Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash report in [AWSIoTStreamThread .cxx_destruct] #5452

Closed
tangguoEddy opened this issue Oct 17, 2024 · 5 comments
Closed

Crash report in [AWSIoTStreamThread .cxx_destruct] #5452

tangguoEddy opened this issue Oct 17, 2024 · 5 comments
Labels
bug Something isn't working follow up Requires follow up from maintainers iot Issues related to the IoT SDK not-reproducible Not able to reproduce the issue

Comments

@tangguoEddy
Copy link

tangguoEddy commented Oct 17, 2024

when i use AWSiOSSDKV2 2.37.2, I get this crash report。

I use IoT APIs like this

DispatchQueue.global(qos: .background).async{
    self.awsIoTDataManager
            .connectUsingWebSocket(withClientId: UUID().uuidString,
                                   cleanSession: true,
                                   statusCallback: self.statusCallback)
}
Date/Time:           2024-10-15 17:03:52.3587 +0800
Launch Time:         2024-10-15 11:59:07.0856 +0800
OS Version:          iPhone OS 17.5.1 (21F90)
Release Type:        User
Baseband Version:    5.00.00
Report Version:      104

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x0000000c295319e0
Exception Codes: 0x0000000000000001, 0x0000000c295319e0
VM Region Info: 0xc295319e0 is not in any region.  Bytes after previous region: 38811146721  Bytes before following region: 15412815392
      REGION TYPE                 START - END      [ VSIZE] PRT/MAX SHRMOD  REGION DETAIL
      MALLOC_NANO              300000000-320000000 [512.0M] rw-/rwx SM=COW  
--->  GAP OF 0xca0000000 BYTES
      commpage (reserved)      fc0000000-1000000000 [  1.0G] ---/--- SM=NUL  reserved VM address space (unallocated)
Termination Reason: SIGNAL 11 Segmentation fault: 11
Terminating Process: exc handler [1054]

Triggered by Thread:  5


Thread 0 name:
Thread 0:
0   libsystem_kernel.dylib        	0x00000001ed824808 mach_msg2_trap + 8 (:-1)
1   libsystem_kernel.dylib        	0x00000001ed828008 mach_msg2_internal + 80 (mach_msg.c:201)
2   libsystem_kernel.dylib        	0x00000001ed827f20 mach_msg_overwrite + 436 (mach_msg.c:0)
3   libsystem_kernel.dylib        	0x00000001ed827d60 mach_msg + 24 (mach_msg.c:323)
4   CoreFoundation                	0x00000001a4744f5c __CFRunLoopServiceMachPort + 160 (CFRunLoop.c:2624)
5   CoreFoundation                	0x00000001a4744600 __CFRunLoopRun + 1208 (CFRunLoop.c:3007)
6   CoreFoundation                	0x00000001a4743cd8 CFRunLoopRunSpecific + 608 (CFRunLoop.c:3420)
7   GraphicsServices              	0x00000001e95f41a8 GSEventRunModal + 164 (GSEvent.c:2196)
8   UIKitCore                     	0x00000001a6d7c90c -[UIApplication _run] + 888 (UIApplication.m:3713)
9   UIKitCore                     	0x00000001a6e309d0 UIApplicationMain + 340 (UIApplication.m:5303)
10  Sesame                        	0x0000000102638898 main + 64 (AppDelegate.swift:16)
11  dyld                          	0x00000001c7df5e4c start + 2240 (dyldMain.cpp:1298)

Thread 1 name:
Thread 1:
0   libsystem_kernel.dylib        	0x00000001ed824808 mach_msg2_trap + 8 (:-1)
1   libsystem_kernel.dylib        	0x00000001ed828008 mach_msg2_internal + 80 (mach_msg.c:201)
2   libsystem_kernel.dylib        	0x00000001ed827f20 mach_msg_overwrite + 436 (mach_msg.c:0)
3   libsystem_kernel.dylib        	0x00000001ed827d60 mach_msg + 24 (mach_msg.c:323)
4   CoreFoundation                	0x00000001a4744f5c __CFRunLoopServiceMachPort + 160 (CFRunLoop.c:2624)
5   CoreFoundation                	0x00000001a4744600 __CFRunLoopRun + 1208 (CFRunLoop.c:3007)
6   CoreFoundation                	0x00000001a4743cd8 CFRunLoopRunSpecific + 608 (CFRunLoop.c:3420)
7   Foundation                    	0x00000001a3664e4c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 (NSRunLoop.m:373)
8   Foundation                    	0x00000001a3664c9c -[NSRunLoop(NSRunLoop) runUntilDate:] + 64 (NSRunLoop.m:420)
9   UIKitCore                     	0x00000001a6d90640 -[UIEventFetcher threadMain] + 420 (UIEventFetcher.m:1207)
10  Foundation                    	0x00000001a367b718 __NSThread__start__ + 732 (NSThread.m:991)
11  libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
12  libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 2 name:
Thread 2:
0   libsystem_kernel.dylib        	0x00000001ed824808 mach_msg2_trap + 8 (:-1)
1   libsystem_kernel.dylib        	0x00000001ed828008 mach_msg2_internal + 80 (mach_msg.c:201)
2   libsystem_kernel.dylib        	0x00000001ed827f20 mach_msg_overwrite + 436 (mach_msg.c:0)
3   libsystem_kernel.dylib        	0x00000001ed827d60 mach_msg + 24 (mach_msg.c:323)
4   CoreFoundation                	0x00000001a4744f5c __CFRunLoopServiceMachPort + 160 (CFRunLoop.c:2624)
5   CoreFoundation                	0x00000001a4744600 __CFRunLoopRun + 1208 (CFRunLoop.c:3007)
6   CoreFoundation                	0x00000001a4743cd8 CFRunLoopRunSpecific + 608 (CFRunLoop.c:3420)
7   Foundation                    	0x00000001a3664e4c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 (NSRunLoop.m:373)
8   AWSIoT                        	0x0000000103ad4810 -[_SRRunLoopThread main] + 228 (AWSSRWebSocket.m:1905)
9   Foundation                    	0x00000001a367b718 __NSThread__start__ + 732 (NSThread.m:991)
10  libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
11  libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 3 name:
Thread 3:
0   libsystem_kernel.dylib        	0x00000001ed824808 mach_msg2_trap + 8 (:-1)
1   libsystem_kernel.dylib        	0x00000001ed828008 mach_msg2_internal + 80 (mach_msg.c:201)
2   libsystem_kernel.dylib        	0x00000001ed827f20 mach_msg_overwrite + 436 (mach_msg.c:0)
3   libsystem_kernel.dylib        	0x00000001ed827d60 mach_msg + 24 (mach_msg.c:323)
4   CoreFoundation                	0x00000001a4744f5c __CFRunLoopServiceMachPort + 160 (CFRunLoop.c:2624)
5   CoreFoundation                	0x00000001a4744600 __CFRunLoopRun + 1208 (CFRunLoop.c:3007)
6   CoreFoundation                	0x00000001a4743cd8 CFRunLoopRunSpecific + 608 (CFRunLoop.c:3420)
7   CFNetwork                     	0x00000001a5924c90 +[__CFN_CoreSchedulingSetRunnable _run:] + 384 (CoreSchedulingSet.mm:1473)
8   Foundation                    	0x00000001a367b718 __NSThread__start__ + 732 (NSThread.m:991)
9   libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
10  libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 4 name:
Thread 4:
0   libsystem_kernel.dylib        	0x00000001ed82c474 __select + 8 (:-1)
1   CoreFoundation                	0x00000001a47afb7c __CFSocketManager + 640 (CFSocket.c:1340)
2   libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
3   libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 5 Crashed:
0   libobjc.A.dylib               	0x000000019c615b60 objc_release + 16 (:-1)
1   AWSIoT                        	0x0000000103add778 -[AWSIoTStreamThread .cxx_destruct] + 116 (AWSIoTStreamThread.m:32)
2   libobjc.A.dylib               	0x000000019c618c3c object_cxxDestructFromClass(objc_object*, objc_class*) + 116 (objc-class.mm:457)
3   libobjc.A.dylib               	0x000000019c618300 objc_destructInstance + 80 (objc-runtime-new.mm:9125)
4   libobjc.A.dylib               	0x000000019c6182a8 _objc_rootDealloc + 80 (NSObject.mm:2136)
5   Foundation                    	0x00000001a35cf7e4 -[NSThread dealloc] + 100 (NSThread.m:774)
6   Foundation                    	0x00000001a35cd9d8 __NSFinalizeThreadData + 728 (NSThread.m:1195)
7   CoreFoundation                	0x00000001a47ca2e8 __CFTSDFinalize + 124 (CFPlatform.c:840)
8   libsystem_pthread.dylib       	0x00000002015cbf18 _pthread_tsd_cleanup + 620 (pthread_tsd.c:416)
9   libsystem_pthread.dylib       	0x00000002015cbc88 _pthread_exit + 84 (pthread.c:1770)
10  libsystem_pthread.dylib       	0x00000002015cef48 pthread_exit + 88 (pthread.c:1787)
11  Foundation                    	0x00000001a367bad4 +[NSThread exit] + 20 (NSThread.m:656)
12  Foundation                    	0x00000001a367b724 __NSThread__start__ + 744 (NSThread.m:993)
13  libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
14  libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 6:
0   libsystem_pthread.dylib       	0x00000002015c80c4 start_wqthread + 0 (:-1)

Thread 7:
0   libsystem_pthread.dylib       	0x00000002015c80c4 start_wqthread + 0 (:-1)

Thread 8:
0   libsystem_pthread.dylib       	0x00000002015c80c4 start_wqthread + 0 (:-1)

Thread 9:
0   libsystem_kernel.dylib        	0x00000001ed824808 mach_msg2_trap + 8 (:-1)
1   libsystem_kernel.dylib        	0x00000001ed828008 mach_msg2_internal + 80 (mach_msg.c:201)
2   libsystem_kernel.dylib        	0x00000001ed827f20 mach_msg_overwrite + 436 (mach_msg.c:0)
3   libsystem_kernel.dylib        	0x00000001ed827d60 mach_msg + 24 (mach_msg.c:323)
4   CoreFoundation                	0x00000001a4744f5c __CFRunLoopServiceMachPort + 160 (CFRunLoop.c:2624)
5   CoreFoundation                	0x00000001a4744600 __CFRunLoopRun + 1208 (CFRunLoop.c:3007)
6   CoreFoundation                	0x00000001a4743cd8 CFRunLoopRunSpecific + 608 (CFRunLoop.c:3420)
7   Foundation                    	0x00000001a3664e4c -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 212 (NSRunLoop.m:373)
8   AWSIoT                        	0x0000000103adcec0 -[AWSIoTStreamThread main] + 768 (AWSIoTStreamThread.m:88)
9   Foundation                    	0x00000001a367b718 __NSThread__start__ + 732 (NSThread.m:991)
10  libsystem_pthread.dylib       	0x00000002015cd06c _pthread_start + 136 (pthread.c:931)
11  libsystem_pthread.dylib       	0x00000002015c80d8 thread_start + 8 (:-1)

Thread 10:
0   libsystem_pthread.dylib       	0x00000002015c80c4 start_wqthread + 0 (:-1)


Thread 5 crashed with ARM Thread State (64-bit):
    x0: 0x00000003012a19c0   x1: 0x0000000000000000   x2: 0x0000000c295319c0   x3: 0x00000003020b30b0
    x4: 0x00000003020b30c0   x5: 0x0000000205468050   x6: 0x0000000000000000   x7: 0x0000000000000000
    x8: 0x0000000000000050   x9: 0x5a32c8b2236c00d3  x10: 0x00000001029bf000  x11: 0x0000000000000000
   x12: 0x00000000000007fb  x13: 0x00000000000007fd  x14: 0x000000003440e057  x15: 0x0000000000000057
   x16: 0x0000e3fc295319c0  x17: 0x000000000000e000  x18: 0x0000000000000000  x19: 0x0000000303cb7080
   x20: 0x0000000103d4b8c0  x21: 0x000000019c65afb1  x22: 0x0000000103add704  x23: 0x0000000205465000
   x24: 0x0000000204310000  x25: 0x000000020545c000  x26: 0x000000000000002d  x27: 0x0000000000000000
   x28: 0x0000000000000000   fp: 0x000000016e03e990   lr: 0x0000000103add778
    sp: 0x000000016e03e980   pc: 0x000000019c615b60        cpsr: 0x0
   esr: 0x92000006 (Data Abort) byte read Translation fault


Binary Images:
        0x1025f0000 -         0x102873fff Sesame arm64  <878a6534ac2c395ea18d1a4cd2c94125> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Sesame
        0x102990000 -         0x102993fff AWSAPIGateway_-3E807E1F8380265C_PackageProduct arm64  <b46129bc67163714b282ea7a82b78e01> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSAPIGateway_-3E807E1F8380265C_PackageProduct.framework/AWSAPIGateway_-3E807E1F8380265C_PackageProduct
        0x1029a0000 -         0x1029a3fff AWSIoT_1718228465003D_PackageProduct arm64  <10560e3ff5c939baac5c8be5bfd87607> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSIoT_1718228465003D_PackageProduct.framework/AWSIoT_1718228465003D_PackageProduct
        0x1029c4000 -         0x1029cbfff AWSCognitoIdentityProviderASF arm64  <56b2087f727831cd9b1bf369f7ca9d99> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSCognitoIdentityProviderASF.framework/AWSCognitoIdentityProviderASF
        0x1029f0000 -         0x1029f7fff AWSAuthCore arm64  <14ec0a62ed1639e5b865e9c219b4e485> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSAuthCore.framework/AWSAuthCore
        0x102b58000 -         0x102b63fff libobjc-trampolines.dylib arm64e  <2e2c05f8377a30899ad91926d284dd03> /private/preboot/Cryptexes/OS/usr/lib/libobjc-trampolines.dylib
        0x102fa8000 -         0x102fb3fff AWSAPIGateway arm64  <8a0f27eb8f72317cb4f7fb2c64e5c67a> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSAPIGateway.framework/AWSAPIGateway
        0x103044000 -         0x1030a3fff AWSMobileClientXCF arm64  <f4789685f4633618832e442802ad2171> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSMobileClientXCF.framework/AWSMobileClientXCF
        0x1030ec000 -         0x1031d3fff AWSCore arm64  <db0be3ac2fb731c793e4c66f69e85250> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSCore.framework/AWSCore
        0x103420000 -         0x1035a3fff AWSCognitoIdentityProvider arm64  <70ad826245603919bee17661d52f1e04> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSCognitoIdentityProvider.framework/AWSCognitoIdentityProvider
        0x103610000 -         0x1036cbfff SesameSDK arm64  <fc19dfcd730932299bbc1d1eeeedd61b> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/SesameSDK.framework/SesameSDK
        0x103a50000 -         0x103cb7fff AWSIoT arm64  <1ee43632627a3a3490afc20c94172424> /private/var/containers/Bundle/Application/67490276-50A6-4129-948D-366713EDFF6B/Sesame.app/Frameworks/AWSIoT.framework/AWSIoT
        0x19c614000 -         0x19c661f43 libobjc.A.dylib arm64e  <53115e1fe35330d99e8a4e6e73489f05> /usr/lib/libobjc.A.dylib
        0x1a359d000 -         0x1a4112fff Foundation arm64e  <3d3a12e3f5e9361fb00a4a5e8861aa55> /System/Library/Frameworks/Foundation.framework/Foundation
        0x1a46f1000 -         0x1a4c1efff CoreFoundation arm64e  <00e76a98210c3cb5930bf236807ff24c> /System/Library/Frameworks/CoreFoundation.framework/CoreFoundation
        0x1a5827000 -         0x1a5c03fff CFNetwork arm64e  <a5124019e235371686c7e75cf0163945> /System/Library/Frameworks/CFNetwork.framework/CFNetwork
        0x1a6972000 -         0x1a8492fff UIKitCore arm64e  <1741fa374e53371e8daed611aab0043d> /System/Library/PrivateFrameworks/UIKitCore.framework/UIKitCore
        0x1ac65d000 -         0x1ac6daff3 libsystem_c.dylib arm64e  <b122f07fa15637f3a22d64627c0c4b24> /usr/lib/system/libsystem_c.dylib
        0x1c7db9000 -         0x1c7e45ef7 dyld arm64e  <71846eacee653697bf7d790b6a07dcdb> /usr/lib/dyld
        0x1e95f3000 -         0x1e95fbfff GraphicsServices arm64e  <c19b2aeb6aa83f998a53f76c7a0d98fe> /System/Library/PrivateFrameworks/GraphicsServices.framework/GraphicsServices
        0x1ed823000 -         0x1ed85cfef libsystem_kernel.dylib arm64e  <13b5134e819c3baab3004856112114cb> /usr/lib/system/libsystem_kernel.dylib
        0x2015c7000 -         0x2015d3ff3 libsystem_pthread.dylib arm64e  <1196b6c3333d3450818ff3663484b8eb> /usr/lib/system/libsystem_pthread.dylib

EOF
@github-actions github-actions bot added pending-triage Issue is pending triage pending-maintainer-response Issue is pending response from an Amplify team member labels Oct 17, 2024
@harsh62
Copy link
Member

harsh62 commented Oct 17, 2024

@tangguoEddy Are you able to provide more details about how to reproduce this crash on a local machine?

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Oct 17, 2024
@harsh62 harsh62 added bug Something isn't working iot Issues related to the IoT SDK not-reproducible Not able to reproduce the issue labels Oct 21, 2024
@AndrKonovalov
Copy link

Hey all, we are also facing same issue in prod. With no luck reproducing it locally.
After a little deep dive we found several potential places that might lead to this crash.

Risk: The AWSIoTStreamThread relies on manually managing its lifecycle with start, isRunning, and cancel states. Improper synchronization can lead to race conditions, especially when starting or canceling the thread.

The isRunning flag is updated but not thread-safe, as it is accessed in both the main loop and external methods (cancel or cancelAndDisconnect).

Risk: Cleanup operations (e.g., invalidating streams, closing resources) are distributed across cleanUp, cancel, and cancelAndDisconnect. This can lead to inconsistent behavior or resource leaks if these methods are called out of sequence.

cleanUp depends on shouldDisconnect, which may not always be set consistently.
cancelAndDisconnect sets shouldDisconnect, but it doesn't always guarantee that cleanUp completes successfully before releasing resources.

A canceled thread may still execute cleanup logic if isRunning is not synchronized.

Risk: The defaultRunLoopTimer is essential for maintaining the run loop's activity. If the timer is invalidated prematurely (e.g., in cleanUp or during reconnection), the run loop might exit unexpectedly.

The timer is invalidated during cleanUp:

if (self.defaultRunLoopTimer) {
    [self.defaultRunLoopTimer invalidate];
    self.defaultRunLoopTimer = nil;
}

The run loop exits based on isRunning and isCancelled, but a missing timer could cause it to terminate prematurely.

Since there are plenty of things that could potentially lead to the crash can you review this updated implementation?

@interface AWSIoTStreamThread()

@property(atomic, strong, nullable) AWSMQTTSession *session;
@property(atomic, strong, nullable) NSOutputStream *encoderOutputStream;
@property(atomic, strong, nullable) NSInputStream  *decoderInputStream;
@property(atomic, strong, nullable) NSOutputStream *outputStream;
@property(atomic, strong, nullable) NSTimer *defaultRunLoopTimer;
@property(atomic, strong, nullable) NSRunLoop *runLoopForStreamsThread;
@property(atomic, assign) NSTimeInterval defaultRunLoopTimeInterval;
@property(atomic, assign) BOOL isRunning;
@property(atomic, assign) BOOL shouldDisconnect;
@property(atomic, copy, nullable) dispatch_block_t onStop;

// Add synchronization primitives
@property(nonatomic, strong) dispatch_queue_t cleanupQueue;
@property(nonatomic, strong) dispatch_semaphore_t cleanupSemaphore;
@property(atomic, assign) BOOL isCleaningUp;
@end

@implementation AWSIoTStreamThread

- (instancetype)initWithSession:(nonnull AWSMQTTSession *)session
            decoderInputStream:(nonnull NSInputStream *)decoderInputStream
           encoderOutputStream:(nonnull NSOutputStream *)encoderOutputStream
                 outputStream:(nullable NSOutputStream *)outputStream {
    if (self = [super init]) {
        _session = session;
        _decoderInputStream = decoderInputStream;
        _encoderOutputStream = encoderOutputStream;
        _outputStream = outputStream;
        _defaultRunLoopTimeInterval = 10;
        _shouldDisconnect = NO;
        _isCleaningUp = NO;
        
        // Initialize synchronization primitives
        _cleanupQueue = dispatch_queue_create("com.amazonaws.iot.streamthread.cleanup", DISPATCH_QUEUE_SERIAL);
        _cleanupSemaphore = dispatch_semaphore_create(1);
    }
    return self;
}

- (void)main {
    @autoreleasepool {
        AWSDDLogVerbose(@"Started execution of Thread: [%@]", self);
        
        if (![self setupRunLoop]) {
            AWSDDLogError(@"Failed to setup run loop for thread: [%@]", self);
            return;
        }
        
        [self startIOOperations];
        
        while ([self shouldContinueRunning]) {
            @autoreleasepool {
                [self.runLoopForStreamsThread runMode:NSDefaultRunLoopMode
                                         beforeDate:[NSDate dateWithTimeIntervalSinceNow:self.defaultRunLoopTimeInterval]];
            }
        }
        
        [self performCleanup];
        
        AWSDDLogVerbose(@"Finished execution of Thread: [%@]", self);
    }
}

- (BOOL)setupRunLoop {
    if (self.isRunning) {
        AWSDDLogError(@"Thread already running");
        return NO;
    }
    
    self.runLoopForStreamsThread = [NSRunLoop currentRunLoop];
    
    // Setup timer with weak reference to prevent retain cycles
    __weak typeof(self) weakSelf = self;
    self.defaultRunLoopTimer = [[NSTimer alloc] initWithFireDate:[NSDate dateWithTimeIntervalSinceNow:60.0]
                                                      interval:60.0
                                                        target:weakSelf
                                                      selector:@selector(timerHandler:)
                                                      userInfo:nil
                                                       repeats:YES];
    
    if (!self.defaultRunLoopTimer) {
        AWSDDLogError(@"Failed to create run loop timer");
        return NO;
    }
    
    [self.runLoopForStreamsThread addTimer:self.defaultRunLoopTimer
                                 forMode:NSDefaultRunLoopMode];
    
    self.isRunning = YES;
    return YES;
}

- (void)startIOOperations {
    if (self.outputStream) {
        [self.outputStream scheduleInRunLoop:self.runLoopForStreamsThread
                                  forMode:NSDefaultRunLoopMode];
        [self.outputStream open];
    }
    
    [self.session connectToInputStream:self.decoderInputStream
                        outputStream:self.encoderOutputStream];
}

- (BOOL)shouldContinueRunning {
    return self.isRunning && !self.isCancelled && self.defaultRunLoopTimer != nil;
}

- (void)invalidateTimer {
    dispatch_sync(self.cleanupQueue, ^{
        if (self.defaultRunLoopTimer) {
            [self.defaultRunLoopTimer invalidate];
            self.defaultRunLoopTimer = nil;
        }
    });
}

- (void)cancel {
    AWSDDLogVerbose(@"Issued Cancel on thread [%@]", (NSThread *)self);
    [self cancelWithDisconnect:NO];
}

- (void)cancelAndDisconnect:(BOOL)shouldDisconnect {
    AWSDDLogVerbose(@"Issued Cancel and Disconnect = [%@] on thread [%@]", 
                    shouldDisconnect ? @"YES" : @"NO", (NSThread *)self);
    [self cancelWithDisconnect:shouldDisconnect];
}

- (void)cancelWithDisconnect:(BOOL)shouldDisconnect {
    // Ensure thread-safe property updates
    dispatch_sync(self.cleanupQueue, ^{
        if (!self.isCleaningUp) {
            self.shouldDisconnect = shouldDisconnect;
            self.isRunning = NO;
            [super cancel];
            
            // Invalidate timer to trigger run loop exit
            [self invalidateTimer];
        }
    });
}

- (void)performCleanup {
    dispatch_semaphore_wait(self.cleanupSemaphore, DISPATCH_TIME_FOREVER);
    
    if (self.isCleaningUp) {
        dispatch_semaphore_signal(self.cleanupSemaphore);
        return;
    }
    
    self.isCleaningUp = YES;
    dispatch_semaphore_signal(self.cleanupSemaphore);
    
    dispatch_sync(self.cleanupQueue, ^{
        [self cleanupResources];
    });
}

- (void)cleanupResources {
    if (self.shouldDisconnect) {
        [self closeSession];
        [self closeStreams];
    } else {
        AWSDDLogVerbose(@"Skipping disconnect for thread: [%@]", (NSThread *)self);
    }
    
    // Handle onStop callback
    dispatch_block_t stopBlock = self.onStop;
    if (stopBlock) {
        self.onStop = nil;
        stopBlock();
    }
}

- (void)closeSession {
    if (self.session) {
        [self.session close];
        self.session = nil;
    }
}

- (void)closeStreams {
    if (self.outputStream) {
        self.outputStream.delegate = nil;
        [self.outputStream close];
        [self.outputStream removeFromRunLoop:self.runLoopForStreamsThread
                                  forMode:NSDefaultRunLoopMode];
        self.outputStream = nil;
    }
    
    if (self.decoderInputStream) {
        [self.decoderInputStream close];
        self.decoderInputStream = nil;
    }
    
    if (self.encoderOutputStream) {
        [self.encoderOutputStream close];
        self.encoderOutputStream = nil;
    }
}

- (void)timerHandler:(NSTimer*)theTimer {
    AWSDDLogVerbose(@"Default run loop timer executed on Thread: [%@]. isRunning = %@. isCancelled = %@", 
                    self, self.isRunning ? @"YES" : @"NO", self.isCancelled ? @"YES" : @"NO");
}

- (void)dealloc {
    AWSDDLogVerbose(@"Deallocating AWSIoTStreamThread: [%@]", self);
}

@end

The improvements ensure that:
Cleanup happens exactly once
Timer invalidation is properly synchronized
Resources are released in the correct order
State transitions are thread-safe
Run loop exits cleanly

I'm not sure if it is critical to have properties nonatomic though

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Nov 15, 2024
@edisooon edisooon removed the pending-triage Issue is pending triage label Nov 15, 2024
@vincetran
Copy link
Member

Thanks for your input @AndrKonovalov ! Someone will take a look at your code and provide comments as soon as they have bandwidth

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Nov 19, 2024
@vincetran vincetran added follow up Requires follow up from maintainers pending-maintainer-response Issue is pending response from an Amplify team member and removed pending-maintainer-response Issue is pending response from an Amplify team member labels Nov 19, 2024
AndrKonovalov pushed a commit to AndrKonovalov/aws-sdk-ios that referenced this issue Dec 4, 2024
…ws-amplify#5452

Related issue:
aws-amplify#5452

Description of changes:

1. Addition of Synchronization Primitives
New Properties:
 - dispatch_queue_t cleanupQueue
 - dispatch_semaphore_t cleanupSemaphore
 - BOOL isCleaningUp
Purpose: Ensures thread-safe access and modification of critical properties like isRunning, shouldDisconnect, and defaultRunLoopTimer.
Synchronization prevents race conditions during cleanup and cancellation processes.

2. Enhanced shouldContinueRunning Method

Before: Used direct property access without synchronization
After: Introduced synchronization using dispatch_sync for thread-safe checks
Purpose:Prevents inconsistencies if multiple threads attempt to read/write properties simultaneously.

3. Cleanup Enhancements

performCleanup and cleanupResources:
Added explicit synchronization: dispatch_sync and dispatch_semaphore ensure cleanup operations are thread-safe and do not overlap if called multiple times.
Handles complex cleanup sequences safely, such as invalidating timers, disconnecting streams, and deallocating the session.
Purpose: Ensures that cleanup actions (e.g., closing streams and invalidating timers) are thread-safe and only executed once.

4. Timer Initialization
Weak Reference to Prevent Retain Cycles: The timer in setupRunLoop now uses a __weak reference to avoid retain cycles
Before: Used a strong reference (target:self), which could result in a retain cycle.
Purpose: Avoids potential memory leaks by ensuring the thread does not retain itself via the timer.

5. Improved cancel Method
Before: Simple isRunning flag and direct super cancel call
After: Introduced thread-safe handling and ensured timer invalidation
Purpose: Prevents race conditions when canceling the thread, ensuring timers are invalidated and properties are safely updated.
sebaland pushed a commit that referenced this issue Dec 13, 2024
…5452

Related issue:
#5452

Description of changes:

1. Addition of Synchronization Primitives
New Properties:
 - dispatch_queue_t cleanupQueue
 - dispatch_semaphore_t cleanupSemaphore
 - BOOL isCleaningUp
Purpose: Ensures thread-safe access and modification of critical properties like isRunning, shouldDisconnect, and defaultRunLoopTimer.
Synchronization prevents race conditions during cleanup and cancellation processes.

2. Enhanced shouldContinueRunning Method

Before: Used direct property access without synchronization
After: Introduced synchronization using dispatch_sync for thread-safe checks
Purpose:Prevents inconsistencies if multiple threads attempt to read/write properties simultaneously.

3. Cleanup Enhancements

performCleanup and cleanupResources:
Added explicit synchronization: dispatch_sync and dispatch_semaphore ensure cleanup operations are thread-safe and do not overlap if called multiple times.
Handles complex cleanup sequences safely, such as invalidating timers, disconnecting streams, and deallocating the session.
Purpose: Ensures that cleanup actions (e.g., closing streams and invalidating timers) are thread-safe and only executed once.

4. Timer Initialization
Weak Reference to Prevent Retain Cycles: The timer in setupRunLoop now uses a __weak reference to avoid retain cycles
Before: Used a strong reference (target:self), which could result in a retain cycle.
Purpose: Avoids potential memory leaks by ensuring the thread does not retain itself via the timer.

5. Improved cancel Method
Before: Simple isRunning flag and direct super cancel call
After: Introduced thread-safe handling and ensured timer invalidation
Purpose: Prevents race conditions when canceling the thread, ensuring timers are invalidated and properties are safely updated.
@sebaland
Copy link
Member

A fix has been released in 2.38.1.

Copy link

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working follow up Requires follow up from maintainers iot Issues related to the IoT SDK not-reproducible Not able to reproduce the issue
Projects
None yet
Development

No branches or pull requests

6 participants