Skip to content
This repository has been archived by the owner on Jan 15, 2021. It is now read-only.

[Android] ThaliTest app crashes during start-up #563

Closed
tompaana opened this issue Feb 22, 2016 · 32 comments
Closed

[Android] ThaliTest app crashes during start-up #563

tompaana opened this issue Feb 22, 2016 · 32 comments
Assignees
Milestone

Comments

@tompaana
Copy link
Member

tompaana commented Feb 22, 2016

Reproduced with commit 8e7e29ab668e5569dc9c8c212030295f41ed47bd and on Lollipop.

No clear reason, but saw this: https://github.com/jxcore/jxcore-cordova/blob/master/src/android/java/io/jxcore/node/jxcore.java#L168

In addition, commenting out line 84 in ThaliPermissions.java (

) fixed the problem, but is obviously not an actual fix removes the crash, but nothing is happening after - the app just freezes functionality-wise.

@tompaana tompaana added this to the New Infra milestone Feb 22, 2016
@tompaana
Copy link
Member Author

The things that has broken the code has happened after 9b58dbf88f0f3cc7922802f1f50034f2c357665d in vNext branch.

@tompaana tompaana assigned tompaana and unassigned juhanak Feb 23, 2016
@tompaana
Copy link
Member Author

Commenting out stuff in app.js (UnitTest_app.js) makes the bug go away:

//testUtils.toggleRadios(true)
//.then(function () {
  Mobile('GetDeviceName').callNative(function (name) {
    console.log('My device name is: %s', name);
    testUtils.setName(name);
    require('./runTests.js');
  });
//})
//.catch(function (error) {
//  console.log('Something went wrong: ' + error);
//});

@tompaana tompaana assigned vjrantal and unassigned tompaana Feb 23, 2016
@vjrantal
Copy link
Member

Without commenting out anything, i.e., when the crash occurs, I see something like this in logcat:

14:52:24.907 511/511                     DEBUG  I   *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
14:52:24.907 511/511                     DEBUG  I   Build fingerprint: 'htc/HTCOneM9vzw/htc_himawl:5.0.2/LRX22G/511781.15:user/release-keys'
14:52:24.907 511/511                     DEBUG  I   Revision: '0'
14:52:24.907 511/511                     DEBUG  I   ABI: 'arm'
14:52:24.907 511/511                     DEBUG  I   pid: 13570, tid: 13659, name: Thread-491  >>> com.test.thalitest <<<
14:52:24.907 511/511                     DEBUG  I   signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
14:52:24.927 511/511                     DEBUG  I       r0 00000000  r1 c36c725c  r2 00000002  r3 00000000
14:52:24.927 511/511                     DEBUG  I       r4 ac809ad8  r5 00000000  r6 ac809368  r7 ad751270
14:52:24.927 511/511                     DEBUG  I       r8 ac5730e4  r9 ac809ad8  sl ac809340  fp c36c7254
14:52:24.927 511/511                     DEBUG  I       ip ffffffff  sp c36c7244  lr c3aa7b30  pc c3aa7a34  cpsr 60070010
14:52:24.927 511/511                     DEBUG  I   
14:52:24.927 511/511                     DEBUG  I   backtrace:
14:52:24.927 511/511                     DEBUG  I       #00 pc 003d1a34  /data/app/com.test.thalitest-2/lib/arm/libjxcore.so (bool ThingIsPermanentAtom<JS::Symbol>(JS::Symbol*)+4)
14:52:24.927 511/511                     DEBUG  I       #01 pc 003d1b2c  /data/app/com.test.thalitest-2/lib/arm/libjxcore.so
14:52:24.927 511/511                     DEBUG  I       #02 pc 01c08000  <unknown>

@obastemur: We will continue investigating this on our side, but if you have any thoughts on why we would see such an error, please comment.

vjrantal added a commit that referenced this issue Feb 23, 2016
@obastemur
Copy link
Contributor

@vjrantal do you have something more? looks like this happening while trying to call back to external method (see pc is beyond the libjxcore.so) or even worst ..

Guess, multiple threads are reaching to jxcore.

@vjrantal
Copy link
Member

@obastemur: Looks like this commit 1da7827 gets rid of the crash, but I have no idea why.

What I was thinking is that would there be any value in building a debug version of libjxcore to get more detailed backtrace?

@obastemur
Copy link
Contributor

Looks like this commit 1da7827 gets rid of the crash, but I have no idea why.

I bet that is just a temporary solution. you might enable logging inside jxcore and build and share the output.

see https://github.com/jxcore/jxcore/blob/master/src/jx/Proxy/EngineLogger.h

enable both

// #define JXCORE_PRINT_NATIVE_CALLS 1

// Print the logs as soon as happens.
// #define JXCORE_PRINT_LOGS_ASAP 1

@vjrantal
Copy link
Member

@obastemur: Do you happen to have any jxcore-cordova build machine where you could easily build a io.jxcore.node.jx package that would have the right binaries for debugging this? I'd have to spend some time to first configure the build environment.

@obastemur
Copy link
Contributor

unfortunately I don't have it on my second laptop. A bit away from the first one. Hey, guess you will need it later too.

@vjrantal
Copy link
Member

@obastemur: Yep - no worries :)

Thanks for the comments!

vjrantal added a commit that referenced this issue Feb 23, 2016
@tompaana
Copy link
Member Author

There may be calls from Java code JXcore from other than the main thread.

@vjrantal
Copy link
Member

@obastemur: Do you think this could be somehow related to jxcore/jxcore-cordova#60? Should we somehow change our code now that we started using jxcore-cordova > 0.1.0?

@obastemur
Copy link
Contributor

Indeed tou should update to latest

@vjrantal
Copy link
Member

@obastemur: We have updated to the latest and now seeing this issue. My question was, now that we have already updated to latest, do we need to change our code, because the reference issue said "anyone depending on UI - JXcore JS thread may need to update their code"?

@vjrantal
Copy link
Member

@obastemur: As further debugging information, with jxcore-cordova 0.1.1 I get 100% repro rate with our test app and without changing anything else, but downgrading to jxcore-cordova 0.0.7 (using jxc install 0.0.7 --force) the crash doesn't occur anymore (3/3 app startups were successful).

vjrantal added a commit that referenced this issue Feb 24, 2016
The radio toggling at the beginning of the tests was removed as a workaround
for #563. That would be good to put back once a root cause and fix is found,
but even then, it should not be necessary to turn off radios at the end of
tests.

This was originally done for CI where devices didn't reboot in between tests,
but now devices do reboot and thus tests always start from clean radio states.
@vjrantal
Copy link
Member

@obastemur: Was able to get a build with the JXcore EngineLogger debug output, but for some reason, the bug doesn't happen with the self-built jxcore-cordova. I used jxcore c2ca8b63216bf1d774f5c31ff4ab50b71d15e59a (current HEAD of master).

I can think of two possibilities:

  1. There is some delta between the jxcore version in jxcore-cordova 0.1.1 and the current HEAD of master
  2. The additional logging changes the timing of calls in a way that this bug doesn't occur

If you know the exact sha1 of the jxcore that is bundled with jxcore-cordova 0.1.1, I'll try once more with that one.

@vjrantal
Copy link
Member

We still see this very frequently with jxcore-cordova >= 0.1.0. This jxcore/jxcore#401 looked a bit related, but we are not calling stopEngine anywhere in our code.

@vjrantal
Copy link
Member

@obastemur: Since we can reliably reproduce this crash in our CI, it would be fairly easy to try out potential fix candidates if there would be any. We also now have the capability to build jxcore and jxcore-cordova ourselves so the fix proposals could include changes to them as well.

If possible, please let us know if you have any suggestions.

For clarity, the title of this issue implies that this happens on app startup (which was the case initially), but when we added workarounds for that, the crash may happen later on like in this example which is a CI run I had today:

I/jxcore-log( 8993): ok 282 constructor called with right args
I/jxcore-log( 8993): 
,I/jxcore-log( 8993): ok 283 match start arg
I/jxcore-log( 8993): 
I/jxcore-log( 8993): ok 284 start called once
I/jxcore-log( 8993): 
I/jxcore-log( 8993): ok 285 stop called once
I/jxcore-log( 8993): 
I/jxcore-log( 8993): ok 286 stop after start
I/jxcore-log( 8993): 
,I/jxcore-log( 8993): # setup
I/jxcore-log( 8993): 
,I/jxcore-log( 8993): # 82. two peers with empty DB, update the doc
I/jxcore-log( 8993): 
,I/jxcore-log( 8993): # teardown
I/jxcore-log( 8993): 
,F/libc    ( 8993): Fatal signal 11 (SIGSEGV), code 1, fault addr 0x0 in tid 9093 (Thread-1118)
,I/DEBUG   ( 2845): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
,I/DEBUG   ( 2845): Build fingerprint: 'samsung/treltexx/trelte:5.0.1/LRX22C/N910CXXU1BOE3:user/release-keys'
I/DEBUG   ( 2845): Revision: '21'
,I/DEBUG   ( 2845): ABI: 'arm'
,I/DEBUG   ( 2845): pid: 8993, tid: 9093, name: Thread-1118  >>> com.test.thalitest <<<
I/DEBUG   ( 2845): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x0
,I/DEBUG   ( 2845):     r0 00000000  r1 96df63bc  r2 00000002  r3 00000000
,I/DEBUG   ( 2845):     r4 af97c900  r5 00000000  r6 af97c190  r7 8dc20000
I/DEBUG   ( 2845):     r8 af9bf67c  r9 af97c900  sl af97c168  fp 96df63b4
,I/DEBUG   ( 2845):     ip ffffffff  sp 96df63a4  lr 972d1b30  pc 972d1a34  cpsr 60070010
,I/DEBUG   ( 2845): 
I/DEBUG   ( 2845): backtrace:
I/DEBUG   ( 2845):     #00 pc 003d1a34  /data/app/com.test.thalitest-1/lib/arm/libjxcore.so (bool ThingIsPermanentAtom<JS::Symbol>(JS::Symbol*)+4)
,I/DEBUG   ( 2845):     #01 pc 003d1b2c  /data/app/com.test.thalitest-1/lib/arm/libjxcore.so
I/DEBUG   ( 2845):     #02 pc e5943000  <unknown>
,D/SSRM:n  ( 3526): SIOP:: AP = 270, PST = 262, CUR = -7

@yaronyg yaronyg removed this from the New Infra milestone Mar 19, 2016
@vjrantal vjrantal removed their assignment Mar 22, 2016
@yaronyg
Copy link
Member

yaronyg commented Apr 4, 2016

Note that others are seeing something very much like this in a scenario identical to ours - jxcore/jxcore-cordova#143

@yaronyg yaronyg added this to the V1 milestone Aug 3, 2016
@yaronyg
Copy link
Member

yaronyg commented Aug 18, 2016

We think this might be #22. Diego is investigating. We think something changed in Lollipop that causes this bug and is why we didn't see it in KitKat.

@yaronyg
Copy link
Member

yaronyg commented Aug 25, 2016

Diego will now try to reproduce #22 on Lollipop since we now know enough to know what to look for.

@yaronyg
Copy link
Member

yaronyg commented Aug 31, 2016

Diego is back and investigating and we hope to have a report tomorrow.

@yaronyg
Copy link
Member

yaronyg commented Sep 1, 2016

This does seem like a relative of #22. It's a seg fault that should have got caught at a lower level but makes its way to JXcore. So the #563 work around should handle it so we should have a PR on this tomorrow.

@yaronyg
Copy link
Member

yaronyg commented Sep 6, 2016

We have a work around that will keep the application from crashing but we haven't actually identified where the access violation is from so we still need to fix this. For now we are waiting to see if this will repo again and if so then we can try to fix it.

@enricogior
Copy link
Member

This is the same crash reported by #1592
It has been fixed in JXcore 0.3.1.10 and JXcore-Cordova 0.1.10.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants