Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #5093: Fix start-up crash on SDK versions <24 #5291

Merged
merged 47 commits into from
Mar 1, 2024

Conversation

BenHenning
Copy link
Member

@BenHenning BenHenning commented Dec 28, 2023

Explanation

Fixes #5093

This replaces #5211.

The core crash has come from using Java 8's Map.computeIfAbsent on devices running SDK <24. This function isn't supported on these SDK versions because:

  1. Java 8 isn't fully supported until SDK 24 (https://stackoverflow.com/a/35934691).
  2. While the build toolchain "desugars" Java 8 syntax and some APIs to provide support for Java 8 before SDK 24, not all functionality is present. While https://developer.android.com/studio/write/java8-support-table suggests that computeIfAbsent should be present, it's not clear why it isn't (it could be an out-of-date desugarer, or some other issue).

Only two uses for computeIfAbsent are present:

  • AndroidLocaleFactory: the main cause of the crash.
  • FakeAssetRepository: only used in tests, so ignored in this change (since it would require changing the production API for AssetRepository to fix).

A new regex pattern check was added to ensure this method can't inadvertently be used in the future since it won't work on lower API versions.

This PR mainly focuses on fixing AndroidLocaleFactory.

Need to review the code & finish it, and document the changes. #5211 explored different methods that we could take to keep synchronization when updating the factory to move away from computeIfAbsent, but all of them required introducing locking mechanisms which could cause deadlocking. I also considered in this PR removing the memoization altogether, but this doesn't seem ideal either since the creation of profiles is non-trivial and locales are frequently fetched by nearly all core lesson data providers throughout the app. I landed on a solution that leveraged the app's blocking dispatcher and made profile creation asynchronous. This has some specific implications:

  • Technically a "live lock" is still possible if the blocking dispatcher is starved. A better solution would be to use an actor-like pattern and funnel changes through the background dispatcher, but that's out of scope for this change and represents a problem with all uses of the blocking dispatcher.
  • Creations of DisplayLocaleImpl get more complicated since creation of the Locale is now asynchronous. This was adjusted by passing in the display locale's Locale object rather than having it compute it, and callsites often already are operating within a coroutine context which makes the asynchronous aspect of Locale creation a bit simpler.
  • Two cases where asynchronous creation cannot be used are the edge cases of initial app startup failing to fetching & create the locale in a timely manner, and initializing locale for all activities prior to the current locale being fetched. In both cases, the factory was updated to create a non-memoized Locale object specifically for these purposes. This method should only be used when necessary since it avoids the performance benefits of the memoized version.
  • [BUG]: App can apparently crash due to an incompatible IETF BCP-47 language ID #5046 might be addressed since one plausible cause for that observed issue is a root locale being used, or a region-only locale (both of which should now be handled per this PR).

During development, there were two other changes that were made to ease development:

  • AndroidLocaleProfile was refactored to use a sealed class to improve its typing, and to better facilitate the introduction of a root profile (which was added for better system compatibility in cases when the default app locale isn't supported on the system). Note that the new root profile only applies to app strings since it can actually have correct default behavior (whereas for content strings & audio voiceovers it can't actually be used effectively). The profile class was also updated to compute its own Locale (which is a simplification since before there were multiple ways to create a "correct" Locale), and have an application-injectable factory.
  • DataProviders.transformAsync was updated to handle caught exceptions in the same way as DataProviders.transform which helped while debugging the crash, and seems like it would have downstream benefits in the future. The method's tests were correspondingly updated.

The changes here led to some testing changes:

  • testCreateDisplayLocaleImpl_defaultInstance_hasDefaultInstanceContext was removed since using the default context isn't valid in most cases (see below point).
  • testReconstituteDisplayLocale_defaultContext_returnsDisplayLocaleForContext was renamed & a new test added to better represent that the default context is invalid except for app strings where it can represent root locale. For non-app cases, an exception should be thrown for default. This is also why testCreateLocale_appStrings_allIncompat_invalidLangType_throwsException was updated to instead verify that the root locale is returned.
  • In TranslationControllerTest, testGetSystemLanguageLocale_rootLocale_returnsLocaleWithBlankContext and testGetAppLanguageLocale_uninitialized_returnsLocaleWithSystemLanguage were updated to correctly indicate that there is no specified language for the root locale cases. To ensure coverage for valid IETF BCP-47 & Android resource language IDs, a new test was added: testGetAppLanguageLocale_ptBrDefLocale_returnsLocaleWithIetfAndAndroidResourcesLangIds.

Finally, please note that this is fixing a release blocking issue for the upcoming 0.13 beta release of the app.

Essential Checklist

  • The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".)
  • Any changes to scripts/assets files have their rationale included in the PR explanation.
  • The PR follows the style guide.
  • The PR does not contain any unnecessary code changes from Android Studio (reference).
  • The PR is made from a branch that's not called "develop" and is up-to-date with "develop".
  • The PR is assigned to the appropriate reviewers (reference).

For UI-specific PRs only

This is mainly an infrastructure change. The only user behavior impacted is that the app no longer crashes on startup on SDK versions below 24.

Attempting to open the app on an SDK 23 emulator on develop (4a07d8d):

open_app_without_fix_smaller

Attempting to open the app with the fixes from this branch:
open_app_with_fix_smaller

BenHenning and others added 9 commits August 22, 2023 01:30
These issues were found after I started using a new development
environment.
ProfileAndDeviceIdFragmentTest had been updated to use a newer fragment
initialization pattern, but that's no longer needed and seems to be
causing what appears to be timing discrepancies between local dev and
CI.
The issue ultimately arose from test parameters being initialized after
they're needed in the launched UI. This type of change was tried earlier
in the branch, but reverted since it didn't seem necessary. It is,
however, necessary when there are environment differences (e.g. local
vs. CI) or when running certain tests individually.

Due to the difficulty in finding this issue, ActivityScenarioRule has
been added as a prohibited pattern in the static regex checks (along
with ActivityTestRule since that's deprecated and discouraged, anyway).
The test was suffering from some proto encoding inconsistencies that
seem to occur between some development machines vs. on CI. The fix
improves the test's robustness by extracting the raw encoded string,
verifying that the other outputs in the intent message correctly
correspond to that string, and that the string (as a parsed proto)
contains the correct values. As a result, the test no longer depends on
a hardcoded encoding value to be present for verification. This does
result in a bit more logic than is generally good to have in a test (and
it lengthened the test code quite a bit), but it seems necessary in this
particular case.
The main problem was the use of computeIfAbsent() which is a Java 8 API
that does NOT desugar during building, and Java 8 APIs are only
available on SDK 24+.

The actual fix introduced a lot of complexity when working with locale
profiles since the utilization of memoized locale profiles now requires
using a coroutine to ensure thread safety (i.e. by leveraging a blocking
dispatcher).

This introduced some complexity in the process for creating locales
since a livelock was possible when trying to use the blocking dispatcher
for the system locale (since a blocking dispatcher is used farther up in
its call chain).

Due to these changes, a refactor of AndroidLocaleProfile has been
included to largely simplify a lot of the previous complexity around the
different states the class could take (these are now explicitly defined
as separate subclasses to the now sealed class AndroidLocaleProfile).

DataProviders.transformAsync() has also been updated to handle caught
exceptions in much the same way as transform(). This helped during
debugging the original crash, so it seems like a good idea to include.
Copy link

oppiabot bot commented Jan 4, 2024

Hi @BenHenning, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

@oppiabot oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Jan 4, 2024
@oppiabot oppiabot bot closed this Jan 11, 2024
@Vishwajith-Shettigar Vishwajith-Shettigar removed the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Jan 12, 2024
The main change here is ensuring that Bazel 4.0.0 is used & bzlmod
disabled in newer versions of Bazel when running operations in a test
Bazel environment.

This commit also introduces some more timing tweaks on CommandExecutor
for some tests, though these only affect very specific tests (as many
script tests directly call a script's main() function and thus don't
overwrite its executor behavior).

This commit attempted to introduce "--batch" mode to runs, but the
isolation didn't actually seem to improve stability and, instead,
substantially slowed down some of the tests.
@BenHenning BenHenning reopened this Jan 26, 2024
@BenHenning BenHenning self-assigned this Jan 26, 2024
Copy link

oppiabot bot commented Feb 2, 2024

Hi @BenHenning, I'm going to mark this PR as stale because it hasn't had any updates for 7 days. If no further activity occurs within 7 days, it will be automatically closed so that others can take up the issue.
If you are still working on this PR, please make a follow-up commit within 3 days (and submit it for review, if applicable). Please also let us know if you are stuck so we can help you!

@oppiabot oppiabot bot added the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Feb 2, 2024
@oppiabot oppiabot bot removed the stale Corresponds to items that haven't seen a recent update and may be automatically closed. label Feb 6, 2024
This was done by removing the //testing dependency and, instead, having
instrumentation targets depend on the direct module within //testing
that they need to build. This module & its corresponding implementation
binding (and tests) needed to be moved out of //testing and into their
own /firebase package.
@BenHenning BenHenning requested review from a team as code owners February 9, 2024 22:08
@BenHenning BenHenning requested review from adhiamboperes and removed request for a team February 9, 2024 22:08
@BenHenning BenHenning changed the base branch from fix-platform-specific-issues to develop February 9, 2024 22:09
@BenHenning BenHenning changed the base branch from develop to fix-platform-specific-issues February 9, 2024 22:09
@BenHenning BenHenning linked an issue Feb 9, 2024 that may be closed by this pull request
@BenHenning
Copy link
Member Author

PTAL @adhiamboperes.

Base automatically changed from fix-platform-specific-issues to develop February 14, 2024 21:13
@BenHenning BenHenning changed the title Fix #5093: Fix start-up crash on SDK versions <24 [Blocked: #5138] Fix #5093: Fix start-up crash on SDK versions <24 Feb 14, 2024
Copy link
Collaborator

@adhiamboperes adhiamboperes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BenHenning! I have left some thoughts.

Copy link
Member Author

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Self-reviewed the latest changes & addressed follow-up comments.

@BenHenning
Copy link
Member Author

Thanks @adhiamboperes! PTAL at latest.

Copy link
Collaborator

@adhiamboperes adhiamboperes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, Thanks @BenHenning!

@adhiamboperes adhiamboperes merged commit 0bc5e43 into develop Mar 1, 2024
41 checks passed
@adhiamboperes adhiamboperes deleted the fix-api-21-22-23-crash branch March 1, 2024 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG]:Oppia-Android app not opening on the Android 6.0.1 Marshmello
3 participants