Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Oppia proto API v1: Android #1

Open
wants to merge 50 commits into
base: develop
Choose a base branch
from

Conversation

BenHenning
Copy link
Member

@BenHenning BenHenning commented Aug 6, 2021

Note that this PR was written largely in collaboration with @anandwana001 & @seanlip (see commit history for specific attribution if necessary),

Overview

This introduces the initial version of Oppia's backend proto API, which is currently only spec'd for the Android client.

Design details

Some things in particular to note:

  • This is introducing the request & response-based structures for both the topic list & individual topic content, based on the latest defined backend specification for Android support
  • The TopicSummary & related structures include some extra data that the client technically doesn't need for the preview experience: descriptions & thumbnails for stories, and skill IDs. This is primarily to (1) provide sufficient details to the client so that it can correctly download the content it needs, and (2) avoid the need for non-summary structures to be downloaded (all of the data needed for topics & stories are fully captured in these summary structures which simplifies the content download portions a bit).
  • Compatibility for matching against versions is designed based on the versioning scheme introduced here. See the next section in this comment for more specifics.
  • The content download API has pagination inherently built in. The client can request multiple structures (of the same or different types), and the backend has the choice of what to respond with. If, for example, the client requests 10 explorations, the backend may choose to only respond with 2 which will prompt the client to request the other 8 in a follow-up request.
  • Questions are being introduced to ensure they are at least defined, but they won't be used by the GA version of the Android app (per the current roadmap). That being said, they are worth noting because they behave a bit differently from the other structures. Since it's not known how many questions can be associated with a given topic, and that list could eventually be quite large (consider 20-100 skills per topic and dozens of questions per skill), the API is designed to provide the full list of questions separately based on a specific skill ID. This is done in three steps:
    1. First, by receiving the list of skills associated with a topic
    2. Second, by requesting the list of question IDs associated with a given skill (the client expects to receive the full list)
    3. Third, by requesting the questions themselves. This approach does not require pagination since, per the previous design point, the backend has full flexibility over how many questions it decides to provide.
  • The proto design here favor strong-typing aggressively (hence substantial duplication in the state interaction definitions) to reduce assumptions or validity checks being needed for clients that consume this API (at the expensive of having less generic interoperable code)
  • The repository is initially being set up to be Bazel compatible (the repository has been confirmed to build) for Java
  • The layout of the repository facilitates potentially supporting more than just the Android client in the future (or future versions of the Android client with evolved APIs)
  • The initial proto structure is very focused on the exact needs of the Android client, hence the structures are substantially reduced compared with the total data available for each structure in the backend

Data flow expectations

All data that the Android client needs to fetch is divided into two categories:

  • The metadata necessary to:
    • Show the topic list
    • Provide the user with enough context (topic details) to know whether they want to download/update a topic
    • Know whether a downloaded topic has an update available
    • Have sufficient technical context to ensure that topic downloads/updates are fully compatible with the local client
  • An explicit download/update as initiated by a user

The metadata is expected to always be automatically downloaded, and is a prerequisite to downloading or updating topics since it provides the necessary context for the client to make this availability determination.

Downloading/updating topic list metadata

Topic metadata provides all the context necessary for users to decide to download/update topics, and for the client to actually make this happen. At a high-level, this is:

  • Lists of topics, stories, chapters, and subtopics/revision cards
  • Versioning information needed to request specific topic content when downloading/updating (see below sections for specifics on versioning & content downloading)

Metadata is always automatically downloaded by the client when connectivity is available to ensure users have the most up-to-date context. However, there's one key challenge that needs to be accounted for in this design: some of the metadata is tied to versioned structures that can change in incompatible ways. For example, if a new incompatible chapter is added to a story, we don't want to display that chapter since the user wouldn't be able to download it, and this would lead to an inconsistency. Another such scenario is if a story's title is updated without the corresponding topic being downloaded.

To properly account for this, metadata is requested with the client's compatibility context so that the server can ensure that it's only providing compatible results. Further, the response for this metadata will distinguish between metadata for undownloaded topics vs. downloaded ones (so that titles & thumbnails aren't updated for topics that are downloaded until the topic itself is updated).

One other issue that crops up here is the time that the user may need to wait in the client for metadata to be available to display the homescreen. While this design is not stipulating frontend design, it is expected that the client may make use of several techniques to improve this experience, including:

  • Embedding an initial topic list that's up-to-date as of the time of that app version's launch
  • Downloading the latest topic list during sign-in to avoid needing to have a loading experience in the home screen
  • Never updating the topic list after the home screen is loaded until app restart (though this might not actually need to happen if the app only ever exposes update/download availability and never changes metadata for already-downloaded topics)

Downloading/updating topics (content downloading)

While directly user initiated, topics are downloaded/updated piecemeal. Topics consist of potentially thousands of files (dozens to ~hundreds of structures, and hundreds to ~thousands of images). An active design decision is to keep topics fragmented and put the onus for downloading each of these to the client. While this increases complexity, it has the key advantage of making topic downloads highly pausable/restartable by introducing more granularity when downloading the constituent pieces of topics.

The image part of topics are naturally piecemeal since they are separate statically GCS-hosted assets that must be individually downloaded. Structures, on the other hand, aren't quite as composed. To make that more viable, pagination & selection were key design choices in the proto API proposed here. In particular, the API is designed to:

  • Allow the client to limit how much it can download (based on actual data sizes rather than arbitrary limits)
  • Allow the client to select exactly the structures that it needs from a single endpoint rather than needing to bounce between multiple endpoints

It's expected that topic content endpoints provide exactly what's requested based on content & structure versions. See the versioning section below for specifics on how versioning is expected to work. Keep in mind that versioning information for topic content is retrieved via metadata (see the section above for more specifics).

Versioning

Versioning starts getting a bit complicated when considering the backend VersionedModels, the model structures themselves, schema structures, proto structures, the API, snapshots of the API, and other things. To try and keep things simple, this design intends to define a separate set of versions that are specific to this API. How these map to Oppia backend's many versions is an implementation detail not actually relevant to this design.

API version

The API itself is versioned and has separate releases. To keep things simple, it has a major & minor version (e.g. 1.0). The major version is represented directly in the directory structure (e.g. 'v1') since the repository could, in the future, house multiple major versions (which is necessary for the backend to maintain long-term backward compatibility with older API versions). The major version is only incremented if a breaking change must be introduced into the API (generally this will only come when we want to make substantial changes to the API/redesign it).

The API's minor version essentially counts the number of releases that major version has been changed. This means we increment the minor version anytime we make a compatible change to the API, and then release it. The main benefit of having distinct releases is it allows the backend & frontends to opt into taking on a particular version of the API. Due to the backward/forward interoperability nature of protobuf, it's unlikely there will ever be an incompatibility so long as structure versions are respected (see below), and even in those cases there should be potential for graceful failures. Note that the minor version is only ever represented in the release numbers and never in code form since it only serves the purpose of cataloging.

Content versions

Content versions are meant to be the version of the entity itself corresponding to a particular structure (such as a topic, skill, or exploration). These are used to track whether there are content updates for a particular structure. These are expected to map exactly to the versions stored in the corresponding structure's VersionedModel in the backend, but how this versioning behaves is up to the backend so long as it's monotonically increasing and positive.

Structure versions

Structure versions correspond to groups of proto structures themselves. These are unique versions to the proto structures and are not shared with schema structure versions in the backend. Generally, these only need to be incremented when a proto structure is updated in such a way that the default data will break existing logical assumptions (thus requiring a data migration). Proto structures should never be updated in an incompatible way (without incrementing the major version--see the API version section above), but sometimes data migrations are necessary and these versions help track when those need to occur. Unlike schema versions in the backend, these are likely to not increment as often since most proto changes can be safely ignored by older clients.

Note that one aspect of this is a bit complicated: some of the substructures are shared across high-level structures (such as SubtitledHtml, or other language-based substructures). To ensure these are properly tracked, some substructures are grouped into their own combined version to track changes. The following are the list of structure types whose versions are tracked:

  • TopicSummaryStructureVersion
  • RevisionCardStructureVersion
  • ConceptCardStructureVersion
  • ExplorationStructureVersion
  • QuestionStructureVersion
  • StateStructureVersion
  • LanguageStructuresVersion
  • ImageStructureVersion

A note on interaction versioning

I did want to briefly note that it was discovered during the creation of this PR & structure versions that separate versioning for interactions is not actually needed since, per the strong typing of State's substructures, the structures of individual interactions & how they relate to the exploration/question experience is implied as part of State's structure version. This does not mean the backend won't need additional version tracking for interactions in order to compute compatibility with specific proto State structure versions, but it does serendipitously simplify version tracking at the sub-State level a bit.

Open tasks

This includes the initial proto definitions for v1 of Oppia backend's
proto-based API. v1 only includes Android support.
@BenHenning
Copy link
Member Author

@anandwana001 @seanlip PTAL and let me know your initial thoughts. This isn't done yet (a lot of documentation still needs to be added for both the protos and the README), but this is the initial structure. I think it can potentially unblock the immediate next work until the PR can be checked in & an official v1.0 released.

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BenHenning for doing this! I took a first pass, and had some questions. PTAL when you get a chance.

proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
repeated WorkedExample worked_examples = 5;
RecordedVoiceovers recorded_voiceovers = 6;
WrittenTranslations written_translations = 7;
ReferencedImageList referenced_image_list = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why ReferencedImageList rather than repeated ReferencedImage (here and elsewhere)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the definition of an image list is a shared substructure, so its version is actually governed as part of ImageStructureVersion. My intent is to design for future iterations on how we represent images (which seems likely to change in coming years as we add more complex image editing in the exploration editor, etc.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I follow, sorry -- wouldn't the same thing still work if you had repeated ReferencedImage and ReferencedImage still includes an ImageProtoVersion? I.e. what's an example of a future change that would affect a list of images that wouldn't just affect each image individually?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In your example the version would only handle how ReferencedImage is represented, not the fact that it's represented using a linear list. If we wanted to represent referenced images differently (for example, by mapping them to content or image IDs) then it would change ReferencedImageList but not the structures that contain it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Update: Chatted with Ben, I think the idea is that this might possibly change to e.g. a dict in the future. Note that, if this is necessary, the message name can be changed without issue, so it's not a problem that it's currently named "*List".]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, changing the type will require a new field (and a migration by updating the proto version & having a routine to migrate lists to a map). We would deprecate the old field & add a new one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this now be considered resolved @seanlip?

proto/v1/structure/languages.proto Outdated Show resolved Hide resolved
message Voiceover {
string filename = 1;
int32 file_size_bytes = 2;
reserved 3; // needs_update
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably do want needs_update; any reason not to add it here?

(OK with keeping it reserved too, just wondered if you had any questions.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When would a client care about needs_update?

I actually meant to remove all reserved lines, I had just missed this file earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs_update is for when the voiceover is a bit out of sync with the text, because the text has changed and the voiceover isn't updated yet.

(Up to you if you want to keep it in for this iteration or not. I don't have a strong preference.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have a clear UX flow for when that condition is hit. If anything, we'd consider not pushing updates that point to out-of-date voiceovers/text translations to the client in the first place if we're concerned with consistency (which is something we need to figure out as part of the versioning project). The fewer decisions the client has to make, the better I think.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had updated it. Removed all the reserved fields as per the Open task list.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this addressed now @seanlip?

proto/v1/structure/languages.proto Outdated Show resolved Hide resolved
@seanlip seanlip assigned BenHenning and unassigned seanlip Aug 8, 2021
Copy link

@anandwana001 anandwana001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @BenHenning
Added few questions which I don't understand.

proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
@anandwana001 anandwana001 removed their assignment Aug 9, 2021
Copy link
Member Author

@BenHenning BenHenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @seanlip & @anandwana001. Just followed up.

Note that the changes aren't yet applied; I've just collected the remaining work items & added them to the opening comment.

proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
repeated WorkedExample worked_examples = 5;
RecordedVoiceovers recorded_voiceovers = 6;
WrittenTranslations written_translations = 7;
ReferencedImageList referenced_image_list = 8;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the definition of an image list is a shared substructure, so its version is actually governed as part of ImageStructureVersion. My intent is to design for future iterations on how we represent images (which seems likely to change in coming years as we add more complex image editing in the exploration editor, etc.)

proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/structure/state.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
proto/v1/api/android.proto Outdated Show resolved Hide resolved
repeated WorkedExample worked_examples = 5;
RecordedVoiceovers recorded_voiceovers = 6;
WrittenTranslations written_translations = 7;
ReferencedImageList referenced_image_list = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure I follow, sorry -- wouldn't the same thing still work if you had repeated ReferencedImage and ReferencedImage still includes an ImageProtoVersion? I.e. what's an example of a future change that would affect a list of images that wouldn't just affect each image individually?

message Voiceover {
string filename = 1;
int32 file_size_bytes = 2;
reserved 3; // needs_update
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs_update is for when the voiceover is a bit out of sync with the text, because the text has changed and the voiceover isn't updated yet.

(Up to you if you want to keep it in for this iteration or not. I don't have a strong preference.)

option java_package = "org.oppia.v1.structure";
option java_multiple_files = true;

message TopicSummary {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note from discussion with Sean: this can't include things that can update immediately without a download.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update immediately without a download.

So, if we are removing those items which are not dependent on whether a topic is downloaded or not, then how could we show the topic preview?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A darn. I forgot part of the context here.

I think the issue is that changes to the stories in the topic shouldn't result in changes shown in the client (necessarily). This is actual a slight gap in our current product plan that needs more thought. We might need to change how we're approaching this design such that we download a very lightweight topic list by default, then download version-sensitive data separately. This will require us to block the home screen on a longer download operation (probably with a loading interstitial), or embed an initial topic list with all relevant data already included directly in the client (I really like this idea since it'll make startup much snappier & work even for older clients).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes address this by updating the API to ensure the backend does not resend topic summary info for downloaded topics (though a failsafe needed to be added in case something goes wrong and the client still needs to fetch the topic summary). I think the latest version is more robust than the past implementation. I'm not 100% confident it covers all cases, but I think we cover about 95% of the cases (it'll become clearer as we dig into both the server & client implementations).

proto/v1/api/android.proto Outdated Show resolved Hide resolved
@BenHenning
Copy link
Member Author

FYI that I will need to follow up on this tomorrow.

@anandwana001
Copy link

Hi @BenHenning
Could you please check my last 2 commits:

  1. eabbdf2
  2. 937c883

I had added the Dto suffix and added 4 explorations in state.proto:

    AlgebraicExpressionInputInstance algebraic_expression_input = 10;
    MathEquationInputInstance MathEquationInput = 11;
    NumericExpressionInputInstance numeric_expression_input = 12;
    NumberWithUnitsInstance number_with_units = 13;

Let me know if these changes are correct.

@anandwana001
Copy link

I am guessing, the rules should only be *Spec and interaction name only be *Instance not the *Dto.
Let me know if this is correct thing I am thinking, I will update in next commit.

@BenHenning
Copy link
Member Author

BenHenning commented Dec 7, 2021

Thanks @anandwana001. To correct one thing: I think you mean 'interactions' not 'explorations.

Regarding the changes:

  • The 'DTO' change looks correct, but we should probably add a follow-up CI check (after this PR) to ensure all messages always end in DTO. Can you please update Introduce GitHub Actions #2 to include details for such a check? Also, if there are any other messages that aren't ending with 'Dto' then please make sure those are also updated.
  • We shouldn't be including number with units--that isn't supported by the app, and should be actually be removed from android_validation_constants
  • The new interactions & objects are missing doc strings--please add these (I suggest referencing the other similar fields in the proto file for reference)
  • I think your math interactions are missing some of the customization arguments (specifically, numeric expression has 'useFractionForDivision', math equation & algebraic expression inputs use both 'useFractionForDivision' and 'customOskLetters'); also, math equation & algebraic expression have an incorrect customization argument 'value'. If it helps, you should reference the interaction's definition files: https://github.com/oppia/oppia/tree/develop/extensions/interactions (e.g. https://github.com/oppia/oppia/blob/develop/extensions/interactions/MathEquationInput/MathEquationInput.py).
  • Please make sure field names always use snake_case; some don't
  • The RuleSpec input names could probably be clearer. Suggest 'math_expression' given the type used in the rule templates: https://github.com/oppia/oppia/blob/develop/extensions/interactions/rule_templates.json.
  • The actual rule specs that we define should correspond to the new PRD not the existing templates since we're changing all three interactions; there's no planned legacy support here so we should design against the final solution on the outset for the proto API

I am guessing, the rules should only be *Spec and interaction name only be *Instance not the *Dto. Let me know if this is correct thing I am thinking, I will update in next commit.

All messages should end with 'Dto' for consistency. I think this will help avoid any sorts of potential naming conflicts in both web & Android when they use the objects generated from these protos.

Copy link
Member

@seanlip seanlip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BenHenning -- per our chat on Wednesday, I've done a full review pass on android.proto. PTAL and feel free to let me know if anything I wrote is unclear -- thanks!

//
// Another dimension of compatibility is the list of supported languages. For an optimal user
// experience, the backend won't provide the client with topics or topic updates that aren't fully
// localized to the languages expected by the client.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify the phrase "languages expected by the client"? Specifically: (a) is it a user setting or is it determined solely by the version of the app? (b) if the former, will this be a list or a single language?

If the former case (user setting) -- most users will need/expect only a single language, I would think -- probably not more than two. But if they've, for some reason, browsed through 4-5, then will the unavailability of a topic for the languages they aren't actively using result in them not being able to download the topic?

option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_LIST_REQUEST_RESPONSE_PROTO_VERSION;

// The common proto version defining this request/response pair. Note that the response may be
// provided with a different version since it should match the requested proto version defined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this line, does "version" refer to "proto version"?

Perhaps use "proto version" anywhere we're talking about proto versions, to eliminate ambiguity (since there are several different types of versions floating around).

Also, where is the "requested proto version defined below" -- are you referring to the ClientCompatibilityContextDto? If so, not sure it makes sense to call this a "requested proto version" since its docstring implies that it carries info about multiple versions.

// The current structures that are downloaded by the local client. Note that this must be a
// comprehensive list of downloaded structures in order to leverage certain consistency guarantees
// by the backend, including ensuring that structures are never re-downloaded or violate the
// one-version expectation (that is, only the latest compatible version of a shared cross-topic
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly is a "shared cross-topic structure"?

Depending on the answer to this question, I'm also wondering how we can enforce this, since there'd be local structures on the device not touched by the download operation -- would new downloads need to remain in sync with those somehow?


// The current structures that are downloaded by the local client. Note that this must be a
// comprehensive list of downloaded structures in order to leverage certain consistency guarantees
// by the backend, including ensuring that structures are never re-downloaded or violate the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean by "structures are never re-downloaded"? Do you mean within the same request/response cycle? (I thought the whole point of this system was that we can e.g. redownload a topic when it's upgraded.)

// structure is downloaded and never multiple versions of it).
//
// Note also that all returned structures are expected to be localized to English by default with
// optional additional language localizations available (as limited by the languages proto
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the "optional" in this sentence square with "the backend won't provide the client with topics or topic updates that aren't fully localized to the languages expected by the client"?

message VersionedStructureDetailsDto {
option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_LIST_REQUEST_RESPONSE_PROTO_VERSION;

// The backend-tracked and creator-affected content version of the structure. This is the primary
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use "server" instead of "backend" elsewhere in this file; should we standardize to "server" throughout for consistency?

// The backend-tracked and creator-affected content version of the structure. This is the primary
// indicator of how "new" a particular structure is, or whether it has updates available. This is
// always a monotonically increasing positive integer for newer versions.
uint32 content_version = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever want to include other fields in this structure, e.g. size, etc.?

The reason I ask is because I'm trying to understand conceptually what this structure is for. Is it meant to communicate info about content versions in particular? Or is it a general grab-bag for meta-information about a given structure?

}

// An identifier used to download a specific structure at a particular content version as part of
// the topic content API. It's expected that the server-provided structure will utilize the provided
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably strike the "provided" at the end of this line, I think it's not needed (it's inherent in "requested").

message DownloadRequestStructureIdentifierDto {
option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_CONTENT_REQUEST_RESPONSE_PROTO_VERSION;

// The content (creator-affected) version of the request. See the similar field in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest "The content version being requested." (this seems more direct)

// actually tied to any particular topics, and can be used across multiple topics. However, they are
// downloaded with the first topic that references them, and shouldn't be re-downloaded in
// subsequent topics (the backend makes guarantees to ensure that structures are never repeated).
message VersionedStructureDetailsDto {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth calling this something like LocalVersionedStructureDetailsDto if it is only meant to refer to stuff that's locally on the client (as opposed to stuff that's fetched from the server)?

Specifically:
- Adds buf & buildifier support (within Bazel) as actual tests, plus
fixes issues found by both.
- Restricts building to Bazel 4.0.0 (since this is what Oppia Android is
currently limited to).
- Significantly revamps how all things localization are stored (and
fully defines them for all structures).
- Finishes topic list request/response support.
- Changes how third-party dependencies are managed at the
repository-level for cleanliness.

There are still a few TODOs left to resolve, but the structures are much
closer to finished as of this commit (sans addressing any existing
comments on GitHub). These changes have been verified via a new asset
download script being built for Oppia Android's release pipeline.
"""

# Note to developers: Please keep this dict sorted by key to make it easier to find dependencies.
MAVEN_DEPENDENCY_VERSIONS = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not actually maven.

@@ -0,0 +1,47 @@
"""
Defines Starlark macros that are used to set up dependency toolcahins needed to build the Oppia
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sp: toolchains

BenHenning and others added 5 commits June 2, 2023 19:15
Requests should be allowed to fail gracefully.

Also, fixes a small formatting inconsistency.
This is in preparation for introducing support in the Android app for
multiple learning classrooms.
This ensures that an older zlib isn't referenced per
bazelbuild/rules_proto#117.
BenHenning added a commit to oppia/oppia-android that referenced this pull request May 29, 2024
… for (multiple) classrooms & topic dependencies, and prepare for #4885 (#5398)

## Explanation
Fixes #1547
Fixes part of #169
Fixes part of #5344
Fixes part of #5365
Fixes part of #5411

The main purpose of this PR is to introduce support for multiple
classrooms in the data layer of the app (with minimal domain integration
to avoid the change extending beyond the lesson structures). However,
the PR is also doing a few more things including preparing the Android
codebase for introducing an asset download script (#4885) and other
peripheral cleanups of code (rather than updating it) that was noticed
along the way.

### Preparing for multiple classrooms

This addresses part of #5365.

#5344 is introducing support for multiple classrooms in the app. To help
prepare for those changes, this PR introduces the necessary data
structure and domain loading changes to load a new proto structure:

```proto
message ClassroomList {
  repeated ClassroomRecord classrooms = 1;
}
message ClassroomRecord {
  string id = 1;
  map<string, TranslationMapping> written_translations = 2;
  SubtitledHtml translatable_title = 3;
  LessonThumbnail classroom_thumbnail = 4;
  map<string, TopicIdList> topic_prerequisites = 5;
  message TopicIdList {
    repeated string topic_ids = 1;
  }
}
```

Rather than just a flat topic list. Some important details to note:
- The recommended topics structure has been updated to use this new
``topic_prerequisites`` value being loaded from classrooms. This will
also extend to production assets since the asset download script from
#4885 is also being updated to include support for this multiple
classrooms structure to address the remainder of #5365.
- To minimize domain changes, the new loading code assumes only **one**
classroom is present. TODOs have been added on #5344 to extend this to
support multiple classrooms.
- Current loading code (including for JSON) is ignoring all but:
``classrooms``, ``id``, and ``topic_prerequisites`` (including
``topic_ids``) from the protos above. These other fields are expected to
have supported added as part of #5344.
- There were some color simplifications made in ``TopicListController``.
These largely shouldn't have a major impact outside of developer assets.
These changes were made to ensure non-specificity to the previous lesson
classroom. In general, all of this should eventually be removed in favor
of loading colors from lesson assets, but that will need to wait until
the JSON pipeline is completely removed.

### Asset priming removal

This addresses part of #169.

``PrimeTopicAssetsController`` and its implementation were responsible
for hackily pre-loading all lesson images and audio to be on-device to
enable offline support. This was the first attempt at offline support
early in the app's development, but it had a few significant drawbacks:
- It required preloading everything upon first app open. Since it can
take a while for loading to occur, some robustness needed to be built in
for pausing, cancelling, and resuming. I'm not certain if these were
even 100% handled in the current implementation.
- It doesn't perform strong compatibility checks until you're in the app
which means lesson incompatibilities would just cause the app to get
stuck rather than addressing it during lesson import time (e.g. via an
asset downloader script).
- It required very significant workarounds to existing loading pipelines
that aren't ideal to keep in the codebase long-term (code smell).
- There's no guarantee the user even has enough disk space to download
all the needed assets (particularly audio), or if they'll have
sufficient internet connectivity & bandwidth to perform those downloads
upon first app open.

This approach was abandoned after the earliest alpha releases for an
asset download script (which is now being migrated over to this codebase
per #5411.

This removal unfortunately required removing a module that was
referenced in a lot of tests throughout the codebase. While the removal
itself was fairly simple, it does affect a lot of files.

Other areas changed (but unaffected by tests since these flows didn't
have automated tests):
- ``SplashActivityPresenter`` for enabling the downloader to start and
block the UI using a dialog box while the downloading occurred.
- ``AudioPlayerController` for removing the special loading logic for
primed audio files (the app now no longer supports loading audio files
from disk as we don't yet have a good long-term solution for
offline-available audio files due to their significant size).
- ``GlideImageLoader`` for removing support for loading locally cached
images (only through this flow; the flow we use for the asset download
script uses a different annotation and is still supported).

As alluded to above, two annotations were removed as part of this
cleanup:
- ``CacheAssetsLocally``: a boolean indicating whether the prime
download flow should be enabled.
- ``TopicListToCache``: this specified the list of topics to
pre-download if the flow was enabled.

### GAE structure cleanup & preparing for asset script

Preparing for #4885 included a few other changes, some of which were
nice-to-have:
- Introducing support for incorporating the protos from
https://github.com/oppia/oppia-proto-api (specifically
oppia/oppia-proto-api#1 since they are still a
work-in-progress).
- Adding ``java_proto`` versions for many of the app's proto structures
(since the download script runs in the JVM and doesn't use the javalite
proto environment).
- Removed all of the unused GAE services and models (addressing #1547 by
essentially making it obsolete), plus their mocks. These were never
hooked up, and they're never going to be: the long-term plan for the
codebase is to use new proto endpoints that will be added to Oppia web.
These Retrofit endpoints have actually been rebuilt and repurposed to be
used in the asset download script as a medium-term stop-gap until the
permanent proto endpoints can be added to Oppia web.
- ``RemoteAuthNetworkInterceptorTest`` was updated to use a different
example service since ``TopicService`` has been removed. The services
for platform parameters and user feedback are being kept.
- Some test file & KDoc exemptions have been removed since their
corresponding files have been deleted.

Note that the new Java proto targets & oppia-proto-api targets aren't
being used yet, but they will be in future PRs. This just provides the
basis of support for the asset download script while also helping to
split up the code to review across multiple PRs.

## Essential Checklist
- [x] The PR title and explanation each start with "Fix #bugnum: " (If
this PR fixes part of an issue, prefix the title with "Fix part of
#bugnum: ...".)
- [x] Any changes to
[scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets)
files have their rationale included in the PR explanation.
- [x] The PR follows the [style
guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide).
- [x] The PR does not contain any unnecessary code changes from Android
Studio
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)).
- [x] The PR is made from a branch that's **not** called "develop" and
is up-to-date with "develop".
- [x] The PR is **assigned** to the appropriate reviewers
([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)).

## For UI-specific PRs only
This is essentially only data infrastructural except for a couple of
points:
- Topic loading is now happening through a classrooms structure rather
than a tropic ID list. Since there's only one test & one production
classroom, this shouldn't actually change the UX.
- An at-app-open flow for predownloading image & audio assets has been
removed. This hasn't been used since the very earliest alpha releases of
the app, so it won't actually affect any users.
- Some colors for developer lesson and topic thumbnails were
updated--see above.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants