-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Oppia proto API v1: Android #1
base: develop
Are you sure you want to change the base?
Conversation
This includes the initial proto definitions for v1 of Oppia backend's proto-based API. v1 only includes Android support.
@anandwana001 @seanlip PTAL and let me know your initial thoughts. This isn't done yet (a lot of documentation still needs to be added for both the protos and the README), but this is the initial structure. I think it can potentially unblock the immediate next work until the PR can be checked in & an official v1.0 released. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @BenHenning for doing this! I took a first pass, and had some questions. PTAL when you get a chance.
repeated WorkedExample worked_examples = 5; | ||
RecordedVoiceovers recorded_voiceovers = 6; | ||
WrittenTranslations written_translations = 7; | ||
ReferencedImageList referenced_image_list = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, why ReferencedImageList
rather than repeated ReferencedImage
(here and elsewhere)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the definition of an image list is a shared substructure, so its version is actually governed as part of ImageStructureVersion. My intent is to design for future iterations on how we represent images (which seems likely to change in coming years as we add more complex image editing in the exploration editor, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I follow, sorry -- wouldn't the same thing still work if you had repeated ReferencedImage
and ReferencedImage still includes an ImageProtoVersion? I.e. what's an example of a future change that would affect a list of images that wouldn't just affect each image individually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example the version would only handle how ReferencedImage is represented, not the fact that it's represented using a linear list. If we wanted to represent referenced images differently (for example, by mapping them to content or image IDs) then it would change ReferencedImageList but not the structures that contain it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Update: Chatted with Ben, I think the idea is that this might possibly change to e.g. a dict in the future. Note that, if this is necessary, the message name can be changed without issue, so it's not a problem that it's currently named "*List".]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though, changing the type will require a new field (and a migration by updating the proto version & having a routine to migrate lists to a map). We would deprecate the old field & add a new one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this now be considered resolved @seanlip?
proto/v1/structure/languages.proto
Outdated
message Voiceover { | ||
string filename = 1; | ||
int32 file_size_bytes = 2; | ||
reserved 3; // needs_update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably do want needs_update; any reason not to add it here?
(OK with keeping it reserved
too, just wondered if you had any questions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would a client care about needs_update?
I actually meant to remove all reserved lines, I had just missed this file earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs_update is for when the voiceover is a bit out of sync with the text, because the text has changed and the voiceover isn't updated yet.
(Up to you if you want to keep it in for this iteration or not. I don't have a strong preference.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have a clear UX flow for when that condition is hit. If anything, we'd consider not pushing updates that point to out-of-date voiceovers/text translations to the client in the first place if we're concerned with consistency (which is something we need to figure out as part of the versioning project). The fewer decisions the client has to make, the better I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had updated it. Removed all the reserved fields as per the Open task list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this addressed now @seanlip?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @BenHenning
Added few questions which I don't understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @seanlip & @anandwana001. Just followed up.
Note that the changes aren't yet applied; I've just collected the remaining work items & added them to the opening comment.
repeated WorkedExample worked_examples = 5; | ||
RecordedVoiceovers recorded_voiceovers = 6; | ||
WrittenTranslations written_translations = 7; | ||
ReferencedImageList referenced_image_list = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the definition of an image list is a shared substructure, so its version is actually governed as part of ImageStructureVersion. My intent is to design for future iterations on how we represent images (which seems likely to change in coming years as we add more complex image editing in the exploration editor, etc.)
repeated WorkedExample worked_examples = 5; | ||
RecordedVoiceovers recorded_voiceovers = 6; | ||
WrittenTranslations written_translations = 7; | ||
ReferencedImageList referenced_image_list = 8; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I follow, sorry -- wouldn't the same thing still work if you had repeated ReferencedImage
and ReferencedImage still includes an ImageProtoVersion? I.e. what's an example of a future change that would affect a list of images that wouldn't just affect each image individually?
proto/v1/structure/languages.proto
Outdated
message Voiceover { | ||
string filename = 1; | ||
int32 file_size_bytes = 2; | ||
reserved 3; // needs_update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs_update is for when the voiceover is a bit out of sync with the text, because the text has changed and the voiceover isn't updated yet.
(Up to you if you want to keep it in for this iteration or not. I don't have a strong preference.)
option java_package = "org.oppia.v1.structure"; | ||
option java_multiple_files = true; | ||
|
||
message TopicSummary { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note from discussion with Sean: this can't include things that can update immediately without a download.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
update immediately without a download.
So, if we are removing those items which are not dependent on whether a topic is downloaded or not, then how could we show the topic preview?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A darn. I forgot part of the context here.
I think the issue is that changes to the stories in the topic shouldn't result in changes shown in the client (necessarily). This is actual a slight gap in our current product plan that needs more thought. We might need to change how we're approaching this design such that we download a very lightweight topic list by default, then download version-sensitive data separately. This will require us to block the home screen on a longer download operation (probably with a loading interstitial), or embed an initial topic list with all relevant data already included directly in the client (I really like this idea since it'll make startup much snappier & work even for older clients).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Latest changes address this by updating the API to ensure the backend does not resend topic summary info for downloaded topics (though a failsafe needed to be added in case something goes wrong and the client still needs to fetch the topic summary). I think the latest version is more robust than the past implementation. I'm not 100% confident it covers all cases, but I think we cover about 95% of the cases (it'll become clearer as we dig into both the server & client implementations).
FYI that I will need to follow up on this tomorrow. |
Hi @BenHenning I had added the Dto suffix and added 4 explorations in state.proto:
Let me know if these changes are correct. |
I am guessing, the rules should only be *Spec and interaction name only be *Instance not the *Dto. |
Thanks @anandwana001. To correct one thing: I think you mean 'interactions' not 'explorations. Regarding the changes:
All messages should end with 'Dto' for consistency. I think this will help avoid any sorts of potential naming conflicts in both web & Android when they use the objects generated from these protos. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @BenHenning -- per our chat on Wednesday, I've done a full review pass on android.proto. PTAL and feel free to let me know if anything I wrote is unclear -- thanks!
// | ||
// Another dimension of compatibility is the list of supported languages. For an optimal user | ||
// experience, the backend won't provide the client with topics or topic updates that aren't fully | ||
// localized to the languages expected by the client. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify the phrase "languages expected by the client"? Specifically: (a) is it a user setting or is it determined solely by the version of the app? (b) if the former, will this be a list or a single language?
If the former case (user setting) -- most users will need/expect only a single language, I would think -- probably not more than two. But if they've, for some reason, browsed through 4-5, then will the unavailability of a topic for the languages they aren't actively using result in them not being able to download the topic?
option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_LIST_REQUEST_RESPONSE_PROTO_VERSION; | ||
|
||
// The common proto version defining this request/response pair. Note that the response may be | ||
// provided with a different version since it should match the requested proto version defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this line, does "version" refer to "proto version"?
Perhaps use "proto version" anywhere we're talking about proto versions, to eliminate ambiguity (since there are several different types of versions floating around).
Also, where is the "requested proto version defined below" -- are you referring to the ClientCompatibilityContextDto? If so, not sure it makes sense to call this a "requested proto version" since its docstring implies that it carries info about multiple versions.
// The current structures that are downloaded by the local client. Note that this must be a | ||
// comprehensive list of downloaded structures in order to leverage certain consistency guarantees | ||
// by the backend, including ensuring that structures are never re-downloaded or violate the | ||
// one-version expectation (that is, only the latest compatible version of a shared cross-topic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is a "shared cross-topic structure"?
Depending on the answer to this question, I'm also wondering how we can enforce this, since there'd be local structures on the device not touched by the download operation -- would new downloads need to remain in sync with those somehow?
|
||
// The current structures that are downloaded by the local client. Note that this must be a | ||
// comprehensive list of downloaded structures in order to leverage certain consistency guarantees | ||
// by the backend, including ensuring that structures are never re-downloaded or violate the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by "structures are never re-downloaded"? Do you mean within the same request/response cycle? (I thought the whole point of this system was that we can e.g. redownload a topic when it's upgraded.)
org/oppia/proto/v1/api/android.proto
Outdated
// structure is downloaded and never multiple versions of it). | ||
// | ||
// Note also that all returned structures are expected to be localized to English by default with | ||
// optional additional language localizations available (as limited by the languages proto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the "optional" in this sentence square with "the backend won't provide the client with topics or topic updates that aren't fully localized to the languages expected by the client"?
message VersionedStructureDetailsDto { | ||
option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_LIST_REQUEST_RESPONSE_PROTO_VERSION; | ||
|
||
// The backend-tracked and creator-affected content version of the structure. This is the primary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use "server" instead of "backend" elsewhere in this file; should we standardize to "server" throughout for consistency?
// The backend-tracked and creator-affected content version of the structure. This is the primary | ||
// indicator of how "new" a particular structure is, or whether it has updates available. This is | ||
// always a monotonically increasing positive integer for newer versions. | ||
uint32 content_version = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we ever want to include other fields in this structure, e.g. size, etc.?
The reason I ask is because I'm trying to understand conceptually what this structure is for. Is it meant to communicate info about content versions in particular? Or is it a general grab-bag for meta-information about a given structure?
} | ||
|
||
// An identifier used to download a specific structure at a particular content version as part of | ||
// the topic content API. It's expected that the server-provided structure will utilize the provided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably strike the "provided" at the end of this line, I think it's not needed (it's inherent in "requested").
message DownloadRequestStructureIdentifierDto { | ||
option (org.oppia.proto.v1.versions.api_proto_version_type) = TOPIC_CONTENT_REQUEST_RESPONSE_PROTO_VERSION; | ||
|
||
// The content (creator-affected) version of the request. See the similar field in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest "The content version being requested." (this seems more direct)
// actually tied to any particular topics, and can be used across multiple topics. However, they are | ||
// downloaded with the first topic that references them, and shouldn't be re-downloaded in | ||
// subsequent topics (the backend makes guarantees to ensure that structures are never repeated). | ||
message VersionedStructureDetailsDto { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth calling this something like LocalVersionedStructureDetailsDto if it is only meant to refer to stuff that's locally on the client (as opposed to stuff that's fetched from the server)?
This also removes a field that shouldn't be in one of the math equation specs.
Specifically: - Adds buf & buildifier support (within Bazel) as actual tests, plus fixes issues found by both. - Restricts building to Bazel 4.0.0 (since this is what Oppia Android is currently limited to). - Significantly revamps how all things localization are stored (and fully defines them for all structures). - Finishes topic list request/response support. - Changes how third-party dependencies are managed at the repository-level for cleanliness. There are still a few TODOs left to resolve, but the structures are much closer to finished as of this commit (sans addressing any existing comments on GitHub). These changes have been verified via a new asset download script being built for Oppia Android's release pipeline.
""" | ||
|
||
# Note to developers: Please keep this dict sorted by key to make it easier to find dependencies. | ||
MAVEN_DEPENDENCY_VERSIONS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not actually maven.
@@ -0,0 +1,47 @@ | |||
""" | |||
Defines Starlark macros that are used to set up dependency toolcahins needed to build the Oppia |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sp: toolchains
Requests should be allowed to fail gracefully. Also, fixes a small formatting inconsistency.
This is in preparation for introducing support in the Android app for multiple learning classrooms.
This ensures that an older zlib isn't referenced per bazelbuild/rules_proto#117.
… for (multiple) classrooms & topic dependencies, and prepare for #4885 (#5398) ## Explanation Fixes #1547 Fixes part of #169 Fixes part of #5344 Fixes part of #5365 Fixes part of #5411 The main purpose of this PR is to introduce support for multiple classrooms in the data layer of the app (with minimal domain integration to avoid the change extending beyond the lesson structures). However, the PR is also doing a few more things including preparing the Android codebase for introducing an asset download script (#4885) and other peripheral cleanups of code (rather than updating it) that was noticed along the way. ### Preparing for multiple classrooms This addresses part of #5365. #5344 is introducing support for multiple classrooms in the app. To help prepare for those changes, this PR introduces the necessary data structure and domain loading changes to load a new proto structure: ```proto message ClassroomList { repeated ClassroomRecord classrooms = 1; } message ClassroomRecord { string id = 1; map<string, TranslationMapping> written_translations = 2; SubtitledHtml translatable_title = 3; LessonThumbnail classroom_thumbnail = 4; map<string, TopicIdList> topic_prerequisites = 5; message TopicIdList { repeated string topic_ids = 1; } } ``` Rather than just a flat topic list. Some important details to note: - The recommended topics structure has been updated to use this new ``topic_prerequisites`` value being loaded from classrooms. This will also extend to production assets since the asset download script from #4885 is also being updated to include support for this multiple classrooms structure to address the remainder of #5365. - To minimize domain changes, the new loading code assumes only **one** classroom is present. TODOs have been added on #5344 to extend this to support multiple classrooms. - Current loading code (including for JSON) is ignoring all but: ``classrooms``, ``id``, and ``topic_prerequisites`` (including ``topic_ids``) from the protos above. These other fields are expected to have supported added as part of #5344. - There were some color simplifications made in ``TopicListController``. These largely shouldn't have a major impact outside of developer assets. These changes were made to ensure non-specificity to the previous lesson classroom. In general, all of this should eventually be removed in favor of loading colors from lesson assets, but that will need to wait until the JSON pipeline is completely removed. ### Asset priming removal This addresses part of #169. ``PrimeTopicAssetsController`` and its implementation were responsible for hackily pre-loading all lesson images and audio to be on-device to enable offline support. This was the first attempt at offline support early in the app's development, but it had a few significant drawbacks: - It required preloading everything upon first app open. Since it can take a while for loading to occur, some robustness needed to be built in for pausing, cancelling, and resuming. I'm not certain if these were even 100% handled in the current implementation. - It doesn't perform strong compatibility checks until you're in the app which means lesson incompatibilities would just cause the app to get stuck rather than addressing it during lesson import time (e.g. via an asset downloader script). - It required very significant workarounds to existing loading pipelines that aren't ideal to keep in the codebase long-term (code smell). - There's no guarantee the user even has enough disk space to download all the needed assets (particularly audio), or if they'll have sufficient internet connectivity & bandwidth to perform those downloads upon first app open. This approach was abandoned after the earliest alpha releases for an asset download script (which is now being migrated over to this codebase per #5411. This removal unfortunately required removing a module that was referenced in a lot of tests throughout the codebase. While the removal itself was fairly simple, it does affect a lot of files. Other areas changed (but unaffected by tests since these flows didn't have automated tests): - ``SplashActivityPresenter`` for enabling the downloader to start and block the UI using a dialog box while the downloading occurred. - ``AudioPlayerController` for removing the special loading logic for primed audio files (the app now no longer supports loading audio files from disk as we don't yet have a good long-term solution for offline-available audio files due to their significant size). - ``GlideImageLoader`` for removing support for loading locally cached images (only through this flow; the flow we use for the asset download script uses a different annotation and is still supported). As alluded to above, two annotations were removed as part of this cleanup: - ``CacheAssetsLocally``: a boolean indicating whether the prime download flow should be enabled. - ``TopicListToCache``: this specified the list of topics to pre-download if the flow was enabled. ### GAE structure cleanup & preparing for asset script Preparing for #4885 included a few other changes, some of which were nice-to-have: - Introducing support for incorporating the protos from https://github.com/oppia/oppia-proto-api (specifically oppia/oppia-proto-api#1 since they are still a work-in-progress). - Adding ``java_proto`` versions for many of the app's proto structures (since the download script runs in the JVM and doesn't use the javalite proto environment). - Removed all of the unused GAE services and models (addressing #1547 by essentially making it obsolete), plus their mocks. These were never hooked up, and they're never going to be: the long-term plan for the codebase is to use new proto endpoints that will be added to Oppia web. These Retrofit endpoints have actually been rebuilt and repurposed to be used in the asset download script as a medium-term stop-gap until the permanent proto endpoints can be added to Oppia web. - ``RemoteAuthNetworkInterceptorTest`` was updated to use a different example service since ``TopicService`` has been removed. The services for platform parameters and user feedback are being kept. - Some test file & KDoc exemptions have been removed since their corresponding files have been deleted. Note that the new Java proto targets & oppia-proto-api targets aren't being used yet, but they will be in future PRs. This just provides the basis of support for the asset download script while also helping to split up the code to review across multiple PRs. ## Essential Checklist - [x] The PR title and explanation each start with "Fix #bugnum: " (If this PR fixes part of an issue, prefix the title with "Fix part of #bugnum: ...".) - [x] Any changes to [scripts/assets](https://github.com/oppia/oppia-android/tree/develop/scripts/assets) files have their rationale included in the PR explanation. - [x] The PR follows the [style guide](https://github.com/oppia/oppia-android/wiki/Coding-style-guide). - [x] The PR does not contain any unnecessary code changes from Android Studio ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#undo-unnecessary-changes)). - [x] The PR is made from a branch that's **not** called "develop" and is up-to-date with "develop". - [x] The PR is **assigned** to the appropriate reviewers ([reference](https://github.com/oppia/oppia-android/wiki/Guidance-on-submitting-a-PR#clarification-regarding-assignees-and-reviewers-section)). ## For UI-specific PRs only This is essentially only data infrastructural except for a couple of points: - Topic loading is now happening through a classrooms structure rather than a tropic ID list. Since there's only one test & one production classroom, this shouldn't actually change the UX. - An at-app-open flow for predownloading image & audio assets has been removed. This hasn't been used since the very earliest alpha releases of the app, so it won't actually affect any users. - Some colors for developer lesson and topic thumbnails were updated--see above.
Note that this PR was written largely in collaboration with @anandwana001 & @seanlip (see commit history for specific attribution if necessary),
Overview
This introduces the initial version of Oppia's backend proto API, which is currently only spec'd for the Android client.
Design details
Some things in particular to note:
Data flow expectations
All data that the Android client needs to fetch is divided into two categories:
The metadata is expected to always be automatically downloaded, and is a prerequisite to downloading or updating topics since it provides the necessary context for the client to make this availability determination.
Downloading/updating topic list metadata
Topic metadata provides all the context necessary for users to decide to download/update topics, and for the client to actually make this happen. At a high-level, this is:
Metadata is always automatically downloaded by the client when connectivity is available to ensure users have the most up-to-date context. However, there's one key challenge that needs to be accounted for in this design: some of the metadata is tied to versioned structures that can change in incompatible ways. For example, if a new incompatible chapter is added to a story, we don't want to display that chapter since the user wouldn't be able to download it, and this would lead to an inconsistency. Another such scenario is if a story's title is updated without the corresponding topic being downloaded.
To properly account for this, metadata is requested with the client's compatibility context so that the server can ensure that it's only providing compatible results. Further, the response for this metadata will distinguish between metadata for undownloaded topics vs. downloaded ones (so that titles & thumbnails aren't updated for topics that are downloaded until the topic itself is updated).
One other issue that crops up here is the time that the user may need to wait in the client for metadata to be available to display the homescreen. While this design is not stipulating frontend design, it is expected that the client may make use of several techniques to improve this experience, including:
Downloading/updating topics (content downloading)
While directly user initiated, topics are downloaded/updated piecemeal. Topics consist of potentially thousands of files (dozens to ~hundreds of structures, and hundreds to ~thousands of images). An active design decision is to keep topics fragmented and put the onus for downloading each of these to the client. While this increases complexity, it has the key advantage of making topic downloads highly pausable/restartable by introducing more granularity when downloading the constituent pieces of topics.
The image part of topics are naturally piecemeal since they are separate statically GCS-hosted assets that must be individually downloaded. Structures, on the other hand, aren't quite as composed. To make that more viable, pagination & selection were key design choices in the proto API proposed here. In particular, the API is designed to:
It's expected that topic content endpoints provide exactly what's requested based on content & structure versions. See the versioning section below for specifics on how versioning is expected to work. Keep in mind that versioning information for topic content is retrieved via metadata (see the section above for more specifics).
Versioning
Versioning starts getting a bit complicated when considering the backend VersionedModels, the model structures themselves, schema structures, proto structures, the API, snapshots of the API, and other things. To try and keep things simple, this design intends to define a separate set of versions that are specific to this API. How these map to Oppia backend's many versions is an implementation detail not actually relevant to this design.
API version
The API itself is versioned and has separate releases. To keep things simple, it has a major & minor version (e.g. 1.0). The major version is represented directly in the directory structure (e.g. 'v1') since the repository could, in the future, house multiple major versions (which is necessary for the backend to maintain long-term backward compatibility with older API versions). The major version is only incremented if a breaking change must be introduced into the API (generally this will only come when we want to make substantial changes to the API/redesign it).
The API's minor version essentially counts the number of releases that major version has been changed. This means we increment the minor version anytime we make a compatible change to the API, and then release it. The main benefit of having distinct releases is it allows the backend & frontends to opt into taking on a particular version of the API. Due to the backward/forward interoperability nature of protobuf, it's unlikely there will ever be an incompatibility so long as structure versions are respected (see below), and even in those cases there should be potential for graceful failures. Note that the minor version is only ever represented in the release numbers and never in code form since it only serves the purpose of cataloging.
Content versions
Content versions are meant to be the version of the entity itself corresponding to a particular structure (such as a topic, skill, or exploration). These are used to track whether there are content updates for a particular structure. These are expected to map exactly to the versions stored in the corresponding structure's VersionedModel in the backend, but how this versioning behaves is up to the backend so long as it's monotonically increasing and positive.
Structure versions
Structure versions correspond to groups of proto structures themselves. These are unique versions to the proto structures and are not shared with schema structure versions in the backend. Generally, these only need to be incremented when a proto structure is updated in such a way that the default data will break existing logical assumptions (thus requiring a data migration). Proto structures should never be updated in an incompatible way (without incrementing the major version--see the API version section above), but sometimes data migrations are necessary and these versions help track when those need to occur. Unlike schema versions in the backend, these are likely to not increment as often since most proto changes can be safely ignored by older clients.
Note that one aspect of this is a bit complicated: some of the substructures are shared across high-level structures (such as SubtitledHtml, or other language-based substructures). To ensure these are properly tracked, some substructures are grouped into their own combined version to track changes. The following are the list of structure types whose versions are tracked:
A note on interaction versioning
I did want to briefly note that it was discovered during the creation of this PR & structure versions that separate versioning for interactions is not actually needed since, per the strong typing of State's substructures, the structures of individual interactions & how they relate to the exploration/question experience is implied as part of State's structure version. This does not mean the backend won't need additional version tracking for interactions in order to compute compatibility with specific proto State structure versions, but it does serendipitously simplify version tracking at the sub-State level a bit.
Open tasks