Feat/memory optimize for sync molnix#2745
Draft
thenav56 wants to merge 2 commits into
Draft
Conversation
NOTE: This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.
b5b90ff to
37f1a26
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses
Changes
Note
This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.
Headline numbers
Replay A/B (memray-instrumented, two DBs, simultaneous) and live A/B (real Molnix API, two DBs, simultaneous)
json.dumps/loadsextra times when cached._CachedPaginated's write+read tee with everything else being microseconds. In production the cache write happens during the network read, so its cost overlaps with network wait.Re-iteration counts (
_CachedPaginated)Each
_CachedPaginated.__iter__call emits alog.warningwith a counter. The counts below are measured from log output of a replay run — first iteration streams from source (writes JSONL cache to/tmp/); subsequent iterations replay from the cache.Example log output from a verification replay:
__iter__callssync_molnix.py_CachedPaginated[deployments]get_unique_tags(L78) →[d["id"] for d in ...](L260) →[get_go_event(d["tags"]) for d in ...](L266) →for md in molnix_deployments:(L295)_CachedPaginated[positions]get_unique_tags(L83) →[p["id"] for p in ...](L462) →for position in molnix_positions:(L468)With the cache, each re-iteration is a sequential read of a
/tmp/molnix_paginated_*.jsonlfile (deployments ~734 MB, positions ~890 MB).