Skip to content

Feat/memory optimize for sync molnix#2745

Draft
thenav56 wants to merge 2 commits into
developfrom
feat/memory-optimize-for-sync-molnix
Draft

Feat/memory optimize for sync molnix#2745
thenav56 wants to merge 2 commits into
developfrom
feat/memory-optimize-for-sync-molnix

Conversation

@thenav56
Copy link
Copy Markdown
Member

@thenav56 thenav56 commented May 22, 2026

Addresses

  • Molnix provisioning issue in k8s

Changes

  • Replace normal array with generator to limit memory usages
  • Lower the kube cronjob memory requirement

Note

This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.


Headline numbers

Replay A/B (memray-instrumented, two DBs, simultaneous) and live A/B (real Molnix API, two DBs, simultaneous)

Metric Baseline (measured) Modified (measured) Δ (measured) Remark
Peak RAM (replay, memray) 6.035 GB 560.3 MB −91 %, ~11× smaller Cleanest A/B since memray reads the actual heap. The ratio is what matters; absolute numbers include some fixture-loader overhead in both.
Peak RAM (live, docker stats RSS) ~5.60 GiB (observed) ~360 MiB (observed) ~16× smaller docker stats samples once per ~60 s so it can miss a brief spike, but the trend across many samples is consistent.
Total allocations (replay) 10,908,529 14,304,384 +31 % Modified has more small allocs because each record goes through json.dumps/loads extra times when cached.
Total bytes allocated, sum over time (replay) 117.9 GB 152.6 GB +29 % Sum-over-time inflated by the fixture loader's per-page rescans; production would be much smaller.
Max single allocation (replay) 3.845 MB 3.845 MB unchanged One JSON page worth — same either way.
Runtime (local cache only, no Molnix calls) 91 s 134 s +47 % Replay-only timing: no network at all, both runs read JSONL fixture files from local disk. The gap is the pure cost of _CachedPaginated's write+read tee with everything else being microseconds. In production the cache write happens during the network read, so its cost overlaps with network wait.
Runtime (live Molnix API) 2196 s (36m 36s) 2215 s (36m 55s) +19 s (+0.9 %) Within noise. Confirms the prediction: when network round-trips dominate the run, the cache I/O is invisible.

Re-iteration counts (_CachedPaginated)

Each _CachedPaginated.__iter__ call emits a log.warning with a counter. The counts below are measured from log output of a replay run — first iteration streams from source (writes JSONL cache to /tmp/); subsequent iterations replay from the cache.

Example log output from a verification replay:

WARNING  _CachedPaginated[deployments] first iteration #1 streaming from source -> /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[positions]   first iteration #1 streaming from source -> /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #2 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[positions]   re-iteration #3 from cache /tmp/molnix_paginated_yyy.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #2 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #3 from cache /tmp/molnix_paginated_xxx.jsonl
WARNING  _CachedPaginated[deployments] re-iteration #4 from cache /tmp/molnix_paginated_xxx.jsonl
Wrapper First iteration (from API) Re-iterations (from cache) Total __iter__ calls Call sites in sync_molnix.py
_CachedPaginated[deployments] 1 3 4 get_unique_tags (L78) → [d["id"] for d in ...] (L260) → [get_go_event(d["tags"]) for d in ...] (L266) → for md in molnix_deployments: (L295)
_CachedPaginated[positions] 1 2 3 get_unique_tags (L83) → [p["id"] for p in ...] (L462) → for position in molnix_positions: (L468)
Total 2 5 7

With the cache, each re-iteration is a sequential read of a /tmp/molnix_paginated_*.jsonl file (deployments ~734 MB, positions ~890 MB).

thenav56 added 2 commits May 22, 2026 18:20
NOTE: This is not the best approach, but it keeps the changes small to avoid unexpected breaking changes.
@thenav56 thenav56 force-pushed the feat/memory-optimize-for-sync-molnix branch from b5b90ff to 37f1a26 Compare May 22, 2026 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant