Skip to content

fix(session): stop retaining full file-diff patch text in summaries#32410

Open
OddballGreg wants to merge 2 commits into
anomalyco:devfrom
OddballGreg:fix/session-summary-diff-memory
Open

fix(session): stop retaining full file-diff patch text in summaries#32410
OddballGreg wants to merge 2 commits into
anomalyco:devfrom
OddballGreg:fix/session-summary-diff-memory

Conversation

@OddballGreg

Copy link
Copy Markdown

Issue for this PR

Closes #17622

Type of change

  • Bug fix

What does this PR do?

SessionSummary.summarize() stored the full per-file diff produced by Snapshot.diffFull() — including the complete patch text — onto target.info.summary.diffs, which is persisted into message.data and kept in the hydrated in-memory message list. On a session that repeatedly edits the same large file, each turn appends another full-file-content copy, so memory (and message.data) grows roughly linearly with edits, and reopening rehydrates all of it (#17622, and the symptoms in #24445 / #20695).

This changes summarize() to store only lightweight diff metadata (file, status, additions, deletions) and strip the heavy patch text. The aggregate additions/deletions/files counts are still computed. diff() now recomputes the full diff (with patch text) on demand via the existing computeDiff() path, so the diff viewer still gets full content when a client actually requests it — it just isn't retained for the session's lifetime. The full patch is always reconstructable from the git-backed snapshots, so storing it was redundant.

There's a fallback in diff() to the stored diffs when recomputation yields nothing, so legacy/imported sessions whose snapshots are gone still return whatever metadata they have.

How did you verify your code works?

  • Added test/session/summary-memory.test.ts: edits a 4000-line file across a turn, then asserts the stored summary.diffs carries no patch text (with correct file count) while diff() still returns full patch text on demand. It passes with the change and fails without it.
  • Existing test/session/{revert-compact,session}.test.ts and test/snapshot/snapshot.test.ts still pass (66 tests).
  • Ran a real TUI session from source and captured a heap snapshot before/after: the baseline retained 8 copies of the same ~2.7 MB file diff; after the change, 0 file-diff text is retained (only normal static assets remain).

Screenshots / recordings

N/A (no UI change; diff viewer output is unchanged).

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

Gregory Havenga added 2 commits June 14, 2026 12:41
SessionSummary.summarize() stored the full per-file unified-diff text
(FileDiff.patch) on each user message's in-memory info.summary.diffs every
turn. On a long session repeatedly editing the same large file this retained
one full-file-content copy per turn in the hydrated message list and the synced
session_diff state, driving multi-GB heap growth (~460 MiB/hr observed; heap
snapshot showed 8x identical 2.68MB copies of one file's diff).

- summarize(): store only lightweight diff metadata (file/status/additions/
  deletions), stripping the heavy patch text. Aggregate counts preserved.
- diff(): recompute full diffs (with patch text) on demand via computeDiff so
  the diff viewer still gets full content when actually requested, falling back
  to stored diffs for legacy sessions whose snapshots are gone.

The full patch is reconstructable from the git-backed snapshots, so retaining
it was redundant.
…still returns it

Regression test for the summary-diff memory fix. Verifies summarize() stores
only lightweight diff metadata (no FileDiff.patch text) while diff() still
reconstructs full patches on demand. Run in a git-backed tmpdir editing a large
file across a turn.
@github-actions

Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found one potentially related PR:

Related PR:

This could be related as it also addresses diff handling performance, though it focuses on rendering rather than storage. However, they may be complementary fixes addressing different aspects of diff performance.

The current PR (32410) is specifically about reducing memory retention by stripping the patch text from SessionSummary.diffs, while allowing recomputation on demand. The search didn't find other open PRs addressing the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Blank session history after restart when summary diffs embed huge file contents

1 participant