Skip to content

[Feature] Batched Summarization for Long-Term Memory #387

@KaranShishodia

Description

@KaranShishodia

Search before asking

  • I searched in the issues and found nothing similar.

Description

The current design feeds all items into the LLM at once, which risks exceeding the context window. A practical solution is to introduce batched summarization with the following workflow:

  • Chunking Input Data

  • Split the memory items into fixed-size batches (configurable, e.g., 50–100 items per batch).

  • Each batch is summarized independently, producing a concise representation.

  • Hierarchical Summarization

  • After batch-level summaries are generated, feed those summaries into a second summarization step.

  • This produces a global summary that captures the overall context without overwhelming the LLM.

  • Configurable Parameters

  • Allow users to configure:

  • Batch size (number of items per summarization call).

    • Summarization depth (single-pass vs. hierarchical).
  • Retention policy (e.g., keep both batch summaries and global summary for traceability).

  • Implementation Sketch (Pseudo-Java/Python)
    List items = getMemoryItems();
    int batchSize = 50;
    List batchSummaries = new ArrayList<>();

for (int i = 0; i < items.size(); i += batchSize) {
List batch = items.subList(i, Math.min(i + batchSize, items.size()));
String summary = llm.summarize(batch);
batchSummaries.add(summary);
}

// Hierarchical summarization
String globalSummary = llm.summarize(batchSummaries);
storeSummary(globalSummary);

  • Advantages
  • Prevents context overflow by respecting LLM limits.
  • Scales to very large memory sets.
  • Maintains fidelity by layering summaries instead of discarding details

Next Steps

  • Add a runtime configuration option for batch size and summarization depth.
  • Implement a summarization operator in the Flink runtime that can be reused across agents.
  • Provide benchmarks comparing single-pass vs. batched summarization to validate efficiency

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    feature[Issue Type] New features or improvements to existing features.priority/majorDefault priority of the PR or issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions