Skip to content

Add Deterministic Retrieval Mode (Stable Global→Local Routing, No Hops/Planner/Sampling) #2136

@yuer-dsl

Description

@yuer-dsl

Do you need to file an issue?

  • I have searched the existing issues and this feature is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

Is your feature request related to a problem? Please describe.

📌 Feature Request: Deterministic Retrieval Mode (Stable Global→Local Routing)

Hi team,

I’ve been experimenting with global→local retrieval patterns and built a very small PoC demonstrating a fully deterministic RAG pipeline — no planners, no hops, no sampling, and no hidden randomness.

🔗 Repo (minimal PoC)

https://github.com/yuer-dsl/deterministic-rag-poc

🧪 Example implementation

deterministic_rag_poc.py


🔍 Why this matters

Many RAG systems (including GraphRAG) rely on:

  • Multi-hop reasoning
  • Planner-generated routes
  • Sampling/temperature in intermediate steps

These introduce hidden randomness and make end-to-end reproducibility difficult.

For regulated, audit-sensitive, or high-reliability environments, we often need:

Same corpus + same query → same route → same output.

A deterministic mode can give GraphRAG a high-certainty retrieval path alongside its dynamic graph-native strengths.


🧩 What the PoC demonstrates

The PoC uses:

  • TF-IDF
  • KMeans with fixed seed
  • Deterministic community assignment
  • Deterministic exact search inside cluster
  • No sampling, no planner, no hop expansion

The goal is not to replace graph traversal — just to offer a strict, reproducible routing mode.


💡 Proposal

Add an optional configuration:

deterministic_mode = true

When enabled:
• 	Global routing uses fixed clustering or deterministic partitionLocal search uses exact/deterministic similarityLLM calls disable sampling (, )
• 	Planner and hop expansion are disabledSame inputsame routing tracesame output
This enables:
• 	ReproducibilityResearch comparisonsCompliance/audit pipelinesDeterministic evaluation

🔧 Possible integration pointsAdd a deterministic branch inside the retrieval pipelineEnable/disable via config or CLI flagProvide simple examples for both modesAllow users to benchmark deterministic vs dynamic behaviorsOpen questionsShould deterministic routing still leverage communities from the graph?
• 	Should this mode restrict multi-hop traversal completely, or just fix its route?
• 	What parts of the graph are still meaningful under deterministic constraints?

Id be happy to help with an example PR if this aligns with your roadmap.
Thanks for your work on GraphRAGexcited to see this grow!

### Describe the solution you'd like

_No response_

### Additional context

_No response_

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions