feat: add AGENTMEMORY_OUTPUT_LANG to control generated-text language#711
feat: add AGENTMEMORY_OUTPUT_LANG to control generated-text language#711cyangfang wants to merge 1 commit into
Conversation
LLM-generated memory fields (title, narrative, facts, concepts, summary) are always produced in the language the system prompts are written in (English), regardless of the user's working language. Non-English users get English memories that are hard to read and that their native-language recall queries match poorly. Add an opt-in AGENTMEMORY_OUTPUT_LANG env var, injected once in ResilientProvider (the wrapper every provider passes through), so a single change covers compress + summarize across all providers and prompts: - unset / empty -> unchanged behaviour (English); strictly opt-in - "match" -> follow the user's input/observation language - "zh"/"ja"/... -> a known code, expanded to a full language label - any other -> used verbatim as the target language name Code, identifiers, and file paths are always preserved verbatim. Documented in .env.example.
|
@cyangfang is attempting to deploy a commit to the rohitg00's projects Team on Vercel. A member of the Team first needs to authorize it. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughThis PR adds configurable output language control for generated text fields. A new environment variable ChangesOutput Language Directive
🎯 2 (Simple) | ⏱️ ~12 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Actionable comments posted: 0 |
Problem
All LLM-generated memory fields (title, narrative, facts, concepts, summary) are produced in the language the system prompts are written in — English — regardless of the user's working language. For non-English users this means:
A quick A/B against a real corpus showed same-concept recall scoring ~4x worse for a non-English query than its English equivalent, largely because the stored text had been forced to English.
Change
Add an opt-in
AGENTMEMORY_OUTPUT_LANGenv var. It's injected once inResilientProvider— the wrapper every provider passes through — so a single change coverscompress+summarizeacross all providers and all prompts (compression, summary, consolidation, graph-extraction, reflect, …).matchzh/ja/ko/ …Português)Code, identifiers, and file paths are always preserved verbatim.
Why ResilientProvider
Every provider is constructed inside a
ResilientProvider(seecreateProvider/createFallbackProvider), and bothcompressandsummarizefunnel through it. Injecting the directive there keeps the diff to ~6 lines and guarantees no prompt path is missed, rather than touching each of the 10+ call sites.Files
src/prompts/output-language.ts(new) —outputLanguageDirective()helpersrc/providers/resilient.ts— append directive incompress/summarize.env.example— document the flagNotes
""for the unset case keeps current behaviour exactly, so existing users see no change.match, known codes case-insensitively, verbatim names).Summary by CodeRabbit
AGENTMEMORY_OUTPUT_LANGenvironment variable.