Add configurable embedding models with ModelRegistry and provider inference by codenamev · Pull Request #4 · codenamev/claude_memory

codenamev · 2026-04-03T12:20:16Z

Enable users to configure embedding models across all providers (tfidf,
fastembed, api) via CLAUDE_MEMORY_EMBEDDING_MODEL env var. The resolver
now auto-infers the provider from the model name using the registry, so
setting just the model name is sufficient.

Key changes:

ModelRegistry: known models with dimensions, descriptions, size metadata
FastembedAdapter: dynamic dimensions from registry (was hardcoded 384)
Resolver: model-based provider inference, unified model forwarding
ApiAdapter: registry-backed dimensions (avoids probe API call)
EmbeddingsCommand: CLI for listing models and validating setup

https://claude.ai/code/session_01DrjisFD2mvy2nvognHczbd

…erence Enable users to configure embedding models across all providers (tfidf, fastembed, api) via CLAUDE_MEMORY_EMBEDDING_MODEL env var. The resolver now auto-infers the provider from the model name using the registry, so setting just the model name is sufficient. Key changes: - ModelRegistry: known models with dimensions, descriptions, size metadata - FastembedAdapter: dynamic dimensions from registry (was hardcoded 384) - Resolver: model-based provider inference, unified model forwarding - ApiAdapter: registry-backed dimensions (avoids probe API call) - EmbeddingsCommand: CLI for listing models and validating setup https://claude.ai/code/session_01DrjisFD2mvy2nvognHczbd

https://claude.ai/code/session_01DrjisFD2mvy2nvognHczbd

Replace stubs with real provider instances and real SQLiteStore databases. Fastembed tests use skip pattern (matching benchmarks/) when models can't be downloaded. EmbeddingsCommand tests use real tmpdir databases to verify dimension mismatch detection and database state display. https://claude.ai/code/session_01DrjisFD2mvy2nvognHczbd

check_dimension_compatibility and show_database_state opened SQLiteStore connections but only closed them in the happy path. On exception, the connection would leak. Wrap both in begin/ensure blocks.

…tion Move default model knowledge into ModelRegistry (single source of truth) instead of hardcoding adapter constants in EmbeddingsCommand. Extract with_each_store helper to eliminate duplicated store open/close/ensure loops in show_database_state and check_dimension_compatibility.

Move DB-reading and dimension-checking logic into a focused Inspector class that returns structured Data.define value objects. The command becomes a thin router (173 LOC, down from 239) that formats output. Inspector owns: with_each_store, database_states, dimension_checks. Store connection safety (ensure close) lives in one place now.

Fastembed::SUPPORTED_MODELS is a Hash, so use direct key lookup instead of iterating with find and accessing positional array elements.

…ance Extract resolve_or_skip helper in resolver_spec to eliminate 4 duplicated begin/rescue/skip blocks. Add rubocop disable comment to fastembed allow_any_instance_of (unavoidable: require is called inside initialize before an instance reference exists).

After testing identical prompts with and without ClaudeMemory, five categories of measurable improvement emerged. Documentation across all user-facing surfaces was updated to reflect these outcomes. instructions_builder.rb (highest leverage): Added proactive_recall_guidance to the MCP server instructions. Instead of passive "Use memory.recall to search facts", now directs Claude to check memory.conventions BEFORE writing code, check memory.architecture BEFORE explaining structure, and check memory.decisions BEFORE refactoring. Addresses the gap where one-shot code generation didn't trigger memory recall (Test #4). README.md: Added "Why It Matters" section with real A/B test results: - Architecture recall: 76-line explanation vs honest refusal - File paths: 8 correct steps vs 3 hallucinated files - Preferences: 7 real preferences vs blank slate - Honest about when memory doesn't help (grep-able questions) Plugin metadata (plugin.json, marketplace.json): Rewrote descriptions from mechanism-focused ("fact extraction, truth maintenance, provenance tracking") to outcome-focused ("recalls architecture without file traversal, follows your patterns, never re-asks what it already learned"). Keywords updated: architecture, conventions, decisions, recall. Gemspec: Summary and description rewritten to lead with outcomes.

0.12 "Release Discipline" punchlist refined post-0.11 ship: - Promote #59 (API Stability Audit) from 1.0 → 0.12. Reason: #52's scoreboard needs an explicit stable-surface list to gate against. Without #59, any "regression" finding is arguable. - Add new #63 (Pre-Release Hook Smoke Gate). Codifies the verification convention from feedback_hooks_run_installed_gem.md into a machine- enforced check. The 0.11 #47 incident was the second time this trap was sprung; documentation alone has not been enough. 0.12 scope grows from ~1 week to ~1.5 weeks (3.5d → ~6d): - #3 Negative-fact harm benchmark full corpus (2d) - #4 CLAUDE.md baseline in headline E2E (½d + $2-8 run) - #6 Release-to-release benchmark scoreboard (1d) - #11 API stability audit + Deprecations module (2d) — promoted - #12 Pre-release hook smoke gate (½d) — new 1.0 calendar shifts ~1 week later as a result; net no compression of the soak window. Risk note updated: harm-prototype 0/3 result reduces the headline 0.12 risk; #11 audit now the most likely overrun. Improvements.md: - #59 description updated with promotion rationale. - New #63 entry with full implementation plan + manifest YAML sketch + skill integration design.

/release skill gains a new Step 6 between specs (Step 5) and lint (Step 7 formerly 6) that invokes bin/pre-release-smoke. Failure aborts the release before git push — exactly the trap the gate is designed to catch. Per-step numbering renumbered 6→7, 7→8, 8→9, 9→10, 10→11, 11→12 to keep sequential ordering. Error-handling section gains a "Smoke gate fails" entry naming the common cause (forgot rake install) and the manifest-edit case for intentional field removal — flagged that removing a detail_json field will become a public-API change once #11 (API stability audit) lands. CHANGELOG [Unreleased] section now lists both #63 (smoke gate) and the #61 Phase 1 prompt-only guard against /study-repo misattribution. These are the first two 0.12 punchlist items landed; full 0.12 lineup is #3 (harm corpus), #4 (CLAUDE.md baseline), #6 (scoreboard), #11 (API stability audit), #12 (this — smoke gate). Addresses: docs/improvements.md #63 / 1_0_punchlist.md 0.12 #12

bin/run-evals now writes spec/benchmarks/results/<version>.json after each run. Diff-friendly schema: pass-rate metrics by category and by scenario, plus version, timestamp, git_sha, git_branch, and what was run. Pass --no-write-results to skip the JSON write. bin/bench-diff (new) compares the current scoreboard against the most recent prior tagged version's via Gem::Version ordering and reports per-category deltas. Pass-rate drops > threshold (default 5%) trigger exit 1; count growth (more specs landing) is reported but never flagged as regression. Flags: --baseline VERSION Pin to a specific prior version --threshold N Tighten/loosen regression bar (default 0.05) --json Machine-readable output for tooling --strict Fail when no baseline exists yet /release skill gains new Step 7 between smoke gate (Step 6) and lint (formerly Step 6, now Step 8). Full step renumbering: 8→9, 9→10, 10→11, 11→12, 12→13. Error-handling section gains a "Bench-diff fails" entry distinguishing real correctness regressions from deliberate baseline changes — and explicitly forbids bypassing the gate without a CHANGELOG note (defeats the entire scoreboard). The 0.12.0 release is the first with the gate enabled. Since there is no prior scoreboard, bench-diff exits 0 with a "No baseline scoreboard available" note. From 0.13.0 onward it actively gates against 0.12 baselines. Verification: - 11 unit specs covering missing-baseline (default + --strict), threshold tuning (default + custom), nested by_scenario / by_category metrics, --json output, --baseline pinning, and count-growth tolerance. All pass. - End-to-end with simulated release-time flow (VERSION + RESULTS_DIR overrides via load): same pass-rate + new specs → exit 0; -15% pass_rate drop → exit 1 with named regressing metric path. - bundle exec rake standard: clean. Note: bin/run-evals's existing --benchmarks-only flag is broken on main (run_evals=false AND run_benchmarks=false → both sections skip). Not addressed here; tracked separately. Use --benchmarks (which enables both) or no args (which runs benchmarks + evals when available) to actually populate a scoreboard. Punchlist: 0.12 #6 ✅ landed 2026-05-01. Three of six 0.12 items landed; remaining: #3 (harm corpus), #4 (CLAUDE.md baseline in headline E2E), #44/#46 (release-time observation). Addresses: docs/improvements.md #52 / 1_0_punchlist.md 0.12 #6

claude and others added 8 commits April 1, 2026 17:15

Remove transient SQLite WAL files from tracking

5277579

https://claude.ai/code/session_01DrjisFD2mvy2nvognHczbd

[Fix] Add ensure blocks for SQLiteStore close in EmbeddingsCommand

b23d174

check_dimension_compatibility and show_database_state opened SQLiteStore connections but only closed them in the happy path. On exception, the connection would leak. Wrap both in begin/ensure blocks.

[Fix] Use hash lookup instead of find in probe_dimensions_from_fastembed

c9bb252

Fastembed::SUPPORTED_MODELS is a Hash, so use direct key lookup instead of iterating with find and accessing positional array elements.

codenamev merged commit 0c95d87 into main Apr 10, 2026
1 check failed

codenamev deleted the claude/configurable-embedding-models-ys3Kc branch April 10, 2026 20:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add configurable embedding models with ModelRegistry and provider inference#4

Add configurable embedding models with ModelRegistry and provider inference#4
codenamev merged 8 commits intomainfrom
claude/configurable-embedding-models-ys3Kc

codenamev commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

codenamev commented Apr 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants