Skip to content

fix(memory): thread BYO embedding credential — stop Cohere empty-key 401 flood (#3354)#3358

Merged
senamakel merged 5 commits into
tinyhumansai:mainfrom
oxoxDev:fix/3354-cohere-empty-key-flood
Jun 4, 2026
Merged

fix(memory): thread BYO embedding credential — stop Cohere empty-key 401 flood (#3354)#3358
senamakel merged 5 commits into
tinyhumansai:mainfrom
oxoxDev:fix/3354-cohere-empty-key-flood

Conversation

@oxoxDev
Copy link
Copy Markdown
Contributor

@oxoxDev oxoxDev commented Jun 4, 2026

Summary

  • Root-cause fix: the memory pipeline never passed the user's embedding API key — it built the embedder with the keyless factory, so a BYO provider (Cohere/OpenAI/Voyage) always sent an empty bearer. Now the stored credential is resolved and threaded through.
  • Safety net: fast-fail CohereEmbedding::embed() on an empty key (no wasted request), and classify Cohere's "no api key supplied" 401 as ApiKeyMissing so any residual event is demoted.
  • Fixes Sentry TAURI-RUST-52S (8.7k events, 1 user). Also silently un-breaks BYO OpenAI/Voyage memory embeddings.

Problem

A user selected Cohere as their embedding provider. Every memory embed produced 401 "no api key supplied"; not retryable, so each call reported an error and the per-document memory pipeline flooded Sentry.

The real cause: memory_store/factories.rs built the embedder via create_embedding_provider(provider, model, dims) — the keyless factory. The user's configured key (stored under embeddings:<slug>) was never resolved or passed, for any provider. The RPC "test embed" path (embeddings/rpc.rs) did it correctly via resolve_api_key + create_embedding_provider_with_credentials; the memory pipeline did not. So even a correctly-keyed Cohere setup got empty-bearer 401s.

The Sentry event escaped demotion because expected_error_kind() only matched "api key not set" / "missing api key" / "no api key is configured" — Cohere's "no api key supplied" wording matched none.

Solution

Real fix — thread the credential (memory_store/factories.rs + 3 call sites):

  • create_memory_full / create_memory_with_local_ai take an embedding_api_key: &str and build via create_embedding_provider_with_credentials.
  • The three production callers that hold the full Configruntime_node::ops, agent::harness::session::builder, channels::runtime::startup — resolve it with resolve_api_key(config, &config.memory.embedding_provider).
  • cloud/managed/ollama/none ignore the key; keyed providers now authenticate. create_memory (tests + migration, no Config in scope) passes "", preserving prior behavior. The key is never logged.

Safety net (defense-in-depth):

  • embeddings/cohere.rs: bail before the request when the key is empty/whitespace, with a classifiable "API key not set" message — covers genuinely-unconfigured setups without a wasted round-trip or a Sentry event.
  • core/observability.rs: add "no api key supplied" to the ApiKeyMissing arm for any residual 401 (older clients without the guard, or a present-but-rejected key).

Sibling: #2898 (429 embed-flood fix, same class).

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) — resolve_api_key credential round-trip + custom-slug normalization; empty/whitespace Cohere key guard; verbatim Cohere 401-body classification; existing over-suppression guard still green
  • Diff coverage ≥ 80% — new/changed lines exercised by the new unit tests plus existing memory-creation tests that drive create_memory_full
  • Coverage matrix updated — N/A: behaviour-only change (no feature row added/removed/renamed)
  • All affected feature IDs from the matrix are listed under ## RelatedN/A: no matrix feature touched
  • No new external network dependencies introduced — fix removes wasted calls; tests use the in-process credential store + axum mock
  • Manual smoke checklist updated — N/A: no release-cut surface touched
  • Linked issue closed via Closes #NNN in ## Related

Impact

  • Desktop Rust core (embeddings + memory factory + observability). No UI, RPC, or schema change.
  • BYO embedding providers (Cohere/OpenAI/Voyage) in the memory pipeline now actually authenticate — previously dead without a network-visible cause. Managed/cloud/ollama paths unchanged.
  • A genuinely keyless setup fails fast instead of looping 401s; Sentry signal stays honest.
  • No migration/compat impact.

Related


AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: fix/3354-cohere-empty-key-flood
  • Commit SHA: 9eabd86e7

Validation Run

  • N/A: pnpm --filter openhuman-app format:check — no app/ files changed
  • N/A: pnpm typecheck — no TypeScript changed
  • Focused tests: embeddings::rpc::tests (2 ok), embeddings::cohere (7 ok), memory_store::factories (16 ok), core::observability::tests::classif* (46 ok)
  • Rust fmt/check (if changed): cargo fmt --check + cargo check (exit 0)
  • N/A: Tauri fmt/check (if changed) — no app/src-tauri files changed

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: memory-pipeline BYO embeddings now use the stored API key; unconfigured Cohere fails fast instead of flooding.
  • User-visible effect: Cohere/OpenAI/Voyage memory embeddings work when a key is configured; previously silently broken.

Parity Contract

  • Legacy behavior preserved: cloud/managed/ollama/none unchanged; create_memory (tests/migration) keeps empty-key path; correctly-keyed providers' 429 backoff tests still pass.
  • Guard/fallback/dispatch parity checks: create_embedding_provider_with_credentials covers the same provider arms as the keyless factory; custom:<url> prefix self-stripped (custom_endpoint stays None).

Duplicate / Superseded PR Handling

  • Duplicate PR(s): none (dedup vs open + 30d-merged clean)
  • Canonical PR: this
  • Resolution: N/A

Summary by CodeRabbit

  • Bug Fixes

    • Detect and classify Cohere "no api key supplied" 401 responses as missing API key to reduce noisy error reports and retries.
    • Fast-fail when an embeddings API key is blank to avoid unnecessary failed requests.
  • Improvements

    • Ensure user-supplied embedding keys are honored when initializing local memory/embedding backends, reducing misconfigured-provider failures.
  • Tests

    • Added tests to validate API-key resolution and missing-key behavior.

oxoxDev added 2 commits June 4, 2026 16:24
…sai#3354)

Cohere's hosted /v2/embed always needs a bearer; with an empty key it
returned 401 "no api key supplied" on every call, and since 401 is not
retryable each embed attempt reported an error. The memory pipeline
re-embeds per document, flooding Sentry (TAURI-RUST-52S, 8.7k events from
one misconfigured user). Fast-fail when the key is empty/whitespace with a
classifiable "API key not set" message so no request is fired and the
error is demoted to a breadcrumb. Scoped to Cohere — the OpenAI-compatible
provider legitimately supports keyless local/custom endpoints.
…eyMissing (tinyhumansai#3354)

Defense-in-depth for TAURI-RUST-52S: demote any residual Cohere 401 of
this shape (older clients without the call-site guard, or a
present-but-rejected key) so the flood stays out of Sentry. The existing
arm only matched 'api key not set' / 'missing api key' / 'no api key is
configured'; Cohere's exact body wording fell through.
@oxoxDev oxoxDev requested a review from a team June 4, 2026 10:55
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds an early empty/whitespace API-key guard for Cohere embeds, extends the observability classifier to match Cohere's "no api key supplied" 401 text, and threads resolved embedding API keys into memory factory creation and callers so keyed embedding providers receive BYO credentials.

Changes

Cohere embedding API key guard and error classification

Layer / File(s) Summary
Cohere embed early API key validation
src/openhuman/embeddings/cohere.rs
Early guard in embed detects empty or whitespace-only api_key and immediately bails with "API key not set" before constructing any HTTP request. Test verifies the function exits before attempting network calls against an unreachable base URL.
API key missing error classification
src/core/observability.rs
The ExpectedErrorKind::ApiKeyMissing classifier expands to match the substring "no api key supplied" (case-insensitive) alongside existing patterns. Regression test confirms Cohere's hosted 401 response is classified as ApiKeyMissing.

Memory factory and runtime wiring for embedding API keys

Layer / File(s) Summary
Resolve and wire embedding API key at callsites
src/openhuman/agent/harness/session/builder.rs, src/openhuman/channels/runtime/startup.rs, src/openhuman/runtime_node/ops.rs
Callers now resolve the configured embedding provider's API key via resolve_api_key(...) and pass the resolved &str into memory_store::create_memory_with_local_ai.
Memory factory accepts and forwards embedding_api_key
src/openhuman/memory_store/factories.rs
create_memory_with_local_ai and create_memory_full signatures accept embedding_api_key: &str; provider construction uses create_embedding_provider_with_credentials(...) so keyed providers receive the credential.
resolve_api_key unit tests
src/openhuman/embeddings/rpc.rs
Tests validate credential lookup for embeddings:<provider> keys and normalization of custom:<url> to embeddings:custom lookup keys.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

bug, memory

Suggested reviewers

  • graycyrus

Poem

🐰 I hopped through logs to find a cry,
"No api key supplied" — a telltale sigh.
I guard the gate before the net,
Thread keys through stores so none are met.
Hop, fix, and sleep — the errors fly less high.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately identifies the main focus: threading BYO embedding credentials and preventing Cohere empty-key 401 floods via fast-fail guards.
Linked Issues check ✅ Passed All coding objectives from #3354 are met: empty-key guard in CohereEmbedding::embed(), observability classifier adds 'no api key supplied', API key threading through memory pipeline, and tests verify both the guard and classifier.
Out of Scope Changes check ✅ Passed All changes are scoped to embedding credential threading and Cohere empty-key handling; no unrelated modifications to other systems or providers detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage working A PR that is being worked on by the team. labels Jun 4, 2026
coderabbitai[bot]
coderabbitai Bot previously approved these changes Jun 4, 2026
@oxoxDev oxoxDev self-assigned this Jun 4, 2026
oxoxDev added 2 commits June 4, 2026 16:59
…tinyhumansai#3354)

The memory pipeline built its embedder via the keyless
`create_embedding_provider`, so the user's configured API key was never
passed for any provider. A user who selected Cohere — even with a valid
key — sent an empty `Bearer ` and got a guaranteed 401 "no api key
supplied" on every embed (TAURI-RUST-52S), and the same gap silently
broke BYO OpenAI / Voyage memory embeddings.

Resolve the stored credential at the three call sites that hold the full
`Config` (runtime_node, agent session builder, channels startup) via
`resolve_api_key`, thread it through `create_memory_with_local_ai` /
`create_memory_full`, and build the provider with
`create_embedding_provider_with_credentials`. cloud/managed/ollama/none
ignore the key; keyed providers now authenticate. The key is never
logged. `create_memory` (tests + migration, no Config in scope) passes
an empty key, preserving prior behavior.
…#3354)

Lock the round-trip the memory factory now depends on: a stored
`embeddings:<slug>` key resolves back, an unconfigured provider stays
empty (no cross-bleed), and `custom:<url>` normalizes to the `custom`
slug.
@oxoxDev oxoxDev changed the title fix(embeddings): stop Cohere empty-key 401 from flooding Sentry (#3354) fix(memory): thread BYO embedding credential — stop Cohere empty-key 401 flood (#3354) Jun 4, 2026
@coderabbitai coderabbitai Bot added memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. bug labels Jun 4, 2026
@senamakel senamakel merged commit e37a569 into tinyhumansai:main Jun 4, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(embeddings): Cohere embed floods Sentry with 401 'no api key supplied'

2 participants