Skip to content

fix(subconscious): degrade WAL journal mode to fix Intelligence tab hang on shm-incompatible filesystems#3233

Merged
M3gA-Mind merged 2 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/subconscious-schema-wal-fallback
Jun 2, 2026
Merged

fix(subconscious): degrade WAL journal mode to fix Intelligence tab hang on shm-incompatible filesystems#3233
M3gA-Mind merged 2 commits into
tinyhumansai:mainfrom
M3gA-Mind:fix/subconscious-schema-wal-fallback

Conversation

@M3gA-Mind
Copy link
Copy Markdown
Contributor

@M3gA-Mind M3gA-Mind commented Jun 2, 2026

Summary

Fixes #3231. The subconscious SQLite schema init hung the Intelligence tab on filesystems that can't back WAL's shared-memory segment.

subconscious::store ran PRAGMA journal_mode = WAL inside the SCHEMA_DDL batch. WAL eagerly mmaps a -shm shared-memory segment; on network mounts, FUSE, and some sandboxed/synced macOS paths that mmap fails with SQLITE_IOERR_SHMMAP (4618) or SQLITE_CANTOPEN (14), aborting the entire DDL batch. The existing retry loop only handles SQLITE_BUSY/SQLITE_LOCKED, so the failure surfaced immediately as failed to run subconscious schema DDL: … and the Intelligence tab broke/hung (Sentry TAURI-RUST-8WM). A sibling report showed the CANTOPEN variant (Error code 14: Unable to open the database file).

Root cause

WAL requires an mmap-backed -shm file. It's a performance optimization, not a correctness requirement — but having it inside the DDL batch made an unsupported filesystem fatal to schema init.

Changes

  • subconscious/store.rs — Move journal-mode selection out of SCHEMA_DDL into apply_journal_mode(), called before the table DDL. It prefers WAL and falls back to a TRUNCATE rollback journal (no shared memory) when WAL can't be honoured, logging the degrade. Failures are logged, never propagated — a working rollback journal beats a failed init.
  • core/observability.rs — Classify residual genuine open failures (where even Connection::open / the file open fails — a local permission/disk condition with no Sentry remediation path) as ExpectedErrorKind::SubconsciousSchemaUnavailable, demoting them to a warn! diagnostic instead of escalating Sentry errors. Anchored to the subconscious open/DDL envelope plus cant-open/shm-IO text, so transient database is locked (retried by the store) and unrelated DB failures in other domains still reach Sentry.

Acceptance criteria

  • Root cause identified — WAL -shm mmap fails on shm-incompatible filesystems.
  • Intelligence no longer hangs — schema init now succeeds (rollback-journal fallback) where it previously aborted.
  • Graceful degraded state — WAL degrades to TRUNCATE; the subconscious status RPC already degrades to a config-derived snapshot when the DB is unavailable.
  • Schema init hardened — journal-mode failures handled out-of-band; busy/locked retry preserved.
  • Sentry noise reduced — residual CANTOPEN demoted with actionable diagnostics; locks/real bugs still reported.
  • Regression safety — store tests (WAL fallback, end-to-end DB init, no journal_mode in DDL) + observability classifier positive/negative cases.

Testing

cargo test --lib subconscious   # 60 passed (store + classifier)
cargo test --lib observability  # 142 passed
cargo fmt --check               # clean

Summary by CodeRabbit

  • Bug Fixes

    • Certain database-open/shared-memory failures during subconscious engine initialization are now treated as warnings (demoted) to reduce noisy critical errors.
  • Improvements

    • Database initialization is more robust: journal mode now prefers WAL but transparently falls back to TRUNCATE when WAL isn’t available.
    • Schema initialization no longer embeds an explicit journal_mode pragma.
  • Tests

    • Added tests covering journal-mode fallback, schema DDL shape, and classification of related DB errors.

…ang on shm-incompatible filesystems

The subconscious SQLite schema init ran PRAGMA journal_mode = WAL inside the
DDL batch. WAL eagerly mmaps a -shm shared-memory segment; on network mounts,
FUSE, and some sandboxed/synced macOS paths that mmap fails with
SQLITE_IOERR_SHMMAP (4618) or SQLITE_CANTOPEN (14), aborting the whole DDL and
hanging the Intelligence tab (issue tinyhumansai#3231 / TAURI-RUST-8WM).

WAL is a performance optimization, not a correctness requirement. Move
journal-mode selection out of the DDL batch into apply_journal_mode(), which
prefers WAL and falls back to a TRUNCATE rollback journal (no shared memory)
when WAL can't be backed — so schema init succeeds on those filesystems.

Also classify residual genuine CANTOPEN failures (where even opening the DB
file fails — a local permission/disk condition Sentry can't remediate) as
ExpectedErrorKind::SubconsciousSchemaUnavailable so they're demoted to a warn
diagnostic instead of escalating as Sentry errors. Scoped to the subconscious
open/DDL envelope plus cant-open/shm-IO text — transient 'database is locked'
(retried by the store) and unrelated DB failures still reach Sentry.

Tests: WAL-fallback + end-to-end DB init in store_tests; classifier
positive/negative cases in observability.
@M3gA-Mind M3gA-Mind requested a review from a team June 2, 2026 20:31
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c99bb816-a24a-4592-bcb0-f15699844e14

📥 Commits

Reviewing files that changed from the base of the PR and between 332f26f and 443ee66.

📒 Files selected for processing (3)
  • src/core/observability.rs
  • src/openhuman/subconscious/store.rs
  • src/openhuman/subconscious/store_tests.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/openhuman/subconscious/store_tests.rs
  • src/openhuman/subconscious/store.rs

📝 Walkthrough

Walkthrough

This PR separates SQLite journal-mode selection from subconscious schema DDL, adds WAL-to-TRUNCATE fallback during initialization, and classifies specific subconscious DB open/shared-memory I/O failures as expected warnings instead of full error reports.

Changes

Subconscious Schema Resilience

Layer / File(s) Summary
Journal Mode Initialization with Fallback
src/openhuman/subconscious/store.rs, src/openhuman/subconscious/store_tests.rs
open_and_initialize now calls apply_journal_mode(&conn) after busy_timeout and before SCHEMA_DDL. The helper attempts PRAGMA journal_mode = WAL, falls back to TRUNCATE when needed, and logs failures without aborting initialization. SCHEMA_DDL no longer embeds a journal_mode pragma, and tests cover pragma removal, on-disk mode verification, workspace DB creation, and in-memory fallback behavior.
Subconscious Schema Failure Classification and Demotion
src/core/observability.rs
Adds ExpectedErrorKind::SubconsciousSchemaUnavailable, wires it into expected_error_kind, introduces is_subconscious_schema_unavailable_message, and logs matching failures through report_expected_message as warn! breadcrumbs with kind="subconscious_schema_unavailable" and metadata-only fields. Tests cover matching SQLITE_IOERR_SHMMAP and SQLITE_CANTOPEN cases plus similar nonmatching failures.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • tinyhumansai/openhuman#2830: Extends the same ExpectedErrorKind classification and demotion flow in src/core/observability.rs with other expected-error buckets.

Suggested reviewers

  • graycyrus

Poem

🐰 The schema took a careful, softer track,
First WAL hopped forth, then TRUNCATE had its back.
When shared-memory burrows would not open right,
A warning lantern glowed instead of full alarm at night.
The rabbit nods: “Proceed, and keep the paths polite.”

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately captures the main change: degrading WAL journal mode to fix Intelligence tab hang on incompatible filesystems, matching the core objective of the changeset.
Linked Issues check ✅ Passed All coding requirements from #3231 are met: root cause identified and addressed via WAL→TRUNCATE fallback, Intelligence remains responsive via out-of-band failure handling, graceful degradation via fallback journal mode, schema init hardened with error classification, Sentry noise reduced via SubconsciousSchemaUnavailable demotion, and tests added.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing #3231: journal-mode selection logic, error classification for subconscious schema failures, and related tests. No unrelated refactorings or feature additions are present.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. sentry-traced-bug Bug identified via Sentry triage bug labels Jun 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/core/observability.rs`:
- Around line 1562-1568: The warning currently logs the full error body via
error = %message and embeds {message} in the formatted string; remove the raw
message body from the tracing::warn call in observability.rs (the call that uses
domain, operation, kind) and instead record only metadata—keep domain, operation
and kind the same and replace the raw error payload with a redacted or generic
marker (e.g., "redacted" or "unavailable") so the variable message is not
printed; update the formatted message to exclude {message} and reference only
the sanitized marker.
- Around line 452-456: The current predicate that classifies SQLite failures is
too broad because lower.contains("disk i/o error") will catch generic I/O
issues; change the boolean expression so it no longer treats a plain "disk i/o
error" as the special WAL/shm case—remove the standalone lower.contains("disk
i/o error") check and only keep the targeted matches (e.g.
lower.contains("xshmmap"), lower.contains("error code 4618"), and
lower.contains("unable to open the database file")) or narrow the disk I/O check
to a WAL-specific pattern (e.g. require both "disk i/o error" and "xshm" / "wal"
text) in the same conditional where these lower.contains(...) calls are used.

In `@src/openhuman/subconscious/store.rs`:
- Around line 123-146: The warnings in src/openhuman/subconscious/store.rs
currently log the absolute path via db_path = %db_path.display() in the
tracing::warn! calls (including the block that handles Err(e) and the TRUNCATE
fallback after query_journal_mode), which exposes PII; change those logging
fields to a non-sensitive identifier instead (for example a redacted constant
like "REDACTED_DB_PATH", the DB filename only, or a stable hash of db_path) and
update all tracing::warn! invocations in this file that reference db_path
(search for db_path = %db_path.display(), the tracing::warn! blocks around the
WAL and TRUNCATE fallbacks, and the query_journal_mode call) so they emit the
chosen redacted identifier rather than the full path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 3f8e86d5-f045-468b-bc27-6a2cd32094bc

📥 Commits

Reviewing files that changed from the base of the PR and between b156392 and 332f26f.

📒 Files selected for processing (3)
  • src/core/observability.rs
  • src/openhuman/subconscious/store.rs
  • src/openhuman/subconscious/store_tests.rs

Comment thread src/core/observability.rs
Comment thread src/core/observability.rs
Comment thread src/openhuman/subconscious/store.rs
…ow IO matcher

- store: drop absolute DB path from journal-mode warnings (leaked home
  dir / username); log a stable db="subconscious.db" identifier instead.
  Removes the now-unused db_path arg from apply_journal_mode.
- observability: omit raw error body from the SubconsciousSchemaUnavailable
  demotion (could embed the absolute DB path); log domain/operation/kind
  only, mirroring DiskFull / FilesystemUserPathInvalid.
- observability: drop the over-broad `disk i/o error` anchor from the
  subconscious matcher so generic SQLite I/O failures stay visible to
  Sentry; xShmMap / error-code-4618 / cantopen anchors still classify the
  targeted WAL shared-memory case.
@M3gA-Mind M3gA-Mind merged commit 235430d into tinyhumansai:main Jun 2, 2026
22 checks passed
senamakel pushed a commit to senamakel/openhuman that referenced this pull request Jun 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug memory Memory store, memory tree, recall, summarization, and embeddings in src/openhuman/memory/. rust-core Core Rust runtime in src/: CLI, core_server, shared infrastructure. sentry-traced-bug Bug identified via Sentry triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Intelligence tab hangs after subconscious schema disk I/O error

1 participant