fix(subconscious): degrade WAL journal mode to fix Intelligence tab hang on shm-incompatible filesystems#3233
Conversation
…ang on shm-incompatible filesystems The subconscious SQLite schema init ran PRAGMA journal_mode = WAL inside the DDL batch. WAL eagerly mmaps a -shm shared-memory segment; on network mounts, FUSE, and some sandboxed/synced macOS paths that mmap fails with SQLITE_IOERR_SHMMAP (4618) or SQLITE_CANTOPEN (14), aborting the whole DDL and hanging the Intelligence tab (issue tinyhumansai#3231 / TAURI-RUST-8WM). WAL is a performance optimization, not a correctness requirement. Move journal-mode selection out of the DDL batch into apply_journal_mode(), which prefers WAL and falls back to a TRUNCATE rollback journal (no shared memory) when WAL can't be backed — so schema init succeeds on those filesystems. Also classify residual genuine CANTOPEN failures (where even opening the DB file fails — a local permission/disk condition Sentry can't remediate) as ExpectedErrorKind::SubconsciousSchemaUnavailable so they're demoted to a warn diagnostic instead of escalating as Sentry errors. Scoped to the subconscious open/DDL envelope plus cant-open/shm-IO text — transient 'database is locked' (retried by the store) and unrelated DB failures still reach Sentry. Tests: WAL-fallback + end-to-end DB init in store_tests; classifier positive/negative cases in observability.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughThis PR separates SQLite journal-mode selection from subconscious schema DDL, adds WAL-to-TRUNCATE fallback during initialization, and classifies specific subconscious DB open/shared-memory I/O failures as expected warnings instead of full error reports. ChangesSubconscious Schema Resilience
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/core/observability.rs`:
- Around line 1562-1568: The warning currently logs the full error body via
error = %message and embeds {message} in the formatted string; remove the raw
message body from the tracing::warn call in observability.rs (the call that uses
domain, operation, kind) and instead record only metadata—keep domain, operation
and kind the same and replace the raw error payload with a redacted or generic
marker (e.g., "redacted" or "unavailable") so the variable message is not
printed; update the formatted message to exclude {message} and reference only
the sanitized marker.
- Around line 452-456: The current predicate that classifies SQLite failures is
too broad because lower.contains("disk i/o error") will catch generic I/O
issues; change the boolean expression so it no longer treats a plain "disk i/o
error" as the special WAL/shm case—remove the standalone lower.contains("disk
i/o error") check and only keep the targeted matches (e.g.
lower.contains("xshmmap"), lower.contains("error code 4618"), and
lower.contains("unable to open the database file")) or narrow the disk I/O check
to a WAL-specific pattern (e.g. require both "disk i/o error" and "xshm" / "wal"
text) in the same conditional where these lower.contains(...) calls are used.
In `@src/openhuman/subconscious/store.rs`:
- Around line 123-146: The warnings in src/openhuman/subconscious/store.rs
currently log the absolute path via db_path = %db_path.display() in the
tracing::warn! calls (including the block that handles Err(e) and the TRUNCATE
fallback after query_journal_mode), which exposes PII; change those logging
fields to a non-sensitive identifier instead (for example a redacted constant
like "REDACTED_DB_PATH", the DB filename only, or a stable hash of db_path) and
update all tracing::warn! invocations in this file that reference db_path
(search for db_path = %db_path.display(), the tracing::warn! blocks around the
WAL and TRUNCATE fallbacks, and the query_journal_mode call) so they emit the
chosen redacted identifier rather than the full path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3f8e86d5-f045-468b-bc27-6a2cd32094bc
📒 Files selected for processing (3)
src/core/observability.rssrc/openhuman/subconscious/store.rssrc/openhuman/subconscious/store_tests.rs
…ow IO matcher - store: drop absolute DB path from journal-mode warnings (leaked home dir / username); log a stable db="subconscious.db" identifier instead. Removes the now-unused db_path arg from apply_journal_mode. - observability: omit raw error body from the SubconsciousSchemaUnavailable demotion (could embed the absolute DB path); log domain/operation/kind only, mirroring DiskFull / FilesystemUserPathInvalid. - observability: drop the over-broad `disk i/o error` anchor from the subconscious matcher so generic SQLite I/O failures stay visible to Sentry; xShmMap / error-code-4618 / cantopen anchors still classify the targeted WAL shared-memory case.
…ang on shm-incompatible filesystems (tinyhumansai#3233)
Summary
Fixes #3231. The subconscious SQLite schema init hung the Intelligence tab on filesystems that can't back WAL's shared-memory segment.
subconscious::storeranPRAGMA journal_mode = WALinside theSCHEMA_DDLbatch. WAL eagerly mmaps a-shmshared-memory segment; on network mounts, FUSE, and some sandboxed/synced macOS paths that mmap fails withSQLITE_IOERR_SHMMAP(4618) orSQLITE_CANTOPEN(14), aborting the entire DDL batch. The existing retry loop only handlesSQLITE_BUSY/SQLITE_LOCKED, so the failure surfaced immediately asfailed to run subconscious schema DDL: …and the Intelligence tab broke/hung (SentryTAURI-RUST-8WM). A sibling report showed theCANTOPENvariant (Error code 14: Unable to open the database file).Root cause
WAL requires an mmap-backed
-shmfile. It's a performance optimization, not a correctness requirement — but having it inside the DDL batch made an unsupported filesystem fatal to schema init.Changes
subconscious/store.rs— Move journal-mode selection out ofSCHEMA_DDLintoapply_journal_mode(), called before the table DDL. It prefers WAL and falls back to aTRUNCATErollback journal (no shared memory) when WAL can't be honoured, logging the degrade. Failures are logged, never propagated — a working rollback journal beats a failed init.core/observability.rs— Classify residual genuine open failures (where evenConnection::open/ the file open fails — a local permission/disk condition with no Sentry remediation path) asExpectedErrorKind::SubconsciousSchemaUnavailable, demoting them to awarn!diagnostic instead of escalating Sentry errors. Anchored to the subconscious open/DDL envelope plus cant-open/shm-IO text, so transientdatabase is locked(retried by the store) and unrelated DB failures in other domains still reach Sentry.Acceptance criteria
-shmmmap fails on shm-incompatible filesystems.journal_modein DDL) + observability classifier positive/negative cases.Testing
Summary by CodeRabbit
Bug Fixes
Improvements
Tests