refactor: split 24 oversized Rust modules into sub-module directories#3424
Conversation
New top-level domain `src/openhuman/agent_memory/` housing the memory retrieval agent, benchmark harness, and performance tracking types. - agent_memory domain: mod.rs, types.rs (WalkBenchmark, BenchmarkSummary), ops.rs (bench_walk, bench_batch wrapping run_smart_walk with timing) - Built-in agent_memory agent: worker tier, read-only sandbox, tools limited to memory_tree/recall/walk/smart_walk/doctor/query_memory - Prompt focused on retrieval strategy, performance contract, and citation-based output format - scripts/bench-memory-walk.sh: CLI benchmark runner against a live memory tree, JSONL output to target/bench-memory/
Move the canonical `memory_loader.rs` (MemoryLoader trait, DefaultMemoryLoader, MemoryCitation, cross-chat context, prior conversations) from `agent/memory_loader.rs` into the new `agent_memory/memory_loader.rs` domain module. The original file becomes a thin re-export shim so all existing harness imports (`agent::memory_loader::*`) continue to resolve.
…o domain - New `call_memory_agent` tool: spawns the agent_memory sub-agent to autonomously walk the memory tree and return cited context for a query. Supports sync (blocking) and async (fire-and-forget) modes. - Move agent definition from agent_registry/agents/agent_memory/ into agent_memory/agent/ so the domain owns its own agent definition. - Switch model hint from "agentic" to "chat" (lightweight/fast tier). - Register call_memory_agent in orchestrator's direct tool list. - Register tool in the global tool registry (tools/ops.rs + mod.rs).
The chat-tier LLM sometimes emits <tool_name>/<parameters> XML inside
<tool_call> blocks instead of the expected JSON {"name","arguments"}
format. The JSON-only parser silently dropped these, causing the walk
to exit after one turn with no evidence.
- Add parse_single_tool_call() with JSON-first + XML fallback parsing
- Add extract_xml_tag() helper for the XML path
- Update system prompt with explicit JSON format example to nudge the
model toward the primary path
- Add bench-memory-retrieval.sh script for local benchmarking
Covers the XML tool-call parsing fix and multi-source memory walking: Parser unit tests: - XML-style <tool_name>/<parameters> inside <tool_call> blocks - Mixed JSON + XML format in same response - XML params with no <parameters> tag - extract_xml_tag helper - parse_single_tool_call JSON/XML/garbage paths E2E walk tests (StubProvider, seeded content tree): - walk_synced_email_with_keyword_and_evidence: 4-turn walk across email + docs, verifies evidence from both sources - walk_with_xml_format_tool_calls: the actual bug scenario — XML calls are parsed, not silently dropped - walk_reads_across_content_types: evidence from raw/episodic/wiki - walk_llm_gives_up_uses_fallback: empty response → LlmGaveUp - walk_direct_answer_without_tools: no tool calls → direct answer - walk_collect_evidence_deduplicates_within_limit: multi-batch evidence collection
… call_memory_agent All 9 agents that had direct memory tool access now delegate through call_memory_agent instead. This consolidates memory retrieval behind the memory agent, which internally uses the full tool suite (smart walk, keyword search, entity search, tree browse, etc.). Agents updated: orchestrator, planner, researcher, help, presentation_agent, profile_memory_agent, crypto_agent, markets_agent, trigger_reactor. profile_memory_agent retains memory_doctor (diagnostics) and memory_store/memory_forget (write ops) since those are not retrieval.
- Rename tool.rs → tools.rs for canonical module shape consistency - Fix hardcoded user path in bench-memory-retrieval.sh (auto-detect) - Fix stdin double-read in bench-memory-retrieval.sh Python one-liner - Fix JSON injection in bench-memory-walk.sh via python3 json.dumps - Redact raw queries from log lines (ops.rs, tools.rs) - Return Err when all benchmark queries fail (ops.rs) - Wire max_turns through to subagent prompt (tools.rs) - Fix percentile indexing off-by-one (types.rs) - Add debug logging to XML parser fallback (smart_walk.rs) - Fix stray quote in system prompt example (smart_walk.rs) - Add actual duplicate to dedup test (smart_walk.rs) - Fix README code block language and registration path
…y_agent The e2e test asserted memory_tree was in orchestrator tools — updated to check for call_memory_agent instead, matching the agent TOML migration.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughRefactors and modularizes agent subsystems: adds an Archivist (indexing, segmentation, LLM recaps, optional tree ingest), splits Agent turn lifecycle into context/core/tools/session_io, implements typed subagent runner ops, refactors prompt construction into builder/sections/render_helpers, introduces a task dispatcher/poller/registry, and adds channel connect RPC operations with helpers and tests. ChangesCore refactor and new subsystems
Sequence Diagram(s)(Skipped — changes are extensive modular refactors and multiple separate flows; no single new multi-component sequential feature meets the diagram generation conditions.) Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
|
Split files exceeding the 500-line guideline into smaller, focused sub-modules while preserving all public APIs via re-exports. - channels/providers/web.rs (2894→7 files): ops, types, schemas, session, event_bus, progress_bridge, run_task - config/ops.rs (2515→6 files): loader, model, agent, ui, sandbox - config/schema/load.rs (2290→7 files): impl_load, env_overlay, env, dirs, secrets, migrate - inference/provider/compatible.rs (2746→5 files): helpers, provider_impl, stream_native, repeat + main - memory/read_rpc.rs (2164→7 files): types, chunks, entities, graph, vault, admin
10 focused sub-modules: error_utils, toolkits, connections, memory_cleanup, tools_ops, execute, triggers, providers_ops, direct_mode. All pub(crate) visibility preserved.
Sub-modules: helpers (field builders), controllers (registration), schema_defs (per-function schema match). Test companion preserved.
Sub-modules: core (turn lifecycle), tools (execution + delegation), context (history + prompt), session_io (transcript persistence).
…directory Sub-modules: factory (from_config constructors), setters (fluent API), tools (tool registration), inner (build finalization + delegation setup).
…directory Sub-modules: provider (lazy resolver), observer (progress bridging), tool_source (tool surface builder), spawn (core spawn logic).
…alk/ directory Splits the monolithic smart_walk.rs into focused sub-modules: - types.rs: public types, constants, and truncate_chars helper - prompts.rs: system/tool prompts, content inventory, model resolution, tool-call parser, fallback synthesis + InnerCall struct - dispatch.rs: dispatch_call router + all per-action dispatch_* functions - runner.rs: run_smart_walk main agentic loop - tool.rs: SmartMemoryWalkTool + ChatProviderAdapter - smart_walk_tests.rs: extracted inline #[cfg(test)] mod - mod.rs: re-exports preserving the external API unchanged
Breaks the monolithic policy.rs into five focused modules under src/openhuman/security/policy/: - types.rs — enums, structs, ActionTracker, workspace-internal lists - command_checks.rs — command risk/class classification and allowlist logic - path_checks.rs — path validation, symlink safety, workspace/trusted-root checks - enforcement.rs — gate decisions, tool-op enforcement, rate limiting, from_config - mod.rs — re-exports preserving the existing public API surface policy_command.rs and policy_tests.rs moved into the directory (rename detected). External API unchanged; cargo check passes clean.
Extract into three sibling files within prompts/: - builder.rs — SystemPromptBuilder + GLOBAL_STYLE_SUFFIX - sections.rs — all PromptSection structs and their impl blocks - render_helpers.rs — render_* free fns, render_subagent_system_prompt, and workspace-file I/O helpers (sync/inject/default) mod.rs is now a thin facade: declares the three new mods and re-exports everything so the public API is unchanged. No logic was moved or altered.
Extracts the flat schema.rs into three focused sub-modules under src/openhuman/memory/schema/: definitions.rs — the `schemas()` match (all ControllerSchema literals) handlers.rs — all `handle_*` bridge functions + shared helpers registry.rs — `all_controller_schemas` / `all_registered_controllers` schema/mod.rs re-exports the original public surface unchanged so all callers in core::all and memory::mod.rs continue to compile without modification. schema_tests.rs remains in place, referenced via #[path = "../schema_tests.rs"].
Extracted into four focused sub-modules under mcp_server/tools/: - types.rs: McpToolSpec, ToolCallError, shared constants - specs.rs: tool spec builders and JSON schema helpers - params.rs: build_rpc_params and all argument validation helpers - dispatch.rs: call_tool, list_tools_result, agent/subagent handlers External API (tool_specs, McpToolSpec, call_tool, list_tools_result, tool_success, tool_error) preserved via mod.rs re-exports. Companion tools_tests.rs included via #[path] with test-gated pub use re-exports so use super::* continues to work unchanged.
Extract inline tests to task_dispatcher/tests.rs and split production code into focused sub-modules: types, registry, prompt, executor, poller, dispatch. All sub-modules are ≤500 lines; public API (dispatch_card, cancel_session, start_board_poller, build_task_prompt, DispatchOutcome) is preserved via mod.rs re-exports.
…ctory Splits the monolithic schemas.rs into five focused sub-modules under src/openhuman/workflows/schemas/: - wire_types.rs (~200 lines) — RPC param/result structs, WorkflowSummary - helpers.rs (~80 lines) — config/workspace resolution, deserialize_params/to_json - run_machinery.rs (~230 lines) — spawn_workflow_run_background, await_run_outcome - handlers.rs (~240 lines) — thin handle_* dispatcher functions - controller_schemas.rs (~300 lines) — workflows_schemas match + all_* registry fns External API (all_workflows_controller_schemas, all_workflows_registered_controllers, workflows_schemas, resolve_workspace_dir, spawn_workflow_run_background, await_run_outcome) is fully preserved; existing callers require no changes.
Split the 1279-line factory.rs into five focused sub-modules (traits, stt_providers, tts_providers, helpers, entry) plus a companion tests.rs, each under 500 lines. Public API preserved exactly via pub use re-exports in mod.rs.
Splits the 1333-line ops.rs into six focused sub-modules (each ≤500 lines): types, connect, yuanbao, telegram, discord, messaging. External API preserved via pub use re-exports in mod.rs.
Splits the 1246-line compose.rs into four focused sub-modules: - yaml.rs: scalar helpers, scan_fm_field, split_front_matter, source_tag - chunk.rs: compose_chunk_file, rewrite_tags and email-specific helpers - summary.rs: SummaryComposeInput, ComposedSummary, compose_summary_md, rewrite_summary_tags - tests.rs: extracted #[cfg(test)] suite All public symbols are re-exported from mod.rs to preserve the external API.
Splits the 1290-line ops.rs into four focused sub-modules under ops/: - sanitize.rs — secret scrubbing and error formatting helpers - http_error.rs — HTTP error classification, Sentry routing, api_error - models.rs — model listing, OpenAI-compat /models parsing - provider_factory.rs — provider construction (create_*, ProviderRuntimeOptions) External API preserved exactly via pub use re-exports in ops/mod.rs. Test companion ops_tests.rs stays in place via #[path] attribute.
Splits the 1257-line archivist.rs into focused sub-modules each ≤500 lines: types.rs (struct def), lifecycle.rs (constructors + segment ops + on_segment_closed), recap.rs (summarize_entries + rolling_segment_recap + read_session_entries), tree_ingest.rs (pipe_segment_to_tree), hook_impl.rs (PostTurnHook impl), helpers.rs (utility fns), and test_constructors.rs (#[cfg(test)] impls). External API preserved via pub use re-export in mod.rs.
a2f8ccd to
36e289e
Compare
…e-modules-v2 # Conflicts: # scripts/bench-memory-retrieval.sh # src/openhuman/agent_memory/tools.rs # src/openhuman/memory/query/smart_walk.rs
read_rpc.rs → read_rpc/, schema.rs → schema/, compose.rs → compose/
There was a problem hiding this comment.
Actionable comments posted: 17
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/openhuman/agent/harness/session/builder/factory.rs (3)
800-806:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThe no-filter path accidentally becomes delegation-only.
When
filter_from_scopeisNone, this populatesvisiblewith only synthesized delegation tool names.AgentBuilder::buildlater usesvisible_tool_namesas the allowlist for both provider-visible specs andToolPolicyEngine::build_session, so callers likefrom_config_for_read_only_council_juror()lose the read-only search/memory tools they just retained and end up delegation-only. Preserve an empty set for the wildcard / legacy "no filter" cases instead.Proposed fix
- let visible: std::collections::HashSet<String> = match filter_from_scope { - Some(set) => set, - None => delegation_tools - .iter() - .map(|t| t.name().to_string()) - .collect(), - }; + let visible: std::collections::HashSet<String> = + filter_from_scope.unwrap_or_default();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/builder/factory.rs` around lines 800 - 806, The current logic sets visible to the delegation_tools names when filter_from_scope is None, causing the no-filter (wildcard) case to become delegation-only; change the branch so that when filter_from_scope is None you create an empty HashSet (preserving the legacy/wildcard meaning) and only populate visible from delegation_tools when filter_from_scope is Some(set); update the code paths that use visible_tool_names (e.g., AgentBuilder::build and ToolPolicyEngine::build_session) to rely on an empty set meaning "no filter" and ensure callers like from_config_for_read_only_council_juror() retain read-only search/memory tools.
165-169:⚠️ Potential issue | 🟠 Major | ⚡ Quick winKeep
with_reflection_chunksaligned with the main agent lookup contract.
from_config_for_agent_with_reflection_chunkscurrently treats both "registry missing" and "agent id not found" astarget_def = None, then falls through to a legacy/default build. A typoed non-orchestratoragent_idtherefore builds the wrong prompt/tool scope instead of returning the same lookup error asfrom_config_for_agent.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/builder/factory.rs` around lines 165 - 169, from_config_for_agent_with_reflection_chunks currently collapses "no registry" and "agent id not found" into target_def = None and falls back to a legacy build; change it to match from_config_for_agent's lookup contract: call AgentDefinitionRegistry::global(), and if it returns Some(reg) then if reg.get(agent_id) is None return the same agent-not-found lookup error (propagate/return the same Err used by from_config_for_agent) instead of setting target_def = None; only treat target_def as None when AgentDefinitionRegistry::global() returns None. Update the logic around target_def / AgentDefinitionRegistry::global() in from_config_for_agent_with_reflection_chunks accordingly so typoed non-orchestrator agent_ids produce the same lookup error as from_config_for_agent.
859-868:⚠️ Potential issue | 🟠 Major | ⚡ Quick winTool-memory rules are prefetched for hidden tools too.
agent_tool_namesis built from the fulltoolsvec, not the agent's visible/policy-filtered surface. On scoped agents this pins rules for tools the model can never call, which leaks hidden tool names into the system prompt and skews the guidance the model sees. Filter the names through the same visible-tool surface before callingprefetch_tool_memory_rules_blocking.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/builder/factory.rs` around lines 859 - 868, The current code builds agent_tool_names from the full tools vec and passes them to prefetch_tool_memory_rules_blocking, which leaks hidden tool names; change it to first derive the agent's visible/policy-filtered tool list (the same surface used elsewhere to decide callable tools), then map that filtered list to agent_tool_names and pass those names into prefetch_tool_memory_rules_blocking so only visible tools' rules are pinned; update the block around config.learning... where agent_tool_names, prefetch_tool_memory_rules_blocking, and prompt_builder.with_tool_memory_rules are used to use the filtered tool list instead of the full tools vec.
🧹 Nitpick comments (2)
src/openhuman/agent/harness/subagent_runner/ops/loop_.rs (1)
42-59: ⚡ Quick winAdd debug/trace lifecycle diagnostics for
run_inner_loop.The new loop has only one branch log at Line 118 and no entry/exit diagnostics around the external turn-engine call at Line 179. Please add
debug/tracelogs (stable[subagent_runner]prefix +task_id/agent_idcorrelation fields) for entry, key branches, external-call start/finish, and error/exit outcomes; and downgrade this diagnostic branch log frominfounless intentionally user-visible.As per coding guidelines: “Include debug logging with verbose diagnostics on all new/changed flows... with stable grep-friendly prefixes and correlation fields” and “Use Rust
log/tracingatdebug/tracelevels for diagnostics.”Also applies to: 118-123, 179-199
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/subagent_runner/ops/loop_.rs` around lines 42 - 59, Add structured debug/trace logging to run_inner_loop: emit a trace/debug entry log at the start of run_inner_loop with the stable prefix "[subagent_runner]" and include task_id and agent_id, add debug/trace logs at each key branch/decision (the branch currently logged at info around the loop branch), emit debug before/after the external turn-engine call and include start/finish timestamps and any correlation ids, and ensure all error/exit outcomes are logged at debug/trace with the same prefix and task_id/agent_id; specifically downgrade the existing info log in the diagnostic branch to debug. Use the run_inner_loop function name and the local variables task_id and agent_id to locate insertion points (entry, branch handling, external call start/finish, error/return paths).Source: Coding guidelines
src/openhuman/agent/harness/session/builder/helpers.rs (1)
27-32: ⚡ Quick winAdd debug logs for the silent prefetch-disable branches.
Both early returns collapse to
Vec::new(), so "no rules matched" and "prefetch could not run in this runtime" are indistinguishable in production. Please log the no-runtime and single-thread-runtime cases with the stable[memory::tool_memory]prefix and a small correlation field liketool_count.As per coding guidelines, "Include debug logging with verbose diagnostics on all new/changed flows: log entry/exit, branches, external calls, retries/timeouts, state transitions, and errors with stable grep-friendly prefixes (
[domain],[rpc]) and correlation fields."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/builder/helpers.rs` around lines 27 - 32, Add debug logs for the two early-return branches around tokio::runtime::Handle::try_current() and the runtime_flavor check so these cases aren’t silent; when try_current() fails log a debug message prefixed with "[memory::tool_memory]" and include a small correlation field tool_count=0 and context like "no runtime handle" before returning Vec::new(), and when handle.runtime_flavor() != tokio::runtime::RuntimeFlavor::MultiThread log another debug message with the same prefix and a tool_count (e.g., tool_count=0) and context like "single-thread runtime - prefetch disabled" before returning Vec::new(); keep messages grep-friendly and include entry/exit or branch text for diagnostics.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@scripts/bench-memory-retrieval.sh`:
- Around line 122-124: The SAMPLE assignment uses a pipeline with "while read f;
do" which can mangle backslashes in file paths; update the loop to use "while
read -r f; do" so backslashes are preserved when iterating matched files (look
for the SAMPLE=... grep -rl "$pattern" "$CONTENT_ROOT" | head -3 | while read f;
do ... done sequence and change read to read -r).
In `@scripts/bench-memory-walk.sh`:
- Line 21: The default CONTENT_ROOT currently hardcodes a user-id path via
OPENHUMAN_MEMORY_CONTENT_ROOT fallback which will break for other users; change
the fallback logic in scripts/bench-memory-walk.sh so CONTENT_ROOT derives from
OPENHUMAN_WORKSPACE (or, like the companion benchmark script, discovers the
first workspace content root) instead of the baked-in path — update references
to the CONTENT_ROOT variable and ensure the env var
OPENHUMAN_MEMORY_CONTENT_ROOT remains respected but only as an override,
otherwise compute the path from OPENHUMAN_WORKSPACE or by scanning workspace
directories for a content subfolder.
- Around line 42-43: The --model flag (variable MODEL) is parsed but never
included in the RPC payload; update the RPC invocation(s) where the script
builds the payload (the curl/HTTP/RPC call blocks around the locations handling
CUSTOM_QUERY and other flags) to add the model field when MODEL is set (for
example include "model": "$MODEL" or the equivalent key expected by the server)
so the override is sent; apply the same change to all payload constructions
referenced in the script (the blocks around lines handling flags at ~84-86 and
~128-141) and ensure quoting/escaping matches the existing payload format and
only adds the field when MODEL is non-empty.
In `@src/openhuman/agent_memory/agent/prompt.md`:
- Around line 32-38: The fenced code block in prompt.md is missing a language
identifier and triggers MD040; update the block starting with the triple
backticks containing the sample answer and Sources so it reads like a fenced
"text" block (e.g., change ``` to ```text) to satisfy the linter and keep
formatting consistent.
In `@src/openhuman/agent_memory/agent/prompt.rs`:
- Around line 10-37: Add structured debug/trace logging around the new Prompt
build flow: log entry and exit of the build(ctx: &PromptContext) function (e.g.,
"[agent-prompt] build start/finish"), log branch decisions when
user_files/tools/workspace are empty vs non-empty, and wrap each external,
fallible call (render_user_files, render_tools, render_workspace) with a trace
entry/exit and error log that includes a stable prefix (e.g., "[agent-prompt]"
or "[rpc]"), the function name, and a correlation id pulled from PromptContext
if available; also log render_safety invocation and any retries/timeouts or
unexpected empty results so failures are visible and grep-friendly.
- Around line 27-29: The builder unconditionally appends the safety preamble by
calling render_safety() and pushing its text, which ignores the
omit_safety_preamble config; update the prompt construction to check the
omit_safety_preamble flag (the config/field named omit_safety_preamble used when
building this agent) and only call render_safety() / push_str its output when
omit_safety_preamble is false (or absent), e.g., wrap the render_safety() /
out.push_str(...) calls in a conditional that references omit_safety_preamble so
the dynamic prompt honors the agent.toml setting.
- Around line 10-38: Add unit tests exercising build(...) by creating a
#[cfg(test)] mod tests in the same prompt.rs (or a sibling prompt_tests.rs) that
constructs minimal PromptContext instances and asserts the composed string
contains ARCHETYPE plus rendered sections; include a happy-path test where
render_user_files(ctx), render_tools(ctx), and render_workspace(ctx) return
normal content and assert expected concatenation and spacing, and an edge-case
test where you supply a PromptContext (or mocks/stubbed helpers) that causes
render_user_files, render_tools, or render_workspace to return empty strings and
assert build(...) omits those sections and preserves spacing/newlines correctly.
Reference the functions PromptContext, build, render_user_files, render_tools,
render_workspace and use Result<String> from build(...) in assertions.
In `@src/openhuman/agent_memory/README.md`:
- Around line 46-59: The README's Benchmarking section only lists the walk
benchmark script; add documentation for the new retrieval benchmark by appending
usage examples for scripts/bench-memory-retrieval.sh (mirroring the style of
bench-memory-walk.sh), including default invocation, passing a custom --query
and --max-turns, and a --content-root override; reference both script names
(bench-memory-walk.sh and bench-memory-retrieval.sh) so readers can run either
benchmark.
In `@src/openhuman/agent/harness/archivist/helpers.rs`:
- Around line 44-57: The current filtering logic that trims per-line markers
leaves multi-line JSON tool payloads in variable cleaned; update the logic that
processes cleaned (the chain starting with cleaned.lines().filter(...) ) to
detect start markers like "\"tool_use\"", "{\"tool_calls\"", or "\"tool_calls\""
and skip entire brace-balanced blocks (or until the matching closing bracket)
instead of only dropping matching lines; implement a block-skip parser that
scans the input string, on seeing those markers consumes the following JSON
object/array by tracking opening and closing braces/brackets (handling nested
braces and string-escaped braces) and only appends non-skipped regions to the
output so no multi-line tool_calls/tool_use payloads remain.
In `@src/openhuman/agent/harness/archivist/hook_impl.rs`:
- Around line 33-150: The fix: avoid indexing the assistant turn before segment
boundary closure by calling manage_segment_sync immediately after inserting the
user episodic entry (the first fts5::episodic_insert and after computing
current_episodic_id) and handling the returned closed_segment (and running
on_segment_closed) before creating/writing the assistant episodic entry, its
memory_archivist dual-write, and subsequent ingestion; move the assistant
fts5::episodic_insert + assistant dual-write + on-segment ingestion logic to run
only after manage_segment_sync completes so segment membership is
non-overlapping (refer to fts5::episodic_insert, manage_segment_sync, and
on_segment_closed).
In `@src/openhuman/agent/harness/session/builder/setters.rs`:
- Around line 446-462: The SegmentRecapSummarizer is being constructed with a
snapshot of self.event_session_id ("session_id_for_recap") inside build(),
causing it to hold the stale "standalone" id; instead, change the construction
so the summarizer queries the current session id at runtime (e.g., pass a
closure/getter or a reference to the session-id source) rather than the cloned
string. Update the builder code around unified_compaction_enabled /
archivist_hook where Arc::new(SegmentRecapSummarizer::new(...,
session_id_for_recap, ...)) is invoked so it supplies a dynamic accessor (or
shared Atomic/Arc<String> updated by from_config_for_read_only_council_juror
after build_session_agent_inner()) so SegmentRecapSummarizer uses the live
event_session_id rather than the captured snapshot.
In `@src/openhuman/agent/harness/session/turn/core.rs`:
- Around line 206-210: The current call to
memory_loader.load_context(...).await.unwrap_or_default() swallows load errors;
change it to match on the Result from
memory_loader.load_context(self.memory.as_ref(), user_message).await, and when
Err(e) occurs log the error with a stable, searchable prefix (e.g.
"MEMORY_LOAD_ERROR:") and relevant context (such as a short user_message
identifier and self.memory info), then continue using a default context on
error; ensure you use the project's logging facility (error! or tracing::error)
and keep the variable named context so downstream code is unchanged.
- Around line 93-95: The current debug branch logs the full rendered_prompt when
self.omit_profile && self.omit_memory_md, which can still contain PII (e.g.,
UserIdentitySection name/email/id); change this so you never emit the raw
rendered_prompt to logs: always log the redacted/hashed version (use whatever
redacted_prompt variable/path you already produce for user-scoped prompts) or
gate the raw dump behind an explicit opt-in flag (e.g.,
allow_full_prompt_logging) that additionally strips/suppresses
UserIdentitySection content before logging; update the branch using symbols
omit_profile, omit_memory_md, rendered_prompt and UserIdentitySection to either
call the redaction path or check the explicit opt-in and enforce identity
suppression before calling log::debug!.
In `@src/openhuman/agent/harness/session/turn/tools.rs`:
- Around line 228-276: The code currently commits schema-only changes when
refresh_delegation_tools() succeeds even if Arc::get_mut(&mut self.tools) (or
equivalent installation of Box<dyn Tool> instances) fails, causing published
delegate specs that cannot be executed; change
refresh_delegation_tools_from_cached_integrations to treat a schema-only refresh
as non-committal: after calling refresh_delegation_tools(), immediately verify
you can install/obtain mutable access to the runtime tools (i.e., attempt
Arc::get_mut on self.tools or otherwise confirm all required Box<dyn Tool>
instances were created); if that verification fails, restore
self.connected_integrations = prev_integrations and do not update
self.last_seen_integrations_hash, self.connected_integrations_initialized, or
mutate announced_integrations/pending_integration_announcement (rollback schema
changes), otherwise commit the hash and append connected_slugs as before; apply
the same rollback/verification logic to the duplicate code path around lines
381-401.
In `@src/openhuman/agent/harness/subagent_runner/ops/runner.rs`:
- Around line 70-74: The caller-provided task_id is used directly in filenames
(e.g., checkpoint_dir.join(format!("{task_id}.json"))), creating a path
traversal risk; update the code that sets task_id (the options.task_id handling
in runner.rs) to sanitize/validate the input: reject or normalize any value
containing path separators, leading dots/slashes, drive letters, or characters
outside a safe set (allow only alphanumerics, '-' and '_' and fall back to the
generated uuid when invalid), or replace invalid characters with '_' so the
result cannot escape checkpoint_dir; use the sanitized_task_id variable wherever
task_id is used for checkpoint filenames and apply the same sanitization to
other usages in the 674-702 region.
In `@src/openhuman/agent/harness/subagent_runner/ops/tool_source.rs`:
- Around line 121-123: The handoff call is passing a hardcoded empty task_id;
replace the "" with the real task identifier available in this scope (use the
correct identifier such as self.task_id or call.task_id or outcome.task_id
depending on which exists in this module) when invoking apply_handoff(cache,
&call.name, ..., &self.agent_id, outcome.text), preserving the existing argument
order and types so per-task correlation is retained in handoff stash logs.
---
Outside diff comments:
In `@src/openhuman/agent/harness/session/builder/factory.rs`:
- Around line 800-806: The current logic sets visible to the delegation_tools
names when filter_from_scope is None, causing the no-filter (wildcard) case to
become delegation-only; change the branch so that when filter_from_scope is None
you create an empty HashSet (preserving the legacy/wildcard meaning) and only
populate visible from delegation_tools when filter_from_scope is Some(set);
update the code paths that use visible_tool_names (e.g., AgentBuilder::build and
ToolPolicyEngine::build_session) to rely on an empty set meaning "no filter" and
ensure callers like from_config_for_read_only_council_juror() retain read-only
search/memory tools.
- Around line 165-169: from_config_for_agent_with_reflection_chunks currently
collapses "no registry" and "agent id not found" into target_def = None and
falls back to a legacy build; change it to match from_config_for_agent's lookup
contract: call AgentDefinitionRegistry::global(), and if it returns Some(reg)
then if reg.get(agent_id) is None return the same agent-not-found lookup error
(propagate/return the same Err used by from_config_for_agent) instead of setting
target_def = None; only treat target_def as None when
AgentDefinitionRegistry::global() returns None. Update the logic around
target_def / AgentDefinitionRegistry::global() in
from_config_for_agent_with_reflection_chunks accordingly so typoed
non-orchestrator agent_ids produce the same lookup error as
from_config_for_agent.
- Around line 859-868: The current code builds agent_tool_names from the full
tools vec and passes them to prefetch_tool_memory_rules_blocking, which leaks
hidden tool names; change it to first derive the agent's visible/policy-filtered
tool list (the same surface used elsewhere to decide callable tools), then map
that filtered list to agent_tool_names and pass those names into
prefetch_tool_memory_rules_blocking so only visible tools' rules are pinned;
update the block around config.learning... where agent_tool_names,
prefetch_tool_memory_rules_blocking, and prompt_builder.with_tool_memory_rules
are used to use the filtered tool list instead of the full tools vec.
---
Nitpick comments:
In `@src/openhuman/agent/harness/session/builder/helpers.rs`:
- Around line 27-32: Add debug logs for the two early-return branches around
tokio::runtime::Handle::try_current() and the runtime_flavor check so these
cases aren’t silent; when try_current() fails log a debug message prefixed with
"[memory::tool_memory]" and include a small correlation field tool_count=0 and
context like "no runtime handle" before returning Vec::new(), and when
handle.runtime_flavor() != tokio::runtime::RuntimeFlavor::MultiThread log
another debug message with the same prefix and a tool_count (e.g., tool_count=0)
and context like "single-thread runtime - prefetch disabled" before returning
Vec::new(); keep messages grep-friendly and include entry/exit or branch text
for diagnostics.
In `@src/openhuman/agent/harness/subagent_runner/ops/loop_.rs`:
- Around line 42-59: Add structured debug/trace logging to run_inner_loop: emit
a trace/debug entry log at the start of run_inner_loop with the stable prefix
"[subagent_runner]" and include task_id and agent_id, add debug/trace logs at
each key branch/decision (the branch currently logged at info around the loop
branch), emit debug before/after the external turn-engine call and include
start/finish timestamps and any correlation ids, and ensure all error/exit
outcomes are logged at debug/trace with the same prefix and task_id/agent_id;
specifically downgrade the existing info log in the diagnostic branch to debug.
Use the run_inner_loop function name and the local variables task_id and
agent_id to locate insertion points (entry, branch handling, external call
start/finish, error/return paths).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0975c8cf-35dc-41c5-9682-a024335e0fc4
📒 Files selected for processing (226)
scripts/bench-memory-retrieval.shscripts/bench-memory-walk.shsrc/openhuman/agent/harness/archivist.rssrc/openhuman/agent/harness/archivist/helpers.rssrc/openhuman/agent/harness/archivist/hook_impl.rssrc/openhuman/agent/harness/archivist/lifecycle.rssrc/openhuman/agent/harness/archivist/mod.rssrc/openhuman/agent/harness/archivist/recap.rssrc/openhuman/agent/harness/archivist/test_constructors.rssrc/openhuman/agent/harness/archivist/tree_ingest.rssrc/openhuman/agent/harness/archivist/types.rssrc/openhuman/agent/harness/session/builder/builder_tests.rssrc/openhuman/agent/harness/session/builder/factory.rssrc/openhuman/agent/harness/session/builder/helpers.rssrc/openhuman/agent/harness/session/builder/mod.rssrc/openhuman/agent/harness/session/builder/setters.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/turn/context.rssrc/openhuman/agent/harness/session/turn/core.rssrc/openhuman/agent/harness/session/turn/mod.rssrc/openhuman/agent/harness/session/turn/session_io.rssrc/openhuman/agent/harness/session/turn/tools.rssrc/openhuman/agent/harness/subagent_runner/ops.rssrc/openhuman/agent/harness/subagent_runner/ops/checkpoint.rssrc/openhuman/agent/harness/subagent_runner/ops/handoff_helper.rssrc/openhuman/agent/harness/subagent_runner/ops/loop_.rssrc/openhuman/agent/harness/subagent_runner/ops/mod.rssrc/openhuman/agent/harness/subagent_runner/ops/observer.rssrc/openhuman/agent/harness/subagent_runner/ops/prompt.rssrc/openhuman/agent/harness/subagent_runner/ops/provider.rssrc/openhuman/agent/harness/subagent_runner/ops/runner.rssrc/openhuman/agent/harness/subagent_runner/ops/tool_source.rssrc/openhuman/agent/memory_loader.rssrc/openhuman/agent/prompts/builder.rssrc/openhuman/agent/prompts/mod.rssrc/openhuman/agent/prompts/render_helpers.rssrc/openhuman/agent/prompts/sections.rssrc/openhuman/agent/task_dispatcher.rssrc/openhuman/agent/task_dispatcher/dispatch.rssrc/openhuman/agent/task_dispatcher/executor.rssrc/openhuman/agent/task_dispatcher/mod.rssrc/openhuman/agent/task_dispatcher/poller.rssrc/openhuman/agent/task_dispatcher/prompt.rssrc/openhuman/agent/task_dispatcher/registry.rssrc/openhuman/agent/task_dispatcher/tests.rssrc/openhuman/agent/task_dispatcher/types.rssrc/openhuman/agent_memory/README.mdsrc/openhuman/agent_memory/agent/agent.tomlsrc/openhuman/agent_memory/agent/mod.rssrc/openhuman/agent_memory/agent/prompt.mdsrc/openhuman/agent_memory/agent/prompt.rssrc/openhuman/agent_memory/memory_loader.rssrc/openhuman/agent_memory/mod.rssrc/openhuman/agent_memory/ops.rssrc/openhuman/agent_memory/tools.rssrc/openhuman/agent_memory/types.rssrc/openhuman/agent_registry/agents/crypto_agent/agent.tomlsrc/openhuman/agent_registry/agents/help/agent.tomlsrc/openhuman/agent_registry/agents/loader.rssrc/openhuman/agent_registry/agents/markets_agent/agent.tomlsrc/openhuman/agent_registry/agents/orchestrator/agent.tomlsrc/openhuman/agent_registry/agents/planner/agent.tomlsrc/openhuman/agent_registry/agents/presentation_agent/agent.tomlsrc/openhuman/agent_registry/agents/profile_memory_agent/agent.tomlsrc/openhuman/agent_registry/agents/researcher/agent.tomlsrc/openhuman/agent_registry/agents/trigger_reactor/agent.tomlsrc/openhuman/channels/controllers/ops.rssrc/openhuman/channels/controllers/ops/connect.rssrc/openhuman/channels/controllers/ops/discord.rssrc/openhuman/channels/controllers/ops/messaging.rssrc/openhuman/channels/controllers/ops/mod.rssrc/openhuman/channels/controllers/ops/telegram.rssrc/openhuman/channels/controllers/ops/types.rssrc/openhuman/channels/controllers/ops/yuanbao.rssrc/openhuman/channels/providers/web.rssrc/openhuman/channels/providers/web/event_bus.rssrc/openhuman/channels/providers/web/mod.rssrc/openhuman/channels/providers/web/ops.rssrc/openhuman/channels/providers/web/progress_bridge.rssrc/openhuman/channels/providers/web/run_task.rssrc/openhuman/channels/providers/web/schemas.rssrc/openhuman/channels/providers/web/session.rssrc/openhuman/channels/providers/web/types.rssrc/openhuman/channels/runtime/dispatch/helpers.rssrc/openhuman/channels/runtime/dispatch/mod.rssrc/openhuman/channels/runtime/dispatch/processor.rssrc/openhuman/channels/runtime/dispatch/routing.rssrc/openhuman/composio/mod.rssrc/openhuman/composio/ops.rssrc/openhuman/composio/ops/connections.rssrc/openhuman/composio/ops/direct_mode.rssrc/openhuman/composio/ops/error_utils.rssrc/openhuman/composio/ops/execute.rssrc/openhuman/composio/ops/memory_cleanup.rssrc/openhuman/composio/ops/mod.rssrc/openhuman/composio/ops/providers_ops.rssrc/openhuman/composio/ops/toolkits.rssrc/openhuman/composio/ops/tools_ops.rssrc/openhuman/composio/ops/triggers.rssrc/openhuman/config/ops.rssrc/openhuman/config/ops/agent.rssrc/openhuman/config/ops/loader.rssrc/openhuman/config/ops/mod.rssrc/openhuman/config/ops/model.rssrc/openhuman/config/ops/sandbox.rssrc/openhuman/config/ops/ui.rssrc/openhuman/config/schema/load.rssrc/openhuman/config/schema/load/dirs.rssrc/openhuman/config/schema/load/env.rssrc/openhuman/config/schema/load/env_overlay.rssrc/openhuman/config/schema/load/impl_load.rssrc/openhuman/config/schema/load/migrate.rssrc/openhuman/config/schema/load/mod.rssrc/openhuman/config/schema/load/secrets.rssrc/openhuman/config/schema/tools.rssrc/openhuman/config/schema/tools/browser.rssrc/openhuman/config/schema/tools/http.rssrc/openhuman/config/schema/tools/integrations.rssrc/openhuman/config/schema/tools/mcp.rssrc/openhuman/config/schema/tools/mod.rssrc/openhuman/config/schema/tools/multimodal.rssrc/openhuman/config/schema/tools/search.rssrc/openhuman/config/schemas.rssrc/openhuman/config/schemas/controllers.rssrc/openhuman/config/schemas/helpers.rssrc/openhuman/config/schemas/mod.rssrc/openhuman/config/schemas/schema_defs.rssrc/openhuman/inference/local/service/ollama_admin.rssrc/openhuman/inference/local/service/ollama_admin/binary.rssrc/openhuman/inference/local/service/ollama_admin/diagnostics.rssrc/openhuman/inference/local/service/ollama_admin/health.rssrc/openhuman/inference/local/service/ollama_admin/mod.rssrc/openhuman/inference/local/service/ollama_admin/model_pull.rssrc/openhuman/inference/local/service/ollama_admin/server.rssrc/openhuman/inference/local/service/ollama_admin/util.rssrc/openhuman/inference/provider/compatible.rssrc/openhuman/inference/provider/compatible_helpers.rssrc/openhuman/inference/provider/compatible_provider_impl.rssrc/openhuman/inference/provider/compatible_repeat.rssrc/openhuman/inference/provider/compatible_stream_native.rssrc/openhuman/inference/provider/ops.rssrc/openhuman/inference/provider/ops/http_error.rssrc/openhuman/inference/provider/ops/mod.rssrc/openhuman/inference/provider/ops/models.rssrc/openhuman/inference/provider/ops/provider_factory.rssrc/openhuman/inference/provider/ops/sanitize.rssrc/openhuman/mcp_server/tools.rssrc/openhuman/mcp_server/tools/dispatch.rssrc/openhuman/mcp_server/tools/mod.rssrc/openhuman/mcp_server/tools/params.rssrc/openhuman/mcp_server/tools/specs.rssrc/openhuman/mcp_server/tools/types.rssrc/openhuman/meet_agent/brain.rssrc/openhuman/meet_agent/brain/access.rssrc/openhuman/meet_agent/brain/constants.rssrc/openhuman/meet_agent/brain/llm.rssrc/openhuman/meet_agent/brain/mod.rssrc/openhuman/meet_agent/brain/speech.rssrc/openhuman/meet_agent/brain/stubs.rssrc/openhuman/meet_agent/brain/text.rssrc/openhuman/meet_agent/brain/turns.rssrc/openhuman/meet_agent/brain_tests.rssrc/openhuman/memory/query/smart_walk.rssrc/openhuman/memory/query/smart_walk/dispatch.rssrc/openhuman/memory/query/smart_walk/mod.rssrc/openhuman/memory/query/smart_walk/prompts.rssrc/openhuman/memory/query/smart_walk/runner.rssrc/openhuman/memory/query/smart_walk/smart_walk_tests.rssrc/openhuman/memory/query/smart_walk/tool.rssrc/openhuman/memory/query/smart_walk/types.rssrc/openhuman/memory/read_rpc.rssrc/openhuman/memory/read_rpc/admin.rssrc/openhuman/memory/read_rpc/chunks.rssrc/openhuman/memory/read_rpc/entities.rssrc/openhuman/memory/read_rpc/graph.rssrc/openhuman/memory/read_rpc/mod.rssrc/openhuman/memory/read_rpc/types.rssrc/openhuman/memory/read_rpc/vault.rssrc/openhuman/memory/schema/definitions.rssrc/openhuman/memory/schema/handlers.rssrc/openhuman/memory/schema/mod.rssrc/openhuman/memory/schema/registry.rssrc/openhuman/memory_store/chunks/migrations.rssrc/openhuman/memory_store/chunks/raw_refs.rssrc/openhuman/memory_store/chunks/store.rssrc/openhuman/memory_store/content/compose.rssrc/openhuman/memory_store/content/compose/chunk.rssrc/openhuman/memory_store/content/compose/mod.rssrc/openhuman/memory_store/content/compose/summary.rssrc/openhuman/memory_store/content/compose/tests.rssrc/openhuman/memory_store/content/compose/yaml.rssrc/openhuman/mod.rssrc/openhuman/security/policy.rssrc/openhuman/security/policy/command_checks.rssrc/openhuman/security/policy/enforcement.rssrc/openhuman/security/policy/mod.rssrc/openhuman/security/policy/path_checks.rssrc/openhuman/security/policy/policy_command.rssrc/openhuman/security/policy/policy_tests.rssrc/openhuman/security/policy/types.rssrc/openhuman/tools/mod.rssrc/openhuman/tools/ops.rssrc/openhuman/voice/factory.rssrc/openhuman/voice/factory/entry.rssrc/openhuman/voice/factory/helpers.rssrc/openhuman/voice/factory/mod.rssrc/openhuman/voice/factory/stt_providers.rssrc/openhuman/voice/factory/tests.rssrc/openhuman/voice/factory/traits.rssrc/openhuman/voice/factory/tts_providers.rssrc/openhuman/voice/schemas.rssrc/openhuman/voice/schemas/handlers.rssrc/openhuman/voice/schemas/handlers/provider_server.rssrc/openhuman/voice/schemas/handlers/transcribe_tts.rssrc/openhuman/voice/schemas/helpers.rssrc/openhuman/voice/schemas/mod.rssrc/openhuman/voice/schemas/params.rssrc/openhuman/voice/schemas/registry.rssrc/openhuman/workflows/schemas.rssrc/openhuman/workflows/schemas/controller_schemas.rssrc/openhuman/workflows/schemas/handlers.rssrc/openhuman/workflows/schemas/helpers.rssrc/openhuman/workflows/schemas/mod.rssrc/openhuman/workflows/schemas/run_machinery.rssrc/openhuman/workflows/schemas/wire_types.rstests/agent_retrieval_e2e.rs
💤 Files with no reviewable changes (4)
- src/openhuman/agent/harness/subagent_runner/ops.rs
- src/openhuman/agent/harness/session/turn.rs
- src/openhuman/agent/harness/archivist.rs
- src/openhuman/agent/task_dispatcher.rs
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
🛑 Comments failed to post (17)
scripts/bench-memory-retrieval.sh (1)
122-124:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse
read -rwhen iterating matched file paths.The current loop can mangle backslashes in paths.
Suggested fix
- SAMPLE=$(grep -rl "$pattern" "$CONTENT_ROOT" 2>/dev/null | head -3 | while read f; do + SAMPLE=$(grep -rl "$pattern" "$CONTENT_ROOT" 2>/dev/null | head -3 | while read -r f; do basename "$f" | tr '\n' ' ' done)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.SAMPLE=$(grep -rl "$pattern" "$CONTENT_ROOT" 2>/dev/null | head -3 | while read -r f; do basename "$f" | tr '\n' ' ' done)🧰 Tools
🪛 Shellcheck (0.11.0)
[info] 122-122: read without -r will mangle backslashes.
(SC2162)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/bench-memory-retrieval.sh` around lines 122 - 124, The SAMPLE assignment uses a pipeline with "while read f; do" which can mangle backslashes in file paths; update the loop to use "while read -r f; do" so backslashes are preserved when iterating matched files (look for the SAMPLE=... grep -rl "$pattern" "$CONTENT_ROOT" | head -3 | while read f; do ... done sequence and change read to read -r).Source: Linters/SAST tools
scripts/bench-memory-walk.sh (2)
21-21:
⚠️ Potential issue | 🟠 Major | ⚡ Quick winDefault content root is environment-specific and brittle.
The baked-in user-id path makes default runs fail for most users. Prefer deriving from
OPENHUMAN_WORKSPACE(or discovering the first workspace content root) like the companion benchmark script.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/bench-memory-walk.sh` at line 21, The default CONTENT_ROOT currently hardcodes a user-id path via OPENHUMAN_MEMORY_CONTENT_ROOT fallback which will break for other users; change the fallback logic in scripts/bench-memory-walk.sh so CONTENT_ROOT derives from OPENHUMAN_WORKSPACE (or, like the companion benchmark script, discovers the first workspace content root) instead of the baked-in path — update references to the CONTENT_ROOT variable and ensure the env var OPENHUMAN_MEMORY_CONTENT_ROOT remains respected but only as an override, otherwise compute the path from OPENHUMAN_WORKSPACE or by scanning workspace directories for a content subfolder.
42-43:
⚠️ Potential issue | 🟠 Major | ⚡ Quick win
--modelis advertised but never applied to the RPC call.The flag is parsed and displayed, but the payload omits it, so users cannot actually benchmark with model overrides.
Suggested fix
- RPC_PAYLOAD=$(python3 -c " + RPC_PAYLOAD=$(python3 -c " import json, sys +params = { + 'query': sys.argv[1], + 'namespace': sys.argv[2], + 'max_turns': $MAX_TURNS +} +if sys.argv[3]: + params['model'] = sys.argv[3] payload = { 'jsonrpc': '2.0', 'id': 'bench-$idx', 'method': 'openhuman.memory_smart_walk', - 'params': { - 'query': sys.argv[1], - 'namespace': sys.argv[2], - 'max_turns': $MAX_TURNS - } + 'params': params } print(json.dumps(payload)) -" "$query" "$NAMESPACE") +" "$query" "$NAMESPACE" "$MODEL")Also applies to: 84-86, 128-141
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/bench-memory-walk.sh` around lines 42 - 43, The --model flag (variable MODEL) is parsed but never included in the RPC payload; update the RPC invocation(s) where the script builds the payload (the curl/HTTP/RPC call blocks around the locations handling CUSTOM_QUERY and other flags) to add the model field when MODEL is set (for example include "model": "$MODEL" or the equivalent key expected by the server) so the override is sent; apply the same change to all payload constructions referenced in the script (the blocks around lines handling flags at ~84-86 and ~128-141) and ensure quoting/escaping matches the existing payload format and only adds the field when MODEL is non-empty.src/openhuman/agent_memory/agent/prompt.md (1)
32-38:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd a language identifier to the fenced block.
This fence triggers MD040; add a language (for example
text) to keep markdown lint clean.Suggested fix
-``` +```text [Answer text with citations like [1], [2]...] Sources: 1. chat/conversations-agent/abc123.md — "relevant snippet" 2. raw/github-repo/def456.md — "relevant snippet"</details> <details> <summary>🧰 Tools</summary> <details> <summary>🪛 markdownlint-cli2 (0.22.1)</summary> [warning] 32-32: Fenced code blocks should have a language specified (MD040, fenced-code-language) </details> </details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@src/openhuman/agent_memory/agent/prompt.mdaround lines 32 - 38, The fenced
code block in prompt.md is missing a language identifier and triggers MD040;
update the block starting with the triple backticks containing the sample answer
and Sources so it reads like a fenced "text" block (e.g., changetotext)
to satisfy the linter and keep formatting consistent.</details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:b1185b9b427aa41e7405062a --> _Source: Linters/SAST tools_ <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent_memory/agent/prompt.rs (3)</summary><blockquote> 10-37: _🛠️ Refactor suggestion_ | _🟠 Major_ | _⚡ Quick win_ **Add debug/trace diagnostics for this new prompt build flow.** This new flow has multiple branches and fallible external render calls but no structured diagnostics. As per coding guidelines, “Include debug logging with verbose diagnostics on all new/changed flows: log entry/exit, branches, external calls, retries/timeouts, state transitions, and errors with stable grep-friendly prefixes (`[domain]`, `[rpc]`) and correlation fields.” <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent_memory/agent/prompt.rs` around lines 10 - 37, Add structured debug/trace logging around the new Prompt build flow: log entry and exit of the build(ctx: &PromptContext) function (e.g., "[agent-prompt] build start/finish"), log branch decisions when user_files/tools/workspace are empty vs non-empty, and wrap each external, fallible call (render_user_files, render_tools, render_workspace) with a trace entry/exit and error log that includes a stable prefix (e.g., "[agent-prompt]" or "[rpc]"), the function name, and a correlation id pulled from PromptContext if available; also log render_safety invocation and any retries/timeouts or unexpected empty results so failures are visible and grep-friendly. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:500619716fb2ebf8ecd38f2f --> _Source: Coding guidelines_ <!-- This is an auto-generated comment by CodeRabbit --> --- 10-38: _⚠️ Potential issue_ | _🟠 Major_ <details> <summary>🧩 Analysis chain</summary> 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail # Locate prompt builder and nearby test files fd -i 'prompt.rs' src/openhuman/agent_memory fd -i '*test*.rs' src/openhuman/agent_memory # Find direct coverage of the memory-agent prompt builder rg -n --type rust -C 2 'agent_memory::agent::prompt::build|prompt::build\(|include_str!\("prompt.md"\)' ``` Repository: tinyhumansai/openhuman Length of output: 416 --- 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail # Locate prompt builder and nearby test files fd -i 'prompt\.rs$' src/openhuman/agent_memory fd --iglob '*test*.rs' src/openhuman/agent_memory fd --iglob '*_tests.rs' src/openhuman/agent_memory # Check if prompt.rs has inline tests rg -n --type rust 'cfg\(test\)|mod tests' src/openhuman/agent_memory/agent/prompt.rs # Find direct references to the build() function from Rust code/tests rg -n --type rust 'agent_memory::agent::prompt::build|prompt::build\(|prompt::build<' src/openhuman/agent_memory # Find any references in test files mentioning build() rg -n --type rust --glob '*test*.rs' 'agent_memory::agent::prompt::build|prompt::build\(' src/openhuman/agent_memory ``` Repository: tinyhumansai/openhuman Length of output: 312 --- 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail # Locate prompt builder and any nearby test files fd -i 'prompt\.rs$' src/openhuman/agent_memory fd --glob '*test*.rs' src/openhuman/agent_memory fd --glob '*_tests.rs' src/openhuman/agent_memory # Check if prompt.rs has inline tests rg -n --type rust 'cfg\(test\)|mod tests' src/openhuman/agent_memory/agent/prompt.rs # Find direct references to build() from Rust code/tests rg -n --type rust 'agent_memory::agent::prompt::build|prompt::build<' src/openhuman/agent_memory rg -n --type rust 'prompt::build\(' src/openhuman/agent_memory # If there are snapshot/fixture-style tests, look for render/build usage around prompt.md rg -n --type rust 'include_str!\("prompt\.md"\)|render_user_files\(|render_tools\(|render_safety\(|render_workspace\(' src/openhuman/agent_memory ``` Repository: tinyhumansai/openhuman Length of output: 109 --- 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail echo "== Rust test files in src/openhuman/agent_memory ==" fd --glob '*test*.rs' src/openhuman/agent_memory || true fd --glob '*_tests.rs' src/openhuman/agent_memory || true echo "== Inline tests in prompt.rs ==" rg -n --type rust 'cfg\(test\)|mod tests' src/openhuman/agent_memory/agent/prompt.rs || true echo "== Call sites: prompt::build( or module path (whole src) ==" rg -n --type rust 'agent_memory::agent::prompt::build' src || true rg -n --type rust 'prompt::build\(' src || true rg -n --type rust 'agent_memory::agent::prompt' src || true echo "== References to prompt.md / render_* helpers in prompt.rs ==" rg -n --type rust 'include_str!\("prompt\.md"\)|render_user_files\(|render_tools\(|render_safety\(|render_workspace\(' src/openhuman/agent_memory/agent/prompt.rs || true ``` Repository: tinyhumansai/openhuman Length of output: 884 --- 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail # Find where the agent registry loader + prompt_fn are used in tests rg -n --type rust 'prompt_fn:\s*crate::openhuman::agent_memory::agent::prompt::build' -S src || true # Look for test modules / harness tests touching agent registry loader rg -n --type rust '\#\[cfg\(test\)\]|mod tests|test_support' src/openhuman | head -n 50 # Search for references to loader.rs in tests rg -n --type rust 'agent_registry::agents::loader|agents/loader\.rs|prompt_fn' src/openhuman --glob '*test*.rs' --glob '*_tests.rs' || true rg -n --type rust 'agent_registry::agents::loader|agents/loader\.rs' src/openhuman || true # Search for executing agent harness / prompt building during tests rg -n --type rust 'run_tool_call_loop|orchestrator prompt|orch_prompt::build|system_prompt|prompt_fn' src/openhuman --glob '*test*.rs' --glob '*_tests.rs' || true rg -n --type rust 'orch_prompt::build|orchestrator prompt|system_prompt' src/openhuman --glob '*test*.rs' --glob '*_tests.rs' || true # Finally, search for direct calls to the memory prompt build function in tests rg -n --type rust 'agent_memory::agent::prompt::build' src/openhuman --glob '*test*.rs' --glob '*_tests.rs' || true ``` Repository: tinyhumansai/openhuman Length of output: 32081 --- 🏁 Script executed: ```shell #!/usr/bin/env bash set -euo pipefail # 1) Direct invocations in tests rg -n --type rust 'agent_memory::agent::prompt::build' src --glob '*test*.rs' --glob '*_tests.rs' || true rg -n --type rust 'prompt_fn:\s*crate::openhuman::agent_memory::agent::prompt::build' src --glob '*test*.rs' --glob '*_tests.rs' || true # 2) Where the agent registry prompt_fn is actually called rg -n --type rust 'prompt_fn' src/openhuman/agent_registry src/openhuman/agent src/openhuman/workflows src --glob '*test*.rs' --glob '*_tests.rs' || true # 3) Find runtime/harness tests that construct agent registry / agent system prompt rg -n --type rust 'agent_registry|load_agent|AgentsLoader|registry' src/openhuman/workflows src/openhuman/agent src --glob '*test*.rs' --glob '*_tests.rs' || true ``` Repository: tinyhumansai/openhuman Length of output: 27837 --- </details> **Add Rust tests for `agent_memory::agent::prompt::build`** `src/openhuman/agent_memory/agent/prompt.rs` has no `#[cfg(test)] mod tests` / sibling `*_tests.rs` exercising `build(...)`, and there are no `*_test.rs` call sites invoking `crate::openhuman::agent_memory::agent::prompt::build`. It’s only wired as `prompt_fn` in `src/openhuman/agent_registry/agents/loader.rs` (line ~194). Add Rust tests covering the prompt composition flow (happy path) and at least one edge case where `render_user_files`/`render_tools`/`render_workspace` render empty sections, per the repo’s testing guidelines. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent_memory/agent/prompt.rs` around lines 10 - 38, Add unit tests exercising build(...) by creating a #[cfg(test)] mod tests in the same prompt.rs (or a sibling prompt_tests.rs) that constructs minimal PromptContext instances and asserts the composed string contains ARCHETYPE plus rendered sections; include a happy-path test where render_user_files(ctx), render_tools(ctx), and render_workspace(ctx) return normal content and assert expected concatenation and spacing, and an edge-case test where you supply a PromptContext (or mocks/stubbed helpers) that causes render_user_files, render_tools, or render_workspace to return empty strings and assert build(...) omits those sections and preserves spacing/newlines correctly. Reference the functions PromptContext, build, render_user_files, render_tools, render_workspace and use Result<String> from build(...) in assertions. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:21d4d1a7b4677d231022efba --> _Source: Coding guidelines_ <!-- This is an auto-generated comment by CodeRabbit --> --- 27-29: _⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_ **`omit_safety_preamble` is effectively ignored for this agent.** `agent_memory/agent/agent.toml` sets `omit_safety_preamble = true`, but this builder always appends `render_safety()`. That creates a config/behavior contract break for this dynamic prompt path. <details> <summary>Suggested fix</summary> ```diff -use crate::openhuman::context::prompt::{ - render_safety, render_tools, render_user_files, render_workspace, PromptContext, -}; +use crate::openhuman::context::prompt::{ + render_tools, render_user_files, render_workspace, PromptContext, +}; @@ - let safety = render_safety(); - out.push_str(safety.trim_end()); - out.push_str("\n\n"); + // Intentionally omitted: agent_memory sets omit_safety_preamble=true in agent.toml. ``` </details> <!-- suggestion_start --> <details> <summary>📝 Committable suggestion</summary> > ‼️ **IMPORTANT** > Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements. ```suggestion use crate::openhuman::context::prompt::{ render_tools, render_user_files, render_workspace, PromptContext, }; // ... earlier code ... let tools = render_tools(&ctx); out.push_str(tools.trim_end()); out.push_str("\n\n"); let user_files = render_user_files(&ctx); out.push_str(user_files.trim_end()); out.push_str("\n\n"); let workspace = render_workspace(&ctx); out.push_str(workspace.trim_end()); out.push_str("\n\n"); // Intentionally omitted: agent_memory sets omit_safety_preamble=true in agent.toml. ``` </details> <!-- suggestion_end --> <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent_memory/agent/prompt.rs` around lines 27 - 29, The builder unconditionally appends the safety preamble by calling render_safety() and pushing its text, which ignores the omit_safety_preamble config; update the prompt construction to check the omit_safety_preamble flag (the config/field named omit_safety_preamble used when building this agent) and only call render_safety() / push_str its output when omit_safety_preamble is false (or absent), e.g., wrap the render_safety() / out.push_str(...) calls in a conditional that references omit_safety_preamble so the dynamic prompt honors the agent.toml setting. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:0aaa8abc1ae4ad5acff5d1b2 --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent_memory/README.md (1)</summary><blockquote> 46-59: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Document the retrieval benchmark script alongside the walk benchmark.** This section only shows `bench-memory-walk.sh`, but this PR also introduces `scripts/bench-memory-retrieval.sh`. Adding it here keeps the benchmark docs complete. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent_memory/README.md` around lines 46 - 59, The README's Benchmarking section only lists the walk benchmark script; add documentation for the new retrieval benchmark by appending usage examples for scripts/bench-memory-retrieval.sh (mirroring the style of bench-memory-walk.sh), including default invocation, passing a custom --query and --max-turns, and a --content-root override; reference both script names (bench-memory-walk.sh and bench-memory-retrieval.sh) so readers can run either benchmark. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:f0b0910707f233a8339ff12f --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/archivist/helpers.rs (1)</summary><blockquote> 44-57: _⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_ **Multiline `tool_calls` payloads are only partially stripped.** The current line filter removes marker lines but can leave inner JSON argument lines/braces when payloads are pretty-printed across multiple lines. That allows structured tool payloads to leak into tree ingest despite the raw-prose-only contract. Please remove the entire `tool_calls`/`tool_use` block (not just matching lines), e.g., with a block-skip parser keyed on marker detection plus balanced braces. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/archivist/helpers.rs` around lines 44 - 57, The current filtering logic that trims per-line markers leaves multi-line JSON tool payloads in variable cleaned; update the logic that processes cleaned (the chain starting with cleaned.lines().filter(...) ) to detect start markers like "\"tool_use\"", "{\"tool_calls\"", or "\"tool_calls\"" and skip entire brace-balanced blocks (or until the matching closing bracket) instead of only dropping matching lines; implement a block-skip parser that scans the input string, on seeing those markers consumes the following JSON object/array by tracking opening and closing braces/brackets (handling nested braces and string-escaped braces) and only appends non-skipped regions to the output so no multi-line tool_calls/tool_use payloads remain. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:cec636835e176d0d8456114d --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/archivist/hook_impl.rs (1)</summary><blockquote> 33-150: _⚠️ Potential issue_ | _🟠 Major_ | _🏗️ Heavy lift_ **Boundary turn is indexed before boundary closure, causing segment overlap.** Line 33/Line 65 persist the current turn before Line 135 runs boundary management. When a boundary is detected, the old segment is closed after this write, so closed-segment recap extraction can include the boundary turn while the new segment also starts at that same turn. This creates duplicate segment attribution (recap/events/tree ingest). Please restructure so boundary close happens before persisting the new turn into segment scope (or otherwise enforce a non-overlapping segment membership source-of-truth not based on the current timestamp window). <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/archivist/hook_impl.rs` around lines 33 - 150, The fix: avoid indexing the assistant turn before segment boundary closure by calling manage_segment_sync immediately after inserting the user episodic entry (the first fts5::episodic_insert and after computing current_episodic_id) and handling the returned closed_segment (and running on_segment_closed) before creating/writing the assistant episodic entry, its memory_archivist dual-write, and subsequent ingestion; move the assistant fts5::episodic_insert + assistant dual-write + on-segment ingestion logic to run only after manage_segment_sync completes so segment membership is non-overlapping (refer to fts5::episodic_insert, manage_segment_sync, and on_segment_closed). ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:5e8ee66d50f1911b4cd4af98 --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/archivist/recap.rs (1)</summary><blockquote> 90-113: _⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_ **Recap summarization currently consumes raw assistant payloads.** The corpus builder assumes assistant `content` is already prose-only, but `on_turn_complete` stores `ctx.assistant_response` verbatim. That means tool-call blobs/image markers can be sent into `summarise(...)`, which is a data-leak and summary-quality risk. <details> <summary>Suggested fix</summary> ```diff +use super::helpers::strip_tool_calls_from_response; use super::types::ArchivistHook; @@ - let corpus_inputs: Vec<SummaryInput> = entries + let corpus_inputs: Vec<SummaryInput> = entries .iter() - .filter(|e| !e.content.trim().is_empty()) - .map(|e| { + .filter_map(|e| { use crate::openhuman::memory_store::chunks::types::approx_token_count; - let content = e.content.clone(); + let raw = if e.role == "assistant" { + strip_tool_calls_from_response(&e.content) + } else { + e.content.clone() + }; + let (content, _image_refs) = + crate::openhuman::agent::multimodal::parse_image_markers(&raw); + if content.trim().is_empty() { + return None; + } let token_count = approx_token_count(&content); let ts = chrono::DateTime::from_timestamp(e.timestamp as i64, 0) .unwrap_or_else(chrono::Utc::now); - SummaryInput { + Some(SummaryInput { id: format!("{}-{}", e.role, e.timestamp as u64), content, token_count, entities: Vec::new(), topics: Vec::new(), time_range_start: ts, time_range_end: ts, score: 0.5, - } + }) }) .collect(); ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:5527223d53253a45d3dd6ac9 --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/session/builder/setters.rs (1)</summary><blockquote> 446-462: _⚠️ Potential issue_ | _🟠 Major_ | _🏗️ Heavy lift_ **`SegmentRecapSummarizer` captures a stale session id.** This snapshots `self.event_session_id` during `build()`, but callers can set the real event context after construction. `from_config_for_read_only_council_juror()` already does that immediately after `build_session_agent_inner()`. The recap summarizer keeps `"standalone"` forever, so unified compaction queries the archivist under the wrong session and misses the live rolling recap. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/builder/setters.rs` around lines 446 - 462, The SegmentRecapSummarizer is being constructed with a snapshot of self.event_session_id ("session_id_for_recap") inside build(), causing it to hold the stale "standalone" id; instead, change the construction so the summarizer queries the current session id at runtime (e.g., pass a closure/getter or a reference to the session-id source) rather than the cloned string. Update the builder code around unified_compaction_enabled / archivist_hook where Arc::new(SegmentRecapSummarizer::new(..., session_id_for_recap, ...)) is invoked so it supplies a dynamic accessor (or shared Atomic/Arc<String> updated by from_config_for_read_only_council_juror after build_session_agent_inner()) so SegmentRecapSummarizer uses the live event_session_id rather than the captured snapshot. ``` </details> <!-- fingerprinting:phantom:medusa:grasshopper --> <!-- cr-comment:v1:aee36538327aa2e2df8dd7f3 --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/session/turn/core.rs (2)</summary><blockquote> 93-95: _⚠️ Potential issue_ | _🟠 Major_ | _⚡ Quick win_ **Avoid dumping the full system prompt here.** `omit_profile && omit_memory_md` is not a safe proxy for “no sensitive user data”. The rendered prompt can still include `UserIdentitySection` content (`name` / `email` / `id`), so this branch leaks full PII into debug logs for narrow-specialist sessions. Keep the hashed/redacted path for all user-scoped prompts, or gate full-body logging behind an explicit opt-in that also suppresses identity sections. As per coding guidelines, use Rust `log`/`tracing` diagnostics but never log secrets or full PII. Based on `src/openhuman/agent/prompts/sections.rs`, the rendered prompt can still contain user identity fields even when `omit_profile` and `omit_memory_md` are both true. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/turn/core.rs` around lines 93 - 95, The current debug branch logs the full rendered_prompt when self.omit_profile && self.omit_memory_md, which can still contain PII (e.g., UserIdentitySection name/email/id); change this so you never emit the raw rendered_prompt to logs: always log the redacted/hashed version (use whatever redacted_prompt variable/path you already produce for user-scoped prompts) or gate the raw dump behind an explicit opt-in flag (e.g., allow_full_prompt_logging) that additionally strips/suppresses UserIdentitySection content before logging; update the branch using symbols omit_profile, omit_memory_md, rendered_prompt and UserIdentitySection to either call the redaction path or check the explicit opt-in and enforce identity suppression before calling log::debug!. ``` </details> <!-- fingerprinting:phantom:medusa:grasshopper --> <!-- cr-comment:v1:a0f06c5f864d712c8aade42e --> _Source: Coding guidelines_ <!-- This is an auto-generated comment by CodeRabbit --> --- 206-210: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Log memory-context load failures instead of silently defaulting.** `unwrap_or_default()` makes a store/loader outage indistinguishable from “no recall matched”, so a broken recall path just degrades quietly. <details> <summary>Suggested fix</summary> ```diff - let context = self - .memory_loader - .load_context(self.memory.as_ref(), user_message) - .await - .unwrap_or_default(); + let context = match self + .memory_loader + .load_context(self.memory.as_ref(), user_message) + .await + { + Ok(context) => context, + Err(err) => { + log::warn!("[agent_loop] memory context load failed: {err}"); + String::new() + } + }; ``` </details> As per coding guidelines, changed flows should log errors with stable prefixes instead of swallowing them. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/turn/core.rs` around lines 206 - 210, The current call to memory_loader.load_context(...).await.unwrap_or_default() swallows load errors; change it to match on the Result from memory_loader.load_context(self.memory.as_ref(), user_message).await, and when Err(e) occurs log the error with a stable, searchable prefix (e.g. "MEMORY_LOAD_ERROR:") and relevant context (such as a short user_message identifier and self.memory info), then continue using a default context on error; ensure you use the project's logging facility (error! or tracing::error) and keep the variable named context so downstream code is unchanged. ``` </details> <!-- fingerprinting:phantom:medusa:grasshopper --> <!-- cr-comment:v1:21a29e1d69e199f6e2eac629 --> _Source: Coding guidelines_ <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/session/turn/tools.rs (1)</summary><blockquote> 228-276: _⚠️ Potential issue_ | _🟠 Major_ | _🏗️ Heavy lift_ **The shared-Arc path advertises delegate tools that the executor cannot run.** When `Arc::get_mut(&mut self.tools)` fails, this function still publishes the refreshed schema/allowlist but does **not** install the new `Box<dyn Tool>` instances. The next provider call can therefore see `delegate_*` specs that `run_agent_tool_call` cannot resolve from `ctx.tools`, which degrades into `"Unknown tool"` at execution time. Returning `false` here also makes `refresh_delegation_tools_from_cached_integrations()` roll back `connected_integrations` and skip `last_seen_integrations_hash`, so the inconsistent refresh repeats on later turns. Please make the return contract reflect “schema-only refresh applied” vs “no refresh applied”, or roll the schema changes back when executable tool instances cannot be reconciled. Also applies to: 381-401 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/session/turn/tools.rs` around lines 228 - 276, The code currently commits schema-only changes when refresh_delegation_tools() succeeds even if Arc::get_mut(&mut self.tools) (or equivalent installation of Box<dyn Tool> instances) fails, causing published delegate specs that cannot be executed; change refresh_delegation_tools_from_cached_integrations to treat a schema-only refresh as non-committal: after calling refresh_delegation_tools(), immediately verify you can install/obtain mutable access to the runtime tools (i.e., attempt Arc::get_mut on self.tools or otherwise confirm all required Box<dyn Tool> instances were created); if that verification fails, restore self.connected_integrations = prev_integrations and do not update self.last_seen_integrations_hash, self.connected_integrations_initialized, or mutate announced_integrations/pending_integration_announcement (rollback schema changes), otherwise commit the hash and append connected_slugs as before; apply the same rollback/verification logic to the duplicate code path around lines 381-401. ``` </details> <!-- fingerprinting:phantom:medusa:grasshopper --> <!-- cr-comment:v1:f9267136f04a58e27527997a --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/subagent_runner/ops/runner.rs (1)</summary><blockquote> 70-74: _⚠️ Potential issue_ | _🔴 Critical_ | _⚡ Quick win_ **Sanitize `task_id` before using it in checkpoint file paths (path traversal risk).** Line 73 accepts caller-provided `task_id` unchanged, and Line 698 uses it in `checkpoint_dir.join(format!("{task_id}.json"))`. A crafted value (e.g., containing `../` or absolute prefixes) can escape the checkpoint directory and overwrite unintended files. <details> <summary>🔒 Proposed fix</summary> ```diff +fn checkpoint_file_stem(task_id: &str) -> String { + let sanitized: String = task_id + .chars() + .map(|c| { + if c.is_ascii_alphanumeric() || c == '-' || c == '_' { + c + } else { + '_' + } + }) + .collect(); + sanitized.trim_matches('_').to_string() +} + @@ - let checkpoint_path = checkpoint_dir.join(format!("{task_id}.json")); + let safe_stem = checkpoint_file_stem(task_id); + let safe_stem = if safe_stem.is_empty() { + format!("sub-{}", uuid::Uuid::new_v4()) + } else { + safe_stem + }; + let checkpoint_path = checkpoint_dir.join(format!("{safe_stem}.json")); ``` </details> Also applies to: 674-702 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/subagent_runner/ops/runner.rs` around lines 70 - 74, The caller-provided task_id is used directly in filenames (e.g., checkpoint_dir.join(format!("{task_id}.json"))), creating a path traversal risk; update the code that sets task_id (the options.task_id handling in runner.rs) to sanitize/validate the input: reject or normalize any value containing path separators, leading dots/slashes, drive letters, or characters outside a safe set (allow only alphanumerics, '-' and '_' and fall back to the generated uuid when invalid), or replace invalid characters with '_' so the result cannot escape checkpoint_dir; use the sanitized_task_id variable wherever task_id is used for checkpoint filenames and apply the same sanitization to other usages in the 674-702 region. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:e4ba83fee1164fe72da853d1 --> <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> <details> <summary>src/openhuman/agent/harness/subagent_runner/ops/tool_source.rs (1)</summary><blockquote> 121-123: _⚠️ Potential issue_ | _🟡 Minor_ | _⚡ Quick win_ **Pass the real `task_id` into `apply_handoff` (Line 122).** The current call hardcodes `""` for `task_id`, so handoff stash logs lose per-task correlation. <details> <summary>🔧 Suggested fix</summary> ```diff diff --git a/src/openhuman/agent/harness/subagent_runner/ops/tool_source.rs b/src/openhuman/agent/harness/subagent_runner/ops/tool_source.rs @@ pub(super) struct SubagentToolSource<'a> { @@ pub(super) policy: crate::openhuman::tools::policy::DefaultToolPolicy, pub(super) agent_id: String, + pub(super) task_id: String, } @@ let text = match self.handoff_cache { - Some(cache) => apply_handoff(cache, &call.name, "", &self.agent_id, outcome.text), + Some(cache) => apply_handoff( + cache, + &call.name, + &self.task_id, + &self.agent_id, + outcome.text, + ), None => outcome.text, }; ``` ```diff diff --git a/src/openhuman/agent/harness/subagent_runner/ops/loop_.rs b/src/openhuman/agent/harness/subagent_runner/ops/loop_.rs @@ let mut tool_source = SubagentToolSource { @@ policy: crate::openhuman::tools::policy::DefaultToolPolicy, agent_id: agent_id.to_string(), + task_id: task_id.to_string(), }; ``` </details> As per coding guidelines: “Include debug logging ... with ... correlation fields (`[domain]`, `[rpc]`)”. <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/subagent_runner/ops/tool_source.rs` around lines 121 - 123, The handoff call is passing a hardcoded empty task_id; replace the "" with the real task identifier available in this scope (use the correct identifier such as self.task_id or call.task_id or outcome.task_id depending on which exists in this module) when invoking apply_handoff(cache, &call.name, ..., &self.agent_id, outcome.text), preserving the existing argument order and types so per-task correlation is retained in handoff stash logs. ``` </details> <!-- fingerprinting:phantom:poseidon:hawk --> <!-- cr-comment:v1:59a57a99b94aa305730aceed --> _Source: Coding guidelines_ <!-- This is an auto-generated comment by CodeRabbit --> </blockquote></details> </blockquote></details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
Promote pub(super) to pub(crate) for items re-exported via #[cfg(test)] blocks in four mod.rs files (subagent_runner/ops, composio/ops, config/schema/load, meet_agent/brain) so the test companion modules can access them without E0364/E0365 errors during cargo test builds.
- Apply main's subagent allowlist changes to split turn/tools.rs - Fix pub(super) → pub(crate) visibility for 10+ symbols in split sub-modules so test re-exports compile - Add missing test-only imports (Path, PathBuf, Map, Value, RpcOutcome, SourceKind, etc.) to split module mod.rs files - Resolve turn.rs delete/modify merge conflict
The config schemas.rs was split into schemas/schema_defs.rs during the module refactor. Update the drift-guard test to read from the new path.
… modules The upstream merge (e73444d) pulled tinyhumansai#3424 (split 24 oversized modules into sub-dirs) + tinyhumansai#3426. Those split files (turn/, workflows/schemas/, builder/, prompts/, subagent_runner/ops/) and 3 new FE files (SkillsExplorerTab) carried the old `skills` naming; re-apply the workflows rename onto the new locations so the tree compiles. SkillsWildcard / related_skills / openhuman::skills:: module paths left intact. cargo fmt applied to schemas/handlers.rs. Co-Authored-By: Claude <noreply@anthropic.com>
Summary
src/openhuman/that exceeded 1000 lines into smaller sub-module directories (target ≤500 lines each)#[cfg(test)]blocks into companion*_tests.rsfiles where applicablepub usere-exports in each newmod.rs.rsfilenamesProblem
Solution
foo.rs→foo/mod.rs+ focused sub-modules (types, ops, helpers, etc.)pub usere-exports inmod.rspreserve the exact external API surface — no callers change#[cfg(test)]blocks extracted to sibling*_tests.rsfiles referenced via#[path]cargo fmtapplied to all new filesSubmission Checklist
Impact
Related
tests/json_rpc_e2e.rs(10,877 lines) also needs splittingAI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Validation Run
pnpm --filter openhuman-app format:checkpnpm typecheckGGML_NATIVE=OFF cargo check— zero errorsValidation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
Parity Contract
Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Refactor
Tests