Skip to content

Commit 53d6909

Browse files
committed
Emit structured doctor JSON diagnostics
1 parent ceaf9cb commit 53d6909

3 files changed

Lines changed: 256 additions & 32 deletions

File tree

ROADMAP.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,7 @@ Priority order: P0 = blocks CI/green state, P1 = blocks integration wiring, P2 =
309309
20. **Session state classification gap (working vs blocked vs finished vs truly stale)****done**: agent manifests now derive machine states such as `working`, `blocked_background_job`, `blocked_merge_conflict`, `degraded_mcp`, `interrupted_transport`, `finished_pending_report`, and `finished_cleanable`, and terminal-state persistence records commit provenance plus derived state so downstream monitoring can distinguish quiet progress from truly idle sessions.
310310
21. **Resumed `/status` JSON parity gap** — dogfooding shows fresh `claw status --output-format json` now emits structured JSON, but resumed slash-command status still leaks through a text-shaped path in at least one dispatch path. Local CI-equivalent repro fails `rust/crates/rusty-claude-cli/tests/resume_slash_commands.rs::resumed_status_command_emits_structured_json_when_requested` with `expected value at line 1 column 1`, so resumed automation can receive text where JSON was explicitly requested. **Action:** unify fresh vs resumed `/status` rendering through one output-format contract and add regression coverage so resumed JSON output is guaranteed valid.
311311
22. **Opaque failure surface for session/runtime crashes** — repeated dogfood-facing failures can currently collapse to generic wrappers like `Something went wrong while processing your request. Please try again, or use /new to start a fresh session.` without exposing whether the fault was provider auth, session corruption, slash-command dispatch, render failure, or transport/runtime panic. This blocks fast self-recovery and turns actionable clawability bugs into blind retries. **Action:** preserve a short user-safe failure class (`provider_auth`, `session_load`, `command_dispatch`, `render`, `runtime_panic`, etc.), attach a local trace/session id, and ensure operators can jump from the chat-visible error to the exact failure log quickly.
312-
23. **`doctor --output-format json` check-level structure gap**direct dogfooding shows `claw doctor --output-format json` exposes `has_failures` at the top level, but individual check results (`auth`, `config`, `workspace`, `sandbox`, `system`) are buried inside flat prose fields like `message` / `report`. That forces claws to string-scrape human text instead of consuming stable machine-readable diagnostics. **Action:** emit structured per-check JSON (`name`, `status`, `summary`, `details`, and relevant typed fields such as sandbox fallback reason) while preserving the current human-readable report for text mode.
312+
23. **`doctor --output-format json` check-level structure gap****done**: `claw doctor --output-format json` now keeps the human-readable `message`/`report` while also emitting structured per-check diagnostics (`name`, `status`, `summary`, `details`, plus typed fields like workspace paths and sandbox fallback data), with regression coverage in `output_format_contract.rs`.
313313
24. **Plugin lifecycle init/shutdown test flakes under workspace-parallel execution** — dogfooding surfaced that `build_runtime_runs_plugin_lifecycle_init_and_shutdown` can fail under `cargo test --workspace` while passing in isolation because sibling tests race on tempdir-backed shell init script paths. This is test brittleness rather than a code-path regression, but it still destabilizes CI confidence and wastes diagnosis cycles. **Action:** isolate temp resources per test robustly (unique dirs + no shared cwd assumptions), audit cleanup timing, and add a regression guard so the plugin lifecycle test remains stable under parallel workspace execution.
314314
**P3 — Swarm efficiency**
315315
13. Swarm branch-lock protocol — **done**: `branch_lock::detect_branch_lock_collisions()` now detects same-branch/same-scope and nested-module collisions before parallel lanes drift into duplicate implementation

rust/crates/rusty-claude-cli/src/main.rs

Lines changed: 220 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ use runtime::{
5151
Session, TokenUsage, ToolError, ToolExecutor, UsageTracker,
5252
};
5353
use serde::Deserialize;
54-
use serde_json::json;
54+
use serde_json::{json, Map, Value};
5555
use tools::{GlobalToolRegistry, RuntimeToolDefinition, ToolSearchOutput};
5656

5757
const DEFAULT_MODEL: &str = "claude-opus-4-6";
@@ -870,6 +870,7 @@ struct DiagnosticCheck {
870870
level: DiagnosticLevel,
871871
summary: String,
872872
details: Vec<String>,
873+
data: Map<String, Value>,
873874
}
874875

875876
impl DiagnosticCheck {
@@ -879,13 +880,45 @@ impl DiagnosticCheck {
879880
level,
880881
summary: summary.into(),
881882
details: Vec::new(),
883+
data: Map::new(),
882884
}
883885
}
884886

885887
fn with_details(mut self, details: Vec<String>) -> Self {
886888
self.details = details;
887889
self
888890
}
891+
892+
fn with_data(mut self, data: Map<String, Value>) -> Self {
893+
self.data = data;
894+
self
895+
}
896+
897+
fn json_value(&self) -> Value {
898+
let mut value = Map::from_iter([
899+
(
900+
"name".to_string(),
901+
Value::String(self.name.to_ascii_lowercase()),
902+
),
903+
(
904+
"status".to_string(),
905+
Value::String(self.level.label().to_string()),
906+
),
907+
("summary".to_string(), Value::String(self.summary.clone())),
908+
(
909+
"details".to_string(),
910+
Value::Array(
911+
self.details
912+
.iter()
913+
.cloned()
914+
.map(Value::String)
915+
.collect::<Vec<_>>(),
916+
),
917+
),
918+
]);
919+
value.extend(self.data.clone());
920+
Value::Object(value)
921+
}
889922
}
890923

891924
#[derive(Debug, Clone, PartialEq, Eq)]
@@ -894,26 +927,29 @@ struct DoctorReport {
894927
}
895928

896929
impl DoctorReport {
930+
fn counts(&self) -> (usize, usize, usize) {
931+
(
932+
self.checks
933+
.iter()
934+
.filter(|check| check.level == DiagnosticLevel::Ok)
935+
.count(),
936+
self.checks
937+
.iter()
938+
.filter(|check| check.level == DiagnosticLevel::Warn)
939+
.count(),
940+
self.checks
941+
.iter()
942+
.filter(|check| check.level == DiagnosticLevel::Fail)
943+
.count(),
944+
)
945+
}
946+
897947
fn has_failures(&self) -> bool {
898948
self.checks.iter().any(|check| check.level.is_failure())
899949
}
900950

901951
fn render(&self) -> String {
902-
let ok_count = self
903-
.checks
904-
.iter()
905-
.filter(|check| check.level == DiagnosticLevel::Ok)
906-
.count();
907-
let warn_count = self
908-
.checks
909-
.iter()
910-
.filter(|check| check.level == DiagnosticLevel::Warn)
911-
.count();
912-
let fail_count = self
913-
.checks
914-
.iter()
915-
.filter(|check| check.level == DiagnosticLevel::Fail)
916-
.count();
952+
let (ok_count, warn_count, fail_count) = self.counts();
917953
let mut lines = vec![
918954
"Doctor".to_string(),
919955
format!(
@@ -923,6 +959,28 @@ impl DoctorReport {
923959
lines.extend(self.checks.iter().map(render_diagnostic_check));
924960
lines.join("\n\n")
925961
}
962+
963+
fn json_value(&self) -> Value {
964+
let report = self.render();
965+
let (ok_count, warn_count, fail_count) = self.counts();
966+
json!({
967+
"kind": "doctor",
968+
"message": report,
969+
"report": report,
970+
"has_failures": self.has_failures(),
971+
"summary": {
972+
"total": self.checks.len(),
973+
"ok": ok_count,
974+
"warnings": warn_count,
975+
"failures": fail_count,
976+
},
977+
"checks": self
978+
.checks
979+
.iter()
980+
.map(DiagnosticCheck::json_value)
981+
.collect::<Vec<_>>(),
982+
})
983+
}
926984
}
927985

928986
fn render_diagnostic_check(check: &DiagnosticCheck) -> String {
@@ -980,22 +1038,17 @@ fn run_doctor(output_format: CliOutputFormat) -> Result<(), Box<dyn std::error::
9801038
let message = report.render();
9811039
match output_format {
9821040
CliOutputFormat::Text => println!("{message}"),
983-
CliOutputFormat::Json => println!(
984-
"{}",
985-
serde_json::to_string_pretty(&json!({
986-
"kind": "doctor",
987-
"message": message,
988-
"report": message,
989-
"has_failures": report.has_failures(),
990-
}))?
991-
),
1041+
CliOutputFormat::Json => {
1042+
println!("{}", serde_json::to_string_pretty(&report.json_value())?);
1043+
}
9921044
}
9931045
if report.has_failures() {
9941046
return Err("doctor found failing checks".into());
9951047
}
9961048
Ok(())
9971049
}
9981050

1051+
#[allow(clippy::too_many_lines)]
9991052
fn check_auth_health() -> DiagnosticCheck {
10001053
let api_key_present = env::var("ANTHROPIC_API_KEY")
10011054
.ok()
@@ -1060,6 +1113,21 @@ fn check_auth_health() -> DiagnosticCheck {
10601113
},
10611114
)
10621115
.with_details(details)
1116+
.with_data(Map::from_iter([
1117+
("api_key_present".to_string(), json!(api_key_present)),
1118+
("auth_token_present".to_string(), json!(auth_token_present)),
1119+
("saved_oauth_present".to_string(), json!(true)),
1120+
("saved_oauth_expired".to_string(), json!(expired)),
1121+
(
1122+
"saved_oauth_expires_at".to_string(),
1123+
json!(token_set.expires_at),
1124+
),
1125+
(
1126+
"refresh_token_present".to_string(),
1127+
json!(token_set.refresh_token.is_some()),
1128+
),
1129+
("scopes".to_string(), json!(token_set.scopes)),
1130+
]))
10631131
}
10641132
Ok(None) => DiagnosticCheck::new(
10651133
"Auth",
@@ -1082,12 +1150,31 @@ fn check_auth_health() -> DiagnosticCheck {
10821150
} else {
10831151
"absent"
10841152
}
1085-
)]),
1153+
)])
1154+
.with_data(Map::from_iter([
1155+
("api_key_present".to_string(), json!(api_key_present)),
1156+
("auth_token_present".to_string(), json!(auth_token_present)),
1157+
("saved_oauth_present".to_string(), json!(false)),
1158+
("saved_oauth_expired".to_string(), json!(false)),
1159+
("saved_oauth_expires_at".to_string(), Value::Null),
1160+
("refresh_token_present".to_string(), json!(false)),
1161+
("scopes".to_string(), json!(Vec::<String>::new())),
1162+
])),
10861163
Err(error) => DiagnosticCheck::new(
10871164
"Auth",
10881165
DiagnosticLevel::Fail,
10891166
format!("failed to inspect saved credentials: {error}"),
1090-
),
1167+
)
1168+
.with_data(Map::from_iter([
1169+
("api_key_present".to_string(), json!(api_key_present)),
1170+
("auth_token_present".to_string(), json!(auth_token_present)),
1171+
("saved_oauth_present".to_string(), Value::Null),
1172+
("saved_oauth_expired".to_string(), Value::Null),
1173+
("saved_oauth_expires_at".to_string(), Value::Null),
1174+
("refresh_token_present".to_string(), Value::Null),
1175+
("scopes".to_string(), Value::Null),
1176+
("saved_oauth_error".to_string(), json!(error.to_string())),
1177+
])),
10911178
}
10921179
}
10931180

@@ -1121,7 +1208,7 @@ fn check_config_health(
11211208
} else {
11221209
details.extend(
11231210
discovered_paths
1124-
.into_iter()
1211+
.iter()
11251212
.map(|path| format!("Discovered file {path}")),
11261213
);
11271214
}
@@ -1139,6 +1226,22 @@ fn check_config_health(
11391226
},
11401227
)
11411228
.with_details(details)
1229+
.with_data(Map::from_iter([
1230+
("discovered_files".to_string(), json!(discovered_paths)),
1231+
(
1232+
"discovered_files_count".to_string(),
1233+
json!(discovered_count),
1234+
),
1235+
(
1236+
"loaded_config_files".to_string(),
1237+
json!(loaded_entries.len()),
1238+
),
1239+
("resolved_model".to_string(), json!(runtime_config.model())),
1240+
(
1241+
"mcp_servers".to_string(),
1242+
json!(runtime_config.mcp().servers().len()),
1243+
),
1244+
]))
11421245
}
11431246
Err(error) => DiagnosticCheck::new(
11441247
"Config",
@@ -1149,10 +1252,21 @@ fn check_config_health(
11491252
vec!["Discovered files <none>".to_string()]
11501253
} else {
11511254
discovered_paths
1152-
.into_iter()
1255+
.iter()
11531256
.map(|path| format!("Discovered file {path}"))
11541257
.collect()
1155-
}),
1258+
})
1259+
.with_data(Map::from_iter([
1260+
("discovered_files".to_string(), json!(discovered_paths)),
1261+
(
1262+
"discovered_files_count".to_string(),
1263+
json!(discovered_count),
1264+
),
1265+
("loaded_config_files".to_string(), json!(0)),
1266+
("resolved_model".to_string(), Value::Null),
1267+
("mcp_servers".to_string(), Value::Null),
1268+
("load_error".to_string(), json!(error.to_string())),
1269+
])),
11561270
}
11571271
}
11581272

@@ -1194,6 +1308,38 @@ fn check_workspace_health(context: &StatusContext) -> DiagnosticCheck {
11941308
context.memory_file_count, context.loaded_config_files, context.discovered_config_files
11951309
),
11961310
])
1311+
.with_data(Map::from_iter([
1312+
("cwd".to_string(), json!(context.cwd.display().to_string())),
1313+
(
1314+
"project_root".to_string(),
1315+
json!(context
1316+
.project_root
1317+
.as_ref()
1318+
.map(|path| path.display().to_string())),
1319+
),
1320+
("in_git_repo".to_string(), json!(in_repo)),
1321+
("git_branch".to_string(), json!(context.git_branch)),
1322+
(
1323+
"git_state".to_string(),
1324+
json!(context.git_summary.headline()),
1325+
),
1326+
(
1327+
"changed_files".to_string(),
1328+
json!(context.git_summary.changed_files),
1329+
),
1330+
(
1331+
"memory_file_count".to_string(),
1332+
json!(context.memory_file_count),
1333+
),
1334+
(
1335+
"loaded_config_files".to_string(),
1336+
json!(context.loaded_config_files),
1337+
),
1338+
(
1339+
"discovered_config_files".to_string(),
1340+
json!(context.discovered_config_files),
1341+
),
1342+
]))
11971343
}
11981344

11991345
fn check_sandbox_health(status: &runtime::SandboxStatus) -> DiagnosticCheck {
@@ -1224,17 +1370,51 @@ fn check_sandbox_health(status: &runtime::SandboxStatus) -> DiagnosticCheck {
12241370
},
12251371
)
12261372
.with_details(details)
1373+
.with_data(Map::from_iter([
1374+
("enabled".to_string(), json!(status.enabled)),
1375+
("active".to_string(), json!(status.active)),
1376+
("supported".to_string(), json!(status.supported)),
1377+
(
1378+
"namespace_supported".to_string(),
1379+
json!(status.namespace_supported),
1380+
),
1381+
(
1382+
"namespace_active".to_string(),
1383+
json!(status.namespace_active),
1384+
),
1385+
(
1386+
"network_supported".to_string(),
1387+
json!(status.network_supported),
1388+
),
1389+
("network_active".to_string(), json!(status.network_active)),
1390+
(
1391+
"filesystem_mode".to_string(),
1392+
json!(status.filesystem_mode.as_str()),
1393+
),
1394+
(
1395+
"filesystem_active".to_string(),
1396+
json!(status.filesystem_active),
1397+
),
1398+
("allowed_mounts".to_string(), json!(status.allowed_mounts)),
1399+
("in_container".to_string(), json!(status.in_container)),
1400+
(
1401+
"container_markers".to_string(),
1402+
json!(status.container_markers),
1403+
),
1404+
("fallback_reason".to_string(), json!(status.fallback_reason)),
1405+
]))
12271406
}
12281407

12291408
fn check_system_health(cwd: &Path, config: Option<&runtime::RuntimeConfig>) -> DiagnosticCheck {
1409+
let default_model = config.and_then(runtime::RuntimeConfig::model);
12301410
let mut details = vec![
12311411
format!("OS {} {}", env::consts::OS, env::consts::ARCH),
12321412
format!("Working dir {}", cwd.display()),
12331413
format!("Version {}", VERSION),
12341414
format!("Build target {}", BUILD_TARGET.unwrap_or("<unknown>")),
12351415
format!("Git SHA {}", GIT_SHA.unwrap_or("<unknown>")),
12361416
];
1237-
if let Some(model) = config.and_then(runtime::RuntimeConfig::model) {
1417+
if let Some(model) = default_model {
12381418
details.push(format!("Default model {model}"));
12391419
}
12401420
DiagnosticCheck::new(
@@ -1243,6 +1423,15 @@ fn check_system_health(cwd: &Path, config: Option<&runtime::RuntimeConfig>) -> D
12431423
"captured local runtime metadata",
12441424
)
12451425
.with_details(details)
1426+
.with_data(Map::from_iter([
1427+
("os".to_string(), json!(env::consts::OS)),
1428+
("arch".to_string(), json!(env::consts::ARCH)),
1429+
("working_dir".to_string(), json!(cwd.display().to_string())),
1430+
("version".to_string(), json!(VERSION)),
1431+
("build_target".to_string(), json!(BUILD_TARGET)),
1432+
("git_sha".to_string(), json!(GIT_SHA)),
1433+
("default_model".to_string(), json!(default_model)),
1434+
]))
12461435
}
12471436

12481437
fn resume_command_can_absorb_token(current_command: &str, token: &str) -> bool {

0 commit comments

Comments
 (0)