Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ All notable changes to cc-settings are documented here.

## [Unreleased]

### Delegation guidance — high-agency heuristic (issue #44)

- **`CLAUDE-FULL.md`**: replaced the "repeated nag" MUST/SHOULD list with a per-decision heuristic table (3+ files / 10+ calls / security-sensitive → delegate, route by shape; NO → act directly). Four closing rules replace the old enforcement list; the briefing-contract blockquote is preserved verbatim.
- **`src/hooks/tool-cadence.ts` (parallelmax branch)**: one nudge per streak (fires at 12+ calls OR 3+ distinct files edited, whichever comes first), followed by one escalation (soft block via `continueOnBlock`) if the streak continues past the reminder. Net: at most 2 signals per streak, down from one every 12 calls indefinitely. State tracks `files`, `nudged`, `countAtNudge`, `filesAtNudge`, `escalated`; old-shape state files handled with defensive defaults.
- **`config/40-hooks.json`**: `continueOnBlock: true` on the tool-cadence PostToolUse hook so the escalation block surfaces as a hard-to-ignore signal without aborting the turn.
- **`src/hooks/delegation-detector.ts`**: message compressed — single-line format with score, matched signals, and routing guide; overriding requires a stated reason.


### Cost tuning — "explore/execute cheap, decide on Fable"

Fable stays the session default and the tier for judgment agents, but the high-volume read/execute agents move off it, since they were the bulk of the burn (each subagent re-reads the repo, and on a Fable session the inheriting agents all ran Fable).
Expand Down
50 changes: 22 additions & 28 deletions CLAUDE-FULL.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,42 +17,36 @@ The Edit tool uses exact string matching. Follow these rules:

## Delegation

> **Opus 4.8 note**: Claude Opus 4.8 (and 4.7 before it) spawns fewer subagents by default than 4.6 and prefers internal reasoning over tool/agent use. The rules below are **not suggestions** — they are explicit triggers to counter that bias. When a trigger fires, delegate. Do not reason your way out of delegating "because you could do it yourself."
> **Opus 4.8 note**: Opus 4.8 (and 4.7) under-delegates — it prefers internal reasoning over spawning agents. The heuristic below is calibrated to counter that bias. Do not reason your way out of it "because you could do it yourself."

### You MUST delegate (non-negotiable) when:
### The per-decision heuristic

- **Multi-file exploration spanning 3+ files** → `Agent(explore, "...")`
- **Any task that would require 10+ sequential tool calls** → break into agent tasks
- **Security-sensitive code** (auth, payments, crypto, input validation) → `Agent(security-reviewer, "...")`
- **Writing new test files** → `Agent(tester, "...")`
- **Dead code cleanup or codebase deslop** → `Agent(deslopper, "...")`
- **Parallel independent workstreams** (3+ with no file conflicts) → spawn agents in a single message
Before each unit of work, ask once: **3+ files, 10+ tool calls, or security-sensitive code?**

### You SHOULD prefer delegation for:
**YES → delegate first, then route by shape:**

- **Genuinely complex implementation — 3+ files, or 10+ sequential tool calls** → `Agent(implementer, "...")`
- **Architecture decisions or upfront planning** → `Agent(planner, "...")`
- **Scaffolding new components/hooks/pages** → `Agent(scaffolder, "...")`
- **Code review on changes touching 3+ files** → `Agent(reviewer, "...")`
- **Expert second opinions / blast-radius / "why is this here" questions** → `Agent(explore, "...")`
- **Full-feature orchestration across 3+ agents** → `Agent(maestro, "...")`
- **Tasks structurally prone to single-window failure** — agentic laziness (quitting at 20 of 50 items), self-preferential bias (judging your own output), goal drift across compaction → a [dynamic workflow](https://code.claude.com/docs/en/workflows) / `/effort ultracode` (see `skills/orchestrate/SKILL.md`)
| Shape | Agent |
|---|---|
| understand / find / map / blast-radius | `explore` |
| build / change / fix across files | `implementer` |
| plan / architecture | `planner` |
| new test files | `tester` (MUST) |
| auth, payments, crypto, input validation | `security-reviewer` (MUST) |
| dead code / deslop | `deslopper` (MUST) |
| 3+ independent workstreams | parallel `Agent` calls in ONE message (MUST) |
| full feature spanning 3+ agents | `maestro` |
| prone to single-window failure — agentic laziness (quitting at 20 of 50 items), self-preferential bias (judging your own output), goal drift across compaction | a [dynamic workflow](https://code.claude.com/docs/en/workflows) / `/effort ultracode` (see `skills/orchestrate/SKILL.md`) |

> **Briefing contract for `implementer`**: as a subagent it gets only your prompt — no conversation context, none of the files you've read — so every prompt MUST contain actual content, not references: the user's ask verbatim, exact file paths and line ranges, the change to make (paste the planner output; never write "based on findings" or "according to plan"), the verification command, and a scope boundary. Thin prompts cause regressions; the agent will refuse them. It runs in the live working tree and leaves changes **uncommitted** for you to review before they land. Full contract: `agents/implementer.md` REQUIRED BRIEFING. This applies equally to `explore` → `implementer` and `planner` → `implementer` chains.

### Act directly ONLY when:
**NO → act directly.** 1–2 file edits, known-path reads, single greps/globs, build/test runs, conversational answers. Keeping small diffs in the main session is correct — don't spawn an `implementer` to prove you delegated.

- Reading a specific file you already know the path to
- Small or medium edits spanning 1–2 files, or any change you can finish in under ~10 tool calls
- Running build or test commands
- Simple searches (one grep for a string, one glob for a file pattern)
- Answering a conversational question with no code change
**Rules that close the loop:**

### Enforcement Rules
1. **Re-ask when scope grows.** Predicted small but it's now 3+ files or 10+ calls? Stop and delegate the remainder — sunk tool calls are not a reason to finish solo.
2. **Overriding a YES requires a stated reason.** One line, in your response, before proceeding (e.g. "12 calls but all sequential edits to one file"). The `tool-cadence` hook escalates on streaks that continue past a reminder with no Agent call.
3. **Delegating needs no narration** — just call the Agent tool.
4. **Parallelize**: independent delegations go in a single message — they run concurrently.

1. **Parallelize**: when multiple delegations have no dependencies, send all `Agent` calls in a single message — they run concurrently.
2. **Don't narrate the decision**: if a trigger fires, call the Agent tool directly. Don't explain why you're delegating — just delegate.
3. **Match the tool to the size**: the MUST/SHOULD triggers fire only for genuinely heavy work (3+ files, 10+ tool calls, security-sensitive code, new test files). Small and medium edits staying in the main session is the correct default — it keeps the diff reviewable in the working tree before commit. Don't spawn an `implementer` just to prove you delegated.
> **Briefing contract for `implementer`**: as a subagent it gets only your prompt — no conversation context, none of the files you've read — so every prompt MUST contain actual content, not references: the user's ask verbatim, exact file paths and line ranges, the change to make (paste the planner output; never write "based on findings" or "according to plan"), the verification command, and a scope boundary. Thin prompts cause regressions; the agent will refuse them. It runs in the live working tree and leaves changes **uncommitted** for you to review before they land. Full contract: `agents/implementer.md` REQUIRED BRIEFING. This applies equally to `explore` → `implementer` and `planner` → `implementer` chains.

For full orchestration mode, activate `profiles/maestro.md`. Model routing per agent: see `docs/agent-models.md`.

Expand Down
3 changes: 2 additions & 1 deletion config/40-hooks.json
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,8 @@
{
"type": "command",
"command": "bun \"$HOME/.claude/src/hooks/tool-cadence.ts\"",
"timeout": 3
"timeout": 3,
"continueOnBlock": true
}
]
},
Expand Down
4 changes: 2 additions & 2 deletions docs/hooks-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -315,7 +315,7 @@ Matchers filter which specific tool invocations or events trigger a hook.
| Script | Purpose | Async |
|--------|---------|-------|
| `session-title.ts` | Derives session title from first prompt; emits `hookSpecificOutput.sessionTitle` so `claude --resume <name>` works | Yes |
| `delegation-detector.ts` | Regex-scores incoming prompt for breadth signals (phrases like "do all", "across the repo", path-shaped tokens, large numbered lists). At score ≥ 2, injects a system reminder via `additionalContext` pointing at maestro / multi-agent delegation | No |
| `delegation-detector.ts` | Regex-scores incoming prompt for breadth signals (phrases like "do all", "across the repo", path-shaped tokens, large numbered lists). At score ≥ 2, injects a compact `additionalContext` reminder: score, matched signals, and a one-line routing guide (maestro / implementer / parallel agents in ONE message). Overriding requires a stated reason. | No |

> Note: Since v2.1.108 Claude Code has a native `Skill` tool that auto-matches skills; the old `skill-activation` hook was removed. Correction detection was removed as low-signal.

Expand Down Expand Up @@ -356,7 +356,7 @@ Logs are used by `bun run claude-audit` to analyze command patterns, security co

| Script | Purpose | Async |
|--------|---------|-------|
| `tool-cadence.ts` (parallelmax branch) | Counts consecutive non-Agent tool calls in main context. At N=8, emits a system reminder pointing at the delegation rules. State at `~/.claude/tmp/parallelmax-counter.json`; resets when an Agent tool fires. 60s debounce | No |
| `tool-cadence.ts` (parallelmax branch) | Counts consecutive non-Agent tool calls and distinct file edits per streak. **One nudge per streak**: fires at threshold 12 calls OR 3+ files edited — emits a compact `additionalContext` reminder with the delegation heuristic. **One escalation per streak**: if the streak continues past the nudge by another threshold-worth of calls or 2+ more files, emits a soft block (`continueOnBlock: true`) via `blockDecision` — the turn continues but the signal is hard to ignore. Both signals suppress when the review queue is at capacity. Resets on any Agent call. 60s debounce. State at `~/.claude/tmp/parallelmax-counter.json`. `CC_PARALLELMAX_THRESHOLD` env override (default 12). | No |

### PostToolUseFailure

Expand Down
10 changes: 4 additions & 6 deletions src/hooks/delegation-detector.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,13 +56,11 @@ async function main(): Promise<void> {

if (score < 2) return;

const reasonList = reasons.map((r) => ` • ${r}`).join("\n");
const msg =
`Breadth signals detected in this prompt (score ${score}):\n${reasonList}\n\n` +
`This prompt likely spans multiple files or requires parallel workstreams. ` +
`Per CLAUDE.md delegation rules: use Agent(maestro) for full-feature orchestration, ` +
`Agent(implementer) for multi-file implementation, or spawn multiple parallel agents ` +
`in a SINGLE message. Do not self-execute tasks that trigger the 3+ file or 10+ tool-call thresholds.`;
`Breadth signals in this prompt (score ${score}): ${reasons.join("; ")}. ` +
`Likely 3+ files / parallel workstreams — per CLAUDE.md: route to Agent(maestro) for orchestration, ` +
`Agent(implementer) for multi-file changes, or parallel agents in ONE message. ` +
`Overriding requires a one-line stated reason.`;

console.log(
JSON.stringify({
Expand Down
136 changes: 106 additions & 30 deletions src/hooks/tool-cadence.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
// unmatched on EVERY PostToolUse (two Bun spawns per tool call → one):
//
// • parallelmax branch (was parallelmax-nudge.ts) — counts consecutive
// non-Agent tool calls; at 8, nudges the model toward delegation.
// non-Agent tool calls and tracked files; fires one soft nudge per streak
// at THRESHOLD calls or FILES_THRESHOLD distinct file edits, then one
// escalation (soft block via continueOnBlock) if the streak continues.
// • review-queue branch (was review-queue-nudge.ts) — backpressure, the
// consumer-side counterpart: counts Agent spawns since the last commit,
// nudges at CC_MAX_UNREVIEWED, drains on successful commit/push,
Expand All @@ -15,7 +17,13 @@
// runHook — any error → silent success, never break a tool call.

import { runGit } from "../lib/git.ts";
import { readHookInput, readState, runHook, writeState } from "../lib/hook-runtime.ts";
import {
blockDecision,
readHookInput,
readState,
runHook,
writeState,
} from "../lib/hook-runtime.ts";
import {
type BashResult,
buildNudge,
Expand All @@ -41,19 +49,39 @@ import {
// routine multi-step edits don't trip it, low enough to catch genuine
// "should have fanned out" runs.
const THRESHOLD = Number(process.env.CC_PARALLELMAX_THRESHOLD) || 12;
// Distinct file edits before the nudge fires (regardless of call count).
const FILES_THRESHOLD = 3;
const DEBOUNCE_MS = 60_000;
const COUNTER_STATE = "parallelmax-counter.json";
const QUEUE_STATE = "review-queue.json";

// File-edit tools and the field carrying the path in tool_input.
const FILE_EDIT_TOOLS: Record<string, string> = {
Write: "file_path",
Edit: "file_path",
MultiEdit: "file_path",
NotebookEdit: "notebook_path",
};

interface CounterState {
count: number;
lastTool: string;
firedAt?: number;
files?: string[];
nudged?: boolean;
countAtNudge?: number;
filesAtNudge?: number;
escalated?: boolean;
}

type Payload = {
tool_name: string;
tool_input: { command?: string; subagent_type?: string };
tool_input: {
command?: string;
subagent_type?: string;
file_path?: string;
notebook_path?: string;
};
tool_response: BashResult;
cwd?: string;
};
Expand All @@ -74,41 +102,90 @@ async function currentHead(cwd: string): Promise<string | undefined> {

// --- Branch 1: consecutive non-Agent call counter --------------------------

async function parallelmaxBranch(toolName: string): Promise<void> {
const state = await readState<CounterState>(COUNTER_STATE, { count: 0, lastTool: "" });
async function parallelmaxBranch(
toolName: string,
toolInput: Payload["tool_input"],
): Promise<void> {
// Defensive defaults for old-shape state files.
const raw = await readState<CounterState>(COUNTER_STATE, { count: 0, lastTool: "" });
const state = {
count: raw.count ?? 0,
lastTool: raw.lastTool ?? "",
firedAt: raw.firedAt as number | undefined,
files: raw.files ?? ([] as string[]),
nudged: raw.nudged ?? false,
countAtNudge: raw.countAtNudge ?? 0,
filesAtNudge: raw.filesAtNudge ?? 0,
escalated: raw.escalated ?? false,
};

if (toolName === "Agent") {
// Reset the whole streak on delegation; preserve firedAt for debounce.
await writeState(COUNTER_STATE, {
count: 0,
lastTool: toolName,
firedAt: state.firedAt,
files: [],
nudged: false,
countAtNudge: 0,
filesAtNudge: 0,
escalated: false,
});
return;
}

state.count += 1;
state.lastTool = toolName;

if (state.count >= THRESHOLD) {
const now = Date.now();
// Debounce: skip if we fired recently to avoid spamming.
if (!state.firedAt || now - state.firedAt >= DEBOUNCE_MS) {
// Suppress the "delegate more" nudge when the review queue is already at/
// over capacity: pushing more production when review is the bottleneck is
// the orchestration-tax failure mode (the review-queue branch covers the
// other direction). Still debounce + reset so we don't re-check on every
// call.
const rq = await readState<{ awaiting: number }>(QUEUE_STATE, { awaiting: 0 });
if (rq.awaiting < maxUnreviewed()) {
const msg =
`You have made ${state.count} consecutive tool calls without delegating to an Agent. ` +
`Opus 4.8 defaults to self-execution, but CLAUDE.md requires delegation when tasks span ` +
`3+ files or 10+ tool calls. Consider Agent(implementer), Agent(explore), or Agent(maestro) ` +
`to parallelize work, reduce context pressure, and follow the project guardrails.`;
emit(msg);
}
// Track file edits (deduplicated, capped at 20).
const filePathField = FILE_EDIT_TOOLS[toolName];
if (filePathField) {
const fp = toolInput[filePathField as keyof typeof toolInput];
if (typeof fp === "string" && fp && !state.files.includes(fp)) {
state.files = [...state.files, fp].slice(0, 20);
}
}

const now = Date.now();
const debounceOk = !state.firedAt || now - state.firedAt >= DEBOUNCE_MS;

// --- Escalation (at most once per streak) ---------------------------------
// Fires AFTER nudge, when the streak continues past the reminder.
if (
state.nudged &&
!state.escalated &&
(state.count - state.countAtNudge >= THRESHOLD ||
state.files.length - state.filesAtNudge >= 2) &&
debounceOk
) {
const rq = await readState<{ awaiting: number }>(QUEUE_STATE, { awaiting: 0 });
if (rq.awaiting < maxUnreviewed()) {
state.escalated = true;
state.firedAt = now;
// Write state BEFORE calling blockDecision (never returns).
await writeState(COUNTER_STATE, state);
blockDecision(
`Delegation violation: ${state.count} tool calls and ${state.files.length} file(s) edited in this streak — past a prior reminder, still no Agent call. This matches a CLAUDE.md MUST-delegate trigger. Delegate the remainder (Agent(implementer) for changes, Agent(explore) for reads) or state a one-line justification before continuing.`,
);
}
}

// --- Soft nudge (at most once per streak) ---------------------------------
if (
!state.nudged &&
(state.count >= THRESHOLD || state.files.length >= FILES_THRESHOLD) &&
debounceOk
) {
const rq = await readState<{ awaiting: number }>(QUEUE_STATE, { awaiting: 0 });
if (rq.awaiting < maxUnreviewed()) {
const filesClause = state.files.length > 0 ? ` and ${state.files.length} file(s) edited` : "";
emit(
`Delegation check — ${state.count} tool calls${filesClause} in this streak with no Agent call. Heuristic: 3+ files or 10+ calls → delegate (explore = read/map, implementer = multi-file change, parallel agents for independent work). Delegate the remainder now, or state a one-line reason for staying solo and continue.`,
);
state.nudged = true;
state.countAtNudge = state.count;
state.filesAtNudge = state.files.length;
state.firedAt = now;
state.count = 0;
}
}

Expand Down Expand Up @@ -180,17 +257,16 @@ async function main(): Promise<void> {
const toolName = payload.tool_name ?? "";

// Branch isolation: a crash in one branch must not silence the other —
// before the merge these were separate processes. Order matches the old
// config order (parallelmax first), so on the rare event where both nudge
// (e.g. an 8th consecutive call that is also a draining commit) the two
// JSON lines print in the same order as before.
// before the merge these were separate processes. reviewQueueBranch runs
// first because parallelmaxBranch may end the process via blockDecision()
// on escalation; the review-queue state must already be written by then.
try {
await parallelmaxBranch(toolName);
await reviewQueueBranch(payload, toolName);
} catch {
// fail open
}
try {
await reviewQueueBranch(payload, toolName);
await parallelmaxBranch(toolName, payload.tool_input ?? {});
} catch {
// fail open
}
Expand Down
Loading
Loading