diff --git a/astro.config.ts b/astro.config.ts index 6b5caa2..8d77558 100644 --- a/astro.config.ts +++ b/astro.config.ts @@ -63,6 +63,7 @@ export default defineConfig({ items: [ { slug: "expanding-horizons/threads-context-and-caching" }, { slug: "expanding-horizons/model-pricing" }, + { slug: "expanding-horizons/high-level-harnesses" }, { slug: "expanding-horizons/what-to-read-next" }, ], }, diff --git a/src/content/docs/expanding-horizons/high-level-harnesses.mdx b/src/content/docs/expanding-horizons/high-level-harnesses.mdx new file mode 100644 index 0000000..061be77 --- /dev/null +++ b/src/content/docs/expanding-horizons/high-level-harnesses.mdx @@ -0,0 +1,102 @@ +--- +title: High-level harnesses +description: Beyond individual agent sessions — scheduled automations, parallel agent fleets, and the emerging pattern of AI-driven code pipelines. +--- + +import ExternalLink from "../../../components/ExternalLink.astro"; + +The [harness engineering](/becoming-productive/harness-engineering/) chapter covered shaping a single agent's actions through AGENTS.md, skills, hooks, and subagents. +This page is one level of abstraction up — it covers tools and patterns that treat agents as a manageable workforce. + +:::caution +Products and feature sets can change significantly between revisions of this guide. +Treat this page as an orientation, especially for building a solid intuition of the field, not a definitive reference. +::: + +## From engineering to managing + +So far in this guide, you have been an **engineer** — you worked interactively with a single agent, steering it turn by turn in real time. +Now, you will become a **manager**, delegating work to a fleet of agents running in parallel. +Instead of supervising each agent individually, you will manage the output queue — a review inbox, an issue tracker, a PR pipeline. +Your coding assistant no longer serves as a conductor, but as an orchestrator. + +:::note[Remember] +The key shift is from "what should the agent do?" to "what work should be running right now, and how do I review what came back?" +::: + +## Running agents in parallel + +The key difference is running several agents simultaneously, each on an isolated task. + +Conductor by Melty Labs is a tool built for exactly this. +It runs multiple AI coding agents at once (both Claude Code and Codex are supported), with each agent working in its own Git worktree. +A dashboard on the user side shows what each agent is doing and lets you review changes as soon as they come in. + +You hand different issues to separate agents at once, come back and review, and merge the ones you like. +That is qualitatively different from the sequential, one-task-at-a-time conductor workflow from the previous chapters. + +## Scheduled and recurring agents + +Agents do not always need to wait for you to trigger them — you can also set them up in advance. + +OpenAI's Codex App includes an Automations feature: +describe a recurring task, set a schedule, and have Codex run it in the background. +Results end up in a review inbox or are auto-archived if nothing needs attention. + +OpenAI uses automations internally for tasks like: +- Daily issue triage +- Surfacing and summarizing CI failures +- Generating release briefs +- Checking for regressions between versions + +With automations, the process becomes closer to a CI pipeline than a chat window — an agent is no longer a tool you reach for and becomes a background process. + +## Issue-tracker-driven orchestration + +You may also set up agents to respond to issues as they appear. + +Symphony is an open-source orchestration service published by OpenAI. +It monitors a Linear board, creates an isolated workspace per issue, and runs a Codex agent on each one. +Engineers decide what issues belong in scope; Symphony handles assignment and execution. + +Agent behavior is defined in a `WORKFLOW.md` file in the repository alongside the code. +The prompt and runtime settings for each agent run are versioned the same way you version a CI pipeline. +When an agent finishes, it gathers evidence: CI results, PR review feedback, complexity analysis. +You can review the output instead of the agent's process. + +:::tip +Symphony is recommended for codebases that have adopted [harness engineering](/becoming-productive/harness-engineering/). +::: + +## The Code Factory pattern + +Beyond specific products, there is an emerging pattern popularized by Ryan Carson under the name **Code Factory**. +The idea is a repository setup where agents autonomously write code, open pull requests, and a separate review agent validates those PRs with machine-verifiable evidence. +If validation passes, the PR merges without human intervention. + +The continuous loop looks like this: + +1. Agent writes code and opens a PR. +2. Risk-aware CI gates check the change. +3. A review agent inspects the PR and collects evidence — screenshots, test results, static analysis. +4. If all checks pass, the PR lands automatically. +5. If anything fails, the agent retries or flags the issue for human review. + +:::caution +A Code Factory is only as good as its quality gates. +An automated pipeline that merges bad PRs is strictly worse than one that does nothing. +Invest in solid tests, linters, and CI before automating the merge step. +::: + +- + +## The one-human company + +The Code Factory pattern is the technical foundation of a broader idea: that a single person with a well-configured agent fleet can operate at the scale that would previously have required a full engineering team. + +Projects like OpenClaw package infrastructure for connecting AI agents to communication platforms and scheduling systems, turning a single machine into an always-on agent runtime that responds to messages, executes tasks, and ships work continuously. +x +Steve Yegge, in a widely-read interview with The Pragmatic Engineer, argues that the engineering profession is reorganizing around exactly this spectrum. +His framing: most engineers are at the low end of AI adoption today, and those who stay there risk being outcompeted by engineers who learn to orchestrate agent fleets — to act as owners of work queues rather than writers of individual functions. + +- \ No newline at end of file diff --git a/src/data/links.csv b/src/data/links.csv index ca4f650..1b75335 100644 --- a/src/data/links.csv +++ b/src/data/links.csv @@ -25,6 +25,7 @@ https://code.claude.com/docs/en/security,Security - Claude Code Docs,Anthropic,, https://code.claude.com/docs/en/sub-agents,Create custom subagents - Claude Code Docs,Anthropic,,2026-03-13 https://code.claude.com/docs/en/sub-agents#code-reviewer,Create custom subagents - Claude Code Docs,,,2026-03-05 https://coderabbit.ai/,CodeRabbit,,,2026-03-05 +https://conductor.build/,Conductor - Run a team of coding agents on your Mac,,,2026-03-25 https://context7.com/,Context7 - Up-to-date documentation for LLMs and AI code editors,,,2026-03-13 https://cursor.com/blog,Cursor Blog,,,2026-03-04 https://cursor.com/bugbot,Cursor Bugbot,,,2026-03-05 @@ -38,6 +39,8 @@ https://cursor.com/for/code-review,Reviewing Code with Cursor | Cursor Docs,,,20 https://cursor.com/pricing,Cursor Subscription,,,2026-03-04 https://developers.openai.com/api/docs/guides/compaction,Compaction,OpenAI,,2026-03-04 https://developers.openai.com/codex/agent-approvals-security,Codex: Agent approvals & security,OpenAI,,2026-03-16 +https://developers.openai.com/codex/app,App – Codex | OpenAI Developers,,,2026-03-25 +https://developers.openai.com/codex/app/automations,Automations – Codex app | OpenAI Developers,,,2026-03-25 https://developers.openai.com/codex/app/worktrees/#working-between-local-and-worktree,Worktrees,,,2026-03-10 https://developers.openai.com/codex/cli/features#run-local-code-review,Codex CLI features (run local code review),,,2026-03-05 https://developers.openai.com/codex/integrations/github/,Use Codex in GitHub,,,2026-03-05 @@ -56,6 +59,7 @@ https://github.com/mcp,GitHub MCP Registry,,,2026-03-13 https://github.com/microsoft/playwright-mcp,microsoft/playwright-mcp,Microsoft,,2026-03-13 https://github.com/mkaput,Marek Kaput,,,2026-03-04 https://github.com/openai/skills,openai/skills,OpenAI,,2026-03-12 +https://github.com/openai/symphony,"GitHub - openai/symphony: Symphony turns project work into isolated, autonomous implementation runs, allowing teams to manage work instead of supervising coding agents. · GitHub",,,2026-03-25 https://github.com/software-mansion-labs/skills,software-mansion-labs/skills,Software Mansion,,2026-03-12 https://github.com/steipete/mcporter/,"steipete/mcporter: Call MCPs via TypeScript, masquerading as simple TypeScript API. Or package them as cli.",Peter Steinberger,,2026-03-04 https://github.com/topics/agent-skills,GitHub Topic: agent-skills,,,2026-03-12 @@ -73,6 +77,8 @@ https://lucumr.pocoo.org/,Thoughts and Writings,Armin Ronacher,,2026-03-04 https://mcp.grep.app/,mcp.grep.app,Vercel,,2026-03-04 https://mitchellh.com/,Blog,Mitchell Hashimoto,,2026-03-04 https://models.dev/,Models.dev - An open-source database of AI models,Opencode,,2026-03-04 +https://myclaw.ai/,OpenClaw & Clawdbot Cloud Hosting — Managed Hosting | MyClaw.ai,,,2026-03-25 +https://newsletter.pragmaticengineer.com/p/from-ides-to-ai-agents-with-steve,From IDEs to AI Agents with Steve Yegge - by Gergely Orosz,,,2026-03-25 https://openai.com/chatgpt/pricing/,ChatGPT Subscription,,,2026-03-04 https://openai.com/index/harness-engineering/,Harness engineering: leveraging Codex in an agent-first world,OpenAI,2026-02-11,2026-03-04 https://openai.com/news/engineering/,OpenAI Engineering News,,,2026-03-04 @@ -110,6 +116,7 @@ https://x.com/GeminiApp,Google Gemini (@GeminiApp) on X,,,2026-03-04 https://x.com/karpathy,Andrej Karpathy (@karpathy) on X,,,2026-03-04 https://x.com/opencode,OpenCode (@opencode) on X,,,2026-03-04 https://x.com/RLanceMartin,Lance Martin (@RLanceMartin) on X,,,2026-03-04 +https://x.com/ryancarson,Ryan Carson (@ryancarson) on X,,,2026-03-25 https://x.com/thorstenball,Thorsten Ball (@thorstenball) on X,,,2026-03-04 https://x.com/thsottiaux,Tibo (@thsottiaux) on X,,,2026-03-04 https://x.com/trq212,Thariq Shihipar (@trq212) on X,,,2026-03-04