`Think` isolated agentic inference for `WorkflowEntrypoint` steps

## Related issue

[#1752 — First-class detached sub-agent runs with a durable completion hook](https://github.com/cloudflare/agents/issues/1752)
overlaps on the "don't pollute main history" motivation, but that issue is about async A2A delegation
*within* an agent's own turn. This request is about calling a Think agent *from a Cloudflare Workflow*
step — with the full agentic loop, structured output, and zero conversation pollution — without having
to build a second nested `ThinkWorkflow` as a workaround.

---

## What we want

We want an `agent` workflow step that works **exactly like the `llm` step** — a clean, isolated call that
returns a result — except instead of a single raw `streamText` call it runs the Think agent's full
agentic capabilities: its configured model, its system prompt, its tools, its memory, and the full
inference loop.

| Need | `llm` step today | `agent` step (desired) |
|---|---|---|
| Agent's configured model | uses caller-supplied model | ✅ uses the agent's `getModel()` |
| System prompt / soul | ❌ caller writes the prompt | ✅ agent's own `getSystemPrompt()` |
| Tool use + agentic loop | ❌ single inference call | ✅ multi-turn, tools fire and retry |
| Stateful memory | ❌ | ✅ agent's SQLite memory blocks |
| Structured output (Zod schema) | ✅ | ✅ same `final_answer` enforcement |
| Isolated — no chat history write | ✅ never touches history | ✅ (desired — currently broken) |
| Multiple concurrent workflows | ✅ | ✅ (desired — currently broken) |

In short: **same ergonomics as `llm`, same power as a full Think agent**.

---

## Problem

There is currently no Think primitive that satisfies all of the above.
`step.prompt()` gets us schema enforcement and the agentic loop but **writes every run to the agent's
main `cf_agent_messages` history**. An agent used as a workflow step accumulates every automated
invocation in its chat — mixing operational calls with user-facing conversation. Two concurrent
workflows hitting the same agent DO serialize and interleave in the same history.

---

## Current workaround — and why it is painful

Because no isolated primitive exists, the only way to get the full agentic loop with structured output
from a `WorkflowEntrypoint` step is to build a second durable workflow *inside* the Think agent itself:

1. The outer `WorkflowEntrypoint` calls a `@callable` on the Think agent DO, then immediately calls
   `step.waitForEvent` to block.
2. That `@callable` spawns a `ThinkWorkflow` (a nested Cloudflare Workflow bound to the agent DO).
3. The inner `ThinkWorkflow` runs `step.prompt()` to get the full agentic loop and schema enforcement.
4. When the inner workflow finishes, it calls `sendWorkflowEvent` back to the outer
   `WorkflowEntrypoint` with the result.
5. The outer workflow wakes up, reads the event payload, and continues.

```
WorkflowEntrypoint (outer)
  │
  ├─ step.do("agent:start")
  │    └─ agent.startPrompt(input, schema, parentRef)   // @callable — fire and return
  │         └─ agent.runWorkflow("AGENT_TASK_WORKFLOW")  // spawns a second durable workflow
  │
  └─ step.waitForEvent("agent:wait", { timeout: "1 hour" })
       │
       └─ ThinkWorkflow (inner, nested inside agent DO)
            └─ step.prompt("respond", { output: schema })
                 └─ full agent turn: tools, loop, final_answer
                      └─ sendWorkflowEvent(parentRef)   // delivers result back to outer workflow
```

### What this workaround costs

1. **Two nested durable workflows** for every single agent step.
   The outer `WorkflowEntrypoint`'s `step.do` already provides checkpointing and replay — the inner
   `ThinkWorkflow` adds a second layer of Cloudflare Workflow overhead that solves a problem the
   outer layer already solves.

2. **History still polluted.** `step.prompt()` inside the inner `ThinkWorkflow` calls
   `submitMessages()`, which appends the turn to `cf_agent_messages`. Every workflow run leaves a
   trace in the agent's chat history. Multiple concurrent workflow runs hitting the same agent
   serialize at the DO and cross-contaminate each other's context.

3. **Boilerplate every consumer has to rediscover.** The bridge from `WorkflowEntrypoint` → Think
   agent → result requires wiring a `@callable`, a nested `ThinkWorkflow`, and a
   `sendWorkflowEvent` / `waitForEvent` handshake. There is no SDK primitive that does this — every
   team that wants this pattern has to build and maintain it themselves.

---

## Proposed API

```ts
// On the Think class — callable over RPC from any WorkflowEntrypoint
async runTask<Schema extends ZodObject>(
  input: string,
  options: {
    output: Schema        // enforced via synthetic final_answer tool (same as step.prompt)
    signal?: AbortSignal
  }
): Promise<z.infer<Schema>>
```

Internally it would:

1. Run inference in a **child facet** (same isolation `startAgentToolRun` already provides —
   separate from `cf_agent_messages`, no history writes)
2. Enforce the output schema via the synthetic `final_answer` tool already used by `step.prompt()`
3. Return the validated result synchronously to the caller
4. Require **no nested `ThinkWorkflow`** — durability is owned by the caller's `step.do`

### What the agent step becomes

```ts
// With runTask() — single step.do, isolated, no nested workflow
const result = await step.do(label, { retries: { limit: 2 }, timeout: "15 minutes" }, async () => {
  const stub = await getAgentByName(env.UserAgent, agentId)
  return stub.runTask(input, { output: schema })
})
```

Additional benefits:

- **Any number of concurrent workflows** can target the **same agent** without serializing or
  cross-contaminating history — each `runTask` call runs in its own child facet.
- **`retries:` is safe** — `step.do` replay re-calls `runTask`, starting a fresh child-facet run
  rather than appending a duplicate turn to main history (the current `stub.prompt()` retry hazard).

---

## Environment

- `@cloudflare/think` (latest)
- Cloudflare Workers + Durable Objects
- Cloudflare Workflows (`WorkflowEntrypoint`)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`Think` isolated agentic inference for `WorkflowEntrypoint` steps #1792

Related issue

What we want

Problem

Current workaround — and why it is painful

What this workaround costs

Proposed API

What the agent step becomes

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Need	`llm` step today	`agent` step (desired)
Agent's configured model	uses caller-supplied model	✅ uses the agent's `getModel()`
System prompt / soul	❌ caller writes the prompt	✅ agent's own `getSystemPrompt()`
Tool use + agentic loop	❌ single inference call	✅ multi-turn, tools fire and retry
Stateful memory	❌	✅ agent's SQLite memory blocks
Structured output (Zod schema)	✅	✅ same `final_answer` enforcement
Isolated — no chat history write	✅ never touches history	✅ (desired — currently broken)
Multiple concurrent workflows	✅	✅ (desired — currently broken)

Think isolated agentic inference for WorkflowEntrypoint steps #1792

Description

Related issue

What we want

Problem

Current workaround — and why it is painful

What this workaround costs

Proposed API

What the agent step becomes

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`Think` isolated agentic inference for `WorkflowEntrypoint` steps #1792