Skip to content

feat(copilotkit): integrate CopilotKit v2 with Bedrock AgentCore#1

Draft
blove wants to merge 3 commits intomainfrom
feat/copilotkit-bedrock-agentcore
Draft

feat(copilotkit): integrate CopilotKit v2 with Bedrock AgentCore#1
blove wants to merge 3 commits intomainfrom
feat/copilotkit-bedrock-agentcore

Conversation

@blove
Copy link
Collaborator

@blove blove commented Mar 13, 2026

Summary

Full integration of CopilotKit v2 with AWS Bedrock AgentCore, enabling a production-ready fullstack AI chat application where frontend React components can participate in the agent's tool loop (generative UI). The PR covers backend runtime, infrastructure (CDK + Terraform), frontend, and two runtime bug fixes verified against a live deployment.


What changed

Backend: CopilotKit runtime proxy

  • New Lambda (infra-cdk/lambdas/copilotkit-runtime/) — TypeScript Lambda that acts as a CopilotKit Runtime, translating CopilotKit Cloud protocol to AG-UI invocations against the AgentCore runtime endpoint
  • API Gateway — new /copilotkit route added to the existing backend stack; URL exported as CopilotKitRuntimeUrl
  • CDK and Terraform modules updated in parallel (infra-cdk/lib/backend-stack.ts, infra-terraform/modules/backend/copilotkit_runtime.tf)

Backend: LangGraph AG-UI agent

  • ag_ui_langgraph/ — vendored AG-UI agent base class implementing the AG-UI streaming protocol on top of LangGraph; handles checkpoint loading, time-travel regeneration, state merging, and Bedrock-compatible message fixups
  • copilotkit/ — CopilotKit middleware package implementing AgentMiddleware for LangGraph: injects frontend tools into model calls, intercepts frontend tool calls to prevent backend tool execution, restores intercepted tool calls to checkpoint on agent exit, and manages app context injection via system messages
  • patterns/langgraph-single-agent/langgraph_agent.py — updated to use LangGraphAGUIAgent + CopilotKitMiddleware; exposes the agent via AG-UI protocol with Cognito JWT auth and M2M gateway token fetch
  • langgraph-checkpoint-aws upgraded 1.0.1 → 1.0.5

Frontend

  • @copilotkit/react-core and @copilotkit/react-ui added
  • CopilotChatInterface — wraps CopilotKit provider + CopilotChat; reads CopilotKitRuntimeUrl from runtime config
  • PieChart generative UI component (recharts) — registered as a frontend tool so the agent can render charts directly in the chat
  • useGenerativeUi hook — exports useMakePieChart action for the CopilotKit tool loop
  • Auth updated to surface accessToken (required by AgentCore JWT authorizer)
  • Runtime config extended with copilotKitRuntimeUrl

Bug fixes

Fix 1 — Infinite loop after frontend tool call

When the agent called a frontend tool (e.g. createPieChart) and the frontend returned the result, the agent re-called the same tool on every continuation instead of responding with text.

Root cause: patch_orphan_tool_calls (langgraph-checkpoint-aws v1.0.5) generates a new random ToolMessage ID on every checkpoint load. ID-based replacement of the "interrupted" placeholder always failed — the real result was appended alongside the placeholder, and Bedrock rejected the duplicate toolResult IDs (ValidationException).

Fix: _fix_messages_for_bedrock in CopilotKitMiddleware now deduplicates ToolMessages by tool_call_id in-place before each Converse API call, keeping the real result over any "interrupted before completion" placeholder. langgraph_default_merge_state simplified to pass incoming ToolMessages through without the unreliable ID-replacement logic.

Fix 2 — RUN_ERROR on multi-turn follow-up messages

New follow-up messages (brand-new message ID, not yet in any checkpoint) were incorrectly treated as time-travel regeneration requests and returned RUN_ERROR.

Fix: prepare_stream now checks is_continuation — whether all incoming non-ToolMessage IDs are already in the checkpoint — before triggering time-travel. Genuine follow-up turns fall through to normal continuation; time-travel only fires when the last user message ID is already in the checkpoint.


Architecture

Browser (React + CopilotKit)
  └─ CopilotKit Cloud
       └─ API Gateway /copilotkit  (CopilotKit Runtime Lambda)
            └─ AgentCore Runtime  (LangGraph agent + CopilotKitMiddleware)
                 ├─ Bedrock (claude-sonnet-4-5)
                 ├─ AgentCore Memory (langgraph-checkpoint-aws)
                 └─ MCP Gateway (backend tools)

Test plan

  • Deploy: cd infra-cdk && npx cdk deploy
  • Open Amplify URL; sign in via Cognito
  • Send "Create a pie chart" — verify createPieChart tool fires and chart renders in chat
  • Confirm agent responds with text after chart renders (no re-call loop)
  • Send a follow-up question in the same thread — verify correct contextual answer, no RUN_ERROR
  • Start a new thread, send two sequential messages — both return text responses
  • Verify Terraform path: cd infra-terraform && terraform apply deploys equivalent infrastructure

🤖 Generated with Claude Code

blove and others added 3 commits March 13, 2026 15:53
Adds end-to-end CopilotKit v2 chat support — streaming generative UI
(pie chart), frontend action tools, and persistent memory — running on
Bedrock AgentCore with Claude Sonnet via the Converse API.

## Backend (patterns/langgraph-single-agent)

### langgraph_agent.py — complete rewrite for AgentCore + CopilotKit
- Replaces `create_react_agent` + raw streaming with `create_agent`
  (from `langchain.agents`) using `CopilotKitMiddleware`, connecting the
  LangGraph agent to CopilotKit's frontend action / generative-UI protocol
- Adopts `BedrockAgentCoreApp` (`@app.entrypoint`) instead of FastAPI,
  keeping AgentCore's native invocation model
- MCP tools fetched per-request from the AgentCore Gateway via
  `MultiServerMCPClient` with a fresh OAuth2 token each time (avoids
  token-expiry in long-running processes)
- Actor identity resolved from `forwardedProps` keys
  (`actor_id`/`user_id`) or the `sub` claim in the Cognito Bearer JWT,
  then threaded through `AgentCoreMemorySaver` so each user's history is
  isolated
- `ActorAwareLangGraphAgent(LangGraphAGUIAgent)` adds three overrides
  required by `AgentCoreMemorySaver`:
  1. `_filter_orphan_tool_messages` — restores `tool_calls` stripped by
     `clean_orphan_tool_calls` (frontend tools have no ToolMessage in the
     checkpoint); ensures MESSAGES_SNAPSHOT carries `toolCalls` so the
     rendered component (e.g. pie chart) is not removed when the snapshot
     overwrites client state
  2. `langgraph_default_merge_state` — prepends repaired AIMessages when
     Run 2 (CopilotKit follow-up) adds a ToolMessage; without this,
     `_fix_messages_for_bedrock` strips the `tool_use` content block
     (because `tool_calls=[]`) → orphan ToolMessage → Bedrock API error
  3. `get_checkpoint_before_message` — injects `actor_id` into the
     LangGraph config for time-travel / edit history lookups
- `serialize_agui_event` / terminal-event guard in `invocations` ensure
  the AG-UI stream always ends with `RUN_FINISHED` or `RUN_ERROR`

### requirements.txt
- Replaces `langgraph` + `langchain-aws` stubs with the full dependency
  set: `ag-ui-protocol`, `partialjson`, `langgraph==1.0.10rc1`,
  `langchain-aws==1.0.0`, `langchain-mcp-adapters`, `copilotkit` (local
  vendor), `bedrock-agentcore`, and pinned versions for reproducibility

### Dockerfile
- Installs the local `copilotkit/` and `ag_ui_langgraph/` vendor packages
  via `pip install -e` so no PyPI dependency is needed for these patched
  libraries

## Vendored packages (copilotkit/ and ag_ui_langgraph/)

Local copies of CopilotKit SDK and ag-ui-langgraph with Bedrock
Converse API compatibility patches (sourced from PR mme:mme/local-copilotkit):

### copilotkit/copilotkit_lg_middleware.py — CopilotKitMiddleware
- `_fix_messages_for_bedrock`: strips unanswered `tool_calls` and syncs
  `tool_use` content blocks before each Bedrock model call, preventing
  the `toolUse` / `toolResult` interleaving errors the Converse API
  requires
- `before_agent`: injects app context from `copilotkit.context` as a
  `SystemMessage` at the start of each agent turn
- `after_model` / `after_agent`: intercepts frontend tool calls so they
  are not forwarded to `ToolNode`, then restores them to the checkpoint
  after the agent exits (enables CopilotKit's frontend-action loop)
- `awrap_model_call`: merges frontend tool definitions into the model
  request so the LLM can call them

### ag_ui_langgraph/agent.py — LangGraphAGUIAgent
- `langgraph_default_merge_state`: fixes string `args` in checkpoint
  `tool_calls`, replaces fake ToolMessages injected by
  `patch_orphan_tool_calls` with the real AG-UI result, and deduplicates
  tool definitions across runs
- `_filter_orphan_tool_messages` / `_ORPHAN_TOOL_MSG_RE`: removes fake
  ToolMessages (pattern: "Tool call '…' with id '…' was interrupted
  before completion.") that AgentCore's saver injects when a tool call
  has no matching result in the checkpoint

## CopilotKit Lambda runtime (infra-cdk/lambdas/copilotkit-runtime/)

New TypeScript Lambda that sits between the frontend and the Python
AgentCore agent, acting as the CopilotKit server-side runtime:
- `CopilotRuntime` with `InMemoryAgentRunner` wraps the Python agent as
  an `HttpAgent` (AG-UI over HTTP)
- `CopilotKitRunner` (extends `InMemoryAgentRunner`) overrides `connect`:
  on reconnect it replays `TOOL_CALL_RESULT` events for every `toolCall`
  in the `MESSAGES_SNAPSHOT`, preventing CopilotKit's `processAgentResult`
  from re-triggering Run 2 when the user reloads the page
- Supports multiple named agents via env vars
  (`LANGGRAPH_AGENTCORE_AG_UI_URL`, `STRANDS_AGENTCORE_AG_UI_URL`) or a
  single `AGENTCORE_AG_UI_URL` fallback; agent selected by
  `COPILOTKIT_AGENT_NAME`

## Frontend (frontend/)

### CopilotChatInterface.tsx (new)
- Loads runtime config (`copilotKitRuntimeUrl`) from `aws-exports.json`
- Wraps `<CopilotKitProvider>` + `<CopilotChat>` with Cognito Bearer
  token forwarded as `Authorization` header

### Generative UI
- `PieChart.tsx`: Recharts-based pie chart component with legend,
  dark-mode support, and a Zod schema (`PieChartPropsSchema`) so
  CopilotKit can validate tool arguments before rendering
- `useGenerativeUi.ts`: registers `pieChart` as a controlled generative-UI
  component via `useComponent` from `@copilotkit/react-core/v2`
- `useExampleSuggestions.ts`: registers suggested prompts in the chat UI

### ChatPage / ChatInterface routing
- `ChatPage.tsx` routes to the new `CopilotChatInterface` when
  `copilotKitRuntimeUrl` is present in config; falls back to the existing
  `ChatInterface` otherwise
- `main.tsx` / `auth.ts` updated for compatibility

### package.json
- Adds `@copilotkit/react-core`, `@copilotkit/react-ui`, `recharts`, and
  `zod`; pins `@copilotkit/runtime-client-gql` for the v2 API surface

## Infrastructure

### CDK (infra-cdk/)
- `backend-stack.ts`: adds a `CopilotKitRuntimeFunction` (Node.js 22
  Lambda, 512 MB, 5 min timeout) with `LANGGRAPH_AGENTCORE_AG_UI_URL` /
  `STRANDS_AGENTCORE_AG_UI_URL` env vars pointing to the AgentCore
  runtime URLs; exposes `/copilotkit/{proxy+}` on the existing API
  Gateway with Cognito authorizer
- `fast-main-stack.ts`: threads the new Lambda construct through the
  stack output
- `config.yaml`: adds `copilotKitRuntimeUrl` to the frontend runtime
  config exported to `aws-exports.json`

### Terraform (infra-terraform/)
- `copilotkit_runtime.tf`: equivalent Lambda + IAM + API Gateway
  resources for Terraform deployments
- `locals.tf`, `outputs.tf`, `ssm.tf`: expose `copilotkit_runtime_url`
  as an SSM parameter and stack output
- `deploy-frontend.{py,sh}` / `scripts/deploy-frontend.py`: write
  `copilotKitRuntimeUrl` into `aws-exports.json` during frontend deploy

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…oint-aws 1.0.5

Upgrade langgraph-checkpoint-aws from 1.0.1 to 1.0.5. Version 1.0.5 uses
patch_orphan_tool_calls (injecting placeholder ToolMessages) instead of
clean_orphan_tool_calls (stripping tool_calls). The vendored ag_ui_langgraph
base class already handles patch_orphan_tool_calls correctly via
_filter_orphan_tool_messages and langgraph_default_merge_state, and
CopilotKitMiddleware handles Bedrock API errors via _fix_messages_for_bedrock.

Remove from ActorAwareLangGraphAgent:
- _reconstruct_tool_calls helper
- _filter_orphan_tool_messages override
- langgraph_default_merge_state override
- get_checkpoint_before_message override

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sage bugs

Fix two bugs in the LangGraph AG-UI agent when used with CopilotKit frontend tools:

**Fix 1: Infinite loop after frontend tool call (pie chart / createPieChart)**

Root cause: `patch_orphan_tool_calls` (langgraph-checkpoint-aws v1.0.5) generates a
new random ToolMessage ID on every checkpoint load. The previous approach tried to
replace the placeholder by ID in `stream_input["messages"]`, but the ID from the first
`aget_state()` call never matches the placeholder ID from the internal `astream_events`
reload — so the real result was appended alongside the placeholder, causing Bedrock to
reject duplicate `toolResult` IDs (ValidationException).

Fix: added step 4 to `_fix_messages_for_bedrock` in `CopilotKitMiddleware` to
deduplicate `ToolMessage`s by `tool_call_id` in-place before each Converse API call,
keeping the real result over any "interrupted before completion" placeholder. Simplified
`langgraph_default_merge_state` to pass incoming `ToolMessage`s through without
attempting unreliable ID-based replacement.

Also upgrades `langgraph-checkpoint-aws` from 1.0.1 to 1.0.5 which switches from
`clean_orphan_tool_calls` (removes tool_calls, breaking continuation) to
`patch_orphan_tool_calls` (adds placeholder ToolMessages, enabling continuation).

**Fix 2: RUN_ERROR on multi-message follow-up turns**

Root cause: when the checkpoint had more messages than the incoming request, the
agent always triggered time-travel regeneration — including for legitimate follow-up
turns (new user message after a prior exchange).

Fix: checks whether all incoming non-ToolMessage IDs are already in the checkpoint
before deciding to time-travel. If they are, it's a continuation; only time-travel
when the last user message ID exists in the checkpoint but the incoming messages are
a proper subset (indicating a genuine re-generation request).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant