-
Notifications
You must be signed in to change notification settings - Fork 13k
adds run api to the docs #29345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ethulia
wants to merge
11
commits into
production
Choose a base branch
from
ml/ai-gateway-run
base: production
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
adds run api to the docs #29345
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
a1c6630
adds run api to the docs
ethulia d13b57d
[AI Gateway] Add changelog entry for Run API beta launch
ethulia 58c0d08
copy
ethulia 2fb6cea
[AI Gateway] Refine Run API docs: sidebar badge, link model docs, rem…
ethulia 6ce312d
[AI Gateway] Remove changelog entry from PR (stashed separately)
ethulia f7e6e2a
[AI Gateway] Wrap Workers AI binding example in TypeScriptExample com…
ethulia 8003697
[AI Gateway] Remove background and usedBYOKKey params, add back chang…
ethulia 0a26bf7
[AI Gateway] Rename Run API from beta to preview, remove coming soon …
ethulia dd97650
[AI Gateway] Fix guardrails link path in Run API docs
ethulia 6e60439
[AI Gateway] Revise changelog: update title, date, and reframe Worker…
ethulia c9b7356
copy
ethulia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
21 changes: 21 additions & 0 deletions
21
src/content/changelog/ai-gateway/2026-03-27-run-api-preview.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| --- | ||
| title: AI Gateway Run API (preview) and third-party model support in Workers AI binding | ||
| description: AI Gateway introduces the Run API in preview and adds support for running third-party models through the Workers AI binding. | ||
| products: | ||
| - ai-gateway | ||
| date: 2026-03-27 | ||
| --- | ||
|
|
||
| AI Gateway introduces the Run API (`/run`), a new endpoint with its own request envelope separate from the OpenAI-compatible `/chat/completions` format. The Run API is in preview. Authenticate with [Unified Billing](/ai-gateway/features/unified-billing/) or [BYOK (Gateway Key Store)](/ai-gateway/configuration/bring-your-own-keys/). | ||
|
|
||
| The Workers AI binding (`env.AI.run()`) now supports calling third-party models available through the Run API. You can run these models directly from a Cloudflare Worker without managing provider credentials in your code: | ||
|
|
||
| ```ts | ||
| const response = await env.AI.run( | ||
| "google/nano-banana", | ||
| { prompt: "a cat riding a burrito" }, | ||
| { gateway: { id: "my-gateway" } }, | ||
| ); | ||
| ``` | ||
|
|
||
| For more information, refer to the [Run API](/ai-gateway/usage/run-api/) documentation. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,195 @@ | ||
| --- | ||
| title: Run API | ||
| pcx_content_type: reference | ||
| tags: | ||
| - AI | ||
| sidebar: | ||
| order: 3 | ||
| badge: Preview | ||
| --- | ||
|
|
||
| import { Tabs, TabItem, TypeScriptExample, WranglerConfig } from "~/components"; | ||
|
|
||
| The Run API is a new endpoint for AI Gateway that provides a simplified request/response pattern for running AI models. Unlike the [Unified API](/ai-gateway/usage/chat-completion/), which follows the OpenAI-compatible `/chat/completions` format, the Run API uses its own request envelope designed for a broader range of model types. | ||
|
|
||
| :::note | ||
| The Run API is currently in preview with limited model and feature support. Refer to [Current limitations](#current-limitations) for details on what is available today. | ||
| ::: | ||
|
|
||
| ## Endpoint URL | ||
|
|
||
| ```txt | ||
| https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run | ||
| ``` | ||
|
|
||
| Replace `{account_id}` with your Cloudflare account ID and `{gateway_id}` with your gateway ID (or `default` to use the [default gateway](/ai-gateway/configuration/manage-gateway/#default-gateway)). | ||
|
|
||
| ## Authentication | ||
|
|
||
| The Run API supports the following authentication methods for upstream provider access: | ||
|
|
||
| - **Unified Billing** — Use AI Gateway billing to pay for inference requests. Refer to [Unified Billing](/ai-gateway/features/unified-billing/). | ||
| - **BYOK (Gateway Key Store)** — Store your provider API keys with Cloudflare. Refer to [BYOK](/ai-gateway/configuration/bring-your-own-keys/). | ||
|
|
||
| :::caution | ||
| Passing provider API keys directly in request headers is not supported with the Run API. You must use Unified Billing or BYOK (AI Gateway secrets store) to authenticate with upstream providers. | ||
| ::: | ||
|
|
||
| ## Request | ||
|
|
||
| ```txt | ||
| POST /v1/{account_id}/{gateway_id}/run | ||
| ``` | ||
|
|
||
| ### Request body | ||
|
|
||
| | Field | Type | Required | Description | | ||
| | --- | --- | --- | --- | | ||
| | `model` | `string` | Yes | The model to run. Refer to [Supported models](#supported-models) for available values. | | ||
| | `input` | `object` | Yes | Model-specific input parameters. The accepted fields depend on the model. Refer to the model's documentation for details. | | ||
| | `provider` | `string` | No | Pin the request to a specific provider instead of using the default. | | ||
|
|
||
|
|
||
| ## Supported models | ||
|
|
||
| The Run API currently supports a single model: | ||
|
|
||
| | Model | Type | Provider | | ||
| | --- | --- | --- | | ||
| | [`google/nano-banana`](/workers-ai/models/google/nano-banana/) | Image generation | Google Vertex AI | | ||
|
|
||
| More models will be added in future updates. | ||
|
|
||
| ## Examples | ||
|
|
||
| ### Basic request | ||
|
|
||
| <Tabs> | ||
| <TabItem label="curl"> | ||
|
|
||
| ```bash | ||
| curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \ | ||
| --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ | ||
| --header "Content-Type: application/json" \ | ||
| --data '{ | ||
| "model": "google/nano-banana", | ||
| "input": { | ||
| "prompt": "a cat riding a burrito" | ||
| } | ||
| }' | ||
| ``` | ||
|
|
||
| </TabItem> | ||
| <TabItem label="Workers AI binding"> | ||
|
|
||
| <WranglerConfig> | ||
|
|
||
| ```toml | ||
| name = "run-api-example" | ||
| main = "src/index.ts" | ||
| compatibility_date = "$today" | ||
|
|
||
| [ai] | ||
| binding = "AI" | ||
| ``` | ||
|
|
||
| </WranglerConfig> | ||
|
|
||
| <TypeScriptExample> | ||
|
|
||
| ```ts | ||
| export interface Env { | ||
| AI: Ai; | ||
| } | ||
|
|
||
| export default { | ||
| async fetch(request, env): Promise<Response> { | ||
| const response = await env.AI.run( | ||
| "google/nano-banana", | ||
| { | ||
| prompt: "a cat riding a burrito", | ||
| }, | ||
| { | ||
| gateway: { | ||
| id: "{gateway_id}", | ||
| }, | ||
| }, | ||
| ); | ||
|
|
||
| return new Response(JSON.stringify(response), { | ||
| headers: { "Content-Type": "application/json" }, | ||
| }); | ||
| }, | ||
| } satisfies ExportedHandler<Env>; | ||
| ``` | ||
|
|
||
| </TypeScriptExample> | ||
|
|
||
| </TabItem> | ||
| </Tabs> | ||
|
|
||
| ### With input options | ||
|
|
||
| ```bash | ||
| curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \ | ||
| --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ | ||
| --header "Content-Type: application/json" \ | ||
| --data '{ | ||
| "model": "google/nano-banana", | ||
| "input": { | ||
| "prompt": "a cat riding a burrito", | ||
| "aspect_ratio": "16:9", | ||
| "output_format": "png", | ||
| "image_size": "2K" | ||
| } | ||
| }' | ||
| ``` | ||
|
|
||
| ## Response | ||
|
|
||
| ### Success | ||
|
|
||
| A successful request returns a response with the following envelope: | ||
|
|
||
| ```json | ||
| { | ||
| "state": "Completed", | ||
| "result": { | ||
| "image": "https://...r2.cloudflarestorage.com/..." | ||
| }, | ||
| "provider": "google-vertex-ai", | ||
| "model": "google/nano-banana" | ||
| } | ||
| ``` | ||
|
|
||
| | Field | Description | | ||
| | --- | --- | | ||
| | `state` | The request status. `Completed` on success, `Failed` on error. | | ||
| | `result` | The model output. For image models, this contains an `image` field with a URL to the generated image. | | ||
| | `provider` | The provider that served the request. | | ||
| | `model` | The model that was used. | | ||
|
|
||
| ### Error | ||
|
|
||
| ```json | ||
| { | ||
| "state": "Failed", | ||
| "error": { | ||
| "code": "provider_unavailable", | ||
| "message": "All providers for this model are currently unavailable." | ||
| }, | ||
| "model": "google/nano-banana" | ||
| } | ||
| ``` | ||
|
|
||
| ## Current limitations | ||
|
|
||
| The Run API is in preview. The following limitations apply: | ||
|
|
||
| - **One model** — Only `google/nano-banana` is supported. Additional models will be added over time. | ||
| - **No caching** — [Caching](/ai-gateway/features/caching/) is not supported for Run API requests. | ||
| - **No rate limiting** — [Rate limiting](/ai-gateway/features/rate-limiting/) rules do not apply to Run API requests. | ||
| - **No guardrails or DLP** — [Guardrails](/ai-gateway/features/guardrails/) and data loss prevention features are not available for the Run API. | ||
| - **No dynamic routing** — [Dynamic routing](/ai-gateway/features/dynamic-routing/) (fallbacks, A/B testing, conditional logic) is not available. | ||
| - **No streaming** — Streaming responses are not supported. | ||
| - **No TypeScript types** — TypeScript type definitions for the Run API are not yet available in `@cloudflare/workers-types`. | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this link really work? i'm not seeing it in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah it's in Charlie's PR -- will make sure that gets merged first