diff --git a/src/content/changelog/ai-gateway/2026-03-27-run-api-preview.mdx b/src/content/changelog/ai-gateway/2026-03-27-run-api-preview.mdx new file mode 100644 index 00000000000000..f616784b2cc431 --- /dev/null +++ b/src/content/changelog/ai-gateway/2026-03-27-run-api-preview.mdx @@ -0,0 +1,21 @@ +--- +title: AI Gateway Run API (preview) and third-party model support in Workers AI binding +description: AI Gateway introduces the Run API in preview and adds support for running third-party models through the Workers AI binding. +products: + - ai-gateway +date: 2026-03-27 +--- + +AI Gateway introduces the Run API (`/run`), a new endpoint with its own request envelope separate from the OpenAI-compatible `/chat/completions` format. The Run API is in preview. Authenticate with [Unified Billing](/ai-gateway/features/unified-billing/) or [BYOK (Gateway Key Store)](/ai-gateway/configuration/bring-your-own-keys/). + +The Workers AI binding (`env.AI.run()`) now supports calling third-party models available through the Run API. You can run these models directly from a Cloudflare Worker without managing provider credentials in your code: + +```ts +const response = await env.AI.run( + "google/nano-banana", + { prompt: "a cat riding a burrito" }, + { gateway: { id: "my-gateway" } }, +); +``` + +For more information, refer to the [Run API](/ai-gateway/usage/run-api/) documentation. diff --git a/src/content/docs/ai-gateway/usage/run-api.mdx b/src/content/docs/ai-gateway/usage/run-api.mdx new file mode 100644 index 00000000000000..2186caec9bc41a --- /dev/null +++ b/src/content/docs/ai-gateway/usage/run-api.mdx @@ -0,0 +1,195 @@ +--- +title: Run API +pcx_content_type: reference +tags: + - AI +sidebar: + order: 3 + badge: Preview +--- + +import { Tabs, TabItem, TypeScriptExample, WranglerConfig } from "~/components"; + +The Run API is a new endpoint for AI Gateway that provides a simplified request/response pattern for running AI models. Unlike the [Unified API](/ai-gateway/usage/chat-completion/), which follows the OpenAI-compatible `/chat/completions` format, the Run API uses its own request envelope designed for a broader range of model types. + +:::note +The Run API is currently in preview with limited model and feature support. Refer to [Current limitations](#current-limitations) for details on what is available today. +::: + +## Endpoint URL + +```txt +https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run +``` + +Replace `{account_id}` with your Cloudflare account ID and `{gateway_id}` with your gateway ID (or `default` to use the [default gateway](/ai-gateway/configuration/manage-gateway/#default-gateway)). + +## Authentication + +The Run API supports the following authentication methods for upstream provider access: + +- **Unified Billing** — Use AI Gateway billing to pay for inference requests. Refer to [Unified Billing](/ai-gateway/features/unified-billing/). +- **BYOK (Gateway Key Store)** — Store your provider API keys with Cloudflare. Refer to [BYOK](/ai-gateway/configuration/bring-your-own-keys/). + +:::caution +Passing provider API keys directly in request headers is not supported with the Run API. You must use Unified Billing or BYOK (AI Gateway secrets store) to authenticate with upstream providers. +::: + +## Request + +```txt +POST /v1/{account_id}/{gateway_id}/run +``` + +### Request body + +| Field | Type | Required | Description | +| --- | --- | --- | --- | +| `model` | `string` | Yes | The model to run. Refer to [Supported models](#supported-models) for available values. | +| `input` | `object` | Yes | Model-specific input parameters. The accepted fields depend on the model. Refer to the model's documentation for details. | +| `provider` | `string` | No | Pin the request to a specific provider instead of using the default. | + + +## Supported models + +The Run API currently supports a single model: + +| Model | Type | Provider | +| --- | --- | --- | +| [`google/nano-banana`](/workers-ai/models/google/nano-banana/) | Image generation | Google Vertex AI | + +More models will be added in future updates. + +## Examples + +### Basic request + + + + +```bash +curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \ + --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "google/nano-banana", + "input": { + "prompt": "a cat riding a burrito" + } + }' +``` + + + + + + +```toml +name = "run-api-example" +main = "src/index.ts" +compatibility_date = "$today" + +[ai] +binding = "AI" +``` + + + + + +```ts +export interface Env { + AI: Ai; +} + +export default { + async fetch(request, env): Promise { + const response = await env.AI.run( + "google/nano-banana", + { + prompt: "a cat riding a burrito", + }, + { + gateway: { + id: "{gateway_id}", + }, + }, + ); + + return new Response(JSON.stringify(response), { + headers: { "Content-Type": "application/json" }, + }); + }, +} satisfies ExportedHandler; +``` + + + + + + +### With input options + +```bash +curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \ + --header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \ + --header "Content-Type: application/json" \ + --data '{ + "model": "google/nano-banana", + "input": { + "prompt": "a cat riding a burrito", + "aspect_ratio": "16:9", + "output_format": "png", + "image_size": "2K" + } + }' +``` + +## Response + +### Success + +A successful request returns a response with the following envelope: + +```json +{ + "state": "Completed", + "result": { + "image": "https://...r2.cloudflarestorage.com/..." + }, + "provider": "google-vertex-ai", + "model": "google/nano-banana" +} +``` + +| Field | Description | +| --- | --- | +| `state` | The request status. `Completed` on success, `Failed` on error. | +| `result` | The model output. For image models, this contains an `image` field with a URL to the generated image. | +| `provider` | The provider that served the request. | +| `model` | The model that was used. | + +### Error + +```json +{ + "state": "Failed", + "error": { + "code": "provider_unavailable", + "message": "All providers for this model are currently unavailable." + }, + "model": "google/nano-banana" +} +``` + +## Current limitations + +The Run API is in preview. The following limitations apply: + +- **One model** — Only `google/nano-banana` is supported. Additional models will be added over time. +- **No caching** — [Caching](/ai-gateway/features/caching/) is not supported for Run API requests. +- **No rate limiting** — [Rate limiting](/ai-gateway/features/rate-limiting/) rules do not apply to Run API requests. +- **No guardrails or DLP** — [Guardrails](/ai-gateway/features/guardrails/) and data loss prevention features are not available for the Run API. +- **No dynamic routing** — [Dynamic routing](/ai-gateway/features/dynamic-routing/) (fallbacks, A/B testing, conditional logic) is not available. +- **No streaming** — Streaming responses are not supported. +- **No TypeScript types** — TypeScript type definitions for the Run API are not yet available in `@cloudflare/workers-types`.