Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions src/content/changelog/ai-gateway/2026-03-27-run-api-preview.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
---
title: AI Gateway Run API (preview) and third-party model support in Workers AI binding
description: AI Gateway introduces the Run API in preview and adds support for running third-party models through the Workers AI binding.
products:
- ai-gateway
date: 2026-03-27
---

AI Gateway introduces the Run API (`/run`), a new endpoint with its own request envelope separate from the OpenAI-compatible `/chat/completions` format. The Run API is in preview. Authenticate with [Unified Billing](/ai-gateway/features/unified-billing/) or [BYOK (Gateway Key Store)](/ai-gateway/configuration/bring-your-own-keys/).

The Workers AI binding (`env.AI.run()`) now supports calling third-party models available through the Run API. You can run these models directly from a Cloudflare Worker without managing provider credentials in your code:

```ts
const response = await env.AI.run(
"google/nano-banana",
{ prompt: "a cat riding a burrito" },
{ gateway: { id: "my-gateway" } },
);
```

For more information, refer to the [Run API](/ai-gateway/usage/run-api/) documentation.
195 changes: 195 additions & 0 deletions src/content/docs/ai-gateway/usage/run-api.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,195 @@
---
title: Run API
pcx_content_type: reference
tags:
- AI
sidebar:
order: 3
badge: Preview
---

import { Tabs, TabItem, TypeScriptExample, WranglerConfig } from "~/components";

The Run API is a new endpoint for AI Gateway that provides a simplified request/response pattern for running AI models. Unlike the [Unified API](/ai-gateway/usage/chat-completion/), which follows the OpenAI-compatible `/chat/completions` format, the Run API uses its own request envelope designed for a broader range of model types.

:::note
The Run API is currently in preview with limited model and feature support. Refer to [Current limitations](#current-limitations) for details on what is available today.
:::

## Endpoint URL

```txt
https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run
```

Replace `{account_id}` with your Cloudflare account ID and `{gateway_id}` with your gateway ID (or `default` to use the [default gateway](/ai-gateway/configuration/manage-gateway/#default-gateway)).

## Authentication

The Run API supports the following authentication methods for upstream provider access:

- **Unified Billing** — Use AI Gateway billing to pay for inference requests. Refer to [Unified Billing](/ai-gateway/features/unified-billing/).
- **BYOK (Gateway Key Store)** — Store your provider API keys with Cloudflare. Refer to [BYOK](/ai-gateway/configuration/bring-your-own-keys/).

:::caution
Passing provider API keys directly in request headers is not supported with the Run API. You must use Unified Billing or BYOK (AI Gateway secrets store) to authenticate with upstream providers.
:::

## Request

```txt
POST /v1/{account_id}/{gateway_id}/run
```

### Request body

| Field | Type | Required | Description |
| --- | --- | --- | --- |
| `model` | `string` | Yes | The model to run. Refer to [Supported models](#supported-models) for available values. |
| `input` | `object` | Yes | Model-specific input parameters. The accepted fields depend on the model. Refer to the model's documentation for details. |
| `provider` | `string` | No | Pin the request to a specific provider instead of using the default. |


## Supported models

The Run API currently supports a single model:

| Model | Type | Provider |
| --- | --- | --- |
| [`google/nano-banana`](/workers-ai/models/google/nano-banana/) | Image generation | Google Vertex AI |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this link really work? i'm not seeing it in this PR

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah it's in Charlie's PR -- will make sure that gets merged first


More models will be added in future updates.

## Examples

### Basic request

<Tabs>
<TabItem label="curl">

```bash
curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \
--header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"model": "google/nano-banana",
"input": {
"prompt": "a cat riding a burrito"
}
}'
```

</TabItem>
<TabItem label="Workers AI binding">

<WranglerConfig>

```toml
name = "run-api-example"
main = "src/index.ts"
compatibility_date = "$today"

[ai]
binding = "AI"
```

</WranglerConfig>

<TypeScriptExample>

```ts
export interface Env {
AI: Ai;
}

export default {
async fetch(request, env): Promise<Response> {
const response = await env.AI.run(
"google/nano-banana",
{
prompt: "a cat riding a burrito",
},
{
gateway: {
id: "{gateway_id}",
},
},
);

return new Response(JSON.stringify(response), {
headers: { "Content-Type": "application/json" },
});
},
} satisfies ExportedHandler<Env>;
```

</TypeScriptExample>

</TabItem>
</Tabs>

### With input options

```bash
curl -X POST "https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/run" \
--header "cf-aig-authorization: Bearer $CLOUDFLARE_API_TOKEN" \
--header "Content-Type: application/json" \
--data '{
"model": "google/nano-banana",
"input": {
"prompt": "a cat riding a burrito",
"aspect_ratio": "16:9",
"output_format": "png",
"image_size": "2K"
}
}'
```

## Response

### Success

A successful request returns a response with the following envelope:

```json
{
"state": "Completed",
"result": {
"image": "https://...r2.cloudflarestorage.com/..."
},
"provider": "google-vertex-ai",
"model": "google/nano-banana"
}
```

| Field | Description |
| --- | --- |
| `state` | The request status. `Completed` on success, `Failed` on error. |
| `result` | The model output. For image models, this contains an `image` field with a URL to the generated image. |
| `provider` | The provider that served the request. |
| `model` | The model that was used. |

### Error

```json
{
"state": "Failed",
"error": {
"code": "provider_unavailable",
"message": "All providers for this model are currently unavailable."
},
"model": "google/nano-banana"
}
```

## Current limitations

The Run API is in preview. The following limitations apply:

- **One model** — Only `google/nano-banana` is supported. Additional models will be added over time.
- **No caching** — [Caching](/ai-gateway/features/caching/) is not supported for Run API requests.
- **No rate limiting** — [Rate limiting](/ai-gateway/features/rate-limiting/) rules do not apply to Run API requests.
- **No guardrails or DLP** — [Guardrails](/ai-gateway/features/guardrails/) and data loss prevention features are not available for the Run API.
- **No dynamic routing** — [Dynamic routing](/ai-gateway/features/dynamic-routing/) (fallbacks, A/B testing, conditional logic) is not available.
- **No streaming** — Streaming responses are not supported.
- **No TypeScript types** — TypeScript type definitions for the Run API are not yet available in `@cloudflare/workers-types`.
Loading