feat(ai,ai-fal): per-model typed durations for video generation#641
feat(ai,ai-fal): per-model typed durations for video generation#641tombeckenham wants to merge 19 commits into
Conversation
…oses #619) Adds a separate `@tanstack/ai-schemas` package that ships per-endpoint JSON Schema + Zod definitions for every supported provider, generated nightly from upstream OpenAPI specs. Architecture ported from fal-ai/fal-js PR #212 and generalised to multi-provider. Pipeline: `fetch-schemas` (per-provider fetchers) → `generate-schemas` (`@hey-api/openapi-ts` with the Zod 4 plugin) → `generate-endpoint-maps` (bundles `$ref` closures under `$defs`, emits endpoint-id-keyed maps and namespaced top-level barrels). Providers wired up with working generation: - OpenAI: `github.com/openai/openai-openapi` raw YAML - Anthropic: Stainless OpenAPI resolved via anthropic-sdk-typescript/.stats.yml - Gemini: Google Discovery doc → OpenAPI converted in-pipeline - ElevenLabs: `api.elevenlabs.io/openapi.json` - FAL: per-model OpenAPI from `api.fal.ai/v1/models` (skips when FAL_KEY absent) .github/workflows/sync-schemas.yml runs daily and opens an automated PR when any upstream spec diff lands, following the same pattern as sync-models.yml. Per repo coupling rules: media-generation + adapter-configuration SKILL.md cross-reference the new package; new docs/advanced/ai-schemas.md page added under the Advanced section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Switches `@tanstack/ai-schemas` from format-first
(`@tanstack/ai-schemas/schemas/openai`, `/zod/gemini`) to provider-first
(`@tanstack/ai-schemas/openai/json-schema`, `/gemini/zod`) and drops the
namespaced aggregator barrels.
With provider-first subpaths and no aggregator, importing one provider's
schemas pulls only that provider's bytes. Importing
`@tanstack/ai-schemas/gemini/json-schema` carries no OpenAI, Anthropic,
ElevenLabs, or FAL code into the consumer's bundle.
The default `.` entry now only re-exports `toOpenAIStrict` (4 KB).
Wildcard exports in package.json:
"./*/json-schema" -> dist/esm/providers/*/schemas-index.{js,d.ts}
"./*/zod" -> dist/esm/providers/*/index.{js,d.ts}
Generator emits a flat layout — `providers/{providerId}` for single-
category providers, `providers/{providerId}-{category}` for FAL
(`fal-image`, `fal-video`, etc.) — so Node's single-segment `*` wildcard
resolves cleanly.
Vite discovers per-provider barrels at build time so the dist mirrors
the provider layout.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Grok as a provider in the nightly OpenAPI sync. Source is xAI's first-party OpenAPI spec at docs.x.ai/openapi.json (OpenAPI 3.1.0, 32 endpoints, 168 schemas). Public, no auth required for the spec. Grok is API-compatible with OpenAI's chat completions at the wire level but ships an independent schema set (different names, slightly different shapes, plus xAI-specific endpoints: deferred-completion, documents/search, per-modality model lists, video edits/extensions, tokenize-text, etc.). Each provider stays isolated in the schemas package. Bumps the build script's heap to 8 GB. With six providers now in the single dts program, the default 4 GB limit was tight. Knip ignores `vite` for ai-schemas since the script invokes the binary directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bles The earlier `/* eslint-disable */` and `// @ts-nocheck` headers on every generated file were holdovers from iterations before the spec-level transforms (`coerceDefaults`, discriminator rewrites) cleaned up the codegen. With those in place, the only real issues are: - One `no-control-regex` lint error in `elevenlabs/zod.gen.ts` (a literal `\x09` in a regex pattern, faithful to upstream). - One tsc error in `openai/zod.gen.ts` where the post-process rewrites `.extend()` on a discriminatedUnion to `z.intersection(...)` (runtime correct, but intersections aren't `$ZodTypeDiscriminable`). Replace the blanket suppressors with two surgical mechanisms: - `suppressControlRegexLines` injects `// eslint-disable-next-line no-control-regex` immediately above each regex literal containing a control-char escape. - The `.extend()` rewriter now reports whether it touched anything; only files where it did get a two-line header (`ban-ts-comment` disable followed by `@ts-nocheck`) so the workspace lint rule doesn't fire. Net effect: zero file-level disables, two line-level disables on the only file that needs them, ~100 `no-explicit-any` warnings (warn-level in the project config, do not fail CI). `pnpm test:eslint` and `pnpm test:types` both walk the generated files in full and pass clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rkflow `@hey-api/openapi-ts` runs with full code-execution privileges during nightly codegen, so an unintentional bump shouldn't be possible. Drop the caret on the version specifier (`^0.97.2` -> `0.97.2`) so future updates have to land via an explicit PR. The lockfile already pinned the resolved version; this just removes the wiggle room in the manifest. Add a `pnpm audit --audit-level=high` step to sync-schemas.yml. It's non-blocking (`continue-on-error: true`) because the workspace currently carries pre-existing transitive advisories (e.g. an old protobufjs via @google/genai used by ai-gemini) that aren't related to schemas codegen and shouldn't be allowed to break the nightly. The step's purpose is to surface any *new* advisory landing on the codegen dep tree so the human reviewing the automated PR can spot it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
We already have pnpm's trust-downgrade policy active (refuses installs where provenance attestation regressed), the heyapi version explicitly pinned without a caret, and an automated PR pattern that forces human review before any codegen change lands. A non-blocking audit step would just be log noise, and a blocking one needs scope work we don't want to take on right now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* feat(ai-openai): add gpt-image-2 to image model meta Adds `gpt-image-2` to OPENAI_IMAGE_MODELS so it can be used through openaiImage adapters. Reuses the gpt-image-1 provider-options/size shape (quality, background, output_format, output_compression, moderation, partial_images; sizes 1024x1024 / 1536x1024 / 1024x1536 / auto) and extends size + prompt-length validators. Also updates the media-generation skill and image-generation doc page to list the new model. * fix(ai-openrouter): restore web_fetch in tool capabilities map The model-metadata sync in #623 regenerated `OpenRouterChatModelToolCapabilitiesByName` with `['web_search']` only, which made `webFetchTool()` (added in #611) unassignable to any OpenRouter text adapter and broke the per-model type-safety test. Add `'web_fetch'` back so the existing tests compile.
* docs: refresh README discoverability * docs: address README review feedback
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* fix: complete tool calls with server results * fix: hydrate server tool outputs from history * test: cover server tool history hydration * ci: apply automated fixes * test: add issue 176 manual repro page * ci: apply automated fixes * test: add live issue 176 repro flow * test: hide issue 176 repro from sidebar --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Adds a fifth generic `TModelDurationByName` to `VideoAdapter` plus two
introspection methods on the base class:
- `availableDurations()` returns a `DurationOptions` tagged union
(`discrete | range | mixed | none`) describing the durations the
current model accepts.
- `snapDuration(seconds)` coerces raw seconds to the closest valid
duration for the current model.
`generateVideo({ duration })` is now typed via
`VideoDurationForAdapter<TAdapter>`. The FAL adapter derives its
per-model duration type from the SDK's `EndpointTypeMap`, so e.g.
`falVideo('fal-ai/kling-video/v1.6/standard/text-to-video')` types
`duration` as `'5' | '10'`; `falVideo('fal-ai/veo3')` types it as
`'4s' | '6s' | '8s'`; `falVideo('fal-ai/minimax/video-01')` rejects
the field entirely.
Adapters that have not yet declared their per-model duration map get
sensible defaults (`{ kind: 'none' }`, `undefined`) so existing video
adapters keep working without changes.
Built on top of #622 (`@tanstack/ai-schemas`); once that PR's FAL
pipeline syncs runtime constraint data, the hand-curated map in
`packages/typescript/ai-fal/src/video/video-provider-options.ts` can
be replaced with schema-derived lookups. Follow-up issue #634 covers
building the Gemini Veo adapter directly on this contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Changeset Version Preview3 package(s) bumped directly, 28 bumped as dependents. 🟥 Major bumps
🟨 Minor bumps
🟩 Patch bumps
|
|
View your CI Pipeline Execution ↗ for commit a4ed1b6 ☁️ Nx Cloud last updated this comment at |
|
View your CI Pipeline Execution ↗ for commit a4ed1b6
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-schemas
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
7a05fca to
d9904cf
Compare
Closes #534. Built on top of #622 (
@tanstack/ai-schemas) — targets that branch as the base. When #622 merges into main, this PR will rebase + retarget to main.Summary
VideoAdapter(TModelDurationByName) + two introspection methods on the base class —availableDurations()andsnapDuration(seconds). Default implementations return{ kind: 'none' }/undefined, so video adapters that haven't been migrated keep working unchanged.generateVideo({ duration })is now per-model typed viaVideoDurationForAdapter<TAdapter>.durationper model from@fal-ai/client'sEndpointTypeMap. Runtime constraint data for popular models lives in a hand-curated map inpackages/typescript/ai-fal/src/video/video-provider-options.ts(to be replaced with schema-derived lookups once feat(schemas): @tanstack/ai-schemas with nightly OpenAPI sync (closes #619) #622's FAL pipeline syncs).snapToDurationOptionutil handles discrete enums, ranges, mixed, keyword-with-unit forms ('8s'), andkind: 'none'.Per-model behaviour after this PR
durationtypeavailableDurations()fal-ai/kling-video/v1.6/{standard,pro}/text-to-video'5' | '10'discretefal-ai/pika/v2.2/text-to-video'5' | '10'discretefal-ai/luma-dream-machine/ray-2'5s' | '9s'discretefal-ai/veo3andfal-ai/veo3/image-to-video'4s' | '6s' | '8s'discretefal-ai/wan-25-preview/text-to-video'2' … '15'discretefal-ai/minimax/video-01{ kind: 'none' }fal-ai/hunyuan-video-v1.5/text-to-videonum_frames){ kind: 'none' }Scope decisions
{ kind: 'none' }) until someone migrates it.packages/typescript/ai-gemini/src/model-meta.ts:827-940). Filed as feat(gemini): add Google Veo video adapter on the typed-duration contract #634 to build directly on this contract.Breaking change
Callers passing
duration: <number>to FAL video models must either:'5','8s', etc.) directly, oradapter.snapDuration(seconds)and let it coerce.Existing video adapters that haven't opted into the typed-duration map (OpenAI Sora today) are not breaking.
Test plan
pnpm test:types— repo-wide type check passespnpm test:lib— all 32 packages, 904 core tests + 86 FAL tests pass (5 new snap.test.ts cases + 4 new FAL durations cases)pnpm test:eslint— cleanpnpm build— all 32 projects build cleanFAL_KEYis available; the E2E suite'svideo-genmatrix currently doesn't include'fal'so this is also follow-up infra work.falVideo('fal-ai/veo3')autocompletes'4s' | '6s' | '8s'onduration;falVideo('fal-ai/minimax/video-01')rejectsduration.🤖 Generated with Claude Code