Skip to content

A hyper-efficient, lightweight AI Gateway that provides a unified interface to access various AI model providers through a single endpoint.

License

Notifications You must be signed in to change notification settings

Noveum/ai-gateway-ts

Repository files navigation

Noveum AI Gateway

Apache License 2.0 Docker Build Docker Pulls Docker Image Size PRs Welcome Contributors GitHub Release

A hyper-efficient, lightweight AI Gateway that provides a unified interface to access various AI model providers through a single endpoint. Built for edge deployment using Cloudflare Workers, it offers seamless integration with popular AI providers while maintaining high performance and low latency.

🌟 Features

  • πŸš€ Edge-Optimized Performance: Built on Cloudflare Workers for minimal latency
  • πŸ”„ Universal Interface: Single endpoint for multiple AI providers
  • πŸ”Œ Provider Agnostic: Easily switch between different AI providers
  • πŸ“‘ Streaming Support: Real-time streaming responses for all supported providers
  • πŸ›  Extensible Middleware: Customizable request/response pipeline
  • βœ… Built-in Validation: Automatic request validation and error handling
  • πŸ”„ Auto-Transform: Automatic request/response transformation
  • πŸ“ Detailed Metrics: Comprehensive request metrics and cost tracking
  • πŸ“ Comprehensive Logging: Detailed logging for monitoring and debugging
  • πŸ’ͺ Type-Safe: Built with TypeScript for robust type safety
  • πŸ”’ OpenAI Compatible: Drop-in replacement for OpenAI's API

πŸ€– Supported Providers

Provider Streaming OpenAI Compatible
OpenAI βœ… Native
Anthropic βœ… βœ…
GROQ βœ… βœ…
Fireworks βœ… βœ…
Together βœ… βœ…

πŸš€ Quick Start

Using Cloudflare Workers (Recommended)

# Install Wrangler CLI
npm install -g wrangler

# Clone and Setup
git clone https://github.com/Noveum/ai-gateway.git
cd ai-gateway
npm install

# Login to Cloudflare
wrangler login

# Development
npm run dev     # Server starts at http://localhost:3000

# Deploy
npm run deploy

Using Docker (Alternative)

docker pull noveum/ai-gateway:latest
docker run -p 3000:3000 noveum/ai-gateway:latest

πŸ“š Usage Examples

OpenAI-Compatible Interface

The gateway provides a drop-in replacement for OpenAI's API. You can use your existing OpenAI client libraries by just changing the base URL:

// TypeScript/JavaScript
import OpenAI from 'openai';

const openai = new OpenAI({
    baseURL: 'http://localhost:3000/v1',
    apiKey: 'your-provider-api-key',
    defaultHeaders: { 'x-provider': 'openai' }
});

const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: 'Hello!' }]
});

Provider-Specific Examples

Anthropic (Claude)

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-provider: anthropic" \
  -H "Authorization: Bearer your-anthropic-api-key" \
  -d '{
    "model": "claude-3-sonnet-20240229-v1:0",
    "messages": [{"role": "user", "content": "Hello!"}],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

GROQ

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "x-provider: groq" \
  -H "Authorization: Bearer your-groq-api-key" \
  -d '{
    "model": "mixtral-8x7b-32768",
    "messages": [{"role": "user", "content": "Hello!"}],
    "temperature": 0.7,
    "max_tokens": 1000
  }'

Streaming Example

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/v1",
    api_key="your-provider-api-key",
    default_headers={"x-provider": "anthropic"}  # or any other provider
)

stream = client.chat.completions.create(
    model="claude-3-sonnet-20240229-v1:0",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Example Response with Metrics

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1709312768,
  "model": "gpt-4",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you today?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 9,
    "total_tokens": 19
  },
  "system_fingerprint": "fp_1234",
  "metrics": {
    "latency_ms": 450,
    "tokens_per_second": 42.2,
    "cost": {
      "input_cost": 0.0003,
      "output_cost": 0.0006,
      "total_cost": 0.0009
    }
  }
}

πŸ“– Documentation

πŸ“„ Contributing Opportunities

We welcome contributions! Here are some tasks we're actively looking for help with:

High Priority Tasks

  1. AWS Bedrock Integration

    • Add support for AWS Bedrock models
    • Implement authentication and cost tracking
    • Get Started β†’
  2. Testing Framework

  3. Performance Benchmarks

Feature Requests

  1. Prometheus Integration

  2. Response Caching

  3. Rate Limiting

Documentation

  1. Provider Guides

  2. Deployment Examples

Want to contribute?

  1. Pick a task from above
  2. Open an issue to discuss your approach
  3. Submit a pull request

Need help? Join our Discord or check existing issues.

πŸ“„ Metrics & Monitoring

The gateway collects detailed metrics for every request, providing insights into:

  • πŸ“ˆ Real-time performance tracking
  • πŸ’° Token usage and cost calculation
  • πŸ”„ Streaming metrics support
  • πŸ“Š Provider-specific metadata
  • ⏱️ Latency and TTFB monitoring
  • πŸ” Detailed debugging information

For detailed metrics documentation, see METRICS.md

πŸ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

πŸ“Š Contact


Made with ❀️ by the Noveum Team
Copyright 2024 Noveum AI

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

πŸ™ Acknowledgments

πŸ“¬ Contact


Made with ❀️ by the Noveum Team

About

A hyper-efficient, lightweight AI Gateway that provides a unified interface to access various AI model providers through a single endpoint.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published