- 🌟 Overview
- 📜 Scripts
- 🆓 Free API Providers
- 💻 Local Model Providers
- 🔀 API Proxies
- 📚 Detailed Tool Guides
- 🖥️ AI-Enhanced Terminals
This repository is your comprehensive guide to getting the most out of AI tools in your terminal. It contains curated scripts, expert tips, and detailed guides for terminal-based AI development.
💡 Pro Tip: This is a companion to the awesome-terminals-ai list—your one-stop resource for terminal AI tools!
Useful scripts to enhance your AI terminal workflow:
| Script | Description | Guide |
|---|---|---|
| 📊 copilot-usage.sh | Check your GitHub Copilot usage and quota | Copilot CLI Guide |
| 🤖 run-claude-copilot.sh | Run Claude Code with GitHub Copilot models | See below ⬇️ |
Access powerful Google Gemini models with generous free tier limits:
| Feature | Gemini 2.5 Pro (Free) | Gemini 2.5 Flash (Free) |
|---|---|---|
| ⚡ Rate Limit | 2 requests/minute | 15 requests/minute |
| 📅 Daily Limit | 50 requests/day | 1,500 requests/day |
GitHub provides two types of AI model access for developers:
- 🤖 GitHub Copilot Models
- 🛒 GitHub Market Models
Overview:
- 🌐 Endpoint:
https://api.githubcopilot.com - 📖 Documentation: Supported Models
- ⚡ Rate Limits: see Individual Plan Comparison
Premium request limits (per month):
| Feature | GitHub Copilot Free | GitHub Copilot Pro | GitHub Copilot Pro+ |
|---|---|---|---|
| Premium requests | 0 per month | 300 per month | 1,500 per month |
ℹ️ Exact limits and availability may change over time—always confirm via the official docs above.
Model multipliers:
- 📖 Model Multipliers Documentation
- Models (accessible via API) with a 0× multiplier for non-free plans (not counted toward premium usage):
gpt-4.1,gpt-5-mini,gpt-4o
⚠️ Integration Note: The endpointhttps://api.githubcopilot.comcannot be used directly by most third-party AI tools. To use GitHub Copilot models with tools like Aider or Claude Code, use the 🌉 Copilot API Bridge proxy to expose an OpenAI/Anthropic-compatible interface.
List available models:
curl -L \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${OAUTH_TOKEN}" \
https://api.githubcopilot.com/models | jq -r '.data[].id'Overview:
- 🌐 Endpoint:
https://models.github.ai/inference - 🔍 Browse: GitHub Marketplace Models
- 📊 Rate Limits: 4k input tokens, 4k output tokens per request
List available models:
curl -L \
-H "Accept: application/vnd.github+json" \
-H "Authorization: Bearer ${OAUTH_TOKEN}" \
-H "X-GitHub-Api-Version: 2022-11-28" \
https://models.github.ai/catalog/models | jq -r '.[].id'OpenRouter provides unified API access to multiple AI models—try different models using one API to find your best fit!
| Model | Link |
|---|---|
| GPT OSS 20B | Try it |
| Qwen3 Coder | Try it |
| GLM 4.5 Air | Try it |
| Kimi K2 | Try it |
| DeepSeek Chat v3.1 | Try it |
| Grok 4.1 Fast | Try it |
Setup: 🔑 Generate API Key
💡 Rate Limits:
- With 10+ credits purchased: 1,000 requests/day
- Otherwise: 50 requests/day
Groq offers high-speed inference with free tier access.
Available models from Rate Limits documentation:
openai/gpt-oss-120bopenai/gpt-oss-20bqwen/qwen3-32bmoonshotai/kimi-k2-instruct-0905
Setup: 🔑 Generate API Key
NVIDIA Build provides free API access to a wide selection of AI models optimized on NVIDIA infrastructure.
| Model | Full Model Name | Link |
|---|---|---|
| Qwen3 Next 80B | qwen/qwen3-next-80b-a3b-instruct |
Try it |
| Qwen3 Coder 480B | qwen/qwen3-coder-480b-a35b-instruct |
Try it |
| GPT-OSS 120B | openai/gpt-oss-120b |
Try it |
| Kimi K2 Instruct | moonshotai/kimi-k2-instruct-0905 |
Try it |
| DeepSeek V3.1 | deepseek-ai/deepseek-v3_1 |
Try it |
| MiniMax M2 | minimaxai/minimax-m2 |
Try it |
Setup:
💡 Note: Use the full model name (with namespace) when making API requests.
Ollama now provides cloud-hosted models via API access, offering powerful AI capabilities without the need for local infrastructure. These models are accessible through a simple API and integrate seamlessly with popular AI coding tools.
💰 Pricing:
- 🆓 Free Plan - Available with hourly and daily usage limits
- 📈 Pay-per-use - No upfront costs or hardware investment required
| Model | Full Name | Use Case |
|---|---|---|
| 🤖 DeepSeek V3.1 | deepseek-v3.1:671b |
Advanced reasoning and code generation |
| 🔥 GPT-OSS 20B | gpt-oss:20b |
Efficient coding and text tasks |
| 🚀 GPT-OSS 120B | gpt-oss:120b |
High-capacity reasoning and analysis |
| 🌙 Kimi K2 | kimi-k2:1t |
Long-context understanding and generation |
| 💻 Qwen3 Coder | qwen3-coder:480b |
Specialized code completion and programming |
| ⚡ GLM 4.6 | glm-4.6 |
Balanced performance for diverse tasks |
| 🎯 MiniMax M2 | minimax-m2 |
Optimized for productivity and speed |
Ollama Cloud Models integrate seamlessly with popular AI coding tools and IDEs through native integrations and OpenAI-compatible APIs:
🎯 Supported AI Coding Tools & IDEs:
| Tool | Integration Type | Documentation |
|---|---|---|
| VS Code | Native Extension | View Guide |
| JetBrains | Native Plugin | View Guide |
| Codex | API Integration | View Guide |
| Cline | API Integration | View Guide |
| Droid | API Integration | View Guide |
| Goose | API Integration | View Guide |
| Zed | Native Extension | View Guide |
Key Benefits:
- OpenAI-compatible API - Use existing OpenAI client libraries
- Direct terminal integration - Run queries from command line
- No local setup required - Access powerful models via API
- Cost-effective - Pay-per-use without hardware investment
- Zero local storage - Models run in the cloud
Example API Usage:
# Query via REST API
curl https://api.ollama.ai/v1/chat/completions \
-H "Authorization: Bearer ${OLLAMA_API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-coder:480b",
"messages": [
{"role": "user", "content": "Write a Python function to parse JSON"}
]
}'Setup:
- 🔑 Generate API Key
- 📚 View Documentation
- 🆓 Free Plan Available - Includes hourly and daily usage limits
💡 Pro Tip: Most integrations support both local and cloud models. For cloud models, append
-cloudto the model name in your tool's configuration.
Ollama - Lightweight framework for running LLMs locally via command line.
Key Features:
- ⚡ Simple CLI interface
- 🌐 RESTful API
- 🐳 Docker-like model management
- 🤖 Popular models: LLaMA, Gemma, DeepSeek
- 🔌 OpenAI-compatible API
- 🖥️ Cross-platform support
Access Ollama's cloud-hosted models locally using the same CLI interface. Simply append -cloud (or :cloud for some models) to the model name when pulling or running models.
Available Cloud Models:
| Local Model Name | Cloud Model |
|---|---|
deepseek-v3.1:671b-cloud |
deepseek-v3.1:671b |
gpt-oss:20b-cloud |
gpt-oss:20b |
gpt-oss:120b-cloud |
gpt-oss:120b |
kimi-k2:1t-cloud |
kimi-k2:1t |
qwen3-coder:480b-cloud |
qwen3-coder:480b |
glm-4.6:cloud |
glm-4.6 |
minimax-m2:cloud |
minimax-m2 |
Usage Example:
# Pull a cloud model (stored locally for faster access)
ollama pull qwen3-coder:480b-cloud
# Run the model
ollama run qwen3-coder:480b-cloudKey Benefits:
- 🚀 Zero local storage - Models run in the cloud
- ⚡ Faster startup - No download or local VRAM requirements
- 🔄 Same CLI experience - Use familiar
ollama run/pullcommands - 💰 Pay-per-use - No hardware investment needed
- 🌍 Always up-to-date - Access to latest model versions
💡 Note: Cloud models require an active internet connection and an Ollama Cloud account. Sign in with
ollama signinto authenticate your account.
Model Sizes:
| Model | Size |
|---|---|
| gpt-oss:120b | 65 GB |
| gpt-oss:20b | 13 GB |
| qwen3:8b | 5.2 GB |
| qwen3:30b | 18 GB |
Performance Benchmark (tokens/second):
| Machine | gpt-oss:120b | gpt-oss:20b | qwen3:8b | qwen3:30b |
|---|---|---|---|---|
| 🖥️ Windows PC (Intel i9) | - | 15 t/s | 12 t/s | 22 t/s |
| 💻 MacBook Pro (M3 Max) | - | 70 t/s | 57 t/s | 74 t/s |
| 🖥️ Linux Server (Dual RTX 4090) | 36 t/s | 156 t/s | 140 t/s | 163 t/s |
📋 Machine Specifications
-
Windows PC (Intel i9):
- CPU: Intel i9-12900
- GPU: Intel UHD Graphics 770 (2 GB)
- RAM: 64 GB
-
MacBook Pro (M3 Max):
- Apple M3 Max with 64 GB RAM
-
Linux Server (Dual RTX 4090):
- CPU: Xeon(R) w7-3445 (40 CPUs)
- GPU: 2 × Nvidia RTX 4090
- RAM: 128 GB
LM Studio - User-friendly desktop GUI for running local LLMs with no technical setup required.
Key Features:
- 🛍️ Model marketplace
- 🌐 OpenAI-compatible API server
- 💬 Chat interface
- 📦 GGUF model support
- 💰 Free for personal & commercial use
Most AI tools support OpenAI-compatible APIs. For tools requiring Anthropic-compatible APIs, these solutions provide compatibility:
Claude Code Router - Routes Claude Code requests to different models with request customization.
📦 Installation (Linux/macOS)
# Install Claude Code CLI (prerequisite)
npm install -g @anthropic-ai/claude-code
# Install Claude Code Router
npm install -g @musistudio/claude-code-router⚙️ Configuration Examples
Create ~/.claude-code-router/config.json with your preferred providers:
{
"LOG": true,
"API_TIMEOUT_MS": 600000,
"Providers": [
{
"name": "gemini",
"api_base_url": "https://generativelanguage.googleapis.com/v1beta/models/",
"api_key": "$GEMINI_API_KEY",
"models": ["gemini-2.5-flash", "gemini-2.5-pro"],
"transformer": { "use": ["gemini"] }
},
{
"name": "openrouter",
"api_base_url": "https://openrouter.ai/api/v1/chat/completions",
"api_key": "$OPENROUTER_API_KEY",
"models": ["google/gemini-2.5-pro-preview", "anthropic/claude-sonnet-4"],
"transformer": { "use": ["openrouter"] }
},
{
"name": "grok",
"api_base_url": "https://api.x.ai/v1/chat/completions",
"api_key": "$GROK_API_KEY",
"models": ["grok-beta"]
},
{
"name": "github-copilot",
"api_base_url": "https://api.githubcopilot.com/chat/completions",
"api_key": "$GITHUB_TOKEN",
"models": ["gpt-4o", "claude-3-7-sonnet", "o1-preview"]
},
{
"name": "github-marketplace",
"api_base_url": "https://models.github.ai/inference/chat/completions",
"api_key": "$GITHUB_TOKEN",
"models": ["openai/gpt-4o", "openai/o1-preview", "xai/grok-3"]
},
{
"name": "ollama",
"api_base_url": "http://localhost:11434/v1/chat/completions",
"api_key": "ollama",
"models": ["qwen3:30b", "gpt-oss:20b", "llama3.2:latest"]
}
],
"Router": {
"default": "gemini,gemini-2.5-flash",
"background": "ollama,qwen3:30b",
"longContext": "openrouter,google/gemini-2.5-pro-preview"
}
}💻 Usage Commands
# Start Claude Code with router
ccr code
# Use UI mode for configuration
ccr ui
# Restart after config changes
ccr restart
# Switch models dynamically in Claude Code
/model ollama,llama3.2:latest
⚠️ Known Issue: The proxy for Ollama models does not work properly with Claude Code.
The GitHub Copilot API (https://api.githubcopilot.com) does not provide direct access for most third‑party AI integrations. copilot‑api, an open‑source proxy, provides the necessary bridge: it exposes an OpenAI‑compatible interface as well as an Anthropic‑compatible interface, at the endpoint https://localhost:4141.
Installation and Authentication:
# Install copilot-api globally
npm install -g copilot-api
# Device authentication
copilot-api auth
# Start the API proxy
copilot-api startThe copilot-api tool is also available in specialized environments like the modern-linuxtools Singularity image on CVMFS.
CVMFS Setup:
# Setup the environment
source /cvmfs/atlas.sdcc.bnl.gov/users/yesw/singularity/alma9-x86/modern-linuxtools/setupMe.sh
# Then use copilot-api as normal
copilot-api auth
copilot-api start💻 Usage Examples
# Use with Aider
export ANTHROPIC_BASE_URL=http://localhost:4141 && aider --no-git --anthropic-api-key dummy --model anthropic/claude-sonnet-4.5
# Or use with Claude Code CLI
export ANTHROPIC_BASE_URL=http://localhost:4141 ANTHROPIC_AUTH_TOKEN=dummy ANTHROPIC_MODEL=claude-sonnet-4.5 && claude-code📌 Important Notes:
- Use your own URL in
ANTHROPIC_BASE_URLand remove trailing/- Enable X11 forwarding when SSH-ing:
ssh -X username@hostname- All GitHub Copilot models (excluding Market models) become accessible
For a streamlined experience, this script automates the entire setup process for using Claude Code with GitHub Copilot models.
✨ Key Features:
| Feature | Description |
|---|---|
| 📦 Auto Dependency Management | Installs nvm, npm, copilot-api, and claude-code |
| ⚡ Simplified Usage | Single command to start fully configured Claude session |
| 🔄 Model Selection | Specify which Copilot model to use |
| 🛠️ Utility Functions | Check usage, list models, update packages |
| 🔗 Transparent Args | Forwards arguments directly to claude command |
💻 Usage Examples:
# Run Claude with default settings
./scripts/run-claude-copilot.sh
# List available Copilot models
./scripts/run-claude-copilot.sh --list-models
# Check your Copilot API usage
./scripts/run-claude-copilot.sh --check-usage
# Run Claude with a specific model and pass a prompt
./scripts/run-claude-copilot.sh --model claude-sonnet-4 -- -p "Explain quantum computing"
# Get help on the script's options
./scripts/run-claude-copilot.sh --help
# Get help on Claude's own options
./scripts/run-claude-copilot.sh -- --helpComprehensive documentation for each AI terminal tool:
| Tool | Description | Guide |
|---|---|---|
| 🤝 Aider | AI pair programming in your terminal | Read Guide |
| 🤖 GitHub Copilot CLI | Copilot coding agent directly in your terminal | Read Guide |
| 💎 Gemini CLI | Google's Gemini in your terminal | Read Guide |
| 🚀 Qwen Code | Qwen3-Coder models in your terminal | Read Guide |
AI-first terminal that integrates intelligent agents directly into the command line.
✨ Key Features:
|
Generate commands with |
Autosuggestions and error detection |
|
Multi-agent parallel workflows |
SAML SSO, BYOL, zero data retention |
📊 Usage Limits:
- 🆓 Free tier: 150 requests/month
- 💎 Paid plans available for higher usage
📦 Installation:
brew install --cask warp # macOS
winget install Warp.Warp # Windows
# Linux - Multiple package formats available
# See: https://www.warp.dev/blog/warp-for-linux
# Packages include: .deb (apt), .rpm (yum/dnf/zypper), Snap, Flatpak, AppImage, and AUROpen-source terminal that brings graphical capabilities into the command line.
✨ Key Features:
|
Images, markdown, CSV, video files |
Integrated editor for remote files |
|
Web browser and SSH connection manager |
Dashboard creation capabilities |
|
Local data storage for privacy |
|
🤖 AI Integration:
- ✅ Built-in AI assistance for command suggestions
- ⚙️ Configurable AI models via "Add AI preset..."
- 🦙 Support for Ollama and other local models
- 🎯 Context-aware recommendations
📦 Installation:
Download from waveterm.dev/download
Available as: Snap, AppImage, .deb, .rpm, and Windows installers
Made with ❤️ by the Community
⭐ Star on GitHub | 🐛 Report Issues | 💡 Contribute
Supercharge your terminal workflow! 🚀