Save 30-60% on Claude Code costs with an installable skill, web tools, and 11 deep-dive guides.
Claude Code (official plugin system):
/plugin marketplace add Sagargupta16/claude-cost-optimizer
/plugin install cost-mode@sagargupta16-claude-cost-optimizerMulti-agent (Cursor, Cline, Codex, 40+ agents):
npx skills add Sagargupta16/claude-cost-optimizerThen activate in any session:
/cost-mode # Standard (40-60% output token reduction)
/cost-mode lite # Professional brevity (20-40% output reduction)
/cost-mode strict # Telegraphic, max savings (60-70% output reduction)
/cost-mode off # Resume normal behavior
| Feature | How It Saves Tokens |
|---|---|
| Strips filler | Drops pleasantries, hedging, restating your question, trailing summaries |
| Suggests cheaper models | "Haiku handles this -- /model haiku" for simple tasks |
| Suggests CLI tools | "Use prettier/eslint --fix directly" instead of burning LLM tokens |
| Session awareness | Reminds to /compact after 20+ turns, fresh sessions for new tasks |
| Minimal code gen | Diffs over rewrites, no obvious comments, no speculative error handling |
| Auto-deactivates | Full clarity for security warnings, destructive ops, and when you're confused |
Technical accuracy is never sacrificed. Code in commits and PRs is written normally.
No install needed -- use these in your browser:
| Tool | What It Does |
|---|---|
| Repo Analyzer | Paste a GitHub URL to get a cost audit, grade (A+ to F), and recommendations |
| Cost Calculator | Estimate monthly spend based on your model, sessions, and config |
| Badge Checker | Score your setup and get a shields.io badge for your repo |
Real project, 30-turn session, Opus 4.7:
BEFORE (no optimization): AFTER (5 minutes of setup):
CLAUDE.md: 6,200 chars (truncated) CLAUDE.md: 2,800 chars (under limit)
.claudeignore: missing .claudeignore: 12 patterns
MCP servers: 6 active MCP servers: 2 active
System prompt: ~15,000 tokens/turn System prompt: ~5,500 tokens/turn
Session cost: $2.85 Session cost: $1.12
Monthly (3x/day): $188.10 Monthly (3x/day): $73.92
Savings: $114.18/month (61%)
Even without installing the skill, these 5 changes cut costs immediately:
| # | Strategy | Savings | Guide |
|---|---|---|---|
| 1 | Keep CLAUDE.md under 4,000 characters -- content beyond 4K is silently truncated | 10-20% | Context Optimization |
| 2 | Use Haiku for simple tasks (--model haiku) -- 5x cheaper than Opus |
20-40% | Model Selection |
| 3 | Use Plan Mode before coding -- prevents wasted iterative cycles | 15-25% | Workflow Patterns |
| 4 | Add .claudeignore -- stop Claude from reading node_modules, dist, lock files |
5-15% | Context Optimization |
| 5 | Delegate to subagents -- isolate expensive searches from main context | 20-40% | Workflow Patterns |
Full walkthrough: Getting Started in 5 Minutes
cost-mode is the first skill. More are planned:
| Skill | Status | What It Does | Cost Impact |
|---|---|---|---|
| cost-mode | Live | Concise responses, model routing suggestions, session awareness | 30-60% cost savings |
| claudeignore-gen | Planned | Auto-generates .claudeignore based on your project's tech stack | 5-15% input reduction |
| context-compress | Planned | Rewrites your CLAUDE.md to be shorter while keeping all essential info | 10-20% input reduction |
| cache-optimizer | Planned | Detects cache-busting patterns and suggests fixes to maximize prompt cache hits | 10-25% input reduction |
| budget-guard | Planned | Per-session and per-day spending limits with warnings before you blow past them | Prevents overspend |
Want to build one? Skills are just SKILL.md files -- see CONTRIBUTING.md and the skills/cost-mode/ directory for the pattern.
11 deep-dive guides covering every optimization area:
| Guide | What You'll Learn |
|---|---|
| 00 - Getting Started | Zero to optimized in 5 minutes -- the essential setup |
| 01 - Understanding Costs | How billing works, what costs the most, where money goes |
| 02 - Context Optimization | Reduce input tokens: CLAUDE.md, .claudeignore, file reads |
| 03 - Model Selection | When to use Opus vs Sonnet vs Haiku (with decision tree) |
| 04 - Workflow Patterns | Plan mode, subagents, commands, batch operations |
| 05 - Team Budgeting | Per-developer budgets, cost tracking, ROI calculation |
| 06 - Access Methods & Pricing | Compare API vs Bedrock vs Vertex AI vs Claude Code pricing |
| 07 - MCP & Agent Cost Impact | MCP server overhead, subagent costs, Agent SDK patterns |
| 08 - Prompt Caching Deep Dive | Cache mechanics, TTL economics, maximizing hit rates, ROI math |
| 09 - Subscription Plan Value | Choose the right plan, maximize allowance, upgrade/downgrade signals |
| 10 - Three-Tier Task Routing | Skip the LLM for Tier 0 tasks, route cheap tasks to Haiku, save Opus for complex work |
Also: Visual Diagrams (Mermaid flowcharts) | One-Page Cheatsheet
Copy-paste configs that are already optimized:
CLAUDE.md: Minimal | Standard | Comprehensive | Monorepo
By Stack: React+Vite | Next.js | FastAPI | MERN | Terraform | Go | Rust | Django | Rails | Spring Boot
Settings: Cost-Conscious | Balanced | Performance-First
Commands: /cost-check | /budget-mode | /quick-fix | /optimize
7 tools for measuring, tracking, and reducing costs. Full tools documentation
| Tool | What It Does |
|---|---|
| Token Estimator | Estimate token count and cost for any file |
| Usage Analyzer | Find cost hotspots across your sessions |
| Badge Generator | Grade your project config (A+ to F) from the CLI |
| MCP Cost Server | In-session cost estimation via MCP |
| VS Code Extension | Token count and cost in the status bar |
| GitHub Action | Automated cost audit on PRs |
| Budget Hooks | Track tool calls, log costs, warn at thresholds |
| Model | Input / 1M | Output / 1M | Cache Hit / 1M | Context | Max Output |
|---|---|---|---|---|---|
| Opus 4.7 (current) | $5.00 | $25.00 | $0.50 | 1M | 128K |
| Opus 4.6 (legacy) | $5.00 | $25.00 | $0.50 | 1M | 128K |
| Sonnet 4.6 | $3.00 | $15.00 | $0.30 | 1M | 64K |
| Haiku 4.5 | $1.00 | $5.00 | $0.10 | 200K | 64K |
| Mythos Preview (invite-only, Glasswing) | $25.00 | $125.00 | -- | 1M | -- |
1M context on Opus 4.7, Opus 4.6, and Sonnet 4.6 bills at standard rates across the full window (no long-context premium). Batch API: 50% off. Fast Mode (Opus 4.6 only): 6x. Regional endpoints (Bedrock/Vertex): +10%. Plans: Pro $20/mo, Max 5x $100/mo, Max 20x $200/mo.
Opus 4.7 tokenizer caveat: The new tokenizer may use up to 35% more tokens for the same text. Effective cost is higher than posted pricing suggests -- factor this into per-task budgets.
Mythos Preview: Research preview for defensive cybersecurity (vulnerability discovery, binary analysis). Invite-only via Project Glasswing (AWS, Apple, Cisco, CrowdStrike, Google, JPMorganChase, Microsoft, NVIDIA, Palo Alto Networks, 40+ critical-infra orgs). Not a general-purpose dev model.
Model retirements coming: Haiku 3 retires April 19, 2026 (migrate to Haiku 4.5). Sonnet 4 and Opus 4 (the original 4.0 versions) retire June 15, 2026 (migrate to Sonnet 4.6 / Opus 4.7).
- Task Comparison -- same task, optimized vs not
- Model Comparison -- Opus vs Sonnet vs Haiku
- Context Size Impact -- how CLAUDE.md size affects cost
- Community Leaderboard -- crowdsourced cost-per-task data
- Case Studies -- real-world optimization stories
- GitHub Discussions -- ask questions, share strategies
- Contributing -- from starring the repo to building new skills
- Issue Templates -- tips, benchmarks, case studies
| Project | What It Does | Savings |
|---|---|---|
| caveman | Brevity skill -- strips filler from responses | 50-75% output |
| claude-mem | Compresses session context for handoff | Input reduction |
| claudetop | htop-style monitoring with cost tracking | Visibility |
How much does Claude Code actually cost?
With Pro ($20/mo), Max 5x ($100/mo), or Max 20x ($200/mo), you get included usage. Heavy users report $3-15/day without optimization, $1-5/day with it. Opus 4.7 (and legacy Opus 4.6) at $5/$25 is much more affordable than Opus 4.1 ($15/$75).
Does this apply to the Claude API too?
Context optimization, model selection, and prompt engineering apply to both. The skill and commands are Claude Code-specific.
Will optimization reduce output quality?
No. These strategies eliminate waste (duplicate context, unnecessary file reads, expensive models for simple tasks). Quality stays the same or improves -- less noise means better reasoning.
What's the biggest single change I can make?
Install cost-mode (npx skills add Sagargupta16/claude-cost-optimizer) and switch to Haiku for routine tasks. Combined: 50-70% savings.
If this repo helped you save money, consider giving it a star!
If you found this useful, check out my other AI/Claude tools:
| Project | Description |
|---|---|
| claude-code-recipes | 50+ copy-paste recipes for Claude Code - commands, subagents, hooks, skills |
| claude-skills | Custom Claude Code plugin marketplace with dev-workflow, FARM stack, and more |
| agent-recipes | AI agent workflows for real-world dev tasks - code review, testing, security |
| ai-git-hooks | AI-powered git hooks - auto-review diffs, generate commit messages, security scanning |
| mcp-toolkit | Production-ready middleware for MCP servers - auth, caching, rate limiting |
MIT - use these strategies, templates, and tools however you want.