Stop Paying Opus Prices
for Haiku Tasks

claude-model-router automatically routes Claude Code sub-tasks to the cheapest capable model — saving 35–50% on token costs without sacrificing quality.

$ claude plugin add /path/to/claude-model-router

Every Claude Code session burns tokens.
Most of them are wasted.

When Claude Code spawns sub-agents for file exploration, code search, or formatting, it defaults to the most capable model. That's like hiring an architect to measure a room.

~60%
of sub-agent tasks are simple enough for Haiku
19x
price difference between Opus and Haiku output tokens
$0
cost to find out — model-router is free and local

Two-step routing. Zero configuration.

Install the plugin and everything is automatic. Token tracking starts immediately, routing instructions are injected into your CLAUDE.md, and savings are reported every session.

1
Agent Type Rules (hard-coded)
Known-simple agent types are routed instantly:
Explore, code-guide, statusline → Haiku  Plan → Sonnet  General-purpose → Step 2
2
Complexity Classification
For general-purpose agents, a fast Haiku call classifies the task by complexity and context dependency.
ComplexityContextModel
SimpleLowHaiku
SimpleHighSonnet
MediumLowSonnet
MediumHighOpus
ComplexAnyOpus
!
Escalation
If a cheaper model fails, retry once at the next tier (haiku → sonnet, sonnet → opus). Never at the same level.

Everything automatic. Nothing to configure.

Automatic Token Tracking

Every API interaction is logged — main thread and sub-agents. No manual calls needed. A stop hook processes transcripts after every response.

Intelligent Routing

Two-step classification routes tasks by agent type, complexity, and context dependency. Hard rules catch obvious cases before the classifier runs.

Session & Lifetime Reports

Colorized terminal banners show per-session and all-time cost breakdowns with per-model usage and savings percentages.

Transparent Pricing

The get_routing_config tool exposes the full routing matrix and model pricing. No black boxes.

Smart Escalation

Failed sub-agents retry at the next model tier, never the same one. Haiku → Sonnet → Opus. Quality is never sacrificed.

Local-First Privacy

All data stays in a local SQLite database. Nothing leaves your machine except standard Anthropic API calls.

The numbers speak for themselves.

Real-world results from heavy Claude Code usage with agent-based workflows.

38.8%
average cost reduction vs. all-Opus baseline
151
interactions tracked
330M+
tokens processed
$556
actual cost
$350
saved vs. Opus baseline

Install in 60 seconds.

Quick Start

# Clone the repo
$ git clone https://github.com/jdogresorg/claude-model-router.git

# Install dependencies
$ cd claude-model-router && npm install

# Add to Claude Code
$ claude plugin add ./claude-model-router

That's it. Token tracking starts automatically on your next session.

Better Together with claude-mem

When paired with claude-mem, the two plugins attack costs from both sides:

  • claude-mem reduces input tokens by recalling past discoveries
  • model-router reduces output costs by routing to cheaper models
  • Combined savings tracked and reported together in every session

Requirements

  • Claude Code CLI
  • Node.js 18+
  • ANTHROPIC_API_KEY environment variable