Stop Paying Opus Prices
for Haiku Tasks

claude-model-router automatically routes Claude Code sub-tasks to the cheapest capable model — saving 35–50% on token costs without sacrificing quality.

$ claude plugin add /path/to/claude-model-router

View on GitHub Read the Docs

The Problem

Every Claude Code session burns tokens.
Most of them are wasted.

When Claude Code spawns sub-agents for file exploration, code search, or formatting, it defaults to the most capable model. That's like hiring an architect to measure a room.

~60%

of sub-agent tasks are simple enough for Haiku

19x

price difference between Opus and Haiku output tokens

cost to find out — model-router is free and local

How It Works

Two-step routing. Zero configuration.

Install the plugin and everything is automatic. Token tracking starts immediately, routing instructions are injected into your CLAUDE.md, and savings are reported every session.

Agent Type Rules (hard-coded)

Known-simple agent types are routed instantly:
Explore, code-guide, statusline → Haiku Plan → Sonnet General-purpose → Step 2

Complexity Classification

For general-purpose agents, a fast Haiku call classifies the task by complexity and context dependency.

Complexity	Context	Model
Simple	Low	Haiku
Simple	High	Sonnet
Medium	Low	Sonnet
Medium	High	Opus
Complex	Any	Opus

Escalation

If a cheaper model fails, retry once at the next tier (haiku → sonnet, sonnet → opus). Never at the same level.

Features

Everything automatic. Nothing to configure.

◈

Automatic Token Tracking

Every API interaction is logged — main thread and sub-agents. No manual calls needed. A stop hook processes transcripts after every response.

◈

Intelligent Routing

Two-step classification routes tasks by agent type, complexity, and context dependency. Hard rules catch obvious cases before the classifier runs.

◈

Session & Lifetime Reports

Colorized terminal banners show per-session and all-time cost breakdowns with per-model usage and savings percentages.

◈

Transparent Pricing

The get_routing_config tool exposes the full routing matrix and model pricing. No black boxes.

◈

Smart Escalation

Failed sub-agents retry at the next model tier, never the same one. Haiku → Sonnet → Opus. Quality is never sacrificed.

◈

Local-First Privacy

All data stays in a local SQLite database. Nothing leaves your machine except standard Anthropic API calls.

Real Savings

The numbers speak for themselves.

Real-world results from heavy Claude Code usage with agent-based workflows.

38.8%

average cost reduction vs. all-Opus baseline

151

interactions tracked

330M+

tokens processed

$556

actual cost

$350

saved vs. Opus baseline

Get Started

Install in 60 seconds.

Quick Start

            # Clone the repo

            $ git clone https://github.com/jdogresorg/claude-model-router.git

            # Install dependencies

            $ cd claude-model-router && npm install

            # Add to Claude Code

            $ claude plugin add ./claude-model-router

That's it. Token tracking starts automatically on your next session.

Better Together with claude-mem

When paired with claude-mem, the two plugins attack costs from both sides:

claude-mem reduces input tokens by recalling past discoveries
model-router reduces output costs by routing to cheaper models
Combined savings tracked and reported together in every session

Requirements

Claude Code CLI
Node.js 18+
ANTHROPIC_API_KEY environment variable

Stop Paying Opus Pricesfor Haiku Tasks

Every Claude Code session burns tokens.Most of them are wasted.