Day 45

Cost Optimization Stack

Day 44 Day 46

80%

cheaper. same output.

Cost Optimization Stack

3 changes that cut your Claude Code bill. Caching, batching, model routing, prompt compression.

75%

fewer tokens with shorthand

30x

cheaper: CLI vs MCP

98%

context cut via pruning

The optimization stack

3 changes. One compound effect.

Each strategy stacks. Apply all three and you hit 80% savings.

Caveman Tokenization

75% fewer tokens per reasoning step

Shorthand reasoning format that compresses chain-of-thought into dense, token-efficient notation. Same quality output, fraction of the cost. The model understands it perfectly.

github.com/JuliusBrussee/caveman — HN 653 pts, Reddit 10.1K upvotes

CLI Over MCP

30x cheaper per operation

Every MCP server loads its full schema into context on every call. A CLI tool loads nothing — just runs and returns. Replace MCP tools with lightweight CLI wrappers wherever possible.

Google gwscli, FuturMinds benchmark

Context Pruning

Remove 98% of wasted context

Most sessions carry dead context — old file contents, stale tool schemas, irrelevant history. Aggressively prune what the model sees. Only load what it needs for the current task.

mksg.lu context-mode MCP

Before vs after

What the numbers look like

Same task. Optimized vs unoptimized session.

Before

12 MCP schemas loaded

verbose reasoning mode

100% token usage

full context every call

After

3 CLIs (zero schema overhead)

shorthand reasoning mode

20% token usage

pruned context per task

Savings breakdown

By method

*Estimates based on community benchmarks and documented results.

Method	Token Saving	Impact	Effort
Caveman Tokenization	–75%	High	Low
CLI over MCP	–97%	High	Medium
Context Pruning	–98%	High	Low
Cache Token Drain Fix	–90%*	Medium	Low
Built-in API Credits	$20–$100/mo	Free money	Zero

Bonus

Two quick wins most people miss

No setup required.

Bug fix

Cache Token Drain Fix

Fixes a known issue causing 10–20x token drain from broken cache invalidation.

github.com/Rangizingo/cc-cache-fix

Free credits

Built-in API Credits

Pro plan includes $20/mo. Max plan includes $100/mo (5x). Check Settings > Usage.

Most people don't know these exist.

Copy-ready config

Add this to your CLAUDE.md

Drop it in and start saving on the next session.

# CLAUDE.md — Token optimization
output-style: default

# Caveman tokenization (75% fewer reasoning tokens)
Use shorthand reasoning: abbrev words, drop filler, dense notation.
Think: step→, result=, b/c=, w/=, cfg=, init=, upd=

# CLI over MCP
Prefer CLI tools over MCP schemas wherever possible.
MCP: load full schema per call. CLI: zero schema overhead.

# Context pruning
Only load files needed for current task.
Prune stale tool output, old file contents, irrelevant history.

Day 44: Boris Cherny's 14 Hidden Features

Day 46: Every Claude Code Command