All Days
Day 45

Cost Optimization Stack

80%

cheaper. same output.

Cost Optimization Stack

3 changes that cut your Claude Code bill. Caching, batching, model routing, prompt compression.

75%
fewer tokens with shorthand
30x
cheaper: CLI vs MCP
98%
context cut via pruning

3 changes. One compound effect.

Each strategy stacks. Apply all three and you hit 80% savings.

Caveman Tokenization
75% fewer tokens per reasoning step

Shorthand reasoning format that compresses chain-of-thought into dense, token-efficient notation. Same quality output, fraction of the cost. The model understands it perfectly.

github.com/JuliusBrussee/caveman — HN 653 pts, Reddit 10.1K upvotes
CLI Over MCP
30x cheaper per operation

Every MCP server loads its full schema into context on every call. A CLI tool loads nothing — just runs and returns. Replace MCP tools with lightweight CLI wrappers wherever possible.

Google gwscli, FuturMinds benchmark
Context Pruning
Remove 98% of wasted context

Most sessions carry dead context — old file contents, stale tool schemas, irrelevant history. Aggressively prune what the model sees. Only load what it needs for the current task.

mksg.lu context-mode MCP

What the numbers look like

Same task. Optimized vs unoptimized session.

Before
12 MCP schemas loaded
verbose reasoning mode
100% token usage
full context every call
After
3 CLIs (zero schema overhead)
shorthand reasoning mode
20% token usage
pruned context per task

By method

*Estimates based on community benchmarks and documented results.

MethodToken SavingImpactEffort
Caveman Tokenization–75%HighLow
CLI over MCP–97%HighMedium
Context Pruning–98%HighLow
Cache Token Drain Fix–90%*MediumLow
Built-in API Credits$20–$100/moFree moneyZero

Two quick wins most people miss

No setup required.

Bug fix
Cache Token Drain Fix

Fixes a known issue causing 10–20x token drain from broken cache invalidation.

github.com/Rangizingo/cc-cache-fix
Free credits
Built-in API Credits

Pro plan includes $20/mo. Max plan includes $100/mo (5x). Check Settings > Usage.

Most people don't know these exist.

Add this to your CLAUDE.md

Drop it in and start saving on the next session.

# CLAUDE.md — Token optimization
output-style: default

# Caveman tokenization (75% fewer reasoning tokens)
Use shorthand reasoning: abbrev words, drop filler, dense notation.
Think: step→, result=, b/c=, w/=, cfg=, init=, upd=

# CLI over MCP
Prefer CLI tools over MCP schemas wherever possible.
MCP: load full schema per call. CLI: zero schema overhead.

# Context pruning
Only load files needed for current task.
Prune stale tool output, old file contents, irrelevant history.

don't miss what's next.

playbooks, templates, and tools that actually save you hours. straight to your inbox. no spam. unsubscribe anytime.

Day 45 of 30 · AY Automate · Claude Code Series