Book a Free Strategy Call

Skip the read — talk to Walid in 30 min.

Free strategy call. We map your AI engineering team, you keep the notes.

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

Want transparent, self-hosted cost control? Open orchestrators show you the per-model receipts. Compare options in Sakana Fugu alternatives and the best open-source LLM orchestration tools.

The honest answer up front: as of its June 22, 2026 launch, Sakana has not published per-token list prices for Fugu, so any "sakana fugu pricing" chart you see with exact dollar figures is guesswork. What Sakana has confirmed is the billing model: Fugu is sold through subscription plans for daily use plus usage-based billing for heavier workloads, and every request returns its own token usage and cost so you can monitor real-time spend. This post explains how Fugu billing actually works, why orchestration pricing is genuinely harder to predict than a single model's, how Fugu's sakana fugu cost conceptually compares to a single frontier LLM, and how to estimate and cap your spend before the meter runs.

TL;DR

No public per-token prices yet. Sakana has not released specific fugu api pricing numbers as of launch. Treat any exact figure you find online as unverified.
Two billing tracks. Subscription plans for everyday use, and usage-based billing for bigger workloads.
Per-request cost reporting. Every Fugu request reports its own token usage and cost, so you get real-time spend visibility instead of a monthly surprise.
Orchestration makes pricing fuzzy by design. One Fugu request can fan out to several underlying frontier models (Thinker/Worker/Verifier roles) plus verification passes — so one "query" is often many model calls.
Fugu Ultra costs more than Fugu for the same task, because it runs a fixed, higher-quality pool with no opt-out and harder multi-step reasoning. Exact fugu ultra pricing is not public.
You pay for routing and verification, not just tokens. That's the tradeoff versus calling one model yourself.
Access is OpenAI-compatible — grab a key from console.sakana.ai and point your existing OpenAI client at it.

How Sakana Fugu Billing Works

Sakana Fugu launched on June 22, 2026 as a multi-agent orchestration model. Instead of being one large model, Fugu exposes a single OpenAI-compatible API and, behind that endpoint, routes your task across a pool of other frontier LLMs — selecting which models to use, delegating sub-tasks, verifying intermediate results, and synthesizing the final answer internally. If you want the full architecture, start with What is Sakana Fugu.

For billing, Sakana has confirmed two tracks and one reporting mechanism:

Subscription plans for daily use. Aimed at steady, everyday workloads where you want predictable monthly cost rather than metering each call.
Usage-based billing for bigger workloads. For spiky or high-volume jobs, you pay for what you consume rather than committing to a flat plan.
Per-request cost reporting. This is the part that matters most for spend control. Every Fugu request returns its own token usage and an associated cost figure, so you can watch real-time spend at the granularity of individual calls instead of reconciling a single end-of-month invoice.

What Sakana has not published is the actual numbers — there is no official sakana fugu price sheet with a cost per token, no posted fugu billing rate card, and no confirmed free-tier figure as of launch. So this guide deliberately avoids quoting dollar amounts for Fugu. Anyone who hands you a precise fugu cost per token today is inventing it.

Access is straightforward: generate a key at console.sakana.ai, and because the API is OpenAI-compatible, your existing OpenAI client libraries work with only a base-URL and key change. That low switching cost is part of the pitch — Fugu is positioned as a hedge against vendor lock-in, especially after the June 12, 2026 export controls pulled Anthropic's Fable 5 and Mythos from some markets.

Why Orchestration Pricing Is Harder to Predict Than a Single Model

With a normal single-model API, cost is mechanical: count input tokens, count output tokens, multiply by the published rates, done. You can estimate a request's cost before you send it.

Fugu breaks that mental model in a few specific ways.

One request can fan out to many model calls. A single Fugu request may invoke multiple underlying models in different roles — a Thinker to plan, Workers to execute sub-tasks, and a Verifier to check the result — and it may run additional verification passes. So a single "query" on your side can translate into several frontier-model calls behind the endpoint. You are not paying for one model's tokens; you are paying for the aggregate work of an internal swarm.

Routing is proprietary and hidden. Fugu decides which models to use and how many calls to make based on its own internal logic, and that routing is not exposed per query. You cannot see, before running a task, which models it will pick or how many passes it will take. Two requests that look similar to you can do very different amounts of internal work.

Fugu Ultra's pool is fixed with no opt-out. With Fugu Ultra (model ID fugu-ultra-20260615), you don't get to trim the pool to save money — the higher-quality model set is fixed. That guarantees quality on hard, multi-step tasks, but it also means you can't dial cost down by excluding the expensive members.

The practical consequence: you should think in terms of cost per completed task, observed from the per-request reporting, not a clean cost-per-token you can calculate up front. Estimate empirically by running representative tasks and reading back the reported cost, rather than trying to derive it from a rate card.

Fugu vs Fugu Ultra on Cost

Sakana ships two variants, and the cost difference between them is qualitative (no public numbers), but the direction is clear:

Fugu — balanced and low-latency. The default for everyday work: drafting, summarization, routine reasoning, customer-facing responses where good-enough-fast beats maximal-and-slow. Lower expected cost per task because it leans toward fewer, faster calls.
Fugu Ultra (fugu-ultra-20260615) — maximum quality for hard, multi-step problems. It runs a fixed pool with no opt-out and is built to grind through difficult reasoning with more verification. Expect higher cost per task, because "max quality" means more underlying model calls and more verification passes per request.

Rule of thumb: default to Fugu, escalate to Ultra only when a task genuinely needs it. Reaching for Ultra on simple work is the fastest way to inflate your bill with no quality benefit you'd actually notice.

How an Orchestrator's Cost Compares to a Single Frontier Model

To reason about whether Fugu is "expensive," it helps to anchor against what a single frontier model lists for. The table below shows Anthropic's published per-token list prices for context — these are Anthropic's prices, not Fugu's, and Fugu has no published equivalent. Use them only to understand the comparison, not as a stand-in for fugu api pricing.

Model (Anthropic — for context only)	~Input / 1M tokens	~Output / 1M tokens
Fable 5	~$10	~$50
Opus 4.8	~$5	~$25
Sonnet 4.6	~$3	~$15

These are approximate Anthropic list prices, shown only to frame the comparison. They are NOT Sakana Fugu prices. For the full breakdown, see the Claude Fable 5 pricing breakdown.

Here's the conceptual point. When you call a single model like Fable 5, you pay one model's input and output rate for one pass. When you send the same task to an orchestrator, that one task may trigger several underlying model calls plus verification. So on a per-task basis, an orchestrator can plausibly cost more than a single model — because it's doing more work: selecting models, delegating, checking its own output, and synthesizing.

The tradeoff is what that extra spend buys you. With a single model you get one model's answer and one model's failure modes. With Fugu you get routing (the right model for the sub-task), verification (a second model checking the first), and resilience (if one model is weak or unavailable, the pool absorbs it). For tasks where a wrong answer is expensive — and especially as a hedge against losing access to any single vendor — paying more per task for verification and routing can be the cheaper option once you price in the cost of being wrong or being locked in. For a deeper head-to-head, see Sakana Fugu vs Fable 5.

How to Estimate and Control Your Fugu Spend

Because you can't compute Fugu cost from a rate card, control it operationally:

Lean on per-request cost reporting. This is your single most useful tool. Log the reported token usage and cost on every call, tag it by task type, and build an empirical cost-per-task table from real traffic. That table — not a published rate — is your budget model.
Default to Fugu, not Ultra. Start every workload on the balanced variant. Only promote specific task types to Fugu Ultra after you've confirmed Fugu's quality is insufficient for them. Don't pay Ultra rates by habit.
Cap usage. On usage-based billing, set hard limits and alerts so a runaway loop or a bad batch job can't silently rack up cost. Treat orchestration like any metered cloud resource: budget, alert, and kill-switch.
Pick the right billing track for your shape. Steady daily volume usually fits a subscription plan better; spiky or one-off heavy jobs usually fit usage-based billing. Run both for a billing cycle if you're unsure and compare the reported totals.
Monitor continuously, not monthly. The whole value of per-request reporting is early warning. Watch cost-per-task trend lines so you catch a routing change or a prompt regression that doubles internal calls — before it shows up as a bill.
Right-size the task to the model. Send trivial work to a cheaper single model directly and reserve Fugu for tasks that actually benefit from orchestration and verification. Not everything needs a swarm.

Bottom Line

Sakana Fugu's pricing is, for now, a billing model rather than a price list. There's no public sakana fugu cost per token, no posted fugu ultra pricing, and no confirmed free tier — what exists is two billing tracks (subscription and usage-based) and per-request cost reporting that gives you real, call-level spend visibility. Orchestration is inherently harder to predict than a single model because one request can fan out into many internal model calls and verification passes, and the routing is hidden. The right move isn't to chase a number that hasn't been published — it's to instrument the per-request reporting, default to Fugu over Ultra, cap usage, and budget by observed cost-per-task. You may pay more per task than a single model, but what you're buying is routing, verification, and a hedge against vendor lock-in.

FAQ

How much does Sakana Fugu cost? Sakana has not published specific per-token prices as of the June 22, 2026 launch. What's confirmed is the billing model: subscription plans for daily use, usage-based billing for bigger workloads, and per-request cost reporting so you can monitor real-time spend. Estimate your actual sakana fugu cost empirically from the reported per-request figures.

Is there a free tier? Not specified at launch. Sakana has not confirmed a free tier or trial figure. Check console.sakana.ai for the current plan options rather than relying on third-party numbers.

Is Fugu cheaper than Fable 5? There's no published number to compare directly. Conceptually, an orchestrator like Fugu may cost more per task than a single model such as Fable 5, because one Fugu request can trigger several underlying model calls plus verification. Whether that's "cheaper" depends on whether you value the routing and verification it adds — and on the cost of a wrong answer.

Why is Fugu pricing hard to predict? Because one request fans out. A single Fugu query can invoke multiple frontier models in Thinker/Worker/Verifier roles plus verification passes, and the routing is proprietary and hidden. You can't see which models it picks per query, so there's no clean cost-per-token you can calculate before running the task.

What's the difference in cost between Fugu and Fugu Ultra? No public figures, but the direction is clear: Fugu Ultra (fugu-ultra-20260615) targets maximum quality on hard, multi-step tasks using a fixed pool with no opt-out, so it does more internal work and costs more per task. Fugu is balanced and low-latency, so it's the cheaper default.

How is Fugu billed — and can I see costs in real time? Yes. Token usage and cost are reported per request, so you get call-level, real-time spend monitoring. That per-request reporting is the foundation of any sensible fugu billing budget, since you build your cost model from observed data rather than a rate card.

How do I access Fugu, and does it need a new SDK? No new SDK. Fugu exposes a single OpenAI-compatible API — get a key at console.sakana.ai, point your existing OpenAI client at it, and you're running. That low switching cost is part of its positioning as a vendor lock-in hedge.

Need Predictable Multi-Model Costs?

At AY Automate we build production multi-model systems with the cost controls, routing, and failover that keep spend predictable even when you're orchestrating several frontier models per task. If you want orchestration without billing surprises, we can design and instrument it for you. Learn more about our AI agent development work.

Sources

Book a Free Strategy Call

Building this in production?

Walid runs a 30-min call to map your AI engineering team. Free, no slides.

Or send us a brief →

Share this article

#LLM#Sakana Fugu#Sakana AI#AI Orchestration#AI Pricing

About the Author

Adel Dahani

COO | Ex IBM

Adel keeps the engine running at AY Automate. He owns internal processes, team coordination, and the operational excellence that lets us ship fast for clients.

Book a Free Strategy Call

Skip the read — talk to Walid in 30 min.

Free strategy call. We map your AI engineering team, you keep the notes.

Or send us a brief →

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

Want transparent, self-hosted cost control? Open orchestrators show you the per-model receipts. Compare options in Sakana Fugu alternatives and the best open-source LLM orchestration tools.

TL;DR

No public per-token prices yet. Sakana has not released specific fugu api pricing numbers as of launch. Treat any exact figure you find online as unverified.
Two billing tracks. Subscription plans for everyday use, and usage-based billing for bigger workloads.
Per-request cost reporting. Every Fugu request reports its own token usage and cost, so you get real-time spend visibility instead of a monthly surprise.
Orchestration makes pricing fuzzy by design. One Fugu request can fan out to several underlying frontier models (Thinker/Worker/Verifier roles) plus verification passes — so one "query" is often many model calls.
Fugu Ultra costs more than Fugu for the same task, because it runs a fixed, higher-quality pool with no opt-out and harder multi-step reasoning. Exact fugu ultra pricing is not public.
You pay for routing and verification, not just tokens. That's the tradeoff versus calling one model yourself.
Access is OpenAI-compatible — grab a key from console.sakana.ai and point your existing OpenAI client at it.

How Sakana Fugu Billing Works

For billing, Sakana has confirmed two tracks and one reporting mechanism:

Subscription plans for daily use. Aimed at steady, everyday workloads where you want predictable monthly cost rather than metering each call.
Usage-based billing for bigger workloads. For spiky or high-volume jobs, you pay for what you consume rather than committing to a flat plan.
Per-request cost reporting. This is the part that matters most for spend control. Every Fugu request returns its own token usage and an associated cost figure, so you can watch real-time spend at the granularity of individual calls instead of reconciling a single end-of-month invoice.

Why Orchestration Pricing Is Harder to Predict Than a Single Model

With a normal single-model API, cost is mechanical: count input tokens, count output tokens, multiply by the published rates, done. You can estimate a request's cost before you send it.

Fugu breaks that mental model in a few specific ways.

Fugu vs Fugu Ultra on Cost

Sakana ships two variants, and the cost difference between them is qualitative (no public numbers), but the direction is clear:

Fugu — balanced and low-latency. The default for everyday work: drafting, summarization, routine reasoning, customer-facing responses where good-enough-fast beats maximal-and-slow. Lower expected cost per task because it leans toward fewer, faster calls.
Fugu Ultra (fugu-ultra-20260615) — maximum quality for hard, multi-step problems. It runs a fixed pool with no opt-out and is built to grind through difficult reasoning with more verification. Expect higher cost per task, because "max quality" means more underlying model calls and more verification passes per request.

How an Orchestrator's Cost Compares to a Single Frontier Model

Model (Anthropic — for context only)	~Input / 1M tokens	~Output / 1M tokens
Fable 5	~$10	~$50
Opus 4.8	~$5	~$25
Sonnet 4.6	~$3	~$15

These are approximate Anthropic list prices, shown only to frame the comparison. They are NOT Sakana Fugu prices. For the full breakdown, see the Claude Fable 5 pricing breakdown.

How to Estimate and Control Your Fugu Spend

Because you can't compute Fugu cost from a rate card, control it operationally:

Lean on per-request cost reporting. This is your single most useful tool. Log the reported token usage and cost on every call, tag it by task type, and build an empirical cost-per-task table from real traffic. That table — not a published rate — is your budget model.
Default to Fugu, not Ultra. Start every workload on the balanced variant. Only promote specific task types to Fugu Ultra after you've confirmed Fugu's quality is insufficient for them. Don't pay Ultra rates by habit.
Cap usage. On usage-based billing, set hard limits and alerts so a runaway loop or a bad batch job can't silently rack up cost. Treat orchestration like any metered cloud resource: budget, alert, and kill-switch.
Pick the right billing track for your shape. Steady daily volume usually fits a subscription plan better; spiky or one-off heavy jobs usually fit usage-based billing. Run both for a billing cycle if you're unsure and compare the reported totals.
Monitor continuously, not monthly. The whole value of per-request reporting is early warning. Watch cost-per-task trend lines so you catch a routing change or a prompt regression that doubles internal calls — before it shows up as a bill.
Right-size the task to the model. Send trivial work to a cheaper single model directly and reserve Fugu for tasks that actually benefit from orchestration and verification. Not everything needs a swarm.

Bottom Line

FAQ

Need Predictable Multi-Model Costs?

Sources

Book a Free Strategy Call

Building this in production?

Walid runs a 30-min call to map your AI engineering team. Free, no slides.

Or send us a brief →

Share this article

#LLM#Sakana Fugu#Sakana AI#AI Orchestration#AI Pricing

About the Author

Adel Dahani

COO | Ex IBM

Adel keeps the engine running at AY Automate. He owns internal processes, team coordination, and the operational excellence that lets us ship fast for clients.

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

Skip the read — talk to Walid in 30 min.

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

TL;DR

How Sakana Fugu Billing Works

Why Orchestration Pricing Is Harder to Predict Than a Single Model

Fugu vs Fugu Ultra on Cost

How an Orchestrator's Cost Compares to a Single Frontier Model

How to Estimate and Control Your Fugu Spend

Bottom Line

FAQ

Need Predictable Multi-Model Costs?

Sources

Building this in production?

Sakana Fugu Alternatives: Best Open-Source & Self-Hosted Options (2026)

Maestro vs Sakana Fugu: Open-Source vs Closed LLM Orchestration (2026)

7 Best Open-Source LLM Orchestration & Routing Tools (2026)

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

Skip the read — talk to Walid in 30 min.

Sakana Fugu Pricing Explained: What It Costs and How Billing Works (2026)

TL;DR

How Sakana Fugu Billing Works

Why Orchestration Pricing Is Harder to Predict Than a Single Model

Fugu vs Fugu Ultra on Cost

How an Orchestrator's Cost Compares to a Single Frontier Model

How to Estimate and Control Your Fugu Spend

Bottom Line

FAQ

Need Predictable Multi-Model Costs?

Sources

Building this in production?

Sakana Fugu Alternatives: Best Open-Source & Self-Hosted Options (2026)

Maestro vs Sakana Fugu: Open-Source vs Closed LLM Orchestration (2026)

7 Best Open-Source LLM Orchestration & Routing Tools (2026)