AY Automate
Services
Case Studies
Industries
Contact
n8n logo
Claude logo
Cursor logo
Make logo
OpenAI logo
AUTOMATION GATEWAY

DEPLOYAUTOMATION

> System status: READY_FOR_DEPLOYMENT
Transform your business operations today.

Company
AY Automate
Connect with us
LinkedInXXYouTube
Explore AI Summary
ChatGPTClaude wrapperPerplexityGoogle AIGrokCopilot
Free Tools
  • ROI Calculator
  • AI Readiness Assessment
  • AI Budget Planner
  • Workflow Audit
  • AI Maturity Quiz
  • AI Use Case Generator
  • AI Tool Selector
  • Digital Transformation Scorecard
  • AI Job Description Generator
+ 5 more free tools
Our Builds
  • Ayn8nn8n Library
  • AyclaudeClaude Library
  • AyDesignMake your vibecoded app look like a $10M company
  • AyRankBe the solution cited by AI
  • LiwalaOpen Source
  • AY SkillsOur best skills
  • n8n × Claude CodeWorkflow builder
  • AY FrameworkOpen Source
Services
  • All Services
  • AI Strategy Consulting
  • AI Agent Development
  • Workflow Automation
  • Custom Automation
  • RAG Pipeline Development
  • SaaS MVP Development
  • AI Workshops
  • Engineer Placement
  • Custom Training
  • Maintenance & Support
  • OpenClaw & NemoClaw Setup
Industries
  • All Industries
  • Marketing Agencies
  • Ecommerce
  • Consulting Firms
  • Revenue Operations
  • Law Firms
  • SaaS Startups
  • Logistics
  • Finance
  • Professional Services
Resources
  • Blog
  • Case Studies
  • Playbooks
  • Courses
  • FAQ
  • Contact Us
  • Careers
Stay Updated

Stay tuned

Get the latest automation insights, playbooks, and case studies delivered to your inbox. No spam, ever.

Join 4,500+ operators · Weekly · Unsubscribe anytime

Featured
Claude

30 Days of Claude Code

Daily challenges + agents

n8n

AI Automation Playbook

Free guide · 1,000+ hours saved

Golden Offer

Scale your company without hiring more staff

Get in touch
Walid Boulanouar
Walid BoulanouarCo-Founder · CEO
Adel Dahani
Adel DahaniCo-Founder · CTO
contact@ayautomate.com

Operating Globally

Serving clients worldwide - across North America, Europe, MENA, Asia & beyond.

© 2026 AY Automate. All rights reserved.
Terms of UsePrivacy Policy
Blog
12 June 2026/8 min read

Claude Fable 5 API Tutorial: Python, TypeScript, and Streaming Examples (2026)

This is the practical tutorial for calling **Claude Fable 5** from your own code. We cover the minimal "hello world" in Python and TypeScript, streaming for long generations, prompt caching for cost control, tool use, and the three gotchas that catch most first-time users.

Adel Dahani
Author:Adel Dahani,COO | Ex IBM
Claude Fable 5 API Tutorial: Python, TypeScript, and Streaming Examples (2026)

Book a Free Strategy Call

Skip the read — talk to Walid in 30 min.

Free strategy call. We map your AI engineering team, you keep the notes.

Or send us a brief →

Claude Fable 5 API Tutorial: Python, TypeScript, and Streaming Examples (2026)

This is the practical tutorial for calling Claude Fable 5 from your own code. We cover the minimal "hello world" in Python and TypeScript, streaming for long generations, prompt caching for cost control, tool use, and the three gotchas that catch most first-time users.

If you're just getting set up, see our day-zero access guide for installation across Claude.ai, Claude Code, and the desktop app.


Prerequisites

  • An Anthropic API key — generate one at console.anthropic.com
  • Python 3.10+ or Node.js 18+
  • The official SDK:
    • Python: pip install anthropic
    • Node: npm install @anthropic-ai/sdk

Set your API key in your shell:

export ANTHROPIC_API_KEY="sk-ant-..."

Minimal Python Example

import anthropic

client = anthropic.Anthropic()  # picks up ANTHROPIC_API_KEY from env

message = client.messages.create(
    model="claude-fable-5",
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Refactor this 200-line module into 4 testable units. Return a single diff."
        }
    ]
)

print(message.content[0].text)
print(f"\nUsage: {message.usage.input_tokens} in / {message.usage.output_tokens} out")

That's a complete working integration. Three things worth noting:

  1. model="claude-fable-5" — the exact model ID. If you get a 404, your API tier may not have access yet; check the models list endpoint to confirm what your org sees.
  2. max_tokens=4096 — this is a budget cap on the response. Set it explicitly; Fable 5 will gladly write 30K tokens if you don't.
  3. message.usage — log this. Tracking input/output tokens per call is the only way to catch cost regressions before your bill spikes.

Minimal TypeScript Example

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const message = await client.messages.create({
  model: "claude-fable-5",
  max_tokens: 4096,
  messages: [
    {
      role: "user",
      content: "Build a complete /pricing page in Next.js App Router that loads from a JSON config and matches our existing Tailwind theme. Return all files in a single response."
    }
  ]
});

if (message.content[0].type === "text") {
  console.log(message.content[0].text);
}
console.log(`Usage: ${message.usage.input_tokens} in / ${message.usage.output_tokens} out`);

Same shape as Python. The TypeScript SDK uses a typed content discriminated union — guard with .type === "text" before accessing .text.


Streaming (You Should Always Use This for Fable 5)

Fable 5 runs are slow. A non-streaming call to Fable 5 for a complex task means staring at a blank terminal for 60+ seconds. Use streaming. It gives you progress visibility and lets you start processing the output as it arrives.

Python streaming

import anthropic

client = anthropic.Anthropic()

with client.messages.stream(
    model="claude-fable-5",
    max_tokens=8192,
    messages=[{"role": "user", "content": "Build a complete user-authentication system with tests."}],
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

    final = stream.get_final_message()
    print(f"\n\nUsage: {final.usage.input_tokens} in / {final.usage.output_tokens} out")

TypeScript streaming

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const stream = client.messages.stream({
  model: "claude-fable-5",
  max_tokens: 8192,
  messages: [{ role: "user", content: "Build a complete user-authentication system with tests." }],
});

for await (const chunk of stream) {
  if (chunk.type === "content_block_delta" && chunk.delta.type === "text_delta") {
    process.stdout.write(chunk.delta.text);
  }
}

const final = await stream.finalMessage();
console.log(`\n\nUsage: ${final.usage.input_tokens} in / ${final.usage.output_tokens} out`);

Streaming changes the user experience entirely for long-running Fable 5 calls. Make it the default in your integrations.


Prompt Caching (Critical for Cost)

If you're calling Fable 5 with a large system prompt or tool definitions repeatedly (e.g. an agent that runs over many turns, or a batch pipeline that processes 1,000 records), enable prompt caching. It cuts the input cost of cached tokens by ~90%.

import anthropic

client = anthropic.Anthropic()

SYSTEM_PROMPT = """You are a senior software engineer. When given a refactoring task:
1. Read the entire input carefully
2. Identify the natural seams in the code
3. Propose a clean, testable decomposition
4. Return a unified diff
... (long system prompt here) ..."""

message = client.messages.create(
    model="claude-fable-5",
    max_tokens=4096,
    system=[
        {
            "type": "text",
            "text": SYSTEM_PROMPT,
            "cache_control": {"type": "ephemeral"}  # cache this block
        }
    ],
    messages=[{"role": "user", "content": "Refactor this module: ..."}]
)

The cache_control flag tells Anthropic to cache the preceding block. Subsequent requests with the same cached block (within a 5-minute window by default) hit the cache and pay ~10% of the normal input cost for those tokens.

You can cache up to 4 blocks per request — typically: system prompt, tool definitions, large reference documents.


Tool Use (Function Calling)

Fable 5 supports tool use with the same schema as Opus 4.8 and Sonnet 4.6 — no migration needed.

import anthropic
import json

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get current weather for a location.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City and state, e.g. San Francisco, CA"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
            },
            "required": ["location"]
        }
    }
]

def get_weather(location, unit="celsius"):
    # your real implementation here
    return {"location": location, "temp": 22, "unit": unit, "conditions": "clear"}

# First call — model decides to use the tool
response = client.messages.create(
    model="claude-fable-5",
    max_tokens=1024,
    tools=tools,
    messages=[{"role": "user", "content": "What's the weather in San Francisco?"}]
)

# Loop: handle tool_use → tool_result until the model is done
messages = [{"role": "user", "content": "What's the weather in San Francisco?"}]
while response.stop_reason == "tool_use":
    tool_use_block = next(b for b in response.content if b.type == "tool_use")
    tool_result = get_weather(**tool_use_block.input)

    messages.append({"role": "assistant", "content": response.content})
    messages.append({
        "role": "user",
        "content": [{
            "type": "tool_result",
            "tool_use_id": tool_use_block.id,
            "content": json.dumps(tool_result)
        }]
    })

    response = client.messages.create(
        model="claude-fable-5",
        max_tokens=1024,
        tools=tools,
        messages=messages
    )

# Final text response
for block in response.content:
    if block.type == "text":
        print(block.text)

The pattern: model returns stop_reason: "tool_use" → you run the tool → append the result → call again. Repeat until stop_reason: "end_turn".

For multi-tool agents, the loop runs many times. Fable 5 is particularly good at long tool-use chains — that's part of what makes it a "whole-job delegation" model.


Three Gotchas That Catch First-Time Users

Gotcha 1: 404 on model: "claude-fable-5"

Two likely causes:

  1. Your API tier doesn't have access yet. Anthropic rolls out new models region-by-region; some orgs get access on day 0, others over the following 24–48 hours. Verify by calling the models list endpoint.
  2. Your SDK is out of date. Update with pip install -U anthropic or npm install @anthropic-ai/sdk@latest. Older SDKs may have stale model registries.

Gotcha 2: Unexpected routing to Opus 4.8

Fable 5 includes safeguards that route cybersecurity and biology queries to Opus 4.8 automatically. If your prompt touches offensive security, malware, exploit development, gain-of-function research, or synthesis routes, you'll get an Opus 4.8 response back — and the response itself will note the routing.

If you're doing legitimate research in these areas, you'd need Mythos 5 access (the unrestricted variant), which is limited to vetted partners. Apply through Anthropic's enterprise contact.

Gotcha 3: Hitting max_tokens mid-response

If Fable 5's response is cut off, you'll see stop_reason: "max_tokens" in the response. The output is truncated; you didn't get the full answer.

Two fixes:

  1. Increase max_tokens for the call. Fable 5 can produce 8K–32K token responses on complex tasks.
  2. Continue the response by sending the truncated output back as the assistant's message and asking the model to continue. This is the right approach for long-form generations where you want to chunk output (e.g. for streaming UX).
# Continue a truncated response
continued = client.messages.create(
    model="claude-fable-5",
    max_tokens=4096,
    messages=[
        original_user_message,
        {"role": "assistant", "content": first_response.content},  # the truncated output
        {"role": "user", "content": "continue"}
    ]
)

Where to Go From Here

  • Anthropic API docs — full reference for messages, streaming, tool use, vision, prompt caching, and the Batch API
  • Anthropic Python SDK on GitHub
  • Anthropic TypeScript SDK on GitHub
  • Our Fable 5 pricing breakdown — real cost per workload
  • Our Fable 5 vs Opus 4.8 comparison — when to pick each

Shipping Fable 5 in Production?

Building production agents on Fable 5 means making decisions about model fallbacks (Fable → Opus when rate-limited), prompt caching strategy (which blocks to cache, when to invalidate), tool-use loop design (how many turns is too many), and observability (cost per request, p95 latency, tool-call success rate).

Getting these right is the difference between an agent that ships and an agent that burns $5K/month and silently fails on 8% of requests.

AY Automate places senior AI engineers into your team to design, build, and ship production agents on Claude Fable 5 and the broader Anthropic stack. Book a free 30-min strategy call — we'll look at your architecture and tell you where the risks are.

Book a Free Strategy Call

Building this in production?

Walid runs a 30-min call to map your AI engineering team. Free, no slides.

Or send us a brief →
Share this article
About the Author
Adel Dahani
Adel Dahani
COO | Ex IBM

Adel keeps the engine running at AY Automate. He owns internal processes, team coordination, and the operational excellence that lets us ship fast for clients.