Claude Fable 5
The most capable Claude model, for the hardest reasoning and long-horizon agentic work. Explicit cache_control breakpoints keep cache reads 90% off - on a 1M-token context, prefix placement drives the real bill.
At a glance.
| Provider | Anthropic |
| Family | claude-fable |
| Released | 2026-06 |
| License | Proprietary |
| Context window | 1M tokens |
| Max output | 128K tokens |
| Modalities | text, image, pdf |
| Tool calling | Yes |
| Reasoning mode | Yes |
| Caching | explicit |
| Batch discount | 50% off |
What reuse looks like here.
What you actually pay once caching works.
At a typical 55% prefix reuse, a million input tokens on Claude Fable 5 effectively costs $5.05 instead of $10.00 - blending to roughly $16.29 with a 25% output share. Background work drops a further 50% on the batch tier.
Estimate it for your workloadRoutes through these aliases:
Same OpenAI client, this model.
from openai import OpenAI
client = OpenAI(base_url="https://api.zumik.ai/v1", api_key="zk_live_...")
r = client.responses.create(
model="claude-fable-5", # or an alias like auto.best
input="Draft a fix for the failing test.",
)
print(r.usage.input_tokens_cached) # confirm reuseHow Claude Fable 5 stacks up.
Other options for these workloads.
Claude Fable 5, answered.
How much does Claude Fable 5 cost?
Claude Fable 5 is $10.00 per million input tokens and $50.00 per million output tokens through Zumik. Cache reads are $1.00 per million, a 90% discount on input.
What is Claude Fable 5's context window?
Claude Fable 5 supports a 1M-token context window with up to 128K output tokens.
Does Claude Fable 5 support prompt caching?
Yes. Anthropic uses Explicit cache_control breakpoints caching. In the Zumik corpus, Claude Fable 5 shows a median cache capture of 93% on agent workloads.
Which Zumik aliases route to Claude Fable 5?
Claude Fable 5 is a candidate for the auto.best, reasoning.best aliases, selected when it wins under current routing policy.
Run Claude Fable 5 with reuse measured.
Point an OpenAI client at Zumik and see exactly how much of this model's input you are reusing.