Anthropic · explicit caching

Claude Fable 5

The most capable Claude model, for the hardest reasoning and long-horizon agentic work. Explicit cache_control breakpoints keep cache reads 90% off - on a 1M-token context, prefix placement drives the real bill.

Call it through Zumik About Anthropic

$10.00

Input / 1M tokens

$50.00

Output / 1M tokens

$1.00

Cache read · −90%

Context window

Specifications

At a glance.

Provider	Anthropic
Family	claude-fable
Released	2026-06
License	Proprietary
Context window	1M tokens
Max output	128K tokens
Modalities	text, image, pdf
Tool calling	Yes
Reasoning mode	Yes
Caching	explicit
Batch discount	50% off

Measured by Zumik

What reuse looks like here.

Claude Fable 5 · agent trafficper request

Total input100%

Candidate reuse67%

Realized reuse63%

Capture rate94%

280ms

Warm TTFT · −70% vs cold

Output tokens / sec

Reuse economics

What you actually pay once caching works.

At a typical 55% prefix reuse, a million input tokens on Claude Fable 5 effectively costs $5.05 instead of $10.00 - blending to roughly $16.29 with a 25% output share. Background work drops a further 50% on the batch tier.

Estimate it for your workload

Best for

Routes through these aliases:

Call it

Same OpenAI client, this model.

python

from openai import OpenAI

client = OpenAI(base_url="https://api.zumik.ai/v1", api_key="zk_live_...")

r = client.responses.create(
    model="claude-fable-5",          # or an alias like auto.best
    input="Draft a fix for the failing test.",
)
print(r.usage.input_tokens_cached)   # confirm reuse

Compare

How Claude Fable 5 stacks up.

Head-to-head

Claude Fable 5 vs GPT-5.5

The two default flagships for general and coding agents.

Similar models

Other options for these workloads.

Frequently asked

Claude Fable 5, answered.

How much does Claude Fable 5 cost?

Claude Fable 5 is $10.00 per million input tokens and $50.00 per million output tokens through Zumik. Cache reads are $1.00 per million, a 90% discount on input.

What is Claude Fable 5's context window?

Claude Fable 5 supports a 1M-token context window with up to 128K output tokens.

Does Claude Fable 5 support prompt caching?

Yes. Anthropic uses Explicit cache_control breakpoints caching. In the Zumik corpus, Claude Fable 5 shows a median cache capture of 94% on agent workloads.

Which Zumik aliases route to Claude Fable 5?

Claude Fable 5 is a candidate for the auto.best, reasoning.best aliases, selected when it wins under current routing policy.

Run Claude Fable 5 with reuse measured.

Point an OpenAI client at Zumik and see exactly how much of this model's input you are reusing.

Migration quickstart Back to catalog

At a glance.

What reuse looks like here.

What you actually pay once caching works.

Same OpenAI client, this model.

How Claude Fable 5 stacks up.

Claude Fable 5 vs GPT-5.5

Other options for these workloads.

Claude Sonnet 4.6

Claude Opus 4.7

Claude Opus 4.8

GPT-5.5 Pro

Claude Fable 5, answered.

Run Claude Fable 5 with reuse measured.