Tool · calculator

What will this actually cost per month?

Pick a model, describe your traffic, and set how much of your prompt is a reusable stable prefix. The calculator shows list-price spend, reuse-adjusted spend, and the gap between them. No subscription on top - the bill is credits.

Model

Requests per day: 20,000

Avg input tokens

Avg output tokens

Stable prefix reused: 55%

600,000 requests/mo · 7,200,000,000 input tokens · 420,000,000 output tokens.

$93,000

List price, no reuse / mo

$57,360

With 55% reuse / mo

Estimated monthly saving

$35,640

−38%

Monthly subscription$0.00

Estimated credit spend$57,360/mo

Estimate only, at provider list prices; Zumik's published per-model rates land at or under list on managed routing. Actual reuse depends on prompt ordering and retention locality. Output is excluded from cache savings.

Method

Reused input is billed at the model's cache-read rate; the rest at list input. Output is never discounted by caching. This is an estimate - measure your real reuse with a workload diagnostic.

Cheapest cached models

Find the lowest reuse-adjusted cost for agent traffic.

Capture more reuse

How to actually land each provider's cache discount.

Move work to batch

Halve the cost of non-interactive traffic.

Frequently asked

Calculator, answered.

How does the calculator estimate savings?

It splits your input tokens by the reuse percentage you set, billing the reused portion at the model’s cache-read rate and the rest at list input price. Output is excluded from cache savings.

What reuse percentage should I use?

Median realized reuse across agent workloads in our corpus is around 53%. Coding agents often run higher; consumer chat runs much lower. Run a diagnostic to measure yours.

Is there a monthly fee on top of this?

No. Zumik has no subscription; the whole bill is prepaid pay-as-you-go credits, drawn down per token at published per-model rates. The smallest credit top-up is $5.

Estimate, then measure.

The calculator is a model; a diagnostic is the truth. Run one on your real traffic.

Run a diagnostic See pricing

Turn the estimate into a plan.

Cheapest cached models

Capture more reuse

Move work to batch

Calculator, answered.

Estimate, then measure.