Five providers, deep failover
Zumik defaults to managed provider accounts and adds BYOK and BYOC as optional profiles. It captures provider-native caching, batch tiers, and service tiers, and reports an honest QoS outcome for every request.
Optimized, not just proxied.
OpenAI
automaticGPT-5 family, automatic prefix caching, flex/scale tiers.
DetailsAnthropic
explicitClaude Fable / Opus, explicit cache breakpoints, 90% read discount.
DetailsGoogle Gemini
implicitGemini 3 family, implicit caching, up to 2M-token context.
DetailsxAI
contextGrok 4 and Grok-3 Mini, context caching, cheap fast frontier.
DetailsFireworks AI
automaticOpen-weights serving with speculative decoding and dedicated tiers.
DetailsStart managed. Escalate with evidence.
Managed default
Company-managed provider accounts via Bifrost. Fastest onboarding, broad coverage, lowest operational burden.
BYOK
Your provider keys. Billing, quotas, and retention follow your agreements; provider-native caching stays active.
BYOC
The data plane in your cloud, for dedicated SLOs and isolation. Activated only when replay proves it.
Hybrid
Managed providers for breadth and overflow, plus BYOC hot lanes for concentrated, reusable workloads.
One scheduler owns replica selection per profile. Zumik never stacks competing routing control planes inside a single deployment path - a decision that keeps behavior predictable under load.
Accountable for results, not just inputs.
Asking for an interactive class is not enough - the platform should tell you whether it delivered. Every request returns a formal outcome with a stable, machine-readable reason code.
Admission
admitted · queued · rejected · expired_before_startCompletion
completed · failed · cancelled · expired_during_executionSignals
target_met · degraded · fallback_used · reason_codeOn /v1, a compact subset is returned via response headers so the OpenAI body stays exactly compatible. The full outcome lives on the native surface.
Provider routing, answered.
Which providers are first-class?
OpenAI, Anthropic, xAI, Google Gemini, and Fireworks AI - for both managed and BYOK execution. Broader coverage is available through Bifrost where policy allows.
Do you use OpenRouter for routing?
Only as an explicit last-resort outage bridge when a required model path has a verified total outage, and only for customers whose policy permits brokered execution. It is never used for routine price arbitration.
How does failover work?
The execution broker preserves the logical response id and opens a new attempt on a healthy provider or region, recording the retry cause, provider, model resolution, and timing per attempt.
What is a QoS outcome?
A formal result for each request: admitted, queued, rejected, or expired on admission; completed, failed, cancelled, or expired during execution; plus whether the target was met and whether a fallback was used.
Route on evidence, fail over on policy.
Bring your own keys or use managed accounts, and get an honest QoS outcome on every request.