Run a workload diagnostic
You suspect repeated context is costing you and want evidence before touching infrastructure.
01
Submit a trace sample
Upload metadata-only traces (no raw prompts required). Zumik computes prefix families and opportunity.
curl https://api.zumik.ai/v2/diagnostics \
-H "Authorization: Bearer zk_live_..." \
-d '{"source":"trace_export","trace_mode":"metadata","sample_ref":"trc_…"}'02
Read the score and waterfall
The report returns Workload Reuse Score, the reuse waterfall, a provider-fit matrix, and the lowest-complexity next step.
d = z.diagnostics.get("dgn_01JY…")
print(d.wrs, d.waterfall.realized_reuse, d.recommended_profile)Where to go from here.
Got it running? Measure it.
Once requests flow through Zumik, a diagnostic shows exactly what you are reusing.