How to make model aliases actually reproducible

Hard-coding a provider model string ages badly: the model gets deprecated, the pricing changes, a better option ships, and now every call site needs editing. So teams reach for aliases - ask for code.fast and let routing pick. The problem is that most alias systems are mutable config, which means the answer to "which model handled this request last March?" is a shrug.

Reproducibility is the whole value. An alias you cannot explain after the fact is just a global variable with extra steps.

The immutable release is the unit

The fix is to make the resolution policy itself a versioned, immutable object. An alias release freezes how code.fast resolves at a point in time: the candidate models, the policy that ranks them, and the constraints. The live alias points at exactly one release. When a provider ships a new model revision or you change the policy, you cut a new release; you never mutate the old one.

Now the question has an answer. A request carries the alias release id, so you can replay the exact decision logic that was in force when it ran, even after the live alias has moved on three times since.

Log the decision, not just the result

Reproducibility needs the receipt, not only the outcome. Every resolution should record the requested alias, the release id, the resolved concrete model, how many candidates were considered, and the reason one won under policy. That last field is what turns an opaque router into something an engineer can audit.

Keep this metadata off the OpenAI-compatible surface. /v1/models should still return exact upstream-compatible objects so drop-in clients do not break; the rich alias metadata belongs under a separate endpoint where it cannot leak into code that expects the standard shape.

Replay has to pin something concrete

There are two honest ways to replay. Pin the alias release, and you reproduce the decision logic - useful when you want to know what the router would have done. Pin the concrete resolved model, and you reproduce the exact output path - useful when you are debugging a specific response. Both are valid; what is not valid is replaying against a live alias and pretending the result is historical, because the alias has almost certainly changed.

Decide which one you mean and record it in the replay manifest. A replay that does not say what it pinned is not reproducible.

What you get for the discipline

Automatic routing usually trades away explainability. Immutable releases buy it back: you get the convenience of asking for a capability and a durable record of every decision. When a regression shows up, you bisect releases instead of guessing. When finance asks why the model mix shifted, you point at the release that changed it. That is the difference between a routing layer you trust and one you merely tolerate.

How to make model aliases actually reproducible

The immutable release is the unit

Log the decision, not just the result

Replay has to pin something concrete

What you get for the discipline

Keep going.

Scoring a workload before you change infrastructure

When bringing your own cloud actually pays off

Prompt ordering is the cheapest optimization you are skipping

Turn the idea into a measurement.