Skip to main content

What is fallback?

When a model fails (provider outage, rate limit, bad request), Manifest retries with a backup model from the same tier. Your agent gets a response instead of an error.

How it works

1

Request fails

The primary model returns an error (any HTTP 4xx or 5xx status).
2

Manifest selects a backup

Manifest picks the next fallback model from the tier’s fallback list. Fallback models are tried in the order you configure them.
3

Request is retried

The original request is forwarded to the backup model. If that model also fails, Manifest continues down the fallback list until a model succeeds or all options are exhausted.

What triggers a fallback

Any HTTP status code >= 400 triggers a fallback, with one exception: 424 (Failed Dependency) does not trigger a fallback (this is the status Manifest itself returns when the entire chain is exhausted, preventing infinite loops). This includes:
StatusExample
400Bad request
401Authentication error
403Forbidden
429Rate limited
500Internal server error
502Bad gateway
503Service unavailable
529Provider overloaded

Configuration

Fallback models are configured per tier in the Manifest dashboard. Each tier can have up to 5 fallback models, tried in order.
1

Open the dashboard

Go to app.manifest.build and navigate to Routing.
2

Select a tier

Click on any tier (Simple, Standard, Complex, or Reasoning).
3

Add fallback models

Add up to 5 fallback models. Drag to reorder — models are tried from top to bottom.

Response headers

When a fallback succeeds, the response includes the standard routing headers plus two extra ones:
HeaderDescription
X-Manifest-TierThe routing tier
X-Manifest-ModelThe model that served the response (the fallback model, not the original)
X-Manifest-ProviderThe provider that handled the request
X-Manifest-ConfidenceRouting confidence score
X-Manifest-ReasonWhy this tier was selected
X-Manifest-Fallback-FromThe primary model that was attempted first
X-Manifest-Fallback-IndexPosition in the fallback chain (0 = first fallback, 1 = second, etc.)
When the chain is exhausted:
HeaderDescription
X-Manifest-Fallback-ExhaustedSet to true when all models failed

Fallback vs routing

RoutingFallback
WhenBefore the request is sentAfter the request fails
GoalPick the cheapest capable modelRecover from a failure
Speed< 2 ms scoringAdds one extra round-trip per retry
TierAssigns a tierStays within the same tier
Routing picks the model. Fallback catches it if that model is down.
Connect at least two providers. With a single provider, fallback can only switch between that provider’s models.