Skip to main content

Documentation Index

Fetch the complete documentation index at: https://manifest.build/docs/llms.txt

Use this file to discover all available pages before exploring further.

What is fallback?

When a model fails (provider outage, rate limit, bad request), Manifest retries with a backup model from the same tier. Your agent gets a response instead of an error.

How it works

1

Request fails

The primary model returns an error (any HTTP 4xx or 5xx status).
2

Manifest selects a backup

Manifest picks the next fallback model from the tier’s fallback list. Fallback models are tried in the order you configure them.
3

Request is retried

The original request is forwarded to the backup model. If that model also fails, Manifest continues down the fallback list until a model succeeds or all options are exhausted.

What triggers a fallback

Any HTTP status code >= 400 triggers a fallback, with one exception: 424 (Failed Dependency) does not trigger a fallback (this is the status Manifest itself returns when the entire chain is exhausted, preventing infinite loops). This includes:
StatusExample
400Bad request
401Authentication error
403Forbidden
429Rate limited
500Internal server error
502Bad gateway
503Service unavailable
529Provider overloaded

Configuration

Fallback models are configured per tier in the Manifest dashboard. Each tier can have up to 5 fallback models, tried in order.
1

Open Routing in the dashboard

Navigate to Routing in the dashboard.
2

Select a tier

Click any tier (Simple, Standard, Complex, or Reasoning).
3

Add fallback models

Add up to 5 fallback models. Drag to reorder — models are tried from top to bottom.

Hung providers and the per-attempt timeout

A provider that opens a connection but never returns will eventually trigger a fallback via Manifest’s per-attempt timeout (default 180 seconds), which surfaces as a synthetic 504 Gateway Timeout and triggers the next model in the chain. If your upstream client (e.g. an agent gateway) has its own timeout that fires at the same time, the client may disconnect first and Manifest will give up before reaching a healthy fallback. On self-hosted installs, lower PROVIDER_TIMEOUT_MS strictly below your client’s timeout so the fallback chain has room to run within the client’s window. See Self-hosted for the env var reference.

Response headers

When a fallback succeeds, the response carries X-Manifest-Fallback-From (the primary that failed) and X-Manifest-Fallback-Index (its position in the chain) on top of the standard routing headers. When the chain is exhausted, X-Manifest-Fallback-Exhausted: true is set and the request returns 424. Full table: Headers reference.

Fallback vs routing

RoutingFallback
WhenBefore the request is sentAfter the request fails
GoalPick the cheapest capable modelRecover from a failure
Speed< 2 ms scoringAdds one extra round-trip per retry
TierAssigns a tierStays within the same tier
Routing picks the model. Fallback catches it if that model is down.
Connect at least two providers. With a single provider, fallback can only switch between that provider’s models.