Fallback

What is fallback?

When a model fails (provider outage, rate limit, bad request), Manifest retries with a backup model from the same tier. Your agent gets a response instead of an error.

How it works

Request fails

The primary model returns an error (any HTTP 4xx or 5xx status).

Manifest selects a backup

Manifest picks the next fallback model from the tier’s fallback list. Fallback models are tried in the order you configure them.

Request is retried

The original request is forwarded to the backup model. If that model also fails, Manifest continues down the fallback list until a model succeeds or all options are exhausted.

What triggers a fallback

Any HTTP status code >= 400 triggers a fallback, with one exception: 424 (Failed Dependency) does not trigger a fallback (this is the status Manifest itself returns when the entire chain is exhausted, preventing infinite loops). This includes:

Status	Example
400	Bad request
401	Authentication error
403	Forbidden
429	Rate limited
500	Internal server error
502	Bad gateway
503	Service unavailable
529	Provider overloaded

Configuration

Fallback models are configured per tier in the Manifest dashboard. Each tier can have up to 5 fallback models, tried in order.

Open Routing in the dashboard

Navigate to Routing in the dashboard.

Select a tier

Click any tier (Simple, Standard, Complex, or Reasoning).

Add fallback models

Add up to 5 fallback models. Drag to reorder — models are tried from top to bottom.

Hung providers and the per-attempt timeout

A provider that opens a connection but never returns will eventually trigger a fallback via Manifest’s per-attempt timeout (default 180 seconds), which surfaces as a synthetic 504 Gateway Timeout and triggers the next model in the chain. If your upstream client (e.g. an agent gateway) has its own timeout that fires at the same time, the client may disconnect first and Manifest will give up before reaching a healthy fallback. On self-hosted installs, lower PROVIDER_TIMEOUT_MS strictly below your client’s timeout so the fallback chain has room to run within the client’s window. See Self-hosted for the env var reference.

Response headers

When a fallback succeeds, the response carries X-Manifest-Fallback-From (the primary that failed) and X-Manifest-Fallback-Index (its position in the chain) on top of the standard routing headers. When the chain is exhausted, X-Manifest-Fallback-Exhausted: true is set and the request returns 424. Full table: Headers reference.

Fallback vs routing

	Routing	Fallback
When	Before the request is sent	After the request fails
Goal	Pick the cheapest capable model	Recover from a failure
Speed	< 2 ms scoring	Adds one extra round-trip per retry
Tier	Assigns a tier	Stays within the same tier

Routing picks the model. Fallback catches it if that model is down.

Connect at least two providers. With a single provider, fallback can only switch between that provider’s models.

Getting Started

Features

Providers

Reference

What is fallback?

How it works

What triggers a fallback

Configuration

Hung providers and the per-attempt timeout

Response headers

Fallback vs routing

Getting Started

Features

Providers

Reference

Documentation Index

​What is fallback?

​How it works

​What triggers a fallback

​Configuration

​Hung providers and the per-attempt timeout

​Response headers

​Fallback vs routing

What is fallback?

How it works

What triggers a fallback

Configuration

Hung providers and the per-attempt timeout

Response headers

Fallback vs routing