What is routing?
Instead of sending every request to the same expensive model, Manifest scores each query and routes it to the cheapest model that can handle it.- Four tiers: simple, standard, complex, reasoning.
- Scoring happens in under 2 ms with zero external calls.
The four tiers
Simple
Greetings, definitions, short factual questions. Routed to the cheapest model.
Standard
General coding help, moderate questions. Good quality at low cost.
Complex
Multi-step tasks, large context, code generation. Best quality models.
Reasoning
Formal logic, proofs, math, multi-constraint problems. Reasoning-capable models only.
How scoring works
23 dimensions grouped into three categories: Keyword-based (14) — Scans the prompt for patterns like “prove”, “write function”, “what is”, etc. Structural (5) — Analyzes token count, nesting depth, code-to-prose ratio, conditional logic, and constraint density. Contextual (4) — Considers expected output length, repetition requests, tool count, and conversation depth. Each dimension has a weight. The weighted sum maps to a tier via threshold boundaries. A confidence score (0–1) indicates how clearly the request fits its tier.Session momentum
Manifest remembers the last 5 tier assignments (30-minute TTL). Short follow-up messages (“yes”, “do it”) inherit momentum from the conversation, so they don’t drop to a cheaper tier unnecessarily.Tier overrides
Some signals force a minimum tier regardless of the score:| Signal | Minimum tier |
|---|---|
| Tools detected | standard |
| Large context (>50k tokens) | complex |
| Formal logic keywords | reasoning |
Response headers
Every response includes these headers:| Header | Description |
|---|---|
X-Manifest-Tier | Assigned tier |
X-Manifest-Model | Actual model used |
X-Manifest-Provider | Provider (anthropic, openai, google, etc.) |
X-Manifest-Confidence | Scoring confidence (0–1) |
X-Manifest-Reason | Why this tier was selected |
Cloud vs Local
- Cloud
- Local
Routing is performed server-side. Model mappings are managed by the Manifest team and updated regularly.