Skip to main content

Documentation Index

Fetch the complete documentation index at: https://manifest.build/docs/llms.txt

Use this file to discover all available pages before exploring further.

Manifest exposes both OpenAI and Anthropic-format endpoints on one proxy. Point your client at the Manifest URL, send manifest/auto as the model, and routing picks the real model behind the scenes.

Base URL

ModeURL
Cloudhttps://app.manifest.build
Self-hostedhttp://localhost:2099 (or your custom port)

Authentication

Every request requires a Manifest agent key:
Authorization: Bearer mnfst_YOUR_KEY_HERE
Generate a key from the dashboard’s Agents page. Keys always start with mnfst_.

Endpoints

MethodPathFormatUse it for
POST/v1/chat/completionsOpenAIMost clients (OpenAI SDK, LangChain, Vercel AI SDK, custom HTTP)
POST/v1/responsesOpenAI ResponsesCodex, *-pro, o1-pro, deep-research models
POST/v1/messagesAnthropicAnthropic SDK, Claude Code, anything that speaks the Messages API
The proxy translates between formats internally, so you can send an OpenAI-shaped request and Manifest will reshape it before forwarding to an Anthropic-only model. The reverse works too.

Chat completions

curl -X POST http://localhost:2099/v1/chat/completions \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "manifest/auto",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'
The body is forwarded verbatim to the resolved provider, with model rewritten to the actual model ID. All standard OpenAI fields (temperature, max_tokens, tools, tool_choice, response_format, stream, etc.) pass through.

Anthropic messages

curl -X POST http://localhost:2099/v1/messages \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "manifest/auto",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Streaming

Set "stream": true to get an SSE stream back. The stream format matches the upstream protocol: OpenAI-style data: {...} chunks for /v1/chat/completions, Anthropic event blocks for /v1/messages. Routing and fallback both work with streams. If the primary model fails before the first chunk, the request restarts on the fallback. If it fails mid-stream, the connection closes. There’s no silent mid-stream retry.

Errors

The proxy returns a standard JSON error envelope:
{
  "error": {
    "message": "Limit exceeded: cost usage ($1.23) exceeds $1.00 per day",
    "type": "limit_exceeded",
    "code": 429
  }
}
StatusMeaning
401Invalid or missing Authorization header
402Provider requires payment / quota exceeded on the upstream
424Fallback chain exhausted (all configured models failed)
429Hard limit hit, or Manifest rate limit (THROTTLE_LIMIT) tripped
5xxUpstream provider error (triggers fallback)
Status 424 is the only one that does not trigger a fallback. Manifest returns it itself when the chain is exhausted, so re-routing it would loop forever.

Rate limits

Self-hosted instances default to 100 requests per 60 seconds per agent. Override with THROTTLE_TTL and THROTTLE_LIMIT (Environment variables). Cloud rate limits are tied to your plan and shown in the dashboard.

Response headers

Every response carries routing headers so your client can see which model handled the request, the assigned tier, and confidence, without parsing the response body.