Skip to main content
Manifest exposes both OpenAI and Anthropic-format endpoints on one proxy. Point your client at the Manifest URL, send auto as the model, and routing picks the real model behind the scenes.

Base URL

ModeURL
Cloudhttps://app.manifest.build
Self-hostedhttp://localhost:2099 (or your custom port)

Authentication

Every request requires a Manifest agent key:
Authorization: Bearer mnfst_YOUR_KEY_HERE
Generate a key from the dashboard’s Agents page. Keys always start with mnfst_.

Endpoints

MethodPathFormatUse it for
POST/v1/chat/completionsOpenAIMost clients (OpenAI SDK, LangChain, Vercel AI SDK, custom HTTP)
POST/v1/responsesOpenAI ResponsesCodex, *-pro, o1-pro, deep-research models
POST/v1/messagesAnthropicAnthropic SDK, Claude Code, anything that speaks the Messages API
GET/v1/modelsOpenAIListing the models your agent can route to
The proxy translates between formats internally, so you can send an OpenAI-shaped request and Manifest will reshape it before forwarding to an Anthropic-only model. The reverse works too.

Chat completions

curl -X POST http://localhost:2099/v1/chat/completions \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'
The body is forwarded verbatim to the resolved provider, with model rewritten to the actual model ID. All standard OpenAI fields (temperature, max_tokens, tools, tool_choice, response_format, stream, etc.) pass through.

Anthropic messages

curl -X POST http://localhost:2099/v1/messages \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "auto",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'

Listing models

GET /v1/models returns the models your agent can reach, in OpenAI format. The first entry is always auto (routing); the rest are the real model IDs from your connected providers.
curl http://localhost:2099/v1/models \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE"
Send auto to let Manifest route, or send any listed model ID to skip routing and go straight to that provider. See Routing → Route a specific model.

Streaming

Set "stream": true to get an SSE stream back. The stream format matches the upstream protocol: OpenAI-style data: {...} chunks for /v1/chat/completions, Anthropic event blocks for /v1/messages. Routing and fallback both work with streams. If the primary model fails before the first chunk, the request restarts on the fallback. If it fails mid-stream, the connection closes. There’s no silent mid-stream retry.

Errors

The proxy returns a standard JSON error envelope:
{
  "error": {
    "message": "Limit exceeded: cost usage ($1.23) exceeds $1.00 per day",
    "type": "limit_exceeded",
    "code": 429
  }
}
StatusMeaning
401Invalid or missing Authorization header
402Provider requires payment / quota exceeded on the upstream
424Fallback chain exhausted (all configured models failed)
429Hard limit hit, or Manifest rate limit (THROTTLE_LIMIT) tripped
5xxUpstream provider error (triggers fallback)
Status 424 is the only one that does not trigger a fallback. Manifest returns it itself when the chain is exhausted, so re-routing it would loop forever.

Rate limits

Self-hosted instances default to 100 requests per 60 seconds per agent. Override with THROTTLE_TTL and THROTTLE_LIMIT (Environment variables). Cloud rate limits are tied to your plan and shown in the dashboard.

Response headers

Every response carries routing headers so your client can see which model and tier handled the request, without parsing the response body.