> ## Documentation Index
> Fetch the complete documentation index at: https://manifest.build/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# API

> The Manifest proxy speaks both OpenAI and Anthropic. Endpoints, auth, streaming, and error responses.

Manifest exposes both OpenAI and Anthropic-format endpoints on one proxy. Point your client at the Manifest URL, send `auto` as the model, and routing picks the real model behind the scenes.

## Base URL

| Mode        | URL                                           |
| ----------- | --------------------------------------------- |
| Cloud       | `https://app.manifest.build`                  |
| Self-hosted | `http://localhost:2099` (or your custom port) |

## Authentication

Every request requires a Manifest agent key:

```http theme={"theme":{"light":"github-light","dark":"github-dark"}}
Authorization: Bearer mnfst_YOUR_KEY_HERE
```

Generate a key from the dashboard's **Agents** page. Keys always start with `mnfst_`.

## Endpoints

| Method | Path                   | Format           | Use it for                                                        |
| ------ | ---------------------- | ---------------- | ----------------------------------------------------------------- |
| `POST` | `/v1/chat/completions` | OpenAI           | Most clients (OpenAI SDK, LangChain, Vercel AI SDK, custom HTTP)  |
| `POST` | `/v1/responses`        | OpenAI Responses | Codex, `*-pro`, `o1-pro`, deep-research models                    |
| `POST` | `/v1/messages`         | Anthropic        | Anthropic SDK, Claude Code, anything that speaks the Messages API |
| `GET`  | `/v1/models`           | OpenAI           | Listing the models your agent can route to                        |

The proxy translates between formats internally, so you can send an OpenAI-shaped request and Manifest will reshape it before forwarding to an Anthropic-only model. The reverse works too.

## Chat completions

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST http://localhost:2099/v1/chat/completions \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [
      {"role": "user", "content": "What is the capital of France?"}
    ]
  }'
```

The body is forwarded verbatim to the resolved provider, with `model` rewritten to the actual model ID. All standard OpenAI fields (`temperature`, `max_tokens`, `tools`, `tool_choice`, `response_format`, `stream`, etc.) pass through.

## Anthropic messages

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl -X POST http://localhost:2099/v1/messages \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "auto",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello"}
    ]
  }'
```

## Listing models

`GET /v1/models` returns the models your agent can reach, in OpenAI format. The first entry is always `auto` (routing); the rest are the real model IDs from your connected providers.

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
curl http://localhost:2099/v1/models \
  -H "Authorization: Bearer mnfst_YOUR_KEY_HERE"
```

Send `auto` to let Manifest route, or send any listed model ID to skip routing and go straight to that provider. See [Routing → Route a specific model](/routing#route-a-specific-model).

## Streaming

Set `"stream": true` to get an SSE stream back. The stream format matches the upstream protocol: OpenAI-style `data: {...}` chunks for `/v1/chat/completions`, Anthropic event blocks for `/v1/messages`.

Routing and fallback both work with streams. If the primary model fails before the first chunk, the request restarts on the fallback. If it fails mid-stream, the connection closes. There's no silent mid-stream retry.

## Errors

The proxy returns a standard JSON error envelope:

```json theme={"theme":{"light":"github-light","dark":"github-dark"}}
{
  "error": {
    "message": "Limit exceeded: cost usage ($1.23) exceeds $1.00 per day",
    "type": "limit_exceeded",
    "code": 429
  }
}
```

| Status | Meaning                                                                          |
| ------ | -------------------------------------------------------------------------------- |
| `401`  | Invalid or missing `Authorization` header                                        |
| `402`  | Provider requires payment / quota exceeded on the upstream                       |
| `424`  | Fallback chain exhausted (all configured models failed)                          |
| `429`  | [Hard limit](/set-limits) hit, or Manifest rate limit (`THROTTLE_LIMIT`) tripped |
| `5xx`  | Upstream provider error (triggers [fallback](/fallback))                         |

Status `424` is the only one that **does not** trigger a fallback. Manifest returns it itself when the chain is exhausted, so re-routing it would loop forever.

## Rate limits

Self-hosted instances default to **100 requests per 60 seconds** per agent. Override with `THROTTLE_TTL` and `THROTTLE_LIMIT` ([Environment variables](/reference/environment-variables)).

Cloud rate limits are tied to your plan and shown in the dashboard.

## Response headers

Every response carries [routing headers](/reference/headers) so your client can see which model and tier handled the request, without parsing the response body.