Custom providers

If your endpoint speaks OpenAI or Anthropic, Manifest can route to it. Useful for self-hosted inference servers (vLLM, TGI, LocalAI), internal endpoints behind your VPN, or providers that aren’t on the built-in list yet.

Compatible servers

Any server exposing one of these endpoints works out of the box:

Format	Endpoint
OpenAI-compatible	`POST /v1/chat/completions`
Anthropic-compatible	`POST /v1/messages`

Common options that ship with one of these formats: vLLM, TGI, LocalAI, Xinference, OpenLLM.

Add a custom provider

Open the Routing page

Navigate to Routing in the dashboard and click Add custom provider.

Enter the base URL

Paste the base URL of your endpoint, e.g. https://my-vllm.internal:8000/v1. Manifest normalizes the trailing /v1 automatically.

Pick the protocol

Choose OpenAI (/v1/chat/completions) or Anthropic (/v1/messages), whichever your server speaks.

Add credentials (optional)

If the endpoint requires authentication, paste an API key. It’s sent as Authorization: Bearer <key> for OpenAI-format endpoints, or x-api-key: <key> for Anthropic-format endpoints.

Probe for models

Manifest calls GET /v1/models against your base URL and lists every model the endpoint reports. Pin one to a complexity tier and you’re routed.

Model discovery

Manifest discovers models by hitting GET <base_url>/v1/models and reading the data[].id field. If your server doesn’t expose /v1/models, you can register models manually from the same panel.

Older builds of llama.cpp (pre-b3800) don’t expose /v1/models. Either upgrade llama-server or register models by hand.

Security

User-supplied URLs are revalidated on every request to defend against SSRF. Manifest blocks resolution to private IP ranges (10.x, 192.168.x, 127.x, link-local, etc.) unless the request originates from a self-hosted instance on the same network.

Custom providers run with your Manifest instance’s network access. If you expose Manifest publicly, anyone with a valid agent key can route requests through any custom provider you’ve added. Gate access accordingly.

Cost tracking

Manifest can’t infer pricing for unknown models. Custom-provider requests show up in the dashboard with cost = 0 and model = <your-id>. Token counts and latency are still recorded, so hard limits on token volume still work.

Getting Started

Features

Providers

Reference

Compatible servers

Add a custom provider

Model discovery

Security

Cost tracking

Getting Started

Features

Providers

Reference

Documentation Index

​Compatible servers

​Add a custom provider

​Model discovery

​Security

​Cost tracking

Compatible servers

Add a custom provider

Model discovery

Security

Cost tracking