Local model providers run entirely on your own hardware. Manifest detects the running server, fetches the model list, and routes requests toDocumentation Index
Fetch the complete documentation index at: https://manifest.build/docs/llms.txt
Use this file to discover all available pages before exploring further.
http://localhost:<port> like any other provider. No API key, no network egress, no per-token cost.
Supported runtimes
| Runtime | Default port | Install |
|---|---|---|
| Ollama | 11434 | ollama.com/download |
| LM Studio | 1234 | lmstudio.ai |
| llama.cpp | 8080 | llama.cpp build guide |
/v1/chat/completions and accept any GGUF model file.
Start the server
- Ollama
- LM Studio
- llama.cpp
Connect to Manifest
Confirm the server is reachable
Manifest probes
http://localhost:<default-port>/v1/models. If the probe succeeds, every loaded model appears for routing.Pin a model to a tier
Open any complexity tier and pick a local model as the primary. You can mix local and cloud models in the same fallback chain.
Running Manifest in Docker
If you self-host Manifest in Docker, the container can’t reach a local server bound to127.0.0.1 on the host. Two of the three runtimes default to loopback and need an explicit override:
- LM Studio
- llama.cpp
- Ollama
Either flip the GUI toggle (LM Studio → ⚙ Developer → Serve on Local Network) or rebind from the CLI:LM Studio remembers the last
--bind, so this is one-time setup.Inside the Manifest container, the host is reachable as
host.docker.internal. Manifest sets this automatically when probing local
providers.Cost & privacy
| Aspect | Local |
|---|---|
| API cost | $0. The model runs on your hardware. |
| Network egress | None. Requests never leave the machine. |
| Cost in dashboard | Recorded as 0. Token counts and latency are still tracked. |
| Pricing data | Not applicable. Local providers are excluded from pricing sync. |