Use the right AI model for every query
With Manifest, find the perfect AI model for your use case and get it set up as it should be.
- Smart routing
- Model benchmarking
- Limits and notifications
Don't waste money on the wrong model
Many agents and apps are using a single model to perform all their queries. This is inefficient as top-tier, thus expensive models are used to perform simple requests. Manifest is the tool you wish you had that helps you get the most cost-efficient AI inference.
Routing
Define routing rules upfront, isolating queries and treating them separately, or use our smart router that analyses queries and routes them on the fly based on their complexity or specificity.
Benchmarking
Not satisfied with your current value for money? Replay past queries against alternative models and compare quality, price and latency side-by-side. Choose the model that fits best.
Control
Visualize each dollar spent in real time. See how much you spend and where. Set soft and hard limits to control your consumption.
Plug in every kind of provider
Simultaneously use different kinds of model or inference providers based on the query. Manifest does not limit you to a restricted list of providers.
API key providers
Bring your own key from all the main providers to get instant access to all their models. Pay by the usage directly to the provider.
Subscription providers
Already paying for a monthly subscription? Connect it to Manifest to use those quotas first, fallback to pay-as-you-go only when limits are exceeded.
Custom providers
Plug in any OpenAI-compatible or Anthropic-compatible provider that exists out there! We don't limit you to the providers that we know only.
Local models
Run open-weight models at home! Manifest handles Ollama, LM Studio and llama.cpp as first-class providers so you can run them on your own infrastructure.
Built for AI apps, and your own AI workflows
Without Manifest, optimizing inference efficiency for AI apps can be incredibly difficult. That's why builders often skip it and end up spending more than they should. Manifest is here to help.
- Replay past queries to optimize them
- Set up fallback models and providers
- Choose the right model beforehand
Personal agents like OpenClaw or Hermes are known to produce surprisingly high AI bills. Any automation can potentially drain your whole budget. Use Manifest to reduce those costs.
- Connect your personal subscriptions
- Set up budget limits and notifications
- Use our curated free model list
Don't let coding tools lock you into their providers' models. Use your favorite tools, now with your favorite models. Take back control of your code.
- Choose your coding models
- Visualize consumption
- Plug in your local models
| Model | Pricing / 1M tokens | Tokens / last 30 days |
|---|---|---|
| Loading... | ||
| Model | Pricing / 1M tokens | Tokens / last 7 days |
|---|---|---|
| Loading... | ||
Why Manifest?
AI is an incredible technology, but it is expensive.
Nevertheless there is room for all of us to use it more efficiently by following some principles. However the techniques to do so are not within the reach of everyone: they require time and expertise.
That is Manifest's mission: giving you the tools to use AI efficiently and reduce your bills without trading-off quality. Putting you in control of this layer, by being open source and flexible.
We all deserve affordable AI, from the solo hacker building agents at 2:00 AM to the established company implementing AI at scale. This is why we are here.
Built in the open
Manifest is fully open source. Use our cloud version for easy onboarding or our self-hosted version based on Docker. We encourage you to participate in this project.
Join the community
Chat with the team and meet the community! Ask questions, no such thing as a stupid question here.
Join Manifest Discord ServerGitHub
Read the source and contribute to the project.
Docker Hub
Get the latest image from Manifest and use it locally!
Start saving on AI inference today
- Drop-in OpenAI replacement
- API key and subscription providers
- Model fallbacks
Frequently asked questions
Which LLM providers and models does Manifest support?
Manifest supports all major LLM providers out-of-the-box, including OpenAI, Anthropic, MiniMax, DeepSeek, Mistral and so on.
You can also connect any provider that has an OpenAI-compatible API. The self-hosted version allows you to run local models too (Ollama, Llama.cpp, LM Studio).
Can I use my pro subscription with Manifest?
Yes, Manifest lets you connect your subscription to make the most of it. As subscriptions often have lower API rate limits, we recommend adding fallback models to complete requests when the subscription model fails.
Manifest provides an easy way to connect popular subscriptions like Anthropic, GitHub Copilot, MiniMax, Ollama Cloud, OpenAI, OpenCode Go and Z.ai.
Do I need to pay to use Manifest?
No, our basic version is free to use. You can connect your own API keys from supported providers (BYOK) and start routing your requests.
How does routing work?
Manifest allows you to create routing tiers and assign models to them. Custom routing lets you define HTTP headers that route requests to a certain model or provider. Default routing allows you to set a primary model and fallback models for each tier, and route requests based on their success or failure.
Complexity and specificity routing uses a rule-based algorithm to analyze the content of the request and route it to the most suitable tier.
Can I use Manifest with free models?
Yes. We support free models from OpenRouter and other providers. You can mix free and paid models in your routing. Simple tasks go to free models, complex ones go to paid ones.
What happens if my primary model fails?
You can add up to 5 fallback models per tier. If a provider returns an error or times out, the request goes to the next model in the chain. Your agent keeps working.
Do I need to adapt my agent setup?
Manifest is a drop-in replacement for OpenAI's API. You just need to change the endpoint to the Manifest URL.
We provide tutorials for SDKs like Anthropic's SDK, OpenAI's SDK, or the Vercel AI SDK, as well as agents like OpenClaw and Hermes, and coding assistants like Claude Code.
What's the difference between cloud and self-hosted?
The cloud version is hosted on our servers, and the self-hosted version runs on your machine or your infrastructure.
Local model providers (Ollama, Llama.cpp, LM Studio) are only available in self-hosted mode. The cloud version is easier to set up and maintain, while the self-hosted version gives you more control and privacy.
Messages routed through Manifest
2,131,137