Stop overpaying for AI
Manifest is a plug-and-play platform for reducing AI inference costs.
- Smart routing
- Model benchmarking
- Limits and notifications
Don't waste money on the wrong model
Many agents and apps are using a single model to perform all their queries. This is inefficient as top-tier, thus expensive models are used to perform simple requests. Manifest is the tool you wish you had that helps you get the most cost-efficient AI inference.
Routing
Define routing rules upfront, isolating queries and treating them separately, or use our smart router that analyses queries and routes them on the fly based on their complexity or specificity.
Benchmarking
Not satisfied with your current value for money? Replay past queries against alternative models and compare quality, price and latency side-by-side. Choose the model that fits best.
Control
Visualize each dollar spent in real time. See how much you spend and where. Set soft and hard limits to control your consumption.
Plug in every kind of provider
Simultaneously use different kinds of model or inference providers based on the query. Manifest does not limit you to a restricted list of providers.
API key providers
Bring your own key from all the main providers to get instant access to all their models. Pay by the usage directly to the provider.
Subscription providers
Already paying for a monthly subscription? Connect it to Manifest to use those quotas first, fallback to pay-as-you-go only when limits are exceeded.
Custom providers
Plug in any OpenAI-compatible or Anthropic-compatible provider that exists out there! We don't limit you to the providers that we know only.
Local models
Run open-weight models at home! Manifest handles Ollama, LM Studio and llama.cpp as first-class providers so you can run them on your own infrastructure.
Built for AI apps, and your own AI workflows
Without Manifest, optimizing inference efficiency for AI apps can be incredibly difficult. That's why builders often skip it and end up spending more than they should. Manifest is here to help.
- Replay past queries to optimize them
- Set up fallback models and providers
- Choose the right model beforehand
Personal agents like OpenClaw or Hermes are known to produce surprisingly high AI bills. Any automation can potentially drain your whole budget. Use Manifest to reduce those costs.
- Connect your personal subscriptions
- Set up budget limits and notifications
- Use our curated free model list
Don't let coding tools lock you into their providers' models. Use your favorite tools, now with your favorite models. Take back control of your code.
- Choose your coding models
- Visualize consumption
- Plug in your local models
| Model | Pricing / 1M tokens | Tokens / last 30 days |
|---|---|---|
| Loading... | ||
| Model | Pricing / 1M tokens | Tokens / last 7 days |
|---|---|---|
| Loading... | ||
Why Manifest?
AI is an incredible technology, but it is expensive.
Nevertheless there is room for all of us to use it more efficiently by following some principles. However the techniques to do so are not within the reach of everyone: they require time and expertise.
That is Manifest's mission: giving you the tools to use AI efficiently and reduce your bills without trading-off quality. Putting you in control of this layer, by being open source and flexible.
We all deserve affordable AI, from the solo hacker building agents at 2:00 AM to the established company implementing AI at scale. This is why we are here.
Built in the open
Manifest is fully open source. Use our cloud version for easy onboarding or our self-hosted version based on Docker. We encourage you to participate in this project.
Join the community
Chat with the team and meet the community! Ask questions, no such thing as a stupid question here.
Join Manifest Discord ServerGitHub
Read the source and contribute to the project.
Docker Hub
Get the latest image from Manifest and use it locally!
Start saving on AI inference today
- Drop-in OpenAI replacement
- API key and subscription providers
- Model fallbacks
Frequently asked questions
Is your API compatible with OpenAI?
Yes. Our endpoints and schemas are drop-in compatible with the OpenAI API, so you can integrate in minutes.
Do you retain my data?
We don't store your prompts or responses. We only log metadata (model used, token count, cost). In local mode, nothing leaves your machine.
How does routing choose a model?
Each request goes through a scorer that checks the prompt length, technical keywords, tool usage, and other signals to classify it into one of four complexity tiers (simple, standard, complex, reasoning). Then it picks the cheapest model assigned to that tier from your connected providers.
Can I use Manifest with free models?
Yes. We support free models from OpenRouter and other providers. You can mix free and paid models in your routing. Simple tasks go to free models, complex ones go to paid ones.
What happens if my primary model fails?
You can add up to 5 fallback models per tier. If a provider returns an error or times out, the request goes to the next model in the chain. Your agent keeps working.
Which agents does Manifest work with?
Any tool that uses the OpenAI-compatible API. Personal AI agents like OpenClaw or Hermes Agent, plus app SDKs like OpenAI SDK, Vercel AI SDK or LangChain.
What's the difference between cloud and local mode?
Same features, same routing. Cloud routes through app.manifest.build. Local runs on your machine, nothing leaves your network, and you can route to local models via Ollama.
Messages routed through Manifest
1,735,934