Run Claude Code on your ChatGPT Plus subscription
If you use agents, you know API keys are expensive and costs are unpredictable.
At the same time, most of us already pay for subscriptions (OpenAI, Claude, GitHub…). We use them in their web app to chat or generate code, but our agents and harnesses run separately on API keys we pay on top.
Manifest lets you connect your subscriptions with your harnesses. Claude Code is one example, but the same setup works with other agents too like Hermes.
What this gives you:
- Costs under control
- Fallbacks when a model hits its rate limit
- The same subscription reused across multiple agents
- One place to see what’s running where
Setup: Claude Code with ChatGPT Plus
Create a Claude Code agent in Manifest and copy the base URL and API key.
Then open ~/.claude/settings.json and point Claude Code to Manifest:
{
"env": {
"ANTHROPIC_BASE_URL": "https://app.manifest.build/v1",
"ANTHROPIC_AUTH_TOKEN": "mnfst_your_key_here"
}
}
Once that is done, your agent will send requests to Manifest.
Now go into Manifest, open Providers, and connect your ChatGPT Plus subscription. You get access to the OpenAI models included in your plan. I set GPT-5.4 as my default, it handles most Claude Code tasks well and doesn’t burn through the GPT-5.5 quota.
After that, every request from Claude Code goes through Manifest first, and Manifest routes it to the model you selected as default.
Routing by tier
You can also split your traffic across multiple models. For simple requests, route to a lightweight model that uses fewer tokens. For heavier ones, keep the strong model in reserve.
If you want more control, you can create your own custom tier mapped to a specific header value. Any Claude Code request that carries that header gets routed to that tier. Useful if you have specific workflows you want pinned to specific models.

You can also set model parameters like temperature or max output length, so the routing stays flexible without becoming messy.
Fallbacks
Fallbacks kick in when a model fails or hits a rate limit. You can chain up to 5 fallback models per tier, so the agent never gets stuck mid-session.
In my case, I keep one API-based model as the very last fallback. That way it’s either never used or used very rarely, and I stay in control of costs.

Limit
You can set a limit, so even with API fallbacks, you know you won’t go over a certain amount.
Visibility
You can see what each provider costs, how much each tier consumes, and where your requests are going in real time. That makes it easier to keep API fallbacks under control and stay within budget.

About Manifest
Manifest is an open-source LLM router for agents and harnesses. It gives you one place to connect your subscriptions, route requests to the right models, and keep track of token usage and spending. It is MIT licensed and can be self-hosted.
Feedback is welcome on GitHub.