The Ultimate Guide to AI Subscription Plans

Guillaume Gay Jun 3, 2026 5 min read

Manifest lets you connect your existing AI provider subscriptions directly to your workflow. Instead of paying per-request API fees, you can leverage the plans you already pay for (like Claude Pro, GitHub Copilot, the Qwen Token Plan, the Xiaomi MiMo Token Plan, or the Kimi Coding Plan) to get premium model access — often with better rates or included quotas.

This guide compares every supported subscription provider in Manifest, so you can choose the best fit for your needs.

How pricing works in Manifest

Manifest’s routing engine is provider-agnostic. It helps to separate two things: how Manifest tracks your usage internally, and how each provider actually bills you.

Internal cost tracking: For every provider, Manifest estimates usage from input and output tokens (priced per million tokens) so you can compare spend across providers in one dashboard. This is an estimate for your visibility — it doesn’t change what the provider charges.
How you’re actually billed falls into two models, shown in the Pricing Model column below:
- Rate limited (most subscriptions): you pay a flat monthly fee and send as much as you want until you hit the provider’s rate limits. Those limits vary — some use rolling 5-hour session windows plus weekly caps (e.g. Anthropic), others a monthly token allowance, and some meter a credit allowance that depletes (e.g. GitHub Copilot). There’s no per-request charge; the limit is your usage cap, not your bill.
- Billed per request (OpenCode Go): you load a credit balance (e.g. $10) and each request costs a small flat amount (e.g. ~$0.01), drawn down per call rather than per token.

Subscription plans comparison matrix

Provider	Plan / Auth Method	Pricing Model	Key Models Available	Manifest Review & Pro Tips
Anthropic	Claude Max / Pro	Rate limited	Claude Opus, Sonnet, Haiku	Anthropic reserves subscription capacity for Claude Code and aggressively rate-limits third-party usage (not in the ToS, changes without notice). Expect 429s. Limits run on rolling 5-hour session windows plus weekly caps. Prompt caching and batching unavailable.
BytePlus	ModelArk Coding Plan	Rate limited	Ark Code, Seed Code, GLM, DeepSeek, Kimi, GPT OSS	One coding plan spans Ark Code, Seed Code, GLM, DeepSeek, Kimi, and GPT OSS models through BytePlus.
Command Code	Command Code subscription	Rate limited	Claude, GPT, Gemini, Qwen (FREE), Kimi, GLM, MiniMax, DeepSeek, Step, MiMo	Qwen 3.7 Max is included free — frontier-class open-source reasoning at zero extra cost.
GitHub Copilot	Copilot Subscription	Rate limited	Claude, GPT, Gemini, Grok	As of June 2026, Copilot moved to usage-based billing with AI Credits (token-based, no more flat premium requests). Each plan includes credits equal to its monthly cost ($10 for Pro, $39 for Pro+). Frontier models like Opus 4.8 consume credits fast — prefer lighter models (Haiku, GPT-4o-mini) for routine tasks.
Google	Gemini Code Assist	Rate limited	Gemini models	Up to 1M token window.
Kiro	Kiro Subscription	Rate limited	Claude, DeepSeek, MiniMax, GLM, Qwen	9 models from 5 providers (Anthropic, DeepSeek, MiniMax, Z.ai, Qwen) at zero marginal cost. The `kiro/auto` meta-model picks the best backend for each request.
Minimax	MiniMax Coding Plan	Rate limited	MiniMax models	Every recent model (M2.1+) has a “highspeed” variant for rapid inference.
Moonshot	Kimi Coding Plan	Rate limited	Kimi	262k context window. Single-model whitelist.
Ollama Cloud	Ollama Cloud Plan	Rate limited	Open-weight models	Open-weight models via a managed cloud endpoint.
OpenAI	ChatGPT Plus / Pro / Team	Rate limited	GPT Chat, GPT Codex, GPT Codex Spark	Your subscription has two separate usage pools: normal and Codex Spark. Most users never touch their Codex Spark quota — consider setting a Codex Spark model as a primary model on one of your agents’ routing, with fallbacks, to make the most of it. Using the Codex subscription for third-party agents isn’t clearly specified in the ToS, but OpenAI officials have allowed it via social media posts. Rate limits run on a rolling 5-hour session window plus a weekly cap.
OpenCode Go	OpenCode Go (Beta)	Billed per request	Open-weight models	Open-weight models (DeepSeek, GLM, MiMo, MiniMax, Qwen, Kimi) with per-request pricing (flat cost per request, not per token).
Qwen (Alibaba Cloud)	Qwen Token Plan	Rate limited	Qwen models	New Alibaba Cloud subscription. Uses `sk-sp-` credentials.
Xiaomi MiMo	MiMo Token Plan	Rate limited	MiMo, Omni, Flash	One of the best cost-quality ratios at the moment with MiMo v2.5 models.
xAI	Grok Subscription	Rate limited	Grok models	Grok Build 0.1 is xAI’s new coding agent harness.
Z.ai	GLM Coding Plan	Rate limited	GLM models	Wide model range from ultra-fast (turbo/air) to frontier (5.1).

Frequently asked questions

How do I connect my subscription in Manifest?

Open Provider Settings → Subscriptions tab → find your provider and follow the setup instructions. Then toggle the provider ON.

Do I still need an API key if I connect a subscription?

No. Once a subscription provider is connected and active, Manifest will prioritize it for supported models, using your subscription’s authentication method instead of a standard API key.

What happens if I hit my subscription rate limit?

Manifest will gracefully fall back to your configured pay-as-you-go API key (if one is set up) or return a rate-limit error, depending on your routing rules. We recommend monitoring your provider’s dashboard for quota usage.

Need a specific provider added to the subscription catalog? Request a new subscription provider on GitHub.

Talk to us and get $25 credit for .

How pricing works in Manifest

Subscription plans comparison matrix

Frequently asked questions

Start for free. Scale with your team.

Claim my spot