The Ultimate Guide to AI Subscription Plans

AI subscription plans flowing through the Manifest router to a single AI agent

Manifest lets you connect your existing AI provider subscriptions directly to your workflow. Instead of paying per-request API fees, you can leverage the plans you already pay for (like Claude Pro, GitHub Copilot, the Qwen Token Plan, or the Kimi Coding Plan) to get premium model access — often with better rates or included quotas.

This guide compares every supported subscription provider in Manifest, so you can choose the best fit for your needs.


How pricing works in Manifest

Manifest’s routing engine is provider-agnostic. It helps to separate two things: how Manifest tracks your usage internally, and how each provider actually bills you.

  • Internal cost tracking: For every provider, Manifest estimates usage from input and output tokens (priced per million tokens) so you can compare spend across providers in one dashboard. This is an estimate for your visibility — it doesn’t change what the provider charges.
  • How you’re actually billed falls into two models, shown in the Pricing Model column below:
    • Rate limited (most subscriptions): you pay a flat monthly fee and send as much as you want until you hit the provider’s rate limits. Those limits vary — some use rolling 5-hour session windows plus weekly caps (e.g. Anthropic), others a monthly token allowance, and some meter a credit allowance that depletes (e.g. GitHub Copilot). There’s no per-request charge; the limit is your usage cap, not your bill.
    • Billed per request (OpenCode Go): you load a credit balance (e.g. $10) and each request costs a small flat amount (e.g. ~$0.01), drawn down per call rather than per token.

Subscription plans comparison matrix

Provider Plan / Auth Method Pricing Model Key Models Available Manifest Review & Pro Tips
Anthropic Claude Max / Pro Rate limited Claude Opus, Sonnet, Haiku Anthropic reserves subscription capacity for Claude Code and aggressively rate-limits third-party usage (not in the ToS, changes without notice). Expect 429s. Limits run on rolling 5-hour session windows plus weekly caps. Prompt caching and batching unavailable.
BytePlus ModelArk Coding Plan Rate limited Ark Code, Seed Code, GLM, DeepSeek, Kimi, GPT OSS One coding plan spans Ark Code, Seed Code, GLM, DeepSeek, Kimi, and GPT OSS models through BytePlus.
Command Code Command Code subscription Rate limited Claude, GPT, Gemini, Qwen (FREE), Kimi, GLM, MiniMax, DeepSeek, Step, MiMo Qwen 3.7 Max is included free — frontier-class open-source reasoning at zero extra cost.
GitHub Copilot Copilot Subscription Rate limited Claude, GPT, Gemini, Grok As of June 2026, Copilot moved to usage-based billing with AI Credits (token-based, no more flat premium requests). Each plan includes credits equal to its monthly cost ($10 for Pro, $39 for Pro+). Frontier models like Opus 4.8 consume credits fast — prefer lighter models (Haiku, GPT-4o-mini) for routine tasks.
Google Gemini Code Assist Rate limited Gemini models Up to 1M token window.
Kiro Kiro Subscription Rate limited Claude, DeepSeek, MiniMax, GLM, Qwen 9 models from 5 providers (Anthropic, DeepSeek, MiniMax, Z.ai, Qwen) at zero marginal cost. The kiro/auto meta-model picks the best backend for each request.
Minimax MiniMax Coding Plan Rate limited MiniMax models Every recent model (M2.1+) has a “highspeed” variant for rapid inference.
Moonshot Kimi Coding Plan Rate limited Kimi 262k context window. Single-model whitelist.
Ollama Cloud Ollama Cloud Plan Rate limited Open-weight models Open-weight models via a managed cloud endpoint.
OpenAI ChatGPT Plus / Pro / Team Rate limited GPT Chat, GPT Codex, GPT Codex Spark Your subscription has two separate usage pools: normal and Codex Spark. Most users never touch their Codex Spark quota — consider setting a Codex Spark model as a primary model on one of your agents’ routing, with fallbacks, to make the most of it. Using the Codex subscription for third-party agents isn’t clearly specified in the ToS, but OpenAI officials have allowed it via social media posts. Rate limits run on a rolling 5-hour session window plus a weekly cap.
OpenCode Go OpenCode Go (Beta) Billed per request Open-weight models Open-weight models (DeepSeek, GLM, MiMo, MiniMax, Qwen, Kimi) with per-request pricing (flat cost per request, not per token).
Qwen (Alibaba Cloud) Qwen Token Plan Rate limited Qwen models New Alibaba Cloud subscription. Uses sk-sp- credentials.
xAI Grok Subscription Rate limited Grok models Grok Build 0.1 is xAI’s new coding agent harness.
Z.ai GLM Coding Plan Rate limited GLM models Wide model range from ultra-fast (turbo/air) to frontier (5.1).

Frequently asked questions

How do I connect my subscription in Manifest?

Open Provider SettingsSubscriptions tab → find your provider and follow the setup instructions. Then toggle the provider ON.

Do I still need an API key if I connect a subscription?

No. Once a subscription provider is connected and active, Manifest will prioritize it for supported models, using your subscription’s authentication method instead of a standard API key.

What happens if I hit my subscription rate limit?

Manifest will gracefully fall back to your configured pay-as-you-go API key (if one is set up) or return a rate-limit error, depending on your routing rules. We recommend monitoring your provider’s dashboard for quota usage.


Need a specific provider added to the subscription catalog? Request a new subscription provider on GitHub.

Start saving on AI inference today