Logo SVG copied to clipboard
Early Access Build agents that never fail Learn more
Groq HTTP 429 Rate limit

Groq 429 — Rate limit exceeded

38teams hit this · 30d
300occurrences · 30d
76%auto-recovered by Manifest
Occurrences across all Manifest teams, last 14 days

What this error means

Groq enforces tight per-minute request and token limits per model; fast agent loops hit them in bursts.

How to fix it

  • Wait for the window in the Retry-After header
  • Throttle concurrency on the client
  • Spread load across models or request a limit bump
Example error message
{
  "error": {
    "message": "Rate limit reached for model llama-3.3-70b in organization on tokens per minute (TPM).",
    "code": "rate_limit_exceeded",
    "type": "tokens"
  }
}

Frequently asked

Why does Groq rate limit so aggressively?

Its very high throughput comes with tight per-minute caps; pacing and fallbacks keep runs alive.

Don't let your requests fail again and again

Manifest fixes your bad LLM requests on the fly so they return successful responses before it reaches your agent. No downtime.

  • Deprecated / Not-found models
  • Wrong parameters
  • Malformed requests
  • Exceeded context windows

Join the waitlist to get early access with a free month.