Groq — Free LLM API
9 free models available — credit card may be required. Get API key →
World's fastest LLM inference — ultra-low latency, free tier.
Groq is a cloud AI platform powered by its proprietary LPU (Language Processing Unit) chips, delivering dramatically faster inference than GPU-based providers. The free tier supports Llama, Qwen, DeepSeek-R1, and Whisper models with generous daily limits. Groq is fully OpenAI SDK-compatible, making it a drop-in replacement for any tool that accepts a custom base URL.
- Ultra-fast inference (~2,600 tok/s)
- Free tier: 14,400 RPD for most models
- Supports Llama 4, Qwen3, DeepSeek-R1
- OpenAI-compatible
API Compatibility: OpenAI SDK-compatible (Chat Completions)
All Free Groq Models — Context Windows & Rate Limits
| Model | Context | Max Output | Modality | Rate Limit | Status | |
|---|---|---|---|---|---|---|
| llama-3.3-70b-versatile | 131K | 32K | 30 RPM, 14,400 RPD | Details | ||
| llama-3.1-8b-instant | 131K | 131K | 30 RPM, 14,400 RPD | Details | ||
| llama-4-scout-17b-16e-instruct | 131K | 8K | 30 RPM, 14,400 RPD | Details | ||
| llama-4-maverick-17b-128e-instruct | 131K | 8K | 15 RPM, 500 RPD | Details | ||
| qwen3-32b | 131K | 131K | 30 RPM, 14,400 RPD | Details | ||
| kimi-k2-instruct | 262K | 262K | 30 RPM, 14,400 RPD | Details | ||
| deepseek-r1-distill-70b | 131K | 8K | 30 RPM, 14,400 RPD | Details | ||
| whisper-large-v3 | 131K | 131K | 20 RPM, 2,000 RPD | Details | ||
| whisper-large-v3-turbo | 131K | 131K | 20 RPM, 2,000 RPD | Details |
Frequently Asked Questions about Groq Free API
Is Groq free to use?
Groq offers a permanently free tier with 9 available models. Account creation is required, and a credit card may be needed to activate the free tier.
What models does Groq offer for free?
Groq provides 9 free models covering chat, reasoning use cases. Supported modalities include text. Browse the full list above with context windows and rate limits.
How do I use Groq with Claude Code or Cursor?
Click "Details" on any model above to get one-click configuration snippets for Claude Code (cc), Cursor, Codex, and more.
All Groq models listed here use an OpenAI-compatible endpoint, so any tool that accepts a custom baseURL will work.