Groq — Free LLM API

9 free models available — credit card may be required. Get API key →

World's fastest LLM inference — ultra-low latency, free tier.

Groq is a cloud AI platform powered by its proprietary LPU (Language Processing Unit) chips, delivering dramatically faster inference than GPU-based providers. The free tier supports Llama, Qwen, DeepSeek-R1, and Whisper models with generous daily limits. Groq is fully OpenAI SDK-compatible, making it a drop-in replacement for any tool that accepts a custom base URL.

  • Ultra-fast inference (~2,600 tok/s)
  • Free tier: 14,400 RPD for most models
  • Supports Llama 4, Qwen3, DeepSeek-R1
  • OpenAI-compatible

API Compatibility: OpenAI SDK-compatible (Chat Completions)

All Free Groq Models — Context Windows & Rate Limits

Model Context Max Output Modality Rate Limit Status
llama-3.3-70b-versatile 131K 32K text 30 RPM, 14,400 RPD Details
llama-3.1-8b-instant 131K 131K text 30 RPM, 14,400 RPD Details
llama-4-scout-17b-16e-instruct 131K 8K text 30 RPM, 14,400 RPD Details
llama-4-maverick-17b-128e-instruct 131K 8K text 15 RPM, 500 RPD Details
qwen3-32b 131K 131K text 30 RPM, 14,400 RPD Details
kimi-k2-instruct 262K 262K text 30 RPM, 14,400 RPD Details
deepseek-r1-distill-70b 131K 8K text 30 RPM, 14,400 RPD Details
whisper-large-v3 131K 131K text 20 RPM, 2,000 RPD Details
whisper-large-v3-turbo 131K 131K text 20 RPM, 2,000 RPD Details

Frequently Asked Questions about Groq Free API

Is Groq free to use?

Groq offers a permanently free tier with 9 available models. Account creation is required, and a credit card may be needed to activate the free tier.

What models does Groq offer for free?

Groq provides 9 free models covering chat, reasoning use cases. Supported modalities include text. Browse the full list above with context windows and rate limits.

How do I use Groq with Claude Code or Cursor?

Click "Details" on any model above to get one-click configuration snippets for Claude Code (cc), Cursor, Codex, and more. All Groq models listed here use an OpenAI-compatible endpoint, so any tool that accepts a custom baseURL will work.