🤖AI Hub
← All Providers

Groq

LPU (Language Processing Unit) inference provider delivering the fastest inference speeds in the industry — up to 1000+ tokens/second. Linear pricing with no hidden costs.

Visit Groq

Plans & Pricing

Free Tier

$0
  • All supported open models
  • Rate limits apply
  • Prompt caching available
  • Batch API: 50% discount

API Pay-Per-Use

From $0.05/1M tokens
  • Llama 3.1 8B Instant: $0.05/1M in / $0.08/1M out (840 TPS)
  • Llama 4 Scout: $0.11/1M in / $0.34/1M out (594 TPS)
  • GPT OSS 20B: $0.075/1M in / $0.30/1M out (1000 TPS)
  • Qwen3 32B: $0.29/1M in / $0.59/1M out (662 TPS)
  • Up to 1000+ TPS

Free Tier

Free tier available with rate limits. No idle infrastructure costs.

Models (7)

Compare →
ModelIn /1MOut /1M
Llama 4 Scout (Groq)
ChatFunctions
$0.11$0.34
Qwen3 32B (Groq)
ChatFunctions
$0.29$0.59
Llama 3.3 70B Versatile (Groq)
ChatFunctions
$0.59$0.79
Llama 3.1 8B Instant (Groq)
ChatFunctions
$0.05$0.08
GPT OSS 20B (Groq)
ChatFunctions
$0.07$0.30
GPT OSS 120B (Groq)
ChatFunctions
$0.15$0.60
Kimi K2 (Groq)
ChatFunctions
$1.00$3.00