Groq - Timbal

Source: Groq model docs. All model IDs use the prefix groq/. Ultra-low latency inference via custom LPU hardware.

All Models

qwen/qwen3.6-27b

Reasoning · Speedgroq/qwen/qwen3.6-27bQwen3.6 27B with hybrid thinking mode, delivered at Groq’s ultra-low latency. Preview tier replacing retired Qwen3 32B / Llama 4 Scout.

$0.60 / $3.00
131K context
Text input
Hybrid thinking

openai/gpt-oss-120b

Reasoning · Speedgroq/openai/gpt-oss-120bOpenAI’s open-weight MoE model with 120B total parameters (5.1B active per token), running at Groq speeds. Near-parity with o4-mini on reasoning benchmarks. Apache 2.0.

$0.15 / $0.60
128K context
Text input
Thinking

openai/gpt-oss-20b

Reasoning · Speedgroq/openai/gpt-oss-20bOpenAI’s compact open-weight MoE model with 20B total parameters (3.6B active), delivering results similar to o3-mini at Groq’s ultra-low latency. Apache 2.0.

$0.075 / $0.30
128K context
Text input
Thinking

llama-3.3-70b-versatile

Reasoning · Speedgroq/llama-3.3-70b-versatileMultilingual instruction-tuned model with 70B parameters, optimized for versatile tasks with Groq’s ultra-low latency inference.

$0.59 / $0.79
131K context
Text input
Deprecated — shutdown August 16, 2026

llama-3.1-8b-instant

Reasoning · Speedgroq/llama-3.1-8b-instantThe most compact Llama 3.1 model with 8B parameters, optimized for instant responses on Groq’s LPU hardware.

$0.05 / $0.08
131K context
Text input
Deprecated — shutdown August 16, 2026

Google Moonshot (Kimi)

​All Models

qwen/qwen3.6-27b

openai/gpt-oss-120b

openai/gpt-oss-20b

llama-3.3-70b-versatile

llama-3.1-8b-instant

All Models