Skip to main content
Source: Cerebras model docs. All model IDs use the prefix cerebras/. Powered by Cerebras wafer-scale chips — the world’s largest AI accelerator — delivering up to 3000+ tokens/second.
Cerebras models are not yet available through the Timbal platform proxy. A CEREBRAS_API_KEY is required to use these models. If you’d like to access Cerebras models via your Timbal API key, please contact sales.

All Models

gpt-oss-120b

Reasoning · Speedcerebras/gpt-oss-120bOpenAI’s open-weight MoE model with 120B total parameters (5.1B active per token), running at up to 3000 tokens/s on Cerebras wafer-scale hardware. Near-parity with o4-mini on reasoning benchmarks. Supports extended thinking. Apache 2.0.
  • $0.35 / $0.75
  • 128K context
  • Text input
  • Thinking
  • Knowledge cutoff Jun 2024

zai-glm-4.7

Reasoning · Speedcerebras/zai-glm-4.7ZAI GLM 4.7 with 355B parameters, running at ~1000 tokens/s on Cerebras hardware. Strong multilingual reasoning and instruction-following capabilities.
  • $2.25 / $2.75
  • 128K context
  • Text input
  • Knowledge cutoff ~early 2025