> ## Documentation Index > Fetch the complete documentation index at: https://docs.timbal.ai/llms.txt > Use this file to discover all available pages before exploring further. # Cerebras > Wafer-scale inference at world-record token speeds with specs, pricing, and capabilities Source: [Cerebras model docs](https://inference-docs.cerebras.ai/introduction). All model IDs use the prefix `cerebras/`. Powered by Cerebras wafer-scale chips — the world's largest AI accelerator — delivering up to 3000+ tokens/second. Cerebras models are not yet available through the Timbal platform proxy. A `CEREBRAS_API_KEY` is required to use these models. If you'd like to access Cerebras models via your Timbal API key, please [contact sales](https://timbal.ai). ## All Models Reasoning · Speed `cerebras/gpt-oss-120b` OpenAI's open-weight MoE model with 120B total parameters (5.1B active per token), running at up to 3000 tokens/s on Cerebras wafer-scale hardware. Near-parity with o4-mini on reasoning benchmarks. Supports extended thinking. Apache 2.0. * \$0.35 / \$0.75 * 128K context * Text input * Thinking * Knowledge cutoff Jun 2024 Reasoning · Speed `cerebras/zai-glm-4.7` ZAI GLM 4.7 with 355B parameters, running at \~1000 tokens/s on Cerebras hardware. Strong multilingual reasoning and instruction-following capabilities. * \$2.25 / \$2.75 * 128K context * Text input * Knowledge cutoff \~early 2025