Wafer

Browse models provided by Wafer (Terms of Service)

3 models

Tokens processed on OpenRouter

Z.ai: GLM 5.2GLM 5.2
GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent workflows, project-level software engineering, and complex multi-step automation. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is particularly strong at coding and tool use across long-running tasks, able to maintain engineering context and follow standards consistently through a full development workflow, from requirements to multi-platform deployment, in a single task.
by z-aiJun 16, 20261.05M context$1.20/M input tokens$4.10/M output tokens

Wafer

Browse models provided by Wafer (Terms of Service)

3 models

Tokens processed on OpenRouter

Z.ai: GLM 5.2GLM 5.2
GLM 5.2 is a large-scale reasoning model from Z.ai. It supports text input and output with a 1M-token context window, and is suited for long-horizon agent workflows, project-level software engineering, and complex multi-step automation. Reasoning efforts `high` and `xhigh` are supported; `xhigh` maps to max reasoning. It is particularly strong at coding and tool use across long-running tasks, able to maintain engineering context and follow standards consistently through a full development workflow, from requirements to multi-platform deployment, in a single task.
by z-aiJun 16, 20261.05M context$1.20/M input tokens$4.10/M output tokens

Z.ai: GLM 5.1GLM 5.1

GLM-5.1 delivers a major leap in coding capability, with particularly significant gains in handling long-horizon tasks. Unlike previous models built around minute-level interactions, GLM-5.1 can work independently and continuously on a single task for more than 8 hours, autonomously planning, executing, and improving itself throughout the process, ultimately delivering complete, engineering-grade results.

by z-aiApr 7, 2026203K context$1/M input tokens$3.20/M output tokens

Google: Gemma 4 26B A4B Gemma 4 26B A4B

Gemma 4 26B A4B IT is an instruction-tuned Mixture-of-Experts (MoE) model from Google DeepMind. Despite 25.2B total parameters, only 3.8B activate per token during inference — delivering near-31B quality at a fraction of the compute cost. Supports multimodal input including text, images, and video (up to 60s at 1fps). Features a 256K token context window, native function calling, configurable thinking/reasoning mode, and structured output support. Released under Apache 2.0.

by googleApr 3, 2026262K context$0.13/M input tokens$0.40/M output tokens