Available Now · February 2026

The model that thinks
before it answers.

GLM-5 is Zhipu AI's most capable foundation model — 745 billion parameters, 202K context, and frontier-level reasoning in a single unified architecture.

0B Parameters
0K Context
0 Experts
0B Active Params

Compatible with

vLLM SGLang HuggingFace ModelScope Ascend NPU

Everything you need.
Nothing you don't.

Multi-Step Reasoning

Chain complex inferences across mathematical proofs, scientific analysis, and strategic planning. Enhanced System 2 thinking for problems that require depth, not shortcuts.

1 Decompose problem
2 Validate assumptions
3 Synthesize answer

Advanced Coding

Opus-level code generation across languages. Full-stack applications, terminal workflows, and SWE-bench verified performance.

async function solve(task) { const plan = await reason(task); return execute(plan); }

Multimodal

Process images, documents, and text in a unified pipeline. Cross-modal understanding as a first-class capability.

Agentic AI

Plan, execute, and adapt through multi-step workflows. AutoGLM-powered autonomous task completion.

Creative Writing

Narratives, documents, and long-form content with natural conversational style.

Enterprise-Ready

5.9% sparsity MoE for cost-efficient inference. Deploy anywhere.

Mixture of Experts,
mastery of everything.

Input
Sparse Router
E1
E2
E3
E4
E5
E6
E7
E8
E9
···
E256
8 of 256 experts activated per pass · 5.9% sparsity
Weighted Merge + MTP
Output Tokens
Architecture
Mixture-of-Experts (MoE)
Total Parameters
745 Billion
Active Parameters
44 Billion per inference
Expert Modules
256 total · 8 activated
Hidden Layers
78 transformer layers
Context Window
202,000 tokens
Attention
DeepSeek Sparse Attention (DSA)
Prediction
Multi-Token Prediction (MTP)

Benchmarks don't lie.

0
SWE-bench Verified
Coding
0
MATH-500
Mathematics
0
GPQA Diamond
Reasoning
0
T-Bench
Agent Tasks
0
MMMU
Multimodal
0
Arena ELO
Creative Writing

Scores represent publicly reported evaluation results. Visit z.ai for the latest data.

From zero to GLM-5
in under a minute.

Available via API, open-weight download, or local deployment with the frameworks you already use.

from zhipuai import ZhipuAI

client = ZhipuAI(api_key="your-key")

response = client.chat.completions.create(
    model="glm-5",
    messages=[{
        "role": "user",
        "content": "Hello, GLM-5!"
    }]
)

print(response.choices[0].message.content)