Chinese LLM Tracker
Track and compare the latest releases from China's leading AI labs. Qwen, DeepSeek, Kimi, GLM, and Baichuan - all in one place.
5
Providers
26
Models Tracked
19
Open Source
Dec 2025
Latest Release
Providers
Qwen
Alibaba Cloud
Alibaba's flagship LLM series, known for strong multilingual capabilities, extensive model sizes from 0.5B to 235B, and industry-leading open-source coding models.
Latest Release
DeepSeek
DeepSeek AI
Known for efficient MoE architectures and groundbreaking reasoning models. DeepSeek-V3 and R1 series have achieved remarkable performance with innovative training techniques.
Latest Release
Kimi
Moonshot AI
Moonshot AI's Kimi series is known for pioneering long-context understanding and efficient MoE architectures. K2 became a major open-source competitor in 2025.
Latest Release
GLM
Zhipu AI (Z.ai)
Zhipu AI's GLM series has evolved from ChatGLM to the powerful GLM-4.7, competing with GPT-5 on coding tasks. Known for strong Chinese language capabilities and open-source models.
Latest Release
Baichuan
Baichuan Intelligence
Baichuan focuses on Chinese language excellence and specialized vertical models. Strong in finance and healthcare domains. Backed by Alibaba and Xiaomi.
Latest Release
Latest Releases
View all releases →GLM
GLM-4.7
400B parameters
Major release competing with GPT-5.2 on coding. Ranked #1 on Code Arena among open-source and domestic models. AIME 2025: 95.7% accuracy.
- LiveCodeBench: 84.9% (beats Claude 4.5)
- SWE-bench Verified: 73.8% (SOTA open-source)
- HLE benchmark: 42.8%
Benchmarks
humaneval
91.5%
gsm8k
97%
math
95.7%
Kimi
Kimi K2 Thinking
1T (32B active) parameters
Reasoning and tool-using thinking agent. Can execute 200-300 sequential tool calls without human interference. State-of-the-art agentic capabilities.
- 200-300 sequential tool calls
- LiveCodeBench-v6: 83.1%
- Advanced agentic reasoning
Benchmarks
humaneval
91%
gsm8k
96.5%
Kimi
Kimi Linear
48B (3B active) parameters
Uses Kimi Delta Attention (KDA) for efficient long-context processing. Reduces memory usage and improves generation speed at longer context windows.
- 1M token context window
- Novel Delta Attention mechanism
- Efficient memory usage
Benchmarks
mmlu
78%
DeepSeek
DeepSeek-V3.2
671B (37B active) parameters
Latest V3 iteration with improved general capabilities. Enhanced reasoning and coding performance over V3.1.
- SWE-bench Verified: 73.1%
- Improved knowledge and academic tasks
- LiveCodeBench-v6: 83.3%
Benchmarks
mmlu
88.5%
humaneval
90%
gsm8k
95.5%
Kimi
Kimi K2
1T (32B active) parameters
1 trillion parameter MoE with 32B active. State-of-the-art open-source performance on coding benchmarks. Trained for $4.6M, rivaling ChatGPT and Claude.
- SOTA open-source coding performance
- Trained for only $4.6M
- Beats GPT-4o on multiple benchmarks
Benchmarks
mmlu
87.5%
humaneval
90.5%
gsm8k
95%
Kimi
Kimi-Dev
72B parameters
Coding-focused model based on Qwen2.5-72B. State-of-the-art among open source on SWE-bench Verified.
- SOTA open-source on SWE-bench Verified
- Built on Qwen2.5-72B foundation
- Specialized for software development
Benchmarks
humaneval
89%
Benchmark Leaders
Top MMLU Scores
DeepSeek-R1-0528
DeepSeek
DeepSeek-V3.2
DeepSeek
DeepSeek-V3-0324
DeepSeek
Kimi K2
Kimi
DeepSeek-V3
DeepSeek
Top HumanEval Scores
Qwen2.5-Coder-32B-Instruct
Qwen
Qwen3-235B-A22B
Qwen
GLM-4.7
GLM
Kimi K2 Thinking
Kimi
Kimi K2
Kimi
Compare Models
Select models to compare benchmarks side-by-side with interactive charts.
Start Comparing