Release Timeline
Chronological view of all model releases from Chinese LLM providers.
December 2025
December 22, 2025
GLM-4.7
400B
Major release competing with GPT-5.2 on coding. Ranked #1 on Code Arena among open-source and domestic models. AIME 2025: 95.7% accuracy.
- LiveCodeBench: 84.9% (beats Claude 4.5)
- SWE-bench Verified: 73.8% (SOTA open-source)
November 2025
November 1, 2025
Kimi K2 Thinking
1T (32B active)
Reasoning and tool-using thinking agent. Can execute 200-300 sequential tool calls without human interference. State-of-the-art agentic capabilities.
- 200-300 sequential tool calls
- LiveCodeBench-v6: 83.1%
October 2025
October 1, 2025
Kimi Linear
48B (3B active)
Uses Kimi Delta Attention (KDA) for efficient long-context processing. Reduces memory usage and improves generation speed at longer context windows.
- 1M token context window
- Novel Delta Attention mechanism
September 2025
September 1, 2025
DeepSeek-V3.2
671B (37B active)
Latest V3 iteration with improved general capabilities. Enhanced reasoning and coding performance over V3.1.
- SWE-bench Verified: 73.1%
- Improved knowledge and academic tasks
July 2025
July 14, 2025
Kimi K2
1T (32B active)
1 trillion parameter MoE with 32B active. State-of-the-art open-source performance on coding benchmarks. Trained for $4.6M, rivaling ChatGPT and Claude.
- SOTA open-source coding performance
- Trained for only $4.6M
June 2025
June 1, 2025
Kimi-Dev
72B
Coding-focused model based on Qwen2.5-72B. State-of-the-art among open source on SWE-bench Verified.
- SOTA open-source on SWE-bench Verified
- Built on Qwen2.5-72B foundation
May 2025
May 28, 2025
DeepSeek-R1-0528
671B (37B active)
Major reasoning upgrade. AIME 2025 accuracy improved from 70% to 87.5%. Performance approaching O3 and Gemini 2.5 Pro levels.
- Intelligence Index: 68 (Artificial Analysis)
- AIME 2025: 87.5%
April 2025
April 28, 2025
Qwen3-235B-A22B
235B (22B active)
Flagship MoE reasoning model. Hybrid thinking/non-thinking modes. Trained on 36T tokens across 119 languages. Outperforms DeepSeek-R1 on 17/23 benchmarks.
- State-of-the-art open-source reasoning model
- Hybrid thinking mode for complex problems
April 1, 2025
Kimi-VL
16B (3B active)
Open-source vision-language MoE model. Efficient multimodal understanding with only 3B active parameters.
- 16B MoE with 3B active
- Efficient multimodal processing
March 2025
March 24, 2025
DeepSeek-V3-0324
671B (37B active)
Significant benchmark improvements. MMLU-Pro: 75.9 to 81.2 (+5.3). GPQA: 59.1 to 68.4 (+9.3). AIME: 39.6 to 59.4 (+19.8).
- MMLU-Pro: 81.2
- GPQA: 68.4
March 6, 2025
QwQ-32B
32B
Medium-sized reasoning model achieving performance comparable to DeepSeek-R1 (671B). Strong step-by-step reasoning capabilities.
- Competitive with 671B DeepSeek-R1 using only 32B parameters
- ArenaHard: 89.5
March 1, 2025
Baichuan-M2Plus
Unknown
Medical domain specialized model. Significantly lower hallucination rate than general models - about 3x lower than DeepSeek.
- 3x lower hallucination rate
- Medical domain specialized
January 2025
January 20, 2025
Kimi K1.5
Unknown
First major Kimi reasoning model. Matches OpenAI o1 in math, coding, and multimodal reasoning. Uses reinforcement learning for dynamic learning.
- Matches OpenAI o1 performance
- Free with no usage limits
December 2024
December 26, 2024
DeepSeek-V3
671B (37B active)
Breakthrough MoE model. 671B parameters with only 37B activated per token. Innovative load balancing and multi-token prediction. Trained on 14.8T tokens.
- Outperforms GPT-4o and Claude 3.5 Sonnet on MMLU
- Cost-efficient MoE architecture
December 24, 2024
Baichuan4-Finance
Unknown
Specialized financial model. 95%+ accuracy in banking, insurance, funds, and securities. Overall 93.62% on FLAME-Cer benchmark.
- FLAME-Cer: 93.62%
- Surpasses GPT-4o on financial tasks
November 2024
November 12, 2024
Qwen2.5-Coder-32B-Instruct
32B
State-of-the-art open-source code LLM. Trained on 5.5T tokens. Performance matches GPT-4o on coding tasks. Supports 40+ programming languages.
- HumanEval pass@1: 92.7%
- Aider benchmark: 73.7 (comparable to GPT-4o)
September 2024
September 19, 2024
Qwen2.5-72B-Instruct
72B
Major upgrade with improved reasoning and 128K context. Surpasses Llama-3.1-405B on several benchmarks. MMLU improved from 84.2 to 86.1.
- Beats Llama-3.1-405B despite smaller size
- 128K context window
August 2024
August 29, 2024
Qwen2-VL-72B
72B
Multimodal vision-language model. Strong image understanding, document analysis, and visual reasoning capabilities.
- State-of-the-art open-source VLM
- Dynamic resolution support
July 2024
July 1, 2024
CodeGeeX-4
9B
Code generation model built on GLM-4-9B. IDE extensions available for VS Code, JetBrains, and more.
- IDE integration available
- Multi-language support
June 2024
June 17, 2024
DeepSeek-Coder-V2
236B (21B active)
First open-source model to match GPT-4 Turbo on coding. Supports 338 programming languages. 128K context length.
- HumanEval: 90.2%
- MBPP: 76.2%
June 5, 2024
GLM-4-9B
9B
Efficient open-source version of GLM-4. Strong performance for its size with 128K context support.
- Apache 2.0 license
- 128K context
June 1, 2024
Baichuan4-Turbo
Unknown
Cost-optimized version of Baichuan 4. Lowest deployment cost among similar models - runs on 2x RTX 4090.
- Runs on 2x RTX 4090
- Lowest cost in class
May 2024
May 24, 2024
CogVLM2
19B
Vision-language model based on Llama 3. Strong image understanding and visual reasoning capabilities.
- Built on Llama 3
- Strong OCR capabilities
May 22, 2024
Baichuan 4
Unknown
Fourth generation with 10%+ improvement in general capabilities. Math improved 14%, code improved 9%. Ranked #1 on SuperCLUE Chinese benchmark.
- SuperCLUE score: 80.64 (#1)
- Beats GPT-4-Turbo on Chinese tasks
January 2024
January 16, 2024
GLM-4
Unknown
Fourth generation GLM with significantly improved reasoning and Chinese language capabilities. 128K context window.
- 128K context window
- Strong Chinese understanding
September 2023
September 6, 2023
Baichuan2-13B
13B
Open-source bilingual model with strong Chinese capabilities. Good balance of performance and efficiency.
- Open source
- Strong Chinese performance