Release Timeline

Chronological view of all model releases from Chinese LLM providers.

Filter by provider:

December 2025

December 22, 2025

GLM Open Source

GLM-4.7

400B

💻 🧠 🔢

Major release competing with GPT-5.2 on coding. Ranked #1 on Code Arena among open-source and domestic models. AIME 2025: 95.7% accuracy.

LiveCodeBench: 84.9% (beats Claude 4.5)
SWE-bench Verified: 73.8% (SOTA open-source)

November 2025

November 1, 2025

Kimi Open Source

Kimi K2 Thinking

1T (32B active)

🧠 🤖 💻

Reasoning and tool-using thinking agent. Can execute 200-300 sequential tool calls without human interference. State-of-the-art agentic capabilities.

200-300 sequential tool calls
LiveCodeBench-v6: 83.1%

October 2025

October 1, 2025

Kimi Open Source

Kimi Linear

48B (3B active)

💬 📄

Uses Kimi Delta Attention (KDA) for efficient long-context processing. Reduces memory usage and improves generation speed at longer context windows.

1M token context window
Novel Delta Attention mechanism

September 2025

September 1, 2025

DeepSeek

DeepSeek-V3.2

671B (37B active)

💬 🧠 💻

Latest V3 iteration with improved general capabilities. Enhanced reasoning and coding performance over V3.1.

SWE-bench Verified: 73.1%
Improved knowledge and academic tasks

July 2025

July 14, 2025

Kimi Open Source

Kimi K2

1T (32B active)

💬 🧠 💻

1 trillion parameter MoE with 32B active. State-of-the-art open-source performance on coding benchmarks. Trained for $4.6M, rivaling ChatGPT and Claude.

SOTA open-source coding performance
Trained for only $4.6M

June 2025

June 1, 2025

Kimi Open Source

Kimi-Dev

72B

💻

Coding-focused model based on Qwen2.5-72B. State-of-the-art among open source on SWE-bench Verified.

SOTA open-source on SWE-bench Verified
Built on Qwen2.5-72B foundation

May 2025

May 28, 2025

DeepSeek Open Source

DeepSeek-R1-0528

671B (37B active)

🧠 🔢 💻

Major reasoning upgrade. AIME 2025 accuracy improved from 70% to 87.5%. Performance approaching O3 and Gemini 2.5 Pro levels.

Intelligence Index: 68 (Artificial Analysis)
AIME 2025: 87.5%

April 2025

April 28, 2025

Qwen Open Source

Qwen3-235B-A22B

235B (22B active)

💬 🧠 💻

Flagship MoE reasoning model. Hybrid thinking/non-thinking modes. Trained on 36T tokens across 119 languages. Outperforms DeepSeek-R1 on 17/23 benchmarks.

State-of-the-art open-source reasoning model
Hybrid thinking mode for complex problems

April 1, 2025

Kimi Open Source

Kimi-VL

16B (3B active)

👁️

Open-source vision-language MoE model. Efficient multimodal understanding with only 3B active parameters.

16B MoE with 3B active
Efficient multimodal processing

March 2025

March 24, 2025

DeepSeek Open Source

DeepSeek-V3-0324

671B (37B active)

💬 🧠 💻

Significant benchmark improvements. MMLU-Pro: 75.9 to 81.2 (+5.3). GPQA: 59.1 to 68.4 (+9.3). AIME: 39.6 to 59.4 (+19.8).

MMLU-Pro: 81.2
GPQA: 68.4

March 6, 2025

Qwen Open Source

QwQ-32B

32B

🧠 🔢 💻

Medium-sized reasoning model achieving performance comparable to DeepSeek-R1 (671B). Strong step-by-step reasoning capabilities.

Competitive with 671B DeepSeek-R1 using only 32B parameters
ArenaHard: 89.5

March 1, 2025

Baichuan

Baichuan-M2Plus

Unknown

💬

Medical domain specialized model. Significantly lower hallucination rate than general models - about 3x lower than DeepSeek.

3x lower hallucination rate
Medical domain specialized

January 2025

January 20, 2025

Kimi

Kimi K1.5

Unknown

💬 🧠 👁️

First major Kimi reasoning model. Matches OpenAI o1 in math, coding, and multimodal reasoning. Uses reinforcement learning for dynamic learning.

Matches OpenAI o1 performance
Free with no usage limits

December 2024

December 26, 2024

DeepSeek Open Source

DeepSeek-V3

671B (37B active)

💬 🧠 💻

Breakthrough MoE model. 671B parameters with only 37B activated per token. Innovative load balancing and multi-token prediction. Trained on 14.8T tokens.

Outperforms GPT-4o and Claude 3.5 Sonnet on MMLU
Cost-efficient MoE architecture

December 24, 2024

Baichuan

Baichuan4-Finance

Unknown

💬

Specialized financial model. 95%+ accuracy in banking, insurance, funds, and securities. Overall 93.62% on FLAME-Cer benchmark.

FLAME-Cer: 93.62%
Surpasses GPT-4o on financial tasks

November 2024

November 12, 2024

Qwen Open Source

Qwen2.5-Coder-32B-Instruct

32B

💻

State-of-the-art open-source code LLM. Trained on 5.5T tokens. Performance matches GPT-4o on coding tasks. Supports 40+ programming languages.

HumanEval pass@1: 92.7%
Aider benchmark: 73.7 (comparable to GPT-4o)

September 2024

September 19, 2024

Qwen Open Source

Qwen2.5-72B-Instruct

72B

💬 🧠 📄

Major upgrade with improved reasoning and 128K context. Surpasses Llama-3.1-405B on several benchmarks. MMLU improved from 84.2 to 86.1.

Beats Llama-3.1-405B despite smaller size
128K context window

August 2024

August 29, 2024

Qwen Open Source

Qwen2-VL-72B

72B

👁️ 💬

Multimodal vision-language model. Strong image understanding, document analysis, and visual reasoning capabilities.

State-of-the-art open-source VLM
Dynamic resolution support

July 2024

July 1, 2024

GLM Open Source

CodeGeeX-4

💻

Code generation model built on GLM-4-9B. IDE extensions available for VS Code, JetBrains, and more.

IDE integration available
Multi-language support

June 2024

June 17, 2024

DeepSeek Open Source

DeepSeek-Coder-V2

236B (21B active)

💻

First open-source model to match GPT-4 Turbo on coding. Supports 338 programming languages. 128K context length.

HumanEval: 90.2%
MBPP: 76.2%

June 5, 2024

GLM Open Source

GLM-4-9B

💬

Efficient open-source version of GLM-4. Strong performance for its size with 128K context support.

Apache 2.0 license
128K context

June 1, 2024

Baichuan

Baichuan4-Turbo

Unknown

💬

Cost-optimized version of Baichuan 4. Lowest deployment cost among similar models - runs on 2x RTX 4090.

Runs on 2x RTX 4090
Lowest cost in class

May 2024

May 24, 2024

GLM Open Source

CogVLM2

19B

👁️

Vision-language model based on Llama 3. Strong image understanding and visual reasoning capabilities.

Built on Llama 3
Strong OCR capabilities

May 22, 2024

Baichuan

Baichuan 4

Unknown

💬 🧠 👁️

Fourth generation with 10%+ improvement in general capabilities. Math improved 14%, code improved 9%. Ranked #1 on SuperCLUE Chinese benchmark.

SuperCLUE score: 80.64 (#1)
Beats GPT-4-Turbo on Chinese tasks

January 2024

January 16, 2024

GLM

GLM-4

Unknown

💬 🧠

Fourth generation GLM with significantly improved reasoning and Chinese language capabilities. 128K context window.

128K context window
Strong Chinese understanding

September 2023

September 6, 2023

Baichuan Open Source

Baichuan2-13B

13B

💬

Open-source bilingual model with strong Chinese capabilities. Good balance of performance and efficiency.

Open source
Strong Chinese performance