DeepSeek
DeepSeek AI
Known for efficient MoE architectures and groundbreaking reasoning models. DeepSeek-V3 and R1 series have achieved remarkable performance with innovative training techniques.
Models
DeepSeek-V3.2
671B (37B active) parameters
Latest V3 iteration with improved general capabilities. Enhanced reasoning and coding performance over V3.1.
- SWE-bench Verified: 73.1%
- Improved knowledge and academic tasks
- LiveCodeBench-v6: 83.3%
Benchmarks
mmlu
88.5%
humaneval
90%
gsm8k
95.5%
DeepSeek-R1-0528
671B (37B active) parameters
Major reasoning upgrade. AIME 2025 accuracy improved from 70% to 87.5%. Performance approaching O3 and Gemini 2.5 Pro levels.
- Intelligence Index: 68 (Artificial Analysis)
- AIME 2025: 87.5%
- Approaching frontier model performance
Benchmarks
mmlu
90.8%
humaneval
89.5%
gsm8k
97%
math
94.5%
DeepSeek-V3-0324
671B (37B active) parameters
Significant benchmark improvements. MMLU-Pro: 75.9 to 81.2 (+5.3). GPQA: 59.1 to 68.4 (+9.3). AIME: 39.6 to 59.4 (+19.8).
- MMLU-Pro: 81.2
- GPQA: 68.4
- LiveCodeBench: 49.2
Benchmarks
mmlu
87.5%
mmlu-pro
81.2%
gpqa
68.4%
humaneval
88%
DeepSeek-V3
671B (37B active) parameters
Breakthrough MoE model. 671B parameters with only 37B activated per token. Innovative load balancing and multi-token prediction. Trained on 14.8T tokens.
- Outperforms GPT-4o and Claude 3.5 Sonnet on MMLU
- Cost-efficient MoE architecture
- First fully open 600B+ model
Benchmarks
mmlu
87.1%
humaneval
86.5%
gsm8k
93%
DeepSeek-Coder-V2
236B (21B active) parameters
First open-source model to match GPT-4 Turbo on coding. Supports 338 programming languages. 128K context length.
- HumanEval: 90.2%
- MBPP: 76.2%
- First open model >10% on SWE-bench
Benchmarks
humaneval
90.2%
mbpp
76.2%