Best AI models for reasoning

The strongest AI models for hard reasoning, ranked by GPQA Diamond (a graduate-level science benchmark). All are available on anyAInow — switch between them or compare them side by side, pay-as-you-go.

#ModelGPQA Diamond
1Gemini 3.1 Pro
GoogleFlagship·from $20 / 1M
94.3%Try →
2Claude Opus 4.7
AnthropicFlagship·from $40 / 1M
94.2%Try →
3GPT-5.5
OpenAIFlagship·from $50 / 1M
93.6%Try →
4Claude Opus 4.8
AnthropicFlagship·from $40 / 1M
93.6%Try →
5GPT-5.4
OpenAIBalanced·from $30 / 1M
92.8%Try →
6GLM 5.2
Z.ai (GLM)Open·from $10 / 1M
91.2%Try →
7Kimi K2
Moonshot (Kimi)Balanced·from $10 / 1M
90.5%Try →
8Qwen 3.6 Plus
QwenOpen·from $10 / 1M
90.4%Try →
9Qwen 3.7 Plus
QwenOpen·from $10 / 1M
90.3%Try →
10Claude Sonnet 4.6
AnthropicBalanced·from $30 / 1M
89.9%Try →
11GPT-5.4 mini
OpenAIFast·from $10 / 1M
88%Try →
12Gemini 3.1 Flash Lite
GoogleFast·from $10 / 1M
86.9%Try →
13Gemini 2.5 Pro
GoogleFlagship·from $20 / 1M
83%Try →
14DeepSeek R1
DeepSeekReasoning·from $10 / 1M
81%Try →
15Claude Haiku 4.5
AnthropicFast·from $10 / 1M
73%Try →
16GPT-5 nano
OpenAIFree·Free
71.2%Try →
17Mistral Small
MistralFast·from $5 / 1M
71.2%Try →
18Llama 4 Maverick
Meta LlamaBalanced·from $10 / 1M
69.8%Try →
19MiMo V2 Pro
Xiaomi (MiMo)Open·from $10 / 1M
66.7%Try →
20Llama 4 Scout
Meta LlamaOpen·from $5 / 1M
57.2%Try →
21Phi-4
MicrosoftOpen·from $10 / 1M
56.1%Try →
22Llama 3.3 70B
Meta LlamaFree·Free
50.5%Try →
23Nova Pro
AmazonBalanced·from $10 / 1M
46.9%Try →
24Mistral Large
MistralFlagship·from $20 / 1M
43.9%Try →
25Gemma 3 27B
GoogleOpen·from $10 / 1M
42.4%Try →
26Nova Lite
AmazonFast·from $5 / 1M
42%Try →

Benchmarks via llm-stats.com, as of 28 Jun 2026. Figures are publisher-reported where not independently verified. Prices shown are the anyAInow pay-as-you-go floor (per 1M tokens); free-tier models deduct nothing.

More rankings