Best AI models for reasoning
The strongest AI models for hard reasoning, ranked by GPQA Diamond (a graduate-level science benchmark). All are available on anyAInow — switch between them or compare them side by side, pay-as-you-go.
| # | Model | GPQA Diamond | |
|---|---|---|---|
| 1 | Gemini 3.1 Pro GoogleFlagship·from $20 / 1M | 94.3% | Try → |
| 2 | Claude Opus 4.7 AnthropicFlagship·from $40 / 1M | 94.2% | Try → |
| 3 | GPT-5.5 OpenAIFlagship·from $50 / 1M | 93.6% | Try → |
| 4 | Claude Opus 4.8 AnthropicFlagship·from $40 / 1M | 93.6% | Try → |
| 5 | GPT-5.4 OpenAIBalanced·from $30 / 1M | 92.8% | Try → |
| 6 | GLM 5.2 Z.ai (GLM)Open·from $10 / 1M | 91.2% | Try → |
| 7 | Kimi K2 Moonshot (Kimi)Balanced·from $10 / 1M | 90.5% | Try → |
| 8 | Qwen 3.6 Plus QwenOpen·from $10 / 1M | 90.4% | Try → |
| 9 | Qwen 3.7 Plus QwenOpen·from $10 / 1M | 90.3% | Try → |
| 10 | Claude Sonnet 4.6 AnthropicBalanced·from $30 / 1M | 89.9% | Try → |
| 11 | GPT-5.4 mini OpenAIFast·from $10 / 1M | 88% | Try → |
| 12 | Gemini 3.1 Flash Lite GoogleFast·from $10 / 1M | 86.9% | Try → |
| 13 | Gemini 2.5 Pro GoogleFlagship·from $20 / 1M | 83% | Try → |
| 14 | DeepSeek R1 DeepSeekReasoning·from $10 / 1M | 81% | Try → |
| 15 | Claude Haiku 4.5 AnthropicFast·from $10 / 1M | 73% | Try → |
| 16 | GPT-5 nano OpenAIFree·Free | 71.2% | Try → |
| 17 | Mistral Small MistralFast·from $5 / 1M | 71.2% | Try → |
| 18 | Llama 4 Maverick Meta LlamaBalanced·from $10 / 1M | 69.8% | Try → |
| 19 | MiMo V2 Pro Xiaomi (MiMo)Open·from $10 / 1M | 66.7% | Try → |
| 20 | Llama 4 Scout Meta LlamaOpen·from $5 / 1M | 57.2% | Try → |
| 21 | Phi-4 MicrosoftOpen·from $10 / 1M | 56.1% | Try → |
| 22 | Llama 3.3 70B Meta LlamaFree·Free | 50.5% | Try → |
| 23 | Nova Pro AmazonBalanced·from $10 / 1M | 46.9% | Try → |
| 24 | Mistral Large MistralFlagship·from $20 / 1M | 43.9% | Try → |
| 25 | Gemma 3 27B GoogleOpen·from $10 / 1M | 42.4% | Try → |
| 26 | Nova Lite AmazonFast·from $5 / 1M | 42% | Try → |
Benchmarks via llm-stats.com, as of 28 Jun 2026. Figures are publisher-reported where not independently verified. Prices shown are the anyAInow pay-as-you-go floor (per 1M tokens); free-tier models deduct nothing.