GPT-5.4 · Claude Sonnet 4.6 · Gemini 3.1 Pro
| Dimension | GPT-5.4 | Claude Sonnet 4.6 | Gemini 3.1 Pro |
|---|---|---|---|
| Release date | Mar 5, 2026 | Feb 17, 2026 | Feb 19, 2026 |
| Context window | 1.05M tokens | 1M (beta, API only) | 1M tokens |
| Max output | Not disclosed | 128K tokens | 64K tokens |
| Input price / 1M | $2.50 | $3.00 | $2.00 |
| Output price / 1M | $15.00 | $15.00 | $12.00 |
| Cached input price | $0.25 / 1M | 90% savings (est. ~$0.30) | Supported (tiered) |
| Input modalities | Text, Image | Text, Image | Text, Image, Video, Audio |
| Reasoning control | 5-level configurable | Hybrid extended thinking | 3 thinking levels |
| Computer use | Native API | Improved in 4.6 | Not available |
| SWE-bench Verified | ~80.0% | Approaches Opus-level | Not reported |
| ARC-AGI-2 | Not reported | Not reported | 77.1% |
| GPQA (reasoning) | 93.2% (Pro variant) | Not reported | Not reported |
| Batch pricing | 50% off standard | 50% off standard | Available |
| Availability | API + ChatGPT Pro/Plus | API + Claude.ai + Bedrock + Vertex + Foundry | API + Gemini App + Vertex |
| Pro / Premium tier | $30 / $180 per 1M (GPT-5.4 Pro) | $15 / $75 per 1M (Opus 4.6) | Deep Think (Ultra subs) |
Best pick: GPT-5.4 for complex, multi-file codebases. ~80% SWE-bench and 5-level reasoning control let you dial cost vs. quality per request. Sonnet 4.6 is the value play at $3 input — devs in early access often preferred it over Opus for frontend and financial code. Gemini 3.1 Pro is strong for agentic coding workflows with its efficient token usage.
Best pick: GPT-5.4 leads with native computer-use API and configurable reasoning (dial down for fast tool-calling loops, dial up for planning). Sonnet 4.6 has improved computer use and agent planning — great for multi-step browser automation. Gemini 3.1 Pro excels at long-horizon tool orchestration with its medium thinking mode.
Best pick: Gemini 3.1 Pro — cheapest per-token ($2/$12) with a full 1M context window and native multimodal grounding (text, image, video, audio). Great for document-heavy pipelines. GPT-5.4 at 1.05M context is the largest window but costs more. Sonnet 4.6's 128K max output is ideal when you need long-form synthesis from retrieved docs.
Best pick: Gemini 3.1 Pro is the clear leader — native text, image, video, and audio input with unified embeddings (gemini-embedding-2). GPT-5.4 handles text + image but no video/audio. Sonnet 4.6 is text + image only. For multimodal RAG or video analysis, Gemini is the only real option at the frontier.