AI Model Tracker

GPT-5.4 · Claude Sonnet 4.6 · Gemini 3.1 Pro

Updated: Mar 19, 2026
OpenAI
GPT-5.4
Released: March 5, 2026
Context1.05M tokens
Input$2.50 / 1M
Output$15.00 / 1M
Cached input$0.25 / 1M
ModalityText + Image in
Reasoning5-level configurable
Computer useYes (native)
SWE-bench~80.0%
Anthropic
Claude Sonnet 4.6
Released: February 17, 2026
Context1M tokens (beta)
Input$3.00 / 1M
Output$15.00 / 1M
Max output128K tokens
ModalityText + Image in
ReasoningHybrid (extended thinking)
Computer useYes (improved)
Caching savingsUp to 90%
Google
Gemini 3.1 Pro
Released: February 19, 2026
Context1M tokens
Input (≤200K)$2.00 / 1M
Output (≤200K)$12.00 / 1M
Input (>200K)$4.00 / 1M
ModalityText, Image, Video, Audio
Reasoning3 thinking levels
ARC-AGI-277.1%
Max output64K tokens

Side-by-Side Comparison

Dimension GPT-5.4 Claude Sonnet 4.6 Gemini 3.1 Pro
Release date Mar 5, 2026 Feb 17, 2026 Feb 19, 2026
Context window 1.05M tokens 1M (beta, API only) 1M tokens
Max output Not disclosed 128K tokens 64K tokens
Input price / 1M $2.50 $3.00 $2.00
Output price / 1M $15.00 $15.00 $12.00
Cached input price $0.25 / 1M 90% savings (est. ~$0.30) Supported (tiered)
Input modalities Text, Image Text, Image Text, Image, Video, Audio
Reasoning control 5-level configurable Hybrid extended thinking 3 thinking levels
Computer use Native API Improved in 4.6 Not available
SWE-bench Verified ~80.0% Approaches Opus-level Not reported
ARC-AGI-2 Not reported Not reported 77.1%
GPQA (reasoning) 93.2% (Pro variant) Not reported Not reported
Batch pricing 50% off standard 50% off standard Available
Availability API + ChatGPT Pro/Plus API + Claude.ai + Bedrock + Vertex + Foundry API + Gemini App + Vertex
Pro / Premium tier $30 / $180 per 1M (GPT-5.4 Pro) $15 / $75 per 1M (Opus 4.6) Deep Think (Ultra subs)

What Matters For Your Stack

Coding

Best pick: GPT-5.4 for complex, multi-file codebases. ~80% SWE-bench and 5-level reasoning control let you dial cost vs. quality per request. Sonnet 4.6 is the value play at $3 input — devs in early access often preferred it over Opus for frontend and financial code. Gemini 3.1 Pro is strong for agentic coding workflows with its efficient token usage.

🤖 Agents

Best pick: GPT-5.4 leads with native computer-use API and configurable reasoning (dial down for fast tool-calling loops, dial up for planning). Sonnet 4.6 has improved computer use and agent planning — great for multi-step browser automation. Gemini 3.1 Pro excels at long-horizon tool orchestration with its medium thinking mode.

📚 RAG

Best pick: Gemini 3.1 Pro — cheapest per-token ($2/$12) with a full 1M context window and native multimodal grounding (text, image, video, audio). Great for document-heavy pipelines. GPT-5.4 at 1.05M context is the largest window but costs more. Sonnet 4.6's 128K max output is ideal when you need long-form synthesis from retrieved docs.

🎨 Multimodal Apps

Best pick: Gemini 3.1 Pro is the clear leader — native text, image, video, and audio input with unified embeddings (gemini-embedding-2). GPT-5.4 handles text + image but no video/audio. Sonnet 4.6 is text + image only. For multimodal RAG or video analysis, Gemini is the only real option at the frontier.

Running Changelog

Mar 18
Google Built-in tools + function calling in one call; Maps grounding for Gemini 3
Gemini API now supports using Gemini’s built-in tools alongside custom function calling tools in a single request; Google Maps grounding is now supported for Gemini 3 models going forward.
Mar 17
OpenAI GPT-5.4 mini + nano released
GPT-5.4 mini brings GPT-5.4-class capability to high-volume workloads with tool search, computer use, and compaction; GPT-5.4 nano is optimized for simple, high-volume tasks and supports compaction only.
Mar 16
Google Revamped Usage Tiers & billing spend caps
Google introduced new usage tier structure with billing account spend caps for Gemini API. Better control for production workloads.
Mar 12
Google AI Studio project-level spend caps
Added project-level spend caps for Gemini API billing in AI Studio, enabling tighter cost control for production experimentation.
Mar 10
Google Multimodal embedding model: gemini-embedding-2-preview
First multimodal embedding model supporting text, image, video, audio, and PDF inputs in a unified embedding space. Major for multimodal RAG and unified retrieval.
Mar 10
Google gemini-2.5-flash-lite-preview shutdown Mar 31
Deprecated. Will be shut down March 31, 2026. Migrate to gemini-3.1-flash-lite-preview.
Mar 9
Google gemini-3-pro-preview alias redirected
Gemini 3 Pro Preview was shut down; the gemini-3-pro-preview alias now points to gemini-3.1-pro-preview.
Mar 12
OpenAI Sora API: 20s generations, 1080p, edits endpoint; remix deprecated
Sora API expanded with character references, longer generations up to 20s, 1080p for sora-2-pro (billed at $0.70/sec), Batch API support, and a new /v1/videos/edits endpoint. /v1/videos/{video_id}/remix will be deprecated in 6 months.
Mar 5
OpenAI GPT-5.4 launched
1.05M context, configurable 5-level reasoning, native computer-use API. First general-purpose model with built-in computer use. Priced at $2.50/$15 per 1M tokens. GPT-5.4 Pro at $30/$180.
Mar 4
OpenAI Legacy GPT-4 models deprecated Mar 26
gpt-4-0314, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview all sunset March 26, 2026. Requests auto-route to gpt-4.1.
Mar 3
Google Gemini 3.1 Flash-Lite Preview
First Flash-Lite model in the Gemini 3 series. 1M context, $0.25/$1.50 pricing. Ultra-efficient for high-volume, cost-sensitive workloads.
Feb 26
Google Nano Banana 2 image model + Gemini 3 Pro deprecated
Nano Banana 2 (gemini-3.1-flash-image-preview) optimized for speed. Gemini 3 Pro Preview shut down March 9.
Feb 19
Google Gemini 3.1 Pro launched
Frontier reasoning model. 1M context, $2/$12 per 1M tokens. Medium thinking level for cost/speed/performance balance. Enhanced agentic coding and multimodal analysis.
Feb 17
Anthropic Claude Sonnet 4.6 launched
Most capable Sonnet yet. 1M context (beta), 128K max output, improved coding + computer use + long-context reasoning. $3/$15 per 1M tokens. Approaches Opus intelligence at Sonnet pricing.
Feb 13
OpenAI GPT-4o, GPT-4.1, o4-mini retired from ChatGPT
GPT-4o, GPT-4.1, GPT-4.1 mini, o4-mini, and GPT-5 (Instant/Thinking) removed from ChatGPT. Still available in API. GPT-4o in Custom GPTs until March 31.
Feb 12
Google Gemini 3 Deep Think major upgrade
Specialized reasoning mode upgraded to solve modern science, research, and engineering challenges. Available to Google AI Ultra subscribers.

Upcoming Deprecations & Deadlines

OpenAI — March 26, 2026

  • gpt-4-0314 — sunset, auto-routes to gpt-4.1
  • gpt-4-1106-preview — sunset, auto-routes to gpt-4.1
  • gpt-4-0125-preview — sunset, auto-routes to gpt-4.1
  • gpt-4-turbo-preview — sunset, auto-routes to gpt-4.1

OpenAI — March 31, 2026

  • GPT-4o in Custom GPTs — fully retired across Business, Enterprise, Edu plans

Google — March 31, 2026

  • gemini-2.5-flash-lite-preview-09-2025 — shutdown. Migrate to gemini-3.1-flash-lite-preview

OpenAI — May 7, 2026

  • Realtime API Beta — deprecated and will be removed. Migrate to production Realtime API

OpenAI — 2026 (date TBD)

  • Assistants API — being replaced by Responses API. Sunset date in 2026