AI Model Tracker

OpenAI

GPT-5.4

Released: March 5, 2026

Context1.05M tokens

Input$2.50 / 1M

Output$15.00 / 1M

Cached input$0.25 / 1M

ModalityText + Image in

Reasoning5-level configurable

Computer useYes (native)

SWE-bench~80.0%

Anthropic

Claude Sonnet 4.6

Released: February 17, 2026

Context1M tokens (beta)

Input$3.00 / 1M

Output$15.00 / 1M

Max output128K tokens

ModalityText + Image in

ReasoningHybrid (extended thinking)

Computer useYes (improved)

Caching savingsUp to 90%

Google

Gemini 3.1 Pro

Released: February 19, 2026

Context1M tokens

Input (≤200K)$2.00 / 1M

Output (≤200K)$12.00 / 1M

Input (>200K)$4.00 / 1M

ModalityText, Image, Video, Audio

Reasoning3 thinking levels

ARC-AGI-277.1%

Max output64K tokens

Side-by-Side Comparison

Dimension	GPT-5.4	Claude Sonnet 4.6	Gemini 3.1 Pro
Release date	Mar 5, 2026	Feb 17, 2026	Feb 19, 2026
Context window	1.05M tokens	1M (beta, API only)	1M tokens
Max output	Not disclosed	128K tokens	64K tokens
Input price / 1M	$2.50	$3.00	$2.00
Output price / 1M	$15.00	$15.00	$12.00
Cached input price	$0.25 / 1M	90% savings (est. ~$0.30)	Supported (tiered)
Input modalities	Text, Image	Text, Image	Text, Image, Video, Audio
Reasoning control	5-level configurable	Hybrid extended thinking	3 thinking levels
Computer use	Native API	Improved in 4.6	Not available
SWE-bench Verified	~80.0%	Approaches Opus-level	Not reported
ARC-AGI-2	Not reported	Not reported	77.1%
GPQA (reasoning)	93.2% (Pro variant)	Not reported	Not reported
Batch pricing	50% off standard	50% off standard	Available
Availability	API + ChatGPT Pro/Plus	API + Claude.ai + Bedrock + Vertex + Foundry	API + Gemini App + Vertex
Pro / Premium tier	$30 / $180 per 1M (GPT-5.4 Pro)	$15 / $75 per 1M (Opus 4.6)	Deep Think (Ultra subs)

What Matters For Your Stack

⚙ Coding

Best pick: GPT-5.4 for complex, multi-file codebases. ~80% SWE-bench and 5-level reasoning control let you dial cost vs. quality per request. Sonnet 4.6 is the value play at $3 input — devs in early access often preferred it over Opus for frontend and financial code. Gemini 3.1 Pro is strong for agentic coding workflows with its efficient token usage.

🤖 Agents

Best pick: GPT-5.4 leads with native computer-use API and configurable reasoning (dial down for fast tool-calling loops, dial up for planning). Sonnet 4.6 has improved computer use and agent planning — great for multi-step browser automation. Gemini 3.1 Pro excels at long-horizon tool orchestration with its medium thinking mode.

📚 RAG

Best pick: Gemini 3.1 Pro — cheapest per-token ($2/$12) with a full 1M context window and native multimodal grounding (text, image, video, audio). Great for document-heavy pipelines. GPT-5.4 at 1.05M context is the largest window but costs more. Sonnet 4.6's 128K max output is ideal when you need long-form synthesis from retrieved docs.

🎨 Multimodal Apps

Best pick: Gemini 3.1 Pro is the clear leader — native text, image, video, and audio input with unified embeddings (gemini-embedding-2). GPT-5.4 handles text + image but no video/audio. Sonnet 4.6 is text + image only. For multimodal RAG or video analysis, Gemini is the only real option at the frontier.

Running Changelog

Mar 18

Google Built-in tools + function calling in one call; Maps grounding for Gemini 3

Gemini API now supports using Gemini’s built-in tools alongside custom function calling tools in a single request; Google Maps grounding is now supported for Gemini 3 models going forward.

Mar 17

OpenAI GPT-5.4 mini + nano released

GPT-5.4 mini brings GPT-5.4-class capability to high-volume workloads with tool search, computer use, and compaction; GPT-5.4 nano is optimized for simple, high-volume tasks and supports compaction only.

Mar 16

Google Revamped Usage Tiers & billing spend caps

Google introduced new usage tier structure with billing account spend caps for Gemini API. Better control for production workloads.

Mar 12

Google AI Studio project-level spend caps

Added project-level spend caps for Gemini API billing in AI Studio, enabling tighter cost control for production experimentation.

Mar 10

Google Multimodal embedding model: gemini-embedding-2-preview

First multimodal embedding model supporting text, image, video, audio, and PDF inputs in a unified embedding space. Major for multimodal RAG and unified retrieval.

Mar 10

Google gemini-2.5-flash-lite-preview shutdown Mar 31

Deprecated. Will be shut down March 31, 2026. Migrate to gemini-3.1-flash-lite-preview.

Mar 9

Google gemini-3-pro-preview alias redirected

Gemini 3 Pro Preview was shut down; the gemini-3-pro-preview alias now points to gemini-3.1-pro-preview.

Mar 12

OpenAI Sora API: 20s generations, 1080p, edits endpoint; remix deprecated

Sora API expanded with character references, longer generations up to 20s, 1080p for sora-2-pro (billed at $0.70/sec), Batch API support, and a new /v1/videos/edits endpoint. /v1/videos/{video_id}/remix will be deprecated in 6 months.

Mar 5

OpenAI GPT-5.4 launched

1.05M context, configurable 5-level reasoning, native computer-use API. First general-purpose model with built-in computer use. Priced at $2.50/$15 per 1M tokens. GPT-5.4 Pro at $30/$180.

Mar 4

OpenAI Legacy GPT-4 models deprecated Mar 26

gpt-4-0314, gpt-4-1106-preview, gpt-4-0125-preview, gpt-4-turbo-preview all sunset March 26, 2026. Requests auto-route to gpt-4.1.

Mar 3

Google Gemini 3.1 Flash-Lite Preview

First Flash-Lite model in the Gemini 3 series. 1M context, $0.25/$1.50 pricing. Ultra-efficient for high-volume, cost-sensitive workloads.

Feb 26

Google Nano Banana 2 image model + Gemini 3 Pro deprecated

Nano Banana 2 (gemini-3.1-flash-image-preview) optimized for speed. Gemini 3 Pro Preview shut down March 9.

Feb 19

Google Gemini 3.1 Pro launched

Frontier reasoning model. 1M context, $2/$12 per 1M tokens. Medium thinking level for cost/speed/performance balance. Enhanced agentic coding and multimodal analysis.

Feb 17

Anthropic Claude Sonnet 4.6 launched

Most capable Sonnet yet. 1M context (beta), 128K max output, improved coding + computer use + long-context reasoning. $3/$15 per 1M tokens. Approaches Opus intelligence at Sonnet pricing.

Feb 13

OpenAI GPT-4o, GPT-4.1, o4-mini retired from ChatGPT

GPT-4o, GPT-4.1, GPT-4.1 mini, o4-mini, and GPT-5 (Instant/Thinking) removed from ChatGPT. Still available in API. GPT-4o in Custom GPTs until March 31.

Feb 12

Google Gemini 3 Deep Think major upgrade

Specialized reasoning mode upgraded to solve modern science, research, and engineering challenges. Available to Google AI Ultra subscribers.

Side-by-Side Comparison

What Matters For Your Stack

⚙ Coding

🤖 Agents

📚 RAG

🎨 Multimodal Apps

Running Changelog

Upcoming Deprecations & Deadlines

OpenAI — March 26, 2026

OpenAI — March 31, 2026

Google — March 31, 2026

OpenAI — May 7, 2026

OpenAI — 2026 (date TBD)