Claude Opus 4.8 Review: Pricing, New Features, and Whether to Upgrade

If you run agentic or coding workloads, upgrade to Opus 4.8 now. The reason is the part most launch coverage buries: the price did not move. Anthropic ships Opus 4.8 at $5 per million input tokens and $25 per million output tokens, identical to Opus 4.7, per the Anthropic Opus product page (fetched 2026-05-30). At the API level the upgrade is a one-line model-ID swap with no bill change, and the tool calling and long-context handling are better than 4.7 per Anthropic's announcement (May 28, 2026). Chat users on Pro and Max get effort control, which alone justifies the switch. The single group that should stay put: anyone whose real work is summarization, drafting, or high-volume Q&A, where Sonnet 4.6 at $3/$15 does the job for 40 percent less money per the models overview.

This is a pricing and features breakdown, not a benchmark run. Every performance number below is attributed to Anthropic's published announcement or a named early tester. We pulled the current pricing and feature specs from Anthropic's docs on 2026-05-30 and lay out who should pay for Opus 4.8, who is better served by Sonnet, and how the model stacks up against GPT-5.5.

What actually changed in Opus 4.8

Seven things shipped with the model, per the announcement and the API "what's new" doc. Each one has a practical consequence, not just a line on a slide.

Dynamic workflows in Claude Code (research preview, Enterprise/Team/Max). Claude plans and runs hundreds of parallel subagents in a single session. The headline use case Anthropic names is a codebase-scale migration across hundreds of thousands of lines, run from kickoff to merge with your existing test suite as the quality bar.
Effort control on claude.ai and Cowork. You pick low, high, extra, or max. Higher effort buys more thinking and more tokens and better answers; lower effort returns faster and burns your rate limit slower. Claude Code adds an xhigh level on top.
Fast mode (research preview, API only). Set speed: "fast" and the model runs at up to 2.5x output tokens per second, priced at $10/$50 per million tokens, per the announcement. Anthropic calls this fast mode three times cheaper than the previous model's fast mode.
Mid-conversation system messages. You can inject a role: "system" message partway through a conversation without breaking the prompt cache, and with no beta header. On a long agentic loop that re-steers the model mid-run, this keeps your cached input cheap instead of forcing a fresh non-cached read.
Lower prompt-cache minimum: 1,024 tokens. Short prompts that fell under the old Opus caching threshold now cache, so repeated small calls get the cache discount they could not get before.
Documented refusal stop details. The stop_details object is now public and categorizes why Claude declined a request, so your app can route the user to the right next step instead of showing a dead end.
Effort default is high everywhere, including Claude Code. Anthropic says this matches Opus 4.7's default token usage on coding tasks while delivering higher performance, so you get the lift without a token-bill surprise.

Two smaller deltas are worth a line. Tool triggering is more reliable, with fewer required tool calls skipped, and long-context compaction handles big sessions better than 4.7 per the what's-new doc. For anyone building agents, those two are the quiet wins that show up in production before any benchmark does.

Pricing: what each tier costs

API pricing is the clean number because it is published per token. Here is the current Claude lineup, all from the models overview and the Opus product page, fetched 2026-05-30.

Model	Input ($/M tokens)	Output ($/M tokens)	Context	Max output	Thinking	Source
Opus 4.8	$5	$25	1M	128k	Adaptive only	Anthropic
Opus 4.7 (legacy)	$5	$25	1M	128k	Adaptive	Anthropic
Sonnet 4.6	$3	$15	1M	64k	Extended + adaptive	Anthropic
Haiku 4.5	$1	$5	200k	64k	Adaptive	Anthropic

The row that drives the upgrade decision is Opus 4.7 versus 4.8: same price, same context, same max output. You are not paying a premium for the new model, so the only question is whether the better tool calling and long-context handling are worth a model-ID change, and for agent and coding work they are.

Three levers cut the API bill further, per the Opus product page. Prompt caching saves up to 90 percent on repeated context. Batch processing saves 50 percent on non-urgent jobs. US-only inference, if your compliance posture needs it, runs at 1.1x the standard rate. The new 1,024-token cache minimum means more of your short prompts now qualify for that 90 percent caching discount.

On the subscription side, Opus 4.8 is available on Pro, Max, Team, and Enterprise. It is not on the Free plan. Anthropic does not publish stable subscription dollar figures in a way we can cite cleanly here, so check claude.com/pricing for the current tier prices before you buy. The practical read: if you live in the chat app and want Opus 4.8 plus effort control, you need at least Pro.

Opus 4.8 vs GPT-5.5 vs Opus 4.7

The most-searched comparison right now is Opus 4.8 against OpenAI's GPT-5.5, so here is the three-way table with the previous Opus included as the upgrade baseline. Benchmark rows are attributed inline; we did not run these.

	Claude Opus 4.8	GPT-5.5	Claude Opus 4.7
Input price ($/M)	$5 Anthropic	$5 OpenRouter	$5 Anthropic
Output price ($/M)	$25 Anthropic	$30 OpenRouter	$25 Anthropic
Context window	1M Anthropic	~1.05M OpenRouter	1M Anthropic
Max output	128k	128k OpenRouter	128k
Fast mode	Yes, 2.5x, $10/$50 Anthropic	Not offered	No
Dynamic workflows	Yes (Claude Code, gated)	No	No
Online-Mind2Web (browser agent)	84%, "a meaningful jump over both Opus 4.7 and GPT-5.5" per Miguel Gonzalez, Tech Lead, BrowserBase Anthropic	not cited in this source	below 4.8 per same quote
Super-Agent benchmark	"the only model to complete every case end-to-end, beating prior Opus models and GPT-5.5 at parity on cost" per Kay Zhu, Co-Founder and CTO, Genspark Anthropic	beaten at parity per same quote	beaten per same quote

On raw token price the two frontier models are close: identical $5 input, and Opus 4.8 comes in $5 cheaper per million output tokens. The benchmark claims all come from Anthropic's own announcement, including the named early-tester quotes, so weigh them as vendor-published rather than third-party. The Cursor and Harvey quotes round out the picture. Michael Truell, CEO of Cursor, says "Claude Opus 4.8 exceeds prior Opus models across every effort level. Tool calling is meaningfully more efficient, using fewer steps for the same intelligence" per the announcement. Niko Grupen of Harvey reports it is "the first model to break 10% overall on the all-pass standard" on the Legal Agent Benchmark, in the same source. On safety, Anthropic's own evaluation puts Opus 4.8 "around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked."

What the table does not settle is your specific workload. GPT-5.5 and Opus 4.8 trade blows depending on the task, and the gap between $25 and $30 output (Anthropic vs OpenRouter) matters only at scale. The clear read: if you are already in the Claude ecosystem, the at-cost-parity claim against GPT-5.5 plus the unchanged price means you have little reason to switch vendors to chase a benchmark.

Which plan for which user

The pricing table tells you what things cost. This tells you what to buy.

Solo developer shipping side projects

Run Opus 4.8 inside your editor rather than paying for max chat usage you will not use. Cursor runs Opus 4.8 as a model option, and Truell's tool-calling-efficiency quote is in Anthropic's launch, so the IDE that drove that benchmark is the natural place to put the model to work. If your usage is bursty and code-heavy, a Pro chat seat plus Cursor covers you; if you are mostly calling the API from scripts, Sonnet 4.6 at $3/$15 with prompt caching keeps the bill low and only reaches for Opus 4.8 on the hard problems. Check claude.com/pricing for the current Pro price.

Engineering team running agentic pipelines

You want the Team plan or direct API access with prompt caching turned on, because a multi-agent pipeline re-reads the same context constantly and the up-to-90-percent cache discount (per the Opus product page) is the difference between a sane bill and a scary one. The mid-conversation system-message feature matters here too: it lets you re-steer a running agent without busting the cache. If your team self-hosts the orchestration layer rather than living entirely in Anthropic's cloud, Cloudways gives you managed infrastructure to run those pipelines, with DigitalOcean as the lower-level alternative if you want to manage the droplet yourself. Pair either with Cursor as the IDE your engineers actually write in.

Enterprise doing a codebase migration

This is the persona dynamic workflows was built for. Enterprise (and Team and Max) unlock the Claude Code feature that runs hundreds of parallel subagents and carries a migration across hundreds of thousands of lines using your test suite as the pass/fail gate per the announcement. If you have a legacy framework migration sitting in the backlog because nobody wants to hand-edit ten thousand files, this is the line item that pays for the Enterprise tier on its own. Confirm current Enterprise pricing through Anthropic sales via claude.com/pricing.

Automation builder wiring research loops

If your work is "run this research or enrichment loop on a schedule" rather than interactive coding, you do not need a chat seat at all. Connect Opus 4.8 to Make and let the scenario fire the API calls, branch on the results, and write them where you need them. Set effort low for the cheap high-volume passes and reserve high effort for the synthesis step, so you are not paying max-thinking rates on every node. Sonnet 4.6 handles the bulk classification work in the same scenario if you want to push the cost down further.

Budget-conscious API user

If price is the binding constraint, Opus 4.8 is not your model. Haiku 4.5 at $1/$5 or Sonnet 4.6 at $3/$15, both from the overview, plus batch processing for the 50 percent discount, will run most summarization and extraction workloads at a fraction of Opus cost. Reach for Opus 4.8 only on the prompts where the cheaper models visibly fail.

The verdict

Opus 4.8 earns a 4.5 of 5 because Anthropic shipped a stronger model at an unchanged price, which is the rarest kind of upgrade. The decision splits cleanly by who you are.

API and agent builders: swap claude-opus-4-7 for claude-opus-4-8 today. Same $5/$25, better tool calling, better long-context handling, and an at-parity cost claim against GPT-5.5 from the announcement. There is no downside row in this trade. Run it in Cursor if you want the IDE that Anthropic's own tool-calling quote came from.

Chat users on Pro or Max: the upgrade is automatic and effort control is the feature you will use daily, so spend a week dialing effort down on routine prompts and up on the hard ones.

Enterprise and large teams: the question is dynamic workflows. If you have a codebase-scale migration in the backlog, that one feature justifies the tier. If you do not, the model upgrade still lands for free.

Stay on Sonnet 4.6 if your work is summarization, drafting, or high-volume Q&A. At $3/$15 (per the overview) it does that work for 40 percent less than Opus output, and the Opus premium buys depth you are not using. The half-point we held back reflects the two caveats: fast mode is an API-only research preview rather than a shipped chat feature, and dynamic workflows skip solo Pro users entirely. Neither changes the core call. At the same price as the model it replaces, Opus 4.8 is the default Anthropic model to run for any serious agentic or coding job.