Table of Contents
Claude Opus 4.8 in GitHub Copilot: pricing, the 15x multiplier, and which plan actually needs it
Here is the short answer before the numbers: turn Claude Opus 4.8 on if you run long agentic sessions across a big codebase and you are on Pro+, Business, or Enterprise. Leave it off for autocomplete, quick chat, and one-file edits, where it quietly drains your monthly credit balance for output you could get cheaper. For a solo Pro+ developer, treat Opus 4.8 as a "break glass" model you reach for on the hard bug, not your daily driver. For a Business team, enable the policy but pair it with guidance on when to use it. For an Enterprise org, enable it and lean on the promotional included usage through August to learn your real consumption before the bill is yours alone.
The model went generally available in GitHub Copilot on May 28, 2026 (GitHub Changelog). Three days later the whole cost question underneath it changed. So the timing matters, and the rest of this guide is the math you need to make the call.
The 15x multiplier is already history
When Opus 4.8 shipped on May 28, Copilot was still on seat-based billing. Under that model, every premium request to Opus 4.8 counted as 15 requests against your monthly premium-request allowance (GitHub Changelog). One question to Opus burned the same budget as fifteen questions to a cheaper model. That 15x figure is the number floating around every forum thread right now.
It applied for exactly four days. Usage-based billing launched June 1, 2026, and it replaced Premium Request Units with GitHub AI Credits (GitHub Blog). The multiplier is gone. There is no "15 requests" tax on an Opus 4.8 call anymore.
What replaced it is simpler and, for heavy users, more honest. Your AI Credits draw down by actual token usage at each model's published API rate. GitHub's own wording: "Credits will be consumed based on token usage, including input, output, and cached tokens, according to the published API rates for each model" (GitHub Blog). One AI Credit equals one cent.
So the cost question moved. It used to be "how many of my premium requests does Opus 4.8 eat." Now it is "how many tokens does an Opus 4.8 session burn against my credit balance." A short question to Opus 4.8 is cheap. A long agent run that reads half your repo into a 1M-token context is not. Under the old multiplier, both cost the same flat 15x. They no longer do, and that is the change worth understanding.
What an Opus 4.8 interaction actually costs
Opus 4.8 bills at $5 per million input tokens and $25 per million output tokens on the Claude API, the same rates as Opus 4.7 (Anthropic pricing). Copilot draws your AI Credits down at these published rates. Since a credit is a cent, the dollar figures map directly to credits: a one-cent draw is one credit.
Example: take a chat-driven edit that reads in roughly 8,000 tokens of file and context and writes back 2,000 tokens of revised code. The same shape the companion model-picker math uses, so the comparison is apples to apples.
Per-interaction cost = (input_tokens / 1,000,000 * input_rate)
+ (output_tokens / 1,000,000 * output_rate)
Opus 4.8 (8k in @ $5.00, 2k out @ $25.00) = $0.0400 + $0.0500 = $0.0900
Nine cents, or nine AI Credits, for one mid-sized edit. On a Pro+ plan that ships with $39 of credits a month (GitHub Blog), that is roughly 430 edits of this size before you hit overage, if Opus 4.8 were the only thing you ran. It will not be.
The real cost lives in agent mode, not single edits. An agentic run that reads a large slice of the repo into context and iterates over several tool calls can push input into the hundreds of thousands of tokens.
Example agent run (250k in @ $5.00, 40k out @ $25.00)
input = 250,000 / 1,000,000 * $5.00 = $1.2500
output = 40,000 / 1,000,000 * $25.00 = $1.0000
total = $2.25 (225 AI Credits)
That single long-horizon run costs more than two dollars. Five of those in a day clears a Pro+ plan's entire monthly credit allowance. The old 15x multiplier hid this difference. Token billing exposes it, which is exactly why you want Opus 4.8 on a leash for routine work and unleashed only for the tasks that justify the spend.
One nuance that bites quietly: Opus 4.7 and later switched to a new tokenizer, and Anthropic notes it "may use up to 35% more tokens for the same fixed text" than older Claude models (Anthropic pricing). Same code, same prompt, more tokens counted. Budget for it. If you were sizing your credit math against an older Claude model's token counts, pad your estimate.
There is also a Fast mode, a research preview that delivers up to 2.5x higher output tokens per second (Anthropic what's new). It bills at $10 per million input and $50 per million output, double the standard rate (Anthropic pricing). Worth knowing it exists; not worth defaulting to unless latency is the bottleneck and you have the credits to spend.
Opus 4.8 vs the lighter options
Most of your day does not need Opus 4.8. The point of the picker is to match the model to the task: pick the lightest model that clears the bar. Here is how Opus 4.8 sits next to the two models you will actually pick against it: Microsoft's lightweight MAI Code 1 Flash, and the default Auto router.
| Model | Best for | Request cost tier | Context window | Speed |
|---|---|---|---|---|
| Claude Opus 4.8 | Hardest bugs, long agentic runs, large-codebase reasoning | High ($5/M in, $25/M out, Anthropic pricing) | 1M tokens on the Claude API, 200k on Microsoft Foundry | Standard, with an optional Fast mode at premium rates |
| MAI Code 1 Flash | Everyday edits, docs, quick explanations, light PR triage | Low end of the menu | Tuned for the Copilot harness | Fast, built for high-volume light work |
| Auto (default picker) | Mixed work when you do not want to choose | Varies by what it routes to | Depends on the routed model | Optimized for "good enough now" |
Opus 4.8 runs a 1M-token context window by default on the Claude API, with 128k max output tokens; on Microsoft Foundry the window is 200k (Anthropic what's new). That million-token context is the real reason to reach for it: holding a large codebase in working memory across a long agent run is where it earns its rate. For everything lighter, MAI Code 1 Flash is the cheaper, faster pick. We walk through the full per-task breakdown in our GitHub Copilot model picker guide.
A word on Auto. It picks a model for you based on availability and leans toward "good enough right now," not your monthly credit math. It is a reasonable baseline. Override it up to Opus 4.8 when you know the task needs deep reasoning across a wide context, and override it down to a light model when you are doing volume work and want to protect your balance.
Where Opus 4.8 shows up, surface by surface
Opus 4.8 reached a long list of surfaces at GA: VS Code (chat, ask, edit, and agent modes), Visual Studio, the Copilot CLI, the GitHub Copilot cloud agent, the GitHub Copilot App, github.com, GitHub Mobile on iOS and Android, JetBrains, Xcode, and Eclipse (GitHub Changelog). The surface you pick changes how much the model costs you, because each one tends toward a different token shape.
Agent mode in VS Code is where Opus 4.8 is worth its rate. Long-horizon agentic coding is the model's strength, and Anthropic calls out improved tool triggering and compaction recovery in this release (Anthropic what's new). It is also where token spend climbs fastest, because the agent reads a lot of context and iterates. High value, high cost. Use it when the task earns it.
Plain chat and the ask flow are cheaper per turn because they read less context. A focused question to Opus 4.8 in chat costs cents, not dollars. If your question genuinely needs Opus-class reasoning, chat is a fine place to spend it.
The GitHub Copilot App and GitHub Mobile put Opus 4.8 in your pocket and on your desktop outside the IDE. Handy for reviewing a design decision away from your editor. Same billing applies: you are spending credits at token rates wherever you call it.
How to enable Claude Opus 4.8 in VS Code
The rollout is gradual, and Enterprise and Business admins have to flip a policy first before anyone on the team sees the model (GitHub Changelog). Here is the path, VS Code first.
- Confirm your plan. Opus 4.8 is available to Copilot Pro+, Business, and Enterprise (GitHub Changelog). Plain Pro does not get it. If you are on Pro and want Opus-class reasoning, you need to move up to Pro+.
- If you are an admin: open Copilot settings for your organization and enable the Claude Opus 4.8 policy. Until an admin enables it, Business and Enterprise seats will not see the model in their picker (GitHub Changelog).
- Update the GitHub Copilot extension in VS Code so you are on a build that lists the model. The rollout is gradual, so an older extension build may not surface it yet.
- Open Copilot Chat and click the model picker dropdown, then select Claude Opus 4.8. For agent work, switch to agent mode first, then pick the model so the run uses it.
- Verify it stuck. The picker should now read Claude Opus 4.8 for that chat or agent session. If the model is missing, you are early in the rollout queue or the org policy is off; recheck steps 2 and 3.
Watch your credit balance before it surprises you
Under token billing, the number to track is your AI Credit draw, not a premium-request counter. Check it from the terminal.
# Open your Copilot usage and billing dashboard.
# macOS:
open "https://github.com/settings/billing"
# Linux:
xdg-open "https://github.com/settings/billing"
From the billing page, find the Copilot usage section to see your AI Credit consumption and any overage above your plan's included allowance. The included credits vary by plan: Pro ships with $10 in credits, Pro+ with $39, Business with $19 per user, and Enterprise with $39 per user. Business and Enterprise also get promotional included usage from June through August, $30 and $70 respectively (GitHub Blog).
Two billing facts worth pinning. Code completions and Next Edit suggestions stay free, so your autocomplete habit does not touch your credit balance (GitHub Blog). And when your credits run out, "fallback experiences will no longer be available" (GitHub Blog). Translation: burn the balance on Opus 4.8 agent runs and you can lose access to paid features until the next cycle or a top-up. If you want the deeper plan-level breakdown, see our Copilot usage-based billing guide.
So set a habit. Glance at the billing page weekly for the first month, learn what a normal week of your work draws, and you will know whether Opus 4.8 is a sensible default for you or a tool to ration. Promotional usage on Business and Enterprise runs through August, which makes the next two months a low-cost window to learn your real number.
A few things changed in this release that make the spend land better when you do reach for it. Adaptive thinking is now the only thinking mode, with effort defaulting to high. The prompt cache minimum dropped to 1,024 tokens, so more of your repeated context can be cached and billed at the lower cached rate (Anthropic what's new). Mid-conversation system messages and better compaction recovery mean long agent sessions hold together longer before they go off the rails. None of that makes Opus 4.8 cheap. It makes the money you do spend on it work harder.
Bottom line: for a solo Pro+ developer, keep MAI Code 1 Flash or Auto as your default and switch to Opus 4.8 only for the bug that has eaten your afternoon or the refactor that needs to see the whole repo at once. The 1M-token context and long-horizon agent strength are worth the token bill on those tasks and a waste of it on everything else. For a Business or Enterprise admin, enable the policy now, point your team at the same "heavy model for heavy work" rule, and use the June-to-August promotional credits to measure real consumption before the full rate is yours. If GitHub Copilot is already your editor, you can start on the Copilot plans page, move to Pro+ if you are stuck on plain Pro, and turn Opus 4.8 on for the work that actually needs it.