Table of Contents
GitHub Copilot Project Polaris vs Claude Code: What Changes in August 2026
Microsoft announced on June 2 that a new in-house model called Project Polaris will become the default engine inside GitHub Copilot starting August 2026, replacing GPT-4 Turbo. Every Copilot subscriber migrates automatically, with an optional three-month window to stay on GPT-4 if you are not ready (ChatForest's Build 2026 recap). If you are choosing between Copilot and Claude Code today, here is the short version: the model under Copilot's hood is going to change in two months, but the decision you make this week does not have to wait for it.
Polaris is Microsoft's first real answer to the question "what does Copilot look like without OpenAI?" The two companies ended their seven-year exclusive partnership in April, and AI Weekly reports Microsoft is naming Claude Code directly as the product Polaris is built to displace. That is an unusually blunt admission that Anthropic took developer ground in agentic coding. So the framing writes itself: Microsoft's homegrown model versus the tool that pulled ahead.
One caveat up front, because it shapes everything below. Microsoft has not published a head-to-head benchmark against Claude Code. The HumanEval and MBPP numbers it cites are against GPT-4 Turbo, not Anthropic's model, and the comparison was internal. Treat the Polaris-beats-Claude-Code story as a stated goal, not a measured result, until real benchmarks land after the August release.
What happened on June 2 (the short version)
Build 2026 ran June 2 to 3 at Fort Mason in San Francisco. Polaris was the headline, but it shipped alongside a large batch of Copilot updates the same day.
The Polaris piece, per the ChatForest recap: a mixture-of-experts model with sub-modules tuned per programming language, set to replace GPT-4 Turbo as Copilot's default in August, running on Microsoft's custom Maia 200 accelerators inside Azure (buildfastwithai). Migration is automatic. The fallback to GPT-4 is opt-in and lasts three months.
Around it, GitHub pushed a stack of features that are already live, not future-dated. The Copilot SDK went generally available. Cloud agent scheduling landed, so the agent can run on a timer or off a repo event. Voice input and prompt scheduling arrived in Copilot CLI. And MAI-Code-1-Flash, a separate small-tier Microsoft model, became selectable in VS Code. Copilot Workspace also exited beta with two new autonomous modes.
Why bunch all this into one day? It is Copilot's first significant model shift since the OpenAI exclusive ended. Microsoft wanted the whole story told at once.
What Project Polaris is, exactly
A mixture-of-experts model with language-specific sub-modules
Polaris is not one monolithic network. Windows News describes it as a mixture-of-experts architecture where specialized sub-modules handle distinct programming languages, frameworks, and paradigms. In practice, that means the routing layer hands a Rust task to a Rust-tuned expert and a Python task to a different one, rather than asking a single generalist model to be good at everything.
The design pays off most in languages that usually get short-changed by general-purpose models. Microsoft says the biggest gains show up in low-resource languages like Rust and Haskell, where training data is thinner and a dedicated sub-module has more room to help.
It runs on Microsoft's own silicon
Polaris is trained and served on Microsoft's custom Maia 200 AI accelerators inside Azure. Microsoft says this cuts per-inference latency and lowers cost relative to routing inference through OpenAI. For a team, that is a vendor-economics detail more than a day-one feature, but it explains the strategic logic. Owning the model and the chips it runs on removes a margin Microsoft was paying to a partner that also competes with it.
Reasoning aimed at multi-file work
The capability Microsoft keeps pointing at is multi-file refactoring, which is exactly where Claude Code earned its reputation. AI Weekly reports Polaris uses chain-of-thought and tree-of-thought reasoning at inference time as its stated differentiator. Pro-tier subscribers also get multi-file context up to 100,000 lines and autonomous test generation, per the ChatForest recap.
The benchmarks Microsoft claims, and the one it doesn't
Here is the honest read on the numbers. Microsoft says Polaris outperforms GPT-4 Turbo on HumanEval and MBPP, with the standout gains in Rust and Haskell. Windows News notes those results came from "internal benchmarking" and that the model "reportedly" beat GPT-4 Turbo. So the comparison is real, the source is the vendor, and the baseline is GPT-4 Turbo.
What is missing is any number against Claude Code. There is no published Polaris-versus-Anthropic benchmark, which the AI Weekly coverage makes plain. And HumanEval and MBPP measure single-function code generation, not the multi-file agentic work where Claude Code actually leads. So even the benchmarks Microsoft does cite don't touch the workflow it says Polaris is built to win. Until Polaris ships in August and third parties test it on real repos, the head-to-head is a marketing claim with no scoreboard behind it.
How the August transition works
| Date | What happens |
|---|---|
| Today (June 2, 2026) | Polaris announced at Build. GPT-4 Turbo remains Copilot's default. Nothing changes in your setup. |
| August 2026 | Polaris becomes the default model. All subscribers migrate automatically. The three-month opt-out window to stay on GPT-4 opens. |
| November 2026 | The fallback window closes (roughly three months after the August switch). Teams that stayed on GPT-4 move to Polaris. |
A few things to keep straight as that timeline plays out.
The migration is automatic and opt-out, not opt-in. If you do nothing, your team is on Polaris in August. The escape hatch is the three-month fallback to GPT-4, and you have to choose it. Per the ChatForest recap, that window gives cautious teams a quarter to validate the new model before it becomes mandatory.
What stays the same matters as much as what changes. This is a model swap, not a billing or product overhaul. Code completions, chat, the agent surface, and your seat structure are unaffected by Polaris itself. If you want the current Copilot pricing picture, that lives in our Copilot plan overview, and it is separate from this model change.
What might change is quality and feel. A new default model can shift completion latency, refactoring accuracy, and how the agent behaves on large diffs. Microsoft's pitch is that all three improve. The honest answer is you will not know how Polaris handles your codebase until you run your own work through it in August.
Copilot vs Claude Code: the current state, before Polaris ships
Strip out the August speculation and look at where the two tools stand today. The picture is clearer than the announcement makes it sound.
Copilot's strengths right now are about gravity and breadth. If your pull requests, Actions, issues, and reviews already run through GitHub, Copilot is sitting inside the place your work already lives. Code completions are fast and unlimited on paid plans, the IDE coverage is wide, and the June 2 additions (SDK, cloud agent scheduling, Workspace GA) make the GitHub-native loop tighter. For everyday completion-and-chat coding, Copilot with its existing models is already competitive.
Claude Code's lead is narrower and deeper. It pulled ahead on multi-file agentic refactoring and following long, complex instructions across a repo. That is the exact workflow Microsoft is now targeting, and it is no accident: AI Weekly notes Microsoft naming Claude Code is the first time an incumbent has publicly acknowledged losing measurable developer adoption to Anthropic in a product category. Today, on complex agentic work, Claude Code is still the tool to beat.
So the Polaris claim is best understood as a shot, not a result. Microsoft is aiming Polaris squarely at Claude Code's strongest ground. Whether it lands is an August question with no June answer.
The other June 2 features worth knowing
The model headline buried a batch of shipping features that change what Copilot can do this week.
The Copilot SDK is generally available in six languages: Node.js/TypeScript, Python, Go, .NET, Rust (new at GA), and Java. It gives you programmatic access to the same agent runtime behind Copilot, including planning, tool calls, file edits, and multi-turn sessions, so you can embed it in your own tools instead of building orchestration yourself.
# Install the Copilot SDK in your stack of choice (per the June 2 GA changelog)
npm install @github/copilot-sdk # Node.js / TypeScript
pip install github-copilot-sdk # Python
go get github.com/github/copilot-sdk/go # Go
dotnet add package GitHub.Copilot.SDK # .NET
cargo add github-copilot-sdk # Rust (new at GA)
Cloud agent scheduling lets the agent run on its own, on a schedule or in response to repository events. GitHub's documented examples: auto-labeling new issues as bug, enhancement, or other; checking for failing tests on main each night and opening a draft fix PR; and drafting weekly release notes. It is available for Pro, Pro+, Max, Business, and Enterprise; Business and Enterprise admins have to enable the cloud agent policy first. Automations run in private and internal repos, with public repo support coming.
You set one up from the Agents tab on github.com (or the GitHub Copilot app), giving it a name, a prompt, a trigger, and a scoped tool list:
# Shape of a Copilot cloud agent automation (per the June 2 changelog)
name: nightly-test-fix
trigger: daily # or: hourly | weekly | on new issue | on PR created/updated
prompt: >
Check for failing tests on main. Attempt a fix and open a draft pull request.
tools:
- create pull request
- update issue labels
Heavy agent and chat usage is where Copilot's June 1 billing changes bite, since those runs draw from your AI Credit pool. If you are turning on scheduled agents, read how that metering works in our Copilot cloud agent model costs breakdown before you set a nightly job loose.
Copilot CLI picked up three generally available features: voice input, prompt scheduling with /every and /after, and a "rubber duck" second-opinion mode. There is also an experimental redesigned terminal interface behind /experimental.
# Copilot CLI, June 2 update (features GA per the changelog)
copilot # launch the CLI
/experimental # opt into the redesigned terminal UI
/every 1h "summarize new commits on main" # recurring scheduled prompt
/after 30m "draft release notes for the open milestone" # one-shot delayed prompt
# Voice input: hold to talk instead of typing your prompt
Examples above follow GitHub's June 2, 2026 Copilot CLI and SDK changelogs; install commands are reproduced from the GA release notes.
MAI-Code-1-Flash is the one that confuses people, so be clear on it. It is a separate, small-tier Microsoft coding model, available in the VS Code model picker now. It is not Polaris. Microsoft frames it as the first in a wave of purpose-built coding models, tuned for lightweight workflows where speed matters more than maximum reasoning depth. Polaris is the big August story; MAI-Code-1-Flash is a fast little model you can pick today.
What to do before August
No "it depends." Three personas, three picks.
If your team is on Copilot Business and lives in completions and chat, stay put and do nothing. The August migration is automatic, the work you do most (completions) is unaffected by the model swap, and Polaris is more likely to help that workflow than hurt it. Note the November date so you remember the GPT-4 fallback window exists, and revisit in late August once Polaris is actually running. There is no reason to churn tools over an announcement.
If you are evaluating a switch to Claude Code for agentic, multi-file work, try it now. Do not wait for Polaris. The Polaris-beats-Claude-Code claim has no benchmark behind it yet, and waiting two months to maybe re-evaluate a promise costs you two months of better agentic tooling if Claude Code is in fact the right call for your repos. Run both against your own codebase this week and decide on evidence you control, not a Build keynote. If part of that evaluation is cost, our Cursor vs Copilot Teams pricing comparison lays out the current per-seat math, and Cursor's team plans are worth a look if you want an agent-first IDE in the same bake-off.
If you run both tools already, keep the split and revisit in September. The June 2 additions (cloud agent scheduling, the SDK, Workspace GA) make Copilot genuinely stronger for GitHub-native automation, so it is earning its seat for that work regardless of Polaris. Let Claude Code keep the heavy agentic refactoring until August benchmarks tell you whether Polaris has closed the gap. September is the first month you will have real Polaris data to act on. You can compare GitHub Copilot's plans against your Claude Code spend then with actual numbers instead of a roadmap.
The thread across all three: Polaris is a real change coming on a real date, but it is two months out and unproven against the competitor it targets. Make the call your current work justifies, and let August bring its own evidence.
FAQ
When exactly does Polaris go live? August 2026. Microsoft gave the month, not a specific day, at Build on June 2. Polaris becomes Copilot's default model then, replacing GPT-4 Turbo, per the ChatForest Build recap.
Can I opt out of the Polaris migration? For a while, yes. Migration is automatic, but there is an optional three-month fallback window to stay on GPT-4 after the August switch. That puts the hard cutover around November 2026. If you do nothing, you are on Polaris in August; to stay on GPT-4, you have to choose the fallback.
Does Polaris change Copilot pricing? The model swap itself is not a pricing change. Copilot's billing change already happened on June 1 with the move to AI Credits, which is a separate event. For current plan and credit details, see our Copilot plan overview.
Is Polaris better than Claude Code? There is no published benchmark comparing the two as of June 2. Microsoft says Polaris beats GPT-4 Turbo on HumanEval and MBPP, with the largest gains on Rust and Haskell (ChatForest's Build recap), but those are internal results against GPT-4 Turbo, not Anthropic's model, and they measure single-function generation rather than the multi-file agentic work where Claude Code leads (AI Weekly). Anyone claiming a winner today is guessing.
What is MAI-Code-1-Flash, and is it the same as Polaris? No. MAI-Code-1-Flash is a separate small-tier Microsoft coding model, available in VS Code's model picker now, built for lightweight, fast workflows. Polaris is the larger reasoning model arriving as Copilot's default in August. Both are Microsoft models, but they are different products at different tiers.
Is Polaris available in all IDEs at launch? The August rollout is a default-model change for Copilot subscribers, so it follows wherever Copilot already runs. Microsoft has not published a per-IDE availability schedule beyond the August default-switch timing. Expect the same surfaces you use Copilot in today; confirm against GitHub's changelog closer to the date.