Table of Contents
GitHub Copilot Model Picker 2026: Which Model to Use for Each Task
There is a new name in your Copilot model picker, and for most of what you do all day, it is the right one to pick. Microsoft shipped MAI-Code-1-Flash at Build 2026 on June 2, and GitHub started rolling it out the same day to Copilot Free, Pro, Pro+, and Max (GitHub Changelog). For everyday edits, short files, docs, and quick explanations, switch to it and move on. Reach for a heavier model only when the task genuinely needs deep reasoning. The rest of this guide is the per-task breakdown so you can set up the picker once and stop second-guessing it.
The short version: MAI-Code-1-Flash is a lightweight Microsoft model tuned specifically for the Copilot harness. It posts a 51.2% pass rate on SWE-Bench Pro against Claude Haiku 4.5's 35.2%, and solves harder problems with up to 60% fewer tokens on SWE-Bench Verified (Microsoft AI). That combination, higher accuracy at lower token spend, is exactly what you want for the high-volume light work that fills a normal coding day.
What MAI-Code-1-Flash is, and why Microsoft built it
Most models in the Copilot picker are general-purpose. They were trained to chat, write, reason, and code, then wired into Copilot afterward. MAI-Code-1-Flash went the other direction. Microsoft says it was built end-to-end on clean, appropriately licensed data and trained directly against the GitHub Copilot harness developers use in production (Microsoft AI). It learned how to work with Copilot's surrounding tools during training, not as an afterthought.
The headline feature is adaptive solution length. The model scales how much it reasons to how hard the task is. Simple request, short answer. Hard bug, more reasoning budget. GitHub's docs call it a "reliable default for everyday coding tasks, writing, and multi-turn development workflows" (GitHub Docs).
The benchmark snapshot
Microsoft put MAI-Code-1-Flash head to head with Claude Haiku 4.5 across SWE-Bench Verified, SWE-Bench Pro, SWE-Bench Multilingual, and Terminal Bench 2, using the same production harness developers run. It came out ahead on all four, with a +16-point margin on SWE-Bench Pro, the most diverse, real-world set: 51.2% versus 35.2% (Microsoft AI).
The token-efficiency claim is the one that hits your wallet. Microsoft reports the model solving harder problems with up to 60% fewer tokens on SWE-Bench Verified. Fewer tokens means lower cost per task and lower latency, which is why it feels snappy on small edits.
One caveat worth keeping straight: GitHub's docs flag MAI-Code-1-Flash as "a continuously improving model," with performance that may shift as new checkpoints ship (GitHub Docs). Treat the benchmark numbers as the June 2026 launch snapshot, not a permanent figure.
Which plans get it
All of them. The changelog confirms MAI-Code-1-Flash is rolling out to Copilot Free, Pro, Pro+, and Max, starting in VS Code with a limited set of users and expanding over the following weeks (GitHub Changelog). If you do not see it in your picker yet, you are early in the rollout queue, not locked out. It also shows up under the default Auto picker, so you may already be using it without choosing it by name.
For context on the broader Build 2026 wave, our GitHub Copilot App setup guide covers the new desktop client that shipped alongside this model.
Every model in the Copilot picker right now
The picker has grown into a crowded menu. Here is the current roster with the pricing that actually decides your AI credit spend. All prices are per 1 million tokens, pulled from the GitHub Docs pricing page on June 3, 2026 (GitHub Docs). One AI credit equals $0.01, and token usage above your plan's included allowance bills at these rates.
| Model | Provider | Input ($/M) | Output ($/M) | Best for |
|---|---|---|---|---|
| MAI-Code-1-Flash | Microsoft | pricing pending (see note) | pricing pending | Light edits, docs, quick explanations |
| GPT-5 mini | OpenAI | $0.25 | $2.00 | Cheap general-purpose default, multimodal |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | Fast answers to simple coding questions |
| Gemini 3.5 Flash | $1.50 | $9.00 | Lightweight tasks (IDE only) | |
| Claude Sonnet 4.6 | Anthropic | $3.00 | $15.00 | Multi-file refactors, agent tasks |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | Deep reasoning, architecture analysis |
| Claude Opus 4.7 | Anthropic | $5.00 | $25.00 | Hardest bugs, complex reasoning (Pro+) |
| Per-token rates from the GitHub Docs model pricing page, fetched June 3, 2026. |
A note on that top row. As of June 3, MAI-Code-1-Flash does not yet have a per-token line in the GitHub Docs pricing table, and its Microsoft model card does not publish a price either. Until GitHub lists it, treat any specific per-token figure floating around as unconfirmed. Check the GitHub Docs model pricing page for the current rate before you make a cost-sensitive call. The qualitative point holds regardless: it is a lightweight model built for efficiency, so it sits at the cheap end of the menu.
What Auto does, and when to override it
When you use Copilot Chat in a supported IDE, Auto picks a model for you based on availability (GitHub Docs). It is a fine default and it now includes MAI-Code-1-Flash in the pool. Override it when you know the task is heavier than Auto will guess, or when you want to pin a cheap model for a long session of small edits and protect your premium request budget. Auto optimizes for "good enough right now," not for your monthly credit math.
Which model to pick for each task
This is the part to keep in a browser tab. Match the task to the model, set the picker, and go.
Autocomplete, single-line edits, short files: MAI-Code-1-Flash. This is its home turf. Inline code completions are not billed in AI credits on any paid plan anyway (GitHub Docs), but for the chat-driven small edits that do bill, MAI-Code-1-Flash gives you the cheapest reasonable quality. GitHub recommends it for writing or reviewing short files and diffs, generating docs and comments, and explaining errors quickly.
Multi-file refactors and test generation: Claude Sonnet 4.6. Once a task spans several files or needs the model to hold a wider mental map, step up. GitHub puts Sonnet 4.6 in its general-purpose coding and agent-task tier, noting it improves on Sonnet 4.5 with "smarter reasoning under pressure" (GitHub Docs). It costs $3.00 input and $15.00 output per million tokens, so you do not want it for a one-line typo fix, but for a refactor that touches five files it earns the spend.
Complex bug analysis and architecture review: GPT-5.5 or Claude Opus 4.7. When you need step-by-step reasoning across a large context, these are the heavy hitters. GitHub lists GPT-5.5 as "great at complex reasoning, code analysis, and technical decision-making," and Claude Opus 4.7 as Anthropic's most powerful model (GitHub Docs). At $5.00 input with output running $25.00 (Opus) to $30.00 (GPT-5.5) per million tokens, this is the most expensive corner of the picker. Use it for the gnarly bug that has eaten an afternoon, not for routine work.
PR review and issue triage: MAI-Code-1-Flash or GPT-5 mini. Both are cheap, both are fast, both handle the "read this diff and tell me what is off" loop well. GPT-5 mini runs $0.25 input and $2.00 output per million tokens per the GitHub Docs pricing page, and adds multimodal input, so if a PR includes a screenshot, it has an edge.
Screenshot-to-code and visual debugging: GPT-5 mini, Claude Sonnet 4.6, or Gemini 3.1 Pro. GitHub's docs list these three for multimodal work where you ask about diagrams, screenshots, or UI components (GitHub Docs). MAI-Code-1-Flash is not in that tier. If your task hinges on an image, skip it and pick one of these.
The token cost math
Here is where picking the right model stops being a preference and starts being money. The premium-request budget is the thing you actually run out of, so model choice is budget management.
Take a representative chat-driven edit that consumes roughly 8,000 input tokens (the file plus surrounding context) and 2,000 output tokens (the rewritten code). Run the numbers against the confirmed pricing (GitHub Docs):
Per-interaction cost = (input_tokens / 1,000,000 * input_rate)
+ (output_tokens / 1,000,000 * output_rate)
GPT-5 mini (8k in @ $0.25, 2k out @ $2.00) = $0.0020 + $0.0040 = $0.0060
Claude Haiku 4.5 (8k in @ $1.00, 2k out @ $5.00) = $0.0080 + $0.0100 = $0.0180
Claude Sonnet 4.6(8k in @ $3.00, 2k out @ $15.00) = $0.0240 + $0.0300 = $0.0540
GPT-5.5 (8k in @ $5.00, 2k out @ $30.00) = $0.0400 + $0.0600 = $0.1000
That single edit costs about a third of a cent on GPT-5 mini and ten cents on GPT-5.5, a 16x spread for the same prompt. Default a day of small edits to Sonnet 4.6 and you pay roughly 9x what GPT-5 mini charges for output you could not tell apart on a one-file change. MAI-Code-1-Flash sits in the same low-cost bracket as GPT-5 mini and, per Microsoft, uses up to 60% fewer tokens to get there, so the real bill can land lower still.
Run the formula against any model in the table. The pattern holds: input is cheap, output dominates, and the heavy models charge a steep premium on output you only need for hard problems.
How to check your AI credit usage
Watch the spend before it surprises you. The included premium-request allowance is the number to track: Pro includes 300 premium requests per month, and Pro+ includes 5x that (GitHub plans).
# Open your Copilot usage and billing dashboard from the terminal.
# macOS:
open "https://github.com/settings/billing"
# Linux:
xdg-open "https://github.com/settings/billing"
From the billing page, find the Copilot usage section to see premium requests consumed and any AI credit overage. If you are burning the budget fast, the fix is almost always the same: stop defaulting heavy models to light tasks. For a deeper plan-level breakdown, see our Copilot Pro+ vs Business pricing comparison.
MAI-Code-1-Flash vs Copilot on Cursor: the edge neither side advertises
Here is a difference that does not show up on any pricing page. MAI-Code-1-Flash is Copilot-exclusive. Microsoft built it for the GitHub Copilot harness and ships it inside Copilot (Microsoft AI). You cannot drop it into Cursor through a bring-your-own-key setup, because it is not offered as a standalone API model the way Claude or GPT are.
For a Copilot subscriber, that is a quiet plus. You get a Microsoft-tuned model, trained against the exact harness you are coding in, that a Cursor user cannot swap in. For a Cursor user, the trade runs the other way: Cursor's pitch has always been model flexibility, its own tab-completion tuning, and the freedom to bring whatever frontier model you have a key for. Neither approach is wrong. They are different bets.
The practical difference is small but real. On Copilot, MAI-Code-1-Flash is one tap in the picker. On Cursor, you reach for Cursor's own fast model or wire up an API key, and you will not have this specific Microsoft model at all. If model-by-model flexibility matters more than one exclusive lightweight model, Cursor leans into that. Size up both in our full Cursor vs Copilot comparison, and Cursor's own pricing and plans are worth a look if BYO-model flexibility is your priority.
The verdict: how to set up your picker today
Set Auto as your baseline and trust it for mixed work. Then override deliberately in two directions.
Override down to MAI-Code-1-Flash for high-volume light editing: small diffs, docs, comments, error explanations, quick PR triage. This is where it is the right default, and it protects your premium-request budget for the work that needs it. Override up to GPT-5.5 or Claude Opus 4.7 only for the hard stuff: a multi-file architecture decision, a bug that has resisted two cheaper passes, anything that needs deep reasoning across a large context.
If you are on Copilot Free ($0) and you keep hitting the 50-request monthly cap, Pro at $10 per user per month buys 300 premium requests and unlimited GPT-5 mini chats. If you are on Pro and you keep reaching for Opus-class reasoning, Pro+ at $39 per user per month unlocks all models, including Claude Opus 4.7, and 5x the premium requests (GitHub plans). Note that GitHub currently shows individual-plan upgrades as paused while it rolls out a new billing experience, so check the plans page for current availability. You can compare every individual tier and what each unlocks on the Copilot plans page.
One housekeeping habit: check the model roster every four to six weeks. The picker churned hard through 2026. Gemini variants came and went on the web, GPT-5 point releases stacked up, and MAI-Code-1-Flash arrived overnight. Today's right default may have a cheaper or sharper replacement next month.
FAQ
Is MAI-Code-1-Flash free to use in Copilot? It is available on the Copilot Free plan, so yes, you can use it without paying for a tier (GitHub Changelog). The Free plan caps you at 50 agent or chat requests per month and 2,000 completions (GitHub plans). Its per-token rate for usage above plan allowances was not yet published in the GitHub Docs pricing table as of June 3, so check that page for current numbers before assuming a cost.
Can I use MAI-Code-1-Flash outside GitHub, in Cursor or VS Code with a BYO key? No. It is built for and shipped inside the GitHub Copilot harness (Microsoft AI). It is not offered as a standalone API model you can bring into Cursor or any other client with your own key. To use it, you need a Copilot plan and the VS Code Copilot extension.
Does the Auto model router pick MAI-Code-1-Flash automatically? Yes, it is in the Auto pool. Microsoft's announcement says the model is available both in the manual picker and under the default Auto picker (Microsoft AI). If you want it specifically for a session, select it by name; otherwise Auto may route some light tasks to it on its own.
What happened to the Gemini models in Copilot? GitHub removed all Gemini models (plus a few others, like GPT-5.2 Codex and GPT-5.4 nano) from Copilot Chat on the web in May 2026 to keep web responses consistent (GitHub Changelog). That change was web-only. Gemini models such as Gemini 3.1 Pro and Gemini 3.5 Flash still appear in the IDE model picker and on the GitHub Docs pricing and comparison pages, so in VS Code you still have them.