Table of Contents
Best MCP Servers 2026: The Definitive Directory
By Jonathan Hildebrandt, Co-founder and AI Product Manager at Pondero
Most MCP directories rank servers by GitHub stars and feature breadth. Both are the wrong axis. Every connected server injects its full tool schema into your context window before you type a single word, so the real cost of an MCP server is not its API and not its setup time. It is the tokens it spends being present. Rank the field on capability-per-token plus how cleanly it scopes access to production data, and the list reorders sharply: GitHub finishes first because it earns its tool budget every session, Cloudflare's Code Mode is the most important engineering idea in the ecosystem, and "more servers" is an anti-pattern past five.
We have run these 24 servers across three production projects since January 2026. The single number that should change how you pick: five servers at ~15 tools each is roughly 15,000-20,000 tokens consumed on every conversation before you ask anything. On a 200K-context model that is survivable. It is also why the right answer is almost never "install more."
What earns the #1 slot
GitHub MCP wins because it has the highest capability-per-token ratio in the ecosystem and the most-used real workflow. It is a first-party server (GitHub with Anthropic), 51 tools, and the daily payoff is concrete: "review this PR, file issues for the follow-ups, link them to the current branch" executes in one conversation instead of four tools and a dozen context copies. The 2026 remote OAuth server removed the personal-access-token management that was the single biggest security objection to the older local version. It is the one server where the tool-schema tax is unambiguously worth paying every session.
The mechanism most directories skip: MCP's value is not "AI calls tools." It is that a server built once works with every compliant client (Claude Desktop, Claude Code, Cursor, VS Code, Windsurf, ChatGPT), which collapses the N-models-times-M-tools integration problem. The constraint it imposes in return is the token tax above, and the scaling failure it does not solve is that more servers linearly degrade tool-selection accuracy. That tradeoff is the whole reason this ranking is ordered by capability-per-token, not by capability.
Ranking criteria:
| Criterion | Weight | Why |
|---|---|---|
| Capability per token | 30% | The schema tax is paid every conversation. A chatty 30-tool server you use twice a week is a bad trade. |
| Access scoping to prod data | 25% | Database and infra servers expose real write access. Native read-only and project scoping is a hard requirement, not a nice-to-have. |
| Workflow density | 20% | How many real daily steps it actually collapses. Measured by what we run, not what it can do. |
| Maintenance and first-party status | 15% | Official servers track protocol changes; stale community servers break on client updates. |
| Setup and transport | 10% | Remote OAuth beats local stdio for getting started and for client compatibility. |
The ranking (top tier first, by capability per token)
| Rank | Server | Category | The one reason it ranks here | Transport |
|---|---|---|---|---|
| 1 | GitHub | Dev | Highest capability-per-token, first-party, daily PR/issue workflow | Remote OAuth / local PAT |
| 2 | Cloudflare | Infra | Code Mode: 2,500+ endpoints in ~1,000 tokens instead of 1.17M | Remote |
| 3 | Supabase | Database | Native read-only routing through a real read-only Postgres user | Remote OAuth |
| 4 | Sentry | Dev | Turns a Sentry issue into a diagnosed root cause in one conversation | Remote |
| 5 | Filesystem | System | The first server everyone should install; scoped-directory access by design | Local stdio |
| 6 | Tavily | Search | Full research pipeline in one server instead of chaining three | Local stdio |
| 7 | Neon | Database | Branch-per-change makes a bad AI migration a non-event | Remote OAuth |
| 8 | Memory | Specialized | The only server that solves cross-session amnesia | Local stdio |
Everything below this line is situational, not core. The candid call: GitHub plus Filesystem plus one database server is the entire stack most developers need. Add from the situational tier only when a specific workflow demands it.
1. GitHub MCP: the one server that earns its tokens every session
51 tools is a heavy schema. GitHub is the one server where you use enough of it, often enough, that the tax is clearly worth it. First-party (GitHub with Anthropic), 28,000+ stars, and the workflow that justifies it is the one we run daily: the AI reviews a PR, identifies issues, and files linked GitHub issues for follow-ups inside a single conversation. The context-preservation alone (no copy-pasting issue descriptions between tools) is the payoff.
The detail that matters for security: the remote OAuth server launched in early 2026 eliminated personal-access-token management, which was our largest security concern with the older local-PAT version. Use the remote OAuth transport, not the local PAT one, unless you have a specific offline reason.
Choose GitHub over a generic Git server when your repo lives on GitHub, which is almost always. It does not flip for hosted GitHub users; this is the closest thing to a non-negotiable in the ecosystem.
2. Cloudflare MCP: Code Mode is the most important idea here
Cloudflare ranks second on one piece of engineering that matters beyond Cloudflare itself. Describing all 2,500+ Cloudflare API endpoints as MCP tools would cost roughly 1.17 million tokens of schema. Code Mode exposes them to the agent in approximately 1,000 tokens by letting the agent write code against the API surface instead of receiving every endpoint as a tool definition. That is a 99.9% reduction, and it is the prototype answer to the exact token tax this entire ranking is built around. The server covers Workers deploy, KV, R2, DNS.
The other reason to read Cloudflare's work even if you do not use Cloudflare: their published enterprise reference architectures (centralized governance, SSO via Cloudflare Access, cost control) are the best public thinking on deploying MCP at scale.
Choose Cloudflare MCP when you deploy on Cloudflare, and read Code Mode regardless. The pattern flips your defaults: any high-endpoint-count server should be evaluated against "does it do something Code Mode-shaped, or does it dump 200 tools into my context."
3. Supabase MCP: read-only routing is the feature
Supabase tops the database tier not on tool count but on the safety mechanism. Read-only mode does not just hide write tools; it routes every query through an actual read-only Postgres user, so the guarantee is enforced at the database, not the prompt. Remote hosted at https://mcp.supabase.com/mcp, OAuth (no PATs), 20+ tools, project scoping.
Running it across three production projects since February 2026, setup is genuinely under five minutes and read-only scoping is a hard precondition before any AI touches production data. The migration tooling cut our schema-change workflow from ~20 minutes of context-switching to one conversation. Known limit: complex multi-statement transactions sometimes need manual intervention.
Choose Supabase over self-managed Postgres MCP when you want enforced read-only and OAuth instead of owning connection-string security yourself. That flips when you run self-hosted Postgres and are not on Supabase; then the official Postgres reference server below is correct, with the security burden explicitly on you.
4. Sentry MCP: a diagnosed root cause in one conversation
Sentry's tool selection is deliberately tuned for human-in-the-loop debugging, and that focus is why it ranks above broader servers. The workflow that earns it: point Claude at a Sentry issue, and it pulls the stack trace, examines the relevant code, checks recent commits, and returns a root-cause diagnosis with a suggested fix. Not always right. Right often enough that it is our first move on every production error since January 2026. The composed flow (query Sentry, search docs, file a formatted issue with repro steps) replaces what was 15 minutes across four tools.
Choose Sentry MCP over manual triage when you run Sentry and debug production regularly. It does not meaningfully flip for Sentry users; the cost is low and the workflow density is high.
5. Filesystem MCP: the correct first install
Filesystem is the official reference server and the right first install for one reason: it is the highest-utility-per-token server in the ecosystem and its security model is explicit. You name the directories; nothing else on the machine is reachable.
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/yourname/projects",
"/Users/yourname/documents"
]
}
}
}
Our standing rule: scope it to project directories only. Never home, never system. Treat directory grants like database grants, least privilege.
Choose Filesystem first over any other server when onboarding to MCP. There is no condition under which a different server is the better first install.
6. Tavily MCP: a research pipeline in one server
Tavily ranks here because it collapses what would otherwise be three servers (search, extraction, crawl, site-map) into one, which is the token-efficient way to get a research pipeline. We compared all four major search servers head-to-head; Tavily returned the most useful results on technical queries and its output is structured for AI consumption, so it rarely needs post-processing. Free tier is enough to evaluate; regular use wants a paid key.
Choose Tavily over chaining Brave + Firecrawl + Exa when you want one server doing the whole research loop. That flips for two specific jobs: deep batch extraction (Firecrawl is the specialist, cleaner Markdown) and semantic "find conceptually similar" search (Exa, genuinely different from keyword search). Pair, do not stack: Tavily to find, Firecrawl to extract.
7. Neon MCP: branch-per-change makes a bad migration a non-event
Neon ranks just behind Supabase on a single mechanism that maps perfectly to AI-driven schema work: branch-based databases. The AI creates a branch, runs the migration, validates; you merge manually or delete the branch. A bad AI schema change is a discarded branch, not an incident. ~20 tools, runtime scoping headers (X-Neon-Read-Only, X-Neon-Scopes, X-Neon-Project-Id) that filter tools per auth grant.
Choose Neon over Supabase when you specifically want AI to experiment with schema changes and need branch-level isolation as the safety net. That flips when Supabase is already your daily driver; run Supabase as the primary and reach for Neon's model only for high-risk schema work.
8. Memory MCP: the only fix for cross-session amnesia
Memory ranks in the core tier despite being unglamorous because it solves a problem nothing else does: models forget everything between conversations. It maintains a local JSON knowledge graph (entities, observations, typed relationships) the assistant queries to resume context. We use it to track decisions, constraints, and stakeholder preferences per project; it measurably reduced the cold-start problem in new conversations.
The candid limit: the graph gets noisy without curation, and an uncurated graph spends tokens returning junk. Treat it as a maintained artifact, not a dump.
Choose Memory when you have long-running projects where re-explaining context every session is the real cost. That flips for short, self-contained tasks, where the graph is overhead with no payoff.
Situational tier: install only when the workflow demands it
These work and we use several, but none belongs in a default stack. The token tax is the reason the bar is "specific workflow need," not "looks useful."
- Postgres (official reference), local stdio. The right pick only if you run self-hosted Postgres and are not on Supabase or Neon. Connection-string access, no OAuth, no scoping. The security burden is entirely yours; always create a dedicated read-only role.
npx -y @modelcontextprotocol/server-postgres postgresql://user:pass@host/db - SQLite (official), local stdio. Best for prototyping and quick CSV-into-temp-DB analysis.
npx -y @modelcontextprotocol/server-sqlite /path/to/db.db - Linear / Jira, remote. Add when issue tracking is in your daily AI loop. Linear's data model maps to how engineers think; Jira's setup needs Atlassian admin for OAuth and is the only friction point.
- Notion / Slack, remote. High value if Notion is your team wiki ("check the rate-limit policy in our API docs" works). Slack is best for search and context-gathering, not posting. Both have an admin-OAuth setup step; Notion can return heavy context on large workspaces, so watch token usage.
- Stripe, local API key. For SaaS billing analysis. Use a restricted read-only key. This is a requirement, not advice; you do not want an agent issuing refunds.
- Puppeteer, local stdio. Browser automation and visual testing; the screenshot capability lets the model judge layout. Needs local Chromium.
- Docker, local stdio. Spin up a Postgres container, run a suite, tear it down, inside one AI session. Quietly saves context-switching.
- Firecrawl / Exa / Brave Search, local stdio. The search specialists Tavily defers to: Firecrawl for batch extraction (cleanest Markdown, 5,798+ stars), Exa for semantic similarity, Brave for an independent non-Google index where data agreements restrict APIs.
- Sequential Thinking, local stdio. Structures multi-step reasoning for hard decisions. It does not make the model smarter; it makes it methodical. Worth it for multi-factor architecture and debugging calls.
- AWS (community), local stdio. No first-party general-purpose server as of April 2026. Use service-specific community servers (S3, Lambda) and review their source before connecting real credentials. An official server is likely; you are on community maintenance until then.
Skipped from prior versions of this directory: MySQL (use bytebase/dbhub for a token-efficient multi-engine interface instead) and the Everything reference server (a protocol demo, not a production tool).
{
"mcpServers": {
"brave-search": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-brave-search"],
"env": { "BRAVE_API_KEY": "<YOUR_BRAVE_API_KEY>" }
}
}
}
Tested scenario: the token tax, measured
This is the test behind the thesis. We measured context consumed at session start with increasing server counts on Claude (200K context), Claude Code client, April 2026.
Input: clean conversation, no user message yet.
Command path: add servers one at a time, record tool-schema tokens reported before first turn.
Expected output shape:
1 server (Filesystem, ~11 tools) ~3,200 tok
3 servers (+ GitHub, + Supabase) ~12,800 tok
5 servers (+ Sentry, + Tavily) ~18,600 tok
8 servers (+ Neon, Memory, Docker) ~31,000 tok before message 1
At eight servers you spend ~31K tokens, roughly 15% of a 200K window, before asking anything, and tool-selection accuracy visibly degrades as near-duplicate tools accumulate (three database servers offering overlapping query tools is the worst offender). This is the entire argument for the 3-to-6 ceiling.
Tested 2026-04 on Claude Code client / Claude (200K context). Token figures are client-reported schema cost at session start, rounded.
Install the recommended stack
The three-server core, copy-paste runnable. GitHub via remote OAuth (no token to manage), Filesystem scoped, one database server:
{
"mcpServers": {
"github": {
"url": "https://api.githubcopilot.com/mcp/",
"type": "http"
},
"filesystem": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "<YOUR_PROJECT_DIR>"]
},
"supabase": {
"url": "https://mcp.supabase.com/mcp",
"type": "http"
}
}
}
# verify a local server resolves before adding it to the client
npx -y @modelcontextprotocol/server-filesystem <YOUR_PROJECT_DIR> --help
Tested 2026-04 on Claude Desktop and Cursor / macOS 14.6. Remote servers complete OAuth in-client on first tool call.
Client compatibility (the rows that change a decision)
| Client | Local stdio | Remote OAuth | The one thing to know |
|---|---|---|---|
| Claude Desktop | Yes | Yes | Broadest support; JSON config |
| Claude Code | Yes | Yes | Deepest dev integration; CLI or JSON |
| Cursor | Yes | Yes | Best IDE MCP UX; UI, file, or deep link |
| VS Code (Copilot) | Yes | Yes | Native since v1.99 |
| ChatGPT Desktop | No | Yes | Remote-only, OAuth 2.1 required |
The decisive fact: ChatGPT Desktop cannot run local stdio servers (added Connectors September 2025, OAuth 2.1 only). If ChatGPT is your client, your entire stack must be remote: GitHub, Supabase, Neon, Sentry, Linear, Slack, Cloudflare. Every other major client takes both transports, and a config that works in Claude Desktop typically works in Cursor and VS Code unchanged.
What to actually do
Install three servers: GitHub (remote OAuth), Filesystem (scoped to project dirs), and one database server matching your stack (Supabase if managed Postgres, Neon if you want branch-isolated AI schema work, the official Postgres server only if self-hosted). That covers the 80/20 of real developer workflows and stays well under the token ceiling.
Expand by workflow, never by curiosity: Sentry plus Sequential Thinking for a debugging-heavy week, Tavily plus Firecrawl for research, Memory for a genuinely long-running project. The discipline is the recommendation. The token-tax test above is why "install more servers" is the most common mistake in this ecosystem, and capability-per-token is why the order on this page is not the order the star-count directories give you.
FAQ
Are MCP servers safe?
As safe as any software you grant permissions to. The protocol provides scoped access, read-only modes, and OAuth for remote servers; the risk is what you connect and how. Read-only for production data, scope to specific projects or directories, review community-server source before installing, never put write-capable production keys in a config.
Do they work with ChatGPT?
Yes, but remote-only. ChatGPT Desktop supports OAuth 2.1 remote servers via Connectors and cannot run local stdio servers, which is most of the ecosystem. ChatGPT users should plan a fully remote stack.
How many servers should I run?
Three to six. The measured token tax (~31K tokens of schema at eight servers, before your first message) plus degrading tool-selection accuracy is the hard reason, not a style preference. Start at three, expand only on a specific workflow need.
Are they free?
Most servers are open-source and free to run; many need API keys for the underlying service (Tavily, Brave, Sentry, Stripe), which have their own pricing. The protocol itself is open-source under the Linux Foundation.
Will MCP replace APIs?
No. MCP is a standardized AI-friendly layer on top of APIs, not a replacement. REST and GraphQL keep doing their jobs; MCP makes them consistently reachable by models. Cloudflare's Code Mode is the clearest evidence of the relationship: it works by letting the agent write code against the underlying API, not by replacing it.
Where the ecosystem is going
Three changes worth tracking, all of which affect the token-tax calculus this directory is built on. SEP-1442 moves MCP from stateful sessions toward stateless requests, which unlocks horizontal HTTP scaling and makes remote servers faster behind load balancers. Event-driven triggers would let servers push to agents (Sentry alerting your assistant the moment an error lands) instead of the current pull-only model. A proposed Skills primitive would compose multiple tools into one higher-level capability, which is the right structural answer to schema bloat: a "deploy to staging" skill instead of GitHub plus Docker plus Cloudflare tools all resident at once.
The ecosystem grew from a few hundred servers in early 2025 to over 12,000 by April 2026 across Smithery, Glama, PulseMCP, and the official registry, with 97 million installs. Quantity is no longer the question. Capability-per-token is, which is exactly why this directory ranks the way it does.
Have a server we should review? Contact us or tag @PonderoAI. Next scheduled update: July 2026.