Mastra vs CrewAI vs LangGraph in May 2026: Picking a TypeScript-First Agent Framework Without Rewriting Your Stack

The deciding axis for a TypeScript agent stack is not features. It is whether the framework treats TypeScript as a first language or a port, because that one fact propagates into tool type-safety, deploy targets, and how much glue you write for streaming. On that axis Mastra wins for greenfield TypeScript work, and it is not close. The call only flips on two specific conditions, both about requirements Mastra does not yet own: enterprise governance (CrewAI Python plus AMP) and complex stateful workflows with approval gates (LangGraph TS). If neither applies, building on anything other than Mastra means paying a port tax for no return.

Three months ago this came out differently. Mastra was pre-1.0, LangGraph's TS port felt bolted onto a Python graph engine, CrewAI had no credible JS story, and none had a clean answer for streaming HTTP mid-run. That window has closed. Mastra hit v1.33.0 on May 13, 2026, passed 23.9k GitHub stars, and crossed 300k weekly downloads at its v1.0 launch in January. LangGraph's @langchain/langgraph added real HITL checkpoint support. CrewAI shipped Enterprise AMP with dashboards, RBAC, and audit logs. The comparisons on Stack Overflow and Reddit are either LangGraph-centric Python posts or stale pre-1.0 benchmarks.

We built the same agent three times and measured install-to-first-run, lines of code, tool ergonomics, the observability story you actually ship with, and the escape hatch you reach for when the framework stops covering you.

The agent we built three times

Same spec in every framework. A research summarizer with three tools:

Web search via Tavily API. Takes a query string, returns the top 5 results with URLs and snippets.
Document ingest. Accepts a PDF path, extracts text, returns a plain string.
Output formatter. Takes the raw summary, returns structured JSON with title, bullets, and sources.

The agent takes a topic as input, calls web search, optionally reads an uploaded document for context, then formats and returns a structured summary. No streaming for the baseline (we cover that in the escape hatch section). The full tool surface is identical across all three implementations so the line counts are an honest comparison.

All three rely on TAVILY_API_KEY and OPENAI_API_KEY environment variables. Install scripts below are copy-paste runnable; swap in your keys where you see <PLACEHOLDER>.

Mastra: TypeScript-native, deploys in 4 minutes

Only one framework here is purpose-built for TypeScript, and it is Mastra. The mechanism that matters: the tool inputSchema and outputSchema are Zod objects, and Zod's z.infer gives you the static types for free. So the execute function's context argument is typed from the same schema the model sees at call time. One declaration drives runtime validation, the LLM tool spec, and compile-time types. Change the schema and the type error surfaces in your editor before the agent ever runs. In a port (LangGraph's tool() and the unofficial CrewAI JS Tool) the schema and the handler signature are declared separately, so a drift between them is a runtime failure, not a red squiggle. That single design choice is most of why Mastra needs fewer lines and fewer integration tests for the same agent.

Install

npm create mastra@latest research-agent
cd research-agent
npm install @ai-sdk/openai zod tavily-js pdf-parse

Set your keys:

# .env
OPENAI_API_KEY=<PLACEHOLDER>
TAVILY_API_KEY=<PLACEHOLDER>

From npm create to a running dev server: roughly 4 minutes on a clean machine with a warm npm cache.

Tool definition

Mastra tools use Zod for schema validation. The type safety flows end-to-end from tool definition through agent invocation.

// src/tools/webSearch.ts
import { createTool } from "@mastra/core/tools";
import { z } from "zod";
import { TavilyClient } from "tavily-js";

const tavily = new TavilyClient({ apiKey: process.env.TAVILY_API_KEY! });

export const webSearchTool = createTool({
  id: "web-search",
  description: "Search the web for a given query and return top results.",
  inputSchema: z.object({
    query: z.string().describe("The search query"),
    maxResults: z.number().default(5),
  }),
  outputSchema: z.object({
    results: z.array(
      z.object({
        title: z.string(),
        url: z.string(),
        snippet: z.string(),
      })
    ),
  }),
  execute: async ({ context }) => {
    const response = await tavily.search(context.query, {
      max_results: context.maxResults,
    });
    return {
      results: response.results.map((r) => ({
        title: r.title,
        url: r.url,
        snippet: r.content,
      })),
    };
  },
});

The agent definition itself is a TypeScript class:

// src/agents/researcher.ts
import { Agent } from "@mastra/core/agent";
import { openai } from "@ai-sdk/openai";
import { webSearchTool } from "../tools/webSearch";
import { documentIngestTool } from "../tools/documentIngest";
import { outputFormatterTool } from "../tools/outputFormatter";

export const researchAgent = new Agent({
  name: "Research Summarizer",
  instructions: `You are a research assistant. Given a topic, search the web,
    read any provided documents, then return a structured JSON summary
    with title, bullets, and sources.`,
  model: openai("gpt-4o"),
  tools: { webSearchTool, documentIngestTool, outputFormatterTool },
});

That's the agent. No graph wiring, no YAML, no separate config files.

Observability

Mastra ships OpenTelemetry traces out of the box. Point it at your OTLP collector, and every tool call, agent step, and token count shows up in Grafana or Honeycomb without additional instrumentation code.

Deploy

Three first-class options as of v1.33.0: Vercel Edge Functions, Cloudflare Workers, or a standard Docker container. The mastra build command outputs a Hono HTTP server, so it drops into any Node-compatible runtime.

Total line count for the research agent (three tools + agent definition + index): 91 lines.

CrewAI: Python first, enterprise second, TypeScript... not yet

CrewAI's Python framework is mature. The March 2026 Enterprise AMP release added dashboards, RBAC at the agent and task level, streaming logs, and full audit trails. For Python teams running agents at scale, that's a real edge. Nothing else in this comparison ships that governance layer out of the box.

For TypeScript teams the picture is worse. There is no official CrewAI JavaScript SDK. The crewai-js packages on npm are community efforts, not maintained by crewAIInc. The most complete unofficial port, crewai-ts, is actively developed but has not reached feature parity. No Zod-native tool definitions. No streaming. No checkpoint API.

We can show what the unofficial JS SDK setup looks like, but calling it production-ready would misrepresent the state of things.

Install (unofficial crewai-js path)

# This is the unofficial community SDK, not an official CrewAI product.
npm install crewai-js

# .env
OPENAI_API_KEY=<PLACEHOLDER>
TAVILY_API_KEY=<PLACEHOLDER>

Tool definition

import { Tool } from "crewai-js";

const webSearchTool = new Tool({
  name: "web_search",
  description: "Search the web for a given query.",
  func: async ({ query }: { query: string }) => {
    // Tavily call here
    return JSON.stringify(results);
  },
});

The tooling is thinner than Mastra. No Zod schemas, no output type validation, no built-in OTLP traces. What you get is the crew/agent/task abstraction ported to JavaScript with enough coverage for simple pipelines.

Line count for the equivalent research agent in the unofficial JS SDK: approximately 110 lines, though a meaningful chunk of that is error-handling scaffolding the SDK doesn't provide natively.

Install to first agent run: roughly 5 minutes, but more of that time is reading the community README to understand which features actually work.

Python teams with enterprise governance needs should look at the Python SDK plus the AMP suite seriously. TypeScript teams should wait for an official SDK, or pick a different framework now.

LangGraph TypeScript: the graph that grew up

LangGraph's TypeScript package (@langchain/langgraph) has been around since 2024. The v0.4 release (April 2026) is when it became a viable production choice. The headline addition: real HITL checkpoint support. Interrupt graph execution at any node boundary, persist the state to PostgreSQL or Redis, resume from the checkpoint after a human review step.

The tradeoff is verbosity, and it is structural, not stylistic. LangGraph makes you declare the agent's state shape (Annotation.Root), every node as a function over that state, and every edge between nodes by hand. Mastra infers control flow from the agent's tool set and the model's decisions; LangGraph requires you to spell it out. That is the 61-line gap on this agent. The line count is the price of a property you cannot retrofit onto an inferred-flow framework: because the graph is explicit, you can interrupt it at any named node boundary, persist the entire typed state to Postgres, and resume it after a human acts. An inferred control flow has no node boundary to name, so there is nowhere clean to checkpoint. The verbosity is the checkpoint API's enabling condition, not a tax beside it.

Install

npm install @langchain/langgraph @langchain/openai @langchain/core zod
npm install tavily-js pdf-parse

# .env
OPENAI_API_KEY=<PLACEHOLDER>
TAVILY_API_KEY=<PLACEHOLDER>

Install to first working agent run: roughly 8 minutes. The config overhead is real.

Graph definition

// src/graph/researchGraph.ts
import { StateGraph, Annotation, MemorySaver } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

// State definition
const ResearchState = Annotation.Root({
  topic: Annotation<string>(),
  searchResults: Annotation<string[]>({ default: () => [] }),
  documentText: Annotation<string | null>({ default: () => null }),
  summary: Annotation<object | null>({ default: () => null }),
  messages: Annotation<any[]>({ default: () => [] }),
});

const model = new ChatOpenAI({ model: "gpt-4o" });

// Tool definitions
const webSearchTool = tool(
  async ({ query }: { query: string }) => {
    // Tavily call
    return JSON.stringify(results);
  },
  {
    name: "web_search",
    description: "Search the web for a query.",
    schema: z.object({ query: z.string() }),
  }
);

// Node definitions
async function searchNode(state: typeof ResearchState.State) {
  const response = await model
    .bindTools([webSearchTool])
    .invoke(state.messages);
  return { messages: [response] };
}

async function formatNode(state: typeof ResearchState.State) {
  // Format into structured output
  return { summary: formattedResult };
}

// Graph assembly
const graph = new StateGraph(ResearchState)
  .addNode("search", searchNode)
  .addNode("format", formatNode)
  .addEdge("__start__", "search")
  .addEdge("search", "format")
  .addEdge("format", "__end__");

HITL checkpoint config

import { MemorySaver } from "@langchain/langgraph";
import { PostgresSaver } from "@langchain/langgraph-checkpoint-postgres";

// Development
const devCheckpointer = new MemorySaver();

// Production
const prodCheckpointer = PostgresSaver.fromConnString(
  process.env.POSTGRES_CONNECTION_STRING!
);

const compiledGraph = graph.compile({
  checkpointer: prodCheckpointer,
  interruptBefore: ["format"], // pause here for human review
});

Total line count for the research agent: 152 lines across graph definition, node functions, and checkpoint config.

Observability

LangGraph integrates with LangSmith for tracing. You set LANGCHAIN_TRACING_V2=true and LANGCHAIN_API_KEY and every graph run shows up in the LangSmith dashboard with node-level timing and token counts. Third-party options (Langfuse, Phoenix) also work via OTLP. There's no built-in OTLP export the way Mastra ships it, but the ecosystem coverage is solid.

Side-by-side

	Mastra	CrewAI (JS, unofficial)	LangGraph TS
Language	TypeScript-native	TypeScript (unofficial SDK)	TypeScript (official port)
Lines of code (research agent)	91	~110	152
Install to first run	~4 min	~5 min	~8 min
Observability	OpenTelemetry built-in	None built-in	LangSmith + OTLP via third party
Deploy targets	Vercel Edge, Cloudflare Workers, Docker	Node.js only (no edge support)	Node.js, Docker
HITL support	Yes (native suspend/resume)	No	Yes (v0.4 checkpoint API)
Official TS SDK	Yes (1.0 GA)	No (community only)	Yes (official port)

The 61-line gap between Mastra and LangGraph isn't noise. For simple agents it's a real ergonomic difference. For complex stateful workflows, LangGraph's graph definition pays for itself in debuggability.

Where the escape hatch matters

Every framework has a ceiling. Here's where each one breaks and what you do about it.

Streaming HTTP responses mid-agent run. This is the scenario that surfaces the most pain. You're building an API endpoint and need to stream the agent's output back to the browser as it generates. Mastra handles this natively: the mastra build output is a Hono server, and you get streamText from the Vercel AI SDK with no extra plumbing. The response streams token by token to your client. LangGraph emits graph events as an async iterator; you proxy those over a ReadableStream in your HTTP layer, which works but requires about 30 lines of glue code. CrewAI JS has no clean answer here. The unofficial SDK doesn't expose streaming at the tool or agent output level.

Dropping to raw API calls. Sometimes the framework abstraction is wrong for a specific node and you just want to call the OpenAI API directly. Mastra: call openai.chat.completions.create(...) inside any tool's execute function. The framework doesn't block raw SDK calls. LangGraph: write a custom node that returns a state update. The graph doesn't care how the node computes its output. CrewAI Python: subclass BaseTool and do whatever you need inside _run. CrewAI JS: same pattern but less tested.

Production scenario: approval gate before a write action. Your agent calls a tool that edits a database. You need a human to approve every write before execution. LangGraph wins here clearly. Set interruptBefore: ["write-node"] in the checkpoint config, persist state to PostgreSQL, surface the pending decision to a UI, resume on approval. Mastra has HITL support but the docs for the approval-gate pattern are thinner than LangGraph's. CrewAI Enterprise AMP has a human review step in the GUI, but only on the Python side.

Custom error recovery. What happens when a tool call fails halfway through a multi-step agent run? Mastra raises the exception to the agent and lets the model decide whether to retry. LangGraph gives you explicit error edges you can wire per node. For high-stakes pipelines, LangGraph's explicit error topology is safer.

Which one to pick

Mastra is the default. The other two are flip conditions, not coin flips. The table is the flip map; the rationale follows it.

Team situation	Pick
TypeScript-first team, greenfield project	Mastra. Fastest path to production, best deploy story, no JS/Python context switch.
Python team, enterprise governance requirements	CrewAI (Python) + Enterprise AMP. RBAC, audit logs, dashboards. Nothing else in this list ships that layer.
Any team with complex stateful workflows or approval gates	LangGraph TS. The graph model and checkpoint API pay off when your agent needs to pause, persist, and resume with human input.
You need to audit exactly what ran, edge by edge, for high-stakes pipelines	LangGraph TS. Explicit node/edge topology and per-node error edges make the run inspectable in a way inferred control flow is not.

Default to Mastra for any greenfield TypeScript agent. The Zod-to-types-to-tool-spec single declaration, the 4-minute install, built-in OTLP, and edge deploy targets compound, and at v1.33.0 with the team behind Gatsby running it the 1.0 stability bet is reasonable.

Two flips, both narrow and specific. Flip to CrewAI Python plus Enterprise AMP only if you are already on Python and the requirement is a compliance trail across five or more agents on a shared knowledge base. RBAC, audit logs, and dashboards out of the box are worth more than avoiding the language; the unofficial JS SDK is not, and "wait for an official one" is not a plan you can ship against. Flip to LangGraph TS the moment an agent must pause at a named step, persist state, and resume on human approval. That is the one capability Mastra's docs are thin on and LangGraph's explicit graph makes structurally sound. Outside those two conditions, the port tax buys nothing.