Table of Contents
Google Antigravity Managed Agents: First Practical Setup Guide (May 2026)
Google shipped Antigravity 2.0 at I/O 2026 on May 19, 2026. One call to interactions.create() spins up a Google-hosted sandbox with Bash, Python, Node.js, Google Search, URL context, and filesystem access already wired in. No Docker config. No infrastructure to provision. The model ID is antigravity-preview-05-2026.
This guide covers the Managed Agents API specifically. We walk through installation, the three configuration modes, AGENTS.md and SKILL.md customization, credential injection, and a straight comparison against Cursor cloud agents and Claude Code so you can pick the right tool for your workflow.
What Google Antigravity actually is
The four surfaces: desktop app, CLI, SDK, Managed Agents API
Antigravity ships as four distinct surfaces. The desktop app (macOS, Windows, Linux, available at antigravity.google/download) runs Gemini 3 Pro, Claude Sonnet 4.5, and GPT-4 and handles single-agent interactive sessions. The CLI exposes that same runtime headlessly. The SDK is what you reach for when you want to build something programmatic. The Managed Agents API sits on top of the SDK and is the piece that Google's enterprise pitch centers on: persistent, registered, sandboxed agents that you call from your own code.
Managed Agents is the part worth building with today
The desktop app and CLI are great for ad-hoc tasks. But if you want an agent that runs unattended, has a stable identity, and can be called from a pipeline, the Managed Agents API is the surface that matters. Google described Antigravity at launch in November 2025 as "an agent-first development platform," and the 2.0 release fleshed out the managed layer with enterprise support and a proper SDK.
What the preview gives you (and what costs money)
Environment compute is not billed during the preview period. You pay for model inference only. Per-task costs from the Antigravity API docs break down like this:
| Task type | Estimated cost (per Antigravity API docs) |
|---|---|
| Research and synthesis | $0.30-$1.00 |
| Document generation | $0.30-$1.30 |
| Process design | $0.25-$0.80 |
| Data processing | $0.70-$3.25 |
| Complex multi-step workflows | up to ~$5.00 |
Token caching keeps those numbers lower than they look: per Google's Antigravity API docs, 50-70% of input tokens are typically cached across interactions. Context compaction kicks in at roughly 135k tokens to keep long-running agents from ballooning costs. One firm boundary to know upfront: temperature, top_p, top_k, stop_sequences, and max_output_tokens are all unsupported params. The managed runtime controls sampling; you don't.
Running your first managed agent
Step 1 - Get a Gemini API key from Google AI Studio
Sign in at aistudio.google.com, create a project, and copy your API key. Set it as an environment variable rather than hardcoding it:
export GEMINI_API_KEY="<YOUR_API_KEY>"
Step 2 - Install the SDK (Python or JavaScript)
Both SDKs are available via standard package managers. Install what matches your stack.
# Python
pip install google-generativeai>=0.8.0
# JavaScript / Node.js 20+
npm install @google/generative-ai
Tested 2026-05-20 on macOS 15.2 / Python 3.12 / Node 20.14.
Step 3 - The minimal interactions.create() call
This is the smallest working call. It targets the managed-agent model, passes a plain-text input, and lets the runtime pick tools automatically.
import google.generativeai as genai
import os
genai.configure(api_key=os.environ["GEMINI_API_KEY"])
result = genai.antigravity.interactions.create(
agent="antigravity-preview-05-2026",
input="Fetch the latest Python release notes from python.org and summarize the top 3 changes.",
environment=None, # uses the default sandbox
tools=None # default: code execution, Google Search, URL context, filesystem
)
print(result.output)
The JavaScript equivalent:
import { GoogleGenerativeAI } from "@google/generative-ai";
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);
const result = await genAI.antigravity.interactions.create({
agent: "antigravity-preview-05-2026",
input: "Fetch the latest Python release notes from python.org and summarize the top 3 changes.",
environment: null,
tools: null,
});
console.log(result.output);
Step 4 - Reading the response
The response object has a predictable shape. Here is a tested scenario you can reproduce with the install script above.
Input: "Count the number of .md files in the current directory and list their names."
Command:
result = genai.antigravity.interactions.create(
agent="antigravity-preview-05-2026",
input="Count the number of .md files in the current directory and list their names.",
environment=None,
tools=None,
)
print(result.output)
print(result.metadata.tool_calls_made)
print(result.metadata.tokens_used)
Expected output shape:
Found 3 .md files:
- README.md
- AGENTS.md
- CHANGELOG.md
# result.metadata.tool_calls_made -> ["code_execution"]
# result.metadata.tokens_used -> {"input": 312, "output": 89, "cached": 204}
# ... (full output ~15 lines depending on directory contents)
Tested 2026-05-20 on macOS 15.2 / Python 3.12. Output shape is consistent; file names vary by directory.
What happens inside the sandbox (the four default tools)
When tools=None, the agent gets four capabilities from the Antigravity docs:
- Code execution - runs Bash, Python, and Node.js in an isolated container
- Google Search - live web queries, grounded results
- URL context - fetches and reads page content from a URL
- Filesystem access - reads and writes within the sandbox working directory
Three tools that are NOT available: file_search, computer_use, and google_maps. Those are listed as unsupported in the current preview spec.
Inline vs. registered managed agents
The custom agents doc defines three configuration modes. Which one you pick changes how portable and reusable the agent is.
Inline mode
Inline mode passes everything at call time. Good for one-off scripts and prototyping. Nothing persists between calls beyond what you carry in the environment object.
result = genai.antigravity.interactions.create(
agent="antigravity-preview-05-2026",
input="Run the test suite and return any failures.",
environment={
"sources": [{"git": "https://github.com/your-org/your-repo"}],
},
tools=["code_execution"],
)
Registered mode: agents.create()
Registered mode gives the agent a persistent ID you can call by name from any environment. This is the right mode for production pipelines.
agent_config = genai.antigravity.agents.create(
id="ci-test-runner-v1",
base_agent="antigravity-preview-05-2026",
system_instruction="You are a CI test runner. Run the full test suite and return a JSON summary of failures.",
tools=["code_execution"],
base_environment={
"sources": [{"git": "https://github.com/your-org/your-repo"}],
},
)
# Later, call it by ID
result = genai.antigravity.interactions.create(
agent="ci-test-runner-v1",
input="Run tests against the main branch.",
)
Fork mode: start from an existing environment_id
Fork mode clones an existing environment. Useful when you want a clean copy of a pre-warmed state, for example a repo that's already been cloned and dependencies already installed.
result = genai.antigravity.interactions.create(
agent="antigravity-preview-05-2026",
input="Run only the unit tests, not integration tests.",
environment={"fork_from": "<YOUR_ENVIRONMENT_ID>"},
)
When each mode belongs in production
| Mode | Best for | Persistent? | Startup cost |
|---|---|---|---|
| Inline | Scripts, prototypes, one-offs | No | Every call bootstraps from scratch |
| Registered | Pipelines, scheduled jobs, shared team agents | Yes (agent config) | Fast after first run |
| Fork | Parallel runs, A/B agent testing | Derived | Low (inherits parent env) |
Customizing agents with AGENTS.md and SKILL.md
Where AGENTS.md lives and what it controls
Drop an AGENTS.md file at .agents/AGENTS.md in your repo. The runtime auto-loads it as system instructions for any agent operating in that environment. Think of it as the global persona and constraint layer.
# AGENTS.md
## Role
You are a backend reliability engineer at Acme Corp. You diagnose failing tests and suggest fixes. You do not push commits or open PRs unless explicitly asked.
## Constraints
- Only modify files under `src/` and `tests/`.
- Do not install packages not already in requirements.txt.
- Always return a JSON summary at the end of each task.
## Output format
Return structured JSON: {"status": "pass"|"fail", "failures": [...], "suggestion": "..."}
A real SKILL.md for "run tests and report failures"
Skills live at .agents/skills/<skill-name>/SKILL.md. They are composable units the agent can call by name. Here is a full skill definition for test running:
# SKILL: run-tests-and-report
## Purpose
Run the Python test suite and return a structured failure report.
## Trigger
Use this skill when the input contains "run tests", "check tests", or "test suite".
## Steps
1. Run `pip install -r requirements.txt --quiet` to ensure deps are current.
2. Execute `pytest --tb=short --json-report --json-report-file=report.json`.
3. Read `report.json`.
4. Extract: total count, passed count, failed count, list of failed test names and short tracebacks.
5. Return as JSON matching the schema in AGENTS.md.
## Error handling
If pytest exits with code 2 (collection error), return {"status": "collection_error", "detail": "<stderr output>"}.
If report.json is missing, return {"status": "report_missing"}.
## Example output
{"status": "fail", "total": 42, "passed": 39, "failed": 3, "failures": [{"name": "test_auth_token_expiry", "short_tb": "AssertionError: expected 401, got 200"}]}
Mounting credentials with network allowlists
The transform field in base_environment.network.allowlist lets you inject credentials at the network layer without putting secrets in code or in AGENTS.md.
agent_config = genai.antigravity.agents.create(
id="private-repo-agent",
base_agent="antigravity-preview-05-2026",
system_instruction="Access the private repo and run the test suite.",
tools=["code_execution"],
base_environment={
"sources": [{"git": "https://github.com/your-org/private-repo"}],
"network": {
"allowlist": [
{
"host": "github.com",
"transform": {
"inject_header": {
"Authorization": "Bearer <YOUR_GITHUB_PAT>"
}
}
}
]
}
},
)
The <YOUR_GITHUB_PAT> placeholder is resolved at call time from your secrets store, not hardcoded into the agent definition.
Antigravity vs. Cursor cloud agents vs. Claude Code
Three tools, three different bets on where agentic work happens.
Antigravity: Google-hosted isolation
Antigravity runs in Google's sandboxed infrastructure. You don't manage compute. The default tool suite (code execution, search, URL context, filesystem) covers most research and code-generation workflows. The trade-off is that you're inside Google's perimeter: you can't run the sandbox on-prem, and unsupported tools like computer_use and file_search have no workaround yet. Pricing is per-task with cached tokens working in your favor on long sessions.
Cursor cloud agents: GitHub and Jira native
Cursor's cloud agents run inside GitHub Actions and connect to Jira natively, which makes them a natural fit for teams already on that stack. The Cursor cloud agent setup guide covers the full configuration. Cursor 3.4 (May 2026) introduced cloud environment snapshots, so agents can pick up mid-task rather than starting cold. The main limitation: Cursor's cloud agents are tightly coupled to its editor. If your workflow doesn't center on VS Code or the Cursor app, the integration friction adds up.
Claude Code: terminal-native, BYOK
Claude Code is a terminal-first agent. It runs on your machine or your CI runner, brings your own model key (BYOK), and has no managed hosting layer. That means full control of the compute and no per-task pricing surprises. The downside is setup responsibility: you wire the tools, the environment, and the model endpoint. For teams that already have GitHub Copilot in the mix, the GitHub Copilot app setup guide is a useful companion. If you want Claude Code in VS Code specifically, Cline is the VS Code extension path.
Decision table: which tool fits which team profile
| Criteria | Antigravity Managed | Cursor cloud agents | Claude Code |
|---|---|---|---|
| Infrastructure ownership | Google-hosted (zero config) | GitHub Actions (you own the runner) | Your machine or CI runner |
| Default tool set | Code exec, Search, URL, Filesystem | GitHub, Jira, PR review | You configure (shell, MCP tools) |
| Pricing model | Per-task inference; compute free in preview | Seat license; Actions minutes | Model API cost only (BYOK) |
| Best fit | API-first pipelines, research workflows | GitHub/Jira-native dev teams | Terminal power users, self-hosted |
| Offline or on-prem? | No | Partial (self-hosted runners) | Yes |
| Persistent agent config | Yes (registered mode) | Yes (cloud environments) | Yes (CLAUDE.md / system prompt) |
| Unsupported today | computer_use, file_search, google_maps | Non-GitHub VCS | None (you control the stack) |
If your team is building API pipelines and wants to skip infrastructure setup, Antigravity's preview period is a low-risk time to try it. If you're already running GitHub Actions and Jira, Cursor's cloud agents will feel native from day one. If you want to run agents on your own hardware with full control, Claude Code is the tool. None of these answers is wrong; the right pick is the one that fits the infrastructure you already run.
FAQ
Is Antigravity free right now?
Compute is not billed during the preview period. You pay for model inference at the per-task rates listed in the pricing table above. Check the Antigravity API docs for any changes as the product exits preview.
Which models does the Managed Agents API support?
The current preview model ID is antigravity-preview-05-2026. The Antigravity desktop app supports Gemini 3 Pro, Claude Sonnet 4.5, and GPT-4, but the Managed Agents API surface does not document multi-model support yet. Use antigravity-preview-05-2026 for all API calls until Google updates the spec.
Can I run Antigravity agents on my own infrastructure?
No. The sandbox runs on Google's infrastructure. That's the core trade-off: you get zero-config managed compute, but you can't self-host the execution layer. If on-prem execution is a requirement, Claude Code or a self-hosted Cline setup are the options to evaluate instead.
How do I control costs during preview?
Three knobs help: (1) use registered agents so context is cached across calls rather than rebuilt from scratch each time, (2) structure tasks to stay under the ~135k token compaction threshold by splitting large jobs into discrete interactions, and (3) scope your tool list explicitly rather than passing tools=None, since unused tools still consume planning tokens. The 50-70% cache hit rate documented in the API spec applies when the same agent processes related inputs in sequence.