Skip to content
Guide intermediate

DigitalOcean Gradient ADK: Deploy a Python AI Agent to Production Without Managing Infrastructure (2026)

The short version

DigitalOcean's Gradient ADK is a pip-installable Python SDK and CLI that ships your LangGraph or CrewAI agent as a hosted /run endpoint with built-in tracing. Here is the setup, the gotchas, and when it beats a bare Droplet or Fly.io.

Published June 17, 2026 by Pondero Labs
Table of Contents

DigitalOcean Gradient ADK: Deploy a Python AI Agent to Production Without Managing Infrastructure (2026)

You wrote a LangGraph agent on your laptop. It plans a research task, calls a couple of tools, writes a summary, and it works. Then you try to share it. Now you need an HTTP endpoint, a process that stays up overnight, somewhere to read the logs when a tool call hangs, and a way to see which node in the graph actually broke. That gap, the one between "runs on my machine" and "runs in production," is where most agent side-projects quietly die. DigitalOcean's Gradient AI Agent Development Kit (ADK) is built to close exactly that gap, and its own announcement frames the problem the same way: the prototype-to-production step "usually means stitching together infrastructure, deployment pipelines, logging, endpoints, and monitoring, often without changing the agent logic itself," per DigitalOcean's launch blog. This guide walks the real commands, the trace view that makes it worth using, the knowledge-base wiring, and the cases where you should reach for a bare Droplet or AWS instead.

What Gradient ADK actually is

It is a Python SDK plus a CLI. Not a no-code builder, not a drag-and-drop canvas. You keep writing agent code in your editor; the ADK is the thing that takes that code and runs it as a hosted service. Per the launch blog, it is "a Python SDK and CLI that lets you deploy agent code as a hosted, production-ready service."

The split between what it manages and what you own is the whole pitch, so it is worth being concrete about it.

What the platform handles for you:

  • The HTTP endpoint. Deployed agents are exposed via a standardized /run endpoint, per the launch blog.
  • Multiple deployments of the same agent (dev, staging, prod) behind that runtime.
  • Execution traces and logs, including LLM calls, tool calls, and knowledge-base calls.
  • Knowledge Base indexing and retrieval, if you wire one in.

What you still own:

  • The agent code itself, in whatever Python you like.
  • The framework. LangGraph, LangChain, CrewAI, PydanticAI, or your own orchestration all run, as long as the code conforms to the entry-point contract, per the launch blog.
  • Model selection on Gradient inference.

Status, as of June 2026: ADK is in public preview, and agent hosting is free during the preview, per the launch blog. You opt in from the Feature Preview page in the DigitalOcean console before the CLI will deploy anything.

On models, Gradient inference is not a one-model platform. The current library on the Inference Engine includes NVIDIA Nemotron 3 Ultra (a 550B MoE model aimed at long-running agentic workflows), Anthropic's Claude Fable 5, OpenAI's GPT-5.5, DeepSeek-V4-Flash, and Llama. DigitalOcean added Nemotron 3 Ultra the week of June 1 and Claude Fable 5 the week of June 8, 2026, both reachable from inside an ADK agent. DigitalOcean's own note on Nemotron 3 Ultra claims "up to 5x faster inference and up to 30% lower cost for agentic workloads," per the Inference Engine update. Read that as a vendor benchmark, not an independent one. The practical point stands either way: you pick the model in code and it bills on the same DigitalOcean invoice as the agent.

When ADK is the right call (and when it isn't)

ADK is not the answer to every agent-hosting question. It earns its keep when you want managed infra, framework-aware tracing, and one DigitalOcean bill instead of three vendors. It is the wrong tool when you need raw GPU control or you already live on another cloud. Here is the decision split.

Your situationThe pick
You want managed infra, tracing, and one DigitalOcean bill in one placeGradient ADK
You need GPU-heavy training or custom CUDA kernelsBare GPU Droplet or Hetzner
You already run on AWS and want serverlessLambda plus Bedrock
You want Git-push deploys on minimal infraFly.io
You're self-hosting n8n or an MCP server, not an agentA raw VPS (see below)

That last row is a different buyer. If your job is keeping an always-on n8n instance or a self-hosted MCP server alive, you are choosing a Linux box and deciding how much of it to babysit, not deploying agent code. We covered that call in detail in our Cloudways vs DigitalOcean vs Hetzner guide. ADK does not replace that; it solves the other problem, where the unit of deployment is a Python agent, not a server you SSH into.

The honest line on the rest: if you have a working Python agent and you want it reachable, observable, and billed in one place, ADK is the shortest path. If you need to drop to bare metal, you will fight the abstraction.

Setup: from local agent to deployed endpoint

You need Python 3.10+, a DigitalOcean account, and the ADK public preview enabled in your workspace. If you don't have an account yet, create a DigitalOcean account and turn on the ADK preview from the Feature Preview page before the CLI will deploy.

Install the package. It is gradient-adk, and it brings both the SDK and the gradient CLI:

pip install gradient-adk

Every ADK agent is a main.py with an @entrypoint-decorated function. This is the contract the runtime calls, and it is the smallest possible agent that deploys cleanly:

# main.py
from gradient_adk import entrypoint

@entrypoint
def main(input: dict) -> dict:
    # your agent logic here
    return {"response": "..."}

Source: gradient-adk-templates README, fetched 2026-06-17.

That signature matters. The runtime hands your function an input dict and expects a dict back; that contract is what gets exposed at the /run endpoint. Most teams start from a template instead of an empty file. The gradient-adk-templates repo ships starting points for LangGraph, CrewAI, a multi-agent RAG workflow, a web-research agent, and an MCP integration. Clone one and you inherit the project shape:

git clone https://github.com/digitalocean/gradient-adk-templates.git
cd gradient-adk-templates/StateGraph

python -m venv venv
source venv/bin/activate          # Windows: venv\Scripts\activate
pip install -r requirements.txt

cp .env.example .env
# edit .env: set DIGITALOCEAN_API_TOKEN and DIGITALOCEAN_INFERENCE_KEY

Source: gradient-adk-templates README (StateGraph quick start), fetched 2026-06-17.

Test it locally before you spend a deploy on it. With your token exported, gradient agent run executes the agent on your machine against the real model endpoint:

export DIGITALOCEAN_API_TOKEN=<YOUR_TOKEN>
gradient agent run

When the local run looks right, ship it. One command pushes the agent to managed infrastructure:

gradient agent deploy

Source: gradient-adk-templates README + DigitalOcean launch blog, fetched 2026-06-17.

After deploy, the agent is live behind the standardized /run endpoint, and you can stand up separate dev, staging, and prod deployments of the same code from the runtime, per the launch blog. The deployment config lives in a .gradient/agent.yml file the templates already include, so the CLI knows what it's shipping.

The framework choice has one real consequence, and the docs bury it. If you're starting fresh, pick LangGraph (or PydanticAI). For those two frameworks, ADK captures node-level traces automatically, with zero extra configuration, per the launch blog. Build on LangChain, CrewAI, or custom orchestration and you still get standardized input/output and runtime visibility, but the deep per-step tracing needs decorators added to your agent logic. If you already have a CrewAI codebase, ADK still wraps it; you just do a little more work to get the same depth of trace. That single difference is the reason to default to LangGraph when the slate is clean.

Tracing: the part that makes it production-credible

A bare Droplet gives you a process and a log file. That is fine until an agent silently loops, or a tool call returns garbage three steps deep and the final answer is plausible-but-wrong. Then you are grepping stdout trying to reconstruct what the model decided and when.

ADK's tracing is the actual differentiator here. Per the launch blog, it adds "full tracing (including LLM calls, tool calls, knowledge base calls, and model specific metadata)" to your agent, and for LangGraph and PydanticAI that capture is automatic. No OpenTelemetry collector to stand up, no span plumbing to write. Each run shows up in your Gradient workspace as a tree of steps: the graph nodes that fired, the prompt and response on each LLM call, every tool invocation with its arguments and return value, and any knowledge-base lookups along the way. When an agent does the wrong thing, you open the trace and read the exact step where the reasoning went sideways instead of reverse-engineering it from logs.

There is a second debugging surface worth knowing about. ADK includes evaluations for multi-step agents: per the launch blog, you build test cases from datasets and apply metrics "for every step of an ADK Agent, not just the final output." That is the difference between "the final answer looked okay" and "step four retrieved the wrong document but the model covered for it." Catching the second one is what keeps an agent honest in production.

Knowledge bases: built-in RAG without a separate vector DB

For a lot of agents, the whole value is answering from your own documents. ADK agents can query a DigitalOcean Knowledge Base directly, per the launch blog, which means common RAG cases don't need a separate vector database you operate yourself.

The piece that makes this low-maintenance is auto-indexing. You connect a source (Google Drive, Amazon S3, or Dropbox), flip on auto-indexing, and pick a schedule. New, updated, and deleted documents get detected, fetched, and re-indexed into the underlying OpenSearch store automatically, per DigitalOcean's capabilities update. The schedule choices are daily, weekly, or manual, and there are sync logs showing last sync time, status, and errors. So a support agent pointed at a Drive folder of policy docs stays current as the docs change, without anyone re-running an ingestion script.

The catch worth naming: auto-indexing keeps the index fresh, it does not make retrieval perfect. Garbage or duplicate docs in the source folder still poison the answers. The evaluations from the previous section are how you catch a retrieval step that pulled the wrong doc before it ships a confidently wrong reply.

What it costs

Here is the part to read carefully, because the preview pricing is generous and temporary.

Agent hosting (the ADK compute itself) is free during the public preview, per the launch blog. You deploy, test, and iterate without paying for the runtime while you evaluate it. DigitalOcean has not published a post-preview hosting rate, so anyone quoting you a future per-hour number is guessing. Don't budget around it yet.

What you do pay for during the preview is inference and any supporting resources. Model calls bill at standard Gradient serverless inference rates, knowledge-base storage and any databases bill normally, and it all lands on one DigitalOcean invoice. For the current numbers, check DigitalOcean's Gradient model pricing directly, as of June 2026, rather than trusting a figure copied into a blog post that may be stale by the time you read it. A new DigitalOcean account starts at $0 and typically comes with sign-up credit, so you can run the whole walkthrough above and pay only for the tokens you actually burn.

Which way to go tomorrow morning

If you're a Python developer with a working agent and you want it hosted, observable, and billed in one place, Gradient ADK is the clear pick right now. The free preview compute removes the cost objection, the LangGraph and PydanticAI automatic tracing removes the "I can't see what it's doing" objection, and running the agent next to inference and knowledge bases under one platform removes the three-vendors-and-a-spreadsheet objection. Start from a template, deploy with one command, and read the trace when something breaks.

Reach for an alternative in two cases. If you need custom CUDA or GPU-heavy training, a bare GPU Droplet or Hetzner gives you the metal ADK abstracts away. If your stack already lives on AWS, Lambda plus Bedrock keeps you in one bill there instead of starting a new one. And if you're hosting n8n or an MCP server rather than agent code, that's the VPS decision in our Cloudways vs DigitalOcean vs Hetzner guide, not this one.

For everyone in the sweet spot: enable the preview, install gradient-adk, and deploy a template before the free window closes. Create a DigitalOcean account to get started and turn on the ADK preview from the Feature Preview page.