A year ago, if you wanted to build an AI agent, you picked an orchestration framework — CrewAI, LangGraph, AutoGen — and wired it to whatever model API you were using. That layer still exists. But in 2026, something shifted: the model providers started shipping their own agent SDKs.
Anthropic has the Claude Agent SDK. OpenAI has the Agents SDK. AWS built Strands Agents. Google released the Agent Development Kit. Each one bundles an agent loop, built-in tools, and deployment opinions into a single package that’s tightly integrated with its parent platform.
This is a different decision than choosing between CrewAI and LangGraph. Those frameworks are model-agnostic orchestration layers. These SDKs are lower-level: they give you the primitives that orchestration frameworks are built on top of. Think of it as choosing your engine versus choosing your car.
I’ve been building with all four over the past couple of months, and the differences aren’t just cosmetic. They reflect very different philosophies about what an agent should be, how much control you should have, and how tightly your code should couple to one vendor’s ecosystem.
What You’re Actually Choosing Between
Here’s a quick orientation before we get into the details.
| Claude Agent SDK | OpenAI Agents SDK | Strands Agents | Google ADK | |
|---|---|---|---|---|
| Maker | Anthropic | OpenAI | AWS | |
| Languages | Python, TypeScript | Python (TS coming) | Python, TypeScript | Python, Java, Go, TS |
| Model lock-in | Claude only | 100+ LLMs via Chat Completions | Model-agnostic (Bedrock, OpenAI, Ollama, etc.) | Gemini-first, but supports others |
| MCP support | Native | No | Native | Partial |
| Open source | Yes | Yes (MIT) | Yes (Apache 2.0) | Yes (Apache 2.0) |
| GitHub stars | ~8K | ~18K | ~14M downloads | ~17K |
The stars and download numbers don’t tell the full story — Strands has been around since May 2025 and counts PyPI downloads differently — but they give you a rough sense of community momentum.
Claude Agent SDK: The “Give You Everything Claude Code Has” Approach
Anthropic’s pitch is straightforward: you get the same agent loop, tools, and context management that power Claude Code, but as a programmable SDK. If you’ve used Claude Code and thought “I wish I could embed this in my own app,” that’s exactly the use case.
The built-in tool set is the most comprehensive of the four. Out of the box, you get file read/write/edit, bash execution, glob and grep for code search, web search, web fetch, and — this is the interesting part — the ability to spawn subagents. The subagent model lets you decompose complex tasks into independent parallel work, which maps well to how you’d naturally break down a problem.
Architecture-wise, the SDK is built around four concepts: Agent (model + system prompt + tools), Environment (a configured container template), Session (a running instance with persistent state), and Events (the message stream). The session abstraction is well-designed — you can pause, resume, and inspect agent state without losing context.
Context compaction is another standout. For long-running agents that accumulate thousands of tokens of conversation history, the SDK automatically summarizes older context to keep the window manageable. You can also set a hard budget cap with max_budget_usd, which is a lifesaver when you’re running agents in production and don’t want a runaway loop to cost you $50.
The catch: Claude-only. There’s no model flexibility here. If you’re building on the Claude Agent SDK, you’re committed to Anthropic’s models and Anthropic’s pricing. For many teams, that’s fine — Claude is strong enough that model optionality isn’t a real concern. But if your organization has a mandate to use Bedrock models or wants the option to swap in an open-source model for cost-sensitive tasks, this becomes a hard constraint.
MCP (Model Context Protocol) support is first-class, which matters if you’re building integrations with databases, APIs, or external services. The SDK treats MCP servers as just another tool source, so connecting to a Postgres database or a Slack workspace is configuration, not code.
Best for: Teams already invested in Claude who want production-grade agent infrastructure with minimal boilerplate. If you’re building developer tools, code assistants, or anything that involves file system interaction, the built-in tools are hard to beat.
OpenAI Agents SDK: Handoffs, Guardrails, and the Enterprise Play
OpenAI’s approach is more opinionated about multi-agent orchestration. The core primitives are agents (with instructions and tools), handoffs (delegation between agents), and guardrails (input/output validation). The guardrails piece is doing real work here — it’s not just schema validation, it’s LLM-powered checking that can catch semantic issues like “the agent is about to send an email to the wrong person.”
The April 2026 update added sandbox support, which matters more than it sounds. Agents can now operate in controlled compute environments — inspecting files, running commands, editing code — inside sandboxes from providers like E2B, Modal, Cloudflare, or Vercel. This moves the SDK from “chat agent” territory into “coding agent” territory, which is clearly where the market is heading.
The handoff model is conceptually clean. Instead of building explicit routing logic, you define agents that know about each other. When Agent A decides it needs Agent B’s expertise, it hands off the conversation. It’s a nice abstraction, but it can get opaque when you’re debugging why a particular handoff happened (or didn’t). OpenAI’s built-in tracing helps here — you get a visual timeline of every agent interaction, tool call, and handoff, which feeds directly into their eval and fine-tuning pipeline.
The catch: The SDK was Python-only until recently, and TypeScript support is still catching up. The “subagents” feature that would let you spawn child agents under a primary agent is announced but hasn’t shipped yet. If you need that pattern today, you have to build it yourself with handoffs, which is clunkier.
Model flexibility is surprisingly good for a vendor SDK. OpenAI opened it up to work with 100+ non-OpenAI LLMs via the Chat Completions API format. So while the SDK is optimized for OpenAI models, you can plug in other providers. Whether those providers work as smoothly as native OpenAI models is another question — tool calling and structured outputs can behave differently across providers, and the SDK doesn’t abstract those differences away.
No MCP support. If you’ve been investing in MCP servers for your tool integrations, this is a gap. OpenAI has their own tool ecosystem (function calling, code interpreter, file search) but it doesn’t interop with the MCP standard that Anthropic, AWS, and Google are converging on.
Best for: Teams that want strong guardrails and observability out of the box, especially for customer-facing agents where you need to validate outputs before they reach users. The tracing-to-eval pipeline is genuinely useful for iterating on agent behavior.
Strands Agents: Model-Agnostic and AWS-Native
Strands takes a fundamentally different position than the other three: model agnosticism isn’t a feature, it’s the core design principle. The SDK works with Amazon Bedrock, Anthropic’s API directly, OpenAI, Google Gemini, Ollama for local models, LiteLLM — basically anything that speaks an LLM API. AWS built it, but they deliberately didn’t lock it to Bedrock.
The “model-driven” philosophy means the SDK leans heavily on the foundation model’s native capabilities rather than wrapping them in framework-specific abstractions. You define an agent with a system prompt and tools, and the model’s own reasoning drives the loop. There’s less framework magic, which means less framework surprise.
Where Strands gets interesting is the multi-agent coordination patterns. It ships with three built-in patterns: Swarm (many agents working independently on similar tasks), Graph (explicit state machine routing, similar to LangGraph’s approach), and Workflow (sequential pipeline). Having these as first-class patterns rather than something you build yourself saves real development time.
The 14 million+ downloads since the May 2025 launch are impressive, and the community has built a substantial tools ecosystem — the tools repository has nearly 300 stars with contributions covering AWS services, databases, and common APIs.
Strands Labs, launched in early 2026, pushes into experimental territory: AI Functions (generating code at runtime from natural language specs), Strands Robots (connecting LLMs to physical hardware), and bidirectional streaming for real-time voice conversations. These are explicitly experimental, but they signal where AWS sees agent development heading.
The catch: Model agnosticism is a double-edged sword. When your SDK has to work with every model, it can’t deeply optimize for any one model’s strengths. Features like extended thinking, native tool use, or model-specific caching work better when the SDK knows exactly which model it’s talking to. Strands handles this through provider-specific configuration, but you’ll notice the difference in polish compared to Claude Agent SDK’s tight integration with Claude or OpenAI Agents SDK’s tight integration with GPT models.
Native MCP support is solid. If you’re already running MCP servers, Strands picks them up cleanly.
Best for: Teams on AWS infrastructure who want model flexibility, or anyone who needs to run different models for different tasks within the same agent system (e.g., Claude for complex reasoning, a smaller open-source model for simple classification). Also the strongest choice if you need multi-agent coordination patterns out of the box.
Google ADK: The Enterprise Framework That Thinks in Systems
Google’s Agent Development Kit arrived with the most ambitious scope. It’s not just an SDK — it’s a framework for building, evaluating, and deploying agent systems at enterprise scale. The fact that it’s the same framework running inside Google’s Agentspace, Customer Engagement Suite, and other internal products gives it a certain gravity.
The multi-language support is the broadest of the four: Python, Java, Go, and TypeScript. The Java SDK reaching 1.0 with native A2A (Agent-to-Agent) protocol support matters a lot for enterprise Java shops that have historically been an afterthought in the AI tooling space.
Context management is where ADK’s engineering shows. The framework treats context like source code — sessions, memory, tool outputs, and artifacts are assembled into a structured view where token usage is tracked and optimized automatically. It filters irrelevant events, summarizes older turns, and lazy-loads artifacts. For long-running enterprise agents that process dozens of documents across multiple sessions, this matters more than you’d think.
The A2A protocol support deserves attention. While MCP handles tool integration (connecting agents to databases, APIs, services), A2A handles agent-to-agent communication across different frameworks and languages. ADK for Java natively supports A2A, meaning your ADK agent can talk to agents built in completely different SDKs. If you’re building a system where multiple teams own different agents, this is the interop layer you need.
ADK’s deployment story is the tightest of the four if you’re on Google Cloud. Deploy to Agent Runtime, Cloud Run, or GKE, and you inherit managed infrastructure, authentication, Cloud Trace observability, and enterprise security with zero additional configuration. But this cuts both ways.
The catch: ADK is Gemini-first. It technically supports other models, but the integration depth with Gemini — especially around multimodal capabilities, grounding with Google Search, and Vertex AI features — creates a strong gravitational pull toward the Google ecosystem. The framework is also the heaviest of the four. If you want to build a simple single-agent tool, ADK’s enterprise scaffolding can feel like driving a semi-truck to the grocery store.
Best for: Enterprise teams on Google Cloud building multi-agent systems that need to interop with agents from other teams or vendors. The A2A protocol support and multi-language coverage make it the natural choice for large organizations with diverse tech stacks.
The Lock-in Question Nobody Wants to Talk About
Here’s the honest version of the model flexibility conversation: every SDK works best with its own vendor’s models, regardless of what the compatibility list says.
Claude Agent SDK only works with Claude. That’s transparent, at least. You know exactly what you’re signing up for.
OpenAI Agents SDK supports 100+ models but is optimized for GPT. Features like structured outputs and tool calling work most reliably with OpenAI models. Third-party model support is “it works” rather than “it works great.”
Strands Agents is genuinely model-agnostic in practice, but it’s built by AWS and integrates most deeply with Bedrock. If you’re using Strands with Bedrock, you get model selection, fine-tuned models, and AWS service integrations that you don’t get through other providers.
Google ADK technically supports multiple models, but the Vertex AI integration, Google Search grounding, and Gemini-specific features create a clear path of least resistance toward the Google stack.
The practical advice: pick the SDK that matches the cloud provider and model you’re already using. If you’re an Anthropic shop, Claude Agent SDK. If you’re OpenAI, their Agents SDK. If you’re on AWS with Bedrock, Strands. If you’re on GCP, ADK. The model flexibility features are insurance policies, not primary use cases.
What About Orchestration Frameworks?
If you’ve been following the agent space, you might wonder where CrewAI, LangGraph, and AutoGen fit relative to these SDKs. They’re solving a different problem at a different layer.
These vendor SDKs give you the primitives: an agent loop, tool execution, context management, and model integration. Orchestration frameworks give you higher-level patterns: role-based agents, graph workflows, conversation-based collaboration.
You can use both. Run LangGraph on top of Claude Agent SDK, or use CrewAI with OpenAI’s Agents SDK as the underlying engine. The SDKs handle the model interaction and tool execution; the frameworks handle the multi-agent orchestration logic.
That said, all four SDKs are adding their own multi-agent features — handoffs in OpenAI, subagents in Claude, swarm/graph/workflow in Strands, multi-agent composition in ADK — which means the line between “SDK” and “framework” is blurring fast. If your orchestration needs are straightforward, you might not need a separate framework at all.
Decision Matrix: Cut Through the Noise
Stop reading comparison tables and ask yourself these three questions:
1. What cloud provider are you on? This is the strongest signal. AWS → Strands. GCP → ADK. Not locked to a cloud → Claude Agent SDK or OpenAI Agents SDK based on which model you prefer.
2. How many models do you need to support? One model (Claude or GPT) → use that vendor’s SDK. Multiple models → Strands. Multiple models across multiple languages → ADK.
3. What are you building? Developer tools / code agents → Claude Agent SDK (best built-in file and code tools). Customer-facing agents → OpenAI Agents SDK (strongest guardrails). Data pipeline agents on AWS → Strands. Enterprise multi-team agent systems → ADK.
A Note on Maturity
None of these SDKs are mature in the way that, say, React or Django are mature. They’re all less than 18 months old. APIs will change. Features will be added and deprecated. The documentation ranges from “pretty good” (Claude Agent SDK, ADK) to “you’ll be reading source code” (parts of Strands and OpenAI).
Build with the understanding that you’ll need to update your integration at least quarterly. Abstract your business logic away from SDK-specific code where reasonable — not because you’re planning to switch SDKs (you probably won’t), but because the SDK you chose today will ship breaking changes within six months.
The vendor-backed SDK era is still early. But it’s already clear that these four players are shaping how production agents get built in 2026 and beyond. Pick the one that fits your stack, build something real with it, and worry about the rest later.