The chatbot era is basically over, and nobody sent a memo. Sometime between January and May 2026, all four big labs stopped selling you something that talks and started selling you something that does. Anthropic shipped Claude Cowork. Microsoft folded Copilot Cowork into Microsoft 365. OpenAI’s ChatGPT Agent went from a Pro-only curiosity to a checkbox on the $20 plan. And at I/O on May 19, Google announced Gemini Spark — a 24/7 agent that keeps working when your laptop is closed.
So now you’ve got four products that all promise the same thing: give it a goal, walk away, come back to finished work. The pitch is identical. The reality is not. Each one is good at a genuinely different job, and each one has a spot where it falls on its face. I’ve spent enough time with all four to tell you which is which.
What “agent” actually means now
Quick reset, because the marketing has muddied this. A chatbot answers. An agent executes — it takes multiple steps, calls tools, touches your apps and files, and produces a deliverable instead of instructions for making one. The dividing line is whether the thing can act without you babysitting each step.
All four contenders cross that line. Where they diverge is where they run and what they can touch:
- Gemini Spark runs in the cloud, 24/7, grounded in your Google account.
- ChatGPT Agent runs in a cloud browser sandbox, app-agnostic.
- Claude Cowork runs on your actual desktop, touching your real local files.
- Copilot Cowork runs inside the Microsoft 365 boundary, grounded in your work graph.
That “where it runs” detail isn’t trivia. It decides everything downstream — what data it sees, what it can break, and who it’s actually for.
Gemini Spark: the always-on one
Spark is the newest and the strangest of the four, in a good way. Google built it on Gemini 3.5 Flash paired with the Antigravity agent harness, and the headline trick is that it runs on dedicated VMs in Google Cloud rather than on your device. Practical translation: you can hand it a task, turn off your phone, and it keeps going. Sundar Pichai’s framing was “your personal AI agent that helps you navigate your digital life, taking action on your behalf and under your direction.” Background execution is the whole point.
It plugs straight into your Google world — Gmail, Drive, Docs, Sheets, Photos, Search, YouTube history — through an opt-in menu where you choose what it can read. Ask it to pull facts from your emails and last quarter’s spreadsheet and draft a client update, and it’ll do that without you uploading anything. It also speaks MCP, so third-party app connections are coming, with Google promising more over the next few months.
Here’s the catch, and it’s a big one as of June 2026: Spark is barely available. It launched in beta to U.S. Google AI Ultra subscribers — that’s the $99.99/mo plan, or the $200/mo tier for heavy users — and even then it’s rolling out to trusted testers first, not GA. Flash is also a speed-tuned model, not Google’s deepest reasoner, so Spark is quick and great at the routine stuff but not where you send a genuinely hard, ambiguous problem. It’s the most ambitious architecture of the four and the one you’re least likely to be able to use today.
ChatGPT Agent: the generalist that goes anywhere
ChatGPT Agent is the one that doesn’t care what ecosystem you live in. It drives a virtual browser — clicking, typing, filling forms, navigating sites — which means it can act across more or less any web app rather than just the ones in one company’s walled garden. Book a thing, scrape a comparison, fill out a tedious portal, pull data from six sites and reconcile it: this is its turf.
It’s also riding the strongest general reasoning of the bunch, in my experience. For open-ended research — synthesize forty sources, cite them, find the contradiction — nothing here beats it. And it’s the cheapest entry point: Agent Mode is included on ChatGPT Plus at $20/mo, which a year ago would’ve sounded absurd. If you want priority on the heavier computer-use work, the Pro tiers ($100/mo as of April 9, and $200/mo) bump your quotas and access.
The browser sandbox is the limiter. Because it’s working through a clean virtual browser, it doesn’t have your logged-in sessions, your local files, or your desktop apps unless you explicitly wire them up. Anything behind a login it hasn’t been handed is a wall. It’s brilliant at navigating the open web and clumsy at touching the stuff that lives only on your machine — which is the exact inverse of the next one.
Claude Cowork: the one that works on your actual computer
Cowork is Anthropic taking the engine behind Claude Code — file access, multi-step execution, tool use — and wrapping it for people who aren’t developers. It launched as a research preview in January 2026, hit Windows on February 10 (after a macOS-only stretch), and went generally available in April. It runs as a desktop app and works on your real local files: you point it at a folder, give it read/edit/create permission, and it produces finished deliverables on disk instead of telling you how to make them.
This is the one I reach for when the work is files. Restructure a thirty-tab spreadsheet, reconcile a folder of messy CSVs, turn a pile of notes into a formatted report, refactor a codebase — Cowork operates where that work actually lives. Anthropic explicitly aimed it at non-developers whose days are full of tasks that are time-consuming but not technically hard: analysts, ops, legal, finance. That targeting shows. It feels less like a chat window and more like handing a competent contractor your laptop.
The local-execution model is its strength and its tax. An agent with read/edit/create rights on your files is exactly as powerful and as nerve-wracking as it sounds — you want to scope its folders carefully and watch what it does the first few times. It also leans on your machine and your attention more than the cloud agents; it’s not the “close the laptop and forget it” experience Spark is going for. Different philosophy. One trusts the cloud, the other trusts your desktop.
Copilot Cowork: the one that already knows your work
Copilot Cowork is the enterprise play, and it’s almost unfair how much context it starts with. It’s an agentic layer living natively inside Microsoft 365 — Outlook, Word, Excel, Teams — grounded in your organization’s work graph. It can send emails, schedule meetings, draft documents, and build things directly inside the apps your company already runs on. The Excel integration is the standout: with Wave 3’s Agent Mode (rolled out March 2026) it builds formulas, tables, and conditional formatting inside a live spreadsheet and explains its reasoning as it goes.
The real selling point is governance. Copilot Cowork inherits your 365 tenant’s entire security posture — Entra ID, DLP policies, sensitivity labels, conditional access. Your data never leaves the Microsoft cloud boundary. For a regulated company, that one fact outweighs almost everything else on this page. It also asks for approval before sensitive actions and lets you pause, resume, or cancel mid-task, which is the kind of control a compliance team actually wants. Pricing is $30/user/mo as a Microsoft 365 Copilot add-on — incremental if you’re already on 365.
Where it stops is the open internet. Copilot Cowork has limited web access; its native habitat is your internal 365 graph, not the wider web. For open-ended research across the public internet, it’s the weakest of the four — which is the precise mirror image of ChatGPT Agent’s strength. And Microsoft now lets you run Claude inside Copilot via multi-model selection, so the lines between these products are starting to smudge.
Head-to-head, by the job you actually have
Forget the spec sheets. Here’s who I’d hand each task to.
Inbox triage and email drafting. Spark and Copilot Cowork tie here, because both are grounded in your actual mail — Gmail for one, Outlook for the other — and can pull context across your messages and docs to draft a real reply. ChatGPT Agent can do it but needs to be given access. Cowork is the odd one out unless your email lives in local files.
Calendar and scheduling. Same split. Spark (Google Calendar) and Copilot (Outlook) win because the calendar is right there in their grounding. The other two need wiring up.
Multi-step research across the open web. ChatGPT Agent, not close. Strongest reasoning, real browser, cites sources. Spark is competent and fast. Copilot Cowork is the wrong tool — it barely sees the public internet.
Working inside SaaS apps you log into. ChatGPT Agent, because the browser sandbox can navigate anything you hand it credentials for. Spark’s MCP connections will eventually compete here; today the partner list is still filling in.
Heavy file and document work. Claude Cowork. It’s the only one operating directly on your local disk, and for spreadsheet surgery, document generation, or anything code-adjacent it’s in a different league. Copilot’s Excel mode is excellent but stays inside Microsoft’s files.
Coding and long technical tasks. Claude Cowork, by inheritance from Claude Code. ChatGPT’s Codex Agent is a real competitor; the other two aren’t really aiming here.
So which one do you pay for?
The honest answer is that the question is mostly settled by where your data already lives.
If your work runs on Microsoft 365 and you answer to a compliance team, it’s Copilot Cowork. The governance story and the native Office grounding are worth more than any benchmark, and at $30/user/mo on top of a license you already hold, it’s the path of least resistance. Just don’t expect it to research the open web.
If you live in Google — Gmail, Drive, the whole thing — and you want the most futuristic version of this idea, Gemini Spark is the one to watch. The 24/7 background execution is genuinely a different model of working. The asterisk is availability: it’s beta, U.S.-only, and gated behind a $100/mo Ultra plan as of June 2026. For most people that’s “soon,” not “now.”
If you don’t want to marry an ecosystem, ChatGPT Agent is the pragmatic pick and the cheapest serious one at $20/mo. It goes anywhere on the web, reasons the best, and doesn’t assume your life lives in any one company’s cloud. The browser sandbox just means it won’t touch your local machine.
And if your work is files and documents more than apps and inboxes, Claude Cowork is the one that actually does the work where the work is. It’s the least flashy framing — no 24/7 cloud agent, no enterprise graph — and for a huge amount of real knowledge work it’s also the most directly useful.
One privacy note worth chewing on before you commit. These four make opposite bets about your data: Copilot keeps it inside your tenant, Cowork keeps it on your desktop, while Spark and ChatGPT Agent send it to a cloud VM to run. That’s not a footnote — it’s the actual decision. The smartest agent for your inbox might be the one you’re least comfortable handing your inbox to.
If you’re just starting, grab the $20 ChatGPT plan and give Agent Mode a genuinely annoying multi-step task this week — the kind you’ve been putting off. It’s the lowest-commitment way to feel where the chatbot ends and the agent begins. Then decide whether you want one that lives in your files, your office suite, or the cloud.
Sources: TechCrunch — Gemini Spark, Engadget — Google AI Ultra pricing, Anthropic — Claude Cowork, Microsoft — Copilot vs ChatGPT Enterprise, OpenAI — ChatGPT Pricing