Field notes · 2026-05-23 · 6 min read

Anthropic's hosted agents, what they actually are, and when to use them in your business

Anthropic now runs the agent runtime for you. Here is what that means in plain English, where it is the right call for a business, and where you still want a custom build.

Anthropic shipped hosted agents and the question we have been getting all week from operators is the same one: do I still need a custom build?

The honest answer is: sometimes no, sometimes yes, and the line is more about your data and your latency than your headcount. Here is the version we wish someone had handed us before we spent two weeks evaluating it.

What "hosted agents" actually means

Hosted agents are a managed runtime. You point at Claude, give it a system prompt and a set of tools, and Anthropic runs the loop. They handle the model calls, the tool dispatch, the conversation state, and the retries. You get an API that takes a request and returns an outcome. No Lambdas, no queues, no orchestration framework, no infrastructure to babysit.

Google has the same shape under Vertex AI Agent Builder. OpenAI has it under Assistants and the newer Responses API. All three labs are converging on the same idea: the model loop is now somebody else's problem if you want it to be. The differences are in compliance posture, latency, ecosystem maturity, and what they let you wire in. We pick across all three depending on the job, more on that in the comparison piece.

For a long time the path to a production agent looked like: pick a framework (LangChain, LlamaIndex, your own), stand up a runtime, wire observability, write the eval harness, deal with timeouts, retry semantics, partial failures, token accounting, and prompt caching. That is two to four engineering weeks before you have anything to show a buyer.

Hosted agents collapse most of that into a config file. The model loop is now somebody else's problem.

When hosted agents are the right call

We have shipped a half-dozen agent integrations across dental, dispatch, retail, and nonprofit. The pattern we keep seeing for when hosted is the right call:

The work is stateless or the state lives in your existing systems. If your agent's job is "read a ticket, classify it, escalate or close" and the ticket lives in your existing helpdesk, hosted is great. The agent calls your helpdesk API, the state stays where it belongs, you do not need a side database.

The tool surface is narrow and well-defined. Five to fifteen tools wired to existing APIs is the sweet spot. Anthropic's hosted runtime handles the tool-calling protocol with the same accuracy as a self-hosted one, and you do not have to maintain it.

Your latency budget is generous. Hosted agents add a network hop. If your user is waiting on a single AI response at chair-side, in the cab of a truck, or on a call, the extra 200 to 600ms can be the difference between "useful" and "annoying." For background processing or async workflows, latency is a non-issue.

You do not need exotic orchestration. Sub-agents, dynamic agent spawning, custom retry policies, manual context window management, all of that is more flexible in your own runtime. If you do not need it, the hosted runtime is faster to ship and cheaper to maintain.

You want the eval and observability for free. Hosted comes with trace inspection and pricing visibility built in. For a first-version agent, that is real value.

When you still want a custom build

Three cases.

Compliance demands isolation. Most of our healthcare and finance work runs on Google Vertex AI under a BAA specifically so we can prove where the data lives and that it never crosses the boundary. Smile PreVue runs on Vertex AI Gemini 3 Pro for this reason. The HIPAA-grade audit log is a contractual deliverable, not a nice-to-have. Anthropic's hosted agents are not yet positioned for this kind of regulated work. Vertex is. If your sector requires a signed agreement on data residency, default to Google for now and check the compliance roadmap on Anthropic and OpenAI every quarter, because both are moving fast.

Your orchestration is the product. If the agent you are building is doing something that no off-the-shelf runtime can model, chained sub-agents with different model providers, dynamic skill loading per tenant, custom rollback semantics on tool failures, you will out-grow a hosted runtime within a quarter. The multi-tenant nonprofit platform we are building does AI attribution where the orchestration logic is the moat. Hosted does not fit that.

Latency is non-negotiable. When a driver is waiting on a route decision or a clinician is waiting on a smile preview, every network hop matters. Self-hosted next to the API endpoint cuts a lot of wall time.

You need control over the retry and timeout policy. Hosted runtimes have sensible defaults. If your business logic needs different defaults, "fail loud after 3 seconds, do not retry" or "retry forever with exponential backoff because this job has to complete", you want your own loop.

What this means for how we scope projects

We re-priced our AI Integration track around this last week. Three of the four engagement shapes look different now:

Greenfield agent for a new workflow. Default to hosted. Six weeks to a thin slice instead of ten. The savings are real and we pass them through.

Replacing a vendor that is not working out. Default to hosted unless the vendor's stickiness is data-portability rather than orchestration. Most of these were over-engineered to begin with.

Adding AI to an existing critical workflow. Default to custom. Latency, compliance, and integration with existing observability usually make hosted a worse fit. The cost of getting the integration wrong is too high.

AI as a feature inside a SaaS we are building. Default to hosted for the first version. Plan to migrate to custom by the time you have ten production tenants if the orchestration grows complex. The optionality is worth more than the small savings of starting custom.

What we would deploy for a law firm tomorrow

We get asked this often enough that it deserves a concrete answer.

A small-to-mid law firm wants an AI that can read a new client intake form, pull relevant prior cases from the firm's matter management system, draft a conflicts-check summary, and surface anything that looks like a red flag to a partner.

On hosted, that is roughly: one agent, five tools (intake fetch, matter search, conflicts check, draft summary, partner alert), one system prompt that defines the conflicts-check rubric and the red-flag taxonomy. Two-week build, one-week eval and tuning, two-week firm pilot. Total: five weeks, well under a typical custom integration budget.

The reason hosted works here: the data lives in the matter management system, the latency budget is generous (a partner reads the summary on Monday morning, not in real time), the workflow is well-defined, and the orchestration is straightforward.

If we were building the same thing for a 200-attorney firm with custom internal systems and PII handling requirements, the answer would be custom and the timeline would be ten weeks. Same agent shape, different runtime.

The trap to avoid

The cheapest version of every agent is the one nobody is using six months later because it never quite earned the next workflow. The discount on hosted vs. custom is real, but it disappears the first time you have to migrate or the first time it fails an audit.

When we recommend hosted, it is because we think the agent will still be the right architecture in two years. When we recommend custom, it is because hosted's defaults, somewhere we cannot see, will cost more in remediation than the engineering hours we are spending up front.

What this means for you, today

If you are evaluating Anthropic's hosted agents for a workflow at your business and want to know whether it is the right call: that is exactly what our 60-minute consultation is for.

Show us the workflow. We will tell you in an hour whether hosted, custom, or "do not build this with AI yet" is the right answer. If we are not the right studio for it, we will say so and the call is refundable.

Book a consultation · Read the approach

hosted agentsAnthropicagent orchestrationAI integrationbuild vs buy

Liked this?

Tell us what is broken. We’ll tell you what the first week looks like.

Next read →

The Interview, Analyze, Execute method, how we approach every engagement