Subject Agent Frameworks
Date Feb 2026
Section 05 / 07
05

Agent SDK to Agent Server: crossing the service boundary

An agent can be many thing: ephemeral or long-lived, stateful or stateless, automating processes behind-the-scene or user-facing.

How do these behavioural features map with technical capabilities provided by the agent frameworks? And the other way around, OpenCode has a server-client architecture while other frameworks are libraries: how does that matter when you want to build an agent

In this part, I'm looking at how agent behaviours and agent implementation details are related. In particular what technical layers need to be implemented to go from an Agent SDK to an Agent Server (OpenCode).

What's an Agent "SDK" anyway?

Libraries and services

Think of the difference between Excel and Google Sheets.

An Excel spreadsheet lives on your machine. Nobody else can see it while you're working. It exists on your machine and only your machine.

Google Sheets lives on Google's servers. You open it in a browser, but the spreadsheet is not on your machine. You can close your browser and it's still there. You can open it from your phone, from another laptop. It keeps running whether or not you're connected.

Excel behaves in this example like a library, it's embedded. Google Sheets is "hosted": it lives behind the service boundary. It's a service.

The lifecycle of a service is not bound to the lifecycle of the client that is calling it. The service boundary is not just about separate physical machines — it is about whether a capability runs inside an application or as a separate, independent process. An application calls a library directly; it connects to a service over a protocol.

A more technical example: databases.

SQLite is embedded. Your application links the library, calls functions directly. No service boundary. When your app exits, SQLite exits.

PostgreSQL is hosted. It runs as a separate server process. Your application connects over a socket, sends SQL as messages, receives results. Service boundary. PostgreSQL keeps running after your app disconnects.

What is the difference between an Agent SDK and a regular coding agent?

What's the difference between Claude Agent SDK and Claude Code, between Codex SDK and Codex, between Pi coding agent and Pi SDK?

An Agent SDK provides the same kind of capabilities you would expect from a coding agent — but as a "programmable interface" (API) instead of a user interface

# Send a prompt to the Claude Agent SDK with a list of allowed tools
from claude_agent_sdk import query

async for message in query(
    prompt="Run the test suite and fix any failures",
    options={"allowed_tools": ["Bash", "Read", "Edit"]}
):
    print(message)

With an Agent SDK, you may:

Example: automated code review in CI.

Example: agentic search in a support app.

In both cases, the agent runs within the host process. It starts, does its work, and stops. No independent lifecycle. No reconnection. No background continuation.

How is an Agent Server different from an Agent SDK?

The Agent Server use case

If you want to build a ChatGPT clone, an Agent SDK is a start. But it's not enough.

You need the agent's lifecycle to be decoupled from the client's so that you can:

You cannot just put the SDK on a server and call it done. The SDK gives you the agent loop. It does not handle what comes with running a process that other people connect to over a network:

Agent-specific server capabilities

Authentication and network resilience need to be thought through for any client-server application. Agents require additional layers:

Transport: how the user's browser (or app) talks to the agent server. You build an HTTP server that accepts requests and returns agent output. The question is how much real-time interaction you need. There are multiple options of growing complexity from standard HTTP request/response (the user submits a task and waits for the complete result: no progress updates while the agent works) to Websocket. See focus on the Transport layer in Part 5 for more details.

Routing: how each message reaches the right conversation. You build this by assigning a session ID to each conversation and maintaining a registry — a lookup table that maps session IDs to agent processes. When a message comes in, the server looks up the session ID and forwards the message to the right place.

Persistence: how conversations can be accessed and resumed later. You build this by "persisting" the conversation state (messages, context, artifacts). Unless the runtime is run without interruption that means saving the state and reloading it when the user reconnects. Part 5 shows how different projects solve this differently.

Lifecycle: what happens when the user closes the tab while the agent is working. When the agent runs inside the request handler, when the user disconnects, the connection closes and the agent stops. For longer tasks, you need the agent to survive disconnection. To do so, first you need to separate the agent process from the request handler. The agent runs in its own container or background process, not inside the HTTP handler.

SDK to Agent Server layers

OpenCode: the only Agent Server

OpenCode ships as a server with most layers built in.

Layer OpenCode provides What it does not provide
Transport HTTP API + SSE streaming. Client sends prompts via POST, receives output via SSE. No WebSocket. SSE is one-way — the client cannot send messages while the agent is streaming without making a separate HTTP request.
Routing Full session management — create, list, fork, delete conversations. Each session has an ID. Sessions are scoped to one machine. No global registry for routing across multiple servers or sandboxes.
Persistence Sessions, messages, and artifacts saved to disk as JSON files. Restart the server and conversations are still there. Persistence is tied to the local filesystem. If the machine or sandbox is destroyed, the files are gone. No external database, no durable state across environments.
Lifecycle Server continues running when client disconnects. Agent keeps processing. Reconnect with opencode attach. No recovery from server crashes — in-flight work is lost. No job queue, no supervisor, no automatic restart.
Multi-client Multiple SSE clients can watch the same session simultaneously. Only one client can prompt at a time (busy lock). No presence awareness, no real-time sync between clients. Multiple viewers, single driver.
Authentication Optional HTTP Basic Auth. No tokens, no user identity, no multi-tenant isolation, no fine-grained permissions.

What to keep in mind