Building a Unified Orchestrator for Agentic Flow with A2A + MCP

Published: 2025-09-20

Modern AI applications rarely live in isolation. They need to delegate work to other agents, call external tools, and stream progress back to users in real time—keeping humans in the loop on what the agent is doing in the background. This post walks you through a practical proof‑of‑concept architecture that does exactly that: a unified orchestrator that fronts a React frontend, plans execution of user queries, and bridges two ecosystems—Agent‑to‑Agent (A2A) protocol servers and Model Context Protocol (MCP) servers—while streaming live updates end to end. ## Why unify A2A and MCP? - **Broader tool surface**: A2A servers expose agent capabilities via JSON‑RPC and stream updates with SSE; MCP servers expose typed tools and schema‑driven calls. Together, they unlock both agent collaboration and structured tool execution from a single orchestrator. - **One client contract**: The frontend speaks only WebSocket to the orchestrator. Everything else—planning, routing, streaming, and auth—happens behind the scenes. - **Zero‑friction expansion**: Add new A2A agents by publishing an `agent.json`; add new MCP servers via a URL. The orchestrator discovers and registers tools at startup. - **Scaling philosophies**: A2A scales *horizontally*—one agent, one job—while MCP scales *vertically* by adding more tools to a server. Together they represent two different ideologies of scaling, yet the orchestrator lets them complement each other. ## The high-level architecture - **React frontend**: Connects to the orchestrator over WebSocket and sends tasks as simple JSON payloads with an optional bearer token. - **Unified orchestrator**: - Maintains chat history per connection. - Initializes a dispatcher agent (LLM) with tools discovered from A2A and MCP. - Plans execution: decides whether a query is simple (single tool), complex (multi‑step), or compound (requiring multiple tools/agents). - Routes to the right A2A agent or MCP tool, or executes a multi‑step plan. - Bridges streaming from tools/agents back to the UI in real time. - **A2A agents**: Receive JSON‑RPC task requests; stream execution events via SSE. - **MCP servers**: Provide tool catalogs over SSE; the orchestrator executes them directly. [![Unified gateway architecture diagram](/images/blog/unified-gateway-architecture.svg)](/images/blog/unified-gateway-architecture.svg) Typical flow: 1. Frontend → Orchestrator (WebSocket): send `SEND_TASK` with optional token. 2. Orchestrator → A2A (JSON‑RPC): forwards `Authorization: Bearer <token>` if present. 3. A2A → Orchestrator (SSE): streams status/messages; orchestrator forwards them over the same WebSocket. 4. Orchestrator ↔ MCP: discovers tools via SSE and calls them directly when selected by the dispatcher. 5. Orchestrator → Frontend (WebSocket): streams thought/tool/text events and a final unified answer. ## Frontend: minimal, real-time UX Your React app only needs a WebSocket and a couple of handlers to receive live updates. ```tsx import { useEffect, useRef, useState } from "react"; export function Chat() { const [messages, setMessages] = useState<string[]>([]); const wsRef = useRef<WebSocket | null>(null); useEffect(() => { const ws = new WebSocket("ws://localhost:4000"); wsRef.current = ws; ws.onmessage = (evt) => { const event = JSON.parse(evt.data); if (event.type === "text_chunk") { setMessages((m) => [...m, event.chunk]); } else if (event.type === "tool_status_update") { // Optionally show tool-level progress } else if (event.type === "final_answer") { setMessages((m) => [...m, "\n---\n" + event.message]); } else if (event.type === "error") { setMessages((m) => [...m, `Error: ${event.message}`]); } }; return () => ws.close(); }, []); const sendTask = (text: string, token?: string) => { wsRef.current?.send( JSON.stringify({ type: "SEND_TASK", payload: { message: text }, token: token?.trim() || undefined, }) ); }; return } ``` - **Event types you’ll see**: `text_chunk`, `tool_status_update`, `tool_start`, `tool_end`, `thought`, `final_answer`, `error`. - **Auth**: Provide a bearer token per message if needed; the gateway forwards it only to authenticated A2A requests (not to SSE). ## Orchestrator internals: planning, routing, and streaming Once a user sends a query, the orchestrator becomes the hub of activity. It accepts the incoming `SEND_TASK` event over WebSocket, validates it (using Zod), and attaches the message to a per‑connection chat history. From here, the orchestrator decides how best to move forward. At the center is the **dispatcher agent**, a tools‑enabled LLM running at temperature 0. This dispatcher is responsible for reasoning about the query: - If the task is straightforward, the dispatcher routes it directly to the most suitable tool. - If the query is complex or compound, it invokes a planning mode. The orchestrator drafts a step‑by‑step plan, executes those steps in order, and then synthesizes a unified final answer for the user. Compound queries might involve both A2A agents and MCP tools working together. ### Dynamic tool discovery The orchestrator is never locked to a static set of tools. Instead it discovers them dynamically: - A2A discovery: By reading `AGENT_CARD_URLS`, the orchestrator fetches each agent’s agent card from `/.well‑known/agent.json`. Each card registers as a callable tool with a simple schema like `{ input: string }`. When invoked, the orchestrator sends a JSON‑RPC `tasks/send`, then subscribes to `/events/:taskId` via SSE to stream progress back to the user. - MCP discovery: From `MCP_SERVER_URLS`, the orchestrator connects to servers that speak the Model Context Protocol. It calls `listTools` over SSE and wraps each tool with its published JSON schema. This means new MCP tools become immediately usable without custom integration code. ### Streaming back to the user The orchestrator doesn’t wait until the end to respond. As partial outputs arrive—from A2A event streams or MCP calls—it forwards them directly over WebSocket. Users see tool status updates, partial text, and progress messages in real time, making the system feel alive and transparent. ## A2A protocol in practice The A2A protocol is designed around simple contracts: - **Agent card**: a JSON file describing an agent’s base URL, name, and optional skills. - **Task lifecycle**: - `tasks/send`: create a task and receive a `taskId`. - `/events/:taskId`: stream progress and messages via SSE until the task completes. - `tasks/get`: poll for status if streaming isn’t available. - `tasks/cancel`: allow graceful termination of long‑running jobs. - **Security model**: Authentication is enforced only on JSON‑RPC POSTs (the orchestrator forwards the bearer token). SSE endpoints remain public, keyed by `taskId`. Importantly, the orchestrator never injects secrets into SSE streams. ## MCP integration at a glance The MCP standard takes a different philosophy: rather than agents, it exposes typed tools with JSON schemas. - **Discovery**: The orchestrator connects to each MCP server via SSE, calls `listTools`, and caches the tool specifications. - **Execution**: When selected, the orchestrator calls `callTool({ name, arguments })` and streams the results. - **Results**: Responses are normalized into simple strings or structured content so they can flow back through the dispatcher and into the chat. Together, these two ecosystems—A2A for horizontal agent collaboration and MCP for structured, schema‑driven tools—are unified through the orchestrator, which plans, routes, and streams everything back to the human in the loop. ## Extensibility - **Add an A2A agent**: Publish /.well‑known/agent.json and add its URL to AGENT_CARD_URLS. - **Add an MCP server**: Provide an SSE endpoint; orchestrator registers tools automatically. - **Customize planning**: Adjust heuristics to produce more granular steps, handle errors differently, or enforce domain constraints. - **Security hardening**: Keep SSE read‑only, enforce auth on JSON‑RPC POSTs, validate tokens, and apply rate limiting. ## Closing thoughts This architecture delivers a pragmatic path to agentic orchestration: a single orchestrator that discovers tools across ecosystems, plans execution, and streams progress back to your UI. It stays simple where it can (one WebSocket contract) and flexible where it matters (A2A and MCP as pluggable sources of capability). - **A2A ideology**: horizontal scaling—more agents for more jobs. - **MCP ideology**: vertical scaling—more tools per server. - **The orchestrator**: the bridge that understands both philosophies and unifies them. As your tool surface grows, the orchestrator’s discovery‑first design keeps your frontend stable and your runtime expandable—without coupling to any specific agent brand or service. If you already have an agent or tool server, publish an agent card or expose an MCP endpoint and add a URL. The orchestrator—and your users—will take it from there.

AI AgentsA2AMCPAgentic architecture