Lesson

The Control Plane

cmux as agent infrastructure — workspaces, surfaces, refs, and why your terminal multiplexer is the orchestration layer.

You're going to build a system where multiple AI agents run simultaneously — each in its own terminal, each visible, each steerable. Before any of that works, you need to understand the physical substrate they live on.

That substrate is cmux.

Not a terminal multiplexer

cmux looks like a terminal multiplexer. It has tabs, split panes, a sidebar. You can type in it. But calling cmux a terminal multiplexer is like calling Kubernetes a process manager — technically true, architecturally misleading.

cmux is a control plane. Every workspace, every pane, every surface is a programmable object with a stable reference. You can create them, read them, write to them, and destroy them — all from the command line, all from code. This is what makes multi-agent orchestration possible. Your agents don't run in invisible background processes. They run in surfaces you can see, read, and interact with.

The hierarchy

cmux organizes everything into three levels:

window
└── workspace
    └── pane
        └── surface

A window is the macOS window. You probably have one. Multiple windows are supported but rarely needed.

A workspace is a named context — like a tab, but heavier. Each workspace has its own set of panes, its own sidebar state, its own environment variables. When you switch workspaces, everything changes. The workspace named "the hive" is a completely separate world from the workspace named "support."

A pane is a rectangular region inside a workspace. You split panes left/right/up/down, like tmux. Each pane holds exactly one surface.

A surface is the actual thing you interact with. A surface is either a terminal or a browser. Most surfaces are terminals — that's where your shells, your editors, and your pi sessions live. But a surface can also be a full Chromium browser, embedded right in the pane alongside your terminals.

This matters for agents. When you spawn a worker agent later in this course, you're creating a pane with a terminal surface inside an existing workspace. When you need to observe what that agent is doing, you read that surface. The hierarchy gives you a precise address for everything.

The tree

Run cmux tree and you get the full hierarchy:

window window:1 [current] ◀ active
├── workspace workspace:2 "ai-hero" [selected] ◀ active
│   ├── pane pane:2 [focused] ◀ active
│   │   └── surface surface:2 [terminal] "π - support inbox pass" [selected] ◀ active
│   └── pane pane:7
│       └── surface surface:7 [terminal] "…/packages/aihero-cli" [selected]
├── workspace workspace:7 "the hive"
│   ├── pane pane:10 [focused]
│   │   └── surface surface:11 [terminal] "π - phased curriculum design" [selected]
│   ├── pane pane:58
│   │   └── surface surface:59 [terminal] "π - course-the-hive" [selected]
│   └── pane pane:57
│       └── surface surface:58 [terminal] "π - course-the-hive" [selected]
├── workspace workspace:1 "π dev"
│   ├── pane pane:1 [focused]
│   │   └── surface surface:1 [terminal] "π - cmux fork-bomb resolved" [selected]
│   └── pane pane:9
│       └── surface surface:9 [browser] "cmux × pi: Multi-Agent Orchestration" [selected]
└── workspace workspace:5 "codet-tv"
    └── pane pane:5 [focused]
        └── surface surface:5 [terminal] "PANDA - π - CodeTV egghead merger" [selected]

Read this carefully. Every object has a refworkspace:7, pane:10, surface:11. These refs are stable identifiers that work across every cmux command. When you want to read what's happening in a specific terminal, you target the surface ref. When you want to send text to a specific agent, you target the surface ref. When you want to close a pane, you target the pane ref.

Notice the markers: [current], [selected], [focused], ◀ active. These tell you what's visible right now. The [focused] pane is the one receiving keyboard input in its workspace. The [selected] surface is the active one in its pane (panes can have multiple surfaces tabbed together). The ◀ active chain traces from window to the surface you're literally looking at.

Now look at workspace "π dev". Pane 1 holds a terminal surface. Pane 9 holds a browser surface — an actual browser rendering a GitHub gist, sitting right next to the terminal. Both are first-class surfaces with refs. Both are programmable.

Refs are addresses

Every cmux command accepts refs. This is the API surface your agents will use:

# Read the last 10 lines of a specific surface
cmux read-screen --surface surface:11 --lines 10

# Send text to a specific surface (as if you typed it)
cmux send --surface surface:59 "run the tests"

# Send a keypress
cmux send-key --surface surface:59 Enter

# Create a new workspace with a command already running
cmux new-workspace --cwd ~/repos/auth-service --command "pi"

# Split a pane to the right, creating a new terminal
cmux new-pane --direction right --workspace workspace:7

# Find out which surface has focus right now
cmux identify

cmux identify returns JSON telling you exactly where the cursor is:

{
  "focused": {
    "workspace_ref": "workspace:2",
    "pane_ref": "pane:2",
    "surface_ref": "surface:2",
    "surface_type": "terminal"
  },
  "caller": {
    "workspace_ref": "workspace:7",
    "pane_ref": "pane:55",
    "surface_ref": "surface:56",
    "surface_type": "terminal"
  }
}

Two objects: focused is where the user is looking. caller is where the command ran from. This distinction matters when an agent needs to decide whether to send a notification — if the user is already looking at its surface, a macOS popup would be annoying.

Environment variables as implicit context

Every terminal surface in cmux gets three environment variables injected automatically:

  • CMUX_WORKSPACE_ID — the workspace this terminal belongs to
  • CMUX_SURFACE_ID — this specific surface
  • CMUX_TAB_ID — the tab this surface belongs to

These act as defaults. When you run cmux send "hello" without specifying a surface, it targets CMUX_SURFACE_ID. When an extension calls cmux set-status "state" "Running" without a workspace flag, it sets the status on CMUX_WORKSPACE_ID.

This is why pi sessions know which workspace they're in without any configuration. The cmux extension reads CMUX_WORKSPACE_ID from the environment and uses it for every sidebar update. When you spawn a worker agent in a new pane, that pane's terminal inherits the workspace's environment — the worker automatically knows where it lives.

The sidebar as cross-workspace broadcast

Each workspace has sidebar state: key-value pairs with optional icons and colors. You set them with cmux set-status:

cmux set-status "state" "Running" --icon "bolt.fill" --color "#4CAF50"
cmux set-status "tool" "editing auth.ts" --icon "wrench.fill"
cmux set-status "model" "claude-sonnet-4" --icon "cpu"

The sidebar is visible from every workspace. Switch from "the hive" to "ai-hero" and you still see the sidebar entries that other workspaces set. This is the quiet signal — the ambient awareness channel that tells you something is happening without pulling your attention.

For multi-agent orchestration, each worker agent updates its own workspace's sidebar with what it's doing. The orchestrator doesn't need to poll each agent — it can see at a glance which workers are running, which are idle, which need input. The sidebar is the fleet's heartbeat.

Notifications as the loud signal

When something demands attention — an agent finished its task, an error occurred, input is needed — cmux sends a macOS notification:

cmux notify --title "Worker: auth" --body "Task complete" --workspace workspace:7

This is the loud signal. The sidebar is continuous ambient awareness; notifications are interrupts. The attention model you'll build in a later lesson uses both: sidebar for status, notifications for state transitions that require human attention.

Two surface types

Terminal surfaces are where shells and agents run. Browser surfaces are where documentation, dashboards, and web apps live. Both are first-class:

# Open a browser surface in a new pane
cmux new-pane --type browser --url "https://docs.example.com" --direction right

# Navigate an existing browser surface
cmux browser goto "https://github.com/your-org/your-repo"

# Take a snapshot of the browser's DOM
cmux browser snapshot --compact

# Click a button in the browser
cmux browser click "#submit-btn"

Browser surfaces have their own command namespace — cmux browser snapshot, cmux browser click, cmux browser eval. An agent running in a terminal surface can control a browser surface in the adjacent pane. This means your agents can interact with web UIs, not just terminals.

You won't use browser surfaces heavily in this course. But knowing they exist changes how you think about what an agent can observe and control. The control plane isn't limited to text.

What the control plane gives you

Here's what cmux provides that a regular terminal multiplexer doesn't:

CapabilityCommandWhy it matters for agents
Enumerate everythingcmux treeKnow exactly what's running
Read any surfacecmux read-screenObserve agent output without switching context
Write to any surfacecmux sendSteer a running agent mid-task
Create workspaces on demandcmux new-workspaceSpawn agent environments programmatically
Split panescmux new-paneAdd workers alongside existing sessions
Sidebar metadatacmux set-statusAmbient fleet awareness across all workspaces
Notificationscmux notifyInterrupt-level alerts for state changes
Focus detectioncmux identifyKnow if the human is watching
Browser automationcmux browser *Agents can control web UIs

Every one of these is a shell command. Every one returns structured output or accepts structured input. There's no SDK to install, no daemon to configure. If you can call cmux tree from a shell, you can orchestrate a fleet of agents.

This is the physical substrate. Every agent in this course lives on a cmux surface. Every orchestration command flows through cmux refs. Every status update broadcasts through the cmux sidebar. The next lesson covers the bridge that connects pi to this substrate — the extension that turns raw cmux commands into agent lifecycle hooks. But the foundation is here: a terminal multiplexer that's actually a control plane.