Lesson

Completion Detection

The gap between 'tools the LLM calls' and 'infrastructure that runs alongside the LLM' — a polling loop that watches workers and notifies the orchestrator.

Workers complete silently. When a worker's pi session finishes its turn, it shows "Needs input" in the sidebar and sends a macOS notification. But the orchestrator — the LLM that spawned the worker — has no programmatic hook to this completion. The orchestrator would have to call list_agents or read_agent manually, in a loop, hoping to catch completions.

LLMs can't self-poll reliably.

Infrastructure vs tools

Most agent tutorials teach you tools. Functions the LLM calls. spawn_pi is a tool. But once you spawn workers, you've crossed into a different paradigm: infrastructure that runs alongside the LLM.

The completion detection loop is infrastructure. It runs on a timer, outside LLM turns. It observes worker state and decides when to wake the orchestrator. This is the foundation of autonomous orchestration — the extension actively manages the fleet, not the LLM reactively checking on it.

The detection strategy

How do you know a worker is idle? Two conditions:

  1. Pi at input prompt: Cost footer visible without spinner
  2. Shell fallback: Raw shell prompt (worker exited pi)

The code reads each worker's surface and tests both patterns:

const HAS_PI_FOOTER = /\$[0-9]+\.[0-9]+/;
const IS_WORKING = /[⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏]\s|Working|Thinking/;
const IDLE_SHELL_RE = /^[❯$]\s*$/m;

const screen = cmuxSafe("read-screen", "--surface", agent.surfaceRef, "--lines", "5");
const piIdle = HAS_PI_FOOTER.test(screen) && !IS_WORKING.test(screen);

if (piIdle || IDLE_SHELL_RE.test(screen)) {
  // worker is idle
}

The original design assumed workers would exit to shell when done. But interactive pi workers never exit — they stay in pi's TUI, waiting for the next prompt. State detection requires rendering-layer awareness, not process-level signals.

The debounce bug

First dogfood session: spawned 2 workers to write lessons. Completion detection immediately fired false positives. Workers briefly show the cost footer between tool calls — a fraction of a second where the spinner disappears but work continues.

The fix: state machine with debounce counter.

const IDLE_POLLS_REQUIRED = 2; // debounce false positives

if (piIdle || IDLE_SHELL_RE.test(screen)) {
  agent.idleCount++;
  if (agent.idleCount < IDLE_POLLS_REQUIRED) continue;
  
  agent.status = "idle";
  // fire completion event
} else {
  agent.idleCount = 0; // reset on any activity
}

Two consecutive idle polls = truly idle. Any activity resets the counter. Simple state machine beats regex tuning every time.

Waking the orchestrator

When a worker idles out, the extension sends a message to pi with triggerTurn: true:

pi.sendMessage(
  { 
    customType: "agent-completion",
    content: `🐝 Agent ${id} has finished and is idle.\nPrompt: ${agent.prompt}\nSurface: ${agent.surfaceRef}`,
    display: true 
  },
  { triggerTurn: true }
);

This is the key insight: the extension wakes the LLM. Infrastructure-initiated turns, not LLM-initiated polls. The orchestrator doesn't need to remember to check on workers. The extension watches continuously and interrupts when something happens.

triggerTurn: true tells pi to start a new LLM turn immediately, as if the user had sent a message. The message content gives context — which agent completed, what it was working on, where to find it.

Self-management

The polling loop manages its own lifecycle:

let _completionPollTimer: ReturnType<typeof setInterval> | null = null;
const COMPLETION_POLL_MS = 5000;
const SPAWN_GRACE_MS = 15000;

function startCompletionPolling(): void {
  if (_completionPollTimer) return; // already running
  
  _completionPollTimer = setInterval(() => {
    if (fleet.size === 0) { 
      stopCompletionPolling(); 
      return; 
    }
    
    for (const [id, agent] of fleet) {
      if (agent.status === "idle" || agent.status === "completed" || agent.status === "failed") continue;
      if (Date.now() - agent.spawnedAt < SPAWN_GRACE_MS) continue; // boot grace period
      
      // ... detection logic
    }
  }, COMPLETION_POLL_MS);
}

startCompletionPolling() is called when the first worker spawns. stopCompletionPolling() runs when the fleet empties. The 15-second grace period lets pi boot completely before state detection begins.

Five-second polling interval balances responsiveness with CPU overhead. Fast enough to feel immediate, slow enough not to spam cmux read operations.

The full loop

Here's the complete implementation:

let _completionPollTimer: ReturnType<typeof setInterval> | null = null;
const COMPLETION_POLL_MS = 5000;
const SPAWN_GRACE_MS = 15000;
const IDLE_POLLS_REQUIRED = 2;

const HAS_PI_FOOTER = /\$[0-9]+\.[0-9]+/;
const IS_WORKING = /[⠋⠙⠹⠸⠼⠴⠦⠧⠇⠏]\s|Working|Thinking/;
const IDLE_SHELL_RE = /^[❯$]\s*$/m;

function startCompletionPolling(): void {
  if (_completionPollTimer) return;
  
  _completionPollTimer = setInterval(() => {
    if (fleet.size === 0) { 
      stopCompletionPolling(); 
      return; 
    }

    for (const [id, agent] of fleet) {
      if (agent.status === "idle" || agent.status === "completed" || agent.status === "failed") continue;
      if (Date.now() - agent.spawnedAt < SPAWN_GRACE_MS) continue;

      const screen = cmuxSafe("read-screen", "--surface", agent.surfaceRef, "--lines", "5");
      if (!screen) continue;

      const piIdle = HAS_PI_FOOTER.test(screen) && !IS_WORKING.test(screen);
      if (piIdle || IDLE_SHELL_RE.test(screen)) {
        agent.idleCount++;
        if (agent.idleCount < IDLE_POLLS_REQUIRED) continue;

        agent.status = "idle";
        pi.sendMessage(
          { 
            customType: "agent-completion",
            content: `🐝 Agent ${id} has finished and is idle.\nPrompt: ${agent.prompt}\nSurface: ${agent.surfaceRef}`,
            display: true 
          },
          { triggerTurn: true }
        );
      } else {
        agent.idleCount = 0;
      }
    }
  }, COMPLETION_POLL_MS);
}

function stopCompletionPolling(): void {
  if (_completionPollTimer) {
    clearInterval(_completionPollTimer);
    _completionPollTimer = null;
  }
}

This loop is the bridge between spawning workers and truly autonomous orchestration. Workers complete, the extension notices, the LLM gets woken up to decide what happens next. No manual polling, no forgotten workers, no silent failures.

The orchestrator becomes reactive to its own fleet.

What's next

The completion detection loop tells you when workers finish, but not what they produced. The next layer is reading worker output, parsing results, and routing follow-up tasks. But first, you need to understand how attention works in a multi-agent system.

When 5 workers all complete simultaneously, which one gets the orchestrator's attention first?