Build a Coding Agent from Scratch
中文

07. Interruption, Steering, and Follow-up Tasks

Real users do not wait for the agent to finish before speaking. They add information while the model is streaming, realize the direction is wrong while a tool is running, or append the next thing just as the current task is about to finish. The agent needs to distinguish three kinds of input: stop, steering, and follow-up tasks. Treating them all as new user prompts will corrupt the runtime state.

Three semantics

Stop cancels the current work. It should trigger the AbortSignal, stop the provider request or tool execution, and write the aborted state to the log.

Steering means "the current direction needs adjustment." It does not immediately kill all tools; instead, it inserts a new user message at a safe moment so the next model turn sees it. A typical example is "don't change the tests, fix the implementation."

A follow-up means "do this after the current task finishes." It should wait in the queue until the current task ends naturally, then start the next round as a new user message. A typical example is "after the fix, also update the changelog."

These three semantics are different, and the UI copy, queue behavior, and log records should differ accordingly.

Why not interrupt the tool batch immediately

Suppose the model has already launched two tools: reading a file and running the tests. The user steers at this moment: "hold off on the tests." If you kill the running tools immediately, you may leave behind half a log entry, half a process, or an unfinished file write. A safer strategy is: let the current tool batch reach a consistent boundary, then inject the steering into the next turn.

A consistent boundary usually means:

  • The current assistant message is complete.
  • All tools that were started have finished or been cancelled in a controlled way.
  • Tool results have been written to the log.
  • The next model request has not yet started.

This is not the fastest-responding strategy, but it makes recovery, auditing, and testing more reliable. For a coding agent, recoverability usually matters more than millisecond-level steering.

The dual-queue model

The runtime can maintain two queues:

type QueuedInput = {
  id: string;
  text: string;
  createdAt: string;
};

type AgentQueues = {
  steering: QueuedInput[];
  followUp: QueuedInput[];
};

After each tool batch ends, the kernel first checks steering. If steering entries exist, it merges them into a new user message and continues with the next turn of the current task. Only after the current task stops does it take from the follow-up queue to start a new task.

When merging steering, preserve the user's original wording and timing. Do not compress multiple user inputs into one vague summary, or you will lose intent. You can use an injection format like this:

User provided steering while you were working:
1. Do not edit tests.
2. Keep the public API unchanged.
Continue from the current state and adjust your plan.

Events and the UI

Queue changes should also be events:

queue_updated steering=1 followUp=0
steering_applied count=1
follow_up_started id=...

The UI needs to tell the user "queued; will apply after the current tool finishes" rather than staying silent. Otherwise the user will type the same thing again, and the model will receive multiple copies of the same instruction.

Failure modes

The most dangerous failure is appending steering directly to messages while the current assistant message is still streaming. The next turn's context can then contain an interleaved sequence of a half-finished assistant message, the user's steering message, and trailing tool results. Many providers are not tolerant of that ordering, and the model will be confused as well.

The second failure is a follow-up preempting the current task. The user says "update the docs when you're done," and the agent goes off to write documentation before the bug is even fixed — the task order is broken. The point of a follow-up is "after," not "also now."

Exercises

Add steer(text) and followUp(text) to the agent.

Acceptance criteria:

  • Calling steer while the model is streaming does not directly modify the request being sent.
  • After the current tool batch ends, steering enters the next turn's context.
  • A follow-up starts only after the current task stops.
  • The UI or event subscribers can see the queue lengths change.
  • After a user abort, the policy for handling steering and follow-ups is explicit — keep them, clear them, or ask the user; they must not be dropped silently.