Build a Coding Agent from Scratch

Agent Engineering Tutorial

From one LLM call to an agent that can edit code.

Modeled on the architecture of the Pi coding agent, this book breaks a mature system — protocol layer, runtime, session log, tools, safety boundaries, and extensions — into engineering knowledge you can implement chapter by chapter.

Layered agent architecture diagram
01

Learned from a real system

The material distills the layered architecture of Pi, a production coding agent, into general, battle-tested design lessons.

02

Every chapter has a checkpoint

Each chapter answers one concrete engineering question and ends with behavior you can verify yourself.

03

Driven by failure modes

See where the naive approach breaks first, then add protocols, logging, compaction, permissions, and extension boundaries.

Contents

One path from protocol to product.

00 Introduction: An Agent Is Not a Single API Call

If you already know how to make a single large-model API call, what you have is "a function that answers questions." A coding agent solves a different class of problem…

01 The Full Protocol of a Tool Call

The first foundation stone of an agent is tool calling. Many tutorials describe it as "give the model a list of functions and the model will pick one." That is not…

02 A Minimal Agent Loop

Once you understand the tool protocol, the heart of an agent is the loop. It is not a while true that lets the model keep talking; it is a set of state transitions…

03 Tool Design Fundamentals

Tools are the agent's only channel to the outside world. Design tools badly and the model learns the wrong behavior; make tool output unstable and the loop becomes hard…

04 Provider Abstraction and a Unified Message Protocol

When an agent has only one model, you might pass the vendor SDK's types throughout the entire system. That gets you started quickly, but it soon locks the runtime into…

05 Streaming Output and the Event Model

An agent without streaming output feels sluggish. After the user submits a task, the model may think for several seconds, then request a tool, and the tool may run for…

06 The Agent Kernel and Its Lifecycle

By now you have a message protocol, providers, tools, and streaming events. The next step is to organize them into an agent kernel. The kernel is not a product…

07 Interruption, Steering, and Follow-up Tasks

Real users do not wait for the agent to finish before speaking. They add information while the model is streaming, realize the direction is wrong while a tool is…

08 The Session Log and Recovery

The source of truth for an agent runtime should be the session log, not the in-memory messages array. Memory gets lost, UIs get refreshed, processes crash, and users…

09 Context Engineering and Compaction

No matter how large the context window is, an agent will fill it. A coding agent reads files, runs commands, emits diffs, hits errors, and receives user steering. The…

10 Coding Tools: read, edit, write, bash

What separates a coding agent from an ordinary tool-using agent shows up most in the write operations. Reading files and searching let the model observe the world; edit…

11 System Prompts and Project Context

Tools and the loop determine what an agent can do; the system prompt determines how it should do it. A coding agent's system prompt is not a single "you are a helpful…

12 Safety Boundaries and the Permission Model

A coding agent can read and write files, run commands, access the network, and call models. As long as it runs on your machine, it has every capability within the scope…

13 One Kernel, Many Shells

Once the agent kernel is finished, you'll want to add more entry points: an interactive CLI, a one-shot print mode, a JSON event mode, an SDK, RPC, a TUI. Don't write a…

14 The Extension System

Once your agent is used by different teams, customization requests never stop: add an internal search tool, intercept dangerous commands, change the system prompt, show…

15 Evaluation, Debugging, and the Capstone Project

The biggest illusion in agent development is "that last run looked fine, so it must be correct." Model output is unstable, real repositories are complex, and tools and…