A phase isn't a new idea. Any competent developer working through a complex ticket already does this mentally — they break it into pieces, get one piece working, then move to the next. The difference is that with a human developer, that decomposition can stay informal. It lives in their head, or in a quick Slack message, or in the way they structure their commits. The person doing the work is also the person holding the context, so it doesn't need to be written down.
With an agent, the informality breaks. The agent doesn't hold context between sessions, doesn't infer intent from the surrounding situation, and doesn't catch its own scope drift. The mental decomposition that an experienced developer does naturally has to become explicit — written down, with a testable endpoint, before the session opens. That's what a phase is. The same instinct, made legible.
Ticket sizing comes first
This was true before agents. Agile always worked better with small, well-scoped tickets — the kind where the developer could start without three clarifying conversations and finish without three scope changes. Teams that wrote good tickets had better sprints. Teams that let tickets grow vague and large had bad ones.
With agents, the same principle applies with less tolerance for sloppiness. A vague ticket handed to a human developer results in a conversation. The same ticket handed to an agent results in the agent guessing — and then building confidently on those guesses for the rest of the session. By the time you look at the output, the drift has already happened.
Every ambiguity in a ticket is a decision the agent makes on your behalf.
The right size for a ticket is whatever a developer and agent can complete in one session with a clear, verifiable outcome. If a ticket is too large for that, it gets broken into smaller tickets at refinement — not decomposed informally into phases during the session. The decomposition happens before the work starts, not inside it.
Phases within a session
A single session can include multiple phases. A developer and agent might work through three distinct checkpoints in one sitting — building something, reviewing it, building the next thing, reviewing that. Each checkpoint closes a segment of work and opens a new one. The ticket moves forward continuously.
What phases are not is a pre-planned breakdown that lives in the ticket system. They're the natural rhythm of the session: build to a verifiable point, pause and review, continue. The checkpoint is what defines the phase boundary — and knowing when to call one is a judgment call, not a rule.
When to call a checkpoint
| Signal | What it means |
|---|---|
| You can write one sentence that says the work is done | Good place for a checkpoint. The criterion is clear enough to verify. |
| The next step is meaningfully different from the last | Natural boundary. The Review closes one concern before a different one opens. |
| Something unexpected surfaced during Build | Stop and review before continuing. The agent should not absorb the surprise silently. |
| The agent is about to touch something high-risk | Call a checkpoint first. Review what was built so far before the stakes rise. |
| You can't explain what was just built | The checkpoint is overdue. Stop now rather than continuing without understanding. |
The wrong time to call a checkpoint is on a rigid schedule regardless of what just happened. The right time is when there's something worth verifying — when the question "do I understand this?" has a meaningful answer.
Where scope decisions belong
The agent session is not where scope decisions happen. If something unexpected comes up during Build — a dependency you didn't account for, a requirement that turns out to be more complex — that's a conversation with the Spec Owner, not a prompt to the agent. The scope holds. New work goes into the ticket system as a new ticket or a refinement conversation.
The temptation is to just ask the agent to handle it while you're already in the session. Resist. Every time you expand scope mid-session, you're creating work with no criterion, no Review, and no record. Exactly the kind of invisible work the methodology exists to prevent.
In practice
Here's a ticket in progress. Two checkpoints have run — the first closed cleanly, the second surfaced a question that needed escalating.
Add structured logging to API layer
Description
Replace ad-hoc console.log calls with structured JSON logging across all API routes.
Acceptance criteria
- —All console.log calls replaced with structured pino logger
- —Every request logs method, path, status code, and duration as JSON
- —Unhandled errors log stack trace and request ID; no sensitive fields
- —Each request carries a UUID that propagates through all log lines
Activity
Logger configured; console.log calls removed from all route handlers.
Should request bodies be logged? Could contain PII depending on the endpoint — escalated to Product Owner.
What to notice
The first checkpoint closed cleanly — the developer could explain everything. The second surfaced a real policy question the developer couldn't answer alone. The work continued; the question is in the Escalation queue. It didn't block and it didn't disappear. Both outcomes are visible directly on the ticket.
Keeping the scope small enough to call a checkpoint is what made the question surfaceable. A single "add logging" session would have buried it — or forced the agent to guess.