The Night Shift - Blaz Kos

A human-agent collaboration model where humans own the specification layer and agents own the implementation loop — running autonomously while you rest.

Two Shifts. One Codebase. Radically Different Work.

The Night Shift is a structured human-agent collaboration model built on a single premise: human time and cognitive energy are finite and expensive; agent tokens are cheap and nearly unlimited. The system optimises accordingly.

During the day, the human operates in deep specification mode — gathering requirements, designing architecture, writing precise feature specs, and thinking through edge cases without the distraction of running agents. During the night, one or more coding agents execute autonomously against those specs, implementing, testing, reviewing, and committing — without any human oversight required.

The result is a workflow with no context-switching, no babysitting, no idle anxiety, and a codebase that arrives each morning either finished or clearly annotated with what needs human attention.

Human time and energy are precious. Agents and agent tokens are cheap and plentiful. Design the system accordingly.

Day Shift — Human	Night Shift — Agent
Think, Specify, Design Gather requirements from stakeholders Design system architecture Write detailed feature specs Use AI only in narrow "Ask" mode File completed specs in /Specs Agents sit idle — no babysitting	Implement, Test, Review Load agent loop instructions Pick specs from the queue Write tests before code Run multi-persona review agents Commit, document, repeat Human is away — fully autonomous

Four Operating Principles

The workflow is built on four constraints that drive every design decision — from how specs are written to how the agent loop terminates.

Human tokens are the scarce resource. Reading and writing text by a human is expensive. Every workflow decision minimises human token consumption. The human should never read agent plans, long generated outputs, or verbose agent updates.
Agent tokens are practically unlimited. Agents should burn tokens freely to achieve quality. Multiple review passes, sub-agent critiques, extended test suites — none of this is rationed. The bottleneck is never compute.
Human retains control, minimally. The human approves the specification before execution, reviews commits the next morning, and corrects the system — not the code. Control is preserved without becoming a bottleneck.
The system improves continuously. Every agent mistake is a signal. The correct response is to improve docs, skills, and validations — not to manually patch the output. Each fix amortises across all future work.

The Night Shift Loop

The agent operates a structured loop. Each cycle handles one feature or bug through the following sequence. The loop repeats until no specs remain, then produces a concise summary report and goes silent.

Step	Phase
0	Preparation — Clean the working tree, analyse uncommitted changes and stash or commit. Run the full existing test suite and fix any pre-existing failures before touching anything new.
1	Select a task — Bugs are prioritised first. If the bug queue is clear, pick the next completed spec. Files prefixed draft- are skipped entirely.
2–3	Load context — Parse the spec thoroughly, then load the relevant system documentation and inspect existing code. A routing document guides which docs to load for each feature domain.
4–5	Write tests first — expect failures — Develop a comprehensive testing plan, then write all tests before writing any implementation code. Run them, confirm they fail as expected. No test suite, no implementation.
6	Develop implementation plan — The agent produces a detailed plan for its own use. The human never reads this. Its purpose is to force structured reasoning before coding begins.
7–8	Multi-persona review cycle — Six sub-agent reviewers critique the plan: Designer, Architect, Domain Expert, Code Expert, Performance Expert, Human Advocate. The plan iterates until all six signal approval.
9	Implement — including documentation — Execute the approved plan. Documentation is updated as part of implementation — not as an afterthought.
10–11	Static analysis and regression testing — Run type checking, linting, compiler, and all relevant tests. Fix issues, iterate. Then run the entire test suite to catch regressions.
12	Second review cycle — on the diff — Run all six reviewer personas again, this time against the actual implementation diff. If any reviewer raises a concern, loop back until all approve.
13–14	Commit and document — Log unrelated issues for human review. Write a CHANGELOG entry. Commit with a detailed message written for human context.
15–17	Loop or terminate — Return to step 1 and select the next task. When all specs are complete, write a concise summary report and go silent.

Morning Review Protocol

The human's morning routine is deliberate and structured. The goal is not just to catch bugs — it's to improve the system so the next night runs better than this one.

Start with the CHANGELOG and agent summary to understand the scope of what was done. Then review commit by commit: read the commit message, examine the diff, inspect the tests, and check the documentation changes. Each commit is designed to be self-explanatory without additional context.

Manual testing is non-optional. Almost every change gets checked by hand — not because the tests are insufficient, but because manual review catches gaps in documentation, tests, and the human's own understanding of the system.

The Correction Protocol

When the agent makes a wrong decision, the correct response is never to patch the output directly. Instead: identify why the agent made the wrong decision — which doc was unclear, which validation was missing, which skill lacked sufficient detail. Fix those root causes first. Then fix the original issue. Every correction made this way amortises across all future work.

The Specification Layer

Spec writing is the human's primary contribution to this system. Specs are not written for the agent — they are written for the human, to organise thinking, surface edge cases, and make architecture decisions explicit before implementation begins.

A good spec describes the feature, all edge cases to handle, integration points, expected behaviour, and anything that might create ambiguity. AI autocomplete is turned off during spec writing. The human does this work themselves, at a sustainable pace, with full attention.

Over time, the amount of detail required in each spec shrinks. As system documentation improves, the agent increasingly pulls domain knowledge from its docs rather than from the spec. The spec becomes a lightweight intent document rather than an exhaustive briefing.

System Architecture

The Night Shift workflow relies on a small set of well-maintained files. Their quality directly determines the quality of overnight output.

File	Purpose
AGENT_LOOP.md	The agent's operating instructions for the night — the master loop definition
AGENTS.md	A routing document pointing to all docs, skills, and system knowledge by domain
REVIEW_PERSONAS.md	Definitions for six critical reviewer sub-agents and their documentation ownership
/Specs	Completed specs ready for execution. Files prefixed draft- are ignored by the agent
/Docs	System documentation maintained inside the codebase, updated as part of every commit

Reported Outcomes

5x faster throughput — Measured against previous agentic workflows after one month of continuous use

0 context switches — The human never pivots between speccing and supervising. Each mode is protected and undisrupted.

Compounding quality — Every correction improves the system permanently. Quality compounds daily — not just the code, but the agent's operating context.

This is not a prompt-and-review workflow. It is a system design discipline. The agent loop is only as good as the specifications, documentation, and validations that surround it. Every morning review is an investment in the next night's output.