The Full Cycle

In one line: Brainstorm → design (one artifact) → worktree → TDD → verify → smoke → review → merge; the Brainstorm Gate blocks the path until a Pre-Mortem Block clears qualifying proposals.

The full development cycle applies to any non-trivial new feature or architectural change. Trivial changes (config updates, typo fixes, single-line adjustments with no architectural impact) may skip directly to implementation.

The pink nodes are the gates. The lifecycle is §3 and the gates are §7, so a reader has had to hold both and join them mentally. They are joined here, at the point each one fires.

Two things the drawing makes hard to miss. The Brainstorm Gate sits before design, not before code — by the time there is a plan to review, the decision it exists to interrogate has already been made. And second-party scrutiny is a separate SESSION, not a second opinion in the same one: same-session self-review does not count, because the context that produced the design is the context least able to see what it missed.

Phase details:

Brainstorm (/brainstorming). Open-ended problem exploration. The skill guides the conversation through problem statement articulation, constraint identification, and generation of at least 3 candidate approaches with explicit tradeoffs. The output is a conversation log, not a document. Its purpose is to prevent the common failure mode of committing to the first approach that seems reasonable.

Duration: 10-30 minutes. Skip threshold: if the implementation approach is obvious and has no meaningful alternatives (e.g., "add a new column to an existing table with an obvious type and no migration concerns"), skip to the design artifact (/writing-plans).

The Brainstorm Gate (Pre-Mortem Block). When a brainstorm proposal triggers any of the conditions below, the brainstorm phase cannot conclude — and no spec/plan/code work begins — until a Pre-Mortem Block is emitted that addresses the Decision-Cost Rubric (Section 2.7) and three reflection fields.

Trigger conditions (any one fires the gate):

Adds a new dependency (library, framework, external service)
Replicates a pattern across 3 or more files / call sites
Estimated to change hot-path latency by more than 100 ms (either direction)
Modifies a public API surface, schema, or data contract
Spans more than 2 hours of estimated implementation work
Changes safety policy, guard/refusal behavior, or any end-user-protection mechanism (relaxations AND additions). This is the hardest trigger: the gate cannot be cleared by the proposer alone — the block must end with an explicit human sign-off line (Safety sign-off: <name, date>), and the Pre-Mortem must name the incident class the change could re-open.

Pre-Mortem Block format (exact shape — the structure is the gate):

## Pre-Mortem — <proposal name>

Proposal in one sentence: ...

Triggers fired: <which of the six conditions above>

Rubric axes:
- Latency: <estimate, or "not measured because…">
- Dependency surface: <new deps + transitive deps + lines we own vs. depend on>
- Debuggability: <what a 3am stack trace looks like; who can fix it>
- Reversibility: <hours to undo>
- Blast radius: <code paths affected; additive vs. substitutive>
- Alternative considered: <one credible alternative + one-sentence "why rejected">
- Cost: <estimate, or "not measured because…">

Strongest risk I see: <specific, named-component, falsifiable>
What would change my mind: <concrete signal — measurement, benchmark, user report>
Confidence: <low / medium / high, with reason>

The two non-skippable lines are Strongest risk I see and What would change my mind. If those fields read "no significant risks" and "nothing comes to mind," the gate has not been cleared — the brainstorm continues until they can be filled with specifics. Generic risks (complexity, maintenance burden) do not satisfy the format; the field demands a specific failure mode tied to a named system component. For safety-policy triggers a third line is non-skippable: Safety sign-off: <name, date> — without it the gate is not cleared, regardless of who proposed the change.

Why this lives at brainstorm. Pivot cost compounds non-linearly with phase: a sentence at brainstorm, the spec at spec, the plan plus re-alignment at plan, the code at code. Fired at brainstorm, the rubric changes the question from "did we catch the bad decision" to "did we make the right decision in the first place" — where the leverage is.

Why it works despite being the hardest point to enforce. At brainstorm the proposer is most invested and the AI's gradient toward agreement peaks. The visible-artifact design compensates: if the AI drifts past the gate without emitting the Pre-Mortem Block, the absence is detectable to the human in real time. The norm is enforceable because the artifact is missing, not because the AI remembered to push back.

The cost of not having this gate is the single-dimension adoption case in Section 2.7's Evidence paragraph: a framework adopted without a latency axis, the tax surfacing only in production.

Design (/writing-plans). One artifact replaces the former specification + plan pair (template: templates/spec-template.md in the methodology repo). It captures the reasoning AND the ordered task list: audience, problem statement, design decisions table (decision / choice / rationale), the Pre-Mortem Block when the gate fires, data flow, API contracts, error-handling strategy, testing strategy, and the implementation tasks. The old dual artifact produced reams of process prose with specs written minutes before their code by the same session — ceremony, not scrutiny. Trivial changes (single file, no schema/API/safety surface) need only a one-sentence design note in the PR description.

Each task in the artifact specifies: task number and title; dependencies; acceptance criteria; files likely to be created or modified. Tasks are sized for a single subagent session (~60 minutes max — split anything larger). Annotate ‖ parallelisable only when the task is genuinely independent (Section 5.4's wave-dispatch consumes the marker).

Duration: 30-60 minutes. The artifact is committed to docs/ before any implementation begins.

Second-party scrutiny threshold. Designs touching a safety path, schema, public API, or auth get review by a second party — a human, or a fresh-context agent in a SEPARATE session that has not seen the design being written. Same-session self-review does not count: its catch-rate is indistinguishable from zero, and "Draft (awaiting review)" designs tend to merge unreviewed. The heavier "plan walkthrough" ritual this threshold replaces had near-zero completions under throughput pressure; review-by-fresh-context is the form of scrutiny that survives contact with reality.

Create Worktree (/using-git-worktrees). Filesystem isolation: the feature branch lives in a separate directory, preventing accidental commits to the wrong branch. The skill handles branch creation, worktree setup, and context transfer.

Implement with TDD (/test-driven-development). Red-green-refactor by default: write a failing test, write the minimal code that passes it, refactor while green. An anti-rationalization table blocks rationalizing a failure as "expected" — every failure is fixed or explicitly documented as a known limitation. In PoC mode the ordering relaxes (code-first, tests-after), but the tests stay mandatory.

Verify Completion (/verification-before-completion). Requires evidence before marking complete: fresh timestamped test output, a coverage report meeting threshold, zero-new-warning linting, and visual verification for UI changes. This is "evidence over claims" (Section 2.3) at the task level.

Code Review (/requesting-code-review). Two stages (Section 3.3): spec compliance, then code quality.

Finish & Merge (/finishing-a-development-branch). Merges the branch, cleans up the worktree, and — mandatorily, in the same commit — regenerates STATE.md and updates the relevant memory files to reflect the new state. This is not optional: STATE.md and memory are updated on every finished branch (and memory also on any decision, correction, or new pattern discovered mid-session), so a fresh session's entry point never rots. The reviewer's documentation check (Section 11.4) confirms both; a STATE.md untouched for more than 30 days is surfaced — advisory — by templates/hooks/check-doc-staleness.sh, which warns when STATE.md is older than 30 days (it never blocks).