Skip to main content

The Four Project Reviewers

In one line: A worked example — four reviewer agents, each with a focused checklist where every item traces to a real failure it now prevents.

What: Four project-specific reviewer agents provide automated checks calibrated to the project's conventions. Each has a focused checklist, a defined scope, and a structured output format. (The exact checklist items below are illustrative of a regulated multi-tenant system; adapt them to your domain.)

API Reviewer (model: Sonnet)

Checks FastAPI endpoints for consistency with established patterns. The checklist covers:

  • Pydantic BaseModel for all request/response schemas (not raw dicts)
  • response_model declared in route decorators
  • Correct HTTP status codes (201 for creation, 404 for not found)
  • Authentication via Depends(get_current_user)
  • Tenant context via Depends(get_current_tenant)
  • Tenant-scoped database access via get_tenant_session(tenant_id)
  • Pagination for large responses
  • No N+1 query patterns

When to dispatch: After adding or modifying API routes. The API reviewer catches convention violations that are invisible to general code review — using get_session() instead of get_tenant_session() produces code that works in testing but leaks data across tenants in production.

Security Reviewer (model: Opus)

Reviews code for authentication bypass, authorization flaws, injection vulnerabilities, and tenant isolation issues. The checklist covers:

  • Authentication enforced on all protected endpoints (portal endpoints have token-based auth instead)
  • Role-based access control: require_role() guards on admin-only endpoints
  • Parameterized SQL queries (:param not f-strings)
  • No eval(), exec(), or __import__() with user input
  • PII never logged
  • MinIO bucket access scoped to case and tenant
  • Error messages do not expose internal details

When to dispatch: After implementing features that handle user input, access control, data queries, or file operations. Uses the strongest model because security review requires judgment about attack vectors and exploitation scenarios that pattern matching alone cannot assess.

Compliance Reviewer (model: Opus)

Reviews code for the project's regulatory obligations (where the domain has them). A representative checklist:

  • Every AI output has input provenance (what data was the decision based on?)
  • Model identification present (which model, version, prompt template?)
  • Chain of thought captured (full message log, evidence bundles)
  • Confidence scoring with documented methodology
  • A version foreign key on AI execution records for traceability
  • No PII in log files or error messages
  • Audit trail is append-only (never modified)
  • System can add scrutiny but never suppress risk signals

When to dispatch: After implementing features involving AI decisions, data processing, or audit trails. This is the highest-stakes reviewer — compliance gaps discovered post-deployment have legal and financial consequences. Uses the strongest model because regulatory interpretation requires contextual judgment.

Migration Reviewer (model: Sonnet)

Reviews Alembic database migrations for safety, RLS compliance, and asyncpg compatibility. The checklist covers:

  • upgrade() and downgrade() are symmetric
  • NOT NULL columns on existing tables have server_default
  • New tenant-scoped tables have ENABLE ROW LEVEL SECURITY and FORCE ROW LEVEL SECURITY
  • RLS policy uses current_setting('app.current_tenant')::UUID
  • CAST(:param AS jsonb) syntax for asyncpg compatibility (not ::jsonb)
  • Migration revision chain is correct (single head)

When to dispatch: Before committing any new Alembic migration. Uses a fast model because migration review is checklist-based — the correct patterns are well-defined and the reviewer verifies their presence or absence.

Evidence: Each checklist item traces to a specific failure the reviewer now prevents — the asyncpg cast-syntax check exists because the ::jsonb form caused a post-deploy runtime failure; the AI-traceability check exists because prompt versioning was missing and required a retrofit. Reproducible: the checklist is committed in the reviewer definition, so the same check fires for every author. See Section 5.3.