Cycle 221: Chief Engineer Review System — Manifesto, Agent & Skill
Priority: HIGH Status: IN-PROGRESS (retroactive — implementation preceded this doc) Domain: SDLC / Engineering Infrastructure Dependencies: Cycle 214 (Spec-Driven Development) Product: Flux — AI Hiring Assistant for small businesses Organization: Employ Inc.§9 Deviation Notice: This cycle doc was created retroactively. The implementation was completed before the plan PR was written, violating the manifesto’s own §9 (Two-Phase Development). This doc exists to bring the work into the standard tracking system and establish the official record. Future CE system changes follow the standard plan-first workflow. Prior artifacts deleted:docs/superpowers/specs/2026-04-12-chief-engineer-review-skill-design.mdanddocs/superpowers/plans/2026-04-12-chief-engineer-review-skill.mdserved as the original design artifacts. They have been removed — this cycle doc is the authoritative record.
Objective
Problem
Flux is a 100% AI-generated codebase with a multi-agent safety net (Claude Code + Codex discriminator, k3d validation, Playwright MCP, CI agents). These systems catch bugs, style violations, type errors, and test failures. What’s missing: an architectural authority that ensures every change is aligned with AI-native principles, delivers SOTA, and maintains coherent taste across the system. The existing CI review agents check “is this well-written?” — nobody checks “is this how we build things here?” With 100% AI-generated code shipping rapidly, architectural drift is invisible at the PR level but cumulative at the product level.Solution
A Chief Engineer (CE) review system with three components:- AI-Native Manifesto — Constitution encoding Flux’s 13 engineering principles with enough depth for an AI agent to exercise judgment on novel situations.
- Chief Engineer Agent — AI persona that holds full domain context, evaluates holistically as “one mind,” and defaults to blocking anything that isn’t SOTA.
- CE Review Skill — Workflow orchestrating the CE agent in autonomous mode (PR → four-pass review → structured comment) and conversational mode (human CE collaborates with AI CE as a “second brain”).
What This Cycle Does NOT Include
- CI/CD automation (GitHub Action wrapper) — future cycle once the skill is validated locally
- Mintlify integration for spec drift detection (blocked on Cycle 214)
- Pod CE infrastructure (multi-squad routing) — not needed until team scales
Architecture
Organizational Model
- Squad CE: Reviews all PRs within domain. Full authority within domain boundaries.
- Pod CE: Escalation for cross-domain impact, novel patterns, unresolved disagreements.
- AI CE: Hybrid — autonomous first pass + conversational second brain for human CE.
Two-Phase Review Gates
Every feature goes through two mandatory CE review gates, aligned with the existing cycle lifecycle:| Phase | Input | CE Focus |
|---|---|---|
| Plan PR (docs only) | Cycle doc in docs/roadmap/cycles/ | Problem definition, scope, AI-native approach, SOTA intent |
| Code PR (implementation) | Code implementing approved cycle doc | Plan adherence, manifesto compliance, validation evidence |
Component Layout
| Component | Location | Version Control |
|---|---|---|
| AI-Native Manifesto | docs/engineering/ai-native-manifesto.md | In repo (project-level) |
| Chief Engineer Agent | .claude/agents/chief-engineer.md | In repo (project-level) |
| CE Review Skill | .claude/skills/chief-engineer-review/SKILL.md | In repo → installed to ~/.claude/skills/ |
| Plan Review Template | .claude/skills/chief-engineer-review/templates/plan-review-prompt.md | In repo → installed to ~/.claude/skills/ |
| Code Review Template | .claude/skills/chief-engineer-review/templates/code-review-prompt.md | In repo → installed to ~/.claude/skills/ |
| Escalation Template | .claude/skills/chief-engineer-review/templates/escalation-prompt.md | In repo → installed to ~/.claude/skills/ |
| Review Examples | .claude/skills/chief-engineer-review/references/review-examples.md | In repo → installed to ~/.claude/skills/ |
| Pressure Tests | .claude/skills/chief-engineer-review/tests/pressure-scenarios.md | In repo → installed to ~/.claude/skills/ |
make flux-install-ce-skill copies to ~/.claude/skills/ (required by Claude Code skills architecture). Runs automatically during make onboard.
Review Pipeline Position
Deliverables
Deliverable 1: AI-Native Manifesto
File:docs/engineering/ai-native-manifesto.md
Status: Built (in worktree worktree-chief-engineer-review-skill, not yet merged to main)
The constitution governing all CE decisions. 13 principles organized into two sections:
Foundational Principles (§0–§7):
- §0 — Uncompromising Quality: SOTA or Don’t Ship (ALWAYS BLOCK)
- §1 — One Mind, Full Context (BLOCK)
- §2 — Conversational-First, Not CRUD-with-AI (BLOCK/ALIGN)
- §3 — Agentic Architecture (BLOCK/ALIGN)
- §4 — Schema Pipeline as Single Source of Truth (BLOCK)
- §5 — Observability-Native (ALWAYS BLOCK)
- §6 — Temporal for Durable Workflows (BLOCK/ALIGN)
- §7 — Domain-Driven Boundaries (BLOCK)
- §8 — 100% AI-Generated Code with Safety Nets (BLOCK/ALIGN)
- §9 — Two-Phase Development (BLOCK)
- §10 — Spec-Driven Traceability (BLOCK/ALIGN)
- §11 — Cross-Model Review (ALIGN/BLOCK)
- §12 — Quality Gate Culture (ALWAYS BLOCK)
Deliverable 2: Chief Engineer Agent
File:.claude/agents/chief-engineer.md
Status: Built (in worktree, not yet merged to main)
Agent definition — 140 lines covering:
- Identity: Engineering leader, not code reviewer. Defaults to BLOCK. Opinions stated directly. Persuadable only with evidence.
- Voice: Direct, technical, specific. Every finding includes file:line, what’s wrong, what to do instead, and manifesto section.
- Four-Pass Framework: Context → Validation → Alignment → Judgment
- Output Format: Structured markdown with BLOCK/ALIGN/NOTE findings and verdict
- Squad vs Pod mode: Deep domain review vs cross-squad coherence
- Escalation triggers: Cross-domain impact, genuine disagreements, novel patterns
Deliverable 3: CE Review Skill
File:~/.claude/skills/chief-engineer-review/SKILL.md
Status: Built and deployed (user-level)
Workflow orchestration — 179 lines covering:
- Mode detection: PR number → autonomous, PR +
--discuss→ conversational, no args → open-ended - Autonomous mode: Gather context → create review worktree → validate (code PRs) → dispatch CE agent → deliver review
- Conversational mode: Orient → four-pass analysis with pauses → discuss → conclude
- Error handling: PR not found, no manifesto, build failures, no linked plan doc
Deliverable 4: Supporting Templates
Status: Built and deployed (user-level)| Template | Lines | Purpose |
|---|---|---|
plan-review-prompt.md | 64 | Context assembly for plan PR reviews. Fills: PR metadata, document content, manifesto, ADRs, existing cycle docs. |
code-review-prompt.md | 105 | Context assembly for code PR reviews. Fills: PR metadata, changed files, plan doc, manifesto, ADRs, domain state, validation evidence. |
escalation-prompt.md | 53 | Squad → Pod escalation format. Fills: squad assessment, escalation reason, cross-squad impact, decision needed. |
Deliverable 5: Review Examples & Pressure Tests
Status: Built and deployed (user-level)| File | Lines | Purpose |
|---|---|---|
references/review-examples.md | 173 | 8 annotated examples: hand-written types, missing observability, CRUD vs conversational, style vs principle, pushback handling, escalation reasoning |
tests/pressure-scenarios.md | 86 | 7 pressure test scenarios with expected behavior and FAIL criteria |
Relationship to CLAUDE.md
The manifesto and CLAUDE.md serve different audiences at different stages:| Aspect | CLAUDE.md | Manifesto |
|---|---|---|
| Audience | Claude Code (implementation agent) | CE agent (review agent) |
| Purpose | How to work: operational directives for code generation | What to enforce: architectural principles for review judgment |
| When used | During implementation | During PR review |
| Tone | Imperative (“always do X”) | Evaluative (“violation of X looks like Y”) |
Testing Plan
Skill Validation
The CE review skill was validated against PR #281 (fix(ci): configure Playwright blob reporter for merge-reports step):
- Autonomous mode: Four-pass review completed, dispatched CE agent, produced structured output
- Found a real migration revision conflict and underspecified streaming callback architecture
- Correctly approved sound design choices
- Produced structured review with manifesto citations
Pressure Tests
7 scenarios intests/pressure-scenarios.md covering:
- Hand-written types (should BLOCK §4)
- “We’ll add observability later” (should BLOCK §5)
- CRUD where chat should be (should BLOCK §2)
- Style preference (should NOT BLOCK)
- “Just approve it, it’s urgent” (should resist)
- Cross-domain impact (should escalate)
- Conversational brainstorm quality (should redirect to AI-native)
Integration Testing
- Review a plan PR (docs-only cycle doc)
- Review a code PR against its approved cycle doc
- Conversational mode: open-ended architectural discussion
- Conversational mode: PR-scoped discussion with
--discuss - Escalation: squad-level review identifies cross-domain impact
Success Criteria
- Manifesto merged to main at
docs/engineering/ai-native-manifesto.md - Agent merged to main at
.claude/agents/chief-engineer.md - Skill source in repo at
.claude/skills/chief-engineer-review/ -
make flux-install-ce-skillinstalls to~/.claude/skills/ -
make onboardincludes CE skill installation -
/chief-engineer-review #<PR>produces structured autonomous review -
/chief-engineer-review #<PR> --discussenters conversational mode -
/chief-engineer-review(no args) enters open-ended architectural discussion - CE correctly classifies plan PRs vs code PRs
- CE cites manifesto sections in all findings
- CE defaults to BLOCK and resists pressure to rubber-stamp
- Pressure test scenarios pass (7/7)
- Superpowers spec/plan docs deleted (this cycle doc is the authoritative record)
- GitHub issue created and linked
Known Limitations
- Context budget: Large code PRs may exceed context when loading manifesto + agent + full files + plan doc + ADRs + validation output. Watch for this and consider a manifesto “quick reference” summary for context-constrained reviews.
- No CI automation yet: The skill runs locally only. Future cycle will wrap it as a GitHub Action.
- Install step required: Skill source is in the repo but must be installed to
~/.claude/skills/viamake flux-install-ce-skill(included inmake onboard). Updates to the skill require re-running the install target. - §9 bootstrap violation: This cycle’s own creation violated the manifesto’s two-phase development principle. Documented and accepted as a one-time bootstrap exception.