diff --git a/CLAUDE.md b/CLAUDE.md index 104ddfb..5b57da7 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -129,6 +129,27 @@ CI: `.gitea/workflows/ci.yaml` (Gitea Actions, mirrors foreman). README.md must match reality in the same commit that changes behavior — no aspirational docs; unbuilt features are marked pending in the matrix. +## Adversarial review loop (Gadfly) + +Ship work through PRs and let Gadfly review it before merge: + +- **Push to a PR, never straight to `main`.** Branch, push, open a PR. + `.gitea/workflows/adversarial-review.yml` runs Gadfly (the standalone + agentic adversarial reviewer) — a full fleet of 9 ollama-cloud models + + the M1/M5 Macs via foreman, each running the 3-lens suite (security, + correctness, error-handling). Advisory only; it never blocks the merge. +- **Wait for Gadfly to finish, then read its output.** Don't merge while the + review is still running. Each model posts one consolidated comment; weigh + every finding on its merits and fix the real ones (Gadfly is a simple + system — findings are advisory, so confirm before acting). +- **Grade the findings back to the Gadfly MCP.** For each finding, call + `mcp__gadfly__record_finding_grade`: `is_real=true` + a `severity` + (trivial|small|medium|high|critical) for a genuine problem, or + `is_real=false` for a false positive; add `notes`/`usefulness` when + useful. Use `mcp__gadfly__list_findings` (`only_ungraded=true`) to find + what still needs grading and `mcp__gadfly__scoreboard` for the per-model + rollup. This telemetry is how we measure whether each model earns its keep. + ## Out of scope (anti-creep) No persistent store (health is in-memory behind the registry), no