Merge remote-tracking branch 'origin/main' into feat/llama-swap-provider
CI / Build & Test (pull_request) Successful in 10m15s
CI / Tidy (pull_request) Successful in 10m20s
Adversarial Review (Gadfly) / review (pull_request) Successful in 18m24s

This commit is contained in:
2026-06-27 15:12:26 -04:00
2 changed files with 113 additions and 0 deletions
+21
View File
@@ -135,6 +135,27 @@ CI: `.gitea/workflows/ci.yaml` (Gitea Actions, mirrors foreman). README.md
must match reality in the same commit that changes behavior — no
aspirational docs; unbuilt features are marked pending in the matrix.
## Adversarial review loop (Gadfly)
Ship work through PRs and let Gadfly review it before merge:
- **Push to a PR, never straight to `main`.** Branch, push, open a PR.
`.gitea/workflows/adversarial-review.yml` runs Gadfly (the standalone
agentic adversarial reviewer) — a fleet of 9 ollama-cloud models +
the M5 Mac via foreman, each running the 3-lens suite (security,
correctness, error-handling). Advisory only; it never blocks the merge.
- **Wait for Gadfly to finish, then read its output.** Don't merge while the
review is still running. Each model posts one consolidated comment; weigh
every finding on its merits and fix the real ones (Gadfly is a simple
system — findings are advisory, so confirm before acting).
- **Grade the findings back to the Gadfly MCP.** For each finding, call
`mcp__gadfly__record_finding_grade`: `is_real=true` + a `severity`
(trivial|small|medium|high|critical) for a genuine problem, or
`is_real=false` for a false positive; add `notes`/`usefulness` when
useful. Use `mcp__gadfly__list_findings` (`only_ungraded=true`) to find
what still needs grading and `mcp__gadfly__scoreboard` for the per-model
rollup. This telemetry is how we measure whether each model earns its keep.
## Out of scope (anti-creep)
No persistent store (health is in-memory behind the registry), no