feat: cross-model consensus consolidation (one ranked comment, not N walls) (#17)
Build & push image / build-and-push (push) Successful in 9s

Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
This commit was merged in pull request #17.
This commit is contained in:
2026-06-28 22:56:15 +00:00
committed by steve
parent 84b891b1ba
commit 88f74aa768
8 changed files with 841 additions and 32 deletions
+30
View File
@@ -277,6 +277,35 @@ every `GADFLY_STATUS_POLL_SECS` (default 12s) until the swarm finishes. It's adv
best-effort — the per-model findings comments are unaffected — and entirely separate from those.
Turn it off with `GADFLY_STATUS_BOARD=0`.
### Consensus consolidation
With **two or more models**, posting one comment each means a reader faces N walls of prose that
mostly agree. Instead Gadfly consolidates: every model writes its findings to a shared file, and
after the whole swarm finishes a single pass clusters those findings by location, counts **how
many models independently flagged each one**, and posts **one consensus comment**:
```
## 🪰 Gadfly review — consensus across 7 models
**Verdict: Blocking issues found** · 9 findings (3 with multi-model agreement)
| | Finding | Where | Models | Lens |
|--|--|--|--|--|
| 🔴 | Auth bypass: token not verified | `auth/login.go:42` | 6/7 | security |
| 🟠 | Unbounded retry loop | `sync/worker.go:88` | 3/7 | error-handling |
<details><summary>4 single-model findings (lower confidence)</summary> … </details>
<details><summary>Per-model detail</summary> … each model's full review, folded … </details>
```
Cross-model agreement is the strongest real-vs-false-positive signal available, so findings are
ranked by it (a lone low-severity finding folds away; a lone *critical* still surfaces). The
per-model comments are suppressed in this mode — each model's full review is preserved, folded,
inside the consensus comment — and nothing is lost: if consolidation can't run, Gadfly falls
back to posting the per-model comments. Controlled by `GADFLY_CONSOLIDATE`: `auto` (default — on
for ≥2 models), `1` (force on), `0` (force off, one comment per model). Single-model runs are
unaffected.
### Triggers
1. A **new/reopened/ready** non-draft PR — automatic.
@@ -362,6 +391,7 @@ The reviewer binary reads these (the stub/entrypoint set sane defaults):
| `GADFLY_MAX_DIFF_CHARS` | 60000 | diff chars embedded in the prompt (full diff via `get_diff`) |
| `GADFLY_STATUS_BOARD` | on | set `0` to disable the live status-board comment |
| `GADFLY_STATUS_POLL_SECS` | 12 | how often the status board re-renders/upserts |
| `GADFLY_CONSOLIDATE` | `auto` | cross-model consensus comment: `auto` (on for ≥2 models), `1` (force on), `0` (off — one comment per model) |
| `GADFLY_TRIGGER_PHRASE` | `@gadfly review` | comment phrase that re-triggers |
| `GADFLY_ALLOWED_USERS` | *(collaborators)* | comma-separated allow-list for comment triggers |
| `GADFLY_FINDINGS_URL` | — | gadfly-reports store base URL; set to enable findings telemetry (off when empty) |