feat: cross-model consensus consolidation (one ranked comment, not N walls)
Build & push image / build-and-push (pull_request) Successful in 4s
Adversarial Review (Gadfly) / review (pull_request) Successful in 16m16s

With >=2 models the swarm now posts ONE consensus comment instead of a
per-model comment each. Every model writes its structured findings to a shared
dir (GADFLY_FINDINGS_OUT); after the swarm finishes, a consolidation pass
(the binary in GADFLY_CONSOLIDATE_DIR mode) clusters findings by location
(±3 lines), counts how many models independently flagged each, escalates to the
highest reported severity, and renders an agreement-ranked table — cross-model
agreement being the strongest real-vs-false-positive signal we have.

- consensus.go: modelFindings artifact, collectFindings (shared with emit),
  writeFindingsOut, runConsolidate, clustering + renderConsensus. Lone
  low-severity findings fold away; a lone critical still surfaces; each model's
  full review is preserved folded inside the one comment.
- main.go: GADFLY_CONSOLIDATE_DIR dispatches to consolidation; writeFindingsOut
  after a normal review.
- emit.go: reuse collectFindings (one definition of "what a finding is").
- run.sh: when GADFLY_CONSOLIDATE=1, write findings + skip the per-model comment
  (live progress still on the status board).
- entrypoint.sh: auto-enable for >=2 models; run the consolidator after the
  barrier and upsert ONE consensus comment; FALL BACK to per-model comments if
  consolidation yields nothing (advisory invariant: never lose output).
- README + reusable workflow (`consolidate` input) + entrypoint docs updated.

Tests: clustering/agreement/tolerance, severity escalation, nit folding, lone
critical, write→consolidate round-trip, empty-dir error. gofmt/vet clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-28 18:35:17 -04:00
parent 84b891b1ba
commit 052833d830
8 changed files with 718 additions and 32 deletions
+2
View File
@@ -60,6 +60,7 @@ on:
worker_model: { type: string, default: "" } # GADFLY_WORKER_MODEL
allowed_users: { type: string, default: "" } # GADFLY_ALLOWED_USERS (consumer-specific; set in your stub)
trigger_phrase: { type: string, default: "" } # GADFLY_TRIGGER_PHRASE
consolidate: { type: string, default: "" } # GADFLY_CONSOLIDATE — "" => auto (one consensus comment for >=2 models); "0" => one comment per model
# Job wall-clock cap. 90 as a default: the 5-lens suite across a slow lane
# (claude-code with extended thinking) over two passes can run long.
timeout_minutes: { type: number, default: 90 }
@@ -143,3 +144,4 @@ jobs:
GADFLY_WORKER_MODEL: ${{ inputs.worker_model }}
GADFLY_ALLOWED_USERS: ${{ inputs.allowed_users }}
GADFLY_TRIGGER_PHRASE: ${{ inputs.trigger_phrase }}
GADFLY_CONSOLIDATE: ${{ inputs.consolidate }}