feat: cross-model consensus consolidation (one ranked comment, not N walls)
With >=2 models the swarm now posts ONE consensus comment instead of a per-model comment each. Every model writes its structured findings to a shared dir (GADFLY_FINDINGS_OUT); after the swarm finishes, a consolidation pass (the binary in GADFLY_CONSOLIDATE_DIR mode) clusters findings by location (±3 lines), counts how many models independently flagged each, escalates to the highest reported severity, and renders an agreement-ranked table — cross-model agreement being the strongest real-vs-false-positive signal we have. - consensus.go: modelFindings artifact, collectFindings (shared with emit), writeFindingsOut, runConsolidate, clustering + renderConsensus. Lone low-severity findings fold away; a lone critical still surfaces; each model's full review is preserved folded inside the one comment. - main.go: GADFLY_CONSOLIDATE_DIR dispatches to consolidation; writeFindingsOut after a normal review. - emit.go: reuse collectFindings (one definition of "what a finding is"). - run.sh: when GADFLY_CONSOLIDATE=1, write findings + skip the per-model comment (live progress still on the status board). - entrypoint.sh: auto-enable for >=2 models; run the consolidator after the barrier and upsert ONE consensus comment; FALL BACK to per-model comments if consolidation yields nothing (advisory invariant: never lose output). - README + reusable workflow (`consolidate` input) + entrypoint docs updated. Tests: clustering/agreement/tolerance, severity escalation, nit folding, lone critical, write→consolidate round-trip, empty-dir error. gofmt/vet clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+22
-5
@@ -50,6 +50,13 @@ MAX_DIFF_CHARS="${MAX_DIFF_CHARS:-60000}"
|
||||
MARKER="<!-- gadfly-review:${PROVIDER}:${MODEL} -->"
|
||||
say() { echo "[gadfly-review:${PROVIDER}:${MODEL}] $*" >&2; }
|
||||
|
||||
# When the swarm is consolidating (GADFLY_CONSOLIDATE=1, set by entrypoint.sh for
|
||||
# a multi-model run), this model does NOT post its own comment — it writes its
|
||||
# findings to GADFLY_FINDINGS_OUT and a single cross-model consensus comment is
|
||||
# posted after the whole swarm finishes. Live progress still shows on the status
|
||||
# board. Default 0 (post a per-model comment, the standalone behavior).
|
||||
CONSOLIDATE="${GADFLY_CONSOLIDATE:-0}"
|
||||
|
||||
# Display the model's ACTUAL backend: the provider segment of the spec
|
||||
# ("m1pro/qwen3.6:35b-mlx" -> "m1pro"); a bare id uses GADFLY_PROVIDER (default
|
||||
# ollama-cloud). This is what the comment header shows, not the run.sh lane.
|
||||
@@ -126,7 +133,9 @@ USR="$(printf 'PR #%s: %s\n\nDescription:\n%s\n\nUnified diff to review:\n```dif
|
||||
# --- announce start (placeholder comment) -----------------------------------
|
||||
START_TS="$(date +%s)"
|
||||
say "starting review with ${MODEL}"
|
||||
upsert_comment "$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n⏳ Reviewing… this comment will update with findings and run time.' \
|
||||
# Skip the per-model placeholder when consolidating (the consensus comment is
|
||||
# posted later; live progress is on the status board).
|
||||
[ "$CONSOLIDATE" = "1" ] || upsert_comment "$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n⏳ Reviewing… this comment will update with findings and run time.' \
|
||||
"$MARKER" "$MODEL" "$MODEL_PROVIDER")"
|
||||
|
||||
# --- call the model ---------------------------------------------------------
|
||||
@@ -170,6 +179,7 @@ case "$PROVIDER" in
|
||||
GADFLY_BODY="$BODY" \
|
||||
GADFLY_MAX_DIFF_CHARS="$MAX_DIFF_CHARS" \
|
||||
GADFLY_STATUS_FILE="${GADFLY_STATUS_FILE:-}" \
|
||||
GADFLY_FINDINGS_OUT="${GADFLY_FINDINGS_OUT:-}" \
|
||||
"$BIN" 2>"$ERR_FILE"
|
||||
)"
|
||||
rc=$?
|
||||
@@ -204,7 +214,14 @@ esac
|
||||
# --- assemble + post final comment (with run time) --------------------------
|
||||
ELAPSED="$(( $(date +%s) - START_TS ))"
|
||||
DUR="$(fmt_duration "$ELAPSED")"
|
||||
COMMENT="$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n%s\n\n<sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in %s</sub>' \
|
||||
"$MARKER" "$MODEL" "$MODEL_PROVIDER" "$REVIEW" "$DUR")"
|
||||
upsert_comment "$COMMENT"
|
||||
say "done in ${DUR}"
|
||||
# When consolidating, the binary has written this model's findings to
|
||||
# GADFLY_FINDINGS_OUT; the consensus comment is posted by entrypoint.sh after the
|
||||
# whole swarm finishes, so this model posts no comment of its own.
|
||||
if [ "$CONSOLIDATE" = "1" ]; then
|
||||
say "done in ${DUR} (consolidated; no per-model comment)"
|
||||
else
|
||||
COMMENT="$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n%s\n\n<sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in %s</sub>' \
|
||||
"$MARKER" "$MODEL" "$MODEL_PROVIDER" "$REVIEW" "$DUR")"
|
||||
upsert_comment "$COMMENT"
|
||||
say "done in ${DUR}"
|
||||
fi
|
||||
|
||||
Reference in New Issue
Block a user