Files
gadfly/entrypoint.sh
T
steve 86f12c126f
Build & push image / build-and-push (push) Successful in 28s
feat: claude-code reviewer engine (#2)
Phase 1: a second review engine alongside the majordomo agent loop. For
each lens, shell out to the Claude Code CLI (`claude -p --output-format
json`) inside the checked-out repo so it verifies findings with its own
read tools, then reuse gadfly's verdict-parse + recheck + consolidate +
emit pipeline. Select via GADFLY_MODELS `claude-code`/`claude-code/<model>`;
auth via CLAUDE_CODE_OAUTH_TOKEN (no --bare) else ANTHROPIC_API_KEY;
read-only by default; GADFLY_CLAUDE_* knobs. Dockerfile bundles Node +
@anthropic-ai/claude-code. Also bumped the dogfood pin to the status-board
image (PR #2 was the first dogfood with the live board + full fleet).

Folded in the swarm's own review findings: minimal subprocess env (no
GITEA_TOKEN leak to the CLI), runPass robustness (ctx/empty-result/runErr),
process-group cleanup on timeout, rune-safe error truncation, and
engine-neutral prompts (also de-mort-ified the recheck prompt). 66 findings
graded via the gadfly MCP.

gofmt clean, go vet quiet, go build + go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
2026-06-27 20:40:41 +00:00

261 lines
12 KiB
Bash
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
#!/usr/bin/env bash
# Gadfly container entrypoint.
#
# This is the brains that used to live in the Gitea Actions workflow YAML. A
# consuming repo only commits a ~15-line stub workflow that runs this image and
# passes the event context as env; ALL the gating, cloning, model-looping and
# comment I/O happens here, so the stub stays dumb (act_runner has weak YAML
# expression support — keep logic in the image, not the workflow).
#
# What it does:
# 1. Decides whether this event should trigger a review (draft skip, comment
# trigger phrase + allowed-user gate, PR detection). Non-triggers exit 0.
# 2. Acknowledges a comment trigger with a 👀 reaction.
# 3. Shallow-clones the PR's head branch (the agentic reviewer reads the
# checked-out tree to VERIFY findings, not just the diff).
# 4. Runs the gadfly reviewer once per configured model via run.sh, which
# upserts one labeled PR comment per model.
#
# Advisory only: it never blocks a merge. Config/usage errors exit non-zero;
# everything review-related is posted as a comment, never a failed check.
#
# Env (set by the consumer's stub workflow from the github.* context):
# GITEA_API https://HOST/api/v1/repos/OWNER/REPO (required)
# GITEA_TOKEN built-in Actions token (posts comments) (required)
# OLLAMA_CLOUD_API_KEY Ollama Cloud key; empty => "not configured" notice
# EVENT_NAME pull_request | issue_comment | workflow_dispatch (required)
# PR pull request number (required)
# PR_BRANCH head branch (github.head_ref); empty => fetched from API
# IS_DRAFT 'true' on a draft PR => skipped
# COMMENT_BODY comment text (issue_comment only)
# COMMENT_ID comment id, for the 👀 reaction (issue_comment only)
# ACTOR github.actor (the user who triggered)
# Optional config:
# GADFLY_MODELS comma-separated model ids/specs (alias: OLLAMA_REVIEW_MODELS)
# GADFLY_PROVIDER majordomo provider for bare model ids (default ollama-cloud;
# e.g. "ollama" local, "openai", "anthropic", "google")
# GADFLY_BASE_URL override backend endpoint (OpenAI/Ollama-compatible servers)
# GADFLY_API_KEY provider key (else provider's standard env: OPENAI_API_KEY, …)
# CLAUDE_CODE_OAUTH_TOKEN auth for the claude-code engine (GADFLY_MODELS entry
# "claude-code"/"claude-code/<model>"); Pro/Max subscription
# token from `claude setup-token`. Else ANTHROPIC_API_KEY.
# GADFLY_TRIGGER_PHRASE comment phrase that triggers a re-review (default "@gadfly review")
# GADFLY_ALLOWED_USERS comma-separated usernames allowed to comment-trigger;
# empty => fall back to "is a repo collaborator"
# GADFLY_FINDINGS_URL optional gadfly-reports store base URL; set to POST the run +
# findings for model-quality tracking (off when empty)
# GADFLY_FINDINGS_TOKEN optional bearer token for the gadfly-reports store
set -uo pipefail
# One model by default: the specialist suite already provides breadth, so a
# multi-model default would multiply cost (models × specialists × 2 passes).
DEFAULT_MODELS="qwen3-coder:480b-cloud"
TRIGGER_PHRASE="${GADFLY_TRIGGER_PHRASE:-@gadfly review}"
SCRIPTS_DIR="/app/scripts"
WORKDIR="${WORKDIR:-/tmp/gadfly}"
log() { echo "[gadfly] $*" >&2; }
die() { log "ERROR: $*"; exit 1; }
: "${GITEA_API:?GITEA_API required}"
: "${GITEA_TOKEN:?GITEA_TOKEN required}"
: "${PR:?PR required}"
: "${EVENT_NAME:?EVENT_NAME required}"
API() { curl -fsS --connect-timeout 20 --max-time 30 -H "Authorization: token ${GITEA_TOKEN}" "$@"; }
# --- is the commenter allowed to trigger a re-review? ----------------------
actor_allowed() {
local actor="$1"
[ -z "$actor" ] && return 1
if [ -n "${GADFLY_ALLOWED_USERS:-}" ]; then
local IFS=','
for u in $GADFLY_ALLOWED_USERS; do
[ "$(echo "$u" | tr -d '[:space:]')" = "$actor" ] && return 0
done
return 1
fi
# No explicit allow-list: allow anyone with collaborator (write) access.
local code
code="$(curl -s -o /dev/null -w '%{http_code}' --connect-timeout 20 --max-time 30 \
-H "Authorization: token ${GITEA_TOKEN}" "${GITEA_API}/collaborators/${actor}")"
[ "$code" = "204" ]
}
# --- trigger gating --------------------------------------------------------
case "$EVENT_NAME" in
workflow_dispatch)
log "manual dispatch for PR #${PR}" ;;
pull_request)
if [ "${IS_DRAFT:-false}" = "true" ]; then
log "PR #${PR} is a draft; skipping"; exit 0
fi
log "new/updated PR #${PR}" ;;
issue_comment)
case "${COMMENT_BODY:-}" in
*"$TRIGGER_PHRASE"*) : ;;
*) log "comment does not contain trigger phrase ${TRIGGER_PHRASE}; skipping"; exit 0 ;;
esac
if ! actor_allowed "${ACTOR:-}"; then
log "actor '${ACTOR:-}' not allowed to trigger; skipping"; exit 0
fi
# Must be a comment on a PR, not a plain issue.
if ! API "${GITEA_API}/pulls/${PR}" >/dev/null 2>&1; then
log "issue #${PR} is not a pull request; skipping"; exit 0
fi
# Acknowledge with 👀.
if [ -n "${COMMENT_ID:-}" ]; then
curl -s -X POST -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" \
"${GITEA_API}/issues/comments/${COMMENT_ID}/reactions" -d '{"content":"eyes"}' >/dev/null 2>&1 || true
fi
log "comment-triggered review for PR #${PR} by ${ACTOR:-?}" ;;
*)
log "event '${EVENT_NAME}' not handled; skipping"; exit 0 ;;
esac
# --- resolve head branch ---------------------------------------------------
BRANCH="${PR_BRANCH:-}"
if [ -z "$BRANCH" ]; then
BRANCH="$(API "${GITEA_API}/pulls/${PR}" | jq -r '.head.ref // ""')"
fi
[ -z "$BRANCH" ] && die "could not determine PR #${PR} head branch"
# --- clone the PR's checked-out tree (shallow) -----------------------------
HOST="${GITEA_API%%/api/v1/*}" # https://host
REPO_PATH="${GITEA_API##*/api/v1/repos/}" # owner/repo
CLONE_URL="https://token:${GITEA_TOKEN}@${HOST#https://}/${REPO_PATH}.git"
REPO_DIR="${WORKDIR}/repo"
rm -rf "$REPO_DIR"; mkdir -p "$WORKDIR"
log "cloning ${REPO_PATH} @ ${BRANCH}"
git clone --depth=1 --branch "$BRANCH" "$CLONE_URL" "$REPO_DIR" 2>/dev/null \
|| die "clone of ${REPO_PATH}@${BRANCH} failed"
# --- findings telemetry (optional) -----------------------------------------
# Plumb the run context to the binary (inherited through run.sh and the gadfly
# process env). The binary's emit is OFF unless GADFLY_FINDINGS_URL is set; when
# set it POSTs the run + its findings to a gadfly-reports store. GADFLY_FINDINGS_URL /
# GADFLY_FINDINGS_TOKEN come from the consumer's stub env and are re-exported so
# they reach the binary even if unset (empty => disabled).
export GADFLY_REPO="$REPO_PATH"
export GADFLY_PR="$PR"
export GADFLY_FINDINGS_URL="${GADFLY_FINDINGS_URL:-}"
export GADFLY_FINDINGS_TOKEN="${GADFLY_FINDINGS_TOKEN:-}"
# --- review once per model, with per-provider concurrency -------------------
# GADFLY_MODELS is the provider-agnostic name; OLLAMA_REVIEW_MODELS is a
# back-compat alias. GADFLY_PROVIDER / GADFLY_BASE_URL / GADFLY_API_KEY and any
# provider key envs (OPENAI_API_KEY, …) are inherited by run.sh and the binary.
#
# Concurrency: each PROVIDER is its own lane and lanes run in PARALLEL, so a fast
# cloud provider isn't stuck behind a slow local box. Within a lane, at most
# `cap` models run at once. cap = GADFLY_PROVIDER_CONCURRENCY's "provider=N"
# entry, else GADFLY_CONCURRENCY (default 1). A model's provider is the spec's
# first path segment ("m1pro/qwen3.6:35b-mlx" -> m1pro), or GADFLY_PROVIDER /
# ollama-cloud for a bare id. Default (cap 1) keeps a single-provider pool fully
# sequential, exactly as before.
MODELS="${GADFLY_MODELS:-${OLLAMA_REVIEW_MODELS:-$DEFAULT_MODELS}}"
DEFAULT_CONC="${GADFLY_CONCURRENCY:-1}"
provider_of() { case "$1" in */*) echo "${1%%/*}";; *) echo "${GADFLY_PROVIDER:-ollama-cloud}";; esac; }
# Per-model status file path for the live board. The model id can contain '/'
# and ':' (e.g. m1/qwen3:14b), so sanitize to a flat filename; the JSON inside
# carries the real model/provider, so this just needs to be unique per model.
STATUS_DIR="${WORKDIR}/status"
status_file_for() { echo "${STATUS_DIR}/$(echo "$1" | tr -c '[:alnum:]._-' '_').json"; }
provider_cap() { # provider -> concurrency (override map "p=N,...", else default)
local p="$1" item k v
IFS=',' read -ra _caps <<< "${GADFLY_PROVIDER_CONCURRENCY:-}"
for item in "${_caps[@]}"; do
k="$(echo "${item%%=*}" | tr -d '[:space:]')"
v="$(echo "${item#*=}" | tr -d '[:space:]')"
if [ "$k" = "$p" ] && [ -n "$v" ]; then echo "$v"; return; fi
done
echo "$DEFAULT_CONC"
}
review_one() {
local sf=""
[ "${GADFLY_STATUS_BOARD:-1}" != "0" ] && sf="$(status_file_for "$1")"
PROVIDER=ollama MODEL="$1" GADFLY_BIN="/usr/local/bin/gadfly" GADFLY_REPO_DIR="$REPO_DIR" \
GADFLY_STATUS_FILE="$sf" \
bash "${SCRIPTS_DIR}/run.sh" || log "model $1 failed (continuing)"
# If the binary never wrote real status (run.sh skipped it: empty diff, no key,
# binary missing), the pre-seed stays {started:0, done:false} and the board
# would show this model "waiting to start" forever and never reach N/N. Mark
# such a never-started file done so the board can complete. The binary stamps a
# nonzero `started`, so that reliably distinguishes "ran" from "skipped".
if [ -n "$sf" ] && [ -f "$sf" ] && [ "$(jq -r '.started // 0' "$sf" 2>/dev/null)" = "0" ]; then
tmp="$(jq '.done = true' "$sf" 2>/dev/null)" && printf '%s' "$tmp" > "$sf"
fi
}
# Normalize the model list (trim, drop blanks) into MODEL_LIST.
IFS=',' read -ra _raw <<< "$MODELS" || true
MODEL_LIST=()
for raw in "${_raw[@]}"; do m="$(echo "$raw" | tr -d '[:space:]')"; [ -n "$m" ] && MODEL_LIST+=("$m"); done
# Distinct providers, in first-seen order (no associative arrays — portable).
PROVIDERS=""
for m in "${MODEL_LIST[@]}"; do
p="$(provider_of "$m")"
case " $PROVIDERS " in *" $p "*) ;; *) PROVIDERS="${PROVIDERS}${PROVIDERS:+ }$p" ;; esac
done
run_lane() { # $1=provider: run its models, at most `cap` at a time
local p="$1" cap inflight=0 m
cap="$(provider_cap "$p")"; [ "$cap" -ge 1 ] 2>/dev/null || cap=1
local mine=()
for m in "${MODEL_LIST[@]}"; do [ "$(provider_of "$m")" = "$p" ] && mine+=("$m"); done
log "lane ${p}: cap ${cap}; models: ${mine[*]}"
for m in "${mine[@]}"; do
review_one "$m" &
inflight=$((inflight+1))
if [ "$inflight" -ge "$cap" ]; then wait -n 2>/dev/null || wait; inflight=$((inflight-1)); fi
done
wait
}
# --- live status board (optional, default on) ------------------------------
# Each model process publishes per-lens progress to STATUS_DIR/<model>.json; a
# background renderer (status-board.sh) upserts ONE consolidated PR comment so
# progress across all models/lenses is visible at a glance — and a watcher can
# tell when the whole swarm is finished. Advisory/best-effort; the per-model
# findings still land in each model's own comment. Disable with
# GADFLY_STATUS_BOARD=0.
BOARD_PID=""
if [ "${GADFLY_STATUS_BOARD:-1}" != "0" ]; then
rm -rf "$STATUS_DIR"; mkdir -p "$STATUS_DIR"
# Pre-seed every model as queued so the board shows the full swarm from t=0,
# even models still waiting on their provider lane's concurrency cap. Each
# binary overwrites its own file with real per-lens detail once it starts.
for m in "${MODEL_LIST[@]}"; do
jq -n --arg model "$m" --arg provider "$(provider_of "$m")" \
'{model:$model, provider:$provider, started:0, updated:0, done:false, lenses:[]}' \
> "$(status_file_for "$m")" 2>/dev/null || true
done
GITEA_API="$GITEA_API" GITEA_TOKEN="$GITEA_TOKEN" PR="$PR" GADFLY_STATUS_DIR="$STATUS_DIR" \
bash "${SCRIPTS_DIR}/status-board.sh" &
BOARD_PID=$!
log "status board started (pid ${BOARD_PID})"
fi
log "providers: ${PROVIDERS:-none}"
# Each provider lane runs in parallel; cap is enforced within each lane. Track
# the lane PIDs so we wait ONLY for the review work — not the status board,
# which intentionally runs until we signal it below.
LANE_PIDS=()
for p in $PROVIDERS; do
run_lane "$p" &
LANE_PIDS+=("$!")
done
[ "${#LANE_PIDS[@]}" -gt 0 ] && wait "${LANE_PIDS[@]}"
# Reviews are done: signal the board to render the final state once and exit.
if [ -n "$BOARD_PID" ]; then
touch "${STATUS_DIR}/.done" 2>/dev/null || true
wait "$BOARD_PID" 2>/dev/null || true
fi
log "done"