gadfly

Author	SHA1	Message	Date
steve	b860119332	feat: re-platform agentic review onto executus + large-PR cost controls Build & push image / build-and-push (pull_request) Successful in 11s Details Adversarial Review (Gadfly) / review (pull_request) Successful in 25m56s Details Fixes the large-PR token burn: a ~250K-token diff was re-sent every agent step across models × lenses × passes, draining a metered usage block in minutes. Small PRs are untouched (every mitigation is size-gated / no-op under threshold). - Re-platform the in-process review path onto executus run.Executor: context compaction (executus/compact, threshold from the model's real context window via executus/model), run-bounding, a per-PR budget gate (Ports.Budget), and the wrap-up nudge re-expressed as a run.Critic. Lens fan-out now uses executus/fanout. gadfly keeps its own model.go, so GADFLY_ENDPOINT_<NAME> aliases and the claude-code engine are unaffected. No majordomo bump; the binary stays static (executus core is majordomo+stdlib only). - Paginate get_diff (per-file `path` + start_line/limit) instead of dumping the whole diff; trim the recheck diff embed (60k -> 20k chars). - entrypoint.sh: downshift the fleet above GADFLY_HUGE_DIFF_BYTES (one cheap model, fewer lenses/steps, no recheck) + a swarm-wide GADFLY_PR_BUDGET_SECS wall-clock backstop (adds procps for pkill). All advisory; CI never fails. - README + CLAUDE.md + tests updated. Note: run.Result exposes no transcript, so the old transcript-based forced- finalization fallback is dropped; the wrap-up critic nudge is the remaining "always emit something" mechanism. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-30 10:43:34 -04:00
steve	53971603d3	feat: structured findings contract (machine-readable gadfly-findings block) (#16 ) Build & push image / build-and-push (push) Successful in 5s Details Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com> Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>	2026-06-28 22:23:02 +00:00
Steve Dudenhoeffer	49f3623204	fix: per-lens timeout, errored-verdict honesty, accurate provider label, tighter lens focus, run timing Build & push image / build-and-push (push) Successful in 8s Details Five fixes, several surfaced by the live bake-off: - PER-LENS TIMEOUT (critical): GADFLY_TIMEOUT_SECS now applies to EACH specialist (own context), not shared across the suite. A slow model (e.g. a 35B local MLX) was exhausting the whole 600s budget on lens 1, leaving the rest "step 0: context deadline exceeded". Default lowered to 300s (per-lens). cmd/gadfly/main.go. - ERRORED VERDICT: a lens whose review pass failed no longer counts as "clean". Header shows "· ⚠️ N/M lens(es) errored" (or "Review incomplete — all lenses errored"); the section reads "⚠️ could not complete". consolidate.go. - PROVIDER LABEL: the comment header now shows the model's ACTUAL backend from the spec ("m1pro/qwen3.6:35b-mlx" -> m1pro), not the global GADFLY_PROVIDER default (was wrongly "ollama-cloud" for local models). scripts/run.sh. - LENS FOCUS: base prompt no longer licenses "report anything serious"; each lens stays in its lane, says "nothing in my area" rather than re-reporting another lens's bug, with a one-line "Outside my lens:" escape hatch. The re-derive- constants discipline is now lane-scoped, not "every lens". system-prompt.txt + specialists.go. - RUN TIMING: run.sh posts a "⏳ Reviewing…" placeholder at model start and updates it with "⏱️ reviewed in 1m 23s" on finish, for per-model comparison. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 20:15:40 -04:00
Steve Dudenhoeffer	7809d1b93d	feat: specialist suite — configurable + custom review lenses (one consolidated comment) Build & push image / build-and-push (push) Successful in 8s Details Replace the single generic review with a suite of focused specialists, each its own review+recheck pass, merged into ONE comment (a collapsible section per lens, led by the worst verdict; the optional `improvements` lens never escalates it). - cmd/gadfly/specialists.go: built-in lenses + default suite (security, correctness, maintainability, performance, error-handling) + opt-in (tests, docs, conventions, improvements). Selection via GADFLY_SPECIALISTS (csv/"all"); custom defs via GADFLY_SPECIALIST_<NAME> env and a repo .gadfly.yml (specialists + define). Precedence: built-ins < file < env. Unknown names error but don't sink the run. - cmd/gadfly/consolidate.go: verdict parse + one-comment render. - main.go: loop specialists; per-lens failure is an inline notice, never fatal. Default timeout bumped to 600s (suite runs sequentially). - base system prompt trimmed to persona+tools+discipline+output; lens-specific focus is appended per specialist (semantic re-derivation discipline kept in base). - entrypoint default models -> single model (suite already gives breadth; cost ~= specialists × models × 2). Adds gopkg.in/yaml.v3. - docs/examples: README "Specialists" section, examples/.gadfly.yml, stub var, CLAUDE.md architecture/config. Dynamic `auto` selection is the planned next step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 19:23:05 -04:00
Steve Dudenhoeffer	c0d0152a34	Gadfly: agentic adversarial PR reviewer (initial extraction) Standalone, Docker-packaged extraction of the agentic PR reviewer that runs in Gitea Actions: reads the checked-out repo with read-only tools (read_file/grep/ find_files/get_diff), verifies findings before reporting, two-pass review + adversarial recheck, posts one labeled comment per model. Advisory only. - cmd/gadfly: reviewer binary (majordomo + Ollama Cloud), zero deps beyond stdlib + majordomo - entrypoint.sh: container brains — trigger gating, PR clone, model loop (logic out of YAML) - Dockerfile: multi-stage; build-time module token never reaches the final image - .gitea/workflows/build-image.yml: tag v* → build & push image - examples/: ~15-line consumer stub - system prompt genericized + hardened to re-derive constants/formulas (semantic bugs) Vibe-coded with Claude Code; see README disclosure. Advisory, never blocks merge. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-25 18:42:20 -04:00

5 Commits