feat: dynamic auto specialist selection + worker-tier delegation
Build & push image / build-and-push (push) Successful in 33s

Two Phase-2 swarm upgrades:

- auto.go: GADFLY_SPECIALISTS=auto routes the review — a selector model
  (GADFLY_SELECTOR_MODEL, else the review model) reads the changed files + PR
  description and picks the smallest relevant lens set from the catalog, and may
  propose ad-hoc lenses for gaps (e.g. migrations). Structured output via
  majordomo.Generate[T]; capped + de-duped; falls back to the default suite.
- delegate.go: GADFLY_WORKER_MODEL adds a delegate_investigation tool so the
  reviewer offloads mechanical legwork (trace callers, gather usages) to a cheap
  worker sub-agent that returns an evidence-cited digest — the top model reasons
  over summaries, not raw file dumps. Workers get an fs-only toolbox (no
  sub-delegation). Unset = off.

resolveSpecialists now also returns the registry + an auto flag. Docs (README
Specialists + config table, CLAUDE.md, main.go header) + tests updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Steve Dudenhoeffer
2026-06-25 19:35:59 -04:00
parent 7809d1b93d
commit 4b8f9aa39b
9 changed files with 370 additions and 26 deletions
+7 -2
View File
@@ -31,8 +31,10 @@ verifies each one against the actual code, and posts its findings as a comment.
cmd/gadfly/ the reviewer binary — pure producer of review markdown (stdout)
main.go orchestration: loop specialists, each a review pass + adversarial recheck
specialists.go specialist lenses: built-ins, default suite, env + .gadfly.yml resolution
auto.go dynamic `auto` selection: a selector model picks lenses per-diff (may invent)
delegate.go worker-tier delegate_investigation tool (cheap sub-agent does legwork)
consolidate.go verdict parsing + one-comment consolidation (a section per specialist)
model.go provider/model resolution (majordomo.Parse) + env endpoint aliases
model.go provider/model + selector + worker resolution (majordomo.Parse) + endpoint aliases
tools.go the 5 read-only repo tools (read_file/list_dir/grep/find_files/get_diff)
recheck.go second-pass verification prompt + verdict recompute
*_test.go sandbox, recheck, wrap-up, spec/endpoint-parse, specialist-resolution tests
@@ -92,7 +94,10 @@ error-handling; opt-in built-ins = tests/docs/conventions/improvements. Select v
`GADFLY_SPECIALISTS` (csv or `all`); define/override via `GADFLY_SPECIALIST_<NAME>` env or a repo
`.gadfly.yml` (`specialists:` + `define:`). See `cmd/gadfly/specialists.go`. Cost ≈
specialists × models × 2 passes — keep the default model count low (entrypoint defaults to one).
Dynamic `auto` selection (a cheap model picks lenses per-diff) is the planned next step.
**Dynamic `auto`** (`GADFLY_SPECIALISTS=auto`): a selector (`GADFLY_SELECTOR_MODEL` or the review
model) picks lenses per-diff and may invent ad-hoc ones (`cmd/gadfly/auto.go`). **Worker-tier**
(`GADFLY_WORKER_MODEL`): a `delegate_investigation` tool offloads grep/read legwork to a cheap
sub-agent (`cmd/gadfly/delegate.go`).
**Tested vs untested:** only the Ollama paths (local + OpenAI-compatible pointed at Ollama)
are actually exercised. OpenAI/Anthropic/Google come from majordomo's abstraction and are