feat: dynamic auto specialist selection + worker-tier delegation

Two Phase-2 swarm upgrades: - auto.go: GADFLY_SPECIALISTS=auto routes the review — a selector model (GADFLY_SELECTOR_MODEL, else the review model) reads the changed files + PR description and picks the smallest relevant lens set from the catalog, and may propose ad-hoc lenses for gaps (e.g. migrations). Structured output via majordomo.Generate[T]; capped + de-duped; falls back to the default suite. - delegate.go: GADFLY_WORKER_MODEL adds a delegate_investigation tool so the reviewer offloads mechanical legwork (trace callers, gather usages) to a cheap worker sub-agent that returns an evidence-cited digest — the top model reasons over summaries, not raw file dumps. Workers get an fs-only toolbox (no sub-delegation). Unset = off. resolveSpecialists now also returns the registry + an auto flag. Docs (README Specialists + config table, CLAUDE.md, main.go header) + tests updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:35:59 -04:00
parent 7809d1b93d
commit 4b8f9aa39b
9 changed files with 370 additions and 26 deletions
@@ -31,8 +31,10 @@ verifies each one against the actual code, and posts its findings as a comment.
 cmd/gadfly/            the reviewer binary — pure producer of review markdown (stdout)
  main.go              orchestration: loop specialists, each a review pass + adversarial recheck
  specialists.go       specialist lenses: built-ins, default suite, env + .gadfly.yml resolution
+  auto.go              dynamic `auto` selection: a selector model picks lenses per-diff (may invent)
+  delegate.go          worker-tier delegate_investigation tool (cheap sub-agent does legwork)
  consolidate.go       verdict parsing + one-comment consolidation (a section per specialist)
-  model.go             provider/model resolution (majordomo.Parse) + env endpoint aliases
+  model.go             provider/model + selector + worker resolution (majordomo.Parse) + endpoint aliases
  tools.go             the 5 read-only repo tools (read_file/list_dir/grep/find_files/get_diff)
  recheck.go           second-pass verification prompt + verdict recompute
  *_test.go            sandbox, recheck, wrap-up, spec/endpoint-parse, specialist-resolution tests
@@ -92,7 +94,10 @@ error-handling; opt-in built-ins = tests/docs/conventions/improvements. Select v
 `GADFLY_SPECIALISTS` (csv or `all`); define/override via `GADFLY_SPECIALIST_<NAME>` env or a repo
 `.gadfly.yml` (`specialists:` + `define:`). See `cmd/gadfly/specialists.go`. Cost ≈
 specialists × models × 2 passes — keep the default model count low (entrypoint defaults to one).
-Dynamic `auto` selection (a cheap model picks lenses per-diff) is the planned next step.
+**Dynamic `auto`** (`GADFLY_SPECIALISTS=auto`): a selector (`GADFLY_SELECTOR_MODEL` or the review
+model) picks lenses per-diff and may invent ad-hoc ones (`cmd/gadfly/auto.go`). **Worker-tier**
+(`GADFLY_WORKER_MODEL`): a `delegate_investigation` tool offloads grep/read legwork to a cheap
+sub-agent (`cmd/gadfly/delegate.go`).

 **Tested vs untested:** only the Ollama paths (local + OpenAI-compatible pointed at Ollama)
 are actually exercised. OpenAI/Anthropic/Google come from majordomo's abstraction and are