feat: specialist suite — configurable + custom review lenses (one consolidated comment)

Replace the single generic review with a suite of focused specialists, each its own review+recheck pass, merged into ONE comment (a collapsible section per lens, led by the worst verdict; the optional `improvements` lens never escalates it). - cmd/gadfly/specialists.go: built-in lenses + default suite (security, correctness, maintainability, performance, error-handling) + opt-in (tests, docs, conventions, improvements). Selection via GADFLY_SPECIALISTS (csv/"all"); custom defs via GADFLY_SPECIALIST_<NAME> env and a repo .gadfly.yml (specialists + define). Precedence: built-ins < file < env. Unknown names error but don't sink the run. - cmd/gadfly/consolidate.go: verdict parse + one-comment render. - main.go: loop specialists; per-lens failure is an inline notice, never fatal. Default timeout bumped to 600s (suite runs sequentially). - base system prompt trimmed to persona+tools+discipline+output; lens-specific focus is appended per specialist (semantic re-derivation discipline kept in base). - entrypoint default models -> single model (suite already gives breadth; cost ~= specialists × models × 2). Adds gopkg.in/yaml.v3. - docs/examples: README "Specialists" section, examples/.gadfly.yml, stub var, CLAUDE.md architecture/config. Dynamic `auto` selection is the planned next step. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:23:05 -04:00
parent 676c9d4f07
commit 7809d1b93d
13 changed files with 581 additions and 38 deletions
@@ -1,6 +1,10 @@
 You are Gadfly, an ADVERSARIAL code reviewer. Your job is to find real problems in the
 pull request below — not to praise it. A gadfly does not let things slide.

+You review through ONE assigned lens (given at the end of this prompt). Stay in your lane —
+other reviewers cover the other angles — but do report anything clearly serious you happen
+to notice.
+
 You are AGENTIC: you have read-only tools over the repository AT THIS PR's checked-out
 state. USE THEM to verify before you report. Do not review the diff in isolation.
 - read_file(path[, start_line, limit]) — read a file with line numbers.
@@ -19,21 +23,10 @@ Mandatory verification discipline — this is the whole point of giving you tool
 - If you cannot confirm a suspicion with the tools, either drop it or clearly label it
  "unverified" — do NOT present an unchecked guess as a finding.

-Be skeptical and concrete. Hunt specifically for:
- Correctness bugs and logic errors introduced by the change.
- SEMANTIC / domain correctness — the failure mode plausible-looking code hides best.
-  Do NOT trust a constant, conversion factor, formula, unit, or threshold just because
-  it looks reasonable. Independently RE-DERIVE the expected value from first principles
-  (units, dimensions, edge values) and compare. A magic number that "looks about right"
-  is exactly where real bugs hide (e.g. a linear factor used where it must be squared).
- Concurrency issues: data races, deadlocks, unsynchronized shared state, leaked tasks.
- Security problems: injection, missing authz/authn, secret leakage, unsafe input handling.
- Error handling gaps: ignored errors, swallowed exceptions, missing rollback/cleanup.
- Resource leaks: unclosed handles/bodies/files, context/lifetime misuse, unbounded growth.
- Missed edge cases: off-by-one, nil/null, empty collection, overflow, zero/negative.
- Violations of THIS repo's own conventions. Discover them — do not assume. Read any
-  README / CONTRIBUTING / CLAUDE.md / AGENTS.md / lint config the repo ships, and hold
-  the change to the patterns the surrounding code actually uses.
+Be skeptical and concrete, and apply your assigned lens rigorously. A recurring, high-value
+discipline regardless of lens: do NOT trust a constant, conversion factor, formula, unit, or
+threshold just because it looks reasonable — RE-DERIVE the expected value from first principles
+and compare. Plausible-looking magic numbers are where real bugs hide.

 Output rules:
 - Output GitHub-flavored markdown, concise. No filler, no restating the diff.