feat: claude-code opus reviewer + max-thinking spec support
Build & push image / build-and-push (pull_request) Successful in 6s
Adversarial Review (Gadfly) / review (pull_request) Successful in 10m29s

Per Steve: add Claude Code opus to gadfly's own swarm, and prep a
max-thinking variant.

- Dogfood workflow: add claude-code/opus alongside claude-code/sonnet
  (claude-code lane bumped to 2 so they run in parallel), and bump the
  image pin to :sha-80d8f53 so the clean-lens telemetry fix from #4 is
  actually live in dogfood reviews.
- Engine: a "claude-code/<model>:<thinking>" spec now sets an extended-
  thinking budget for that run via MAX_THINKING_TOKENS on the subprocess
  — ":max" (high ultrathink tier) or ":<n>". Best-effort (a no-op if the
  CLI build ignores it); harmless, never errors. This ships the capability
  so a follow-up can enable claude-code/opus:max once this image builds
  (the currently-pinned image predates the parse and would mis-route it).
- README documents the :thinking suffix; new tests cover the spec parse.

gofmt clean, go vet quiet, go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-27 18:23:23 -04:00
parent 80d8f53f63
commit af2d3a2938
4 changed files with 97 additions and 19 deletions
+8 -2
View File
@@ -89,12 +89,18 @@ parses the result and runs the same verdict-parse → recheck → consolidate
CLI is bundled in the image (Node + `@anthropic-ai/claude-code`).
Select it as a model id — bare `claude-code` (CLI default model) or `claude-code/<model>` (the
suffix becomes `--model`, e.g. `claude-code/sonnet`, `claude-code/opus`):
suffix becomes `--model`, e.g. `claude-code/sonnet`, `claude-code/opus`). An optional
`:<thinking>` suffix forces an extended-thinking budget for that reviewer — `:max` (the high
"ultrathink" tier) or `:<n>` for a specific token budget — so you can run the same model at two
thinking depths as separate reviewers:
```yaml
GADFLY_MODELS: "claude-code/sonnet,claude-code/opus"
GADFLY_MODELS: "claude-code/sonnet,claude-code/opus,claude-code/opus:max"
```
The thinking budget is applied via the `MAX_THINKING_TOKENS` env on the CLI subprocess; it's
best-effort (a no-op if the installed CLI build doesn't honor it).
Auth is read from the environment: the default is a **Pro/Max subscription** via
`CLAUDE_CODE_OAUTH_TOKEN` (from `claude setup-token`; no `--bare`), falling back to
`ANTHROPIC_API_KEY`. Don't set both. Tuning knobs (all optional):