feat: add claude-code/opus reviewer + max-thinking spec support #5

Merged

steve merged 2 commits from ci/add-opus-reviewer into main

2026-06-27 22:39:14 +00:00

Author	SHA1	Message	Date
steve	bd24b0a423	fix: fold in PR #5 review findings (thinking-env test + keep-list) Build & push image / build-and-push (pull_request) Successful in 6s Details The swarm reviewed PR #5 (8 reviewers; the telemetry fix from #4 is now live, so only 13 findings vs 43 on the comparably-clean #4 — the fix works). Folded in the two warranted ones: - engine: keep MAX_THINKING_TOKENS in claudeEnv() so a globally-set value reaches the CLI too (not just the per-spec :max append). (minimax) - test: TestRunPassInjectsThinkingTokens verifies runPass actually puts MAX_THINKING_TOKENS in the subprocess env (31999 for :max, unset for a plain spec) — the parse was tested, the injection wasn't. (minimax) The MAX_THINKING_TOKENS-is-unverified concern (minimax, qwen) is the same caveat already documented; left as-is. gofmt/vet/test -race green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 18:39:00 -04:00
steve	af2d3a2938	feat: claude-code opus reviewer + max-thinking spec support Build & push image / build-and-push (pull_request) Successful in 6s Details Adversarial Review (Gadfly) / review (pull_request) Successful in 10m29s Details Per Steve: add Claude Code opus to gadfly's own swarm, and prep a max-thinking variant. - Dogfood workflow: add claude-code/opus alongside claude-code/sonnet (claude-code lane bumped to 2 so they run in parallel), and bump the image pin to :sha-80d8f53 so the clean-lens telemetry fix from #4 is actually live in dogfood reviews. - Engine: a "claude-code/<model>:<thinking>" spec now sets an extended- thinking budget for that run via MAX_THINKING_TOKENS on the subprocess — ":max" (high ultrathink tier) or ":<n>". Best-effort (a no-op if the CLI build ignores it); harmless, never errors. This ships the capability so a follow-up can enable claude-code/opus:max once this image builds (the currently-pinned image predates the parse and would mis-route it). - README documents the :thinking suffix; new tests cover the spec parse. gofmt clean, go vet quiet, go test -race green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 18:23:23 -04:00

Author

SHA1

Message

Date

steve

bd24b0a423

fix: fold in PR #5 review findings (thinking-env test + keep-list)

Build & push image / build-and-push (pull_request) Successful in 6s

Details

The swarm reviewed PR #5 (8 reviewers; the telemetry fix from #4 is now
live, so only 13 findings vs 43 on the comparably-clean #4 — the fix
works). Folded in the two warranted ones:

- engine: keep MAX_THINKING_TOKENS in claudeEnv() so a globally-set value
  reaches the CLI too (not just the per-spec :max append). (minimax)
- test: TestRunPassInjectsThinkingTokens verifies runPass actually puts
  MAX_THINKING_TOKENS in the subprocess env (31999 for :max, unset for a
  plain spec) — the parse was tested, the injection wasn't. (minimax)

The MAX_THINKING_TOKENS-is-unverified concern (minimax, qwen) is the same
caveat already documented; left as-is. gofmt/vet/test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-27 18:39:00 -04:00

steve

af2d3a2938

feat: claude-code opus reviewer + max-thinking spec support

Build & push image / build-and-push (pull_request) Successful in 6s

Details

Adversarial Review (Gadfly) / review (pull_request) Successful in 10m29s

Details

Per Steve: add Claude Code opus to gadfly's own swarm, and prep a
max-thinking variant.

- Dogfood workflow: add claude-code/opus alongside claude-code/sonnet
  (claude-code lane bumped to 2 so they run in parallel), and bump the
  image pin to :sha-80d8f53 so the clean-lens telemetry fix from #4 is
  actually live in dogfood reviews.
- Engine: a "claude-code/<model>:<thinking>" spec now sets an extended-
  thinking budget for that run via MAX_THINKING_TOKENS on the subprocess
  — ":max" (high ultrathink tier) or ":<n>". Best-effort (a no-op if the
  CLI build ignores it); harmless, never errors. This ships the capability
  so a follow-up can enable claude-code/opus:max once this image builds
  (the currently-pinned image predates the parse and would mis-route it).
- README documents the :thinking suffix; new tests cover the spec parse.

gofmt clean, go vet quiet, go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-27 18:23:23 -04:00

feat: add claude-code/opus reviewer + max-thinking spec support #5

2 Commits