feat: claude-code reviewer engine (per-lens `claude -p` shell-out) #2

Merged

steve merged 2 commits from feat/claude-code-engine into main

2026-06-27 20:40:41 +00:00

Author SHA1 Message Date

Author	SHA1	Message	Date
steve	2ca1ce0b6b	fix: fold in claude-code-engine review findings Build & push image / build-and-push (pull_request) Successful in 14s Details The dogfood swarm reviewed PR #2 (9 cloud reviewers; m5 wedged by a host reboot and skipped this once). 66 findings graded via the gadfly MCP (~half real, half false-positive/clean-verification). Folding in the warranted ones: - engine.go: claudeEnv() builds a MINIMAL subprocess environment (auth token + PATH/HOME/locale/GADFLY_CLAUDE_*), no longer handing GITEA_TOKEN and provider keys to the third-party CLI (4-model consensus). - engine.go: runPass rewrite — check ctx.Err() first (don't emit a review from a timed-out run), treat an empty parsed result as an error instead of returning the raw JSON envelope, only trust a JSON answer on a clean exit, and drop the dangling ": " when there's no error detail. - engine.go: put the CLI in its own process group (Setpgid) and SIGKILL the whole group on cancel, so a timed-out lens can't orphan node procs. - engine.go: rune-safe truncateForErr. - prompts: genericized the tool-name hints in buildTask + recheck so the claude-code engine isn't told to call majordomo-only tools (read_file/ get_diff); also dropped the mort-specific framing from the recheck prompt (it must stay generic per CLAUDE.md). - README: documented that GADFLY_CLAUDE_EXTRA_ARGS is whitespace-split and can override the read-only default, and that the subprocess gets a minimal env. Left as-is (graded, noted in finding notes): operator-knob override of read-only (intentional escape hatch), shared per-lens timeout (by design), GADFLY_CLAUDE_BIN trust (operator-controlled, like GADFLY_BIN). New tests: claudeEnv filtering, rune-safe truncation, and runPass paths (clean / empty-result / is_error / non-zero) via a stub binary. gofmt clean, go vet quiet, go test -race green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 16:39:50 -04:00
steve	4237a18d09	feat: claude-code reviewer engine (per-lens `claude -p` shell-out) Build & push image / build-and-push (pull_request) Successful in 46s Details Adversarial Review (Gadfly) / review (pull_request) Failing after 30m16s Details Phase 1 of the gadfly-games build. Adds a second review engine alongside the majordomo agent loop: for each lens, shell out to the Claude Code CLI (`claude -p`) inside the checked-out repo so it verifies findings with its OWN read tools, then reuse gadfly's verdict-parse + recheck + consolidate + emit pipeline unchanged. - cmd/gadfly/engine.go: new reviewEngine interface with two impls — majordomoEngine (wraps the existing runAgent path) and claudeCodeEngine (exec `claude -p ... --output-format json`, parse `.result`). main.go's runSpecialists/reviewWithSpecialist are now engine-agnostic. - Select via a model id: `claude-code` (CLI default) or `claude-code/<model>` (suffix → --model). Auth inherits from the env: Pro/Max via CLAUDE_CODE_OAUTH_TOKEN (no --bare), else ANTHROPIC_API_KEY. Read-only by default (--permission-mode plan); tunable via GADFLY_CLAUDE_*. - auto-select + delegate worker are majordomo-only and are skipped with this engine (Claude Code does its own legwork). - Dockerfile bundles Node + @anthropic-ai/claude-code (larger image). - Docs: README "Claude Code engine" section + config rows, examples/ claude-code.yml stub, examples/README + CLAUDE.md updated. Honest note that subscription-auth-in-CI is untested here / a ToS gray area. - Bumps the dogfood image pin to :sha-c3d09d3 so gadfly's own PRs now review with the live status board from Phase 3. New engine_test.go covers spec detection, model derivation, and argv building (no live CLI call). gofmt clean, go vet quiet, go test -race green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 15:26:57 -04:00

steve

2ca1ce0b6b

fix: fold in claude-code-engine review findings

Build & push image / build-and-push (pull_request) Successful in 14s

Details

The dogfood swarm reviewed PR #2 (9 cloud reviewers; m5 wedged by a host
reboot and skipped this once). 66 findings graded via the gadfly MCP
(~half real, half false-positive/clean-verification). Folding in the
warranted ones:

- engine.go: claudeEnv() builds a MINIMAL subprocess environment (auth
  token + PATH/HOME/locale/GADFLY_CLAUDE_*), no longer handing GITEA_TOKEN
  and provider keys to the third-party CLI (4-model consensus).
- engine.go: runPass rewrite — check ctx.Err() first (don't emit a
  review from a timed-out run), treat an empty parsed result as an error
  instead of returning the raw JSON envelope, only trust a JSON answer on
  a clean exit, and drop the dangling ": " when there's no error detail.
- engine.go: put the CLI in its own process group (Setpgid) and SIGKILL
  the whole group on cancel, so a timed-out lens can't orphan node procs.
- engine.go: rune-safe truncateForErr.
- prompts: genericized the tool-name hints in buildTask + recheck so the
  claude-code engine isn't told to call majordomo-only tools (read_file/
  get_diff); also dropped the mort-specific framing from the recheck
  prompt (it must stay generic per CLAUDE.md).
- README: documented that GADFLY_CLAUDE_EXTRA_ARGS is whitespace-split
  and can override the read-only default, and that the subprocess gets a
  minimal env.

Left as-is (graded, noted in finding notes): operator-knob override of
read-only (intentional escape hatch), shared per-lens timeout (by design),
GADFLY_CLAUDE_BIN trust (operator-controlled, like GADFLY_BIN).

New tests: claudeEnv filtering, rune-safe truncation, and runPass paths
(clean / empty-result / is_error / non-zero) via a stub binary. gofmt
clean, go vet quiet, go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-27 16:39:50 -04:00

steve

4237a18d09

feat: claude-code reviewer engine (per-lens claude -p shell-out)

Build & push image / build-and-push (pull_request) Successful in 46s

Details

Adversarial Review (Gadfly) / review (pull_request) Failing after 30m16s

Details

Phase 1 of the gadfly-games build. Adds a second review engine alongside
the majordomo agent loop: for each lens, shell out to the Claude Code CLI
(`claude -p`) inside the checked-out repo so it verifies findings with
its OWN read tools, then reuse gadfly's verdict-parse + recheck +
consolidate + emit pipeline unchanged.

- cmd/gadfly/engine.go: new reviewEngine interface with two impls —
  majordomoEngine (wraps the existing runAgent path) and claudeCodeEngine
  (exec `claude -p ... --output-format json`, parse `.result`). main.go's
  runSpecialists/reviewWithSpecialist are now engine-agnostic.
- Select via a model id: `claude-code` (CLI default) or
  `claude-code/<model>` (suffix → --model). Auth inherits from the env:
  Pro/Max via CLAUDE_CODE_OAUTH_TOKEN (no --bare), else ANTHROPIC_API_KEY.
  Read-only by default (--permission-mode plan); tunable via GADFLY_CLAUDE_*.
- auto-select + delegate worker are majordomo-only and are skipped with
  this engine (Claude Code does its own legwork).
- Dockerfile bundles Node + @anthropic-ai/claude-code (larger image).
- Docs: README "Claude Code engine" section + config rows, examples/
  claude-code.yml stub, examples/README + CLAUDE.md updated. Honest note
  that subscription-auth-in-CI is untested here / a ToS gray area.
- Bumps the dogfood image pin to :sha-c3d09d3 so gadfly's own PRs now
  review with the live status board from Phase 3.

New engine_test.go covers spec detection, model derivation, and argv
building (no live CLI call). gofmt clean, go vet quiet, go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-27 15:26:57 -04:00

feat: claude-code reviewer engine (per-lens claude -p shell-out) #2

2 Commits

feat: claude-code reviewer engine (per-lens `claude -p` shell-out) #2