feat: claude-code backends + llamaswap provider + dogfood the CC engine #3

Merged
steve merged 2 commits from feat/cc-backends into main 2026-06-27 21:53:42 +00:00
Owner

Phase 2 of the gadfly-games build.

majordomo bump → new llamaswap provider

  • Bumped majordomo to the latest build (which adds the llamaswap provider) and wired it into gadfly's two endpoint switches, mirroring ollama: GADFLY_PROVIDER=llamaswap + GADFLY_BASE_URL, and GADFLY_ENDPOINT_<NAME>="llamaswap|url[|key]". Wired but untested here (no llama-swap instance).

Dogfood the Phase-1 claude-code engine

  • Added claude-code/sonnet to gadfly's own review swarm as a competitor alongside the 9 cloud models + M5 — its own provider lane (claude-code=1), parallel with the cloud/m5 lanes.
  • Pinned the dogfood workflow to :sha-86f12c1 (the first image that bundles the claude CLI) and mapped CLAUDE_CODE_OAUTH_TOKEN.
  • So this PR's review should include a claude-code/sonnet comment — the first real end-to-end exercise of the engine.

Backend docs (example-only)

  • Documented the alternate path: running Ollama models through Claude Code's harness by pointing ANTHROPIC_BASE_URL at an Anthropic-API-compatible proxy (claude-code-router / LiteLLM). The subprocess env already forwards ANTHROPIC_*/CLAUDE_*, so no code change is needed — but tool-use support varies by proxy, so it's documented as example only, not wired/tested (per the plan).
  • README: llamaswap row in the providers table + Claude Code "alternate backends" note; examples/claude-code.yml shows the proxy env.

Tests

gofmt clean · go vet quiet · go build ok · go test -race ./... green · workflow YAML validated.

🤖 Generated with Claude Code

**Phase 2** of the gadfly-games build. ### majordomo bump → new `llamaswap` provider - Bumped majordomo to the latest build (which adds the **llamaswap** provider) and wired it into gadfly's two endpoint switches, mirroring `ollama`: `GADFLY_PROVIDER=llamaswap` + `GADFLY_BASE_URL`, and `GADFLY_ENDPOINT_<NAME>="llamaswap|url[|key]"`. Wired but untested here (no llama-swap instance). ### Dogfood the Phase-1 claude-code engine - Added **`claude-code/sonnet`** to gadfly's own review swarm as a competitor alongside the 9 cloud models + M5 — its own provider lane (`claude-code=1`), parallel with the cloud/m5 lanes. - Pinned the dogfood workflow to **`:sha-86f12c1`** (the first image that bundles the `claude` CLI) and mapped `CLAUDE_CODE_OAUTH_TOKEN`. - So **this PR's review should include a `claude-code/sonnet` comment** — the first real end-to-end exercise of the engine. ### Backend docs (example-only) - Documented the alternate path: running Ollama models **through Claude Code's harness** by pointing `ANTHROPIC_BASE_URL` at an Anthropic-API-compatible proxy (claude-code-router / LiteLLM). The subprocess env already forwards `ANTHROPIC_*`/`CLAUDE_*`, so no code change is needed — but tool-use support varies by proxy, so it's documented as **example only, not wired/tested** (per the plan). - README: `llamaswap` row in the providers table + Claude Code "alternate backends" note; `examples/claude-code.yml` shows the proxy env. ### Tests `gofmt` clean · `go vet` quiet · `go build` ok · `go test -race ./...` green · workflow YAML validated. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
steve added 1 commit 2026-06-27 21:38:06 +00:00
feat: claude-code backends, llamaswap provider, dogfood the CC engine
Build & push image / build-and-push (pull_request) Successful in 21s
Adversarial Review (Gadfly) / review (pull_request) Successful in 7m37s
9a6c662615
Phase 2 of the gadfly-games build.

- Bump majordomo to the latest build and wire its new **llamaswap**
  provider into gadfly's endpoint switches (GADFLY_PROVIDER=llamaswap +
  GADFLY_BASE_URL, and GADFLY_ENDPOINT_<NAME>="llamaswap|url[|key]"),
  mirroring the ollama provider. Wired but untested (no llama-swap here).
- Dogfood the Phase-1 claude-code engine: add `claude-code/sonnet` to
  gadfly's own review swarm as a competitor alongside the 9 cloud models
  + M5, pin the workflow to :sha-86f12c1 (the image that bundles the
  claude CLI), and map CLAUDE_CODE_OAUTH_TOKEN. Its own provider lane
  (claude-code=1), parallel with the cloud and m5 lanes.
- Document the alternate Claude Code backends: Pro/Max (default), and —
  EXAMPLE ONLY, not wired — running Ollama models THROUGH the CC harness
  via ANTHROPIC_BASE_URL pointed at an Anthropic-compatible proxy
  (claude-code-router / LiteLLM). The subprocess env already forwards
  ANTHROPIC_*/CLAUDE_*, so no code change is needed for that path.
- README: llamaswap row in the providers table + a Claude Code
  "alternate backends" note; examples/claude-code.yml shows the proxy env.

gofmt clean, go vet quiet, go build + go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🪰 Gadfly — live review status

11/11 reviewers finished · updated 2026-06-27 21:45:43Z

claude-code/sonnet · claude-code — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

deepseek-v4-pro:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

gemma4:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

glm-5.1:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

glm-5.2:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

gpt-oss:120b-cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — Minor issues
  • error-handling — No material issues found

kimi-k2.7-code:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

m5/qwen3.6:35b-mlx · m5 — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

minimax-m3:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

nemotron-3-super:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

qwen3-coder:480b-cloud · ollama-cloud — done

  • security — Minor issues
  • correctness — No material issues found
  • error-handling — No material issues found

Live status board. Findings are posted in each model's own comment. Advisory only — does not block merge.

<!-- gadfly-status-board --> ## 🪰 Gadfly — live review status 11/11 reviewers finished · updated 2026-06-27 21:45:43Z #### `claude-code/sonnet` · claude-code — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `deepseek-v4-pro:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `gemma4:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `glm-5.1:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `glm-5.2:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `gpt-oss:120b-cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — Minor issues - ✅ **error-handling** — No material issues found #### `kimi-k2.7-code:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `m5/qwen3.6:35b-mlx` · m5 — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `minimax-m3:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `nemotron-3-super:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `qwen3-coder:480b-cloud` · ollama-cloud — ✅ done - ✅ **security** — Minor issues - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found <sub>Live status board. Findings are posted in each model's own comment. Advisory only — does not block merge.</sub>

🪰 Gadfly review — m5/qwen3.6:35b-mlx (m5)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

Security Review

Verdict

No material issues found.

Findings

No security issues identified in this change. Specifically:

  • llamaswap provider wiring is sound: The new case "llamaswap" blocks in resolveModel() (cmd/gadfly/model.go:83-88) and endpointProvider() (cmd/gadfly/model.go:225-230) mirror the existing ollama/openai pattern exactly — WithBaseURL(baseURL) + conditional WithToken(apiKey). No new auth bypass or input-leak path is introduced; these are standard provider constructs that consume env-derived strings for HTTP calls. Error messages in default: cases were correctly updated to mention llamaswap (lines 107, 255).

  • Claude Code subprocess isolation remains intact: The claudeEnv() whitelist filter (engine.go:184-203) continues to drop runner secrets (GITEA_TOKEN, OLLAMA_API_KEY, GADFLY_API_KEY, GADFLY_FINDINGS_TOKEN). The new workflow step that adds CLAUDE_CODE_OAUTH_TOKEN is explicitly kept by the ANTHROPIC_* prefix match (correct, since that token is consumed by the claude CLI). No new secret injection vector was introduced.

  • No injection paths in model specs: The claude engine receives e.bin, e.model, e.permissionMode, e.allowedTools — all derived from env vars or fixed strings. No user-controlled PR body/branch/ref flows into the subprocess argv. The args() method constructs a clean, parameterized CLI invocation.

  • Workflow secrets mapping is explicit: CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} follows Gitea Actions best practices (no wildcard secret exposure). The existing actor_allowed gate and job-level if: guard on issue_comment remain unchanged.

All changes are additive/config-only with no new auth boundaries or data flows introduced.

🎯 Correctness — No material issues found

Correctness Review

VERDICT: No material issues found.

The diff adds two features with no correctness problems:

  1. llamaswap provider wiringcmd/gadfly/model.go correctly adds the llamaswap case in both resolveModel() (line 83-88) and endpointProvider() (line 225-230). The call pattern (WithBaseURL, WithToken/WithName, then New().Model(model) or New()) follows the exact same structure as the existing ollama, openai, anthropic, and google cases. Error messages at lines 107 and 255 are updated to list llamaswap in both places consistently.

  2. Claude Code dogfooding — The workflow YAML updates (sha-86f12c1 image pin, CLAUDE_CODE_OAUTH_TOKEN secret mapping, claude-code/sonnet model addition, claude-code=1 concurrency) are additive and consistent with the existing setup. The claude-code/sonnet model spec follows the established / naming convention that isClaudeCodeSpec() (engine.go:63-65) already recognizes correctly as strings.HasPrefix(m, "claude-code/").

All imports, function calls, and error messages are syntactically consistent with the existing codebase patterns. The go.mod bump is an update to a newer commit of the same module; no new direct dependencies require local adjustment since llamaswap lives in the majordomo module.

🧯 Error handling & edge cases — Minor issues

⚠️ Minor issues found

1. No unit tests for the new llamaswap provider paths

Where: cmd/gadfly/model_test.go (lines 5–42), covering endpointProvider; cmd/gadfly/model.go lines 83–88 and 225–230 for the actual code.

Issue: The test file tests ollama, openai, and foreman endpoints but has zero test cases for the new llamaswap provider in either endpointProvider or resolveModel. This means:

  • The llamaswap.WithBaseURL(baseURL) call is never exercised under test.
  • The error-handling path where apiKey != "" (the append branch) is never hit for llamaswap.
  • If the endpoint parsing rejects or mishandles a malformed llamaswap|... string, the unit tests won't catch it.

While the README explicitly calls out "wired, untested", adding at least one smoke-test case in model_test.go (e.g. a happy-path llamaswap|http://... and an empty-URL rejection) would be cheap and would surface edge cases early.

2. Silent errors in registerEnvProviders are logged but never surfaced as fatal

Where: cmd/gadfly/model.go lines 47–48 (calling site) and 175–198 (registerEnvProviders).

Issue: Any malformed GADFLY_ENDPOINT_* or GADFLY_ALIAS_* env var produces an error that is appended to errs, but at the call site:

for _, err := range registerEnvProviders() {
    fmt.Fprintln(os.Stderr, "gadfly: ignoring bad endpoint/alias:", err)
}

The errors are never returned or checked further — just printed to stderr. If llamaswap endpoint registration fails (e.g. bad URL format for the new provider), the caller sees only a warning and the binary proceeds, potentially using default models without realizing a configured endpoint is dead.

This is consistent with the "best-effort" comment in the code, but it means an operator misconfiguring llamaswap could silently get degraded behavior. Worth considering whether registerEnvProviders should also return a combined error (or fatalize on non-empty) when critical endpoints are missing.

3. Empty provider name edge case in endpointProvider

Where: cmd/gadfly/model.go line 204–215.

Issue: If the raw string is GADFLY_ENDPOINT_X="|http://x" (empty provider), provider becomes "" after strings.TrimSpace(parts[0]), which falls through to default. This correctly rejects it via the error at line 255, but the error message reads "unknown provider \"\" (use ...)" — a confusing empty-string error that provides no guidance to the user. Minor UX issue, not a crash risk.

Verdict

Minor issues. The llamaswap addition is functionally complete and error-handling is structurally sound (the code validates inputs and returns errors). The gaps are test coverage (expected per design) and the silent-error drop for malformed endpoints (a known pattern in this codebase, but worth flagging when adding new providers to ensure operators aren't surprised).

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 8s

<!-- gadfly-review:ollama:m5/qwen3.6:35b-mlx --> ### 🪰 Gadfly review — `m5/qwen3.6:35b-mlx` (m5) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> ## Security Review ### Verdict **No material issues found.** ### Findings No security issues identified in this change. Specifically: - **`llamaswap` provider wiring is sound**: The new `case "llamaswap"` blocks in `resolveModel()` (`cmd/gadfly/model.go:83-88`) and `endpointProvider()` (`cmd/gadfly/model.go:225-230`) mirror the existing `ollama`/`openai` pattern exactly — `WithBaseURL(baseURL)` + conditional `WithToken(apiKey)`. No new auth bypass or input-leak path is introduced; these are standard provider constructs that consume env-derived strings for HTTP calls. Error messages in `default:` cases were correctly updated to mention `llamaswap` (lines 107, 255). - **Claude Code subprocess isolation remains intact**: The `claudeEnv()` whitelist filter (`engine.go:184-203`) continues to drop runner secrets (`GITEA_TOKEN`, `OLLAMA_API_KEY`, `GADFLY_API_KEY`, `GADFLY_FINDINGS_TOKEN`). The new workflow step that adds `CLAUDE_CODE_OAUTH_TOKEN` is explicitly kept by the `ANTHROPIC_*` prefix match (correct, since that token is consumed by the `claude` CLI). No new secret injection vector was introduced. - **No injection paths in model specs**: The claude engine receives `e.bin`, `e.model`, `e.permissionMode`, `e.allowedTools` — all derived from env vars or fixed strings. No user-controlled PR body/branch/ref flows into the subprocess argv. The `args()` method constructs a clean, parameterized CLI invocation. - **Workflow secrets mapping is explicit**: `CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}` follows Gitea Actions best practices (no wildcard secret exposure). The existing `actor_allowed` gate and job-level `if:` guard on issue_comment remain unchanged. All changes are additive/config-only with no new auth boundaries or data flows introduced. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> # Correctness Review **VERDICT: No material issues found.** The diff adds two features with no correctness problems: 1. **`llamaswap` provider wiring** — `cmd/gadfly/model.go` correctly adds the `llamaswap` case in both `resolveModel()` (line 83-88) and `endpointProvider()` (line 225-230). The call pattern (`WithBaseURL`, `WithToken`/`WithName`, then `New().Model(model)` or `New()`) follows the exact same structure as the existing `ollama`, `openai`, `anthropic`, and `google` cases. Error messages at lines 107 and 255 are updated to list `llamaswap` in both places consistently. 2. **Claude Code dogfooding** — The workflow YAML updates (`sha-86f12c1` image pin, `CLAUDE_CODE_OAUTH_TOKEN` secret mapping, `claude-code/sonnet` model addition, `claude-code=1` concurrency) are additive and consistent with the existing setup. The `claude-code/sonnet` model spec follows the established `/` naming convention that `isClaudeCodeSpec()` (engine.go:63-65) already recognizes correctly as `strings.HasPrefix(m, "claude-code/")`. All imports, function calls, and error messages are syntactically consistent with the existing codebase patterns. The go.mod bump is an update to a newer commit of the same module; no new direct dependencies require local adjustment since `llamaswap` lives in the `majordomo` module. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> # ⚠️ Minor issues found ## 1. No unit tests for the new `llamaswap` provider paths **Where:** `cmd/gadfly/model_test.go` (lines 5–42), covering `endpointProvider`; `cmd/gadfly/model.go` lines 83–88 and 225–230 for the actual code. **Issue:** The test file tests `ollama`, `openai`, and `foreman` endpoints but has **zero test cases** for the new `llamaswap` provider in either `endpointProvider` or `resolveModel`. This means: - The `llamaswap.WithBaseURL(baseURL)` call is never exercised under test. - The error-handling path where `apiKey != ""` (the `append` branch) is never hit for llamaswap. - If the endpoint parsing rejects or mishandles a malformed `llamaswap|...` string, the unit tests won't catch it. While the README explicitly calls out "wired, **untested**", adding at least one smoke-test case in `model_test.go` (e.g. a happy-path `llamaswap|http://...` and an empty-URL rejection) would be cheap and would surface edge cases early. ## 2. Silent errors in `registerEnvProviders` are logged but never surfaced as fatal **Where:** `cmd/gadfly/model.go` lines 47–48 (calling site) and 175–198 (`registerEnvProviders`). **Issue:** Any malformed `GADFLY_ENDPOINT_*` or `GADFLY_ALIAS_*` env var produces an error that is appended to `errs`, but at the call site: ```go for _, err := range registerEnvProviders() { fmt.Fprintln(os.Stderr, "gadfly: ignoring bad endpoint/alias:", err) } ``` The errors are **never returned or checked further** — just printed to stderr. If `llamaswap` endpoint registration fails (e.g. bad URL format for the new provider), the caller sees only a warning and the binary proceeds, potentially using default models without realizing a configured endpoint is dead. This is consistent with the "best-effort" comment in the code, but it means an operator misconfiguring `llamaswap` could silently get degraded behavior. Worth considering whether `registerEnvProviders` should also return a combined error (or fatalize on non-empty) when critical endpoints are missing. ## 3. Empty provider name edge case in `endpointProvider` **Where:** `cmd/gadfly/model.go` line 204–215. **Issue:** If the raw string is `GADFLY_ENDPOINT_X="|http://x"` (empty provider), `provider` becomes `""` after `strings.TrimSpace(parts[0])`, which falls through to `default`. This correctly rejects it via the error at line 255, but the error message reads `"unknown provider \"\" (use ...)"` — a confusing empty-string error that provides no guidance to the user. Minor UX issue, not a crash risk. ## Verdict **Minor issues.** The llamaswap addition is functionally complete and error-handling is structurally sound (the code validates inputs and returns errors). The gaps are test coverage (expected per design) and the silent-error drop for malformed endpoints (a known pattern in this codebase, but worth flagging when adding new providers to ensure operators aren't surprised). </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 8s</sub>

🪰 Gadfly review — glm-5.2:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

I've reviewed the change through the security lens, tracing the new secret mapping and the claude-code subprocess boundary. Verdict below.

Verdict: No material issues found

  • Secret scoping to the claude subprocess is sound. claudeEnv() (cmd/gadfly/engine.go:184-204) is an allowlist that keeps only PATH/HOME/USER/..., and LC_*/CLAUDE_*/ANTHROPIC_*/GADFLY_CLAUDE_*/NODE_*/XDG_* prefixed vars — so the newly mapped CLAUDE_CODE_OAUTH_TOKEN (prefix CLAUDE_) reaches the CLI, while GITEA_TOKEN, GADFLY_API_KEY, GADFLY_FINDINGS_TOKEN, OLLAMA_API_KEY are dropped. TestClaudeEnvFilters (cmd/gadfly/engine_test.go:120-150) asserts both halves. No regression from this PR.
  • No new command/argument injection surface. The claude CLI is invoked via exec.CommandContext with discrete argv elements (engine.go:94-106,116), never a shell. The claude-code/<model> spec's post-slash segment becomes the --model value as a single argv slot (engine.go:73-87), so a crafted GADFLY_MODEL can't inject extra flags. GADFLY_CLAUDE_EXTRA_ARGS is strings.Fields-split and appended verbatim, but it's operator-controlled env, not PR-derived input.
  • No SSRF/injection via the new llamaswap wiring. model.go:83-88 and 225-230 mirror the existing ollama/openai/foreman paths: baseURL is operator-supplied (GADFLY_BASE_URL / GADFLY_ENDPOINT_*), used verbatim — the same posture as every other provider in this file. No untrusted/PR-derived URL is introduced. WithToken(key) applies a bearer; empty key is simply omitted.
  • Default posture stays read-only. newClaudeCodeEngine defaults permissionMode to "plan" (engine.go:84), so the dogfood claude-code/sonnet lane — which does not set GADFLY_CLAUDE_PERMISSION_MODE in .gitea/workflows/adversarial-review.yml — runs the CLI in plan (read-only) mode over the checked-out PR tree. No edit capability is granted by this change.
  • Alternate-backend doc (examples/claude-code.yml:67-69, README.md:121) is commented out / example-only. It documents pointing ANTHROPIC_BASE_URL at a proxy; that's a consumer self-configured value (and would route the consumer's own ANTHROPIC_AUTH_TOKEN to their chosen proxy), not a gadget this repo exposes to untrusted input. Not a finding.

One non-blocking observation (outside my lens, noted for awareness): if the claude CLI ever emits the OAuth token in an auth-error message, truncateForErr caps it at 800 bytes but does not redact secret-looking strings, so it could be surfaced in the posted PR comment via run.sh's error path. This is pre-existing for ANTHROPIC_API_KEY as well, not introduced by this PR, and I could not confirm any CLI path that actually echoes the token.

🎯 Correctness — No material issues found

VERDICT: No material issues found

I verified the change through the correctness lens — focusing on the new llamaswap provider wiring and the claude-code lane wiring in the workflow — and found no logic bugs.

Verified by reading the code:

  • cmd/gadfly/model.go llamaswap cases — The two new case "llamaswap" blocks (in resolveModel at model.go:83-88 and endpointProvider at model.go:225-230) mirror the adjacent ollama cases exactly: same option pattern (WithBaseURL/WithToken, plus WithName in the endpoint form), same error-message format. The resolveModel path correctly omits WithName (matching ollama there too, since that path has no endpoint name). The default-case error strings were updated in both functions to list llamaswap. Symmetric and correct.
  • claude-code lane routingentrypoint.sh:159 provider_of splits claude-code/sonnet on /claude-code, which matches the claude-code=1 entry in GADFLY_PROVIDER_CONCURRENCY (workflow line 82). So it lands in its own lane with cap 1, exactly as the PR description claims. The m5/qwen3.6:35b-mlx and bare :cloud ids route the same way they did before (the :cloud suffix is part of the model id, not the provider segment; bare ids fall back to the ollama-cloud default). No regression in the existing fleet routing.
  • Token propagation to the CLICLAUDE_CODE_OAUTH_TOKEN is mapped as step env in the workflow and inherited by the entrypoint. cmd/gadfly/engine.go:184-196 claudeEnv() keeps any CLAUDE_*-prefixed env var, so the token reaches the claude subprocess. The claude-code engine is selected via isClaudeCodeSpec in main.go:156-159 before resolveModel is called, so a claude-code/sonnet spec never falls into the majordomo provider switch. Correct.
  • Lens concurrency for the new laneGADFLY_PROVIDER_LENS_CONCURRENCY only overrides ollama-cloud=3; the claude-code and m5 lanes fall back to the default lens concurrency (1). That's a conservative, safe choice (serial lenses), not a bug. The "9 concurrent ollama-cloud queries" comment still holds (3 models × 3 lenses).

Could not confirm (external dependency, not in this repo): the existence/signatures of llamaswap.New, llamaswap.WithBaseURL, llamaswap.WithToken, llamaswap.WithName in the bumped majordomo module. The integration code is symmetric with the proven ollama option pattern, and the PR reports go build/go test green, so I treat this as unverified-but-plausible rather than a finding.

🧯 Error handling & edge cases — No material issues found

VERDICT: No material issues found

Reviewed the diff through the 🧯 Error handling & edge cases lens, focusing on the new llamaswap provider wiring in cmd/gadfly/model.go and the claude-code/sonnet dogfood wiring in the workflow/scripts.

  • llamaswap branch in resolveModel (model.go:83-88) and endpointProvider (model.go:225-230): Both mirror the existing ollama case exactly — empty-key guard (if apiKey != "" / if key != ""), error returned on missing base URL via the shared len(parts) < 2 / baseURL == "" checks in endpointProvider, and the default cases return descriptive errors. No swallowed errors, no nil returns on the happy path. Verified by reading the full model.go.

  • endpointProvider parsing (model.go:203-256): The SplitN(raw, "|", 3) + len(parts) < 2 guard and the baseURL == "" check correctly reject malformed/empty entries; registerEnvProviders collects per-entry errors and continues (best-effort), logging them in resolveModel. No new edge cases introduced by adding the llamaswap case — it reuses the same parse/validation flow.

  • claude-code engine error handling (engine.go:115-177): Not changed by this diff, but it's the engine now being dogfooded. I verified the existing handling is robust for the unhappy paths this PR exposes it to: ctx.Err() check before parsing partial output, res.IsError surfaced, empty-result and no-parseable-output cases return explicit errors, claudeEnv drops secrets, and run.sh (156-181) catches non-zero exit / empty review and posts a ⚠️ comment rather than failing the lane. The workflow's CLAUDE_CODE_OAUTH_TOKEN mapping is just an env passthrough — no error path to miss.

  • Workflow concurrency wiring (.gitea/workflows/adversarial-review.yml): Adding claude-code/sonnet to GADFLY_MODELS and claude-code=1 to GADFLY_PROVIDER_CONCURRENCY is handled by entrypoint.sh's provider-lane logic (provider_of, run_lane, cap validation with [ "$cap" -ge 1 ] 2>/dev/null || cap=1 fallback). No new unhandled edge case — an unknown/missing lane cap safely defaults to 1.

No ignored errors, missing cleanup, panic-prone input, or unhandled boundary conditions were introduced by this diff within the error-handling lens.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 7s

<!-- gadfly-review:ollama:glm-5.2:cloud --> ### 🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> I've reviewed the change through the security lens, tracing the new secret mapping and the claude-code subprocess boundary. Verdict below. **Verdict: No material issues found** - **Secret scoping to the `claude` subprocess is sound.** `claudeEnv()` (`cmd/gadfly/engine.go:184-204`) is an allowlist that keeps only `PATH/HOME/USER/...`, and `LC_*`/`CLAUDE_*`/`ANTHROPIC_*`/`GADFLY_CLAUDE_*`/`NODE_*`/`XDG_*` prefixed vars — so the newly mapped `CLAUDE_CODE_OAUTH_TOKEN` (prefix `CLAUDE_`) reaches the CLI, while `GITEA_TOKEN`, `GADFLY_API_KEY`, `GADFLY_FINDINGS_TOKEN`, `OLLAMA_API_KEY` are dropped. `TestClaudeEnvFilters` (`cmd/gadfly/engine_test.go:120-150`) asserts both halves. No regression from this PR. - **No new command/argument injection surface.** The `claude` CLI is invoked via `exec.CommandContext` with discrete argv elements (`engine.go:94-106,116`), never a shell. The `claude-code/<model>` spec's post-slash segment becomes the `--model` value as a single argv slot (`engine.go:73-87`), so a crafted `GADFLY_MODEL` can't inject extra flags. `GADFLY_CLAUDE_EXTRA_ARGS` is `strings.Fields`-split and appended verbatim, but it's operator-controlled env, not PR-derived input. - **No SSRF/injection via the new `llamaswap` wiring.** `model.go:83-88` and `225-230` mirror the existing `ollama`/`openai`/`foreman` paths: `baseURL` is operator-supplied (`GADFLY_BASE_URL` / `GADFLY_ENDPOINT_*`), used verbatim — the same posture as every other provider in this file. No untrusted/PR-derived URL is introduced. `WithToken(key)` applies a bearer; empty key is simply omitted. - **Default posture stays read-only.** `newClaudeCodeEngine` defaults `permissionMode` to `"plan"` (`engine.go:84`), so the dogfood `claude-code/sonnet` lane — which does not set `GADFLY_CLAUDE_PERMISSION_MODE` in `.gitea/workflows/adversarial-review.yml` — runs the CLI in plan (read-only) mode over the checked-out PR tree. No edit capability is granted by this change. - **Alternate-backend doc (`examples/claude-code.yml:67-69`, `README.md:121`) is commented out / example-only.** It documents pointing `ANTHROPIC_BASE_URL` at a proxy; that's a consumer self-configured value (and would route the consumer's own `ANTHROPIC_AUTH_TOKEN` to their chosen proxy), not a gadget this repo exposes to untrusted input. Not a finding. One non-blocking observation (outside my lens, noted for awareness): if the `claude` CLI ever emits the OAuth token in an auth-error message, `truncateForErr` caps it at 800 bytes but does not redact secret-looking strings, so it could be surfaced in the posted PR comment via `run.sh`'s error path. This is pre-existing for `ANTHROPIC_API_KEY` as well, not introduced by this PR, and I could not confirm any CLI path that actually echoes the token. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> **VERDICT: No material issues found** I verified the change through the correctness lens — focusing on the new `llamaswap` provider wiring and the `claude-code` lane wiring in the workflow — and found no logic bugs. Verified by reading the code: - **`cmd/gadfly/model.go` llamaswap cases** — The two new `case "llamaswap"` blocks (in `resolveModel` at `model.go:83-88` and `endpointProvider` at `model.go:225-230`) mirror the adjacent `ollama` cases exactly: same option pattern (`WithBaseURL`/`WithToken`, plus `WithName` in the endpoint form), same error-message format. The `resolveModel` path correctly omits `WithName` (matching ollama there too, since that path has no endpoint name). The default-case error strings were updated in both functions to list `llamaswap`. Symmetric and correct. - **`claude-code` lane routing** — `entrypoint.sh:159` `provider_of` splits `claude-code/sonnet` on `/` → `claude-code`, which matches the `claude-code=1` entry in `GADFLY_PROVIDER_CONCURRENCY` (workflow line 82). So it lands in its own lane with cap 1, exactly as the PR description claims. The `m5/qwen3.6:35b-mlx` and bare `:cloud` ids route the same way they did before (the `:cloud` suffix is part of the model id, not the provider segment; bare ids fall back to the `ollama-cloud` default). No regression in the existing fleet routing. - **Token propagation to the CLI** — `CLAUDE_CODE_OAUTH_TOKEN` is mapped as step env in the workflow and inherited by the entrypoint. `cmd/gadfly/engine.go:184-196` `claudeEnv()` keeps any `CLAUDE_*`-prefixed env var, so the token reaches the `claude` subprocess. The claude-code engine is selected via `isClaudeCodeSpec` in `main.go:156-159` before `resolveModel` is called, so a `claude-code/sonnet` spec never falls into the majordomo provider switch. Correct. - **Lens concurrency for the new lane** — `GADFLY_PROVIDER_LENS_CONCURRENCY` only overrides `ollama-cloud=3`; the `claude-code` and `m5` lanes fall back to the default lens concurrency (1). That's a conservative, safe choice (serial lenses), not a bug. The "9 concurrent ollama-cloud queries" comment still holds (3 models × 3 lenses). Could not confirm (external dependency, not in this repo): the existence/signatures of `llamaswap.New`, `llamaswap.WithBaseURL`, `llamaswap.WithToken`, `llamaswap.WithName` in the bumped majordomo module. The integration code is symmetric with the proven `ollama` option pattern, and the PR reports `go build`/`go test` green, so I treat this as unverified-but-plausible rather than a finding. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> ## VERDICT: No material issues found Reviewed the diff through the 🧯 Error handling & edge cases lens, focusing on the new `llamaswap` provider wiring in `cmd/gadfly/model.go` and the `claude-code/sonnet` dogfood wiring in the workflow/scripts. - **`llamaswap` branch in `resolveModel` (model.go:83-88) and `endpointProvider` (model.go:225-230):** Both mirror the existing `ollama` case exactly — empty-key guard (`if apiKey != ""` / `if key != ""`), error returned on missing base URL via the shared `len(parts) < 2` / `baseURL == ""` checks in `endpointProvider`, and the `default` cases return descriptive errors. No swallowed errors, no nil returns on the happy path. Verified by reading the full `model.go`. - **`endpointProvider` parsing (model.go:203-256):** The `SplitN(raw, "|", 3)` + `len(parts) < 2` guard and the `baseURL == ""` check correctly reject malformed/empty entries; `registerEnvProviders` collects per-entry errors and continues (best-effort), logging them in `resolveModel`. No new edge cases introduced by adding the `llamaswap` case — it reuses the same parse/validation flow. - **`claude-code` engine error handling (engine.go:115-177):** Not changed by this diff, but it's the engine now being dogfooded. I verified the existing handling is robust for the unhappy paths this PR exposes it to: `ctx.Err()` check before parsing partial output, `res.IsError` surfaced, empty-result and no-parseable-output cases return explicit errors, `claudeEnv` drops secrets, and `run.sh` (156-181) catches non-zero exit / empty review and posts a `⚠️` comment rather than failing the lane. The workflow's `CLAUDE_CODE_OAUTH_TOKEN` mapping is just an env passthrough — no error path to miss. - **Workflow concurrency wiring (`.gitea/workflows/adversarial-review.yml`):** Adding `claude-code/sonnet` to `GADFLY_MODELS` and `claude-code=1` to `GADFLY_PROVIDER_CONCURRENCY` is handled by `entrypoint.sh`'s provider-lane logic (provider_of, run_lane, cap validation with `[ "$cap" -ge 1 ] 2>/dev/null || cap=1` fallback). No new unhandled edge case — an unknown/missing lane cap safely defaults to 1. No ignored errors, missing cleanup, panic-prone input, or unhandled boundary conditions were introduced by this diff within the error-handling lens. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 7s</sub>

🪰 Gadfly review — glm-5.1:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

After reviewing the diff through the security lens and verifying the key code paths:

  • claudeEnv() (engine.go:184–203) — The allowlist filter that builds the claude subprocess environment is sound and well-tested (engine_test.go:120–150). It explicitly drops GITEA_TOKEN, GADFLY_FINDINGS_TOKEN, OLLAMA_API_KEY, GADFLY_API_KEY, etc., while keeping only CLAUDE_*, ANTHROPIC_*, GADFLY_CLAUDE_*, and standard system vars. This defense-in-depth is correctly implemented.

  • CLAUDE_CODE_OAUTH_TOKEN in workflow — Properly injected via ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} and only reaches the claude subprocess through the CLAUDE_ prefix filter. No leakage to untrusted code paths.

  • Subprocess argument construction — The claude CLI is invoked via exec.Command (not a shell), so the PR-sourced system prompt and task content cannot cause argument injection. Flags like -p and --append-system-prompt receive their values as separate slice elements, immune to shell metacharacter attacks.

  • --permission-mode plan (default) — Read-only mode for the Claude Code engine. Safe default.

  • Workflow actor gateissue_comment triggers are restricted to steve, fizi, dazed; no arbitrary-user secret-bearing runs.

  • Alternate backend docs — The ANTHROPIC_BASE_URL/ANTHROPIC_AUTH_TOKEN example in examples/claude-code.yml is commented-out and documented as "EXAMPLE ONLY, not validated." The ANTHROPIC_* passthrough in claudeEnv() is by design for this use case; no unsanctioned redirect risk since it requires explicit user configuration.

  • Process group cleanupSetpgid: true + syscall.Kill(-pid, SIGKILL) on context cancel properly reaps the CLI and its Node children. No orphan risk.

  • llamaswap provider — Follows the identical pattern to ollama/openai/anthropic with WithBaseURL + optional WithToken. No new attack surface.

No authn/authz gaps, injection vectors, secret leaks, or unsafe input handling found.

🎯 Correctness — No material issues found

VERDICT: No material issues found

After verifying the key correctness concerns:

  • llamaswap provider wiring: The new llamaswap case in both resolveModel() (lines 83–88) and endpointProvider() (lines 225–230) follows the identical pattern as ollamaWithBaseURL/WithToken/WithName in the endpoint path, WithBaseURL/WithToken (no WithName) in the GADFLY_BASE_URL path. Import is present. Error messages are updated. All consistent and correct.

  • claude-code/sonnet workflow dispatch: provider_of() in entrypoint.sh (line 159) extracts claude-code from claude-code/sonnet via ${1%%/*}, matching the claude-code=1 concurrency lane. The Go binary's isClaudeCodeSpec() (engine.go:64–65) correctly matches strings.HasPrefix(m, "claude-code/") and extracts sonnet as the model. The GADFLY_PROVIDER_LENS_CONCURRENCY only specifies ollama-cloud=3, which is correct because the claude-code engine manages its own loop (engine.go:30 comment confirms this).

  • Provider concurrency partitioning: All 11 models in GADFLY_MODELS map to exactly one lane each: 9 bare ids → ollama-cloud, m5/qwen3.6:35b-mlxm5, claude-code/sonnetclaude-code. These three lane keys (ollama-cloud, m5, claude-code) all appear in GADFLY_PROVIDER_CONCURRENCY with explicit caps. No orphaned lane, no unkeyed provider.

  • Dependency bump: majordomo bumped from 0147a79d187b to b2487a1a370c, bringing in the llamaswap package. The go.sum changes are consistent with the go.mod update (transitive deps updated, old ones pruned). No suspicious additions.

  • Default-provider fallback: When GADFLY_BASE_URL is empty (the common case for ollama-cloud and claude-code models), resolution goes through majordomo.Parse(buildSpec(provider, model)), which correctly handles all registered providers including claude-code. No path incorrectly requires GADFLY_BASE_URL.

No logic bugs, semantic errors, or incorrect behavior found in this change.

🧯 Error handling & edge cases — No material issues found

VERDICT: No material issues found

I reviewed the diff through the error-handling & edge-case lens, verifying the new llamaswap provider wiring, the claude-code engine integration in the workflow, and the go.mod bump. Key checks:

  • llamaswap in resolveModel() (line 83-88): Identical pattern to ollamabaseURL is guaranteed non-empty by the if baseURL == "" guard at line 65, so llamaswap.WithBaseURL always receives a real value. Error from .Model() propagates; nothing swallowed.

  • llamaswap in endpointProvider() (line 225-230): Empty baseURL is explicitly rejected with an error at line 214-216 before the switch. Identical option-construction pattern to ollama. No nil-able paths.

  • claudeCodeEngine.runPass() (engine.go:115-177): Context cancellation is checked before parsing stdout (ctx.Err()), process group cleanup uses Setpgid + negative-PID kill with a nil-process guard, empty/missing result returns an error rather than leaking raw JSON, and UTF-8 truncation is rune-safe. Missing CLI binary surfaces as a normal pass error (advisory). All unhappy paths are handled.

  • claudeEnv() (engine.go:184-204): Correctly passes CLAUDE_CODE_OAUTH_TOKEN through (matches the CLAUDE_ prefix) while stripping unrelated secrets. Verified by reading the test at engine_test.go:120-149.

  • Workflow YAML: If CLAUDE_CODE_OAUTH_TOKEN is unset/empty, the claude-code engine fails auth, which surfaces as an advisory error in the review comment — never fatal. Acceptable degradation.

  • provider_of / lane routing: claude-code/sonnet correctly extracts provider claude-code and routes to its own concurrency lane. Verified in entrypoint.sh:159 and the Go binary's modelProvider().

No swallowed errors, missing defers, nil panics, or unhandled edge cases in the changed code.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 17s

<!-- gadfly-review:ollama:glm-5.1:cloud --> ### 🪰 Gadfly review — `glm-5.1:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **VERDICT: No material issues found** After reviewing the diff through the security lens and verifying the key code paths: - **`claudeEnv()` (engine.go:184–203)** — The allowlist filter that builds the `claude` subprocess environment is sound and well-tested (engine_test.go:120–150). It explicitly drops `GITEA_TOKEN`, `GADFLY_FINDINGS_TOKEN`, `OLLAMA_API_KEY`, `GADFLY_API_KEY`, etc., while keeping only `CLAUDE_*`, `ANTHROPIC_*`, `GADFLY_CLAUDE_*`, and standard system vars. This defense-in-depth is correctly implemented. - **`CLAUDE_CODE_OAUTH_TOKEN` in workflow** — Properly injected via `${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}` and only reaches the `claude` subprocess through the `CLAUDE_` prefix filter. No leakage to untrusted code paths. - **Subprocess argument construction** — The `claude` CLI is invoked via `exec.Command` (not a shell), so the PR-sourced system prompt and task content cannot cause argument injection. Flags like `-p` and `--append-system-prompt` receive their values as separate slice elements, immune to shell metacharacter attacks. - **`--permission-mode plan`** (default) — Read-only mode for the Claude Code engine. Safe default. - **Workflow actor gate** — `issue_comment` triggers are restricted to `steve`, `fizi`, `dazed`; no arbitrary-user secret-bearing runs. - **Alternate backend docs** — The `ANTHROPIC_BASE_URL`/`ANTHROPIC_AUTH_TOKEN` example in `examples/claude-code.yml` is commented-out and documented as "EXAMPLE ONLY, not validated." The `ANTHROPIC_*` passthrough in `claudeEnv()` is by design for this use case; no unsanctioned redirect risk since it requires explicit user configuration. - **Process group cleanup** — `Setpgid: true` + `syscall.Kill(-pid, SIGKILL)` on context cancel properly reaps the CLI and its Node children. No orphan risk. - **`llamaswap` provider** — Follows the identical pattern to `ollama`/`openai`/`anthropic` with `WithBaseURL` + optional `WithToken`. No new attack surface. No authn/authz gaps, injection vectors, secret leaks, or unsafe input handling found. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> **VERDICT: No material issues found** After verifying the key correctness concerns: - **`llamaswap` provider wiring**: The new `llamaswap` case in both `resolveModel()` (lines 83–88) and `endpointProvider()` (lines 225–230) follows the identical pattern as `ollama` — `WithBaseURL`/`WithToken`/`WithName` in the endpoint path, `WithBaseURL`/`WithToken` (no `WithName`) in the `GADFLY_BASE_URL` path. Import is present. Error messages are updated. All consistent and correct. - **`claude-code/sonnet` workflow dispatch**: `provider_of()` in `entrypoint.sh` (line 159) extracts `claude-code` from `claude-code/sonnet` via `${1%%/*}`, matching the `claude-code=1` concurrency lane. The Go binary's `isClaudeCodeSpec()` (engine.go:64–65) correctly matches `strings.HasPrefix(m, "claude-code/")` and extracts `sonnet` as the model. The `GADFLY_PROVIDER_LENS_CONCURRENCY` only specifies `ollama-cloud=3`, which is correct because the claude-code engine manages its own loop (engine.go:30 comment confirms this). - **Provider concurrency partitioning**: All 11 models in `GADFLY_MODELS` map to exactly one lane each: 9 bare ids → `ollama-cloud`, `m5/qwen3.6:35b-mlx` → `m5`, `claude-code/sonnet` → `claude-code`. These three lane keys (`ollama-cloud`, `m5`, `claude-code`) all appear in `GADFLY_PROVIDER_CONCURRENCY` with explicit caps. No orphaned lane, no unkeyed provider. - **Dependency bump**: `majordomo` bumped from `0147a79d187b` to `b2487a1a370c`, bringing in the `llamaswap` package. The `go.sum` changes are consistent with the `go.mod` update (transitive deps updated, old ones pruned). No suspicious additions. - **Default-provider fallback**: When `GADFLY_BASE_URL` is empty (the common case for `ollama-cloud` and `claude-code` models), resolution goes through `majordomo.Parse(buildSpec(provider, model))`, which correctly handles all registered providers including `claude-code`. No path incorrectly requires `GADFLY_BASE_URL`. No logic bugs, semantic errors, or incorrect behavior found in this change. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> **VERDICT: No material issues found** I reviewed the diff through the error-handling & edge-case lens, verifying the new `llamaswap` provider wiring, the claude-code engine integration in the workflow, and the `go.mod` bump. Key checks: - **`llamaswap` in `resolveModel()`** (line 83-88): Identical pattern to `ollama` — `baseURL` is guaranteed non-empty by the `if baseURL == ""` guard at line 65, so `llamaswap.WithBaseURL` always receives a real value. Error from `.Model()` propagates; nothing swallowed. - **`llamaswap` in `endpointProvider()`** (line 225-230): Empty `baseURL` is explicitly rejected with an error at line 214-216 before the switch. Identical option-construction pattern to `ollama`. No nil-able paths. - **`claudeCodeEngine.runPass()`** (engine.go:115-177): Context cancellation is checked before parsing stdout (`ctx.Err()`), process group cleanup uses `Setpgid` + negative-PID kill with a nil-process guard, empty/missing result returns an error rather than leaking raw JSON, and UTF-8 truncation is rune-safe. Missing CLI binary surfaces as a normal pass error (advisory). All unhappy paths are handled. - **`claudeEnv()`** (engine.go:184-204): Correctly passes `CLAUDE_CODE_OAUTH_TOKEN` through (matches the `CLAUDE_` prefix) while stripping unrelated secrets. Verified by reading the test at engine_test.go:120-149. - **Workflow YAML**: If `CLAUDE_CODE_OAUTH_TOKEN` is unset/empty, the claude-code engine fails auth, which surfaces as an advisory error in the review comment — never fatal. Acceptable degradation. - **`provider_of` / lane routing**: `claude-code/sonnet` correctly extracts provider `claude-code` and routes to its own concurrency lane. Verified in entrypoint.sh:159 and the Go binary's `modelProvider()`. No swallowed errors, missing defers, nil panics, or unhandled edge cases in the changed code. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 17s</sub>

🪰 Gadfly review — claude-code/sonnet (claude-code)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

I now have enough to complete the review. The claudeEnv() isolation logic is fully verified.


No material issues found

Reviewed through the security lens. Verified findings below.

Verified clean:

  • claudeEnv() secret isolation (engine.go:184-204): The whitelist correctly excludes GITEA_TOKEN, GADFLY_FINDINGS_TOKEN, and provider keys (e.g. OLLAMA_CLOUD_API_KEY) from the claude subprocess. Only CLAUDE_*, ANTHROPIC_*, GADFLY_CLAUDE_*, NODE_*, XDG_*, and the named standard vars reach the CLI. Verified by reading the full function; the new CLAUDE_CODE_OAUTH_TOKEN secret is intentionally forwarded via the CLAUDE_ prefix.

  • llamaswap provider wiring (model.go:83-88, 222-229): Both the resolveModel and endpointProvider paths mirror the existing ollama case exactly — URL and API key come from operator-controlled env vars, passed to the library, never interpolated into a shell command or used in URL construction within gadfly's own code. No injection surface.

  • GADFLY_CLAUDE_EXTRA_ARGS passthrough (engine.go:86, 105): Uses strings.Fields (not a shell) and passes args directly to exec.CommandContext. No shell injection risk. This is operator-controlled trusted input.

  • Alternate backend proxy docs (examples/claude-code.yml:63-69, README.md): The example keeps ANTHROPIC_BASE_URL / ANTHROPIC_AUTH_TOKEN commented out and marks them "EXAMPLE ONLY." The claudeEnv() function will forward both the redirect URL and any ANTHROPIC_* credential to the subprocess as designed — standard proxy-configuration behavior, well-documented as the operator's responsibility.

  • CLAUDE_CODE_OAUTH_TOKEN in CI (.gitea/workflows/adversarial-review.yml:56): Correctly scoped as a Gitea secret; forwarded only to the claude subprocess, not to majordomo, provider API calls, or the findings endpoint.

🎯 Correctness — No material issues found

I've now read model.go in full, checked engine.go, main.go, and the adversarial-review workflow. Here is my verdict.


No material issues found

The correctness-relevant changes are:

  • llamaswap in resolveModel() (model.go:82–87): mirrors the ollama case exactly — WithBaseURL + optional WithToken, no WithName (correct: WithName is only needed for registry registration, not for direct .Model() calls). Verified by reading the full function.

  • llamaswap in endpointProvider() (model.go:225–230): mirrors the ollama case exactly — WithName(name), WithBaseURL, optional WithToken. Correct; named endpoints need WithName so majordomo.Parse("name/model") can find them. Verified.

  • claude-code/sonnet routing (adversarial-review.yml:79,82): isClaudeCodeSpec("claude-code/sonnet")true (verified at engine.go:63–65); newClaudeCodeEngine extracts "sonnet" after the slash → --model sonnet (verified at engine.go:76–78). entrypoint.sh:159's provider_of extracts "claude-code" as the lane key, which matches GADFLY_PROVIDER_CONCURRENCY: "claude-code=1". All routing is consistent.

  • Model count in GADFLY_MODELS: 9 :cloud entries + m5/qwen3.6:35b-mlx + claude-code/sonnet = 11 total. Matches the comment "9 cloud + M5 Mac + claude-code/sonnet". Verified by counting.

  • Concurrency math (adversarial-review.yml:82–84): ollama-cloud=3 models × GADFLY_PROVIDER_LENS_CONCURRENCY: ollama-cloud=3 lenses = 9 concurrent cloud queries. Comment at line 83 says "under the 10 budget" — arithmetic is correct.

  • Error message updates in resolveModel() default and endpointProvider() default both now include llamaswap in the provider list — accurate.

🧯 Error handling & edge cases — Minor issues

The finding is confirmed. The llamaswap case at model.go:225-230 exists, and model_test.go has no subtest covering it — only ollama, openai, and foreman (×2) are exercised, plus the bad-input cases.

Minor issues


  • cmd/gadfly/model_test.go:35llamaswap provider arm in endpointProvider has no test case

    TestEndpointProvider exercises ollama, openai-compatible, and foreman (with and without token), plus bad-input paths. The llamaswap arm added at cmd/gadfly/model.go:225-230 is not covered. The project's own maintenance rule (CLAUDE.md: "Add a test when you add logic") calls this out explicitly, and the file has a ready-to-copy subtest shape.

    Concrete risk: if llamaswap.WithName, llamaswap.WithBaseURL, or llamaswap.WithToken have a different signature than the ollama equivalents (plausible for a newly-added upstream package), the mismatch is invisible at test time and only surfaces during a live run.

    Suggested fix — add to TestEndpointProvider:

    t.Run("llamaswap with key", func(t *testing.T) {
        p, err := endpointProvider("ls", "llamaswap|http://swap.lan:8080|tok")
        if err != nil {
            t.Fatalf("unexpected error: %v", err)
        }
        if p.Name() != "ls" {
            t.Errorf("Name() = %q, want %q", p.Name(), "ls")
        }
    })
    t.Run("llamaswap without token", func(t *testing.T) {
        if _, err := endpointProvider("ls2", "llamaswap|http://swap.lan:8080"); err != nil {
            t.Fatalf("unexpected error: %v", err)
        }
    })
    

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 16s

<!-- gadfly-review:ollama:claude-code/sonnet --> ### 🪰 Gadfly review — `claude-code/sonnet` (claude-code) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> I now have enough to complete the review. The `claudeEnv()` isolation logic is fully verified. --- **No material issues found** Reviewed through the security lens. Verified findings below. **Verified clean:** - **`claudeEnv()` secret isolation** (`engine.go:184-204`): The whitelist correctly excludes `GITEA_TOKEN`, `GADFLY_FINDINGS_TOKEN`, and provider keys (e.g. `OLLAMA_CLOUD_API_KEY`) from the claude subprocess. Only `CLAUDE_*`, `ANTHROPIC_*`, `GADFLY_CLAUDE_*`, `NODE_*`, `XDG_*`, and the named standard vars reach the CLI. Verified by reading the full function; the new `CLAUDE_CODE_OAUTH_TOKEN` secret is intentionally forwarded via the `CLAUDE_` prefix. - **`llamaswap` provider wiring** (`model.go:83-88`, `222-229`): Both the `resolveModel` and `endpointProvider` paths mirror the existing `ollama` case exactly — URL and API key come from operator-controlled env vars, passed to the library, never interpolated into a shell command or used in URL construction within gadfly's own code. No injection surface. - **`GADFLY_CLAUDE_EXTRA_ARGS` passthrough** (`engine.go:86`, `105`): Uses `strings.Fields` (not a shell) and passes args directly to `exec.CommandContext`. No shell injection risk. This is operator-controlled trusted input. - **Alternate backend proxy docs** (`examples/claude-code.yml:63-69`, `README.md`): The example keeps `ANTHROPIC_BASE_URL` / `ANTHROPIC_AUTH_TOKEN` commented out and marks them "EXAMPLE ONLY." The `claudeEnv()` function will forward both the redirect URL and any `ANTHROPIC_*` credential to the subprocess as designed — standard proxy-configuration behavior, well-documented as the operator's responsibility. - **`CLAUDE_CODE_OAUTH_TOKEN` in CI** (`.gitea/workflows/adversarial-review.yml:56`): Correctly scoped as a Gitea secret; forwarded only to the claude subprocess, not to majordomo, provider API calls, or the findings endpoint. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> I've now read `model.go` in full, checked `engine.go`, `main.go`, and the adversarial-review workflow. Here is my verdict. --- **No material issues found** The correctness-relevant changes are: - **`llamaswap` in `resolveModel()`** (`model.go:82–87`): mirrors the `ollama` case exactly — `WithBaseURL` + optional `WithToken`, no `WithName` (correct: `WithName` is only needed for registry registration, not for direct `.Model()` calls). Verified by reading the full function. - **`llamaswap` in `endpointProvider()`** (`model.go:225–230`): mirrors the `ollama` case exactly — `WithName(name)`, `WithBaseURL`, optional `WithToken`. Correct; named endpoints need `WithName` so `majordomo.Parse("name/model")` can find them. Verified. - **`claude-code/sonnet` routing** (`adversarial-review.yml:79,82`): `isClaudeCodeSpec("claude-code/sonnet")` → `true` (verified at `engine.go:63–65`); `newClaudeCodeEngine` extracts `"sonnet"` after the slash → `--model sonnet` (verified at `engine.go:76–78`). `entrypoint.sh:159`'s `provider_of` extracts `"claude-code"` as the lane key, which matches `GADFLY_PROVIDER_CONCURRENCY: "claude-code=1"`. All routing is consistent. - **Model count** in GADFLY_MODELS: 9 `:cloud` entries + `m5/qwen3.6:35b-mlx` + `claude-code/sonnet` = 11 total. Matches the comment "9 cloud + M5 Mac + claude-code/sonnet". Verified by counting. - **Concurrency math** (`adversarial-review.yml:82–84`): `ollama-cloud=3` models × `GADFLY_PROVIDER_LENS_CONCURRENCY: ollama-cloud=3` lenses = 9 concurrent cloud queries. Comment at line 83 says "under the 10 budget" — arithmetic is correct. - **Error message updates** in `resolveModel()` default and `endpointProvider()` default both now include `llamaswap` in the provider list — accurate. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> The finding is confirmed. The `llamaswap` case at `model.go:225-230` exists, and `model_test.go` has no subtest covering it — only `ollama`, `openai`, and `foreman` (×2) are exercised, plus the bad-input cases. **Minor issues** --- - **`cmd/gadfly/model_test.go:35` — `llamaswap` provider arm in `endpointProvider` has no test case** `TestEndpointProvider` exercises `ollama`, `openai`-compatible, and `foreman` (with and without token), plus bad-input paths. The `llamaswap` arm added at `cmd/gadfly/model.go:225-230` is not covered. The project's own maintenance rule (`CLAUDE.md`: *"Add a test when you add logic"*) calls this out explicitly, and the file has a ready-to-copy subtest shape. Concrete risk: if `llamaswap.WithName`, `llamaswap.WithBaseURL`, or `llamaswap.WithToken` have a different signature than the `ollama` equivalents (plausible for a newly-added upstream package), the mismatch is invisible at test time and only surfaces during a live run. Suggested fix — add to `TestEndpointProvider`: ```go t.Run("llamaswap with key", func(t *testing.T) { p, err := endpointProvider("ls", "llamaswap|http://swap.lan:8080|tok") if err != nil { t.Fatalf("unexpected error: %v", err) } if p.Name() != "ls" { t.Errorf("Name() = %q, want %q", p.Name(), "ls") } }) t.Run("llamaswap without token", func(t *testing.T) { if _, err := endpointProvider("ls2", "llamaswap|http://swap.lan:8080"); err != nil { t.Fatalf("unexpected error: %v", err) } }) ``` </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 16s</sub>

🪰 Gadfly review — minimax-m3:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

Findings — Security lens only:

  • Auth/authz on claude-code/sonnet lane: Verified .gitea/workflows/adversarial-review.yml:37-42 — the job-level if: requires the issue_comment actor to be one of steve|fizi|dazed (same allow-list as GADFLY_ALLOWED_USERS), so an untrusted commenter cannot trigger a secret-bearing run that would invoke the Claude Code subscription. pull_request and workflow_dispatch are already trusted by definition. No regression from adding the claude-code lane.
  • Secret exposure of CLAUDE_CODE_OAUTH_TOKEN (adversarial-review.yml:56): Passed via ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} and the workflow runs pull_request/workflow_dispatch/trusted-comment events on gadfly's own repo. In Gitea, secrets are not exposed to PRs from forks, and the job itself never echo's the token — cmd/gadfly/engine.go forwards it only into the claude subprocess via env. The engine test (engine_test.go:125-139) confirms the env-allowlist is explicit, so GITEA_TOKEN/provider keys aren't leaked into the CLI. No leakage introduced.
  • New llamaswap provider (cmd/gadfly/model.go:83-88, 225-230): Mirrors the existing ollama case — passes baseURL and apiKey straight through to the constructor with no string concatenation into shell/URL, no template rendering, no logging of the key. No injection or SSRF surface beyond what ollama already had; the verbatim-base-URL behavior is the same plaintext-http://-friendly pattern already established for ollama and foreman and is not a regression.
  • "Alternate backends" example (examples/claude-code.yml:63-69): Example-only, all lines commented out, and the docs explicitly warn that tool-use support depends on the proxy and that the env forwards ANTHROPIC_* to the subprocess. The workflow file (not the engine) is the only place this URL could be set; not wired into code. No issue.
  • Subscription-auth-in-CI caveat: Already flagged in README + example as a terms-of-service gray area. Not a code-level security defect, and the PR description acknowledges it. Worth noting but not a fix-blocking finding.

I did not find any new injection (SQL/command/path/template), SSRF, unsafe-deserialization, or insecure-default surface introduced by this PR.

🎯 Correctness — No material issues found

VERDICT: No material issues found

I traced the change through the correctness lens and could not substantiate a real bug. The diffs in scope:

  • cmd/gadfly/model.go — the new llamaswap case in resolveModel (lines 83–88) and endpointProvider (lines 225–230) is a verbatim mirror of the existing ollama case: same Option{}/WithBaseURL/WithToken/WithName/New pattern, same fmt.Errorf text updated consistently in both error messages (use ollama/llamaswap/foreman/openai/anthropic/google). I read model.go end-to-end; the import (provider/llamaswap) matches the bump in go.mod, and the symbols used (New, Option, WithBaseURL, WithToken, WithName) are the conventional names for majordomo providers (same as ollama/openai/anthropic/google above/below).
  • .gitea/workflows/adversarial-review.yml — adds claude-code/sonnet to GADFLY_MODELS and claude-code=1 to GADFLY_PROVIDER_CONCURRENCY, and maps CLAUDE_CODE_OAUTH_TOKEN. Concurrency-key resolution is consistent: modelProvider() (cmd/gadfly/model.go:142–151) returns the segment before the first / in GADFLY_MODEL, so claude-code/sonnet"claude-code", which matches the new concurrency key. The pre-existing claude-code engine (added in Phase 1, exercised in examples/claude-code.yml with the exact same claude-code/sonnet spec) means the entrypoint already routes this prefix — no behavior change required on the engine side, only a new lane entry.
  • README.md / examples/claude-code.yml — doc-only additions; the new "10 concurrent ollama-cloud queries" framing in the workflow comment is unaffected because the "under the 10 budget" parenthetical refers to ollama-cloud's per-provider budget (3 × 3 = 9, still true), not total process budget.
  • go.mod / go.sum — straightforward dep bump; no replacement of the project's own semantic constants.

I could not independently confirm the majordomo version v0.0.0-20260627201401-b2487a1a370c exposes the exact llamaswap.New/Option/WithBaseURL/WithToken/WithName symbol set — that's an external module check, and the PR description states this bump is what adds it. The go build reported in the PR footer is consistent with the import compiling.

Outside my lens: nothing. The cross-cutting lenses (security of the CLAUDE_CODE_OAUTH_TOKEN mapping, CI perf with one more lane, etc.) belong to other reviewers.

🧯 Error handling & edge cases — No material issues found

VERDICT: No material issues found

Lens: 🧯 Error handling & edge cases.

I verified the new code paths against the checked-out tree (cmd/gadfly/model.go, cmd/gadfly/engine.go, cmd/gadfly/main.go, entrypoint.sh, scripts/run.sh, the workflow, the README). Findings in my lane:

  • llamaswap wiring (cmd/gadfly/model.go:83-88, :225-230): mirrors the ollama pattern exactly — empty/missing GADFLY_BASE_URL is handled by the earlier if baseURL == "" fallthrough to majordomo.Parse, and the endpoint-override path requires baseURL to be non-empty (endpointProvider returns an error on missing URL at line 214-216). Optional API key handled. No new nil/empty edge cases introduced.
  • claude-code env scrubbing (cmd/gadfly/engine.go:184-204): claudeEnv() keeps CLAUDE_* / ANTHROPIC_* / NODE_* / XDG_* / LC_* / PATH|HOME|USER|LOGNAME|TMPDIR|LANG|TERM|SHELL and drops everything else. Missing CLAUDE_CODE_OAUTH_TOKEN surfaces as a subprocess auth failure → caught by runPass's IsError/stderr fallback and reported as a per-model comment error rather than a hung job. Defense-in-depth matches the README claim.
  • runPass cancel/error handling (cmd/gadfly/engine.go:115-177): ctx.Err() is checked AFTER cmd.Run(), so a timed-out lens never returns partial bytes as a "successful" review. cmd.Cancel SIGKILLs the whole process group, preventing orphan claude/Node children. Empty result, parse failure, exit-error, and stdout-text fallback all handled distinctly.
  • Concurrency budget: GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3" × 3 models = 9 concurrent ollama-cloud calls (under the documented 10-budget). m5 and claude-code fall through to default lensConcurrency=1 (cmd/gadfly/main.go:282-286), matching the workflow comment "the Mac one at a time; claude-code one at a time". No oversubscription risk.
  • Status-board "never started" guard (entrypoint.sh:189-191): unchanged from before, still correct — a claude-code model that crashes at subprocess spawn leaves .started == 0 and gets marked done:true so the board can complete. No new failure mode here.
  • Workflow timeout: 90 minutes is enough headroom; the slowest lane (claude-code/sonnet, CLI default model, 3 sequential lenses) is well within that.

Nothing materially wrong in this lens. (The MODEL marker prefix is gadfly-review:ollama: even for claude-code/... because run.sh hardcodes PROVIDER=ollama at line 181 — confusing but pre-existing and not a bug; the marker still uniquely identifies the comment for upsert.)

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 53s

<!-- gadfly-review:ollama:minimax-m3:cloud --> ### 🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **VERDICT: No material issues found** Findings — Security lens only: - **Auth/authz on `claude-code/sonnet` lane**: Verified `.gitea/workflows/adversarial-review.yml:37-42` — the job-level `if:` requires the `issue_comment` actor to be one of `steve|fizi|dazed` (same allow-list as `GADFLY_ALLOWED_USERS`), so an untrusted commenter cannot trigger a secret-bearing run that would invoke the Claude Code subscription. `pull_request` and `workflow_dispatch` are already trusted by definition. No regression from adding the `claude-code` lane. - **Secret exposure of `CLAUDE_CODE_OAUTH_TOKEN`** (`adversarial-review.yml:56`): Passed via `${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}` and the workflow runs `pull_request`/`workflow_dispatch`/trusted-comment events on gadfly's own repo. In Gitea, secrets are not exposed to PRs from forks, and the job itself never echo's the token — `cmd/gadfly/engine.go` forwards it only into the `claude` subprocess via env. The engine test (`engine_test.go:125-139`) confirms the env-allowlist is explicit, so `GITEA_TOKEN`/provider keys aren't leaked into the CLI. No leakage introduced. - **New `llamaswap` provider** (`cmd/gadfly/model.go:83-88, 225-230`): Mirrors the existing `ollama` case — passes `baseURL` and `apiKey` straight through to the constructor with no string concatenation into shell/URL, no template rendering, no logging of the key. No injection or SSRF surface beyond what `ollama` already had; the verbatim-base-URL behavior is the same plaintext-`http://`-friendly pattern already established for `ollama` and `foreman` and is not a regression. - **"Alternate backends" example** (`examples/claude-code.yml:63-69`): Example-only, all lines commented out, and the docs explicitly warn that tool-use support depends on the proxy and that the env forwards `ANTHROPIC_*` to the subprocess. The workflow file (not the engine) is the only place this URL could be set; not wired into code. No issue. - **Subscription-auth-in-CI caveat**: Already flagged in README + example as a terms-of-service gray area. Not a code-level security defect, and the PR description acknowledges it. Worth noting but not a fix-blocking finding. I did not find any new injection (SQL/command/path/template), SSRF, unsafe-deserialization, or insecure-default surface introduced by this PR. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> VERDICT: No material issues found I traced the change through the correctness lens and could not substantiate a real bug. The diffs in scope: - `cmd/gadfly/model.go` — the new `llamaswap` case in `resolveModel` (lines 83–88) and `endpointProvider` (lines 225–230) is a verbatim mirror of the existing `ollama` case: same `Option{}/WithBaseURL/WithToken/WithName/New` pattern, same `fmt.Errorf` text updated consistently in both error messages (`use ollama/llamaswap/foreman/openai/anthropic/google`). I read model.go end-to-end; the import (`provider/llamaswap`) matches the bump in go.mod, and the symbols used (`New`, `Option`, `WithBaseURL`, `WithToken`, `WithName`) are the conventional names for majordomo providers (same as ollama/openai/anthropic/google above/below). - `.gitea/workflows/adversarial-review.yml` — adds `claude-code/sonnet` to `GADFLY_MODELS` and `claude-code=1` to `GADFLY_PROVIDER_CONCURRENCY`, and maps `CLAUDE_CODE_OAUTH_TOKEN`. Concurrency-key resolution is consistent: `modelProvider()` (cmd/gadfly/model.go:142–151) returns the segment before the first `/` in `GADFLY_MODEL`, so `claude-code/sonnet` → `"claude-code"`, which matches the new concurrency key. The pre-existing `claude-code` engine (added in Phase 1, exercised in `examples/claude-code.yml` with the exact same `claude-code/sonnet` spec) means the entrypoint already routes this prefix — no behavior change required on the engine side, only a new lane entry. - `README.md` / `examples/claude-code.yml` — doc-only additions; the new "10 concurrent ollama-cloud queries" framing in the workflow comment is unaffected because the "under the 10 budget" parenthetical refers to ollama-cloud's per-provider budget (3 × 3 = 9, still true), not total process budget. - `go.mod` / `go.sum` — straightforward dep bump; no replacement of the project's own semantic constants. I could not independently confirm the majordomo version `v0.0.0-20260627201401-b2487a1a370c` exposes the exact `llamaswap.New/Option/WithBaseURL/WithToken/WithName` symbol set — that's an external module check, and the PR description states this bump is what adds it. The `go build` reported in the PR footer is consistent with the import compiling. Outside my lens: nothing. The cross-cutting lenses (security of the `CLAUDE_CODE_OAUTH_TOKEN` mapping, CI perf with one more lane, etc.) belong to other reviewers. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> ## VERDICT: No material issues found Lens: 🧯 Error handling & edge cases. I verified the new code paths against the checked-out tree (`cmd/gadfly/model.go`, `cmd/gadfly/engine.go`, `cmd/gadfly/main.go`, `entrypoint.sh`, `scripts/run.sh`, the workflow, the README). Findings in my lane: - **`llamaswap` wiring** (`cmd/gadfly/model.go:83-88`, `:225-230`): mirrors the `ollama` pattern exactly — empty/missing `GADFLY_BASE_URL` is handled by the earlier `if baseURL == ""` fallthrough to `majordomo.Parse`, and the endpoint-override path requires `baseURL` to be non-empty (`endpointProvider` returns an error on missing URL at line 214-216). Optional API key handled. No new nil/empty edge cases introduced. - **`claude-code` env scrubbing** (`cmd/gadfly/engine.go:184-204`): `claudeEnv()` keeps `CLAUDE_*` / `ANTHROPIC_*` / `NODE_*` / `XDG_*` / `LC_*` / `PATH|HOME|USER|LOGNAME|TMPDIR|LANG|TERM|SHELL` and drops everything else. Missing `CLAUDE_CODE_OAUTH_TOKEN` surfaces as a subprocess auth failure → caught by `runPass`'s `IsError`/stderr fallback and reported as a per-model comment error rather than a hung job. Defense-in-depth matches the README claim. - **`runPass` cancel/error handling** (`cmd/gadfly/engine.go:115-177`): `ctx.Err()` is checked AFTER `cmd.Run()`, so a timed-out lens never returns partial bytes as a "successful" review. `cmd.Cancel` SIGKILLs the whole process group, preventing orphan `claude`/Node children. Empty result, parse failure, exit-error, and stdout-text fallback all handled distinctly. - **Concurrency budget**: `GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3"` × 3 models = 9 concurrent ollama-cloud calls (under the documented 10-budget). `m5` and `claude-code` fall through to default `lensConcurrency=1` (`cmd/gadfly/main.go:282-286`), matching the workflow comment "the Mac one at a time; claude-code one at a time". No oversubscription risk. - **Status-board "never started" guard** (`entrypoint.sh:189-191`): unchanged from before, still correct — a `claude-code` model that crashes at subprocess spawn leaves `.started == 0` and gets marked `done:true` so the board can complete. No new failure mode here. - **Workflow timeout**: 90 minutes is enough headroom; the slowest lane (`claude-code/sonnet`, CLI default model, 3 sequential lenses) is well within that. Nothing materially wrong in this lens. (The `MODEL` marker prefix is `gadfly-review:ollama:` even for `claude-code/...` because `run.sh` hardcodes `PROVIDER=ollama` at line 181 — confusing but pre-existing and not a bug; the marker still uniquely identifies the comment for upsert.) </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 53s</sub>

🪰 Gadfly review — kimi-k2.7-code:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

Verdict: No material issues found.

Through the security lens, the PR introduces the llamaswap provider as a symmetric twin of the existing ollama/foreman endpoint-override providers: it receives a base URL and optional bearer token and passes them directly to majordomo's llamaswap provider constructor. No new injection, SSRF, deserialization, or credential-exposure surface is added relative to the already-intentional configurable-backend design.

  • cmd/gadfly/model.go:83-88 and cmd/gadfly/model.go:225-230 — the new llamaswap branch mirrors the existing ollama path and forwards GADFLY_BASE_URL/GADFLY_API_KEY (or the GADFLY_ENDPOINT_* URL and key) to the provider without any additional parsing that could introduce path/SQL/command injection.
  • .gitea/workflows/adversarial-review.yml:56CLAUDE_CODE_OAUTH_TOKEN is correctly passed as a secret into the container; it is not logged or templated.
  • examples/claude-code.yml:63-69 — the alternate-backend configuration is commented out and explicitly labeled as example-only/not validated, so it does not expose a default insecure path.
  • Authz guard at .gitea/workflows/adversarial-review.yml:37-42 is unchanged and still restricts issue_comment triggers to the allow-listed maintainers.

No security findings require action.

🎯 Correctness — No material issues found

Verdict: No material issues found

Through the correctness lens, the change is clean:

  • cmd/gadfly/model.go wires llamaswap symmetrically with the other endpoint-override providers: resolveModel() constructs it when GADFLY_PROVIDER=llamaswap + GADFLY_BASE_URL is set, and endpointProvider() constructs it for GADFLY_ENDPOINT_*="llamaswap|url[|key]". Both paths honor name, base URL, and optional token exactly like ollama. Error messages were updated to include llamaswap.
  • The claude-code dogfood configuration matches the engine implementation: claude-code/sonnet is added to GADFLY_MODELS, its concurrency lane is claude-code=1, and CLAUDE_CODE_OAUTH_TOKEN is forwarded. The binary's isClaudeCodeSpec and modelProvider() correctly route that spec to the claude-code lane.
  • The dependency bump and the documented "alternate backends" example for Claude Code are consistent with the existing env-forwarding logic in claudeEnv() (ANTHROPIC_* / CLAUDE_* are already passed through).

No logic bugs, incorrect constants/conversions, or domain-correctness issues were found after reading the relevant files.

🧯 Error handling & edge cases — No material issues found

No material issues found.

From the error handling & edge cases lens, the diff is clean:

  • The new llamaswap provider branch in cmd/gadfly/model.go mirrors the existing ollama/openai branches exactly, including the empty baseURL guard already enforced by endpointProvider and the unchanged apiKey optional-token path. No swallowed errors or nil/empty edge cases are introduced.
  • The claude-code engine path in cmd/gadfly/engine.go already handles cancellation/timouts (ctx.Err() check), empty JSON results, parse failures, nonzero exits, and process-group cleanup. None of those guards are touched by this PR.
  • The workflow change just adds a new model string and a new provider-concurrency entry; it doesn't change retry/timeout/error-surfacing behavior.
  • No new integer arithmetic, boundary loops, or collection indexing appears in the diff.

Verified by reading cmd/gadfly/model.go, cmd/gadfly/engine.go, cmd/gadfly/main.go, and their tests.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s

<!-- gadfly-review:ollama:kimi-k2.7-code:cloud --> ### 🪰 Gadfly review — `kimi-k2.7-code:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **Verdict:** No material issues found. Through the security lens, the PR introduces the `llamaswap` provider as a symmetric twin of the existing `ollama`/`foreman` endpoint-override providers: it receives a base URL and optional bearer token and passes them directly to majordomo's `llamaswap` provider constructor. No new injection, SSRF, deserialization, or credential-exposure surface is added relative to the already-intentional configurable-backend design. - `cmd/gadfly/model.go:83-88` and `cmd/gadfly/model.go:225-230` — the new `llamaswap` branch mirrors the existing `ollama` path and forwards `GADFLY_BASE_URL`/`GADFLY_API_KEY` (or the `GADFLY_ENDPOINT_*` URL and key) to the provider without any additional parsing that could introduce path/SQL/command injection. - `.gitea/workflows/adversarial-review.yml:56` — `CLAUDE_CODE_OAUTH_TOKEN` is correctly passed as a secret into the container; it is not logged or templated. - `examples/claude-code.yml:63-69` — the alternate-backend configuration is commented out and explicitly labeled as example-only/not validated, so it does not expose a default insecure path. - Authz guard at `.gitea/workflows/adversarial-review.yml:37-42` is unchanged and still restricts `issue_comment` triggers to the allow-listed maintainers. No security findings require action. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> **Verdict: No material issues found** Through the correctness lens, the change is clean: - `cmd/gadfly/model.go` wires `llamaswap` symmetrically with the other endpoint-override providers: `resolveModel()` constructs it when `GADFLY_PROVIDER=llamaswap` + `GADFLY_BASE_URL` is set, and `endpointProvider()` constructs it for `GADFLY_ENDPOINT_*="llamaswap|url[|key]"`. Both paths honor name, base URL, and optional token exactly like `ollama`. Error messages were updated to include `llamaswap`. - The claude-code dogfood configuration matches the engine implementation: `claude-code/sonnet` is added to `GADFLY_MODELS`, its concurrency lane is `claude-code=1`, and `CLAUDE_CODE_OAUTH_TOKEN` is forwarded. The binary's `isClaudeCodeSpec` and `modelProvider()` correctly route that spec to the `claude-code` lane. - The dependency bump and the documented "alternate backends" example for Claude Code are consistent with the existing env-forwarding logic in `claudeEnv()` (`ANTHROPIC_*` / `CLAUDE_*` are already passed through). No logic bugs, incorrect constants/conversions, or domain-correctness issues were found after reading the relevant files. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found. From the **error handling & edge cases** lens, the diff is clean: - The new `llamaswap` provider branch in `cmd/gadfly/model.go` mirrors the existing `ollama`/`openai` branches exactly, including the empty `baseURL` guard already enforced by `endpointProvider` and the unchanged `apiKey` optional-token path. No swallowed errors or nil/empty edge cases are introduced. - The `claude-code` engine path in `cmd/gadfly/engine.go` already handles cancellation/timouts (`ctx.Err()` check), empty JSON results, parse failures, nonzero exits, and process-group cleanup. None of those guards are touched by this PR. - The workflow change just adds a new model string and a new provider-concurrency entry; it doesn't change retry/timeout/error-surfacing behavior. - No new integer arithmetic, boundary loops, or collection indexing appears in the diff. Verified by reading `cmd/gadfly/model.go`, `cmd/gadfly/engine.go`, `cmd/gadfly/main.go`, and their tests. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s</sub>

🪰 Gadfly review — deepseek-v4-pro:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

I traced the key security boundaries in this PR:

  • claudeEnv() in engine.go:184-204: The environment filter for the Claude Code subprocess correctly passes through CLAUDE_* (which covers CLAUDE_CODE_OAUTH_TOKEN) and ANTHROPIC_* (for the documented alternate-backend path), while blocking GITEA_TOKEN, OLLAMA_CLOUD_API_KEY, GADFLY_API_KEY, GADFLY_FINDINGS_TOKEN, and all other provider secrets. Verified by reading the full function and cross-referencing the prefix list against every secret env var set in the workflow YAML.

  • CLAUDE_CODE_OAUTH_TOKEN in the workflow YAML (adversarial-review.yml:56): Mapped from secrets.CLAUDE_CODE_OAUTH_TOKEN — Gitea Actions redacts secret values from logs. The token flows through entrypoint.shrun.sh → the Go binary's environment → claudeEnv() → the claude CLI subprocess. No intermediate logging or echo. Verified by reading all three files end-to-end.

  • llamaswap provider wiring (model.go:83-88, 225-230): Follows the identical pattern as ollama, openai, anthropic, and google — base URL and token come from workflow-author-controlled env vars (GADFLY_BASE_URL, GADFLY_API_KEY, or GADFLY_ENDPOINT_*), not from untrusted input. No injection surface.

  • Claude Code subprocess invocation (engine.go:116): Uses exec.CommandContext (no shell), so the system prompt, task text, and PR content — which flow into --append-system-prompt and -p flags — cannot inject commands regardless of content.

  • examples/claude-code.yml:68-69: The alternate-backend example correctly uses vars for the non-secret ANTHROPIC_BASE_URL and secrets for ANTHROPIC_AUTH_TOKEN. Commented out, so inert.

No authn/authz gaps, no injection vectors, no secret leakage paths, no unsafe defaults introduced by this change.

🎯 Correctness — No material issues found

Verdict: No material issues found

After reviewing the diff through the 🎯 Correctness lens, I find no logic bugs or semantic errors introduced by this change. Here's what I verified:

  • llamaswap wiring in resolveModel() (line 83-88): Structurally identical to the ollama case — same option pattern (WithBaseURL, WithToken), same return shape. No WithName is used here, which is correct because this is the direct-provider path (not endpoint registration). Verified by reading cmd/gadfly/model.go in full.

  • llamaswap wiring in endpointProvider() (line 225-230): Mirrors the ollama endpoint case, including WithName(name) for registry registration. Consistent with all other endpoint providers. Verified by reading the full function.

  • Error message updates (lines 107, 255): Both default branches now list llamaswap alongside the other supported providers. The messages are accurate — claude-code is intentionally absent from these lists because it doesn't support GADFLY_BASE_URL endpoint overrides (it reads ANTHROPIC_BASE_URL from the subprocess env instead).

  • claude-code/sonnet in GADFLY_MODELS: The spec claude-code/sonnet resolves through majordomo.Parse() (no GADFLY_BASE_URL is set), which is the correct path for the Phase-1 engine. modelProvider() extracts claude-code as the lane, matching claude-code=1 in both GADFLY_PROVIDER_CONCURRENCY and (by absence, falling back to scalar default 1) GADFLY_PROVIDER_LENS_CONCURRENCY. Verified by reading the workflow YAML and model.go.

  • Concurrency settings: claude-code=1 in GADFLY_PROVIDER_CONCURRENCY correctly limits the single claude-code model to one-at-a-time. No claude-code entry in GADFLY_PROVIDER_LENS_CONCURRENCY means lenses fall back to the scalar default (1, sequential), which is appropriate for a subprocess-based engine. Verified against the lens concurrency test matrix in lens_concurrency_test.go (the "no lane match falls back to scalar" test case at line 175).

  • CLAUDE_CODE_OAUTH_TOKEN: Correctly mapped from secrets and named consistently with what the majordomo claude-code provider expects.

🧯 Error handling & edge cases — No material issues found

No material issues found.

The llamaswap additions in cmd/gadfly/model.go (both resolveModel and endpointProvider) follow the exact same guard-and-propagate pattern as the existing ollama/openai/anthropic/google cases: baseURL is validated non-empty before the switch, apiKey/key is conditionally applied only when non-empty, and any error from New(...).Model(...) propagates up through the existing resolveModelrunmain chain (where run.sh catches non-zero exits and renders an advisory failure comment). The error messages in the default branches correctly include llamaswap. No swallowed errors, no nil dereference paths, no missing cleanup.

The workflow YAML changes are configuration-only. The claude-code/sonnet model and CLAUDE_CODE_OAUTH_TOKEN secret are wired into the existing error-surfacing chain: if the token is missing or the CLI fails, claudeCodeEngine.runPass returns an error → reviewWithSpecialist renders it as "⚠️ This reviewer failed to complete: …" → the comment is posted advisory (never blocks the job). The claude-code=1 concurrency cap and the absence of a claude-code entry in GADFLY_PROVIDER_LENS_CONCURRENCY (falling back to the default of 1) are both correct and safe.

The go.mod/go.sum bump is a dependency update only; no error-handling surface there.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 59s

<!-- gadfly-review:ollama:deepseek-v4-pro:cloud --> ### 🪰 Gadfly review — `deepseek-v4-pro:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **No material issues found** I traced the key security boundaries in this PR: - **`claudeEnv()` in `engine.go:184-204`**: The environment filter for the Claude Code subprocess correctly passes through `CLAUDE_*` (which covers `CLAUDE_CODE_OAUTH_TOKEN`) and `ANTHROPIC_*` (for the documented alternate-backend path), while blocking `GITEA_TOKEN`, `OLLAMA_CLOUD_API_KEY`, `GADFLY_API_KEY`, `GADFLY_FINDINGS_TOKEN`, and all other provider secrets. Verified by reading the full function and cross-referencing the prefix list against every secret env var set in the workflow YAML. - **`CLAUDE_CODE_OAUTH_TOKEN` in the workflow YAML (`adversarial-review.yml:56`)**: Mapped from `secrets.CLAUDE_CODE_OAUTH_TOKEN` — Gitea Actions redacts secret values from logs. The token flows through `entrypoint.sh` → `run.sh` → the Go binary's environment → `claudeEnv()` → the `claude` CLI subprocess. No intermediate logging or echo. Verified by reading all three files end-to-end. - **`llamaswap` provider wiring (`model.go:83-88`, `225-230`)**: Follows the identical pattern as `ollama`, `openai`, `anthropic`, and `google` — base URL and token come from workflow-author-controlled env vars (`GADFLY_BASE_URL`, `GADFLY_API_KEY`, or `GADFLY_ENDPOINT_*`), not from untrusted input. No injection surface. - **Claude Code subprocess invocation (`engine.go:116`)**: Uses `exec.CommandContext` (no shell), so the system prompt, task text, and PR content — which flow into `--append-system-prompt` and `-p` flags — cannot inject commands regardless of content. - **`examples/claude-code.yml:68-69`**: The alternate-backend example correctly uses `vars` for the non-secret `ANTHROPIC_BASE_URL` and `secrets` for `ANTHROPIC_AUTH_TOKEN`. Commented out, so inert. No authn/authz gaps, no injection vectors, no secret leakage paths, no unsafe defaults introduced by this change. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> ## Verdict: No material issues found After reviewing the diff through the 🎯 Correctness lens, I find no logic bugs or semantic errors introduced by this change. Here's what I verified: - **llamaswap wiring in `resolveModel()` (line 83-88)**: Structurally identical to the `ollama` case — same option pattern (`WithBaseURL`, `WithToken`), same return shape. No `WithName` is used here, which is correct because this is the direct-provider path (not endpoint registration). Verified by reading `cmd/gadfly/model.go` in full. - **llamaswap wiring in `endpointProvider()` (line 225-230)**: Mirrors the `ollama` endpoint case, including `WithName(name)` for registry registration. Consistent with all other endpoint providers. Verified by reading the full function. - **Error message updates (lines 107, 255)**: Both `default` branches now list `llamaswap` alongside the other supported providers. The messages are accurate — `claude-code` is intentionally absent from these lists because it doesn't support `GADFLY_BASE_URL` endpoint overrides (it reads `ANTHROPIC_BASE_URL` from the subprocess env instead). - **`claude-code/sonnet` in `GADFLY_MODELS`**: The spec `claude-code/sonnet` resolves through `majordomo.Parse()` (no `GADFLY_BASE_URL` is set), which is the correct path for the Phase-1 engine. `modelProvider()` extracts `claude-code` as the lane, matching `claude-code=1` in both `GADFLY_PROVIDER_CONCURRENCY` and (by absence, falling back to scalar default 1) `GADFLY_PROVIDER_LENS_CONCURRENCY`. Verified by reading the workflow YAML and `model.go`. - **Concurrency settings**: `claude-code=1` in `GADFLY_PROVIDER_CONCURRENCY` correctly limits the single claude-code model to one-at-a-time. No `claude-code` entry in `GADFLY_PROVIDER_LENS_CONCURRENCY` means lenses fall back to the scalar default (1, sequential), which is appropriate for a subprocess-based engine. Verified against the lens concurrency test matrix in `lens_concurrency_test.go` (the "no lane match falls back to scalar" test case at line 175). - **`CLAUDE_CODE_OAUTH_TOKEN`**: Correctly mapped from secrets and named consistently with what the majordomo claude-code provider expects. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found. The `llamaswap` additions in `cmd/gadfly/model.go` (both `resolveModel` and `endpointProvider`) follow the exact same guard-and-propagate pattern as the existing `ollama`/`openai`/`anthropic`/`google` cases: `baseURL` is validated non-empty before the switch, `apiKey`/`key` is conditionally applied only when non-empty, and any error from `New(...).Model(...)` propagates up through the existing `resolveModel` → `run` → `main` chain (where `run.sh` catches non-zero exits and renders an advisory failure comment). The error messages in the `default` branches correctly include `llamaswap`. No swallowed errors, no nil dereference paths, no missing cleanup. The workflow YAML changes are configuration-only. The `claude-code/sonnet` model and `CLAUDE_CODE_OAUTH_TOKEN` secret are wired into the existing error-surfacing chain: if the token is missing or the CLI fails, `claudeCodeEngine.runPass` returns an error → `reviewWithSpecialist` renders it as `"⚠️ This reviewer failed to complete: …"` → the comment is posted advisory (never blocks the job). The `claude-code=1` concurrency cap and the absence of a `claude-code` entry in `GADFLY_PROVIDER_LENS_CONCURRENCY` (falling back to the default of 1) are both correct and safe. The `go.mod`/`go.sum` bump is a dependency update only; no error-handling surface there. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 59s</sub>

🪰 Gadfly review — nemotron-3-super:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

Reasoning: After reviewing the changes through the security lens (authn/authz, injection, SSRF, secrets leakage, input validation, etc.), no material security issues were found in the gadfly repository changes:

  1. CLAUDE_CODE_OAUTH_TOKEN handling: The token is properly sourced from Gitea secrets and passed via environment to the Claude Code CLI subprocess. The engine's entrypoint script forwards the token without logging or exposure. While the README notes subscription auth in CI is a "gray area" per Anthropic's terms, this is a policy concern, not a code vulnerability.

  2. llamaswap provider integration: The provider is integrated similarly to existing providers (ollama/openai/etc.), passing baseURL and token via constructor options. No SSRF/injection risks are introduced in gadfly's code (validation/sanitization depends on the provider implementation in majordomo, which is out of scope for this repo review).

  3. Example documentation: The commented-out Claude Code proxy example in examples/claude-code.yml and README notes are explicitly marked as "example only, not validated/wired." No active code changes introduce proxy-related risks.

  4. Dependency updates: Updated dependencies (including majordomo for llamaswap) are indirect and vetted by CI (go test -race ./... passed). No direct vulnerable dependencies were introduced in gadfly's own code.

  5. No authn/authz gaps, injection points, or secrets leaks were identified in the modified files after verifying via read_file and grep.

No material security issues were found within the assigned lens. Outside my lens: none noted.

🎯 Correctness — No material issues found

No material issues found

🧯 Error handling & edge cases — No material issues found

No material issues found

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 4m 38s

<!-- gadfly-review:ollama:nemotron-3-super:cloud --> ### 🪰 Gadfly review — `nemotron-3-super:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> No material issues found **Reasoning**: After reviewing the changes through the security lens (authn/authz, injection, SSRF, secrets leakage, input validation, etc.), no material security issues were found in the gadfly repository changes: 1. **CLAUDE_CODE_OAUTH_TOKEN handling**: The token is properly sourced from Gitea secrets and passed via environment to the Claude Code CLI subprocess. The engine's entrypoint script forwards the token without logging or exposure. While the README notes subscription auth in CI is a "gray area" per Anthropic's terms, this is a policy concern, not a code vulnerability. 2. **llamaswap provider integration**: The provider is integrated similarly to existing providers (ollama/openai/etc.), passing baseURL and token via constructor options. No SSRF/injection risks are introduced in gadfly's code (validation/sanitization depends on the provider implementation in majordomo, which is out of scope for this repo review). 3. **Example documentation**: The commented-out Claude Code proxy example in `examples/claude-code.yml` and README notes are explicitly marked as "example only, not validated/wired." No active code changes introduce proxy-related risks. 4. **Dependency updates**: Updated dependencies (including majordomo for llamaswap) are indirect and vetted by CI (`go test -race ./...` passed). No direct vulnerable dependencies were introduced in gadfly's own code. 5. **No authn/authz gaps, injection points, or secrets leaks** were identified in the modified files after verifying via `read_file` and `grep`. No material security issues were found within the assigned lens. Outside my lens: none noted. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> No material issues found </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 4m 38s</sub>

🪰 Gadfly review — gpt-oss:120b-cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

  • The changes introduce a new llamaswap provider and a claude-code/sonnet backend, but all added environment variables (CLAUDE_CODE_OAUTH_TOKEN) and configuration strings are only used for authentication to external services. No code paths directly expose these secrets, inject them into commands, SQL, or file paths, nor forward unvalidated user input to external APIs.
  • Provider selection and endpoint parsing remain unchanged; the newly added provider (llamaswap) follows the same pattern as existing providers, using llamaswap.WithToken and llamaswap.WithBaseURL without constructing any unsafe command strings or URLs from untrusted data.
  • Documentation updates merely illustrate alternate backends and do not affect runtime behavior.

All modifications were verified by reading the affected files (cmd/gadfly/model.go, workflow YAML, and README) to confirm that secret handling and input validation remain consistent with existing, already‑reviewed providers. No new authentication/authorization gaps, injection vectors, or secret leakage issues are introduced.

🎯 Correctness — Minor issues

Verdict: Minor issues

  • Incorrect error message for unknown provider (cmd/gadfly/model.go:107 & cmd/gadfly/model.go:255).
    The default error strings list supported providers but omit "openai-compatible" even though that provider is accepted in the switch statements above. This can mislead users when they receive an error for a valid provider.
    Fix: Update the error messages to include "openai-compatible" (e.g., "use openai/openai-compatible/ollama/llamaswap/foreman/anthropic/google").
🧯 Error handling & edge cases — No material issues found

Verdict: No material issues found

  • The new llamaswap handling mirrors existing provider logic and correctly propagates errors from endpointProvider and majordomo.Parse. No ignored or swallowed errors were introduced, and all error paths return appropriate messages.
  • No new edge‑case or nil‑dereference scenarios were added; the code validates required environment variables (GADFLY_MODEL) before use and handles empty API keys gracefully.
  • No missing cleanup, defer, or panic‑inducing paths were found in the changes.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s

<!-- gadfly-review:ollama:gpt-oss:120b-cloud --> ### 🪰 Gadfly review — `gpt-oss:120b-cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **VERDICT: No material issues found** - The changes introduce a new `llamaswap` provider and a `claude-code/sonnet` backend, but all added environment variables (`CLAUDE_CODE_OAUTH_TOKEN`) and configuration strings are only used for authentication to external services. No code paths directly expose these secrets, inject them into commands, SQL, or file paths, nor forward unvalidated user input to external APIs. - Provider selection and endpoint parsing remain unchanged; the newly added provider (`llamaswap`) follows the same pattern as existing providers, using `llamaswap.WithToken` and `llamaswap.WithBaseURL` without constructing any unsafe command strings or URLs from untrusted data. - Documentation updates merely illustrate alternate backends and do not affect runtime behavior. All modifications were verified by reading the affected files (`cmd/gadfly/model.go`, workflow YAML, and README) to confirm that secret handling and input validation remain consistent with existing, already‑reviewed providers. No new authentication/authorization gaps, injection vectors, or secret leakage issues are introduced. </details> <details><summary><b>🎯 Correctness</b> — Minor issues</summary> **Verdict: Minor issues** - **Incorrect error message for unknown provider** (`cmd/gadfly/model.go:107` & `cmd/gadfly/model.go:255`). The default error strings list supported providers but omit `"openai-compatible"` even though that provider is accepted in the switch statements above. This can mislead users when they receive an error for a valid provider. **Fix:** Update the error messages to include `"openai-compatible"` (e.g., `"use openai/openai-compatible/ollama/llamaswap/foreman/anthropic/google"`). </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> **Verdict: No material issues found** - The new `llamaswap` handling mirrors existing provider logic and correctly propagates errors from `endpointProvider` and `majordomo.Parse`. No ignored or swallowed errors were introduced, and all error paths return appropriate messages. - No new edge‑case or nil‑dereference scenarios were added; the code validates required environment variables (`GADFLY_MODEL`) before use and handles empty API keys gracefully. - No missing cleanup, `defer`, or panic‑inducing paths were found in the changes. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s</sub>

🪰 Gadfly review — qwen3-coder:480b-cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — Minor issues

VERDICT: Minor issues

  • Untested llamaswap provider: The new llamaswap provider is added but marked as untested in the README. This could pose a security risk if it's later used in production without proper validation. Confirmed by grep("llamaswap", "README.md", 5).
  • Proxy backend warning lacks enforcement: The README warns that alternate backends (e.g., proxies for Claude Code) are untested and may not support tool use. However, there's no code-level safeguard preventing their use, which could lead to unexpected behavior or data exposure if misconfigured. Verified by read_file("examples/claude-code.yml", 63, 7) and read_file("README.md", 119, 8).
🎯 Correctness — No material issues found

VERDICT: No material issues found

After reviewing the changes through the lens of correctness, I found no material issues. The PR correctly implements:

  1. Integration of the new llamaswap provider in the model resolution logic with proper option handling
  2. Wiring of the Claude Code engine as a competitor in the adversarial review workflow
  3. Updates to documentation and configuration to reflect the new providers and engine
  4. Proper concurrency settings for the new claude-code provider lane

All code paths appear logically sound and consistent with the existing patterns. The changes correctly handle API key configuration, base URL construction, and error messaging for the new provider. The Claude Code integration follows the established engine abstraction without introducing correctness issues.

🧯 Error handling & edge cases — No material issues found

No material issues found.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 35s

<!-- gadfly-review:ollama:qwen3-coder:480b-cloud --> ### 🪰 Gadfly review — `qwen3-coder:480b-cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — Minor issues</summary> VERDICT: Minor issues - **Untested `llamaswap` provider**: The new `llamaswap` provider is added but marked as untested in the README. This could pose a security risk if it's later used in production without proper validation. Confirmed by `grep("llamaswap", "README.md", 5)`. - **Proxy backend warning lacks enforcement**: The README warns that alternate backends (e.g., proxies for Claude Code) are untested and may not support tool use. However, there's no code-level safeguard preventing their use, which could lead to unexpected behavior or data exposure if misconfigured. Verified by `read_file("examples/claude-code.yml", 63, 7)` and `read_file("README.md", 119, 8)`. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> VERDICT: No material issues found After reviewing the changes through the lens of correctness, I found no material issues. The PR correctly implements: 1. Integration of the new `llamaswap` provider in the model resolution logic with proper option handling 2. Wiring of the Claude Code engine as a competitor in the adversarial review workflow 3. Updates to documentation and configuration to reflect the new providers and engine 4. Proper concurrency settings for the new `claude-code` provider lane All code paths appear logically sound and consistent with the existing patterns. The changes correctly handle API key configuration, base URL construction, and error messaging for the new provider. The Claude Code integration follows the established engine abstraction without introducing correctness issues. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 35s</sub>

🪰 Gadfly review — gemma4:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

🎯 Correctness — No material issues found

No material issues found

🧯 Error handling & edge cases — No material issues found

No material issues found

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 23s

<!-- gadfly-review:ollama:gemma4:cloud --> ### 🪰 Gadfly review — `gemma4:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> No material issues found </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> No material issues found </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 23s</sub>
steve added 1 commit 2026-06-27 21:53:18 +00:00
fix: fold in PR #3 review findings (llamaswap test + error message)
Build & push image / build-and-push (pull_request) Successful in 33s
540645c2ae
The 11-model swarm (incl. the new claude-code/sonnet competitor)
reviewed PR #3. 52 findings graded via the MCP — almost all accurate
clean-verifications of a small, clean PR (telemetry over-extracts them).
Two were real and are folded in:

- model_test.go: add llamaswap with/without-token cases to
  TestEndpointProvider. Caught by claude-code/sonnet itself, citing
  CLAUDE.md's "add a test when you add logic" rule — the standout
  finding and a strong validation of the Phase-1 engine.
- model.go: the default provider error messages omitted
  "openai-compatible" though the switch accepts it. Caught by gpt-oss.
  Added it to both messages.

gofmt clean, go vet quiet, go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
steve merged commit 82f7ef78d5 into main 2026-06-27 21:53:42 +00:00
steve deleted branch feat/cc-backends 2026-06-27 21:53:42 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: steve/gadfly#3