feat: claude-code backends + llamaswap provider + dogfood the CC engine #3
Reference in New Issue
Block a user
Delete Branch "feat/cc-backends"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 2 of the gadfly-games build.
majordomo bump → new
llamaswapproviderollama:GADFLY_PROVIDER=llamaswap+GADFLY_BASE_URL, andGADFLY_ENDPOINT_<NAME>="llamaswap|url[|key]". Wired but untested here (no llama-swap instance).Dogfood the Phase-1 claude-code engine
claude-code/sonnetto gadfly's own review swarm as a competitor alongside the 9 cloud models + M5 — its own provider lane (claude-code=1), parallel with the cloud/m5 lanes.:sha-86f12c1(the first image that bundles theclaudeCLI) and mappedCLAUDE_CODE_OAUTH_TOKEN.claude-code/sonnetcomment — the first real end-to-end exercise of the engine.Backend docs (example-only)
ANTHROPIC_BASE_URLat an Anthropic-API-compatible proxy (claude-code-router / LiteLLM). The subprocess env already forwardsANTHROPIC_*/CLAUDE_*, so no code change is needed — but tool-use support varies by proxy, so it's documented as example only, not wired/tested (per the plan).llamaswaprow in the providers table + Claude Code "alternate backends" note;examples/claude-code.ymlshows the proxy env.Tests
gofmtclean ·go vetquiet ·go buildok ·go test -race ./...green · workflow YAML validated.🤖 Generated with Claude Code
🪰 Gadfly — live review status
11/11 reviewers finished · updated 2026-06-27 21:45:43Z
claude-code/sonnet· claude-code — ✅ donedeepseek-v4-pro:cloud· ollama-cloud — ✅ donegemma4:cloud· ollama-cloud — ✅ doneglm-5.1:cloud· ollama-cloud — ✅ doneglm-5.2:cloud· ollama-cloud — ✅ donegpt-oss:120b-cloud· ollama-cloud — ✅ donekimi-k2.7-code:cloud· ollama-cloud — ✅ donem5/qwen3.6:35b-mlx· m5 — ✅ doneminimax-m3:cloud· ollama-cloud — ✅ donenemotron-3-super:cloud· ollama-cloud — ✅ doneqwen3-coder:480b-cloud· ollama-cloud — ✅ doneLive status board. Findings are posted in each model's own comment. Advisory only — does not block merge.
🪰 Gadfly review —
m5/qwen3.6:35b-mlx(m5)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Security Review
Verdict
No material issues found.
Findings
No security issues identified in this change. Specifically:
llamaswapprovider wiring is sound: The newcase "llamaswap"blocks inresolveModel()(cmd/gadfly/model.go:83-88) andendpointProvider()(cmd/gadfly/model.go:225-230) mirror the existingollama/openaipattern exactly —WithBaseURL(baseURL)+ conditionalWithToken(apiKey). No new auth bypass or input-leak path is introduced; these are standard provider constructs that consume env-derived strings for HTTP calls. Error messages indefault:cases were correctly updated to mentionllamaswap(lines 107, 255).Claude Code subprocess isolation remains intact: The
claudeEnv()whitelist filter (engine.go:184-203) continues to drop runner secrets (GITEA_TOKEN,OLLAMA_API_KEY,GADFLY_API_KEY,GADFLY_FINDINGS_TOKEN). The new workflow step that addsCLAUDE_CODE_OAUTH_TOKENis explicitly kept by theANTHROPIC_*prefix match (correct, since that token is consumed by theclaudeCLI). No new secret injection vector was introduced.No injection paths in model specs: The claude engine receives
e.bin,e.model,e.permissionMode,e.allowedTools— all derived from env vars or fixed strings. No user-controlled PR body/branch/ref flows into the subprocess argv. Theargs()method constructs a clean, parameterized CLI invocation.Workflow secrets mapping is explicit:
CLAUDE_CODE_OAUTH_TOKEN: ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}follows Gitea Actions best practices (no wildcard secret exposure). The existingactor_allowedgate and job-levelif:guard on issue_comment remain unchanged.All changes are additive/config-only with no new auth boundaries or data flows introduced.
🎯 Correctness — No material issues found
Correctness Review
VERDICT: No material issues found.
The diff adds two features with no correctness problems:
llamaswapprovider wiring —cmd/gadfly/model.gocorrectly adds thellamaswapcase in bothresolveModel()(line 83-88) andendpointProvider()(line 225-230). The call pattern (WithBaseURL,WithToken/WithName, thenNew().Model(model)orNew()) follows the exact same structure as the existingollama,openai,anthropic, andgooglecases. Error messages at lines 107 and 255 are updated to listllamaswapin both places consistently.Claude Code dogfooding — The workflow YAML updates (
sha-86f12c1image pin,CLAUDE_CODE_OAUTH_TOKENsecret mapping,claude-code/sonnetmodel addition,claude-code=1concurrency) are additive and consistent with the existing setup. Theclaude-code/sonnetmodel spec follows the established/naming convention thatisClaudeCodeSpec()(engine.go:63-65) already recognizes correctly asstrings.HasPrefix(m, "claude-code/").All imports, function calls, and error messages are syntactically consistent with the existing codebase patterns. The go.mod bump is an update to a newer commit of the same module; no new direct dependencies require local adjustment since
llamaswaplives in themajordomomodule.🧯 Error handling & edge cases — Minor issues
⚠️ Minor issues found
1. No unit tests for the new
llamaswapprovider pathsWhere:
cmd/gadfly/model_test.go(lines 5–42), coveringendpointProvider;cmd/gadfly/model.golines 83–88 and 225–230 for the actual code.Issue: The test file tests
ollama,openai, andforemanendpoints but has zero test cases for the newllamaswapprovider in eitherendpointProviderorresolveModel. This means:llamaswap.WithBaseURL(baseURL)call is never exercised under test.apiKey != ""(theappendbranch) is never hit for llamaswap.llamaswap|...string, the unit tests won't catch it.While the README explicitly calls out "wired, untested", adding at least one smoke-test case in
model_test.go(e.g. a happy-pathllamaswap|http://...and an empty-URL rejection) would be cheap and would surface edge cases early.2. Silent errors in
registerEnvProvidersare logged but never surfaced as fatalWhere:
cmd/gadfly/model.golines 47–48 (calling site) and 175–198 (registerEnvProviders).Issue: Any malformed
GADFLY_ENDPOINT_*orGADFLY_ALIAS_*env var produces an error that is appended toerrs, but at the call site:The errors are never returned or checked further — just printed to stderr. If
llamaswapendpoint registration fails (e.g. bad URL format for the new provider), the caller sees only a warning and the binary proceeds, potentially using default models without realizing a configured endpoint is dead.This is consistent with the "best-effort" comment in the code, but it means an operator misconfiguring
llamaswapcould silently get degraded behavior. Worth considering whetherregisterEnvProvidersshould also return a combined error (or fatalize on non-empty) when critical endpoints are missing.3. Empty provider name edge case in
endpointProviderWhere:
cmd/gadfly/model.goline 204–215.Issue: If the raw string is
GADFLY_ENDPOINT_X="|http://x"(empty provider),providerbecomes""afterstrings.TrimSpace(parts[0]), which falls through todefault. This correctly rejects it via the error at line 255, but the error message reads"unknown provider \"\" (use ...)"— a confusing empty-string error that provides no guidance to the user. Minor UX issue, not a crash risk.Verdict
Minor issues. The llamaswap addition is functionally complete and error-handling is structurally sound (the code validates inputs and returns errors). The gaps are test coverage (expected per design) and the silent-error drop for malformed endpoints (a known pattern in this codebase, but worth flagging when adding new providers to ensure operators aren't surprised).
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 8s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
I've reviewed the change through the security lens, tracing the new secret mapping and the claude-code subprocess boundary. Verdict below.
Verdict: No material issues found
claudesubprocess is sound.claudeEnv()(cmd/gadfly/engine.go:184-204) is an allowlist that keeps onlyPATH/HOME/USER/..., andLC_*/CLAUDE_*/ANTHROPIC_*/GADFLY_CLAUDE_*/NODE_*/XDG_*prefixed vars — so the newly mappedCLAUDE_CODE_OAUTH_TOKEN(prefixCLAUDE_) reaches the CLI, whileGITEA_TOKEN,GADFLY_API_KEY,GADFLY_FINDINGS_TOKEN,OLLAMA_API_KEYare dropped.TestClaudeEnvFilters(cmd/gadfly/engine_test.go:120-150) asserts both halves. No regression from this PR.claudeCLI is invoked viaexec.CommandContextwith discrete argv elements (engine.go:94-106,116), never a shell. Theclaude-code/<model>spec's post-slash segment becomes the--modelvalue as a single argv slot (engine.go:73-87), so a craftedGADFLY_MODELcan't inject extra flags.GADFLY_CLAUDE_EXTRA_ARGSisstrings.Fields-split and appended verbatim, but it's operator-controlled env, not PR-derived input.llamaswapwiring.model.go:83-88and225-230mirror the existingollama/openai/foremanpaths:baseURLis operator-supplied (GADFLY_BASE_URL/GADFLY_ENDPOINT_*), used verbatim — the same posture as every other provider in this file. No untrusted/PR-derived URL is introduced.WithToken(key)applies a bearer; empty key is simply omitted.newClaudeCodeEnginedefaultspermissionModeto"plan"(engine.go:84), so the dogfoodclaude-code/sonnetlane — which does not setGADFLY_CLAUDE_PERMISSION_MODEin.gitea/workflows/adversarial-review.yml— runs the CLI in plan (read-only) mode over the checked-out PR tree. No edit capability is granted by this change.examples/claude-code.yml:67-69,README.md:121) is commented out / example-only. It documents pointingANTHROPIC_BASE_URLat a proxy; that's a consumer self-configured value (and would route the consumer's ownANTHROPIC_AUTH_TOKENto their chosen proxy), not a gadget this repo exposes to untrusted input. Not a finding.One non-blocking observation (outside my lens, noted for awareness): if the
claudeCLI ever emits the OAuth token in an auth-error message,truncateForErrcaps it at 800 bytes but does not redact secret-looking strings, so it could be surfaced in the posted PR comment viarun.sh's error path. This is pre-existing forANTHROPIC_API_KEYas well, not introduced by this PR, and I could not confirm any CLI path that actually echoes the token.🎯 Correctness — No material issues found
VERDICT: No material issues found
I verified the change through the correctness lens — focusing on the new
llamaswapprovider wiring and theclaude-codelane wiring in the workflow — and found no logic bugs.Verified by reading the code:
cmd/gadfly/model.gollamaswap cases — The two newcase "llamaswap"blocks (inresolveModelatmodel.go:83-88andendpointProvideratmodel.go:225-230) mirror the adjacentollamacases exactly: same option pattern (WithBaseURL/WithToken, plusWithNamein the endpoint form), same error-message format. TheresolveModelpath correctly omitsWithName(matching ollama there too, since that path has no endpoint name). The default-case error strings were updated in both functions to listllamaswap. Symmetric and correct.claude-codelane routing —entrypoint.sh:159provider_ofsplitsclaude-code/sonneton/→claude-code, which matches theclaude-code=1entry inGADFLY_PROVIDER_CONCURRENCY(workflow line 82). So it lands in its own lane with cap 1, exactly as the PR description claims. Them5/qwen3.6:35b-mlxand bare:cloudids route the same way they did before (the:cloudsuffix is part of the model id, not the provider segment; bare ids fall back to theollama-clouddefault). No regression in the existing fleet routing.CLAUDE_CODE_OAUTH_TOKENis mapped as step env in the workflow and inherited by the entrypoint.cmd/gadfly/engine.go:184-196claudeEnv()keeps anyCLAUDE_*-prefixed env var, so the token reaches theclaudesubprocess. The claude-code engine is selected viaisClaudeCodeSpecinmain.go:156-159beforeresolveModelis called, so aclaude-code/sonnetspec never falls into the majordomo provider switch. Correct.GADFLY_PROVIDER_LENS_CONCURRENCYonly overridesollama-cloud=3; theclaude-codeandm5lanes fall back to the default lens concurrency (1). That's a conservative, safe choice (serial lenses), not a bug. The "9 concurrent ollama-cloud queries" comment still holds (3 models × 3 lenses).Could not confirm (external dependency, not in this repo): the existence/signatures of
llamaswap.New,llamaswap.WithBaseURL,llamaswap.WithToken,llamaswap.WithNamein the bumped majordomo module. The integration code is symmetric with the provenollamaoption pattern, and the PR reportsgo build/go testgreen, so I treat this as unverified-but-plausible rather than a finding.🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
Reviewed the diff through the 🧯 Error handling & edge cases lens, focusing on the new
llamaswapprovider wiring incmd/gadfly/model.goand theclaude-code/sonnetdogfood wiring in the workflow/scripts.llamaswapbranch inresolveModel(model.go:83-88) andendpointProvider(model.go:225-230): Both mirror the existingollamacase exactly — empty-key guard (if apiKey != ""/if key != ""), error returned on missing base URL via the sharedlen(parts) < 2/baseURL == ""checks inendpointProvider, and thedefaultcases return descriptive errors. No swallowed errors, no nil returns on the happy path. Verified by reading the fullmodel.go.endpointProviderparsing (model.go:203-256): TheSplitN(raw, "|", 3)+len(parts) < 2guard and thebaseURL == ""check correctly reject malformed/empty entries;registerEnvProviderscollects per-entry errors and continues (best-effort), logging them inresolveModel. No new edge cases introduced by adding thellamaswapcase — it reuses the same parse/validation flow.claude-codeengine error handling (engine.go:115-177): Not changed by this diff, but it's the engine now being dogfooded. I verified the existing handling is robust for the unhappy paths this PR exposes it to:ctx.Err()check before parsing partial output,res.IsErrorsurfaced, empty-result and no-parseable-output cases return explicit errors,claudeEnvdrops secrets, andrun.sh(156-181) catches non-zero exit / empty review and posts a⚠️comment rather than failing the lane. The workflow'sCLAUDE_CODE_OAUTH_TOKENmapping is just an env passthrough — no error path to miss.Workflow concurrency wiring (
.gitea/workflows/adversarial-review.yml): Addingclaude-code/sonnettoGADFLY_MODELSandclaude-code=1toGADFLY_PROVIDER_CONCURRENCYis handled byentrypoint.sh's provider-lane logic (provider_of, run_lane, cap validation with[ "$cap" -ge 1 ] 2>/dev/null || cap=1fallback). No new unhandled edge case — an unknown/missing lane cap safely defaults to 1.No ignored errors, missing cleanup, panic-prone input, or unhandled boundary conditions were introduced by this diff within the error-handling lens.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 7s
🪰 Gadfly review —
glm-5.1:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
After reviewing the diff through the security lens and verifying the key code paths:
claudeEnv()(engine.go:184–203) — The allowlist filter that builds theclaudesubprocess environment is sound and well-tested (engine_test.go:120–150). It explicitly dropsGITEA_TOKEN,GADFLY_FINDINGS_TOKEN,OLLAMA_API_KEY,GADFLY_API_KEY, etc., while keeping onlyCLAUDE_*,ANTHROPIC_*,GADFLY_CLAUDE_*, and standard system vars. This defense-in-depth is correctly implemented.CLAUDE_CODE_OAUTH_TOKENin workflow — Properly injected via${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}and only reaches theclaudesubprocess through theCLAUDE_prefix filter. No leakage to untrusted code paths.Subprocess argument construction — The
claudeCLI is invoked viaexec.Command(not a shell), so the PR-sourced system prompt and task content cannot cause argument injection. Flags like-pand--append-system-promptreceive their values as separate slice elements, immune to shell metacharacter attacks.--permission-mode plan(default) — Read-only mode for the Claude Code engine. Safe default.Workflow actor gate —
issue_commenttriggers are restricted tosteve,fizi,dazed; no arbitrary-user secret-bearing runs.Alternate backend docs — The
ANTHROPIC_BASE_URL/ANTHROPIC_AUTH_TOKENexample inexamples/claude-code.ymlis commented-out and documented as "EXAMPLE ONLY, not validated." TheANTHROPIC_*passthrough inclaudeEnv()is by design for this use case; no unsanctioned redirect risk since it requires explicit user configuration.Process group cleanup —
Setpgid: true+syscall.Kill(-pid, SIGKILL)on context cancel properly reaps the CLI and its Node children. No orphan risk.llamaswapprovider — Follows the identical pattern toollama/openai/anthropicwithWithBaseURL+ optionalWithToken. No new attack surface.No authn/authz gaps, injection vectors, secret leaks, or unsafe input handling found.
🎯 Correctness — No material issues found
VERDICT: No material issues found
After verifying the key correctness concerns:
llamaswapprovider wiring: The newllamaswapcase in bothresolveModel()(lines 83–88) andendpointProvider()(lines 225–230) follows the identical pattern asollama—WithBaseURL/WithToken/WithNamein the endpoint path,WithBaseURL/WithToken(noWithName) in theGADFLY_BASE_URLpath. Import is present. Error messages are updated. All consistent and correct.claude-code/sonnetworkflow dispatch:provider_of()inentrypoint.sh(line 159) extractsclaude-codefromclaude-code/sonnetvia${1%%/*}, matching theclaude-code=1concurrency lane. The Go binary'sisClaudeCodeSpec()(engine.go:64–65) correctly matchesstrings.HasPrefix(m, "claude-code/")and extractssonnetas the model. TheGADFLY_PROVIDER_LENS_CONCURRENCYonly specifiesollama-cloud=3, which is correct because the claude-code engine manages its own loop (engine.go:30 comment confirms this).Provider concurrency partitioning: All 11 models in
GADFLY_MODELSmap to exactly one lane each: 9 bare ids →ollama-cloud,m5/qwen3.6:35b-mlx→m5,claude-code/sonnet→claude-code. These three lane keys (ollama-cloud,m5,claude-code) all appear inGADFLY_PROVIDER_CONCURRENCYwith explicit caps. No orphaned lane, no unkeyed provider.Dependency bump:
majordomobumped from0147a79d187btob2487a1a370c, bringing in thellamaswappackage. Thego.sumchanges are consistent with thego.modupdate (transitive deps updated, old ones pruned). No suspicious additions.Default-provider fallback: When
GADFLY_BASE_URLis empty (the common case forollama-cloudandclaude-codemodels), resolution goes throughmajordomo.Parse(buildSpec(provider, model)), which correctly handles all registered providers includingclaude-code. No path incorrectly requiresGADFLY_BASE_URL.No logic bugs, semantic errors, or incorrect behavior found in this change.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
I reviewed the diff through the error-handling & edge-case lens, verifying the new
llamaswapprovider wiring, the claude-code engine integration in the workflow, and thego.modbump. Key checks:llamaswapinresolveModel()(line 83-88): Identical pattern toollama—baseURLis guaranteed non-empty by theif baseURL == ""guard at line 65, sollamaswap.WithBaseURLalways receives a real value. Error from.Model()propagates; nothing swallowed.llamaswapinendpointProvider()(line 225-230): EmptybaseURLis explicitly rejected with an error at line 214-216 before the switch. Identical option-construction pattern toollama. No nil-able paths.claudeCodeEngine.runPass()(engine.go:115-177): Context cancellation is checked before parsing stdout (ctx.Err()), process group cleanup usesSetpgid+ negative-PID kill with a nil-process guard, empty/missing result returns an error rather than leaking raw JSON, and UTF-8 truncation is rune-safe. Missing CLI binary surfaces as a normal pass error (advisory). All unhappy paths are handled.claudeEnv()(engine.go:184-204): Correctly passesCLAUDE_CODE_OAUTH_TOKENthrough (matches theCLAUDE_prefix) while stripping unrelated secrets. Verified by reading the test at engine_test.go:120-149.Workflow YAML: If
CLAUDE_CODE_OAUTH_TOKENis unset/empty, the claude-code engine fails auth, which surfaces as an advisory error in the review comment — never fatal. Acceptable degradation.provider_of/ lane routing:claude-code/sonnetcorrectly extracts providerclaude-codeand routes to its own concurrency lane. Verified in entrypoint.sh:159 and the Go binary'smodelProvider().No swallowed errors, missing defers, nil panics, or unhandled edge cases in the changed code.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 17s
🪰 Gadfly review —
claude-code/sonnet(claude-code)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
I now have enough to complete the review. The
claudeEnv()isolation logic is fully verified.No material issues found
Reviewed through the security lens. Verified findings below.
Verified clean:
claudeEnv()secret isolation (engine.go:184-204): The whitelist correctly excludesGITEA_TOKEN,GADFLY_FINDINGS_TOKEN, and provider keys (e.g.OLLAMA_CLOUD_API_KEY) from the claude subprocess. OnlyCLAUDE_*,ANTHROPIC_*,GADFLY_CLAUDE_*,NODE_*,XDG_*, and the named standard vars reach the CLI. Verified by reading the full function; the newCLAUDE_CODE_OAUTH_TOKENsecret is intentionally forwarded via theCLAUDE_prefix.llamaswapprovider wiring (model.go:83-88,222-229): Both theresolveModelandendpointProviderpaths mirror the existingollamacase exactly — URL and API key come from operator-controlled env vars, passed to the library, never interpolated into a shell command or used in URL construction within gadfly's own code. No injection surface.GADFLY_CLAUDE_EXTRA_ARGSpassthrough (engine.go:86,105): Usesstrings.Fields(not a shell) and passes args directly toexec.CommandContext. No shell injection risk. This is operator-controlled trusted input.Alternate backend proxy docs (
examples/claude-code.yml:63-69,README.md): The example keepsANTHROPIC_BASE_URL/ANTHROPIC_AUTH_TOKENcommented out and marks them "EXAMPLE ONLY." TheclaudeEnv()function will forward both the redirect URL and anyANTHROPIC_*credential to the subprocess as designed — standard proxy-configuration behavior, well-documented as the operator's responsibility.CLAUDE_CODE_OAUTH_TOKENin CI (.gitea/workflows/adversarial-review.yml:56): Correctly scoped as a Gitea secret; forwarded only to the claude subprocess, not to majordomo, provider API calls, or the findings endpoint.🎯 Correctness — No material issues found
I've now read
model.goin full, checkedengine.go,main.go, and the adversarial-review workflow. Here is my verdict.No material issues found
The correctness-relevant changes are:
llamaswapinresolveModel()(model.go:82–87): mirrors theollamacase exactly —WithBaseURL+ optionalWithToken, noWithName(correct:WithNameis only needed for registry registration, not for direct.Model()calls). Verified by reading the full function.llamaswapinendpointProvider()(model.go:225–230): mirrors theollamacase exactly —WithName(name),WithBaseURL, optionalWithToken. Correct; named endpoints needWithNamesomajordomo.Parse("name/model")can find them. Verified.claude-code/sonnetrouting (adversarial-review.yml:79,82):isClaudeCodeSpec("claude-code/sonnet")→true(verified atengine.go:63–65);newClaudeCodeEngineextracts"sonnet"after the slash →--model sonnet(verified atengine.go:76–78).entrypoint.sh:159'sprovider_ofextracts"claude-code"as the lane key, which matchesGADFLY_PROVIDER_CONCURRENCY: "claude-code=1". All routing is consistent.Model count in GADFLY_MODELS: 9
:cloudentries +m5/qwen3.6:35b-mlx+claude-code/sonnet= 11 total. Matches the comment "9 cloud + M5 Mac + claude-code/sonnet". Verified by counting.Concurrency math (
adversarial-review.yml:82–84):ollama-cloud=3models ×GADFLY_PROVIDER_LENS_CONCURRENCY: ollama-cloud=3lenses = 9 concurrent cloud queries. Comment at line 83 says "under the 10 budget" — arithmetic is correct.Error message updates in
resolveModel()default andendpointProvider()default both now includellamaswapin the provider list — accurate.🧯 Error handling & edge cases — Minor issues
The finding is confirmed. The
llamaswapcase atmodel.go:225-230exists, andmodel_test.gohas no subtest covering it — onlyollama,openai, andforeman(×2) are exercised, plus the bad-input cases.Minor issues
cmd/gadfly/model_test.go:35—llamaswapprovider arm inendpointProviderhas no test caseTestEndpointProviderexercisesollama,openai-compatible, andforeman(with and without token), plus bad-input paths. Thellamaswaparm added atcmd/gadfly/model.go:225-230is not covered. The project's own maintenance rule (CLAUDE.md: "Add a test when you add logic") calls this out explicitly, and the file has a ready-to-copy subtest shape.Concrete risk: if
llamaswap.WithName,llamaswap.WithBaseURL, orllamaswap.WithTokenhave a different signature than theollamaequivalents (plausible for a newly-added upstream package), the mismatch is invisible at test time and only surfaces during a live run.Suggested fix — add to
TestEndpointProvider:Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 16s
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Findings — Security lens only:
claude-code/sonnetlane: Verified.gitea/workflows/adversarial-review.yml:37-42— the job-levelif:requires theissue_commentactor to be one ofsteve|fizi|dazed(same allow-list asGADFLY_ALLOWED_USERS), so an untrusted commenter cannot trigger a secret-bearing run that would invoke the Claude Code subscription.pull_requestandworkflow_dispatchare already trusted by definition. No regression from adding theclaude-codelane.CLAUDE_CODE_OAUTH_TOKEN(adversarial-review.yml:56): Passed via${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}and the workflow runspull_request/workflow_dispatch/trusted-comment events on gadfly's own repo. In Gitea, secrets are not exposed to PRs from forks, and the job itself never echo's the token —cmd/gadfly/engine.goforwards it only into theclaudesubprocess via env. The engine test (engine_test.go:125-139) confirms the env-allowlist is explicit, soGITEA_TOKEN/provider keys aren't leaked into the CLI. No leakage introduced.llamaswapprovider (cmd/gadfly/model.go:83-88, 225-230): Mirrors the existingollamacase — passesbaseURLandapiKeystraight through to the constructor with no string concatenation into shell/URL, no template rendering, no logging of the key. No injection or SSRF surface beyond whatollamaalready had; the verbatim-base-URL behavior is the same plaintext-http://-friendly pattern already established forollamaandforemanand is not a regression.examples/claude-code.yml:63-69): Example-only, all lines commented out, and the docs explicitly warn that tool-use support depends on the proxy and that the env forwardsANTHROPIC_*to the subprocess. The workflow file (not the engine) is the only place this URL could be set; not wired into code. No issue.I did not find any new injection (SQL/command/path/template), SSRF, unsafe-deserialization, or insecure-default surface introduced by this PR.
🎯 Correctness — No material issues found
VERDICT: No material issues found
I traced the change through the correctness lens and could not substantiate a real bug. The diffs in scope:
cmd/gadfly/model.go— the newllamaswapcase inresolveModel(lines 83–88) andendpointProvider(lines 225–230) is a verbatim mirror of the existingollamacase: sameOption{}/WithBaseURL/WithToken/WithName/Newpattern, samefmt.Errorftext updated consistently in both error messages (use ollama/llamaswap/foreman/openai/anthropic/google). I read model.go end-to-end; the import (provider/llamaswap) matches the bump in go.mod, and the symbols used (New,Option,WithBaseURL,WithToken,WithName) are the conventional names for majordomo providers (same as ollama/openai/anthropic/google above/below)..gitea/workflows/adversarial-review.yml— addsclaude-code/sonnettoGADFLY_MODELSandclaude-code=1toGADFLY_PROVIDER_CONCURRENCY, and mapsCLAUDE_CODE_OAUTH_TOKEN. Concurrency-key resolution is consistent:modelProvider()(cmd/gadfly/model.go:142–151) returns the segment before the first/inGADFLY_MODEL, soclaude-code/sonnet→"claude-code", which matches the new concurrency key. The pre-existingclaude-codeengine (added in Phase 1, exercised inexamples/claude-code.ymlwith the exact sameclaude-code/sonnetspec) means the entrypoint already routes this prefix — no behavior change required on the engine side, only a new lane entry.README.md/examples/claude-code.yml— doc-only additions; the new "10 concurrent ollama-cloud queries" framing in the workflow comment is unaffected because the "under the 10 budget" parenthetical refers to ollama-cloud's per-provider budget (3 × 3 = 9, still true), not total process budget.go.mod/go.sum— straightforward dep bump; no replacement of the project's own semantic constants.I could not independently confirm the majordomo version
v0.0.0-20260627201401-b2487a1a370cexposes the exactllamaswap.New/Option/WithBaseURL/WithToken/WithNamesymbol set — that's an external module check, and the PR description states this bump is what adds it. Thego buildreported in the PR footer is consistent with the import compiling.Outside my lens: nothing. The cross-cutting lenses (security of the
CLAUDE_CODE_OAUTH_TOKENmapping, CI perf with one more lane, etc.) belong to other reviewers.🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
Lens: 🧯 Error handling & edge cases.
I verified the new code paths against the checked-out tree (
cmd/gadfly/model.go,cmd/gadfly/engine.go,cmd/gadfly/main.go,entrypoint.sh,scripts/run.sh, the workflow, the README). Findings in my lane:llamaswapwiring (cmd/gadfly/model.go:83-88,:225-230): mirrors theollamapattern exactly — empty/missingGADFLY_BASE_URLis handled by the earlierif baseURL == ""fallthrough tomajordomo.Parse, and the endpoint-override path requiresbaseURLto be non-empty (endpointProviderreturns an error on missing URL at line 214-216). Optional API key handled. No new nil/empty edge cases introduced.claude-codeenv scrubbing (cmd/gadfly/engine.go:184-204):claudeEnv()keepsCLAUDE_*/ANTHROPIC_*/NODE_*/XDG_*/LC_*/PATH|HOME|USER|LOGNAME|TMPDIR|LANG|TERM|SHELLand drops everything else. MissingCLAUDE_CODE_OAUTH_TOKENsurfaces as a subprocess auth failure → caught byrunPass'sIsError/stderr fallback and reported as a per-model comment error rather than a hung job. Defense-in-depth matches the README claim.runPasscancel/error handling (cmd/gadfly/engine.go:115-177):ctx.Err()is checked AFTERcmd.Run(), so a timed-out lens never returns partial bytes as a "successful" review.cmd.CancelSIGKILLs the whole process group, preventing orphanclaude/Node children. Empty result, parse failure, exit-error, and stdout-text fallback all handled distinctly.GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3"× 3 models = 9 concurrent ollama-cloud calls (under the documented 10-budget).m5andclaude-codefall through to defaultlensConcurrency=1(cmd/gadfly/main.go:282-286), matching the workflow comment "the Mac one at a time; claude-code one at a time". No oversubscription risk.entrypoint.sh:189-191): unchanged from before, still correct — aclaude-codemodel that crashes at subprocess spawn leaves.started == 0and gets markeddone:trueso the board can complete. No new failure mode here.claude-code/sonnet, CLI default model, 3 sequential lenses) is well within that.Nothing materially wrong in this lens. (The
MODELmarker prefix isgadfly-review:ollama:even forclaude-code/...becauserun.shhardcodesPROVIDER=ollamaat line 181 — confusing but pre-existing and not a bug; the marker still uniquely identifies the comment for upsert.)Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 53s
🪰 Gadfly review —
kimi-k2.7-code:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Verdict: No material issues found.
Through the security lens, the PR introduces the
llamaswapprovider as a symmetric twin of the existingollama/foremanendpoint-override providers: it receives a base URL and optional bearer token and passes them directly to majordomo'sllamaswapprovider constructor. No new injection, SSRF, deserialization, or credential-exposure surface is added relative to the already-intentional configurable-backend design.cmd/gadfly/model.go:83-88andcmd/gadfly/model.go:225-230— the newllamaswapbranch mirrors the existingollamapath and forwardsGADFLY_BASE_URL/GADFLY_API_KEY(or theGADFLY_ENDPOINT_*URL and key) to the provider without any additional parsing that could introduce path/SQL/command injection..gitea/workflows/adversarial-review.yml:56—CLAUDE_CODE_OAUTH_TOKENis correctly passed as a secret into the container; it is not logged or templated.examples/claude-code.yml:63-69— the alternate-backend configuration is commented out and explicitly labeled as example-only/not validated, so it does not expose a default insecure path..gitea/workflows/adversarial-review.yml:37-42is unchanged and still restrictsissue_commenttriggers to the allow-listed maintainers.No security findings require action.
🎯 Correctness — No material issues found
Verdict: No material issues found
Through the correctness lens, the change is clean:
cmd/gadfly/model.gowiresllamaswapsymmetrically with the other endpoint-override providers:resolveModel()constructs it whenGADFLY_PROVIDER=llamaswap+GADFLY_BASE_URLis set, andendpointProvider()constructs it forGADFLY_ENDPOINT_*="llamaswap|url[|key]". Both paths honor name, base URL, and optional token exactly likeollama. Error messages were updated to includellamaswap.claude-code/sonnetis added toGADFLY_MODELS, its concurrency lane isclaude-code=1, andCLAUDE_CODE_OAUTH_TOKENis forwarded. The binary'sisClaudeCodeSpecandmodelProvider()correctly route that spec to theclaude-codelane.claudeEnv()(ANTHROPIC_*/CLAUDE_*are already passed through).No logic bugs, incorrect constants/conversions, or domain-correctness issues were found after reading the relevant files.
🧯 Error handling & edge cases — No material issues found
No material issues found.
From the error handling & edge cases lens, the diff is clean:
llamaswapprovider branch incmd/gadfly/model.gomirrors the existingollama/openaibranches exactly, including the emptybaseURLguard already enforced byendpointProviderand the unchangedapiKeyoptional-token path. No swallowed errors or nil/empty edge cases are introduced.claude-codeengine path incmd/gadfly/engine.goalready handles cancellation/timouts (ctx.Err()check), empty JSON results, parse failures, nonzero exits, and process-group cleanup. None of those guards are touched by this PR.Verified by reading
cmd/gadfly/model.go,cmd/gadfly/engine.go,cmd/gadfly/main.go, and their tests.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
I traced the key security boundaries in this PR:
claudeEnv()inengine.go:184-204: The environment filter for the Claude Code subprocess correctly passes throughCLAUDE_*(which coversCLAUDE_CODE_OAUTH_TOKEN) andANTHROPIC_*(for the documented alternate-backend path), while blockingGITEA_TOKEN,OLLAMA_CLOUD_API_KEY,GADFLY_API_KEY,GADFLY_FINDINGS_TOKEN, and all other provider secrets. Verified by reading the full function and cross-referencing the prefix list against every secret env var set in the workflow YAML.CLAUDE_CODE_OAUTH_TOKENin the workflow YAML (adversarial-review.yml:56): Mapped fromsecrets.CLAUDE_CODE_OAUTH_TOKEN— Gitea Actions redacts secret values from logs. The token flows throughentrypoint.sh→run.sh→ the Go binary's environment →claudeEnv()→ theclaudeCLI subprocess. No intermediate logging or echo. Verified by reading all three files end-to-end.llamaswapprovider wiring (model.go:83-88,225-230): Follows the identical pattern asollama,openai,anthropic, andgoogle— base URL and token come from workflow-author-controlled env vars (GADFLY_BASE_URL,GADFLY_API_KEY, orGADFLY_ENDPOINT_*), not from untrusted input. No injection surface.Claude Code subprocess invocation (
engine.go:116): Usesexec.CommandContext(no shell), so the system prompt, task text, and PR content — which flow into--append-system-promptand-pflags — cannot inject commands regardless of content.examples/claude-code.yml:68-69: The alternate-backend example correctly usesvarsfor the non-secretANTHROPIC_BASE_URLandsecretsforANTHROPIC_AUTH_TOKEN. Commented out, so inert.No authn/authz gaps, no injection vectors, no secret leakage paths, no unsafe defaults introduced by this change.
🎯 Correctness — No material issues found
Verdict: No material issues found
After reviewing the diff through the 🎯 Correctness lens, I find no logic bugs or semantic errors introduced by this change. Here's what I verified:
llamaswap wiring in
resolveModel()(line 83-88): Structurally identical to theollamacase — same option pattern (WithBaseURL,WithToken), same return shape. NoWithNameis used here, which is correct because this is the direct-provider path (not endpoint registration). Verified by readingcmd/gadfly/model.goin full.llamaswap wiring in
endpointProvider()(line 225-230): Mirrors theollamaendpoint case, includingWithName(name)for registry registration. Consistent with all other endpoint providers. Verified by reading the full function.Error message updates (lines 107, 255): Both
defaultbranches now listllamaswapalongside the other supported providers. The messages are accurate —claude-codeis intentionally absent from these lists because it doesn't supportGADFLY_BASE_URLendpoint overrides (it readsANTHROPIC_BASE_URLfrom the subprocess env instead).claude-code/sonnetinGADFLY_MODELS: The specclaude-code/sonnetresolves throughmajordomo.Parse()(noGADFLY_BASE_URLis set), which is the correct path for the Phase-1 engine.modelProvider()extractsclaude-codeas the lane, matchingclaude-code=1in bothGADFLY_PROVIDER_CONCURRENCYand (by absence, falling back to scalar default 1)GADFLY_PROVIDER_LENS_CONCURRENCY. Verified by reading the workflow YAML andmodel.go.Concurrency settings:
claude-code=1inGADFLY_PROVIDER_CONCURRENCYcorrectly limits the single claude-code model to one-at-a-time. Noclaude-codeentry inGADFLY_PROVIDER_LENS_CONCURRENCYmeans lenses fall back to the scalar default (1, sequential), which is appropriate for a subprocess-based engine. Verified against the lens concurrency test matrix inlens_concurrency_test.go(the "no lane match falls back to scalar" test case at line 175).CLAUDE_CODE_OAUTH_TOKEN: Correctly mapped from secrets and named consistently with what the majordomo claude-code provider expects.🧯 Error handling & edge cases — No material issues found
No material issues found.
The
llamaswapadditions incmd/gadfly/model.go(bothresolveModelandendpointProvider) follow the exact same guard-and-propagate pattern as the existingollama/openai/anthropic/googlecases:baseURLis validated non-empty before the switch,apiKey/keyis conditionally applied only when non-empty, and any error fromNew(...).Model(...)propagates up through the existingresolveModel→run→mainchain (whererun.shcatches non-zero exits and renders an advisory failure comment). The error messages in thedefaultbranches correctly includellamaswap. No swallowed errors, no nil dereference paths, no missing cleanup.The workflow YAML changes are configuration-only. The
claude-code/sonnetmodel andCLAUDE_CODE_OAUTH_TOKENsecret are wired into the existing error-surfacing chain: if the token is missing or the CLI fails,claudeCodeEngine.runPassreturns an error →reviewWithSpecialistrenders it as"⚠️ This reviewer failed to complete: …"→ the comment is posted advisory (never blocks the job). Theclaude-code=1concurrency cap and the absence of aclaude-codeentry inGADFLY_PROVIDER_LENS_CONCURRENCY(falling back to the default of 1) are both correct and safe.The
go.mod/go.sumbump is a dependency update only; no error-handling surface there.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 59s
🪰 Gadfly review —
nemotron-3-super:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
Reasoning: After reviewing the changes through the security lens (authn/authz, injection, SSRF, secrets leakage, input validation, etc.), no material security issues were found in the gadfly repository changes:
CLAUDE_CODE_OAUTH_TOKEN handling: The token is properly sourced from Gitea secrets and passed via environment to the Claude Code CLI subprocess. The engine's entrypoint script forwards the token without logging or exposure. While the README notes subscription auth in CI is a "gray area" per Anthropic's terms, this is a policy concern, not a code vulnerability.
llamaswap provider integration: The provider is integrated similarly to existing providers (ollama/openai/etc.), passing baseURL and token via constructor options. No SSRF/injection risks are introduced in gadfly's code (validation/sanitization depends on the provider implementation in majordomo, which is out of scope for this repo review).
Example documentation: The commented-out Claude Code proxy example in
examples/claude-code.ymland README notes are explicitly marked as "example only, not validated/wired." No active code changes introduce proxy-related risks.Dependency updates: Updated dependencies (including majordomo for llamaswap) are indirect and vetted by CI (
go test -race ./...passed). No direct vulnerable dependencies were introduced in gadfly's own code.No authn/authz gaps, injection points, or secrets leaks were identified in the modified files after verifying via
read_fileandgrep.No material security issues were found within the assigned lens. Outside my lens: none noted.
🎯 Correctness — No material issues found
No material issues found
🧯 Error handling & edge cases — No material issues found
No material issues found
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 4m 38s
🪰 Gadfly review —
gpt-oss:120b-cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
llamaswapprovider and aclaude-code/sonnetbackend, but all added environment variables (CLAUDE_CODE_OAUTH_TOKEN) and configuration strings are only used for authentication to external services. No code paths directly expose these secrets, inject them into commands, SQL, or file paths, nor forward unvalidated user input to external APIs.llamaswap) follows the same pattern as existing providers, usingllamaswap.WithTokenandllamaswap.WithBaseURLwithout constructing any unsafe command strings or URLs from untrusted data.All modifications were verified by reading the affected files (
cmd/gadfly/model.go, workflow YAML, and README) to confirm that secret handling and input validation remain consistent with existing, already‑reviewed providers. No new authentication/authorization gaps, injection vectors, or secret leakage issues are introduced.🎯 Correctness — Minor issues
Verdict: Minor issues
cmd/gadfly/model.go:107&cmd/gadfly/model.go:255).The default error strings list supported providers but omit
"openai-compatible"even though that provider is accepted in the switch statements above. This can mislead users when they receive an error for a valid provider.Fix: Update the error messages to include
"openai-compatible"(e.g.,"use openai/openai-compatible/ollama/llamaswap/foreman/anthropic/google").🧯 Error handling & edge cases — No material issues found
Verdict: No material issues found
llamaswaphandling mirrors existing provider logic and correctly propagates errors fromendpointProviderandmajordomo.Parse. No ignored or swallowed errors were introduced, and all error paths return appropriate messages.GADFLY_MODEL) before use and handles empty API keys gracefully.defer, or panic‑inducing paths were found in the changes.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 50s
🪰 Gadfly review —
qwen3-coder:480b-cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — Minor issues
VERDICT: Minor issues
llamaswapprovider: The newllamaswapprovider is added but marked as untested in the README. This could pose a security risk if it's later used in production without proper validation. Confirmed bygrep("llamaswap", "README.md", 5).read_file("examples/claude-code.yml", 63, 7)andread_file("README.md", 119, 8).🎯 Correctness — No material issues found
VERDICT: No material issues found
After reviewing the changes through the lens of correctness, I found no material issues. The PR correctly implements:
llamaswapprovider in the model resolution logic with proper option handlingclaude-codeprovider laneAll code paths appear logically sound and consistent with the existing patterns. The changes correctly handle API key configuration, base URL construction, and error messaging for the new provider. The Claude Code integration follows the established engine abstraction without introducing correctness issues.
🧯 Error handling & edge cases — No material issues found
No material issues found.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 35s
🪰 Gadfly review —
gemma4:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
🎯 Correctness — No material issues found
No material issues found
🧯 Error handling & edge cases — No material issues found
No material issues found
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 23s