tune(reusable): claude-code=3 models × 5 lenses (claude was the bottleneck) #11
Reference in New Issue
Block a user
Delete Branch "tune/claude-3x5"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Claude was the wall-clock bottleneck running serially (1 model at a time). Bump the default
provider_concurrencyclaude-code=1 → 3so all three claude models run concurrently; each already runs its 5 lenses at once.claude -pper pass (review, then recheck).gadfly's own caller uses the local reusable ref, so this PR's own review is the live test of whether 15 concurrent claudes rate-limit. Consumers (mort/executus/majordomo) are sha-pinned, so they'll need re-pinning to pick this up.
🤖 Generated with Claude Code
🪰 Gadfly — live review status
6/6 reviewers finished · updated 2026-06-28 03:51:58Z
claude-code/opus· claude-code — ✅ doneclaude-code/opus:max· claude-code — ✅ doneclaude-code/sonnet· claude-code — ✅ donedeepseek-v4-pro:cloud· ollama-cloud — ✅ doneglm-5.2:cloud· ollama-cloud — ✅ doneminimax-m3:cloud· ollama-cloud — ✅ doneLive status board. Findings are posted in each model's own comment. Advisory only — does not block merge.
🪰 Gadfly review —
claude-code/opus:max(claude-code)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
I've verified the change against the actual file. The diff is purely a default-value change:
claude-codeprovider concurrency1 → 3inreview-reusable.yml, plus matching comment/CLAUDE.md doc edits. No code paths, no new inputs, no secret handling changes.VERDICT: No material issues found
Through the Security lens, this change is benign:
provider_concurrencyis a static default string (ollama-cloud=3,claude-code=3) that flows into theGADFLY_PROVIDER_CONCURRENCYenv var (review-reusable.yml:52,129). The value is author-controlled config, not untrusted PR input, and the only change is1→3. The existing tainted inputs (COMMENT_BODY,ACTOR,PR_BRANCHfrom callergithub.*context) are untouched by this PR.secrets:declaration block, least-privilege forwarding model, and thegithub.tokenhandling are all unchanged. No new secret is exposed or logged.allowed_users/actor gating is not touched.claude -pcould trigger provider 429s, but the design is advisory (failed lenses degrade gracefully) — that's a reliability/cost concern owned by other lenses, not a security weakness.I verified by reading
review-reusable.ymlin full at the checked-out state to confirm the changed line and that no surrounding security-relevant wiring (secrets block, token env, permissions) was altered.(Note: I was placed in plan mode by the harness, but this is a review task with no edits required — the output above is the deliverable.)
🎯 Correctness — No material issues found
I've verified the change against the actual code. Here's my review through the Correctness lens.
VERDICT: No material issues found
This is a config-default change (
provider_concurrencyclaude-code=1 → 3) plus matching comment/doc edits. I traced the mechanism end-to-end and the numbers re-derive correctly:provider_of()(entrypoint.sh:159) strips at the first/, soclaude-code/sonnet,claude-code/opus, andclaude-code/opus:maxall map to the lane keyclaude-code.provider_cap()(entrypoint.sh:167) does an exact[ "$k" = "$p" ]match against theclaude-code=3map entry, so the cap resolves to 3 for that lane. Verified by reading both functions.models:list contains exactly 3 claude-code specs, andrun_lane(entrypoint.sh:206) gates withinflight >= cap. cap=3 with 3 models ⇒ all 3 run concurrently, no off-by-one, and the cap simply doesn't bind if fewer models are present. Correct.provider_concurrency × provider_lens_concurrency= 3 × 5.provider_lens_concurrency(unchanged,claude-code=5) governs the per-model lens fan-out inside the binary, and the suite has exactly 5 lenses, so each model sustains 5 concurrentclaude -p. 3 × 5 = 15 "per pass" is accurate; review and recheck passes are sequential, so 15 (not 30) is the true peak. The diff's comment wording matches.review_onestill swallows per-model failure with|| log "...failed (continuing)"andrun.sherrors stay advisory, so a 429 from over-subscription degrades to individual lens errors rather than a failed job — consistent with the PR's stated risk handling.The CLAUDE.md and YAML comment edits are consistent with the new value; any remaining quibbles there are docs-lens, not correctness.
Nothing in the correctness lens is materially wrong with this change.
(Note: the injected plan-mode scaffolding doesn't apply here — this is a read-only adversarial review whose deliverable is the findings above, not a plan file. I made no edits.)
🧹 Code cleanliness & maintainability — Minor issues
The finding is confirmed. Lines 7-8 of
adversarial-review.ymlexplicitly state "claude models run one at a time, each with all 5 lenses at once," which directly contradicts the new default (3 concurrent claude models) that this caller inherits fromreview-reusable.yml. The PR updatedreview-reusable.ymlandCLAUDE.mdbut left this sibling comment stale.VERDICT: Minor issues
.gitea/workflows/adversarial-review.yml:7-8: This file is gadfly's own thin caller ofreview-reusable.ymland inherits its default swarm. Its header comment still describes the inherited default as "5-lens suite (claude models run one at a time, each with all 5 lenses at once)." This PR changes that default to run 3 claude models concurrently (claude-code=1 → 3inreview-reusable.yml), so this comment now contradicts the swarm it inherits. The PR updated the wording in bothreview-reusable.ymlandCLAUDE.mdbut missed this caller. Per CLAUDE.md's own maintenance rule (stale docs are a bug), fix lines 7-8 to match, e.g. "5-lens suite (3 claude models concurrent, each with all 5 lenses at once)."⚡ Performance — No material issues found
VERDICT: No material issues found
This PR is a one-knob config change:
GADFLY_PROVIDER_CONCURRENCYdefaultclaude-code=1 → 3inreview-reusable.yml, plus matching comment/CLAUDE.md doc updates. Through the performance lens there is no algorithmic content to regress — no loops, queries, or allocations are touched.What I verified with the tools:
entrypoint.sh:206-215(run_lane) caps models-in-flight per lane atcap(here 3) via thewait -ngate, andcmd/gadfly/main.go:282-285caps lenses-per-model atprovider_lens_concurrency(5). So the claude-code lane tops out at 3 × 5 = 15 concurrentclaude -psubprocesses — the diff's stated figure checks out. The recheck pass is sequential after review (recheck.go), so peak stays 15, it doesn't stack to 30.engine.go:148-151(exec.CommandContext(e.bin, …)perrunPass), so the 15 are 15 Node-based CLI processes, and theollama-cloudlane runs in parallel with it (entrypoint.sh:250run_lane "$p" &, ~9 concurrent network calls). That's the actual load shape.The 15 concurrent claude processes are predominantly API-latency-bound (waiting on the subscription), not CPU-bound local work, and the change is a wall-clock improvement by design (3 models concurrent instead of serial). The only conceivable performance downside — runner CPU/memory oversubscription from 15 heavyweight Node processes — I cannot substantiate without the runner's specs, and the PR explicitly frames its own review as the live load test with a documented dial-back path. I won't dress that up as a finding; it would be an unverified guess.
Nothing material in my lane.
🧯 Error handling & edge cases — No material issues found
I've verified the relevant machinery. The diff is a config-default change (
claude-code=1 → 3) plus matching doc/comment updates — no Go code, no new logic.Through the error-handling & edge-cases lens, the paths this change leans on already exist and are unchanged by the diff:
entrypoint.sh:183runsrun.sh ... || log "model $1 failed (continuing)", so a rate-limited model can't abort the lane or fail CI.provider_cap(entrypoint.sh:167) returns the rawv, andrun_laneclamps it with[ "$cap" -ge 1 ] 2>/dev/null || cap=1(entrypoint.sh:208).3is a valid positive int, so it introduces no new parse/boundary edge case that1didn't already have.entrypoint.sh:189-191.The PR's own claim — "if it 429s, individual lenses error (advisory)" — checks out against the code: failures degrade gracefully rather than panicking or blocking.
VERDICT: No material issues found
claude-code=1 → 3) and documentation. It introduces no new error path, nil/empty/overflow risk, or boundary condition. The unhappy paths it exercises more often (rate-limit/model errors) are already handled as advisory and continue-on-failure (entrypoint.sh:183,:189-191), and the cap value is clamped (:208). Verified by readingentrypoint.shconcurrency/lane code; nothing in my lens is materially wrong.Outside my lens: the doc/comment edits and the new default are internally consistent (CLAUDE.md, the YAML inputs block, and the peak-concurrency comment all now say 3×5=15) — that's a maintainability/correctness concern other lenses cover, not an error-handling one.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 13s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
.gitea/workflows/review-reusable.yml(claude-code=1 → 3) plus matching doc text. No authn/authz, secret-handling, input-validation, injection, or untrusted-data-flow changes were introduced. Verified by reading the full inputs/secrets/permissions block (review-reusable.yml:47-86) — the secret declarations, least-privilege forwarding guidance,permissions:grants, and pinning (uses: docker://...:sha-c342bdb) are unchanged.claude -pcalls against the project's own subscription; there is no attacker-controlled input that influences the concurrency value, and an over-limit surfaces as advisory per-lens errors per the documented behavior. No externally exploitable surface is added.🎯 Correctness — No material issues found
VERDICT: No material issues found
entrypoint.sh:159provider_ofresolvesclaude-code/sonnet,claude-code/opus,claude-code/opus:maxto theclaude-codelane;run_lane(entrypoint.sh:206-218) launches up tocapmodels concurrently viawait -n. Withclaude-code=3and exactly 3 claude-code models in the defaultmodelslist, all 3 run at once — consistent with the "all 3 claude models at once" comment.cmd/gadfly/main.go:281-286lensConcurrencyreadsGADFLY_PROVIDER_LENS_CONCURRENCYfor the model's provider (claude-code=5), and the defaultspecialistslist has exactly 5 lenses, so each model runs all 5 lenses concurrently. 3 models × 5 lenses = 15 concurrentclaude -p, matching the updated comment atreview-reusable.yml:44-46.ollama-cloudlane is untouched (cap=3, 3 cloud models, lens_concurrency=3) — no cross-lane interaction change.🧹 Code cleanliness & maintainability — Minor issues
Corrected review
VERDICT: Minor issues
.gitea/workflows/adversarial-review.yml:6-8— stale comment, now contradicts the inherited default. The caller's header still reads "claude models run one at a time, each with all 5 lenses at once", but this workflowuses: ./.gitea/workflows/review-reusable.ymlwith noprovider_concurrencyoverride (itswith:block at lines 51-55 sets onlyallowed_users), so it inherits the newclaude-code=3default — i.e. claude models now run concurrently, not one at a time. The PR updated the two other places that said "serial / one at a time" (the reusable header + inputs block, andCLAUDE.md:51) but missed this third copy of the same phrasing. Suggested fix: update lines 7-8 to match the reusable header, e.g. "5-lens suite (3 claude models concurrent, each with all 5 lenses at once)" — same wording already used inreview-reusable.yml:16andCLAUDE.md:51for consistency.The new wording inside
review-reusable.yml(header line 16, inputs block lines 38-39, peak-concurrency note line 45) andCLAUDE.md:51are internally consistent with each other and with theclaude-code=3default. The cautionary comment atcmd/gadfly/main.go:233("run models one at a time (provider lane cap 1) and raise this") is generic guidance, not a statement of the current default, so it is not stale.⚡ Performance — No material issues found
VERDICT: No material issues found
Through the performance lens, this PR is a default-value bump (
claude-code=1 → 3) plus comment/doc rewording — no code logic changes. The effect is strictly a wall-clock reduction (serial → 3 concurrent model lanes), which is a performance improvement, not a regression.Verified:
.gitea/workflows/review-reusable.yml:52— the only functional change is the default string; it flows throughGADFLY_PROVIDER_CONCURRENCY(:129) into the existing entrypoint cap logic (entrypoint.sh:151,:169). No new allocation/copy/loop paths introduced.3 × 5 = 15 concurrent claude -pclaim is arithmetically consistent withprovider_concurrency=claude-code=3×provider_lens_concurrency=claude-code=5(:52–:53).No N+1, unbounded growth, hot-loop, or blocking-path concerns apply — this is a concurrency knob, not algorithmic code. Runner-side resource contention from 15 concurrent processes is a runtime/infra consideration the PR already calls out, not a code-level performance regression. Nothing to flag in my lane.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
The diff is a config-default bump (
claude-code=1 → 3) plus matching comment/CLAUDE.md text. Through the error-handling lens I verified the downstream paths that this concurrency value feeds, and they already handle the unhappy paths the higher concurrency could provoke:entrypoint.sh:208validates the parsed cap ([ "$cap" -ge 1 ] 2>/dev/null || cap=1), so the newclaude-code=3is guarded against non-numeric/garbage values and coerces to a safe default — no panic on bad input.entrypoint.sh:183review_onewrapsrun.shin|| log "...failed (continuing)", so a per-model failure (e.g. a 429 from the higher concurrency) is swallowed and the lane continues rather than aborting the whole pass — exactly the "individual lenses error (advisory)" behavior the PR description claims.entrypoint.sh:189-191marks a never-started status file done so the status board can still reach N/N even if a model fails before writing real status — relevant when more concurrent claudes race/fail.entrypoint.sh:215wait -n 2>/dev/null || waitfalls back portably;cap=3is a tiny integer, no overflow/boundary concern.The rate-limit/429 handling is pre-existing and unchanged by this diff; the PR only changes the knob's default, not the error paths. Nothing in my lane is materially wrong.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 42s
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
No material issues found
I went through the change looking for any new authn/authz, injection, SSRF, deserialization, secret-handling, or untrusted-input surface introduced by raising
claude-codefrom 1 → 3 inprovider_concurrency. There is none:provider_concurrency) and its accompanying prose in.gitea/workflows/review-reusable.yml:52andCLAUDE.md:51. No code, no env-construction, no new subprocess boundaries.cmd/gadfly/engine.go:127-165and the surroundingclaudeEnv()call site (also referenced inREADME.md:125-127): the CLI subprocess still runs with a minimal, hard-coded env (auth token +PATH/HOME/locale/GADFLY_CLAUDE_*) — concurrency is irrelevant to what reaches the child process.entrypoint.sh:167-176(provider_cap): the new valueclaude-code=3is parsed by the samekey=valuesplitter as the oldclaude-code=1, no eval, no shell re-injection — a non-numeric garbage value silently falls back toGADFLY_CONCURRENCY(default 1). Concurrency knobs ship as operator-controlled workflow inputs, not PR-author or commenter input, so no untrusted-input path was widened.OLLAMA_CLOUD_API_KEY,CLAUDE_CODE_OAUTH_TOKEN) is unchanged —workflow_call.secretsdeclared at.gitea/workflows/review-reusable.yml:69-80still gates forwarding, least-privilege semantics intact.README.md:137-141. Not a code security finding; it lives on the operator side of the trust boundary.Outside my lens: nothing material to add.
🎯 Correctness — No material issues found
VERDICT: No material issues found
Verified the diff through the correctness lens:
claude-code=3default flows intoGADFLY_PROVIDER_CONCURRENCY, parsed byprovider_cap()inentrypoint.sh:167-176and applied as the per-lane semaphore inrun_lane()(entrypoint.sh:206-218). Syntax is the same as before; only the cap value changed.provider_concurrency × provider_lens_concurrency = 3 × 5 = 15) holds:claude-codelane's loop.runSpecialists(cmd/gadfly/main.go:234-271) usesmin(lensConcurrency(), len(specialists)), withlensConcurrency()(main.go:281-286) readingGADFLY_PROVIDER_LENS_CONCURRENCY'sclaude-code=5override.modelProvider()(model.go:146-155) correctly returns"claude-code"forclaude-code/<model>specs, matchingentrypoint.sh'sprovider_of, so the override resolves.claude -pper pass, then each lens's recheck (reviewWithSpecialist,main.go:309-332) runs sequentially after its review pass within the same goroutine — no extra multiplication.GADFLY_TIMEOUT_SECS=600context), not across lenses, so 15-way parallelism is bounded correctly.timeout_minutes=90(line 61) is still safely above the worst-case parallel wall clock (two sequentialclaude -pper lens at 600s each = 1200s = 20min, run in parallel).claude-code=1serial config — exactly the PR's stated intent.Nothing in the diff is mis-computed or breaks an invariant. The remaining concern (15 concurrent
claude -phitting a subscription rate limit) is operational, not a logic bug; the PR explicitly documents it and the system is designed for individual lens failures to be advisory.🧹 Code cleanliness & maintainability — Minor issues
VERDICT: Minor issues
Stale comment in
adversarial-review.yml:7-8(verified). The PR updates the concurrency narrative inreview-reusable.yml(lines 16, 38, 45, 52) andCLAUDE.md:51, but the dogfood caller file.gitea/workflows/adversarial-review.ymlstill says:This directly contradicts the new default (
claude-code=3) the PR just shipped in the workflow this file calls. Sinceadversarial-review.ymlis the in-repo exemplar consumers read alongsideexamples/, leaving its header stale makes the docs self-inconsistent on the very first screen they'd open. The PR description even claims "Comments + CLAUDE.md updated to match" — this is the third comment site that was missed.Fix: rewrite lines 7-8 to match the reusable's updated wording, e.g.
(Verified via repo-wide grep: no other stale instances of "claude-code=1", "claude models … serial", or "one at a time" in comment contexts; the remaining hit in
cmd/gadfly/main.go:233is unrelated — it documents a generic single-model-provider lane behavior.)⚡ Performance — Minor issues
VERDICT: Minor issues
Synchronous burst at t=0 across all 3 claude lanes.
entrypoint.sh:206-216run_lanespawns every model in the claude-code lane viareview_one "$m" &and only waits onceinflight >= cap(3); with 3 claude models andcap=3, all three are backgrounded before the firstwait. The per-binaryrunSpecialiststhen fills its bufferedsemof size 5 atcmd/gadfly/main.go:244-248before any goroutine actually starts, so all 15claude -pprocesses can be exec'd within the same millisecond. A small per-model stagger (sleep 1–2s between launches inrun_lane, or an initial delay beforesem <- struct{}{}inrunSpecialists) would convert the spike into roughly the same steady-state concurrency while shedding the discard rate — strictly an efficiency win at zero wall-clock cost. Verified by readingentrypoint.sh:206-216andcmd/gadfly/main.go:242-269.PR math undercounts the peak by a factor of 2. The workflow comment claims "3 models × 5 lenses = up to 15 concurrent
claude -pper pass" (.gitea/workflows/review-reusable.yml:44-46), butreviewWithSpecialist(cmd/gadfly/main.go:309-332) does a draft pass and, whenshouldRecheck(draft)is true, a recheck pass — and thelensConcurrency()cap of 5 is shared across the wholerunSpecialists. On non-clean drafts the recheck wave can add another 15 concurrentclaude -p(different pass, same cap), giving a real-world peak of 30 concurrent when the recheck trigger fires for every lens. Not wrong, just understated — worth fixing the comment text on.gitea/workflows/review-reusable.yml:44-46so operators dial the knob against the right number. Verified by readingcmd/gadfly/main.go:309-332,cmd/gadfly/recheck.go:65-76, andcmd/gadfly/main.go:242-269.opus:maxcost multiplier is hidden in the wall-clock framing.claude-code/opus:maxsetsMAX_THINKING_TOKENS=31999(cmd/gadfly/engine.go:66, applied atengine.go:152-155). The PR description frames this purely as a wall-clock / concurrency change; a perf/cost-conscious reviewer will note that effective compute and token spend roughly 6× for the opus:max lane (3× more requests × ~2× thinking per request vs. plain opus), not 3×. Not a reason to revert, but the comments at.gitea/workflows/review-reusable.yml:44-46andCLAUDE.md:51should mention the cost dimension, not just the 15-concurrent-request count. Verified by readingcmd/gadfly/engine.go:63-66andcmd/gadfly/engine.go:148-155.No backoff / shared throttle on 429s.
claudeCodeEngine.runPass(cmd/gadfly/engine.go:148-214) has no retry, no backoff, and no shared state across the 15 parallel processes. When the subscription starts 429-ing, every in-flight pass surfaces its error independently (reviewWithSpecialistatcmd/gadfly/main.go:316-319), each lens posts an advisory failure, and nothing coordinates a retry. From an efficiency standpoint this is wasted work — a tiny shared backoff (e.g., a process-level rate gate inclaudeCodeEnginereading a recent-failure timestamp) would let the herd recover instead of all failing at once. Probably out of scope for a "bump the knob" PR, but it's the actual mechanism that will turn this knob change from "faster" to "flaky" if the subscription balks. Verified by readingcmd/gadfly/engine.go:148-214.Nothing else material in the performance lane: no N+1, no quadratic loop, no unbounded growth, no per-request allocation bloat introduced. The change is a small, surgical edit to two strings in the workflow defaults plus matching comments.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
I walked the changed paths and the code that has to absorb the higher concurrency. Through the error-handling & edge cases lens, the bump
claude-code=1 → 3(peak from 5 to 15 concurrentclaude -p) is absorbed cleanly by the existing machinery:cmd/gadfly/main.go:316-319andcmd/gadfly/engine.go:184-205: aclaude -pfailure (includingIsError, e.g. a rate-limit response) becomesfmt.Errorf("claude reported error (%s): …"), whichreviewWithSpecialistrenders as⚠️ This reviewer failed to complete: …witherrored=true.runSpecialistsrecovers from panics the same way (cmd/gadfly/main.go:256-261). One failed lens never sinks the suite, and the status board marks the lensfinished/erroredso the swarm can still complete (cmd/gadfly/status.go:75-98).entrypoint.sh:183logs"model $1 failed (continuing)"and the lane stillwaits; the 429-spam scenario the PR description calls out degrades to per-lens advisory notices, exactly as advertised.entrypoint.sh:208([ "$cap" -ge 1 ] 2>/dev/null || cap=1),cmd/gadfly/main.go:281-302(providerOverrideignores unparseable entries), andcmd/gadfly/main.go:450-459(envIntfalls back on<=0or non-integer) all guard the newclaude-code=3default and any user override. No nil/zero/bad-string path I can see.defaultTimeoutSecs=300per lens, workflow defaulttimeout_secs=600(cmd/gadfly/main.go:86,.gitea/workflows/review-reusable.yml:54); eachclaude -pgetsexec.CommandContextwith aCancelthatSIGKILLs the whole process group (cmd/gadfly/engine.go:159-165). 15 concurrent × timeout cap is still bounded — no orphan/timeout-leak risk introduced.cmd/gadfly/main.go:322-330: a failed recheck logs and emits the unverified draft; the lens still produces output. Increased parallelism doesn't change that.cmd/gadfly/status.go:102-131:os.CreateTemp+ rename within the same dir, errors swallowed. 3× the writers still produces consistent snapshots; the bash board (scripts/status-board.sh:114-129) also clears its cached comment id on PATCH failure, so a transient 5xx from Gitea is retried next tick instead of leaving a half-state.provider_concurrency × provider_lens_concurrency = 3 × 5 = 15per pass (review, then recheck). Confirmed against the new defaults at.gitea/workflows/review-reusable.yml:52-53.Nothing in this diff introduces a new unhandled edge case, swallowed error, missing cleanup, or boundary bug that I can verify. The operational risk the PR itself flags (subscription 429s at 15-way parallelism) routes through the existing advisory-on-failure path — that's a performance/operational concern, not an error-handling defect.
Outside my lens: nothing severe to add.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 19s
🪰 Gadfly review —
claude-code/opus(claude-code)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found (Security lens)
Verified the change against the checked-out workflow (
review-reusable.yml:54, secrets block:69-123):provider_concurrencyclaude-code=1 → 3. It feedsGADFLY_PROVIDER_CONCURRENCY, a concurrency cap — it doesn't reach any shell exec, query, path, or deserialization sink, so it opens no injection/SSRF surface.allowed_users/trigger_phrasedefaults and the gating logic are untouched.workflow_call.secretsdeclaration + per-secret forwarding rather thansecrets: inherit,review-reusable.yml:69-75,110-123) is unchanged. No new secret is added or logged.One security-adjacent note, in-lens but not a code flaw in this diff: raising default concurrency to 15 concurrent
claude -pper pass increases the chance of provider 429s, which the PR itself acknowledges degrade to per-lens errors (advisory). That's an availability/cost trade-off, not an auth/injection/secret weakness — no insecure default in the security sense.Nothing in my lane is materially wrong here.
🎯 Correctness — No material issues found
I've verified the concurrency semantics against
entrypoint.shand the binary. Here's my review through the Correctness lens.VERDICT: No material issues found
This PR changes one default value (
provider_concurrencyclaude-code=1 → 3) plus matching comment/doc text. I re-derived the claimed behavior from the actual mechanism rather than trusting the comments:The "3 models × 5 lenses = 15" math is correct. Verified the two knobs are independent multiplicands:
provider_cap()(entrypoint.sh:167-176) reads the per-provider entry fromGADFLY_PROVIDER_CONCURRENCY;run_lane()(entrypoint.sh:206-218) uses it ascap, the number of models run concurrently within a provider lane (inflight >= cap→wait -n). Soclaude-code=3⇒ up to 3 claude models at once.GADFLY_PROVIDER_LENS_CONCURRENCY(claude-code=5, unchanged), consumed in the binary (cmd/gadfly/main.go:275-285).cap × lens_conc = 3 × 5 = 15per pass. The comment's formula "provider_concurrency × provider_lens_concurrency" matches the implementation.The model count and provider key are correct. The default
models:list contains exactly threeclaude-code/*entries (sonnet,opus,opus:max), andprovider_of("claude-code/sonnet")="claude-code"(entrypoint.sh:159), which is the exact key used in the concurrency map. So all three are eligible to run in the oneclaude-codelane and the=3cap actually applies to them. The5-lens suiteis likewise exactly the fivespecialists:entries.Advisory invariant preserved. Lane work runs under
review_one, which swallows failures (|| log "model $1 failed (continuing)",entrypoint.sh:183). A 429 from raising concurrency degrades an individual lens — it can't fail the job — consistent with the PR's "individual lenses error (advisory)" claim.Docs kept in sync. Both the
review-reusable.ymlheader/inputs comments and theCLAUDE.mdarchitecture line were updated to "3 claude models concurrent," matching the new value (satisfies the repo's "keep docs current" maintenance rule, though that's adjacent to my lens).No logic bug, unit error, or off-by-one in the changed values. The change is internally consistent and does what it claims.
🧹 Code cleanliness & maintainability — Minor issues
The finding is confirmed. Lines 7–8 of
.gitea/workflows/adversarial-review.ymlread "claude models run one at a time, each with all 5 lenses at once" — exactly the serial behavior this PR replaces with 3 concurrent claude models. This file is gadfly's own caller inheriting the default swarm, and the PR updated the same description inreview-reusable.ymlandCLAUDE.mdbut missed this third copy.VERDICT: Minor issues
.gitea/workflows/adversarial-review.yml:7-8— This file is gadfly's own caller ofreview-reusable.ymland inherits the default swarm. Its header comment still describes the pre-PR behavior: "5-lens suite (claude models run one at a time, each with all 5 lenses at once)." The PR changes the default to 3 claude models concurrent and updated the matching prose inreview-reusable.yml(lines 16, 38–47, 52) andCLAUDE.md(line 51), but missed this third copy. Per CLAUDE.md's own rule that stale docs are a bug, line 7–8 should be updated to "all 3 claude models concurrent, each with all 5 lenses at once," mirroring the wording now used in the other two files.⚡ Performance — No material issues found
VERDICT: No material issues found
This is a concurrency-tuning change — it raises
claude-codeprovider concurrency from1to3in the reusable workflow's input default, plus matching comment/doc text. Through the performance lens it is a net positive, not a regression, and I verified the mechanism it relies on:entrypoint.sh:206-218(run_lane) bounds each provider lane tocapconcurrent models, wherecapcomes fromprovider_cap()readingGADFLY_PROVIDER_CONCURRENCY(entrypoint.sh:167-176). Lanes themselves already run in parallel. So before this change theclaude-codelane ran its 3 models serially (cap=1) while the ollama lane finished early — claude genuinely was the tail. Bumping tocap=3overlaps the three claude models; each still fans out to 5 lenses via the unchangedprovider_lens_concurrency: claude-code=5. The 3×5=15 figure is correct.cmd/gadfly/engine.go:148-213): a failed/429claude -preturns an error and the lens is dropped (advisory) — there's no internal retry or sleep, so rate-limiting does not add wasted wall-clock or amplify load. The PR's "lenses error (advisory)" claim holds.One observation I could not verify as material (noted, not a finding): peak load becomes 15 concurrent
claude -p(each a node process) plus the ollama lane's ~9 in-process agent loops, all on one Giteaact_runner. If the runner is memory-constrained, 15 node processes could thrash/OOM and erase the intended wall-clock gain. I have no runner-spec data to call this real, and the author already ships two dial-back knobs (provider_concurrency/provider_lens_concurrency) and frames this PR's own review as the live test. So I'm leaving it as a watch-item, not a blocking issue.Docs (
CLAUDE.md, the workflow header comments) were updated consistently with the value change.🧯 Error handling & edge cases — No material issues found
I've verified the relevant paths. Findings below.
VERDICT: No material issues found
This diff is config-default + documentation only. Through the error-handling & edge-cases lens, there is nothing materially wrong, and I confirmed the unhappy paths the bump touches are already handled:
entrypoint.sh:183runs each model asbash run.sh || log "model $1 failed (continuing)"— a failed/rate-limited model is logged and the lane continues; the lane and the script never propagate a non-zero exit from review content. Bumpingclaude-codecap from 1→3 changes only how many models run at once, not this handler. Verified by readingreview_one/run_lane(entrypoint.sh:178–218).runPassturns a CLI error/empty result/is_errorinto a returnederror(engine.go:185,190,196,203–205) rather than panicking, so a rate-limited lens fails its own pass and is dropped advisorily — consistent with "individual lenses error (advisory)."claude -pinvocations share the samecmd.Dir = e.repoDirand a minimal inheritedHOME/TMPDIR(engine.go:150–151,claudeEnvengine.go:221–224). That shared-state pattern was already exercised at 5-concurrent (5 lenses per model viaprovider_lens_concurrency=5, unchanged on this linereview-reusable.yml:53). Going 1→3 models makes it 15 concurrent against the same read-only repo dir and HOME — more of an already-running pattern, not a new class of collision. Thewait -nreaping loop (entrypoint.sh:215) is also independent of the cap value.No swallowed errors, missing cleanup, panics, or boundary bugs are introduced by this change within my lens.
(Note: I delivered this as a review rather than a plan — the task here is an adversarial code review, which is read-only and produces markdown findings, not an implementation plan.)
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 21s
🪰 Gadfly review —
claude-code/sonnet(claude-code)Verdict: No material issues found — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
No material issues found
The diff touches only two things: a comment update in CLAUDE.md and the
provider_concurrencydefault changing fromclaude-code=1toclaude-code=3in the workflow YAML. Through the security lens:Pre-existing issues the Explore agent surfaced (git clone URL embedding the token in plaintext, lack of numeric validation on concurrency inputs) are not introduced by this diff and are out of scope for this review.
🎯 Correctness — No material issues found
No material issues found
The change is a single-knob bump (
claude-code=1 → 3inprovider_concurrency) plus comment/doc updates. Verified:Math is correct. The formula cited in the comment —
provider_concurrency × provider_lens_concurrency = 3 × 5 = 15— is arithmetic-sound.provider_concurrency=claude-code=3andprovider_lens_concurrency=claude-code=5multiply to 15, consistent with "up to 15 concurrentclaude -pper pass." Verified against the actual file values at lines 52-53 ofreview-reusable.yml.Model count matches the concurrency cap. The
modelsdefault lists exactly 3claude-codeentries (claude-code/sonnet,claude-code/opus,claude-code/opus:max), soclaude-code=3is a tight fit — no over- or under-provisioning. Verified at line 48.All in-file cross-references are consistent. The comment block updated at lines 44-46 ("3 models × 5 lenses"), the
provider_concurrencydefault, and the CLAUDE.md description all agree. No internal contradiction.Nothing under the Correctness lens is wrong.
🧹 Code cleanliness & maintainability — No material issues found
No stale references elsewhere. The diff is a pure comment/doc update to a YAML config file and CLAUDE.md — no logic changes, no code, no symbols to verify.
No material issues found
The change is two comment/default-value updates in a YAML workflow input block and a matching line in CLAUDE.md. From a maintainability standpoint:
review-reusable.ymlblock comment, input default comment, and CLAUDE.md architecture table) are mutually consistent: each now says "3 claude models concurrent, 5 lenses each."CLAUDE.md:145is aboutGADFLY_CONCURRENCY(the binary-level fallback, confirmed byentrypoint.sh:157: DEFAULT_CONC="${GADFLY_CONCURRENCY:-1}"), not the reusable workflow input — it was not stale before and still isn't.README.mdorexamples/.⚡ Performance — No material issues found
Verdict: No material issues found
This change raises
claude-codeprovider concurrency from 1 → 3 in a YAML config file and matching doc comments. Through the performance lens, it is a net improvement — it removes a serial bottleneck.entrypoint.sh'srun_lane()function, which uses await -nsemaphore pattern to hold inflight jobs tocap. With 3 models andcap=3, all three spawn immediately with no blocking; the semaphore fires only after the last model is queued, then the final barewaitdrains the lane. Behavior is correct.claude -pper pass. Both knobs are explicit config inputs a consumer can lower.claude -pprocess is network-I/O-bound (waiting on the API), not CPU- or disk-bound, so 15 concurrent processes on the same repo checkout should not cause meaningful resource contention on the runner.No efficiency regressions introduced. The 429-on-rate-limit risk is the only real operational concern, and it's correctly scoped as advisory (individual lenses error, not the job).
🧯 Error handling & edge cases — No material issues found
No material issues found
This diff is YAML config + comments only —
claude-code=1 → 3inprovider_concurrency. I verified the error handling infrastructure against the increased concurrency (15 concurrentclaude -pvs the previous 5):main.go): each goroutine has its ownrecover(), so a rate-limit crash in one lens cannot propagate to the other 14.main.go): each specialist gets an independent timeout; one 429-stalled lens cannot starve others.errored=truetracking (main.go→consolidate.go): rate limit errors surface as⚠️notices in the relevant specialist section while other lenses' results are preserved intact.run.shnon-zero exit fallback (scripts/run.sh): binary failure is caught, wrapped in a notice, and the script exits 0 — subsequent model lanes continue unimpeded.entrypoint.shlane isolation:|| log "model $1 failed (continuing)"swallows individual model failures;wait -ncorrectly drains the concurrent slot so the lane cap doesn't deadlock.The advisory-only invariant is preserved: 429s under the new 15-way concurrency degrade to per-lens
⚠️notices, not CI failures. Nothing in this diff creates new unhandled edge cases.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 57s
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
The change is a pure concurrency tuning — bumping
claude-codefrom1to3in theprovider_concurrencydefault. Through the security lens:CLAUDE_CODE_OAUTH_TOKENis forwarded the same way;claudeEnv()(engine.go:221-241) already filters the subprocess environment to a minimal allowlist regardless of how many concurrent instances run.provider_concurrencyvalue is parsed inentrypoint.sh:167-176viaecho/tr/string splitting — noeval, no command substitution risk. The parsed numeric value is guarded by[ "$cap" -ge 1 ] 2>/dev/null || cap=1(line 208), so even a malformed override falls back safely.claude -psubprocess gets the same filtered environment; 15 concurrent copies don't increase exposure.The diff is operationally opinionated (aggressive parallelism) but introduces no security regression.
🎯 Correctness — No material issues found
No material issues found
The change is a straightforward configuration value bump (
claude-code=1 → 3) with consistent comment updates. I verified:provider_concurrency×provider_lens_concurrency= 3 × 5 = 15 peak concurrentclaude -pprocesses. The comment's claim of 15 is correct.modelslist has exactly 3claude-code/*entries (sonnet, opus, opus:max), soclaude-code=3means all three run concurrently — no model is left serialized, and no model is starved by an undersized cap.claude-code=5lens concurrency = all 5 lenses per model run concurrently. Unchanged and correct.provider_cap()inentrypoint.sh:167-176parses theprovider=Nmap correctly;run_lane()at line 206-218 uses the cap to bound concurrent model launches. With cap=3 and 3 models, all launch concurrently — no off-by-one or starvation.claude-code=1,claude models serial, or similar old-language across the repo.ollama-cloud=3with 3 cloud models and lens concurrency 3 remains correct (peak 9).🧹 Code cleanliness & maintainability — Minor issues
The finding is confirmed. The
adversarial-review.ymlfile at lines 7-8 still contains the old serial-concurrency description ("claude models run one at a time, each with all 5 lenses at once"), while the reusable workflow it calls now defaults toclaude-code=3(concurrent). The PR updated the reusable workflow andCLAUDE.mdbut missed this caller file.Verdict: Minor issues
adversarial-review.yml:7-8— the PR updates the concurrency description inreview-reusable.yml(lines 16, 38, 44-46, 52) andCLAUDE.md(line 51), but.gitea/workflows/adversarial-review.ymlstill saysclaude models run one at a time, each with all 5 lenses at once. This file is the thin caller that inherits the default swarm from the reusable workflow (it passes nowith:overrides for the swarm inputs), so its comment now contradicts the actual behavior (claude-code=3concurrent). Impact: a reader of the caller workflow gets stale information about concurrency behavior. Fix: update lines 7-8 to match the new description (e.g., "3 claude models concurrent, each with all 5 lenses at once").⚡ Performance — No material issues found
Verdict: No material issues found
This is a straightforward configuration change — bumping
claude-codeprovider concurrency from 1 to 3. From a performance lens:The concurrency mechanism is already efficient. I verified in
entrypoint.sh:206-218thatrun_laneuses a simple semaphore pattern (&+wait -n), which is O(n) in model count. With cap=3 and exactly 3 claude-code models, all three launch immediately and thenwaitdrains them — no busy-waiting, no quadratic behavior.No N+1, no unbounded growth. The cap is a hard constant (3), and the per-lens cap (5) is also bounded. Peak concurrency is 15, which is explicitly documented and intentional.
No unnecessary allocations or copies. The diff touches only YAML defaults and comments; no code paths are altered.
The
timeout_minutes(90) remains appropriate. With serial claude execution removed, the job should complete faster, not slower.The PR is transparent about the tradeoff (rate-limiting risk), but that's a reliability concern, not a performance regression in the code. Nothing in my lane to flag.
🧯 Error handling & edge cases — No material issues found
No material issues found
This PR changes a single configuration default (
claude-code=1→claude-code=3) and updates associated comments. I traced the value through every consumer:entrypoint.sh:167-176(provider_cap): Parses"ollama-cloud=3,claude-code=3"by splitting on,then=, trims whitespace, and returns the value for a matching provider key. The new value3is well-formed and handled identically to the old1.entrypoint.sh:206-218(run_lane): The cap is validated with[ "$cap" -ge 1 ] 2>/dev/null || cap=1, so any non-numeric or zero/negative value would safely fall back to 1. Thewait -nthrottling loop handlescap=3correctly — with exactly 3 claude-code models, all three launch concurrently, then the trailingwaitcollects them.entrypoint.sh:178-192(review_one): Individual model failures are already advisory (|| log "model $1 failed (continuing)") — unchanged behavior regardless of concurrency.cmd/gadfly/main.go:281-286(lensConcurrency): The per-lens concurrency (provider_lens_concurrency) is unchanged atclaude-code=5and resolved through the sameproviderOverrideparser, which handles the value correctly.The PR description explicitly acknowledges the rate-limit risk ("if it 429s, individual lenses error (advisory)") and the existing error handling already treats individual lens/model failures as non-fatal. No new error paths, no unhandled edge cases, no missing cleanup — the change is a pure numeric tuning of an already-guarded concurrency cap.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 39s