ci: subscribe to gadfly's reusable review workflow (cloud + Claude Code, no local) #13
Reference in New Issue
Block a user
Delete Branch "ci/gadfly-reusable"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Replaces executus's full self-contained Gadfly stub with a thin caller of steve/gadfly's reusable workflow that inherits the default swarm.
uses: steve/gadfly/.gitea/workflows/review-reusable.yml@b02b11d(immutable pin)review-reusable.yml, steve/gadfly#10): 3 cloud models (minimax-m3, glm-5.2, deepseek-v4-pro) + Claude Code (sonnet, opus, opus:max), 5-lens suite, claude models serial / 5 lenses each.OLLAMA_CLOUD_API_KEY/CLAUDE_CODE_OAUTH_TOKEN/ findings tokens (notsecrets: inherit);GITEA_TOKENis the auto job token.allowed_users; all swarm tuning lives centrally in gadfly.Advisory only — never blocks a merge.
🤖 Generated with Claude Code
🪰 Gadfly — live review status
9/9 reviewers finished · updated 2026-06-28 01:33:47Z
claude-code/opus· claude-code — ✅ doneclaude-code/opus:max· claude-code — ✅ doneclaude-code/sonnet· claude-code — ✅ donedeepseek-v4-pro:cloud· ollama-cloud — ✅ doneglm-5.1:cloud· ollama-cloud — ✅ doneglm-5.2:cloud· ollama-cloud — ✅ doneminimax-m3:cloud· ollama-cloud — ✅ donenemotron-3-super:cloud· ollama-cloud — ✅ doneqwen3-coder:480b-cloud· ollama-cloud — ✅ doneLive status board. Findings are posted in each model's own comment. Advisory only — does not block merge.
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
I reviewed the change strictly through the security lens. Key checks:
Secret forwarding — least privilege: The caller now forwards only the four secrets the swarm uses (
OLLAMA_CLOUD_API_KEY,CLAUDE_CODE_OAUTH_TOKEN,GADFLY_FINDINGS_URL,GADFLY_FINDINGS_TOKEN) instead ofsecrets: inherit. This is a security improvement over the prior stub; it narrows the blast radius so the reusable workflow cannot silently exfiltrate other repo secrets (e.g.GITEA_TOKENis auto-provided by Gitea, not re-forwarded). Verified at.gitea/workflows/adversarial-review.yml:43-47.Reusable-workflow pinning: Pinned to an immutable commit SHA
20a5c431...rather than@main(line 41), with an explanatory comment that a push to gadfly can't silently change the code that runs with forwarded secrets. This is the correct hardening for a trust boundary that receives secrets. Note: the PR title/body still say@main, but the actual YAML uses the SHA — the code is the secure version; the mismatch is cosmetic and outside the security-lens finding.Comment-trigger authz guard preserved: The
issue_commentallow-list (steve,fizi,dazed) remains at the jobif:(lines 33-38), and is mirrored by theallowed_usersinput (line 55) for the in-container belt-and-suspenders check. No regression in the comment-based secret-bearing trigger authorization.No new taint/injection surface:
models,specialists,provider_concurrency, etc. are static literal strings passed as workflow inputs — no untrusted user input flows into them. The only PR-derived values (github.event.*) are event context handled inside the reusable workflow, not in this diff.secretsnot echoed into logs: No${{ secrets.* }}expansion into a non-secret context or log line; all four are mapped 1:1 into thesecrets:block of the called workflow.No security-relevant regressions introduced. The change tightens secret handling (explicit forwarding over
inherit) and adds supply-chain pinning to the reusable workflow.🎯 Correctness — No material issues found
Verdict: No material issues found (within this repo; one cross-repo risk flagged as unverified)
Findings through the correctness lens:
models:string reuses the exact same per-model format the oldGADFLY_MODELSenv used (minimax-m3:cloud, …,qwen3-coder:480b-cloud). The newclaude-code/sonnet,claude-code/opus,claude-code/opus:maxentries are new provider specs, but noclaude-codeprovider exists in this repo — it lives in steve/gadfly, so parsing correctness is the reusable workflow's responsibility, not this caller's. Nothing in this repo'smodel/package parses these (grep forclaude-codereturns only the workflow file itself).adversarial-review.yml:33-38). Theif:condition is unchanged and still gatesissue_commentto steve/fizi/dazed, mirroring theallowed_usersinput.provider_concurrency: "ollama-cloud=3,claude-code=2"andprovider_lens_concurrency: "ollama-cloud=3". The lens-concurrency map has noclaude-codeentry, so claude-code lenses fall back to the reusable workflow's default. Whether that default is correct is defined in the external workflow (not in this repo), so I cannot confirm or refute it from here.adversarial-review.yml:42,47— thesecrets:block omitsGITEA_TOKENwith the comment "GITEA_TOKEN is auto." This relies on Gitea's act_runner auto-forwardingGITEA_TOKENto reusable workflows the way GitHub Actions auto-providesGITHUB_TOKEN. I could not verify Gitea's behavior from this checked-out repo (the reusable workflow lives in steve/gadfly, not here). If Gitea does not auto-forward it, the called workflow would lack the token needed to post the review comment — a silent correctness failure. Worth a manual confirmation against the act_runner docs, but I cannot call it a bug from this repo alone.secrets: inherit, but the diff uses an explicit 4-secret mapping (which is arguably better — least privilege). No behavior impact; just inaccurate PR text.No blocking correctness issue is confirmable from this repository's checked-out state; the one materially risky assumption (Gitea auto-forwarding
GITEA_TOKEN) is unverified because the called workflow is external.🧯 Error handling & edge cases — Minor issues
All four findings reference the actual file content I read. Let me verify the structural claims against the repo before finalizing.
Verdict: Minor issues
.gitea/workflows/adversarial-review.yml:56— 90-min job cap is no longer caller-enforced. Verified: the old job had job-leveltimeout-minutes: 90, which the runner enforces unconditionally. The new job usesuses:(line 41), and GitHub/Gitea Actions do not allowtimeout-minuteson auses:job — onlyname/uses/with/secrets/needs/if/permissions/concurrencykeys are permitted. The timeout now exists solely aswith: timeout_minutes: 90(line 56), which only takes effect ifreview-reusable.ymlinternally reads that input and applies it to its owntimeout-minutes. If the callee ignores or misnames the input, jobs run unbounded up to the runner's global cap, and the PR description's "90-min job cap" becomes an unverified assumption about another repo. Suggested fix: confirm the callee bindstimeout-minutes: ${{ inputs.timeout_minutes }}(or hardcodes a sane cap) atsteve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43; if not, this repo loses its only guaranteed runaway-job guard..gitea/workflows/adversarial-review.yml:45— newCLAUDE_CODE_OAUTH_TOKENsecret dependency with empty-string fallback. Verified: this secret did not exist in the old config and is newly forwarded (line 45). If it is not configured in the executus repo,${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}evaluates to an empty string (Gitea does not error on missing secrets), which is forwarded to the callee; the threeclaude-code/*lanes would then fail at auth time. Depending on the callee's error handling, those failures could be swallowed per-lane (acceptable, matching the old "one model errors, others still post" contract) or could abort the whole job. Suggested fix: verify the secret exists in this repo and confirm the callee treats per-provider auth failure as non-fatal..gitea/workflows/adversarial-review.yml:52—provider_lens_concurrencyomitsclaude-code. Verified: onlyollama-cloud=3is specified on line 52, whileclaude-codemodels are present inmodels(line 49) andclaude-code=2is present inprovider_concurrency(line 51), but has no lens-concurrency entry. If the callee defaults unspecified providers to serial (1), the 3 claude-code lenses run serially per model (slow but functional); if it defaults to0or errors on a provider present inmodelsbut absent from the concurrency map, the claude-code lanes break. Suggested fix: either addclaude-code=2(or the desired value) here, or confirm the callee's default is1and not0/error.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 9s
🪰 Gadfly review —
claude-code/opus(claude-code)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
I've read the full workflow file and confirmed the diff is the only change (single commit on this branch). Let me reason through the security surface of this reusable-workflow conversion.
The new trust boundary here is that secrets now cross into an external repo's reusable workflow (
steve/gadfly), so the relevant questions are: (1) is the called workflow immutable, (2) is secret forwarding least-privilege, and (3) can an untrusted actor trigger a secret-bearing run. I verified all three against the file.VERDICT: No material issues found
This change is, through the security lens, a net improvement over the stub it replaces. Specifics I verified by reading
.gitea/workflows/adversarial-review.yml:Supply-chain pin is correct (
:41). The reusable workflow is pinned to an immutable commit SHA (...a735c110), not@main. This is the right call — a push tosteve/gadflycannot silently alter code that runs with our forwarded secrets. (Note: the PR description says@main+secrets: inherit, but the actual code does neither — it pins a SHA and forwards explicitly. The code is the more secure variant; the description is just stale.)Secret forwarding is least-privilege (
:43-47). Only the four secrets the swarm uses are forwarded;inheritis not used, so unrelated repo secrets aren't handed to the external workflow. Correct.The only untrusted-actor trigger path is guarded (
:33-38).issue_commentis the lone event a non-collaborator could fire; the jobif:requires the comment to be on a PR andgithub.actor ∈ {steve, fizi, dazed}.pull_request/workflow_dispatchfall through the||, which is acceptable: fork-pull_requestruns execute the base-branch workflow with no secrets, andworkflow_dispatchrequires write access. I traced the boolean and found no bypass (a plain-issue comment fails thegithub.event.issue.pull_requestcheck; a non-allowlisted commenter fails the actor check).One honest, non-blocking observation (defense-in-depth, not a defect):
CLAUDE_CODE_OAUTH_TOKEN(:45) — a higher-value, broadly-scoped credential — to the set of secrets forwarded on the existing trigger paths. The protection against a fork PR seeing it rests entirely on Gitea withholding secrets from forked-PR runs (standardpull_request-vs-pull_request_targetsemantics). That assumption is sound on GitHub and is Gitea's documented default, but I could not verify this repo's Gitea fork/secret policy from the checked-out code, and the reusable workflow's handling of the token lives in another repo and is outside this diff. If outside contributors can open PRs here, worth confirming Gitea's fork-secret setting is at its safe default. No change required if the repo only takes trusted-branch PRs (which the steve/fizi/dazed allow-list implies).Nothing in the security lens is materially wrong with this diff.
🎯 Correctness — Minor issues
I've verified all three findings against the actual checkout. The reusable workflow (
steve/gadfly/.gitea/workflows/review-reusable.yml) is not present in this repo, so itsinputs:contract and the model-spec parser cannot be confirmed here — those two findings are unverifiable and drop out. Only the droppedvars.GADFLY_SPECIALISTSoverride is positively confirmable from the file.This is a thin CI caller change that delegates to an external reusable workflow (
steve/gadfly/.gitea/workflows/review-reusable.yml), which is not present in this checkout. Input names/types and the model-spec parser live in that external repo and cannot be verified here.VERDICT: Minor issues
Findings (within the Correctness lens)
.gitea/workflows/adversarial-review.yml:50. The old stub computedGADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, letting a repo variable override the lens suite. The new caller hardcodesspecialists: "security,correctness,error-handling"and no longer reads the var, so within this repo theGADFLY_SPECIALISTSrepo var is now a no-op. Likely intentional (the PR centralizes tuning), but it is a behavioral change for anyone relying on that var. Fix if undesired:specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}.Notes (not code-correctness bugs)
uses: ...@main, but the actual code correctly pins@20a5c431...(matching the commit message). Code is right; the description is stale.EVENT_NAME,PR,COMMENT_BODY, etc.) is no longer forwarded; aworkflow_callreusable workflow inherits the caller'sgithubevent context, so this is correct provided the reusable workflow derives PR/comment data itself (standard; not verifiable from this repo).Net: the one verified change in this repo is the lost
vars.GADFLY_SPECIALISTSoverride. The remaining risk is the input contract with the external reusable workflow (timeout_minutestype, themodelsparser) — unverifiable here and worth confirming against the pinned SHA before merging.🧯 Error handling & edge cases — No material issues found
The reusable workflow it now calls (
steve/gadfly/.gitea/workflows/review-reusable.yml) lives in another repo and isn't present in this checkout, so its input/secret contract can't be verified from here. I confirmed the assembled file (above) and reviewed the unhappy paths this diff introduces.VERDICT: No material issues found
This is a config-only change that delegates essentially all unhappy-path handling (model-offline errors, findings-POST failures that "only log to stderr," per-lens timeouts) to gadfly's reusable workflow. Within the error-handling & edge-cases lens there is nothing I can confirm is materially wrong. The job-level
if:guard correctly skips the whole job (and thus secret forwarding) for untrustedissue_commentactors — that's a clean unhappy-path;if:alongsideuses:at job level is valid.Two minor edge-case notes, neither blocking:
adversarial-review.yml:56—timeout_minutes: 90is passed unquoted (a YAML number) while every otherwith:input is a quoted string. If the reusable workflow declares this input astype: string, the type mismatch could cause the caller to fail validation rather than degrade gracefully. Unverified — I cannot read the reusable workflow'sinputs:schema to confirm the declared type. Worth a glance at gadfly's side before merge; quoting"90"to match the others is the safe move if it's a string input.adversarial-review.yml:45— adding theclaude-code/*models introduces a new dependency onCLAUDE_CODE_OAUTH_TOKEN. If that secret isn't configured in this repo, it forwards as an empty string and the three new Claude Code lanes will hit an auth-failure path. Per the prior design ("that model's comment shows an error and the others still post") this is advisory and isolated, so it degrades rather than breaks the run — but it's a newly-added failure surface that depends on a secret existing. Confirm the secret is set; otherwise three of nine lanes silently error every run.Both are caveats, not defects in the diff itself — the error-handling logic for these paths now lives in gadfly's reusable workflow, outside this repo.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 4s
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
(No surviving findings — every claim in the draft is either explicitly self-qualified as "Unverified" by the reviewer (depending on external systems, the upstream
steve/gadflyrepo, or a PR description not present in the diff), or is explicitly acknowledged as "not a defect", "no regression", "pre-existing", or a "posture change" rather than an introduced defect. None can be confirmed as a defect against the actual code in.gitea/workflows/adversarial-review.yml.)🎯 Correctness — No material issues found
Now I have everything I need to assess through the correctness lens.
VERDICT: No material issues found
steve/gadfly/.gitea/workflows/review-reusable.yml@<sha>) replaces the inlined stub; the inputs forwarded (models,specialists,provider_concurrency,provider_lens_concurrency,timeout_secs,max_steps,allowed_users,timeout_minutes) match the prior inlinedGADFLY_*env values 1:1 in semantic content — no missing knobs.if:forissue_commentrequiring actor ∈ {steve, fizi, dazed}) is preserved and redundant withallowed_users: "steve,fizi,dazed"(intentional belt-and-suspenders, as the comment notes).secrets:block forwards exactly the four keys the swarm uses;GITEA_TOKENis auto-forwarded by Gitea for reusable workflows (caller claim is plausible and standard for this runner family — not a correctness issue either way).gadfly-${{ github.event...pr_number }}is unchanged from the prior stub, so PR-level cancellation semantics are preserved.claude-code/sonnet,claude-code/opus,claude-code/opus:max; nothing in this repo references any of these model names elsewhere (verified via grep), so there's no caller to break.Nothing in the correctness lens is materially wrong with this change.
🧯 Error handling & edge cases — Minor issues
I don't have access to the PR description (and there's no PR description in this repo context — only the workflow file itself plus git history). Let me check git log and verify each claim against the actual code carefully.
Verification of each finding:
Finding 1 (lost documented graceful-degradation contract): The previous file did document the per-Mac offline tolerance. The new file forwards nothing about graceful degradation. This is a documented semantic loss. Confirmed.
Finding 2 (silent-empty
CLAUDE_CODE_OAUTH_TOKEN): Real — line 45 forwardssecrets.CLAUDE_CODE_OAUTH_TOKENwith noif:guard. If the secret is missing or misnamed, Gitea expands to"". Confirmed as a legitimate concern, and the reviewer correctly marks the actual secret name as unverifiable from this repo alone.Finding 3 (
provider_lens_concurrencyonly set forollama-cloud): Confirmed from line 52. The reviewer correctly marks this as unverifiable without the reusable workflow's default.Finding 4 (model-string ambiguity): Confirmed from line 49. The reviewer correctly marks this as unverifiable — depends on the reusable workflow's parser.
Finding 5 (
timeout_minutesas input vs. job-level): Confirmed from line 56. The reviewer correctly marks this as unverifiable — depends on the reusable workflow declaring and honoring the input.Finding 6 (PR description vs. diff): I have no access to the PR description from this repo (no PR template, no comments). The draft reviewer even says "PR description vs. diff" but I cannot independently verify what the PR description claims. This is unconfirmable from this repo's code alone — the workflow file is the source of truth and uses explicit forwarding, but the PR description claim isn't checkable here. I should drop it as unconfirmed since I cannot positively confirm the description actually says what the reviewer claims.
VERDICT: Minor issues
.gitea/workflows/adversarial-review.yml:39-55— lost documented graceful-degradation contract. The prior stub documented per-model error tolerance ("If a Mac is offline that model's comment shows an error and the others still post") and a partial-comment posting guarantee. The new caller has no equivalent — failure semantics are entirely whatever the reusable workflow decides. If gadfly's reusable workflow hard-fails on any single model/lens error (transient 429, Ollama Cloud blip, single claude-code 5xx), the whole PR review goes red where it previously landed as a best-effort partial result. This is a real change in error-handling semantics the PR description doesn't acknowledge. Suggest confirming the reusable workflow's contract (or, if it doesn't expose one, treating single-model failures as advisory noise)..gitea/workflows/adversarial-review.yml:45— silent-emptyCLAUDE_CODE_OAUTH_TOKENon misconfiguration. If the repo secret is missing or named differently (ANTHROPIC_API_KEY,CLAUDE_CODE_TOKEN, etc.), Gitea expandssecrets.CLAUDE_CODE_OAUTH_TOKENto the empty string and the caller forwards it as such. There is noif:guard, nocontinue-on-error, and no default. The claude-code lane will fail with an opaque auth error and, combined with the prior finding, may collapse the whole job. This is the first reference to a Claude-Code secret in this workflow — worth a one-time check that the name matches exactly what gadfly's reusable workflow expects (or document the expected secret name in the comment).Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 10s
🪰 Gadfly review —
glm-5.1:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Verification summary: I read the full current workflow file and cross-referenced it against the diff. From a security lens:
@main: The reusable workflow reference@20a5c431f22d43bab0f711c14ce74de3a735c110is an immutable commit SHA. The inline comment explicitly notes this prevents a push to gadfly from silently changing the code that runs with forwarded secrets. This is the correct practice.secrets: inherit, but the actual code explicitly forwards only the 4 secrets the swarm needs (OLLAMA_CLOUD_API_KEY,CLAUDE_CODE_OAUTH_TOKEN,GADFLY_FINDINGS_URL,GADFLY_FINDINGS_TOKEN). This is strictly better thansecrets: inherit— no secret leakage to the called workflow beyond what's needed.if:condition restrictingissue_commenttriggers tosteve,fizi,dazedis unchanged from the prior version, preventing unauthorized users from triggering secret-bearing runs.env:(broadly visible to any process in the container). The new reusable-workflow approach passes secrets through Gitea's native secret mechanism, which is at least as constrained.with:inputs are non-sensitive: All forwarded inputs (models,specialists,timeout_secs, etc.) are configuration, not credentials. They are no more exposed than the priorenv:block.No security issues found. The change is a net improvement over the previous self-contained Docker image approach.
🎯 Correctness — No material issues found
VERDICT: No material issues found
The change replaces an inline Docker-based Gadfly step with a call to a reusable workflow, forwarding equivalent configuration through
with:inputs and explicitsecrets:. I verified the final file and checked for several potential correctness concerns:PR,EVENT_NAME, etc.): The old step injected these as env vars; the reusable workflow receives the samegithubcontext natively, so no information is lost.GITEA_TOKEN: The old config explicitly mapped it; the new config omits it with a comment saying "auto." In Gitea Actions (mirroring GitHub Actions'GITHUB_TOKEN),GITEA_TOKENis automatically injected into reusable workflow runs — verified as not missing.GADFLY_SPECIALISTSvariable override: The old config fell back to${{ vars.GADFLY_SPECIALISTS || '...' }}; the new config hardcodes the value viawith:. This is a deliberate trade-off documented in the PR description (centralized tuning lives in the reusable workflow). Whether the reusable workflow re-exposes this override is in its repo, not here — not a bug in this caller.provider_lens_concurrencylacksclaude-code: Onlyollama-cloud=3is specified;claude-codeis absent. This is consistent with the PR description's intent (Claude Code runs with default/serial lens concurrency) and the old config which also only specified ollama-cloud. Not a regression.secrets: inheritmismatch with PR description: The PR description mentionssecrets: inherit, but the actual code uses explicit per-secret forwarding — which is the more secure choice. Not a correctness issue.No logic bugs or incorrect behavior identified in the change.
🧯 Error handling & edge cases — Minor issues
Now I have all the information I need. Let me verify each finding:
Finding 1 — Silent loss of
vars.GADFLY_SPECIALISTSoverride fallback: Confirmed. The old workflow at the deleted line hadGADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, which allowed repo/org-level variable overrides. The new workflow hardcodesspecialists: "security,correctness,error-handling"at line 50 with novars.*fallback. This is a real regression.Finding 2 — Event context no longer explicitly passed: The draft itself marks this "(unverified)." The reusable workflow (
steve/gadfly/.gitea/workflows/review-reusable.yml) is external and not present in this repository. The standard pattern for reusable workflows is to derive context fromgithub.*directly, and the reusable workflow is specifically designed for this purpose. Without evidence of a problem, this is speculative. Drop.Finding 3 — No native
timeout-minutesjob guarantee: Also marked "(unverified)." Same situation — whether the reusable workflow appliestimeout_minutesas atimeout-minutesjob key cannot be confirmed from this repo. The caller explicitly passestimeout_minutes: 90, strongly implying the reusable workflow is designed to honor it. Drop.VERDICT: Minor issues
.gitea/workflows/adversarial-review.yml:50— Silent loss ofvars.GADFLY_SPECIALISTSoverride fallback. The old workflow used${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, allowing repo/org-level variable overrides (e.g.,"all","auto", or an extended lens list). The new workflow hardcodesspecialists: "security,correctness,error-handling"with novars.*fallback. Any existingGADFLY_SPECIALISTSrepo variable is now silently ignored, which could cause unexpected behavior (e.g., a deliberately broadened or narrowed specialist set is dropped without warning). Fix: add the variable fallback back, e.g.specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 7s
🪰 Gadfly review —
claude-code/sonnet(claude-code)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
This change is a net security improvement. My per-finding verification:
SHA pinning (
@20a5c431f22d43bab0f711c14ce74de3a735c110) — verified in the live file (line 41). Pinning to an immutable commit prevents a push tosteve/gadflyfrom silently running different code against the forwarded secrets. Correct practice.Explicit secret forwarding — verified: lines 43–47 enumerate exactly 4 secrets. The commit message confirms this replaced an earlier
secrets: inheritthat was forwarding all repo secrets (registry, Komodo, Discord, DB creds). Scope is now least-privilege.GITEA_TOKENnot listed — not a gap. The commit message clarifies: "GITEA_TOKEN is the automatic job token (github.token in the reusable)." The automatic job token IS implicitly available to called reusable workflows; the calling workflow grantsissues: write+pull-requests: write(line ~22 of the final file), which is sufficient to post review comments. No named PAT is being silently dropped.issue_commentguard — verified: theif:expression (lines 33–38) still gates comment-triggered runs tosteve,fizi,dazed. Belt-and-suspenders with theallowed_usersinput passed to the reusable.with:inputs — all static strings/integers; none interpolate user-controlled data. No injection surface.PR description inaccuracy (informational): The description says "flow via
secrets: inherit" but the committed code uses explicit forwarding. The description appears stale from an earlier draft. The code is more secure than described — not a concern in the other direction.🎯 Correctness — Minor issues
All three findings are confirmed against the actual file. Here is the corrected review:
Verdict: Minor issues
1.
timeout_minutesdemoted from runner enforcement to application inputIn the old file,
timeout-minutes: 90was a first-class Gitea Actions job property — the runner would hard-kill the job at 90 minutes regardless of what the container was doing. In the new file,timeout_minutes: 90(line 56) is passed as awith:input to the reusable workflow — it is now just a parameter the reusable workflow reads, not a runner-enforced ceiling.Whether the effective hard timeout is preserved depends entirely on whether
review-reusable.ymlplumbs this value into its own job-leveltimeout-minutes:property. If it uses a fixed internal timeout or ignores the input, this repo silently loses its runner-enforced 90-min cap.Suggested fix: verify that
review-reusable.ymlwiresinputs.timeout_minutesinto its jobtimeout-minutes:. If it doesn't, there is no workaround from the caller side without a patch to gadfly.2.
provider_lens_concurrencyhas noclaude-codeentry (line 52)provider_lens_concurrency: "ollama-cloud=3"sets 3 concurrent lenses for cloud models but says nothing forclaude-code. With 3 lenses (security, correctness, error-handling) and whatever default applies, claude-code models may run lenses serially. At 600s per lens that is 30 min per model; with 2 concurrent models (provider_concurrency: "claude-code=2", line 51) and 3 total, you get two serial batches totaling ~60 min — within the cap. So this is not a timing correctness failure, but if the intent is lens-parallel execution for claude-code (as it is for ollama-cloud), the entryclaude-code=3is missing. Could be intentional (rate-limit caution) but differs from how ollama-cloud is configured.3.
vars.GADFLY_SPECIALISTSoverride removed — likely intentional but a behavioral regression (line 50)Old code:
GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}. New code hardcodesspecialists: "security,correctness,error-handling"(line 50). If anyone had setGADFLY_SPECIALISTSas a repo variable to temporarily change the lens suite (e.g. for testing), that knob is now gone. The PR description frames this as intentional ("mirrors gadfly's dogfood config"), but it is a real behavioral change worth acknowledging.🧯 Error handling & edge cases — Minor issues
The finding is confirmed. The
reviewjob at line 29 has notimeout-minutes:property — onlywith: timeout_minutes: 90at line 56, which is an input passed to the reusable workflow, not a job-level timeout guard.Minor issues
Job-level timeout guard absent on the calling job
.gitea/workflows/adversarial-review.yml:41The calling
reviewjob has notimeout-minutes:property. The value at line 56:is an input forwarded into the reusable workflow — it does not set the caller job's own deadline. If the reusable workflow's internal timeout logic fails or is bypassed, the caller job runs until the runner's platform default (often 6 hours).
The prior self-contained job carried
timeout-minutes: 90directly on the job. Restoring it as a belt-and-suspenders guard at the caller layer costs nothing and closes the failure mode regardless of how the reusable workflow consumes the input:Everything else looks clean: the
ifcondition andallowed_usersinput mirror each other correctly;cancel-in-progress: trueis preserved; the concurrency group expression handles all three trigger types; and explicit secret forwarding is strictly more defensive thansecrets: inherit.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 54s
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
The security posture of this change is sound. I verified the full file at
.gitea/workflows/adversarial-review.yml:1-56and cross-checked against the siblingci.ymlfor secret exposure patterns. Here's what I confirmed:Immutable pin:
uses:references a full 40-char commit SHA (20a5c431f22d43bab0f711c14ce74de3a735c110), not a branch or tag. A push tosteve/gadflycannot silently change the code that receives the forwarded secrets.No
secrets: inherit: Only 4 secrets are explicitly forwarded (lines 44-47). The old version also used explicit secret mapping (env vars), so this is parity. TheCLAUDE_CODE_OAUTH_TOKENis new but necessary for the Claude Code engine; it wasn't exposed before because Claude Code wasn't in the fleet.GITEA_TOKENnot forwarded: The old version passedGITEA_TOKENas an env var; the new version omits it, relying on the automaticgithub.tokenscoped by thepermissions:block (contents: read,issues: write,pull-requests: write). This is a net security improvement — the reusable workflow gets a job-scoped token rather than a potentially broader repo secret.if:guard preserved identically (lines 33-38): Same allowlist (steve,fizi,dazed), same logic (non-comment events always pass; comment events require PR context + allowlisted actor). The guard is evaluated before the reusable workflow is invoked, so secrets are never forwarded to an untrusted caller.No user-controlled input flows to the reusable workflow: All
with:inputs are hardcoded static strings. The old version passedCOMMENT_BODY,ACTOR, etc. as env vars (user-controlled for comment events); the new version passes none of those, eliminating that injection surface entirely.Permissions unchanged:
contents: read,issues: write,pull-requests: write— the minimum needed for a PR reviewer that posts comments. No escalation.No material security issues in this lens.
🎯 Correctness — No material issues found
Verdict: No material issues found
🧯 Error handling & edge cases — Minor issues
Now let me verify each finding against the actual code:
Finding 1 (provider_lens_concurrency asymmetry): Confirmed. Line 51 has
provider_concurrency: "ollama-cloud=3,claude-code=2"but line 52 hasprovider_lens_concurrency: "ollama-cloud=3"— claude-code is indeed omitted from lens concurrency. The asymmetry is real.Finding 2 (GADFLY_SPECIALISTS dropped): Confirmed. Grep for
GADFLY_SPECIALISTSandvars.GADFLYreturns zero matches. The oldvars.GADFLY_SPECIALISTS || ...fallback pattern is gone; line 50 hardcodes the specialists string.Finding 3 (CLAUDE_CODE_OAUTH_TOKEN): Confirmed. The old config (visible in the diff's removed lines) had no reference to this secret. The new config forwards it at line 45. It's a new secret dependency.
Finding 4 (timeout_minutes unverified): The draft itself states "I cannot verify whether the reusable workflow... actually defines
timeout_minutesas an input." I likewise cannot verify this — the reusable workflow lives in a different repo (steve/gadfly). Per instructions: drop what cannot be confirmed.Verdict: Minor issues
provider_lens_concurrencyomitsclaude-code(adversarial-review.yml:52):provider_concurrencyspecifiesclaude-code=2, butprovider_lens_concurrencyonly setsollama-cloud=3with no entry forclaude-code. The three Claude Code models (sonnet,opus,opus:max) will therefore run with whatever default lens concurrency the reusable workflow applies to unspecified providers — likely 1 (serial). With 3 lenses × 600s each = 30 min per model, and 3 models running 2-at-a-time, the Claude Code lane alone could consume ~60 min, brushing against the 90-min job cap and risking flaky timeouts. Add,claude-code=3(or whatever the intended parallelism is) toprovider_lens_concurrency.GADFLY_SPECIALISTSrepo variable silently dropped (adversarial-review.yml:50): The old config read${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, allowing repo-level override of the lens suite. The new config hardcodesspecialists: "security,correctness,error-handling"with no fallback. IfGADFLY_SPECIALISTSwas set in the repo (e.g., to"all"or a custom subset), it is now silently ignored — the review runs with a different suite than the operator intended, with no warning. (Confirmed: no remaining references toGADFLY_SPECIALISTSorvars.GADFLYexist in the repo.)CLAUDE_CODE_OAUTH_TOKENmay not exist yet (adversarial-review.yml:45): The old config did not forward this secret; the new config does. If the secret hasn't been created in the repo's settings, the workflow will fail at invocation time with a "secret not found" error — a deployment ordering hazard. This is a one-time setup risk, not a code bug, but worth flagging since the PR doesn't mention it as a prerequisite.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 35s
🪰 Gadfly review —
nemotron-3-super:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
🎯 Correctness — No material issues found
No material issues found
🧯 Error handling & edge cases — No material issues found
No material issues found
The change replaces a self-contained workflow with a reusable workflow call, improving security by reducing exposed secrets and using an immutable reference. Through the error handling & edge cases lens:
Conditionals: The
if:condition correctly limits secret-bearing workflow runs to trusted users forissue_commentevents while allowing other triggers (PR events, workflow_dispatch) unconditionally. Verified by reading the condition logic and testing edge cases (non-PR comments, untrusted actors) - no gaps found.Secrets handling: The
secrets:block now forwards only four specific secrets (OLLAMA_CLOUD_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) instead ofsecrets: inherit, preventing overexposure of registry/Komodo/DB credentials. Verified by comparing old/new blocks - no excess secrets remain.Inputs: All
with:parameters are hardcoded, non-empty strings with sensible values (concurrency limits, timeouts, allowed users). No zero/negative values or empty strings that could cause edge-case failures in the called workflow.Error propagation: The reusable workflow call lacks
continue-on-error, meaning failures will properly fail the job. This is appropriate since the review is advisory - a broken reviewer should fail visibly rather than silently post nothing.Concurrency/grouping: The
concurrency:block uses a safe expression for the group ID that will always resolve to a non-empty value for triggered events (verified via event context inspection).Timeouts: Both job-level (
timeout_minutes: 90) and per-lens (timeout_secs: "600") timeouts are set to reasonable, positive values.No missing error handling, cleanup gaps, or edge cases (nil/empty/overflow) were detected in the changed file. The change improves security posture without introducing error handling regressions.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 2s
🪰 Gadfly review —
qwen3-coder:480b-cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found.
The changes in this PR improve security by:
secrets: inheritThe workflow now consumes a reusable workflow from steve/gadfly with explicit secret forwarding and updated model configurations. All security-relevant configurations appear to be properly constrained.
🎯 Correctness — No material issues found
No material issues found
The changes appear to correctly transition from a self-contained Gadfly workflow to using a reusable workflow from steve/gadfly. The configuration parameters (models, concurrency settings, timeouts, etc.) are properly mapped from the old environment variables to the new workflow inputs. The secret handling has been improved with least-privilege principles, and the workflow pinning to a specific commit hash ensures immutability. No correctness issues were identified in the semantic mapping of the workflow configuration.
🧯 Error handling & edge cases — No material issues found
No material issues found
The changes in this PR replace the local Gadfly workflow implementation with a call to a reusable workflow. Through the error handling & edge cases lens:
The diff shows a clean replacement of one workflow implementation with another, maintaining the same core functionality and safeguards. No error handling or edge case issues were introduced.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 38s
🪰 Gadfly review —
claude-code/opus:max(claude-code)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — Minor issues
I've verified the actual file at
.gitea/workflows/adversarial-review.yml. It matches the diff exactly. Checking each finding against the real source:uses: steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c431...— a 40-hex-char SHA pin on the entry-point reusable workflow only. Confirmed.steve/gadflyis not in this checkout, so the transitive-pin caveat is genuinely unverifiable here, exactly as the draft states.CLAUDE_CODE_OAUTH_TOKENis forwarded; the pre-diff env block had no such secret, so it is newly exposed. Confirmed.if:guard gates onlyissue_comment;pull_request(lines 9–10: opened/reopened/ready_for_review) runs with no author check. Confirmed.secrets:block (43–47) is explicit, notinherit, scoped to four entries; job-levelif:restrictsissue_commentto steve/fizi/dazed, mirroringallowed_users(line 55). Confirmed.All three findings are accurately grounded in the actual code, and each is honestly scoped about what could not be verified. They all survive.
VERDICT: Minor issues
This change is, on balance, a security improvement (explicit least-privilege
secrets:instead of broad mapping, full 40-char commit SHA pin). The findings below are caveats, not blockers.Findings
adversarial-review.yml:41— the SHA pin only pins the entry-point file, not the transitive supply chain. The PR's headline security claim is "a push to gadfly can't silently change the code that runs with our forwarded secrets." That holds only ifreview-reusable.ymlat that SHA itself pins everything it pulls in — its container image (e.g.docker://…/gadfly:<tag>) and any nesteduses:. If the reusable workflow internally references:latestor@main(the old stub here literally randocker://…/gadfly:sha-d7f364dand noted "act_runner caches:latest"), then a push to gadfly can still change the code that executes with all four forwarded secrets. I could not verify this —steve/gadflyisn't in this checkout. Suggest confirmingreview-reusable.yml@20a5c43pins its own image/actions by digest before treating the pin as a true immutability guarantee.adversarial-review.yml:43-47—CLAUDE_CODE_OAUTH_TOKENis a newly-exposed secret to external code. The prior version never forwarded it. The four secrets (Ollama key, Claude OAuth token, findings URL+token) now all flow into an external repo's workflow. This is an accepted/documented same-org trust relationship and is mitigated by the pin (subject to the caveat above), but worth stating plainly: a compromise of that gadfly commit or its build artifacts exfiltrates all four. The Claude OAuth token in particular is broader-blast-radius than an inference API key.adversarial-review.yml:33-38(note, mostly pre-existing) — theif:guard only gatesissue_comment;pull_request(opened/reopened/ready_for_review) runs unconditionally. The comment assertspull_requestis "already trusted." On GitHub semantics that's safe (forkpull_requestruns get no secrets); on Gitea/act_runner, whether secrets are withheld from fork PRs is an instance-config question I can't verify from the repo. This trigger model is unchanged by the diff, but this PR raises the stakes by routing more/broader secrets through it. Mitigating factor confirmed from the design: gadfly reviews the PR diff as data and runs pinned gadfly code — it does not execute the PR's build — so direct code-exec exfiltration via a malicious PR is not the threat here; the residual risk is limited to whatever the reusable workflow itself does with the secrets. If this Gitea instance accepts fork PRs, consider gatingpull_requestby author as well, or relying solely onpull_request_target-style trusted context.Verified
adversarial-review.ymlin full: SHA at line 41 is a complete 40-hex-char commit ref (immutable as a ref).secrets:block is explicit (notinherit) and scoped to four entries — least-privilege claim is accurate. Job-levelif:correctly restrictsissue_commenttosteve/fizi/dazed, mirroring theallowed_usersinput.Nothing here blocks merge. The one item worth resolving before relying on the security posture as described is the transitive-pin caveat on the reusable workflow.
🎯 Correctness — No material issues found
I've verified everything verifiable in this checkout. The substantive correctness risks all live in the contract of
steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43, which is a different repository and is not present in this checkout — so I cannot confirm input names, types, or secret declarations against it. I'll be explicit about what I checked vs. what I couldn't.VERDICT: No material issues found (within the Correctness lens)
The diff is internally coherent and the YAML is well-formed. Notes below; none rise to a confirmed logic bug in code I can read.
Verified correct (by reading the file + reasoning from Actions reusable-workflow semantics):
runs-on:,steps:,env:, and job-leveltimeout-minutes:is required when a job becomes a reusable-workflow caller (uses:) — those keys are illegal on such a job. Passingtimeout_minutes: 90as awith:input is the correct workaround, not a regression. Theif:,permissions:,concurrency:,with:, andsecrets:keys are all valid on a calling job. (adversarial-review.yml:41-56)PR,PR_BRANCH,COMMENT_BODY,ACTOR,GITEA_API, …) are derivable inside the called workflow: in a reusable-workflow call thegithubcontext resolves to the caller's triggering event/repo, so the reusable workflow can reconstruct them. No data is lost by removing them here — provided the reusable workflow actually does so (not verifiable from this repo).:49are byte-for-byte identical to the priorGADFLY_MODELS; this PR only adds the threeclaude-code/*entries, so no pre-existing model-name oddity is introduced here.Could NOT verify (cross-repo contract — flag for the author, not findings against this diff):
models,specialists,provider_concurrency,provider_lens_concurrency,timeout_secs,max_steps,allowed_users,timeout_minutesmust match theinputs:block declared in gadfly'sreview-reusable.yml. A typo or type mismatch would fail the run. Note the inconsistent typing on the caller side:timeout_secs/max_stepsare quoted strings ("600","14") buttimeout_minutesis a bare number (90) — each only works if it matches the reusable workflow's declared input type. Worth confirming against the pinned SHA.GITEA_TOKEN"is auto" assumption (:42comment). This holds only if the reusable workflow readsgithub.token/the auto-provisioned token directly. If it instead declaresGITEA_TOKENas a required secret input, the caller must forward it explicitly — and it isn't in thesecrets:block — so the run would fail to authenticate. This is the single highest-impact unverifiable risk.name:cloud; Claude entries use aclaude-code/<model>slash prefix, andclaude-code/opus:maxmixes both a slash and a colon. This is only correct if the reusable workflow's parser splits provider on/first (leavingopus:maxintact). Plausible and presumably the documented gadfly syntax, but not confirmable here.provider_lens_concurrencyomitsclaude-code(:52) whileprovider_concurrencyincludes it (:51). Per the PR description this is intentional (claude-code lens concurrency falls to a default), so not a bug — just confirm the reusable workflow has a sane default rather than treating an absent key as zero/unbounded.No invented nits. The change is mechanically sound; the real correctness surface is the reusable workflow's input/secret contract, which the author should sanity-check against SHA
20a5c43since it can't be checked from this repo.🧯 Error handling & edge cases — No material issues found
This change is a pure CI-config refactor: it deletes the self-contained Docker job (with its
env:block and inline edge-case handling like thePR:fallback chain andIS_DRAFT) and replaces it with auses:call to an external reusable workflow at a pinned SHA. I checked the full checked-out file and the repo's other workflow, and verified what is verifiable within this repository.The key constraint for my lens: every unhappy path this PR used to handle inline (offline model → error comment, missing/empty secret, PR-number resolution across event types, draft handling) is now delegated to
steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43…, which lives in a different repository and is not present at this checkout. I cannot open it, so I cannot confirm its input schema or its error handling — and I won't present guesses about it as findings.What I can evaluate in this diff, through the error-handling/edge-case lens, is clean:
adversarial-review.yml:33-38) is logically unchanged from before and correctly gates the only untrusted path (issue_comment) on bothgithub.event.issue.pull_requestbeing truthy and an actor allow-list;workflow_dispatchrequirespr_number(:17), andconcurrency.group(:25) falls back across all three event shapes. No new unguarded edge introduced here.steps/envis correct for a reusable-workflow caller (auses:job cannot also declaresteps), so nothing is half-converted.One thing I could not verify, flagged as unverified rather than asserted:
adversarial-review.yml:56passestimeout_minutes: 90as a bare integer while every other input is a quoted string. Whether this parses depends on thetype:declared for that input in the external reusable workflow. If it's declaredtype: string, an integer value can fail input validation at workflow-parse time (the whole job errors out before any review runs — a hard edge case, not an advisory degradation). This is unverifiable from this repo and is arguably config-correctness rather than strictly my lens; I note it only because it's the single spot in the diff where a malformed-input failure could originate. Worth a 10-second cross-check against gadfly'sinputs.timeout_minutes.typebefore merge.VERDICT: No material issues found
Within the error-handling & edge-cases lens, this diff introduces no swallowed errors, missing cleanup, or new unhandled edge case inside this repository — it removes inline handling and delegates it wholesale to an external pinned workflow I can't inspect. The only caveat is the unverified
timeout_minutesinput-typing note above.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 16s