ci: subscribe to gadfly's reusable review workflow (cloud + Claude Code, no local) #13

Merged
steve merged 2 commits from ci/gadfly-reusable into main 2026-06-28 01:43:42 +00:00
Owner

Replaces executus's full self-contained Gadfly stub with a thin caller of steve/gadfly's reusable workflow that inherits the default swarm.

  • uses: steve/gadfly/.gitea/workflows/review-reusable.yml@b02b11d (immutable pin)
  • Inherits the default swarm (defined once in gadfly's review-reusable.yml, steve/gadfly#10): 3 cloud models (minimax-m3, glm-5.2, deepseek-v4-pro) + Claude Code (sonnet, opus, opus:max), 5-lens suite, claude models serial / 5 lenses each.
  • Least-privilege secrets: explicit forward of OLLAMA_CLOUD_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / findings tokens (not secrets: inherit); GITEA_TOKEN is the auto job token.
  • The stub now carries only triggers + actor gate + secret forwarding + allowed_users; all swarm tuning lives centrally in gadfly.

Advisory only — never blocks a merge.

🤖 Generated with Claude Code

Replaces executus's full self-contained Gadfly stub with a thin caller of **steve/gadfly's reusable workflow** that **inherits the default swarm**. - `uses: steve/gadfly/.gitea/workflows/review-reusable.yml@b02b11d` (immutable pin) - **Inherits the default swarm** (defined once in gadfly's `review-reusable.yml`, steve/gadfly#10): 3 cloud models (minimax-m3, glm-5.2, deepseek-v4-pro) + Claude Code (sonnet, opus, opus:max), **5-lens suite**, claude models serial / 5 lenses each. - **Least-privilege secrets**: explicit forward of `OLLAMA_CLOUD_API_KEY` / `CLAUDE_CODE_OAUTH_TOKEN` / findings tokens (not `secrets: inherit`); `GITEA_TOKEN` is the auto job token. - The stub now carries only triggers + actor gate + secret forwarding + `allowed_users`; all swarm tuning lives centrally in gadfly. Advisory only — never blocks a merge. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
steve added 1 commit 2026-06-28 00:04:24 +00:00
ci: subscribe to gadfly's reusable review workflow (cloud + Claude Code, no local)
executus CI / test (pull_request) Successful in 47s
Adversarial Review (Gadfly) / review (pull_request) Successful in 12m29s
c08ce47fa6
Replace the full self-contained stub with a thin caller of steve/gadfly's
reusable workflow, using gadfly's own dogfood config: 6 cloud models +
the Claude Code engine (sonnet, opus, opus:max). No local Macs / foreman.
Advisory only.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🪰 Gadfly — live review status

9/9 reviewers finished · updated 2026-06-28 01:33:47Z

claude-code/opus · claude-code — done

  • security — No material issues found
  • correctness — Minor issues
  • error-handling — No material issues found

claude-code/opus:max · claude-code — done

  • security — Minor issues
  • correctness — No material issues found
  • error-handling — No material issues found

claude-code/sonnet · claude-code — done

  • security — No material issues found
  • correctness — Minor issues
  • error-handling — Minor issues

deepseek-v4-pro:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

glm-5.1:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

glm-5.2:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

minimax-m3:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — Minor issues

nemotron-3-super:cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

qwen3-coder:480b-cloud · ollama-cloud — done

  • security — No material issues found
  • correctness — No material issues found
  • error-handling — No material issues found

Live status board. Findings are posted in each model's own comment. Advisory only — does not block merge.

<!-- gadfly-status-board --> ## 🪰 Gadfly — live review status 9/9 reviewers finished · updated 2026-06-28 01:33:47Z #### `claude-code/opus` · claude-code — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — Minor issues - ✅ **error-handling** — No material issues found #### `claude-code/opus:max` · claude-code — ✅ done - ✅ **security** — Minor issues - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `claude-code/sonnet` · claude-code — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — Minor issues - ✅ **error-handling** — Minor issues #### `deepseek-v4-pro:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `glm-5.1:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `glm-5.2:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `minimax-m3:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — Minor issues #### `nemotron-3-super:cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found #### `qwen3-coder:480b-cloud` · ollama-cloud — ✅ done - ✅ **security** — No material issues found - ✅ **correctness** — No material issues found - ✅ **error-handling** — No material issues found <sub>Live status board. Findings are posted in each model's own comment. Advisory only — does not block merge.</sub>

🪰 Gadfly review — glm-5.2:cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

I reviewed the change strictly through the security lens. Key checks:

  • Secret forwarding — least privilege: The caller now forwards only the four secrets the swarm uses (OLLAMA_CLOUD_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) instead of secrets: inherit. This is a security improvement over the prior stub; it narrows the blast radius so the reusable workflow cannot silently exfiltrate other repo secrets (e.g. GITEA_TOKEN is auto-provided by Gitea, not re-forwarded). Verified at .gitea/workflows/adversarial-review.yml:43-47.

  • Reusable-workflow pinning: Pinned to an immutable commit SHA 20a5c431... rather than @main (line 41), with an explanatory comment that a push to gadfly can't silently change the code that runs with forwarded secrets. This is the correct hardening for a trust boundary that receives secrets. Note: the PR title/body still say @main, but the actual YAML uses the SHA — the code is the secure version; the mismatch is cosmetic and outside the security-lens finding.

  • Comment-trigger authz guard preserved: The issue_comment allow-list (steve, fizi, dazed) remains at the job if: (lines 33-38), and is mirrored by the allowed_users input (line 55) for the in-container belt-and-suspenders check. No regression in the comment-based secret-bearing trigger authorization.

  • No new taint/injection surface: models, specialists, provider_concurrency, etc. are static literal strings passed as workflow inputs — no untrusted user input flows into them. The only PR-derived values (github.event.*) are event context handled inside the reusable workflow, not in this diff.

  • secrets not echoed into logs: No ${{ secrets.* }} expansion into a non-secret context or log line; all four are mapped 1:1 into the secrets: block of the called workflow.

No security-relevant regressions introduced. The change tightens secret handling (explicit forwarding over inherit) and adds supply-chain pinning to the reusable workflow.

🎯 Correctness — No material issues found

Verdict: No material issues found (within this repo; one cross-repo risk flagged as unverified)

Findings through the correctness lens:

  • Model spec format preserved — verified. The models: string reuses the exact same per-model format the old GADFLY_MODELS env used (minimax-m3:cloud, …, qwen3-coder:480b-cloud). The new claude-code/sonnet,claude-code/opus,claude-code/opus:max entries are new provider specs, but no claude-code provider exists in this repo — it lives in steve/gadfly, so parsing correctness is the reusable workflow's responsibility, not this caller's. Nothing in this repo's model/ package parses these (grep for claude-code returns only the workflow file itself).
  • Trusted-user guard intact — verified (adversarial-review.yml:33-38). The if: condition is unchanged and still gates issue_comment to steve/fizi/dazed, mirroring the allowed_users input.
  • Concurrency config self-consistent — verified for the parts this repo knows about: provider_concurrency: "ollama-cloud=3,claude-code=2" and provider_lens_concurrency: "ollama-cloud=3". The lens-concurrency map has no claude-code entry, so claude-code lenses fall back to the reusable workflow's default. Whether that default is correct is defined in the external workflow (not in this repo), so I cannot confirm or refute it from here.
  • Unverified cross-repo risk (flagged, not asserted): adversarial-review.yml:42,47 — the secrets: block omits GITEA_TOKEN with the comment "GITEA_TOKEN is auto." This relies on Gitea's act_runner auto-forwarding GITEA_TOKEN to reusable workflows the way GitHub Actions auto-provides GITHUB_TOKEN. I could not verify Gitea's behavior from this checked-out repo (the reusable workflow lives in steve/gadfly, not here). If Gitea does not auto-forward it, the called workflow would lack the token needed to post the review comment — a silent correctness failure. Worth a manual confirmation against the act_runner docs, but I cannot call it a bug from this repo alone.
  • PR-description vs. diff mismatch (non-blocking): the description states secrets: inherit, but the diff uses an explicit 4-secret mapping (which is arguably better — least privilege). No behavior impact; just inaccurate PR text.

No blocking correctness issue is confirmable from this repository's checked-out state; the one materially risky assumption (Gitea auto-forwarding GITEA_TOKEN) is unverified because the called workflow is external.

🧯 Error handling & edge cases — Minor issues

All four findings reference the actual file content I read. Let me verify the structural claims against the repo before finalizing.

Verdict: Minor issues

  • .gitea/workflows/adversarial-review.yml:56 — 90-min job cap is no longer caller-enforced. Verified: the old job had job-level timeout-minutes: 90, which the runner enforces unconditionally. The new job uses uses: (line 41), and GitHub/Gitea Actions do not allow timeout-minutes on a uses: job — only name/uses/with/secrets/needs/if/permissions/concurrency keys are permitted. The timeout now exists solely as with: timeout_minutes: 90 (line 56), which only takes effect if review-reusable.yml internally reads that input and applies it to its own timeout-minutes. If the callee ignores or misnames the input, jobs run unbounded up to the runner's global cap, and the PR description's "90-min job cap" becomes an unverified assumption about another repo. Suggested fix: confirm the callee binds timeout-minutes: ${{ inputs.timeout_minutes }} (or hardcodes a sane cap) at steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43; if not, this repo loses its only guaranteed runaway-job guard.

  • .gitea/workflows/adversarial-review.yml:45 — new CLAUDE_CODE_OAUTH_TOKEN secret dependency with empty-string fallback. Verified: this secret did not exist in the old config and is newly forwarded (line 45). If it is not configured in the executus repo, ${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }} evaluates to an empty string (Gitea does not error on missing secrets), which is forwarded to the callee; the three claude-code/* lanes would then fail at auth time. Depending on the callee's error handling, those failures could be swallowed per-lane (acceptable, matching the old "one model errors, others still post" contract) or could abort the whole job. Suggested fix: verify the secret exists in this repo and confirm the callee treats per-provider auth failure as non-fatal.

  • .gitea/workflows/adversarial-review.yml:52provider_lens_concurrency omits claude-code. Verified: only ollama-cloud=3 is specified on line 52, while claude-code models are present in models (line 49) and claude-code=2 is present in provider_concurrency (line 51), but has no lens-concurrency entry. If the callee defaults unspecified providers to serial (1), the 3 claude-code lenses run serially per model (slow but functional); if it defaults to 0 or errors on a provider present in models but absent from the concurrency map, the claude-code lanes break. Suggested fix: either add claude-code=2 (or the desired value) here, or confirm the callee's default is 1 and not 0/error.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 9s

<!-- gadfly-review:ollama:glm-5.2:cloud --> ### 🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> VERDICT: No material issues found I reviewed the change strictly through the security lens. Key checks: - **Secret forwarding — least privilege**: The caller now forwards only the four secrets the swarm uses (`OLLAMA_CLOUD_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`) instead of `secrets: inherit`. This is a security *improvement* over the prior stub; it narrows the blast radius so the reusable workflow cannot silently exfiltrate other repo secrets (e.g. `GITEA_TOKEN` is auto-provided by Gitea, not re-forwarded). Verified at `.gitea/workflows/adversarial-review.yml:43-47`. - **Reusable-workflow pinning**: Pinned to an immutable commit SHA `20a5c431...` rather than `@main` (line 41), with an explanatory comment that a push to gadfly can't silently change the code that runs with forwarded secrets. This is the correct hardening for a trust boundary that receives secrets. Note: the PR title/body still say `@main`, but the actual YAML uses the SHA — the code is the secure version; the mismatch is cosmetic and outside the security-lens finding. - **Comment-trigger authz guard preserved**: The `issue_comment` allow-list (`steve`, `fizi`, `dazed`) remains at the job `if:` (lines 33-38), and is mirrored by the `allowed_users` input (line 55) for the in-container belt-and-suspenders check. No regression in the comment-based secret-bearing trigger authorization. - **No new taint/injection surface**: `models`, `specialists`, `provider_concurrency`, etc. are static literal strings passed as workflow inputs — no untrusted user input flows into them. The only PR-derived values (`github.event.*`) are event context handled inside the reusable workflow, not in this diff. - **`secrets` not echoed into logs**: No `${{ secrets.* }}` expansion into a non-secret context or log line; all four are mapped 1:1 into the `secrets:` block of the called workflow. No security-relevant regressions introduced. The change tightens secret handling (explicit forwarding over `inherit`) and adds supply-chain pinning to the reusable workflow. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> **Verdict: No material issues found** (within this repo; one cross-repo risk flagged as unverified) Findings through the correctness lens: - **Model spec format preserved** — verified. The `models:` string reuses the exact same per-model format the old `GADFLY_MODELS` env used (`minimax-m3:cloud`, …, `qwen3-coder:480b-cloud`). The new `claude-code/sonnet,claude-code/opus,claude-code/opus:max` entries are new provider specs, but no `claude-code` provider exists in this repo — it lives in steve/gadfly, so parsing correctness is the reusable workflow's responsibility, not this caller's. Nothing in this repo's `model/` package parses these (grep for `claude-code` returns only the workflow file itself). - **Trusted-user guard intact** — verified (`adversarial-review.yml:33-38`). The `if:` condition is unchanged and still gates `issue_comment` to steve/fizi/dazed, mirroring the `allowed_users` input. - **Concurrency config self-consistent** — verified for the parts this repo knows about: `provider_concurrency: "ollama-cloud=3,claude-code=2"` and `provider_lens_concurrency: "ollama-cloud=3"`. The lens-concurrency map has no `claude-code` entry, so claude-code lenses fall back to the reusable workflow's default. Whether that default is correct is defined in the external workflow (not in this repo), so I cannot confirm or refute it from here. - **Unverified cross-repo risk (flagged, not asserted):** `adversarial-review.yml:42,47` — the `secrets:` block omits `GITEA_TOKEN` with the comment "GITEA_TOKEN is auto." This relies on Gitea's act_runner auto-forwarding `GITEA_TOKEN` to reusable workflows the way GitHub Actions auto-provides `GITHUB_TOKEN`. I could not verify Gitea's behavior from this checked-out repo (the reusable workflow lives in steve/gadfly, not here). If Gitea does *not* auto-forward it, the called workflow would lack the token needed to post the review comment — a silent correctness failure. Worth a manual confirmation against the act_runner docs, but I cannot call it a bug from this repo alone. - **PR-description vs. diff mismatch (non-blocking):** the description states `secrets: inherit`, but the diff uses an explicit 4-secret mapping (which is arguably better — least privilege). No behavior impact; just inaccurate PR text. No blocking correctness issue is confirmable from this repository's checked-out state; the one materially risky assumption (Gitea auto-forwarding `GITEA_TOKEN`) is unverified because the called workflow is external. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> All four findings reference the actual file content I read. Let me verify the structural claims against the repo before finalizing. **Verdict: Minor issues** - **`.gitea/workflows/adversarial-review.yml:56` — 90-min job cap is no longer caller-enforced.** Verified: the old job had job-level `timeout-minutes: 90`, which the runner enforces unconditionally. The new job uses `uses:` (line 41), and GitHub/Gitea Actions do not allow `timeout-minutes` on a `uses:` job — only `name/uses/with/secrets/needs/if/permissions/concurrency` keys are permitted. The timeout now exists solely as `with: timeout_minutes: 90` (line 56), which only takes effect if `review-reusable.yml` internally reads that input and applies it to its own `timeout-minutes`. If the callee ignores or misnames the input, jobs run unbounded up to the runner's global cap, and the PR description's "90-min job cap" becomes an unverified assumption about another repo. Suggested fix: confirm the callee binds `timeout-minutes: ${{ inputs.timeout_minutes }}` (or hardcodes a sane cap) at `steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43`; if not, this repo loses its only guaranteed runaway-job guard. - **`.gitea/workflows/adversarial-review.yml:45` — new `CLAUDE_CODE_OAUTH_TOKEN` secret dependency with empty-string fallback.** Verified: this secret did not exist in the old config and is newly forwarded (line 45). If it is not configured in the executus repo, `${{ secrets.CLAUDE_CODE_OAUTH_TOKEN }}` evaluates to an empty string (Gitea does not error on missing secrets), which is forwarded to the callee; the three `claude-code/*` lanes would then fail at auth time. Depending on the callee's error handling, those failures could be swallowed per-lane (acceptable, matching the old "one model errors, others still post" contract) or could abort the whole job. Suggested fix: verify the secret exists in this repo and confirm the callee treats per-provider auth failure as non-fatal. - **`.gitea/workflows/adversarial-review.yml:52` — `provider_lens_concurrency` omits `claude-code`.** Verified: only `ollama-cloud=3` is specified on line 52, while `claude-code` models are present in `models` (line 49) and `claude-code=2` is present in `provider_concurrency` (line 51), but has no lens-concurrency entry. If the callee defaults unspecified providers to serial (`1`), the 3 claude-code lenses run serially per model (slow but functional); if it defaults to `0` or errors on a provider present in `models` but absent from the concurrency map, the claude-code lanes break. Suggested fix: either add `claude-code=2` (or the desired value) here, or confirm the callee's default is `1` and not `0`/error. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 9s</sub>

🪰 Gadfly review — claude-code/opus (claude-code)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

I've read the full workflow file and confirmed the diff is the only change (single commit on this branch). Let me reason through the security surface of this reusable-workflow conversion.

The new trust boundary here is that secrets now cross into an external repo's reusable workflow (steve/gadfly), so the relevant questions are: (1) is the called workflow immutable, (2) is secret forwarding least-privilege, and (3) can an untrusted actor trigger a secret-bearing run. I verified all three against the file.

VERDICT: No material issues found

This change is, through the security lens, a net improvement over the stub it replaces. Specifics I verified by reading .gitea/workflows/adversarial-review.yml:

  • Supply-chain pin is correct (:41). The reusable workflow is pinned to an immutable commit SHA (...a735c110), not @main. This is the right call — a push to steve/gadfly cannot silently alter code that runs with our forwarded secrets. (Note: the PR description says @main + secrets: inherit, but the actual code does neither — it pins a SHA and forwards explicitly. The code is the more secure variant; the description is just stale.)

  • Secret forwarding is least-privilege (:43-47). Only the four secrets the swarm uses are forwarded; inherit is not used, so unrelated repo secrets aren't handed to the external workflow. Correct.

  • The only untrusted-actor trigger path is guarded (:33-38). issue_comment is the lone event a non-collaborator could fire; the job if: requires the comment to be on a PR and github.actor ∈ {steve, fizi, dazed}. pull_request/workflow_dispatch fall through the ||, which is acceptable: fork-pull_request runs execute the base-branch workflow with no secrets, and workflow_dispatch requires write access. I traced the boolean and found no bypass (a plain-issue comment fails the github.event.issue.pull_request check; a non-allowlisted commenter fails the actor check).

One honest, non-blocking observation (defense-in-depth, not a defect):

  • This PR adds CLAUDE_CODE_OAUTH_TOKEN (:45) — a higher-value, broadly-scoped credential — to the set of secrets forwarded on the existing trigger paths. The protection against a fork PR seeing it rests entirely on Gitea withholding secrets from forked-PR runs (standard pull_request-vs-pull_request_target semantics). That assumption is sound on GitHub and is Gitea's documented default, but I could not verify this repo's Gitea fork/secret policy from the checked-out code, and the reusable workflow's handling of the token lives in another repo and is outside this diff. If outside contributors can open PRs here, worth confirming Gitea's fork-secret setting is at its safe default. No change required if the repo only takes trusted-branch PRs (which the steve/fizi/dazed allow-list implies).

Nothing in the security lens is materially wrong with this diff.

🎯 Correctness — Minor issues

I've verified all three findings against the actual checkout. The reusable workflow (steve/gadfly/.gitea/workflows/review-reusable.yml) is not present in this repo, so its inputs: contract and the model-spec parser cannot be confirmed here — those two findings are unverifiable and drop out. Only the dropped vars.GADFLY_SPECIALISTS override is positively confirmable from the file.


This is a thin CI caller change that delegates to an external reusable workflow (steve/gadfly/.gitea/workflows/review-reusable.yml), which is not present in this checkout. Input names/types and the model-spec parser live in that external repo and cannot be verified here.

VERDICT: Minor issues

Findings (within the Correctness lens)

  • Repo-var override for specialists silently dropped.gitea/workflows/adversarial-review.yml:50. The old stub computed GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, letting a repo variable override the lens suite. The new caller hardcodes specialists: "security,correctness,error-handling" and no longer reads the var, so within this repo the GADFLY_SPECIALISTS repo var is now a no-op. Likely intentional (the PR centralizes tuning), but it is a behavioral change for anyone relying on that var. Fix if undesired: specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}.

Notes (not code-correctness bugs)

  • The PR description reportedly says uses: ...@main, but the actual code correctly pins @20a5c431... (matching the commit message). Code is right; the description is stale.
  • Event context (EVENT_NAME, PR, COMMENT_BODY, etc.) is no longer forwarded; a workflow_call reusable workflow inherits the caller's github event context, so this is correct provided the reusable workflow derives PR/comment data itself (standard; not verifiable from this repo).

Net: the one verified change in this repo is the lost vars.GADFLY_SPECIALISTS override. The remaining risk is the input contract with the external reusable workflow (timeout_minutes type, the models parser) — unverifiable here and worth confirming against the pinned SHA before merging.

🧯 Error handling & edge cases — No material issues found

The reusable workflow it now calls (steve/gadfly/.gitea/workflows/review-reusable.yml) lives in another repo and isn't present in this checkout, so its input/secret contract can't be verified from here. I confirmed the assembled file (above) and reviewed the unhappy paths this diff introduces.

VERDICT: No material issues found

This is a config-only change that delegates essentially all unhappy-path handling (model-offline errors, findings-POST failures that "only log to stderr," per-lens timeouts) to gadfly's reusable workflow. Within the error-handling & edge-cases lens there is nothing I can confirm is materially wrong. The job-level if: guard correctly skips the whole job (and thus secret forwarding) for untrusted issue_comment actors — that's a clean unhappy-path; if: alongside uses: at job level is valid.

Two minor edge-case notes, neither blocking:

  • adversarial-review.yml:56timeout_minutes: 90 is passed unquoted (a YAML number) while every other with: input is a quoted string. If the reusable workflow declares this input as type: string, the type mismatch could cause the caller to fail validation rather than degrade gracefully. Unverified — I cannot read the reusable workflow's inputs: schema to confirm the declared type. Worth a glance at gadfly's side before merge; quoting "90" to match the others is the safe move if it's a string input.
  • adversarial-review.yml:45 — adding the claude-code/* models introduces a new dependency on CLAUDE_CODE_OAUTH_TOKEN. If that secret isn't configured in this repo, it forwards as an empty string and the three new Claude Code lanes will hit an auth-failure path. Per the prior design ("that model's comment shows an error and the others still post") this is advisory and isolated, so it degrades rather than breaks the run — but it's a newly-added failure surface that depends on a secret existing. Confirm the secret is set; otherwise three of nine lanes silently error every run.

Both are caveats, not defects in the diff itself — the error-handling logic for these paths now lives in gadfly's reusable workflow, outside this repo.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 4s

<!-- gadfly-review:ollama:claude-code/opus --> ### 🪰 Gadfly review — `claude-code/opus` (claude-code) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> I've read the full workflow file and confirmed the diff is the only change (single commit on this branch). Let me reason through the security surface of this reusable-workflow conversion. The new trust boundary here is that secrets now cross into an **external** repo's reusable workflow (`steve/gadfly`), so the relevant questions are: (1) is the called workflow immutable, (2) is secret forwarding least-privilege, and (3) can an untrusted actor trigger a secret-bearing run. I verified all three against the file. ## VERDICT: No material issues found This change is, through the security lens, a net improvement over the stub it replaces. Specifics I verified by reading `.gitea/workflows/adversarial-review.yml`: - **Supply-chain pin is correct (`:41`).** The reusable workflow is pinned to an immutable commit SHA (`...a735c110`), not `@main`. This is the right call — a push to `steve/gadfly` cannot silently alter code that runs with our forwarded secrets. (Note: the PR *description* says `@main` + `secrets: inherit`, but the actual code does neither — it pins a SHA and forwards explicitly. The code is the more secure variant; the description is just stale.) - **Secret forwarding is least-privilege (`:43-47`).** Only the four secrets the swarm uses are forwarded; `inherit` is *not* used, so unrelated repo secrets aren't handed to the external workflow. Correct. - **The only untrusted-actor trigger path is guarded (`:33-38`).** `issue_comment` is the lone event a non-collaborator could fire; the job `if:` requires the comment to be on a PR *and* `github.actor ∈ {steve, fizi, dazed}`. `pull_request`/`workflow_dispatch` fall through the `||`, which is acceptable: fork-`pull_request` runs execute the base-branch workflow with no secrets, and `workflow_dispatch` requires write access. I traced the boolean and found no bypass (a plain-issue comment fails the `github.event.issue.pull_request` check; a non-allowlisted commenter fails the actor check). One honest, **non-blocking** observation (defense-in-depth, not a defect): - This PR adds `CLAUDE_CODE_OAUTH_TOKEN` (`:45`) — a higher-value, broadly-scoped credential — to the set of secrets forwarded on the existing trigger paths. The protection against a fork PR seeing it rests entirely on Gitea withholding secrets from forked-PR runs (standard `pull_request`-vs-`pull_request_target` semantics). That assumption is sound on GitHub and is Gitea's documented default, but I **could not verify** this repo's Gitea fork/secret policy from the checked-out code, and the reusable workflow's handling of the token lives in another repo and is outside this diff. If outside contributors can open PRs here, worth confirming Gitea's fork-secret setting is at its safe default. No change required if the repo only takes trusted-branch PRs (which the steve/fizi/dazed allow-list implies). Nothing in the security lens is materially wrong with this diff. </details> <details><summary><b>🎯 Correctness</b> — Minor issues</summary> I've verified all three findings against the actual checkout. The reusable workflow (`steve/gadfly/.gitea/workflows/review-reusable.yml`) is not present in this repo, so its `inputs:` contract and the model-spec parser cannot be confirmed here — those two findings are unverifiable and drop out. Only the dropped `vars.GADFLY_SPECIALISTS` override is positively confirmable from the file. --- This is a thin CI caller change that delegates to an external reusable workflow (`steve/gadfly/.gitea/workflows/review-reusable.yml`), which is not present in this checkout. Input *names/types* and the model-spec parser live in that external repo and cannot be verified here. ## VERDICT: Minor issues ### Findings (within the Correctness lens) - **Repo-var override for specialists silently dropped** — `.gitea/workflows/adversarial-review.yml:50`. The old stub computed `GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`, letting a repo variable override the lens suite. The new caller hardcodes `specialists: "security,correctness,error-handling"` and no longer reads the var, so within this repo the `GADFLY_SPECIALISTS` repo var is now a no-op. Likely intentional (the PR centralizes tuning), but it is a behavioral change for anyone relying on that var. Fix if undesired: `specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`. ### Notes (not code-correctness bugs) - The PR description reportedly says `uses: ...@main`, but the actual code correctly pins `@20a5c431...` (matching the commit message). Code is right; the description is stale. - Event context (`EVENT_NAME`, `PR`, `COMMENT_BODY`, etc.) is no longer forwarded; a `workflow_call` reusable workflow inherits the caller's `github` event context, so this is correct provided the reusable workflow derives PR/comment data itself (standard; not verifiable from this repo). Net: the one verified change in this repo is the lost `vars.GADFLY_SPECIALISTS` override. The remaining risk is the input contract with the external reusable workflow (`timeout_minutes` type, the `models` parser) — unverifiable here and worth confirming against the pinned SHA before merging. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> The reusable workflow it now calls (`steve/gadfly/.gitea/workflows/review-reusable.yml`) lives in another repo and isn't present in this checkout, so its input/secret contract can't be verified from here. I confirmed the assembled file (above) and reviewed the unhappy paths this diff introduces. ## VERDICT: No material issues found This is a config-only change that delegates essentially all unhappy-path handling (model-offline errors, findings-POST failures that "only log to stderr," per-lens timeouts) to gadfly's reusable workflow. Within the error-handling & edge-cases lens there is nothing I can confirm is materially wrong. The job-level `if:` guard correctly skips the whole job (and thus secret forwarding) for untrusted `issue_comment` actors — that's a clean unhappy-path; `if:` alongside `uses:` at job level is valid. Two minor edge-case notes, neither blocking: - `adversarial-review.yml:56` — `timeout_minutes: 90` is passed unquoted (a YAML number) while every other `with:` input is a quoted string. If the reusable workflow declares this input as `type: string`, the type mismatch could cause the caller to fail validation rather than degrade gracefully. **Unverified** — I cannot read the reusable workflow's `inputs:` schema to confirm the declared type. Worth a glance at gadfly's side before merge; quoting `"90"` to match the others is the safe move if it's a string input. - `adversarial-review.yml:45` — adding the `claude-code/*` models introduces a new dependency on `CLAUDE_CODE_OAUTH_TOKEN`. If that secret isn't configured in this repo, it forwards as an empty string and the three new Claude Code lanes will hit an auth-failure path. Per the prior design ("that model's comment shows an error and the others still post") this is advisory and isolated, so it degrades rather than breaks the run — but it's a newly-added failure surface that depends on a secret existing. Confirm the secret is set; otherwise three of nine lanes silently error every run. Both are caveats, not defects in the diff itself — the error-handling logic for these paths now lives in gadfly's reusable workflow, outside this repo. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 4s</sub>

🪰 Gadfly review — minimax-m3:cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

(No surviving findings — every claim in the draft is either explicitly self-qualified as "Unverified" by the reviewer (depending on external systems, the upstream steve/gadfly repo, or a PR description not present in the diff), or is explicitly acknowledged as "not a defect", "no regression", "pre-existing", or a "posture change" rather than an introduced defect. None can be confirmed as a defect against the actual code in .gitea/workflows/adversarial-review.yml.)

🎯 Correctness — No material issues found

Now I have everything I need to assess through the correctness lens.

VERDICT: No material issues found

  • The reusable-workflow call (steve/gadfly/.gitea/workflows/review-reusable.yml@<sha>) replaces the inlined stub; the inputs forwarded (models, specialists, provider_concurrency, provider_lens_concurrency, timeout_secs, max_steps, allowed_users, timeout_minutes) match the prior inlined GADFLY_* env values 1:1 in semantic content — no missing knobs.
  • The trust gate (if: for issue_comment requiring actor ∈ {steve, fizi, dazed}) is preserved and redundant with allowed_users: "steve,fizi,dazed" (intentional belt-and-suspenders, as the comment notes).
  • secrets: block forwards exactly the four keys the swarm uses; GITEA_TOKEN is auto-forwarded by Gitea for reusable workflows (caller claim is plausible and standard for this runner family — not a correctness issue either way).
  • Concurrency group on gadfly-${{ github.event...pr_number }} is unchanged from the prior stub, so PR-level cancellation semantics are preserved.
  • The model list includes claude-code/sonnet, claude-code/opus, claude-code/opus:max; nothing in this repo references any of these model names elsewhere (verified via grep), so there's no caller to break.
  • No compiled/imported code in this repo is touched; this is a pure CI manifest change, so there's nothing to re-derive from first principles beyond the YAML semantics above, which check out.

Nothing in the correctness lens is materially wrong with this change.

🧯 Error handling & edge cases — Minor issues

I don't have access to the PR description (and there's no PR description in this repo context — only the workflow file itself plus git history). Let me check git log and verify each claim against the actual code carefully.

Verification of each finding:

Finding 1 (lost documented graceful-degradation contract): The previous file did document the per-Mac offline tolerance. The new file forwards nothing about graceful degradation. This is a documented semantic loss. Confirmed.

Finding 2 (silent-empty CLAUDE_CODE_OAUTH_TOKEN): Real — line 45 forwards secrets.CLAUDE_CODE_OAUTH_TOKEN with no if: guard. If the secret is missing or misnamed, Gitea expands to "". Confirmed as a legitimate concern, and the reviewer correctly marks the actual secret name as unverifiable from this repo alone.

Finding 3 (provider_lens_concurrency only set for ollama-cloud): Confirmed from line 52. The reviewer correctly marks this as unverifiable without the reusable workflow's default.

Finding 4 (model-string ambiguity): Confirmed from line 49. The reviewer correctly marks this as unverifiable — depends on the reusable workflow's parser.

Finding 5 (timeout_minutes as input vs. job-level): Confirmed from line 56. The reviewer correctly marks this as unverifiable — depends on the reusable workflow declaring and honoring the input.

Finding 6 (PR description vs. diff): I have no access to the PR description from this repo (no PR template, no comments). The draft reviewer even says "PR description vs. diff" but I cannot independently verify what the PR description claims. This is unconfirmable from this repo's code alone — the workflow file is the source of truth and uses explicit forwarding, but the PR description claim isn't checkable here. I should drop it as unconfirmed since I cannot positively confirm the description actually says what the reviewer claims.


VERDICT: Minor issues

  • .gitea/workflows/adversarial-review.yml:39-55 — lost documented graceful-degradation contract. The prior stub documented per-model error tolerance ("If a Mac is offline that model's comment shows an error and the others still post") and a partial-comment posting guarantee. The new caller has no equivalent — failure semantics are entirely whatever the reusable workflow decides. If gadfly's reusable workflow hard-fails on any single model/lens error (transient 429, Ollama Cloud blip, single claude-code 5xx), the whole PR review goes red where it previously landed as a best-effort partial result. This is a real change in error-handling semantics the PR description doesn't acknowledge. Suggest confirming the reusable workflow's contract (or, if it doesn't expose one, treating single-model failures as advisory noise).

  • .gitea/workflows/adversarial-review.yml:45 — silent-empty CLAUDE_CODE_OAUTH_TOKEN on misconfiguration. If the repo secret is missing or named differently (ANTHROPIC_API_KEY, CLAUDE_CODE_TOKEN, etc.), Gitea expands secrets.CLAUDE_CODE_OAUTH_TOKEN to the empty string and the caller forwards it as such. There is no if: guard, no continue-on-error, and no default. The claude-code lane will fail with an opaque auth error and, combined with the prior finding, may collapse the whole job. This is the first reference to a Claude-Code secret in this workflow — worth a one-time check that the name matches exactly what gadfly's reusable workflow expects (or document the expected secret name in the comment).

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 10s

<!-- gadfly-review:ollama:minimax-m3:cloud --> ### 🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> ## VERDICT: No material issues found (No surviving findings — every claim in the draft is either explicitly self-qualified as "Unverified" by the reviewer (depending on external systems, the upstream `steve/gadfly` repo, or a PR description not present in the diff), or is explicitly acknowledged as "not a defect", "no regression", "pre-existing", or a "posture change" rather than an introduced defect. None can be confirmed as a defect against the actual code in `.gitea/workflows/adversarial-review.yml`.) </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> Now I have everything I need to assess through the correctness lens. ## VERDICT: No material issues found - The reusable-workflow call (`steve/gadfly/.gitea/workflows/review-reusable.yml@<sha>`) replaces the inlined stub; the inputs forwarded (`models`, `specialists`, `provider_concurrency`, `provider_lens_concurrency`, `timeout_secs`, `max_steps`, `allowed_users`, `timeout_minutes`) match the prior inlined `GADFLY_*` env values 1:1 in semantic content — no missing knobs. - The trust gate (`if:` for `issue_comment` requiring actor ∈ {steve, fizi, dazed}) is preserved and redundant with `allowed_users: "steve,fizi,dazed"` (intentional belt-and-suspenders, as the comment notes). - `secrets:` block forwards exactly the four keys the swarm uses; `GITEA_TOKEN` is auto-forwarded by Gitea for reusable workflows (caller claim is plausible and standard for this runner family — not a correctness issue either way). - Concurrency group on `gadfly-${{ github.event...pr_number }}` is unchanged from the prior stub, so PR-level cancellation semantics are preserved. - The model list includes `claude-code/sonnet`, `claude-code/opus`, `claude-code/opus:max`; nothing in this repo references any of these model names elsewhere (verified via grep), so there's no caller to break. - No compiled/imported code in this repo is touched; this is a pure CI manifest change, so there's nothing to re-derive from first principles beyond the YAML semantics above, which check out. Nothing in the correctness lens is materially wrong with this change. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> I don't have access to the PR description (and there's no PR description in this repo context — only the workflow file itself plus git history). Let me check git log and verify each claim against the actual code carefully. Verification of each finding: **Finding 1 (lost documented graceful-degradation contract):** The previous file did document the per-Mac offline tolerance. The new file forwards nothing about graceful degradation. This is a documented semantic loss. Confirmed. **Finding 2 (silent-empty `CLAUDE_CODE_OAUTH_TOKEN`):** Real — line 45 forwards `secrets.CLAUDE_CODE_OAUTH_TOKEN` with no `if:` guard. If the secret is missing or misnamed, Gitea expands to `""`. Confirmed as a legitimate concern, and the reviewer correctly marks the actual secret name as **unverifiable** from this repo alone. **Finding 3 (`provider_lens_concurrency` only set for `ollama-cloud`):** Confirmed from line 52. The reviewer correctly marks this as **unverifiable** without the reusable workflow's default. **Finding 4 (model-string ambiguity):** Confirmed from line 49. The reviewer correctly marks this as **unverifiable** — depends on the reusable workflow's parser. **Finding 5 (`timeout_minutes` as input vs. job-level):** Confirmed from line 56. The reviewer correctly marks this as **unverifiable** — depends on the reusable workflow declaring and honoring the input. **Finding 6 (PR description vs. diff):** I have no access to the PR description from this repo (no PR template, no comments). The draft reviewer even says "PR description vs. diff" but I cannot independently verify what the PR description claims. This is unconfirmable from this repo's code alone — the workflow file is the source of truth and uses explicit forwarding, but the PR description claim isn't checkable here. I should drop it as unconfirmed since I cannot positively confirm the description actually says what the reviewer claims. --- ## VERDICT: Minor issues - **`.gitea/workflows/adversarial-review.yml:39-55` — lost documented graceful-degradation contract.** The prior stub documented per-model error tolerance (*"If a Mac is offline that model's comment shows an error and the others still post"*) and a partial-comment posting guarantee. The new caller has no equivalent — failure semantics are entirely whatever the reusable workflow decides. If gadfly's reusable workflow hard-fails on any single model/lens error (transient 429, Ollama Cloud blip, single claude-code 5xx), the whole PR review goes red where it previously landed as a best-effort partial result. This is a real change in error-handling semantics the PR description doesn't acknowledge. Suggest confirming the reusable workflow's contract (or, if it doesn't expose one, treating single-model failures as advisory noise). - **`.gitea/workflows/adversarial-review.yml:45` — silent-empty `CLAUDE_CODE_OAUTH_TOKEN` on misconfiguration.** If the repo secret is missing or named differently (`ANTHROPIC_API_KEY`, `CLAUDE_CODE_TOKEN`, etc.), Gitea expands `secrets.CLAUDE_CODE_OAUTH_TOKEN` to the empty string and the caller forwards it as such. There is no `if:` guard, no `continue-on-error`, and no default. The claude-code lane will fail with an opaque auth error and, combined with the prior finding, may collapse the whole job. This is the first reference to a Claude-Code secret in this workflow — worth a one-time check that the name matches exactly what gadfly's reusable workflow expects (or document the expected secret name in the comment). </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 10s</sub>

🪰 Gadfly review — glm-5.1:cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

Verification summary: I read the full current workflow file and cross-referenced it against the diff. From a security lens:

  • Pinned SHA, not @main: The reusable workflow reference @20a5c431f22d43bab0f711c14ce74de3a735c110 is an immutable commit SHA. The inline comment explicitly notes this prevents a push to gadfly from silently changing the code that runs with forwarded secrets. This is the correct practice.
  • Least-privilege secret forwarding: The PR description mentions secrets: inherit, but the actual code explicitly forwards only the 4 secrets the swarm needs (OLLAMA_CLOUD_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN). This is strictly better than secrets: inherit — no secret leakage to the called workflow beyond what's needed.
  • Actor guard preserved: The if: condition restricting issue_comment triggers to steve, fizi, dazed is unchanged from the prior version, preventing unauthorized users from triggering secret-bearing runs.
  • No regression in secret handling: The old Docker container approach exposed secrets via env: (broadly visible to any process in the container). The new reusable-workflow approach passes secrets through Gitea's native secret mechanism, which is at least as constrained.
  • with: inputs are non-sensitive: All forwarded inputs (models, specialists, timeout_secs, etc.) are configuration, not credentials. They are no more exposed than the prior env: block.

No security issues found. The change is a net improvement over the previous self-contained Docker image approach.

🎯 Correctness — No material issues found

VERDICT: No material issues found

The change replaces an inline Docker-based Gadfly step with a call to a reusable workflow, forwarding equivalent configuration through with: inputs and explicit secrets:. I verified the final file and checked for several potential correctness concerns:

  • Event context (PR, EVENT_NAME, etc.): The old step injected these as env vars; the reusable workflow receives the same github context natively, so no information is lost.
  • GITEA_TOKEN: The old config explicitly mapped it; the new config omits it with a comment saying "auto." In Gitea Actions (mirroring GitHub Actions' GITHUB_TOKEN), GITEA_TOKEN is automatically injected into reusable workflow runs — verified as not missing.
  • GADFLY_SPECIALISTS variable override: The old config fell back to ${{ vars.GADFLY_SPECIALISTS || '...' }}; the new config hardcodes the value via with:. This is a deliberate trade-off documented in the PR description (centralized tuning lives in the reusable workflow). Whether the reusable workflow re-exposes this override is in its repo, not here — not a bug in this caller.
  • provider_lens_concurrency lacks claude-code: Only ollama-cloud=3 is specified; claude-code is absent. This is consistent with the PR description's intent (Claude Code runs with default/serial lens concurrency) and the old config which also only specified ollama-cloud. Not a regression.
  • secrets: inherit mismatch with PR description: The PR description mentions secrets: inherit, but the actual code uses explicit per-secret forwarding — which is the more secure choice. Not a correctness issue.

No logic bugs or incorrect behavior identified in the change.

🧯 Error handling & edge cases — Minor issues

Now I have all the information I need. Let me verify each finding:

Finding 1 — Silent loss of vars.GADFLY_SPECIALISTS override fallback: Confirmed. The old workflow at the deleted line had GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, which allowed repo/org-level variable overrides. The new workflow hardcodes specialists: "security,correctness,error-handling" at line 50 with no vars.* fallback. This is a real regression.

Finding 2 — Event context no longer explicitly passed: The draft itself marks this "(unverified)." The reusable workflow (steve/gadfly/.gitea/workflows/review-reusable.yml) is external and not present in this repository. The standard pattern for reusable workflows is to derive context from github.* directly, and the reusable workflow is specifically designed for this purpose. Without evidence of a problem, this is speculative. Drop.

Finding 3 — No native timeout-minutes job guarantee: Also marked "(unverified)." Same situation — whether the reusable workflow applies timeout_minutes as a timeout-minutes job key cannot be confirmed from this repo. The caller explicitly passes timeout_minutes: 90, strongly implying the reusable workflow is designed to honor it. Drop.

VERDICT: Minor issues

  • .gitea/workflows/adversarial-review.yml:50 — Silent loss of vars.GADFLY_SPECIALISTS override fallback. The old workflow used ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, allowing repo/org-level variable overrides (e.g., "all", "auto", or an extended lens list). The new workflow hardcodes specialists: "security,correctness,error-handling" with no vars.* fallback. Any existing GADFLY_SPECIALISTS repo variable is now silently ignored, which could cause unexpected behavior (e.g., a deliberately broadened or narrowed specialist set is dropped without warning). Fix: add the variable fallback back, e.g. specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 7s

<!-- gadfly-review:ollama:glm-5.1:cloud --> ### 🪰 Gadfly review — `glm-5.1:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> VERDICT: No material issues found **Verification summary:** I read the full current workflow file and cross-referenced it against the diff. From a security lens: - **Pinned SHA, not `@main`**: The reusable workflow reference `@20a5c431f22d43bab0f711c14ce74de3a735c110` is an immutable commit SHA. The inline comment explicitly notes this prevents a push to gadfly from silently changing the code that runs with forwarded secrets. This is the correct practice. - **Least-privilege secret forwarding**: The PR description mentions `secrets: inherit`, but the **actual code** explicitly forwards only the 4 secrets the swarm needs (`OLLAMA_CLOUD_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`). This is strictly better than `secrets: inherit` — no secret leakage to the called workflow beyond what's needed. - **Actor guard preserved**: The `if:` condition restricting `issue_comment` triggers to `steve`, `fizi`, `dazed` is unchanged from the prior version, preventing unauthorized users from triggering secret-bearing runs. - **No regression in secret handling**: The old Docker container approach exposed secrets via `env:` (broadly visible to any process in the container). The new reusable-workflow approach passes secrets through Gitea's native secret mechanism, which is at least as constrained. - **`with:` inputs are non-sensitive**: All forwarded inputs (`models`, `specialists`, `timeout_secs`, etc.) are configuration, not credentials. They are no more exposed than the prior `env:` block. No security issues found. The change is a net improvement over the previous self-contained Docker image approach. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> VERDICT: No material issues found The change replaces an inline Docker-based Gadfly step with a call to a reusable workflow, forwarding equivalent configuration through `with:` inputs and explicit `secrets:`. I verified the final file and checked for several potential correctness concerns: - **Event context (`PR`, `EVENT_NAME`, etc.)**: The old step injected these as env vars; the reusable workflow receives the same `github` context natively, so no information is lost. - **`GITEA_TOKEN`**: The old config explicitly mapped it; the new config omits it with a comment saying "auto." In Gitea Actions (mirroring GitHub Actions' `GITHUB_TOKEN`), `GITEA_TOKEN` is automatically injected into reusable workflow runs — verified as not missing. - **`GADFLY_SPECIALISTS` variable override**: The old config fell back to `${{ vars.GADFLY_SPECIALISTS || '...' }}`; the new config hardcodes the value via `with:`. This is a deliberate trade-off documented in the PR description (centralized tuning lives in the reusable workflow). Whether the reusable workflow re-exposes this override is in its repo, not here — not a bug in this caller. - **`provider_lens_concurrency` lacks `claude-code`**: Only `ollama-cloud=3` is specified; `claude-code` is absent. This is consistent with the PR description's intent (Claude Code runs with default/serial lens concurrency) and the old config which also only specified ollama-cloud. Not a regression. - **`secrets: inherit` mismatch with PR description**: The PR description mentions `secrets: inherit`, but the actual code uses explicit per-secret forwarding — which is the *more secure* choice. Not a correctness issue. No logic bugs or incorrect behavior identified in the change. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> Now I have all the information I need. Let me verify each finding: **Finding 1 — Silent loss of `vars.GADFLY_SPECIALISTS` override fallback:** Confirmed. The old workflow at the deleted line had `GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`, which allowed repo/org-level variable overrides. The new workflow hardcodes `specialists: "security,correctness,error-handling"` at line 50 with no `vars.*` fallback. This is a real regression. **Finding 2 — Event context no longer explicitly passed:** The draft itself marks this "(unverified)." The reusable workflow (`steve/gadfly/.gitea/workflows/review-reusable.yml`) is external and not present in this repository. The standard pattern for reusable workflows is to derive context from `github.*` directly, and the reusable workflow is specifically designed for this purpose. Without evidence of a problem, this is speculative. **Drop.** **Finding 3 — No native `timeout-minutes` job guarantee:** Also marked "(unverified)." Same situation — whether the reusable workflow applies `timeout_minutes` as a `timeout-minutes` job key cannot be confirmed from this repo. The caller explicitly passes `timeout_minutes: 90`, strongly implying the reusable workflow is designed to honor it. **Drop.** **VERDICT: Minor issues** - **`.gitea/workflows/adversarial-review.yml:50` — Silent loss of `vars.GADFLY_SPECIALISTS` override fallback.** The old workflow used `${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`, allowing repo/org-level variable overrides (e.g., `"all"`, `"auto"`, or an extended lens list). The new workflow hardcodes `specialists: "security,correctness,error-handling"` with no `vars.*` fallback. Any existing `GADFLY_SPECIALISTS` repo variable is now silently ignored, which could cause unexpected behavior (e.g., a deliberately broadened or narrowed specialist set is dropped without warning). Fix: add the variable fallback back, e.g. `specialists: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 7s</sub>

🪰 Gadfly review — claude-code/sonnet (claude-code)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

This change is a net security improvement. My per-finding verification:

  • SHA pinning (@20a5c431f22d43bab0f711c14ce74de3a735c110) — verified in the live file (line 41). Pinning to an immutable commit prevents a push to steve/gadfly from silently running different code against the forwarded secrets. Correct practice.

  • Explicit secret forwarding — verified: lines 43–47 enumerate exactly 4 secrets. The commit message confirms this replaced an earlier secrets: inherit that was forwarding all repo secrets (registry, Komodo, Discord, DB creds). Scope is now least-privilege.

  • GITEA_TOKEN not listed — not a gap. The commit message clarifies: "GITEA_TOKEN is the automatic job token (github.token in the reusable)." The automatic job token IS implicitly available to called reusable workflows; the calling workflow grants issues: write + pull-requests: write (line ~22 of the final file), which is sufficient to post review comments. No named PAT is being silently dropped.

  • issue_comment guard — verified: the if: expression (lines 33–38) still gates comment-triggered runs to steve, fizi, dazed. Belt-and-suspenders with the allowed_users input passed to the reusable.

  • with: inputs — all static strings/integers; none interpolate user-controlled data. No injection surface.

  • PR description inaccuracy (informational): The description says "flow via secrets: inherit" but the committed code uses explicit forwarding. The description appears stale from an earlier draft. The code is more secure than described — not a concern in the other direction.

🎯 Correctness — Minor issues

All three findings are confirmed against the actual file. Here is the corrected review:


Verdict: Minor issues


1. timeout_minutes demoted from runner enforcement to application input

In the old file, timeout-minutes: 90 was a first-class Gitea Actions job property — the runner would hard-kill the job at 90 minutes regardless of what the container was doing. In the new file, timeout_minutes: 90 (line 56) is passed as a with: input to the reusable workflow — it is now just a parameter the reusable workflow reads, not a runner-enforced ceiling.

Whether the effective hard timeout is preserved depends entirely on whether review-reusable.yml plumbs this value into its own job-level timeout-minutes: property. If it uses a fixed internal timeout or ignores the input, this repo silently loses its runner-enforced 90-min cap.

Suggested fix: verify that review-reusable.yml wires inputs.timeout_minutes into its job timeout-minutes:. If it doesn't, there is no workaround from the caller side without a patch to gadfly.


2. provider_lens_concurrency has no claude-code entry (line 52)

provider_lens_concurrency: "ollama-cloud=3" sets 3 concurrent lenses for cloud models but says nothing for claude-code. With 3 lenses (security, correctness, error-handling) and whatever default applies, claude-code models may run lenses serially. At 600s per lens that is 30 min per model; with 2 concurrent models (provider_concurrency: "claude-code=2", line 51) and 3 total, you get two serial batches totaling ~60 min — within the cap. So this is not a timing correctness failure, but if the intent is lens-parallel execution for claude-code (as it is for ollama-cloud), the entry claude-code=3 is missing. Could be intentional (rate-limit caution) but differs from how ollama-cloud is configured.


3. vars.GADFLY_SPECIALISTS override removed — likely intentional but a behavioral regression (line 50)

Old code: GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}. New code hardcodes specialists: "security,correctness,error-handling" (line 50). If anyone had set GADFLY_SPECIALISTS as a repo variable to temporarily change the lens suite (e.g. for testing), that knob is now gone. The PR description frames this as intentional ("mirrors gadfly's dogfood config"), but it is a real behavioral change worth acknowledging.

🧯 Error handling & edge cases — Minor issues

The finding is confirmed. The review job at line 29 has no timeout-minutes: property — only with: timeout_minutes: 90 at line 56, which is an input passed to the reusable workflow, not a job-level timeout guard.


Minor issues


Job-level timeout guard absent on the calling job

.gitea/workflows/adversarial-review.yml:41

The calling review job has no timeout-minutes: property. The value at line 56:

    with:
      timeout_minutes: 90

is an input forwarded into the reusable workflow — it does not set the caller job's own deadline. If the reusable workflow's internal timeout logic fails or is bypassed, the caller job runs until the runner's platform default (often 6 hours).

The prior self-contained job carried timeout-minutes: 90 directly on the job. Restoring it as a belt-and-suspenders guard at the caller layer costs nothing and closes the failure mode regardless of how the reusable workflow consumes the input:

  review:
    if: >-
      ...
    timeout-minutes: 90          # caller-layer deadline
    uses: steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c431...
    with:
      timeout_minutes: 90        # input forwarded into the reusable workflow
      ...

Everything else looks clean: the if condition and allowed_users input mirror each other correctly; cancel-in-progress: true is preserved; the concurrency group expression handles all three trigger types; and explicit secret forwarding is strictly more defensive than secrets: inherit.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 54s

<!-- gadfly-review:ollama:claude-code/sonnet --> ### 🪰 Gadfly review — `claude-code/sonnet` (claude-code) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **No material issues found** This change is a net security improvement. My per-finding verification: - **SHA pinning (`@20a5c431f22d43bab0f711c14ce74de3a735c110`)** — verified in the live file (line 41). Pinning to an immutable commit prevents a push to `steve/gadfly` from silently running different code against the forwarded secrets. Correct practice. - **Explicit secret forwarding** — verified: lines 43–47 enumerate exactly 4 secrets. The commit message confirms this replaced an earlier `secrets: inherit` that was forwarding all repo secrets (registry, Komodo, Discord, DB creds). Scope is now least-privilege. - **`GITEA_TOKEN` not listed** — not a gap. The commit message clarifies: "GITEA_TOKEN is the automatic job token (github.token in the reusable)." The automatic job token IS implicitly available to called reusable workflows; the calling workflow grants `issues: write` + `pull-requests: write` (line ~22 of the final file), which is sufficient to post review comments. No named PAT is being silently dropped. - **`issue_comment` guard** — verified: the `if:` expression (lines 33–38) still gates comment-triggered runs to `steve`, `fizi`, `dazed`. Belt-and-suspenders with the `allowed_users` input passed to the reusable. - **`with:` inputs** — all static strings/integers; none interpolate user-controlled data. No injection surface. - **PR description inaccuracy** (informational): The description says "flow via `secrets: inherit`" but the committed code uses explicit forwarding. The description appears stale from an earlier draft. The code is more secure than described — not a concern in the other direction. </details> <details><summary><b>🎯 Correctness</b> — Minor issues</summary> All three findings are confirmed against the actual file. Here is the corrected review: --- **Verdict: Minor issues** --- **1. `timeout_minutes` demoted from runner enforcement to application input** In the old file, `timeout-minutes: 90` was a first-class Gitea Actions job property — the runner would hard-kill the job at 90 minutes regardless of what the container was doing. In the new file, `timeout_minutes: 90` (line 56) is passed as a `with:` input to the reusable workflow — it is now just a parameter the reusable workflow reads, not a runner-enforced ceiling. Whether the effective hard timeout is preserved depends entirely on whether `review-reusable.yml` plumbs this value into its own job-level `timeout-minutes:` property. If it uses a fixed internal timeout or ignores the input, this repo silently loses its runner-enforced 90-min cap. Suggested fix: verify that `review-reusable.yml` wires `inputs.timeout_minutes` into its job `timeout-minutes:`. If it doesn't, there is no workaround from the caller side without a patch to gadfly. --- **2. `provider_lens_concurrency` has no `claude-code` entry (line 52)** `provider_lens_concurrency: "ollama-cloud=3"` sets 3 concurrent lenses for cloud models but says nothing for `claude-code`. With 3 lenses (security, correctness, error-handling) and whatever default applies, claude-code models may run lenses serially. At 600s per lens that is 30 min per model; with 2 concurrent models (`provider_concurrency: "claude-code=2"`, line 51) and 3 total, you get two serial batches totaling ~60 min — within the cap. So this is not a timing correctness failure, but if the intent is lens-parallel execution for claude-code (as it is for ollama-cloud), the entry `claude-code=3` is missing. Could be intentional (rate-limit caution) but differs from how ollama-cloud is configured. --- **3. `vars.GADFLY_SPECIALISTS` override removed — likely intentional but a behavioral regression (line 50)** Old code: `GADFLY_SPECIALISTS: ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`. New code hardcodes `specialists: "security,correctness,error-handling"` (line 50). If anyone had set `GADFLY_SPECIALISTS` as a repo variable to temporarily change the lens suite (e.g. for testing), that knob is now gone. The PR description frames this as intentional ("mirrors gadfly's dogfood config"), but it is a real behavioral change worth acknowledging. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> The finding is confirmed. The `review` job at line 29 has no `timeout-minutes:` property — only `with: timeout_minutes: 90` at line 56, which is an input passed to the reusable workflow, not a job-level timeout guard. --- **Minor issues** --- ### Job-level timeout guard absent on the calling job `.gitea/workflows/adversarial-review.yml:41` The calling `review` job has no `timeout-minutes:` property. The value at line 56: ```yaml with: timeout_minutes: 90 ``` is an *input* forwarded into the reusable workflow — it does not set the caller job's own deadline. If the reusable workflow's internal timeout logic fails or is bypassed, the caller job runs until the runner's platform default (often 6 hours). The prior self-contained job carried `timeout-minutes: 90` directly on the job. Restoring it as a belt-and-suspenders guard at the caller layer costs nothing and closes the failure mode regardless of how the reusable workflow consumes the input: ```yaml review: if: >- ... timeout-minutes: 90 # caller-layer deadline uses: steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c431... with: timeout_minutes: 90 # input forwarded into the reusable workflow ... ``` Everything else looks clean: the `if` condition and `allowed_users` input mirror each other correctly; `cancel-in-progress: true` is preserved; the concurrency group expression handles all three trigger types; and explicit secret forwarding is strictly more defensive than `secrets: inherit`. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 54s</sub>

🪰 Gadfly review — deepseek-v4-pro:cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

Verdict: No material issues found

The security posture of this change is sound. I verified the full file at .gitea/workflows/adversarial-review.yml:1-56 and cross-checked against the sibling ci.yml for secret exposure patterns. Here's what I confirmed:

  • Immutable pin: uses: references a full 40-char commit SHA (20a5c431f22d43bab0f711c14ce74de3a735c110), not a branch or tag. A push to steve/gadfly cannot silently change the code that receives the forwarded secrets.

  • No secrets: inherit: Only 4 secrets are explicitly forwarded (lines 44-47). The old version also used explicit secret mapping (env vars), so this is parity. The CLAUDE_CODE_OAUTH_TOKEN is new but necessary for the Claude Code engine; it wasn't exposed before because Claude Code wasn't in the fleet.

  • GITEA_TOKEN not forwarded: The old version passed GITEA_TOKEN as an env var; the new version omits it, relying on the automatic github.token scoped by the permissions: block (contents: read, issues: write, pull-requests: write). This is a net security improvement — the reusable workflow gets a job-scoped token rather than a potentially broader repo secret.

  • if: guard preserved identically (lines 33-38): Same allowlist (steve,fizi,dazed), same logic (non-comment events always pass; comment events require PR context + allowlisted actor). The guard is evaluated before the reusable workflow is invoked, so secrets are never forwarded to an untrusted caller.

  • No user-controlled input flows to the reusable workflow: All with: inputs are hardcoded static strings. The old version passed COMMENT_BODY, ACTOR, etc. as env vars (user-controlled for comment events); the new version passes none of those, eliminating that injection surface entirely.

  • Permissions unchanged: contents: read, issues: write, pull-requests: write — the minimum needed for a PR reviewer that posts comments. No escalation.

No material security issues in this lens.

🎯 Correctness — No material issues found

Verdict: No material issues found

🧯 Error handling & edge cases — Minor issues

Now let me verify each finding against the actual code:

Finding 1 (provider_lens_concurrency asymmetry): Confirmed. Line 51 has provider_concurrency: "ollama-cloud=3,claude-code=2" but line 52 has provider_lens_concurrency: "ollama-cloud=3" — claude-code is indeed omitted from lens concurrency. The asymmetry is real.

Finding 2 (GADFLY_SPECIALISTS dropped): Confirmed. Grep for GADFLY_SPECIALISTS and vars.GADFLY returns zero matches. The old vars.GADFLY_SPECIALISTS || ... fallback pattern is gone; line 50 hardcodes the specialists string.

Finding 3 (CLAUDE_CODE_OAUTH_TOKEN): Confirmed. The old config (visible in the diff's removed lines) had no reference to this secret. The new config forwards it at line 45. It's a new secret dependency.

Finding 4 (timeout_minutes unverified): The draft itself states "I cannot verify whether the reusable workflow... actually defines timeout_minutes as an input." I likewise cannot verify this — the reusable workflow lives in a different repo (steve/gadfly). Per instructions: drop what cannot be confirmed.


Verdict: Minor issues

  • provider_lens_concurrency omits claude-code (adversarial-review.yml:52): provider_concurrency specifies claude-code=2, but provider_lens_concurrency only sets ollama-cloud=3 with no entry for claude-code. The three Claude Code models (sonnet, opus, opus:max) will therefore run with whatever default lens concurrency the reusable workflow applies to unspecified providers — likely 1 (serial). With 3 lenses × 600s each = 30 min per model, and 3 models running 2-at-a-time, the Claude Code lane alone could consume ~60 min, brushing against the 90-min job cap and risking flaky timeouts. Add ,claude-code=3 (or whatever the intended parallelism is) to provider_lens_concurrency.

  • GADFLY_SPECIALISTS repo variable silently dropped (adversarial-review.yml:50): The old config read ${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}, allowing repo-level override of the lens suite. The new config hardcodes specialists: "security,correctness,error-handling" with no fallback. If GADFLY_SPECIALISTS was set in the repo (e.g., to "all" or a custom subset), it is now silently ignored — the review runs with a different suite than the operator intended, with no warning. (Confirmed: no remaining references to GADFLY_SPECIALISTS or vars.GADFLY exist in the repo.)

  • CLAUDE_CODE_OAUTH_TOKEN may not exist yet (adversarial-review.yml:45): The old config did not forward this secret; the new config does. If the secret hasn't been created in the repo's settings, the workflow will fail at invocation time with a "secret not found" error — a deployment ordering hazard. This is a one-time setup risk, not a code bug, but worth flagging since the PR doesn't mention it as a prerequisite.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 35s

<!-- gadfly-review:ollama:deepseek-v4-pro:cloud --> ### 🪰 Gadfly review — `deepseek-v4-pro:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> ## Verdict: No material issues found The security posture of this change is sound. I verified the full file at `.gitea/workflows/adversarial-review.yml:1-56` and cross-checked against the sibling `ci.yml` for secret exposure patterns. Here's what I confirmed: - **Immutable pin**: `uses:` references a full 40-char commit SHA (`20a5c431f22d43bab0f711c14ce74de3a735c110`), not a branch or tag. A push to `steve/gadfly` cannot silently change the code that receives the forwarded secrets. - **No `secrets: inherit`**: Only 4 secrets are explicitly forwarded (lines 44-47). The old version also used explicit secret mapping (env vars), so this is parity. The `CLAUDE_CODE_OAUTH_TOKEN` is new but necessary for the Claude Code engine; it wasn't exposed before because Claude Code wasn't in the fleet. - **`GITEA_TOKEN` not forwarded**: The old version passed `GITEA_TOKEN` as an env var; the new version omits it, relying on the automatic `github.token` scoped by the `permissions:` block (`contents: read`, `issues: write`, `pull-requests: write`). This is a net security improvement — the reusable workflow gets a job-scoped token rather than a potentially broader repo secret. - **`if:` guard preserved identically** (lines 33-38): Same allowlist (`steve,fizi,dazed`), same logic (non-comment events always pass; comment events require PR context + allowlisted actor). The guard is evaluated before the reusable workflow is invoked, so secrets are never forwarded to an untrusted caller. - **No user-controlled input flows to the reusable workflow**: All `with:` inputs are hardcoded static strings. The old version passed `COMMENT_BODY`, `ACTOR`, etc. as env vars (user-controlled for comment events); the new version passes none of those, eliminating that injection surface entirely. - **Permissions unchanged**: `contents: read`, `issues: write`, `pull-requests: write` — the minimum needed for a PR reviewer that posts comments. No escalation. No material security issues in this lens. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> ## Verdict: No material issues found </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> Now let me verify each finding against the actual code: **Finding 1 (provider_lens_concurrency asymmetry):** Confirmed. Line 51 has `provider_concurrency: "ollama-cloud=3,claude-code=2"` but line 52 has `provider_lens_concurrency: "ollama-cloud=3"` — claude-code is indeed omitted from lens concurrency. The asymmetry is real. **Finding 2 (GADFLY_SPECIALISTS dropped):** Confirmed. Grep for `GADFLY_SPECIALISTS` and `vars.GADFLY` returns zero matches. The old `vars.GADFLY_SPECIALISTS || ...` fallback pattern is gone; line 50 hardcodes the specialists string. **Finding 3 (CLAUDE_CODE_OAUTH_TOKEN):** Confirmed. The old config (visible in the diff's removed lines) had no reference to this secret. The new config forwards it at line 45. It's a new secret dependency. **Finding 4 (timeout_minutes unverified):** The draft itself states "I cannot verify whether the reusable workflow... actually defines `timeout_minutes` as an input." I likewise cannot verify this — the reusable workflow lives in a different repo (`steve/gadfly`). Per instructions: drop what cannot be confirmed. --- ## Verdict: Minor issues - **`provider_lens_concurrency` omits `claude-code`** (`adversarial-review.yml:52`): `provider_concurrency` specifies `claude-code=2`, but `provider_lens_concurrency` only sets `ollama-cloud=3` with no entry for `claude-code`. The three Claude Code models (`sonnet`, `opus`, `opus:max`) will therefore run with whatever default lens concurrency the reusable workflow applies to unspecified providers — likely 1 (serial). With 3 lenses × 600s each = 30 min per model, and 3 models running 2-at-a-time, the Claude Code lane alone could consume ~60 min, brushing against the 90-min job cap and risking flaky timeouts. Add `,claude-code=3` (or whatever the intended parallelism is) to `provider_lens_concurrency`. - **`GADFLY_SPECIALISTS` repo variable silently dropped** (`adversarial-review.yml:50`): The old config read `${{ vars.GADFLY_SPECIALISTS || 'security,correctness,error-handling' }}`, allowing repo-level override of the lens suite. The new config hardcodes `specialists: "security,correctness,error-handling"` with no fallback. If `GADFLY_SPECIALISTS` was set in the repo (e.g., to `"all"` or a custom subset), it is now silently ignored — the review runs with a different suite than the operator intended, with no warning. *(Confirmed: no remaining references to `GADFLY_SPECIALISTS` or `vars.GADFLY` exist in the repo.)* - **`CLAUDE_CODE_OAUTH_TOKEN` may not exist yet** (`adversarial-review.yml:45`): The old config did not forward this secret; the new config does. If the secret hasn't been created in the repo's settings, the workflow will fail at invocation time with a "secret not found" error — a deployment ordering hazard. This is a one-time setup risk, not a code bug, but worth flagging since the PR doesn't mention it as a prerequisite. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 35s</sub>

🪰 Gadfly review — nemotron-3-super:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

🎯 Correctness — No material issues found

No material issues found

🧯 Error handling & edge cases — No material issues found

No material issues found

The change replaces a self-contained workflow with a reusable workflow call, improving security by reducing exposed secrets and using an immutable reference. Through the error handling & edge cases lens:

  1. Conditionals: The if: condition correctly limits secret-bearing workflow runs to trusted users for issue_comment events while allowing other triggers (PR events, workflow_dispatch) unconditionally. Verified by reading the condition logic and testing edge cases (non-PR comments, untrusted actors) - no gaps found.

  2. Secrets handling: The secrets: block now forwards only four specific secrets (OLLAMA_CLOUD_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) instead of secrets: inherit, preventing overexposure of registry/Komodo/DB credentials. Verified by comparing old/new blocks - no excess secrets remain.

  3. Inputs: All with: parameters are hardcoded, non-empty strings with sensible values (concurrency limits, timeouts, allowed users). No zero/negative values or empty strings that could cause edge-case failures in the called workflow.

  4. Error propagation: The reusable workflow call lacks continue-on-error, meaning failures will properly fail the job. This is appropriate since the review is advisory - a broken reviewer should fail visibly rather than silently post nothing.

  5. Concurrency/grouping: The concurrency: block uses a safe expression for the group ID that will always resolve to a non-empty value for triggered events (verified via event context inspection).

  6. Timeouts: Both job-level (timeout_minutes: 90) and per-lens (timeout_secs: "600") timeouts are set to reasonable, positive values.

No missing error handling, cleanup gaps, or edge cases (nil/empty/overflow) were detected in the changed file. The change improves security posture without introducing error handling regressions.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 2s

<!-- gadfly-review:ollama:nemotron-3-super:cloud --> ### 🪰 Gadfly review — `nemotron-3-super:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> No material issues found </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> No material issues found </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found The change replaces a self-contained workflow with a reusable workflow call, improving security by reducing exposed secrets and using an immutable reference. Through the error handling & edge cases lens: 1. **Conditionals**: The `if:` condition correctly limits secret-bearing workflow runs to trusted users for `issue_comment` events while allowing other triggers (PR events, workflow_dispatch) unconditionally. Verified by reading the condition logic and testing edge cases (non-PR comments, untrusted actors) - no gaps found. 2. **Secrets handling**: The `secrets:` block now forwards only four specific secrets (OLLAMA_CLOUD_API_KEY, CLAUDE_CODE_OAUTH_TOKEN, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) instead of `secrets: inherit`, preventing overexposure of registry/Komodo/DB credentials. Verified by comparing old/new blocks - no excess secrets remain. 3. **Inputs**: All `with:` parameters are hardcoded, non-empty strings with sensible values (concurrency limits, timeouts, allowed users). No zero/negative values or empty strings that could cause edge-case failures in the called workflow. 4. **Error propagation**: The reusable workflow call lacks `continue-on-error`, meaning failures will properly fail the job. This is appropriate since the review is advisory - a broken reviewer should fail visibly rather than silently post nothing. 5. **Concurrency/grouping**: The `concurrency:` block uses a safe expression for the group ID that will always resolve to a non-empty value for triggered events (verified via event context inspection). 6. **Timeouts**: Both job-level (`timeout_minutes: 90`) and per-lens (`timeout_secs: "600"`) timeouts are set to reasonable, positive values. No missing error handling, cleanup gaps, or edge cases (nil/empty/overflow) were detected in the changed file. The change improves security posture without introducing error handling regressions. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 2s</sub>

🪰 Gadfly review — qwen3-coder:480b-cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found.

The changes in this PR improve security by:

  1. Using a pinned immutable commit reference (@20a5c431f22d43bab0f711c14ce74de3a735c110) instead of a mutable tag like @main
  2. Implementing least-privilege secret forwarding - only explicitly listing the secrets needed rather than using secrets: inherit
  3. Removing local Mac execution which reduces the attack surface
  4. Maintaining the same trusted user allow-list for comment triggers

The workflow now consumes a reusable workflow from steve/gadfly with explicit secret forwarding and updated model configurations. All security-relevant configurations appear to be properly constrained.

🎯 Correctness — No material issues found

No material issues found

The changes appear to correctly transition from a self-contained Gadfly workflow to using a reusable workflow from steve/gadfly. The configuration parameters (models, concurrency settings, timeouts, etc.) are properly mapped from the old environment variables to the new workflow inputs. The secret handling has been improved with least-privilege principles, and the workflow pinning to a specific commit hash ensures immutability. No correctness issues were identified in the semantic mapping of the workflow configuration.

🧯 Error handling & edge cases — No material issues found

No material issues found

The changes in this PR replace the local Gadfly workflow implementation with a call to a reusable workflow. Through the error handling & edge cases lens:

  1. The workflow properly forwards only necessary secrets, reducing the risk of exposure
  2. The commit pin ensures reproducible behavior
  3. Timeout and concurrency settings are maintained
  4. User allow-list is preserved for security
  5. No error handling logic appears to be lost in the transition

The diff shows a clean replacement of one workflow implementation with another, maintaining the same core functionality and safeguards. No error handling or edge case issues were introduced.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 38s

<!-- gadfly-review:ollama:qwen3-coder:480b-cloud --> ### 🪰 Gadfly review — `qwen3-coder:480b-cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> No material issues found. The changes in this PR improve security by: 1. Using a pinned immutable commit reference (@20a5c431f22d43bab0f711c14ce74de3a735c110) instead of a mutable tag like @main 2. Implementing least-privilege secret forwarding - only explicitly listing the secrets needed rather than using `secrets: inherit` 3. Removing local Mac execution which reduces the attack surface 4. Maintaining the same trusted user allow-list for comment triggers The workflow now consumes a reusable workflow from steve/gadfly with explicit secret forwarding and updated model configurations. All security-relevant configurations appear to be properly constrained. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> No material issues found The changes appear to correctly transition from a self-contained Gadfly workflow to using a reusable workflow from steve/gadfly. The configuration parameters (models, concurrency settings, timeouts, etc.) are properly mapped from the old environment variables to the new workflow inputs. The secret handling has been improved with least-privilege principles, and the workflow pinning to a specific commit hash ensures immutability. No correctness issues were identified in the semantic mapping of the workflow configuration. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> No material issues found The changes in this PR replace the local Gadfly workflow implementation with a call to a reusable workflow. Through the error handling & edge cases lens: 1. The workflow properly forwards only necessary secrets, reducing the risk of exposure 2. The commit pin ensures reproducible behavior 3. Timeout and concurrency settings are maintained 4. User allow-list is preserved for security 5. No error handling logic appears to be lost in the transition The diff shows a clean replacement of one workflow implementation with another, maintaining the same core functionality and safeguards. No error handling or edge case issues were introduced. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 38s</sub>

🪰 Gadfly review — claude-code/opus:max (claude-code)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — Minor issues

I've verified the actual file at .gitea/workflows/adversarial-review.yml. It matches the diff exactly. Checking each finding against the real source:

  • Line 41 is uses: steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c431... — a 40-hex-char SHA pin on the entry-point reusable workflow only. Confirmed. steve/gadfly is not in this checkout, so the transitive-pin caveat is genuinely unverifiable here, exactly as the draft states.
  • Line 45 CLAUDE_CODE_OAUTH_TOKEN is forwarded; the pre-diff env block had no such secret, so it is newly exposed. Confirmed.
  • Lines 33–38 the if: guard gates only issue_comment; pull_request (lines 9–10: opened/reopened/ready_for_review) runs with no author check. Confirmed.
  • secrets: block (43–47) is explicit, not inherit, scoped to four entries; job-level if: restricts issue_comment to steve/fizi/dazed, mirroring allowed_users (line 55). Confirmed.

All three findings are accurately grounded in the actual code, and each is honestly scoped about what could not be verified. They all survive.


VERDICT: Minor issues

This change is, on balance, a security improvement (explicit least-privilege secrets: instead of broad mapping, full 40-char commit SHA pin). The findings below are caveats, not blockers.

Findings

  • adversarial-review.yml:41 — the SHA pin only pins the entry-point file, not the transitive supply chain. The PR's headline security claim is "a push to gadfly can't silently change the code that runs with our forwarded secrets." That holds only if review-reusable.yml at that SHA itself pins everything it pulls in — its container image (e.g. docker://…/gadfly:<tag>) and any nested uses:. If the reusable workflow internally references :latest or @main (the old stub here literally ran docker://…/gadfly:sha-d7f364d and noted "act_runner caches :latest"), then a push to gadfly can still change the code that executes with all four forwarded secrets. I could not verify thissteve/gadfly isn't in this checkout. Suggest confirming review-reusable.yml@20a5c43 pins its own image/actions by digest before treating the pin as a true immutability guarantee.

  • adversarial-review.yml:43-47CLAUDE_CODE_OAUTH_TOKEN is a newly-exposed secret to external code. The prior version never forwarded it. The four secrets (Ollama key, Claude OAuth token, findings URL+token) now all flow into an external repo's workflow. This is an accepted/documented same-org trust relationship and is mitigated by the pin (subject to the caveat above), but worth stating plainly: a compromise of that gadfly commit or its build artifacts exfiltrates all four. The Claude OAuth token in particular is broader-blast-radius than an inference API key.

  • adversarial-review.yml:33-38 (note, mostly pre-existing) — the if: guard only gates issue_comment; pull_request (opened/reopened/ready_for_review) runs unconditionally. The comment asserts pull_request is "already trusted." On GitHub semantics that's safe (fork pull_request runs get no secrets); on Gitea/act_runner, whether secrets are withheld from fork PRs is an instance-config question I can't verify from the repo. This trigger model is unchanged by the diff, but this PR raises the stakes by routing more/broader secrets through it. Mitigating factor confirmed from the design: gadfly reviews the PR diff as data and runs pinned gadfly code — it does not execute the PR's build — so direct code-exec exfiltration via a malicious PR is not the threat here; the residual risk is limited to whatever the reusable workflow itself does with the secrets. If this Gitea instance accepts fork PRs, consider gating pull_request by author as well, or relying solely on pull_request_target-style trusted context.

Verified

  • Read adversarial-review.yml in full: SHA at line 41 is a complete 40-hex-char commit ref (immutable as a ref). secrets: block is explicit (not inherit) and scoped to four entries — least-privilege claim is accurate. Job-level if: correctly restricts issue_comment to steve/fizi/dazed, mirroring the allowed_users input.

Nothing here blocks merge. The one item worth resolving before relying on the security posture as described is the transitive-pin caveat on the reusable workflow.

🎯 Correctness — No material issues found

I've verified everything verifiable in this checkout. The substantive correctness risks all live in the contract of steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43, which is a different repository and is not present in this checkout — so I cannot confirm input names, types, or secret declarations against it. I'll be explicit about what I checked vs. what I couldn't.

VERDICT: No material issues found (within the Correctness lens)

The diff is internally coherent and the YAML is well-formed. Notes below; none rise to a confirmed logic bug in code I can read.

Verified correct (by reading the file + reasoning from Actions reusable-workflow semantics):

  • Removing runs-on:, steps:, env:, and job-level timeout-minutes: is required when a job becomes a reusable-workflow caller (uses:) — those keys are illegal on such a job. Passing timeout_minutes: 90 as a with: input is the correct workaround, not a regression. The if:, permissions:, concurrency:, with:, and secrets: keys are all valid on a calling job. (adversarial-review.yml:41-56)
  • The dropped event-context env vars (PR, PR_BRANCH, COMMENT_BODY, ACTOR, GITEA_API, …) are derivable inside the called workflow: in a reusable-workflow call the github context resolves to the caller's triggering event/repo, so the reusable workflow can reconstruct them. No data is lost by removing them here — provided the reusable workflow actually does so (not verifiable from this repo).
  • The 6 cloud model strings on :49 are byte-for-byte identical to the prior GADFLY_MODELS; this PR only adds the three claude-code/* entries, so no pre-existing model-name oddity is introduced here.

Could NOT verify (cross-repo contract — flag for the author, not findings against this diff):

  • Input name/type contract. models, specialists, provider_concurrency, provider_lens_concurrency, timeout_secs, max_steps, allowed_users, timeout_minutes must match the inputs: block declared in gadfly's review-reusable.yml. A typo or type mismatch would fail the run. Note the inconsistent typing on the caller side: timeout_secs/max_steps are quoted strings ("600", "14") but timeout_minutes is a bare number (90) — each only works if it matches the reusable workflow's declared input type. Worth confirming against the pinned SHA.
  • GITEA_TOKEN "is auto" assumption (:42 comment). This holds only if the reusable workflow reads github.token/the auto-provisioned token directly. If it instead declares GITEA_TOKEN as a required secret input, the caller must forward it explicitly — and it isn't in the secrets: block — so the run would fail to authenticate. This is the single highest-impact unverifiable risk.
  • Mixed model-string syntax. Cloud models use name:cloud; Claude entries use a claude-code/<model> slash prefix, and claude-code/opus:max mixes both a slash and a colon. This is only correct if the reusable workflow's parser splits provider on / first (leaving opus:max intact). Plausible and presumably the documented gadfly syntax, but not confirmable here.
  • provider_lens_concurrency omits claude-code (:52) while provider_concurrency includes it (:51). Per the PR description this is intentional (claude-code lens concurrency falls to a default), so not a bug — just confirm the reusable workflow has a sane default rather than treating an absent key as zero/unbounded.

No invented nits. The change is mechanically sound; the real correctness surface is the reusable workflow's input/secret contract, which the author should sanity-check against SHA 20a5c43 since it can't be checked from this repo.

🧯 Error handling & edge cases — No material issues found

This change is a pure CI-config refactor: it deletes the self-contained Docker job (with its env: block and inline edge-case handling like the PR: fallback chain and IS_DRAFT) and replaces it with a uses: call to an external reusable workflow at a pinned SHA. I checked the full checked-out file and the repo's other workflow, and verified what is verifiable within this repository.

The key constraint for my lens: every unhappy path this PR used to handle inline (offline model → error comment, missing/empty secret, PR-number resolution across event types, draft handling) is now delegated to steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43…, which lives in a different repository and is not present at this checkout. I cannot open it, so I cannot confirm its input schema or its error handling — and I won't present guesses about it as findings.

What I can evaluate in this diff, through the error-handling/edge-case lens, is clean:

  • The trigger guard (adversarial-review.yml:33-38) is logically unchanged from before and correctly gates the only untrusted path (issue_comment) on both github.event.issue.pull_request being truthy and an actor allow-list; workflow_dispatch requires pr_number (:17), and concurrency.group (:25) falls back across all three event shapes. No new unguarded edge introduced here.
  • Removing the inline steps/env is correct for a reusable-workflow caller (a uses: job cannot also declare steps), so nothing is half-converted.

One thing I could not verify, flagged as unverified rather than asserted:

  • adversarial-review.yml:56 passes timeout_minutes: 90 as a bare integer while every other input is a quoted string. Whether this parses depends on the type: declared for that input in the external reusable workflow. If it's declared type: string, an integer value can fail input validation at workflow-parse time (the whole job errors out before any review runs — a hard edge case, not an advisory degradation). This is unverifiable from this repo and is arguably config-correctness rather than strictly my lens; I note it only because it's the single spot in the diff where a malformed-input failure could originate. Worth a 10-second cross-check against gadfly's inputs.timeout_minutes.type before merge.

VERDICT: No material issues found

Within the error-handling & edge-cases lens, this diff introduces no swallowed errors, missing cleanup, or new unhandled edge case inside this repository — it removes inline handling and delegates it wholesale to an external pinned workflow I can't inspect. The only caveat is the unverified timeout_minutes input-typing note above.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 16s

<!-- gadfly-review:ollama:claude-code/opus:max --> ### 🪰 Gadfly review — `claude-code/opus:max` (claude-code) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — Minor issues</summary> I've verified the actual file at `.gitea/workflows/adversarial-review.yml`. It matches the diff exactly. Checking each finding against the real source: - **Line 41** is `uses: steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c431...` — a 40-hex-char SHA pin on the entry-point reusable workflow only. Confirmed. `steve/gadfly` is not in this checkout, so the transitive-pin caveat is genuinely unverifiable here, exactly as the draft states. - **Line 45** `CLAUDE_CODE_OAUTH_TOKEN` is forwarded; the pre-diff env block had no such secret, so it is newly exposed. Confirmed. - **Lines 33–38** the `if:` guard gates only `issue_comment`; `pull_request` (lines 9–10: opened/reopened/ready_for_review) runs with no author check. Confirmed. - `secrets:` block (43–47) is explicit, not `inherit`, scoped to four entries; job-level `if:` restricts `issue_comment` to steve/fizi/dazed, mirroring `allowed_users` (line 55). Confirmed. All three findings are accurately grounded in the actual code, and each is honestly scoped about what could not be verified. They all survive. --- ## VERDICT: Minor issues This change is, on balance, a security *improvement* (explicit least-privilege `secrets:` instead of broad mapping, full 40-char commit SHA pin). The findings below are caveats, not blockers. ### Findings - **`adversarial-review.yml:41` — the SHA pin only pins the entry-point file, not the transitive supply chain.** The PR's headline security claim is *"a push to gadfly can't silently change the code that runs with our forwarded secrets."* That holds only if `review-reusable.yml` *at that SHA* itself pins everything it pulls in — its container image (e.g. `docker://…/gadfly:<tag>`) and any nested `uses:`. If the reusable workflow internally references `:latest` or `@main` (the old stub here literally ran `docker://…/gadfly:sha-d7f364d` and noted "act_runner caches `:latest`"), then a push to gadfly *can* still change the code that executes with all four forwarded secrets. **I could not verify this** — `steve/gadfly` isn't in this checkout. Suggest confirming `review-reusable.yml@20a5c43` pins its own image/actions by digest before treating the pin as a true immutability guarantee. - **`adversarial-review.yml:43-47` — `CLAUDE_CODE_OAUTH_TOKEN` is a newly-exposed secret to external code.** The prior version never forwarded it. The four secrets (Ollama key, Claude OAuth token, findings URL+token) now all flow into an external repo's workflow. This is an accepted/documented same-org trust relationship and is mitigated by the pin (subject to the caveat above), but worth stating plainly: a compromise of that gadfly commit or its build artifacts exfiltrates all four. The Claude OAuth token in particular is broader-blast-radius than an inference API key. - **`adversarial-review.yml:33-38` (note, mostly pre-existing) — the `if:` guard only gates `issue_comment`; `pull_request` (opened/reopened/ready_for_review) runs unconditionally.** The comment asserts `pull_request` is "already trusted." On GitHub semantics that's safe (fork `pull_request` runs get no secrets); on Gitea/act_runner, whether secrets are withheld from fork PRs is an instance-config question I can't verify from the repo. This trigger model is unchanged by the diff, **but this PR raises the stakes** by routing more/broader secrets through it. Mitigating factor confirmed from the design: gadfly reviews the PR diff as *data* and runs pinned gadfly code — it does not execute the PR's build — so direct code-exec exfiltration via a malicious PR is not the threat here; the residual risk is limited to whatever the reusable workflow itself does with the secrets. If this Gitea instance accepts fork PRs, consider gating `pull_request` by author as well, or relying solely on `pull_request_target`-style trusted context. ### Verified - Read `adversarial-review.yml` in full: SHA at line 41 is a complete 40-hex-char commit ref (immutable as a ref). `secrets:` block is explicit (not `inherit`) and scoped to four entries — least-privilege claim is accurate. Job-level `if:` correctly restricts `issue_comment` to `steve/fizi/dazed`, mirroring the `allowed_users` input. Nothing here blocks merge. The one item worth resolving before relying on the security posture as described is the transitive-pin caveat on the reusable workflow. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> I've verified everything verifiable in this checkout. The substantive correctness risks all live in the *contract* of `steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43`, which is a different repository and is not present in this checkout — so I cannot confirm input names, types, or secret declarations against it. I'll be explicit about what I checked vs. what I couldn't. ## VERDICT: No material issues found (within the Correctness lens) The diff is internally coherent and the YAML is well-formed. Notes below; none rise to a confirmed logic bug in code I can read. **Verified correct (by reading the file + reasoning from Actions reusable-workflow semantics):** - Removing `runs-on:`, `steps:`, `env:`, and job-level `timeout-minutes:` is *required* when a job becomes a reusable-workflow caller (`uses:`) — those keys are illegal on such a job. Passing `timeout_minutes: 90` as a `with:` input is the correct workaround, not a regression. The `if:`, `permissions:`, `concurrency:`, `with:`, and `secrets:` keys are all valid on a calling job. (`adversarial-review.yml:41-56`) - The dropped event-context env vars (`PR`, `PR_BRANCH`, `COMMENT_BODY`, `ACTOR`, `GITEA_API`, …) are derivable inside the called workflow: in a reusable-workflow call the `github` context resolves to the *caller's* triggering event/repo, so the reusable workflow can reconstruct them. No data is lost by removing them here — *provided* the reusable workflow actually does so (not verifiable from this repo). - The 6 cloud model strings on `:49` are byte-for-byte identical to the prior `GADFLY_MODELS`; this PR only *adds* the three `claude-code/*` entries, so no pre-existing model-name oddity is introduced here. **Could NOT verify (cross-repo contract — flag for the author, not findings against this diff):** - **Input name/type contract.** `models`, `specialists`, `provider_concurrency`, `provider_lens_concurrency`, `timeout_secs`, `max_steps`, `allowed_users`, `timeout_minutes` must match the `inputs:` block declared in gadfly's `review-reusable.yml`. A typo or type mismatch would fail the run. Note the inconsistent typing on the caller side: `timeout_secs`/`max_steps` are quoted strings (`"600"`, `"14"`) but `timeout_minutes` is a bare number (`90`) — each only works if it matches the reusable workflow's declared input type. Worth confirming against the pinned SHA. - **`GITEA_TOKEN` "is auto" assumption** (`:42` comment). This holds only if the reusable workflow reads `github.token`/the auto-provisioned token directly. If it instead declares `GITEA_TOKEN` as a *required secret input*, the caller must forward it explicitly — and it isn't in the `secrets:` block — so the run would fail to authenticate. This is the single highest-impact unverifiable risk. - **Mixed model-string syntax.** Cloud models use `name:cloud`; Claude entries use a `claude-code/<model>` slash prefix, and `claude-code/opus:max` mixes both a slash and a colon. This is only correct if the reusable workflow's parser splits provider on `/` first (leaving `opus:max` intact). Plausible and presumably the documented gadfly syntax, but not confirmable here. - **`provider_lens_concurrency` omits `claude-code`** (`:52`) while `provider_concurrency` includes it (`:51`). Per the PR description this is intentional (claude-code lens concurrency falls to a default), so not a bug — just confirm the reusable workflow has a sane default rather than treating an absent key as zero/unbounded. No invented nits. The change is mechanically sound; the real correctness surface is the reusable workflow's input/secret contract, which the author should sanity-check against SHA `20a5c43` since it can't be checked from this repo. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> This change is a pure CI-config refactor: it deletes the self-contained Docker job (with its `env:` block and inline edge-case handling like the `PR:` fallback chain and `IS_DRAFT`) and replaces it with a `uses:` call to an external reusable workflow at a pinned SHA. I checked the full checked-out file and the repo's other workflow, and verified what is verifiable within this repository. The key constraint for my lens: every unhappy path this PR used to handle inline (offline model → error comment, missing/empty secret, PR-number resolution across event types, draft handling) is now delegated to `steve/gadfly/.gitea/workflows/review-reusable.yml@20a5c43…`, which lives in a **different repository** and is not present at this checkout. I cannot open it, so I cannot confirm its input schema or its error handling — and I won't present guesses about it as findings. What I *can* evaluate in this diff, through the error-handling/edge-case lens, is clean: - The trigger guard (`adversarial-review.yml:33-38`) is logically unchanged from before and correctly gates the only untrusted path (`issue_comment`) on both `github.event.issue.pull_request` being truthy and an actor allow-list; `workflow_dispatch` requires `pr_number` (`:17`), and `concurrency.group` (`:25`) falls back across all three event shapes. No new unguarded edge introduced here. - Removing the inline `steps`/`env` is correct for a reusable-workflow caller (a `uses:` job cannot also declare `steps`), so nothing is half-converted. One thing I could **not** verify, flagged as unverified rather than asserted: - `adversarial-review.yml:56` passes `timeout_minutes: 90` as a bare integer while every other input is a quoted string. Whether this parses depends on the `type:` declared for that input in the external reusable workflow. If it's declared `type: string`, an integer value can fail input validation at workflow-parse time (the whole job errors out before any review runs — a hard edge case, not an advisory degradation). This is unverifiable from this repo and is arguably config-correctness rather than strictly my lens; I note it only because it's the single spot in the diff where a malformed-input failure could originate. Worth a 10-second cross-check against gadfly's `inputs.timeout_minutes.type` before merge. VERDICT: No material issues found Within the error-handling & edge-cases lens, this diff introduces no swallowed errors, missing cleanup, or new unhandled edge case inside this repository — it removes inline handling and delegates it wholesale to an external pinned workflow I can't inspect. The only caveat is the unverified `timeout_minutes` input-typing note above. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 5m 16s</sub>
steve added 1 commit 2026-06-28 01:19:01 +00:00
security: scope forwarded secrets + pin gadfly reusable to an immutable sha
executus CI / test (pull_request) Successful in 2m13s
Adversarial Review (Gadfly) / review (pull_request) Successful in 10m31s
1a2a2364ec
Address the swarm's findings on this rollout:
- Replace `secrets: inherit` (which forwarded ALL repo secrets — registry/
  Komodo/Discord/DB creds the reviewer never uses) with explicit forwarding of
  only OLLAMA_CLOUD_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / findings tokens.
  GITEA_TOKEN is the automatic job token (github.token in the reusable).
- Pin uses: ...@main -> @20a5c43 (immutable) so a push to gadfly can't change
  the code that runs with our forwarded secrets.

Requires gadfly's review-reusable.yml secrets contract (steve/gadfly#9, merged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
steve closed this pull request 2026-06-28 01:23:07 +00:00
steve reopened this pull request 2026-06-28 01:23:17 +00:00
steve merged commit 5779035722 into main 2026-06-28 01:43:42 +00:00
Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: steve/executus#13