ci(gadfly): trim the weakest reviewers from the swarm #5

2026-06-27T22:07:39Z

steve commented

2026-06-27 22:07:39 +00:00

Removes the four lowest-value reviewers per your call:

m5/qwen3.6:35b-mlx (also drops the last local Mac → cloud-only now)
gemma4:cloud
gpt-oss:120b-cloud
kimi-k2.7-code:cloud

Result: a cloud-only fleet of 6 — minimax-m3, glm-5.2, glm-5.1, deepseek-v4-pro, nemotron-3-super, qwen3-coder:480b.

Also removes GADFLY_ENDPOINT_M5 and the m5=1 concurrency entry (no local lanes left), drops the per-job timeout 90m → 45m, and keeps the README/CLAUDE.md descriptions in sync.

🤖 Generated with Claude Code

Removes the four lowest-value reviewers per your call: - `m5/qwen3.6:35b-mlx` (also drops the last local Mac → cloud-only now) - `gemma4:cloud` - `gpt-oss:120b-cloud` - `kimi-k2.7-code:cloud` Result: a **cloud-only fleet of 6** — `minimax-m3`, `glm-5.2`, `glm-5.1`, `deepseek-v4-pro`, `nemotron-3-super`, `qwen3-coder:480b`. Also removes `GADFLY_ENDPOINT_M5` and the `m5=1` concurrency entry (no local lanes left), drops the per-job timeout 90m → 45m, and keeps the README/CLAUDE.md descriptions in sync. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

steve added 1 commit 2026-06-27 22:07:39 +00:00

ci(gadfly): trim the weakest reviewers from the swarm

Adversarial Review (Gadfly) / review (pull_request) Successful in 5m27s

Details

CI / Tidy (pull_request) Successful in 9m31s

Details

CI / Build & Test (pull_request) Successful in 9m48s

Details

98a2164aba

Drop the four lowest-graded reviewers — m5/qwen3.6:35b-mlx, gemma4:cloud,
gpt-oss:120b-cloud, kimi-k2.7-code:cloud. Removing m5/qwen3.6 takes the last
local Mac out, so this is now a cloud-only fleet of 6 ollama-cloud models;
GADFLY_ENDPOINT_M5 and the m5 concurrency entry are gone and the per-job timeout
drops to 45m. README/CLAUDE.md kept in sync.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

gitea-actions bot commented

2026-06-27 22:07:49 +00:00

🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

Verdict: No material issues found

The change removes the GADFLY_ENDPOINT_M5 secret mapping and the m5=1 provider-concurrency entry, leaving only cloud providers. All remaining secrets (GITEA_TOKEN, OLLAMA_CLOUD_API_KEY, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) are still explicitly mapped, so no credential is silently dropped or left dangling. Verified via grep that no remaining reference to GADFLY_ENDPOINT_M5 exists in the workflow or repo config.
The trusted-actor guard (job-level if: + GADFLY_ALLOWED_USERS) is unchanged, so authn/authz for the secret-bearing comment trigger is intact. Verified at .gitea/workflows/adversarial-review.yml:38-43,68.
No new injection surface, untrusted input handling, or insecure defaults are introduced — this is a pure fleet-trim.

🎯 Correctness — No material issues found

VERDICT: No material issues found

The model list in GADFLY_MODELS (.gitea/workflows/adversarial-review.yml:58) now contains exactly the six advertised survivors — minimax-m3, glm-5.2, glm-5.1, deepseek-v4-pro, nemotron-3-super, qwen3-coder:480b — and the four removed models (m5/qwen3.6, gemma4, gpt-oss, kimi-k2.7) are gone from the env. Verified by reading the file.
GADFLY_PROVIDER_CONCURRENCY (:59) correctly drops the m5=1 entry now that no m5 model remains in GADFLY_MODELS; no orphaned concurrency reference.
GADFLY_ENDPOINT_M5 env mapping is removed (:50-area no longer references it), consistent with dropping the local Mac lane. No code path in the workflow references it after the change (verified via grep across the repo — remaining m5/.../GADFLY_ENDPOINT_M5 hits are unrelated docs/tests/examples, not this workflow).
timeout-minutes: 45 (:48) is internally consistent with the comment justification (no slow local lane dominating wall time). This is a config threshold with no first-principles "correct" value to re-derive against; the change is self-consistent with the removed lane.
README/CLAUDE.md prose counts updated to "6 ollama-cloud models" and drop the M5 Mac mention, matching the workflow. No stale references to 9 models or the M5 remain in these files.
The comment at :4 still says "this build is what carries foreman provider-type support" — now slightly stale since no foreman/Mac endpoint is used, but that's a documentation-nit outside the correctness lens (no behavioral impact; foreman support is still carried by the image regardless).

No logic/behavioral defects introduced through the correctness lens.

🧯 Error handling & edge cases — No material issues found

Verdict: No material issues found

Through the error-handling / edge-case lens, I checked the unhappy paths this config diff touches:

Dangling provider references after the trim — GADFLY_MODELS now lists only :cloud / :480b-cloud models, which all resolve to the ollama-cloud provider (confirmed via builtin.go:20 ProviderOllamaCloud = "ollama-cloud" and parse_test.go showing :cloud/:480b-cloud tags stay under ollama-cloud). GADFLY_PROVIDER_CONCURRENCY / GADFLY_PROVIDER_LENS_CONCURRENCY both still specify ollama-cloud=3, and the removed m5=1 entry is no longer needed because no model references provider m5. No orphaned provider name that would cause a parse/startup error. Verified.
Timeout headroom after the 90m → 45m cut — 6 models, provider concurrency 3 ⇒ 2 sequential waves; per-model lenses run 3-wide (LENS ollama-cloud=3) under a 600s per-lens deadline (GADFLY_TIMEOUT_SECS). Worst case ≈ 2 × 600s ≈ 20 min of deadline-bound work, well under the 45m job cap. The reduction is safe with margin. Verified against the values in the file (adversarial-review.yml:48,58,59,60,65).
Removed secret mapping (GADFLY_ENDPOINT_M5) — dropping the env mapping is consistent with removing the only foreman model. If the repo secret still exists unused, it's harmless (no longer passed in). No nil/empty fallback path introduced.

(One stale-comment note — adversarial-review.yml:4 still says the image "carries foreman provider-type support," now that no foreman/Mac model remains — but that's a documentation issue outside my lens, not an error-handling defect.)

No edge-case regressions introduced by this change.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 0s}

### 🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> **Verdict: No material issues found** - The change removes the `GADFLY_ENDPOINT_M5` secret mapping and the `m5=1` provider-concurrency entry, leaving only cloud providers. All remaining secrets (`GITEA_TOKEN`, `OLLAMA_CLOUD_API_KEY`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`) are still explicitly mapped, so no credential is silently dropped or left dangling. Verified via `grep` that no remaining reference to `GADFLY_ENDPOINT_M5` exists in the workflow or repo config. - The trusted-actor guard (job-level `if:` + `GADFLY_ALLOWED_USERS`) is unchanged, so authn/authz for the secret-bearing comment trigger is intact. Verified at `.gitea/workflows/adversarial-review.yml:38-43,68`. - No new injection surface, untrusted input handling, or insecure defaults are introduced — this is a pure fleet-trim. </details> <details><summary>🎯 Correctness — No material issues found</summary> VERDICT: No material issues found - The model list in `GADFLY_MODELS` (`.gitea/workflows/adversarial-review.yml:58`) now contains exactly the six advertised survivors — `minimax-m3`, `glm-5.2`, `glm-5.1`, `deepseek-v4-pro`, `nemotron-3-super`, `qwen3-coder:480b` — and the four removed models (`m5/qwen3.6`, `gemma4`, `gpt-oss`, `kimi-k2.7`) are gone from the env. Verified by reading the file. - `GADFLY_PROVIDER_CONCURRENCY` (`:59`) correctly drops the `m5=1` entry now that no `m5` model remains in `GADFLY_MODELS`; no orphaned concurrency reference. - `GADFLY_ENDPOINT_M5` env mapping is removed (`:50`-area no longer references it), consistent with dropping the local Mac lane. No code path in the workflow references it after the change (verified via grep across the repo — remaining `m5/...`/`GADFLY_ENDPOINT_M5` hits are unrelated docs/tests/examples, not this workflow). - `timeout-minutes: 45` (`:48`) is internally consistent with the comment justification (no slow local lane dominating wall time). This is a config threshold with no first-principles "correct" value to re-derive against; the change is self-consistent with the removed lane. - README/CLAUDE.md prose counts updated to "6 ollama-cloud models" and drop the M5 Mac mention, matching the workflow. No stale references to 9 models or the M5 remain in these files. - The comment at `:4` still says "this build is what carries foreman provider-type support" — now slightly stale since no foreman/Mac endpoint is used, but that's a documentation-nit outside the correctness lens (no behavioral impact; foreman support is still carried by the image regardless). No logic/behavioral defects introduced through the correctness lens. </details> <details><summary>🧯 Error handling & edge cases — No material issues found</summary> **Verdict: No material issues found** Through the error-handling / edge-case lens, I checked the unhappy paths this config diff touches: - **Dangling provider references after the trim** — `GADFLY_MODELS` now lists only `:cloud` / `:480b-cloud` models, which all resolve to the `ollama-cloud` provider (confirmed via `builtin.go:20` `ProviderOllamaCloud = "ollama-cloud"` and `parse_test.go` showing `:cloud`/`:480b-cloud` tags stay under `ollama-cloud`). `GADFLY_PROVIDER_CONCURRENCY` / `GADFLY_PROVIDER_LENS_CONCURRENCY` both still specify `ollama-cloud=3`, and the removed `m5=1` entry is no longer needed because no model references provider `m5`. No orphaned provider name that would cause a parse/startup error. Verified. - **Timeout headroom after the 90m → 45m cut** — 6 models, provider concurrency 3 ⇒ 2 sequential waves; per-model lenses run 3-wide (`LENS ollama-cloud=3`) under a 600s per-lens deadline (`GADFLY_TIMEOUT_SECS`). Worst case ≈ 2 × 600s ≈ 20 min of deadline-bound work, well under the 45m job cap. The reduction is safe with margin. Verified against the values in the file (`adversarial-review.yml:48,58,59,60,65`). - **Removed secret mapping (`GADFLY_ENDPOINT_M5`)** — dropping the env mapping is consistent with removing the only foreman model. If the repo secret still exists unused, it's harmless (no longer passed in). No nil/empty fallback path introduced. (One stale-comment note — `adversarial-review.yml:4` still says the image "carries foreman provider-type support," now that no foreman/Mac model remains — but that's a documentation issue outside my lens, not an error-handling defect.) No edge-case regressions introduced by this change. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 0s

gitea-actions bot commented

2026-06-27 22:07:52 +00:00

🪰 Gadfly review — `glm-5.1:cloud` (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

The change removes four reviewer models and a local Mac endpoint from the Gadfly fleet. From the security lens:

Secret surface reduced: The GADFLY_ENDPOINT_M5 secret is no longer injected into the workflow, which is a slight net positive (one fewer credential in the job environment, one fewer network egress path to a foreman host).
No new secrets, endpoints, or injection vectors are introduced.
The actor allow-list (steve,fizi,dazed) and job-level if: guard are unchanged — authz posture is the same.
Timeout reduction (90→45 min) is neutral-to-positive; a shorter window limits resource-abuse potential.
No tainted input paths are added or modified.

The remaining secrets (GITEA_TOKEN, OLLAMA_CLOUD_API_KEY, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) are unchanged and already appropriately scoped. Nothing to flag.

🎯 Correctness — No material issues found

VERDICT: No material issues found

All removed models (m5/qwen3.6:35b-mlx, gemma4:cloud, gpt-oss:120b-cloud, kimi-k2.7-code:cloud) have no remaining active references in the codebase (only a stale test fixture reference to gpt-oss:120b in provider/ollama/ollama_test.go:250 which is unrelated provider-test data, not Gadfly config).
GADFLY_ENDPOINT_M5 and m5=1 concurrency entry are fully removed; no stale references found.
No stale "9 models" / "fleet of 9" / "M5 Mac" references remain in the workflow file or CLAUDE.md after the edits.
The GADFLY_MODELS value lists exactly 6 models, all with the :cloud provider suffix consistent with the single ollama-cloud=3 concurrency entry.
timeout-minutes: 45 vs GADFLY_TIMEOUT_SECS: "600" (10 min per lens): with ollama-cloud=3 model concurrency and 6 models, worst-case wall time is 2 batches × 600s ≈ 20 min, well within 45 min. No correctness issue.
CLAUDE.md description matches the workflow definition (6 cloud models, 3-lens suite, advisory-only).

🧯 Error handling & edge cases — No material issues found

VERDICT: No material issues found

Reviewed through the error-handling & edge-case lens:

Timeout reduction 90→45 min: Safe given the fleet shrank 9→6 and the slow local lane (M5 Mac) was removed. Worst-case wall time: 2 batches (concurrency 3) × 600s per-lens deadline ≈ 20 min, leaving 25 min headroom. No risk of premature job kill.
Removed GADFLY_ENDPOINT_M5 secret: No remaining references to this env var anywhere in the repo (verified via grep), so a missing/empty secret cannot cause a runtime error.
Removed m5=1 from GADFLY_PROVIDER_CONCURRENCY: All m5/qwen3:35b-mlx models are gone from GADFLY_MODELS, so the stale concurrency entry would be unreachable anyway. Removing it is correct and avoids any "unknown provider" parsing edge case in Gadfly.
Stale secret in Gitea: GADFLY_ENDPOINT_M5 will become an orphaned secret, but that's a hygiene concern, not an error-handling issue — Gitea won't inject it unless referenced.
Model list: Verified the 6 remaining models are all :cloud suffixed and all route to the single ollama-cloud provider whose concurrency entries (ollama-cloud=3) are intact. No orphaned provider references.

No edge-case or error-handling defects found in this change.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 30s}

### 🪰 Gadfly review — `glm-5.1:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> **VERDICT: No material issues found** The change removes four reviewer models and a local Mac endpoint from the Gadfly fleet. From the security lens: - **Secret surface reduced**: The `GADFLY_ENDPOINT_M5` secret is no longer injected into the workflow, which is a slight net positive (one fewer credential in the job environment, one fewer network egress path to a foreman host). - **No new secrets, endpoints, or injection vectors** are introduced. - **The actor allow-list** (`steve,fizi,dazed`) and **job-level `if:` guard** are unchanged — authz posture is the same. - **Timeout reduction** (90→45 min) is neutral-to-positive; a shorter window limits resource-abuse potential. - **No tainted input paths** are added or modified. The remaining secrets (`GITEA_TOKEN`, `OLLAMA_CLOUD_API_KEY`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`) are unchanged and already appropriately scoped. Nothing to flag. </details> <details><summary>🎯 Correctness — No material issues found</summary> **VERDICT: No material issues found** - All removed models (`m5/qwen3.6:35b-mlx`, `gemma4:cloud`, `gpt-oss:120b-cloud`, `kimi-k2.7-code:cloud`) have no remaining active references in the codebase (only a stale test fixture reference to `gpt-oss:120b` in `provider/ollama/ollama_test.go:250` which is unrelated provider-test data, not Gadfly config). - `GADFLY_ENDPOINT_M5` and `m5=1` concurrency entry are fully removed; no stale references found. - No stale "9 models" / "fleet of 9" / "M5 Mac" references remain in the workflow file or CLAUDE.md after the edits. - The `GADFLY_MODELS` value lists exactly 6 models, all with the `:cloud` provider suffix consistent with the single `ollama-cloud=3` concurrency entry. - `timeout-minutes: 45` vs `GADFLY_TIMEOUT_SECS: "600"` (10 min per lens): with `ollama-cloud=3` model concurrency and 6 models, worst-case wall time is 2 batches × 600s ≈ 20 min, well within 45 min. No correctness issue. - CLAUDE.md description matches the workflow definition (6 cloud models, 3-lens suite, advisory-only). </details> <details><summary>🧯 Error handling & edge cases — No material issues found</summary> **VERDICT: No material issues found** Reviewed through the error-handling & edge-case lens: - **Timeout reduction 90→45 min**: Safe given the fleet shrank 9→6 and the slow local lane (M5 Mac) was removed. Worst-case wall time: 2 batches (concurrency 3) × 600s per-lens deadline ≈ 20 min, leaving 25 min headroom. No risk of premature job kill. - **Removed `GADFLY_ENDPOINT_M5` secret**: No remaining references to this env var anywhere in the repo (verified via grep), so a missing/empty secret cannot cause a runtime error. - **Removed `m5=1` from `GADFLY_PROVIDER_CONCURRENCY`**: All `m5/qwen3:35b-mlx` models are gone from `GADFLY_MODELS`, so the stale concurrency entry would be unreachable anyway. Removing it is correct and avoids any "unknown provider" parsing edge case in Gadfly. - **Stale secret in Gitea**: `GADFLY_ENDPOINT_M5` will become an orphaned secret, but that's a hygiene concern, not an error-handling issue — Gitea won't inject it unless referenced. - **Model list**: Verified the 6 remaining models are all `:cloud` suffixed and all route to the single `ollama-cloud` provider whose concurrency entries (`ollama-cloud=3`) are intact. No orphaned provider references. No edge-case or error-handling defects found in this change. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 30s

gitea-actions bot commented

2026-06-27 22:07:53 +00:00

🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

Reviewing through the Security lens — this PR is purely a fleet-size trim and timeout shrink in the CI workflow, with parallel documentation updates. No new code paths, no new trust boundaries, no new taint sources, no input handling changes.

What I verified:

GADFLY_ENDPOINT_M5 is gone from the workflow (grep across repo returns zero hits outside the deleted block). The foreman DSN plumbing for m5 is not referenced anywhere else (.env.example and README.md still describe the LLM_M* foreman DSN feature generically — those are the library's documented capabilities, not a CI integration, and they're not security-relevant to this change).
The remaining secrets.* bindings (GITEA_TOKEN, OLLAMA_CLOUD_API_KEY, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) and the trusted-user if: guard (steve|fizi|dazed) plus GADFLY_ALLOWED_USERS are unchanged.
The permissions: block (contents: read, issues: write, pull-requests: write) is unchanged — still no elevated scopes.
timeout-minutes: 90 → 45 reduces the worst-case window a runaway reviewer run holds secrets in memory and posts comments. That's a mild improvement on the secrets-exposure surface, not a regression.
GADFLY_MODELS / GADFLY_PROVIDER_CONCURRENCY / GADFLY_PROVIDER_LENS_CONCURRENCY are untrusted-input-free envs set by workflow author; no user input is interpolated into them. Model name strings are not executed anywhere a security lens cares about.

No material issues found.

The change is purely a fleet trim and a documentation sync. Through the security lens it is a small net positive (one fewer secret plumbed into the job, smaller worst-case runtime window for the secrets-bearing run). Nothing in the diff opens a new injection, SSRF, deserialization, authz, or secret-leakage vector.

🎯 Correctness — No material issues found

All consistent. The CLAUDE.md is updated and README.md has no Gadfly-fleet claim (its references to m5/qwen3:30b,m1/qwen3:30b are in library code examples, not workflow descriptions).

Let me verify the per-lens concurrency math one more time. The fleet has 6 cloud models. With GADFLY_PROVIDER_CONCURRENCY: "ollama-cloud=3", at most 3 of those 6 are running concurrently (sliced into 2 batches of 3). Each running provider fans out its 3 lenses (GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3"). Each lens has a 600s deadline. Worst-case wall time: ~1200s = 20min (two sequential provider-batches, each running 3 lenses in parallel for up to 600s). The 45-min job timeout is fine.

Actually wait — I should think about this more carefully. If provider concurrency is 3, then up to 3 of the 6 cloud models run in parallel. As each finishes, the next one starts. That's a single wave shape, not 2 sequential batches. So in the worst case where ALL 6 take 600s each, with 3 slots, the last one wouldn't start until 3 × 600s = 1800s = 30min in — past the 45-min timeout.

Let me re-derive: if provider concurrency = 3 and there are 6 models, the queue is processed 3-at-a-time. If each model takes 600s (max), the LAST model starts at (5/3) × 600s ≈ 1000s into the run and runs for 600s, finishing at ~1600s ≈ 27min. Within 45min. ✓

But there's still concern: if lenses run concurrently within a model (3 lenses parallel), and the model takes 600s total wall time (its slowest lens), then with provider concurrency = 3, total worst case ≈ 27 min. With provider concurrency = 1 (serial per provider), it would be 6 × 600s = 60min, which would EXCEED 45min. But provider concurrency = 3 means parallel, so we're fine.

Actually, let me look at this differently: the per-model wall time is bounded by its slowest lens (lenses run in parallel within a provider, so model wall time ≤ lens timeout = 600s). With 6 models and concurrency 3, the 4th/5th/6th models can start as soon as any of the first 3 finishes. So total wall time ≤ 600s × 2 = 1200s = 20min in the absolute worst case. Definitely under 45min. ✓

So 90m → 45m is reasonable.

Now let me look one more time at the comment update on line 55-57:

# Cloud-only fleet (no local Macs). Cloud concurrency lives in the
# LENSES: models run a few at a time (ollama-cloud=3) with their 3 lenses
# concurrent (LENS ollama-cloud=3) so comments land sooner.

The original was nearly identical; this is just a cleanup removing the M5-specific phrasing. ✓

VERDICT: No material issues found.

No material issues found.

This PR is a self-contained trim of the review fleet and its supporting documentation. Verified each claim:

Fleet math (.gitea/workflows/adversarial-review.yml:58). The new GADFLY_MODELS lists exactly 6 entries (minimax-m3, glm-5.2, glm-5.1, deepseek-v4-pro, nemotron-3-super, qwen3-coder:480b-cloud) — matches the PR's "cloud-only fleet of 6" claim.
Drop list match. Each of the four dropped names (m5/qwen3.6:35b-mlx, gemma4:cloud, gpt-oss:120b-cloud, kimi-k2.7-code:cloud) was present in the original model string and is absent in the new one; the six remaining names are the original minus those four. 10 → 6 ✓.
GADFLY_ENDPOINT_M5 removal is safe. After the model list change, no m5/... reference remains in GADFLY_MODELS, so the secret mapping has no consumer (grep confirmed the only GADFLY_ENDPOINT_M5 occurrence was the removed line).
m5=1 concurrency removal is safe. Same reason — no m5 provider in GADFLY_MODELS, so the entry would have been a no-op.
Timeout reduction 90m → 45m is correctly sized. With 6 cloud models, GADFLY_PROVIDER_CONCURRENCY=ollama-cloud=3, GADFLY_PROVIDER_LENS_CONCURRENCY=ollama-cloud=3, and a per-lens deadline of 600s: each model's wall time is bounded by its slowest lens (≤600s because lenses within a provider run in parallel), and at most 2 full provider-waves are queued, giving a worst-case wall time ≤ 1200s ≈ 20 min. 45 min is comfortably above that bound; the old 90 min was paying for the slow local Mac lane that no longer exists.
Doc sync (CLAUDE.md:143–146). Updated to "fleet of 6 ollama-cloud models"; README.md has no fleet-size claim to update (its m5/qwen3 references are library usage examples in code blocks, not workflow descriptions).

Nothing in the correctness lane — model set, concurrency knobs, timeout budget, and secret mapping all line up.

🧯 Error handling & edge cases — Minor issues

These are general mort library concerns about per-target retries inside majordomo, not about gadfly's multi-model posting behavior. The draft's claim about gadfly's runtime behavior ("zero findings with no diagnostic lane") is unverifiable from this repo since gadfly is an external tool.

Now let me write the corrected review. The first finding's factual core (provider diversity removed; old graceful-degradation comment deleted) is verifiable, but the dramatic conclusion about gadfly going all-or-nothing isn't. Findings 2 and 3 contain arithmetic that's partially wrong (Finding 2's "2× worst case") and partially unverifiable speculation (rate limiting). Let me produce a clean, verifiable output.

VERDICT: Minor issues

.gitea/workflows/adversarial-review.yml:45–66 (verified by diff vs. previous file): The fleet now consists solely of *:cloud models (GADFLY_MODELS, line 58) and GADFLY_PROVIDER_CONCURRENCY only declares ollama-cloud=3 (line 59) — GADFLY_ENDPOINT_M5 and the M5 entry in GADFLY_MODELS were removed. The header comment that previously documented graceful degradation ("If the Mac is offline that model's comment shows an error and the others still post") is also gone. CLAUDE.md:143–146 is updated consistently to "6 ollama-cloud models". The factual loss of the only non-cloud provider lane is confirmed by the diff. Whether this constitutes a regression in review reliability depends on gadfly's behavior when the cloud provider is unreachable, which lives in the external gitea.stevedudenhoeffer.com/steve/gadfly:sha-d7f364d image and is not in this repo — that part of the claim is not verifiable here. Worth restating the new failure mode honestly in the workflow header (or adding a second provider lane) before merging, but not blocking on what can be checked in-tree.
.gitea/workflows/adversarial-review.yml:48,58–60,65 (timeout/concurrency math): timeout-minutes: 45 vs. GADFLY_TIMEOUT_SECS: "600" × ceil(6/3)=2 model-batches ≈ 20 min worst-case if all requests run to deadline, i.e. ~2.25× headroom, not the "2×" the draft asserts. The 45-min job budget is reasonable. GADFLY_PROVIDER_CONCURRENCY=3 × GADFLY_PROVIDER_LENS_CONCURRENCY=3 = up to 9 concurrent ollama-cloud calls per CI runner — that arithmetic is correct. The claim that ollama-cloud rate-limits "will eat 429s and burn the full 45 min" is speculation about an external service's behavior and not verifiable from this repo. No actionable change required based on what is in-tree.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 27s}

### 🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> Reviewing through the Security lens — this PR is purely a fleet-size trim and timeout shrink in the CI workflow, with parallel documentation updates. No new code paths, no new trust boundaries, no new taint sources, no input handling changes. What I verified: - `GADFLY_ENDPOINT_M5` is gone from the workflow (grep across repo returns zero hits outside the deleted block). The foreman DSN plumbing for `m5` is not referenced anywhere else (`.env.example` and `README.md` still describe the `LLM_M*` foreman DSN feature generically — those are the library's documented capabilities, not a CI integration, and they're not security-relevant to this change). - The remaining `secrets.*` bindings (`GITEA_TOKEN`, `OLLAMA_CLOUD_API_KEY`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`) and the trusted-user `if:` guard (`steve|fizi|dazed`) plus `GADFLY_ALLOWED_USERS` are unchanged. - The `permissions:` block (`contents: read`, `issues: write`, `pull-requests: write`) is unchanged — still no elevated scopes. - `timeout-minutes: 90 → 45` reduces the worst-case window a runaway reviewer run holds secrets in memory and posts comments. That's a mild improvement on the secrets-exposure surface, not a regression. - `GADFLY_MODELS` / `GADFLY_PROVIDER_CONCURRENCY` / `GADFLY_PROVIDER_LENS_CONCURRENCY` are untrusted-input-free envs set by workflow author; no user input is interpolated into them. Model name strings are not executed anywhere a security lens cares about. **No material issues found.** The change is purely a fleet trim and a documentation sync. Through the security lens it is a small net positive (one fewer secret plumbed into the job, smaller worst-case runtime window for the secrets-bearing run). Nothing in the diff opens a new injection, SSRF, deserialization, authz, or secret-leakage vector. </details> <details><summary>🎯 Correctness — No material issues found</summary> All consistent. The CLAUDE.md is updated and README.md has no Gadfly-fleet claim (its references to `m5/qwen3:30b,m1/qwen3:30b` are in library code examples, not workflow descriptions). Let me verify the per-lens concurrency math one more time. The fleet has 6 cloud models. With `GADFLY_PROVIDER_CONCURRENCY: "ollama-cloud=3"`, at most 3 of those 6 are running concurrently (sliced into 2 batches of 3). Each running provider fans out its 3 lenses (`GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3"`). Each lens has a 600s deadline. Worst-case wall time: ~1200s = 20min (two sequential provider-batches, each running 3 lenses in parallel for up to 600s). The 45-min job timeout is fine. Actually wait — I should think about this more carefully. If provider concurrency is 3, then up to 3 of the 6 cloud models run in parallel. As each finishes, the next one starts. That's a single wave shape, not 2 sequential batches. So in the worst case where ALL 6 take 600s each, with 3 slots, the last one wouldn't start until 3 × 600s = 1800s = 30min in — past the 45-min timeout. Let me re-derive: if provider concurrency = 3 and there are 6 models, the queue is processed 3-at-a-time. If each model takes 600s (max), the LAST model starts at (5/3) × 600s ≈ 1000s into the run and runs for 600s, finishing at ~1600s ≈ 27min. Within 45min. ✓ But there's still concern: if lenses run concurrently within a model (3 lenses parallel), and the model takes 600s total wall time (its slowest lens), then with provider concurrency = 3, total worst case ≈ 27 min. With provider concurrency = 1 (serial per provider), it would be 6 × 600s = 60min, which would EXCEED 45min. But provider concurrency = 3 means parallel, so we're fine. Actually, let me look at this differently: the per-model wall time is bounded by its slowest lens (lenses run in parallel within a provider, so model wall time ≤ lens timeout = 600s). With 6 models and concurrency 3, the 4th/5th/6th models can start as soon as any of the first 3 finishes. So total wall time ≤ 600s × 2 = 1200s = 20min in the absolute worst case. Definitely under 45min. ✓ So 90m → 45m is reasonable. Now let me look one more time at the comment update on line 55-57: ``` # Cloud-only fleet (no local Macs). Cloud concurrency lives in the # LENSES: models run a few at a time (ollama-cloud=3) with their 3 lenses # concurrent (LENS ollama-cloud=3) so comments land sooner. ``` The original was nearly identical; this is just a cleanup removing the M5-specific phrasing. ✓ VERDICT: **No material issues found.** --- **No material issues found.** This PR is a self-contained trim of the review fleet and its supporting documentation. Verified each claim: - **Fleet math (`.gitea/workflows/adversarial-review.yml:58`).** The new `GADFLY_MODELS` lists exactly 6 entries (`minimax-m3`, `glm-5.2`, `glm-5.1`, `deepseek-v4-pro`, `nemotron-3-super`, `qwen3-coder:480b-cloud`) — matches the PR's "cloud-only fleet of 6" claim. - **Drop list match.** Each of the four dropped names (`m5/qwen3.6:35b-mlx`, `gemma4:cloud`, `gpt-oss:120b-cloud`, `kimi-k2.7-code:cloud`) was present in the original model string and is absent in the new one; the six remaining names are the original minus those four. 10 → 6 ✓. - **`GADFLY_ENDPOINT_M5` removal is safe.** After the model list change, no `m5/...` reference remains in `GADFLY_MODELS`, so the secret mapping has no consumer (grep confirmed the only `GADFLY_ENDPOINT_M5` occurrence was the removed line). - **`m5=1` concurrency removal is safe.** Same reason — no `m5` provider in `GADFLY_MODELS`, so the entry would have been a no-op. - **Timeout reduction 90m → 45m is correctly sized.** With 6 cloud models, `GADFLY_PROVIDER_CONCURRENCY=ollama-cloud=3`, `GADFLY_PROVIDER_LENS_CONCURRENCY=ollama-cloud=3`, and a per-lens deadline of 600s: each model's wall time is bounded by its slowest lens (≤600s because lenses within a provider run in parallel), and at most 2 full provider-waves are queued, giving a worst-case wall time ≤ 1200s ≈ 20 min. 45 min is comfortably above that bound; the old 90 min was paying for the slow local Mac lane that no longer exists. - **Doc sync (`CLAUDE.md:143–146`).** Updated to "fleet of 6 ollama-cloud models"; README.md has no fleet-size claim to update (its `m5/qwen3` references are library usage examples in code blocks, not workflow descriptions). Nothing in the correctness lane — model set, concurrency knobs, timeout budget, and secret mapping all line up. </details> <details><summary>🧯 Error handling & edge cases — Minor issues</summary> These are general mort library concerns about per-target retries inside majordomo, not about gadfly's multi-model posting behavior. The draft's claim about gadfly's runtime behavior ("zero findings with no diagnostic lane") is unverifiable from this repo since gadfly is an external tool. Now let me write the corrected review. The first finding's factual core (provider diversity removed; old graceful-degradation comment deleted) is verifiable, but the dramatic conclusion about gadfly going all-or-nothing isn't. Findings 2 and 3 contain arithmetic that's partially wrong (Finding 2's "2× worst case") and partially unverifiable speculation (rate limiting). Let me produce a clean, verifiable output. **VERDICT: Minor issues** - **.gitea/workflows/adversarial-review.yml:45–66** (verified by diff vs. previous file): The fleet now consists solely of `*:cloud` models (`GADFLY_MODELS`, line 58) and `GADFLY_PROVIDER_CONCURRENCY` only declares `ollama-cloud=3` (line 59) — `GADFLY_ENDPOINT_M5` and the M5 entry in `GADFLY_MODELS` were removed. The header comment that previously documented graceful degradation ("If the Mac is offline that model's comment shows an error and the others still post") is also gone. CLAUDE.md:143–146 is updated consistently to "6 ollama-cloud models". The factual loss of the only non-cloud provider lane is confirmed by the diff. Whether this constitutes a regression in review reliability depends on gadfly's behavior when the cloud provider is unreachable, which lives in the external `gitea.stevedudenhoeffer.com/steve/gadfly:sha-d7f364d` image and is not in this repo — that part of the claim is not verifiable here. Worth restating the new failure mode honestly in the workflow header (or adding a second provider lane) before merging, but not blocking on what can be checked in-tree. - **.gitea/workflows/adversarial-review.yml:48,58–60,65** (timeout/concurrency math): `timeout-minutes: 45` vs. `GADFLY_TIMEOUT_SECS: "600"` × `ceil(6/3)=2` model-batches ≈ 20 min worst-case if all requests run to deadline, i.e. ~2.25× headroom, not the "2×" the draft asserts. The 45-min job budget is reasonable. `GADFLY_PROVIDER_CONCURRENCY=3` × `GADFLY_PROVIDER_LENS_CONCURRENCY=3` = up to 9 concurrent ollama-cloud calls per CI runner — that arithmetic is correct. The claim that ollama-cloud rate-limits "will eat 429s and burn the full 45 min" is speculation about an external service's behavior and not verifiable from this repo. No actionable change required based on what is in-tree. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 27s

gitea-actions bot commented

2026-06-27 22:08:50 +00:00

🪰 Gadfly review — `deepseek-v4-pro:cloud` (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found.

The diff is a net security positive through the security lens:

GADFLY_ENDPOINT_M5 secret removed — the foreman URL+token that connected to a local Mac is no longer exposed as an environment variable in the workflow. Verified via grep that no references to GADFLY_ENDPOINT_M5 or GADFLY_ENDPOINT remain anywhere in the repo. This shrinks the secret surface area.
No new secrets, inputs, or injection vectors introduced — the remaining env vars (GITEA_TOKEN, OLLAMA_CLOUD_API_KEY, GADFLY_FINDINGS_URL, GADFLY_FINDINGS_TOKEN) are unchanged. No new user-controlled input flows into the container.
Auth guard intact — the if: condition restricting comment-triggered runs to steve, fizi, dazed is untouched, and GADFLY_ALLOWED_USERS mirrors it.
No path/command/SQL/template construction in the diff — purely a model-list trim, timeout reduction, and comment sync.

🎯 Correctness — No material issues found

No material issues found

The diff is a straightforward fleet trim: four models are removed from GADFLY_MODELS, their associated provider registration (GADFLY_ENDPOINT_M5) and concurrency entry (m5=1) are dropped, and the timeout is halved. I verified:

No dangling references: m5/qwen3.6:35b-mlx was the sole consumer of the m5 provider (registered via GADFLY_ENDPOINT_M5) and the sole entry behind m5=1 in GADFLY_PROVIDER_CONCURRENCY. All three are removed together — nothing left behind that would reference a missing provider or concurrency key.
Remaining model count matches: 6 models in the new GADFLY_MODELS string, all :cloud-suffixed, all served by ollama-cloud — consistent with the ollama-cloud=3 concurrency settings.
Timeout 90m → 45m is sound: With the slow local Mac lane gone, worst-case wall time is ~2 batches × 600s lens timeout = 20 min. 45 min provides >2× headroom.
CLAUDE.md references to m5/qwen3:30b (line 81) and other repo references are majordomo library examples/docs — they describe the library's own env-DSN provider feature, not Gadfly's fleet, and are unaffected by this change.

🧯 Error handling & edge cases — No material issues found

No material issues found.

The diff is a clean removal: the four models, the GADFLY_ENDPOINT_M5 env var, and the m5=1 concurrency entry are all dropped together. I verified:

No dangling references: grep confirms GADFLY_ENDPOINT_M5, m5/qwen, and m5=1 no longer appear in the workflow file. The remaining GADFLY_MODELS list (6 entries, all :cloud) is consistent with the single GADFLY_PROVIDER_CONCURRENCY entry (ollama-cloud=3).
No orphaned provider: Since no model references the m5 provider, the container won't attempt to resolve GADFLY_ENDPOINT_M5 — removing the env var is safe.
Timeout reduction 90m → 45m: With the slow local Mac lane gone, the worst-case wall time is ~2 batches × 10 min per-lens timeout + overhead ≈ 25–30 minutes. 45 minutes provides reasonable headroom; the per-lens GADFLY_TIMEOUT_SECS: "600" still bounds individual runaway reviews regardless.
No nil/empty/zero edge cases introduced: The remaining config values are all non-empty strings with the same structure as before.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 31s}

### 🪰 Gadfly review — `deepseek-v4-pro:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> **No material issues found.** The diff is a net security positive through the security lens: - **`GADFLY_ENDPOINT_M5` secret removed** — the foreman URL+token that connected to a local Mac is no longer exposed as an environment variable in the workflow. Verified via `grep` that no references to `GADFLY_ENDPOINT_M5` or `GADFLY_ENDPOINT` remain anywhere in the repo. This shrinks the secret surface area. - **No new secrets, inputs, or injection vectors introduced** — the remaining env vars (`GITEA_TOKEN`, `OLLAMA_CLOUD_API_KEY`, `GADFLY_FINDINGS_URL`, `GADFLY_FINDINGS_TOKEN`) are unchanged. No new user-controlled input flows into the container. - **Auth guard intact** — the `if:` condition restricting comment-triggered runs to `steve`, `fizi`, `dazed` is untouched, and `GADFLY_ALLOWED_USERS` mirrors it. - **No path/command/SQL/template construction** in the diff — purely a model-list trim, timeout reduction, and comment sync. </details> <details><summary>🎯 Correctness — No material issues found</summary> **No material issues found** The diff is a straightforward fleet trim: four models are removed from `GADFLY_MODELS`, their associated provider registration (`GADFLY_ENDPOINT_M5`) and concurrency entry (`m5=1`) are dropped, and the timeout is halved. I verified: - **No dangling references**: `m5/qwen3.6:35b-mlx` was the sole consumer of the `m5` provider (registered via `GADFLY_ENDPOINT_M5`) and the sole entry behind `m5=1` in `GADFLY_PROVIDER_CONCURRENCY`. All three are removed together — nothing left behind that would reference a missing provider or concurrency key. - **Remaining model count matches**: 6 models in the new `GADFLY_MODELS` string, all `:cloud`-suffixed, all served by `ollama-cloud` — consistent with the `ollama-cloud=3` concurrency settings. - **Timeout 90m → 45m is sound**: With the slow local Mac lane gone, worst-case wall time is ~2 batches × 600s lens timeout = 20 min. 45 min provides >2× headroom. - **CLAUDE.md references to `m5/qwen3:30b`** (line 81) and other repo references are majordomo library examples/docs — they describe the library's own env-DSN provider feature, not Gadfly's fleet, and are unaffected by this change. </details> <details><summary>🧯 Error handling & edge cases — No material issues found</summary> No material issues found. The diff is a clean removal: the four models, the `GADFLY_ENDPOINT_M5` env var, and the `m5=1` concurrency entry are all dropped together. I verified: - **No dangling references**: `grep` confirms `GADFLY_ENDPOINT_M5`, `m5/qwen`, and `m5=1` no longer appear in the workflow file. The remaining `GADFLY_MODELS` list (6 entries, all `:cloud`) is consistent with the single `GADFLY_PROVIDER_CONCURRENCY` entry (`ollama-cloud=3`). - **No orphaned provider**: Since no model references the `m5` provider, the container won't attempt to resolve `GADFLY_ENDPOINT_M5` — removing the env var is safe. - **Timeout reduction 90m → 45m**: With the slow local Mac lane gone, the worst-case wall time is ~2 batches × 10 min per-lens timeout + overhead ≈ 25–30 minutes. 45 minutes provides reasonable headroom; the per-lens `GADFLY_TIMEOUT_SECS: "600"` still bounds individual runaway reviews regardless. - **No nil/empty/zero edge cases introduced**: The remaining config values are all non-empty strings with the same structure as before. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 31s

gitea-actions bot commented

2026-06-27 22:09:20 +00:00

🪰 Gadfly review — `nemotron-3-super:cloud` (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

No material issues found

🎯 Correctness — No material issues found

No material issues found. The changes align with the PR description: removed the four specified reviewers, removed the corresponding endpoint and concurrency setting, reduced the timeout from 90 to 45 minutes, and updated the comments accordingly. All changes were verified in the workflow file. No correctness issues were detected.

🧯 Error handling & edge cases — No material issues found

No material issues found

Reasoning (Error handling & edge cases lens):
The changes are limited to workflow configuration (.gitea/workflows/adversarial-review.yml) and documentation (CLAUDE.md). No actual Gadfly source code is modified, so error handling concerns are limited to the workflow's operational aspects:

Timeout reduction (90 → 45 min): The workflow timeout was halved after removing the slow local Mac lane and four cloud models. The comment states "Plenty of headroom for the cloud lanes." Without runtime data, this is a judgment call but not an outright error handling flaw—it’s a configuration adjustment. If the timeout proves insufficient in practice, the workflow would fail visibly (step timeout), which is an observable failure mode, not a silent error.
Removed Mac lane secrets/concurrency:
- GADFLY_ENDPOINT_M5 and its associated comment were removed.
- GADFLY_PROVIDER_CONCURRENCY changed from "ollama-cloud=3,m5=1" to "ollama-cloud=3".
  Verified via grep that GADFLY_ENDPOINT_M5 is no longer referenced anywhere in the repository. No dangling references or orphaned cleanup obligations remain.
Error propagation: The workflow step runs a container; if the container fails (e.g., missing secrets, runtime errors), the step fails and the workflow stops—this is appropriate for a CI job. No swallowed errors or missing rollback logic is evident because the job allocates no persistent resources requiring cleanup.
Documentation: Changes in CLAUDE.md merely update the fleet size description (9 → 6 models, Mac removal) and remove references to the Mac lane. No error handling implications.

No material error handling or edge case defects were identified in the scoped changes.

Verdict: No material issues found.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 49s}

### 🪰 Gadfly review — `nemotron-3-super:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> No material issues found </details> <details><summary>🎯 Correctness — No material issues found</summary> No material issues found. The changes align with the PR description: removed the four specified reviewers, removed the corresponding endpoint and concurrency setting, reduced the timeout from 90 to 45 minutes, and updated the comments accordingly. All changes were verified in the workflow file. No correctness issues were detected. </details> <details><summary>🧯 Error handling & edge cases — No material issues found</summary> No material issues found **Reasoning (Error handling & edge cases lens):** The changes are limited to workflow configuration (`.gitea/workflows/adversarial-review.yml`) and documentation (`CLAUDE.md`). No actual Gadfly source code is modified, so error handling concerns are limited to the workflow's operational aspects: 1. **Timeout reduction (90 → 45 min)**: The workflow timeout was halved after removing the slow local Mac lane and four cloud models. The comment states "Plenty of headroom for the cloud lanes." Without runtime data, this is a judgment call but not an outright error handling flaw—it’s a configuration adjustment. If the timeout proves insufficient in practice, the workflow would fail visibly (step timeout), which is an observable failure mode, not a silent error. 2. **Removed Mac lane secrets/concurrency**: - `GADFLY_ENDPOINT_M5` and its associated comment were removed. - `GADFLY_PROVIDER_CONCURRENCY` changed from `"ollama-cloud=3,m5=1"` to `"ollama-cloud=3"`. Verified via `grep` that `GADFLY_ENDPOINT_M5` is no longer referenced anywhere in the repository. No dangling references or orphaned cleanup obligations remain. 3. **Error propagation**: The workflow step runs a container; if the container fails (e.g., missing secrets, runtime errors), the step fails and the workflow stops—this is appropriate for a CI job. No swallowed errors or missing rollback logic is evident because the job allocates no persistent resources requiring cleanup. 4. **Documentation**: Changes in `CLAUDE.md` merely update the fleet size description (9 → 6 models, Mac removal) and remove references to the Mac lane. No error handling implications. No material error handling or edge case defects were identified in the scoped changes. **Verdict**: No material issues found. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 49s

gitea-actions bot commented

2026-06-27 22:09:21 +00:00

🪰 Gadfly review — `qwen3-coder:480b-cloud` (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

The changes in this PR are focused on reducing the number of reviewers in the Gadfly swarm by removing four models and the local M5 Mac endpoint. From a security perspective, the modifications involve:

Removing the GADFLY_ENDPOINT_M5 environment variable and associated configuration
Updating the model fleet from 9 cloud models + 1 local to 6 cloud-only models
Adjusting timeout values and concurrency settings accordingly
Updating documentation to reflect these changes

These are operational changes that do not introduce new security vulnerabilities or weaken existing security controls. The removal of local endpoints actually reduces the attack surface by eliminating a potential local network access point. The changes are consistent across both the workflow configuration and documentation, maintaining the advisory-only nature of the review system.

No security-related issues were identified in the diff. All changes appear to be intentional operational adjustments that maintain the existing security posture.

🎯 Correctness — No material issues found

VERDICT: No material issues found

The changes correctly update the documentation and workflow configuration to reflect the reduced fleet of reviewers. The timeout reduction from 90m to 45m is consistent with removing the slower local models. The model list and concurrency settings have been appropriately updated to reflect the cloud-only configuration. No correctness issues were identified in the changes.

🧯 Error handling & edge cases — No material issues found

VERDICT: No material issues found

The changes in this PR are focused on updating documentation and configuration to reflect the removal of several reviewers from the Gadfly swarm. Through the lens of error handling and edge cases:

The removal of GADFLY_ENDPOINT_M5 and related configurations appears consistent across both the workflow file and documentation.
The reduction in timeout-minutes from 90 to 45 is accompanied by explanatory comments that clarify the change.
The model list in GADFLY_MODELS has been appropriately trimmed, and the concurrency settings have been updated to remove the m5=1 entry.
No new error handling scenarios or edge cases are introduced in the diff that would require additional scrutiny.

All changes appear to be consistent and do not introduce new error handling or edge case concerns. The removal of local Mac reviewers and the corresponding configuration updates are handled cleanly.

_{Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 10s}

### 🪰 Gadfly review — `qwen3-coder:480b-cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary>🔒 Security — No material issues found</summary> VERDICT: No material issues found The changes in this PR are focused on reducing the number of reviewers in the Gadfly swarm by removing four models and the local M5 Mac endpoint. From a security perspective, the modifications involve: 1. Removing the `GADFLY_ENDPOINT_M5` environment variable and associated configuration 2. Updating the model fleet from 9 cloud models + 1 local to 6 cloud-only models 3. Adjusting timeout values and concurrency settings accordingly 4. Updating documentation to reflect these changes These are operational changes that do not introduce new security vulnerabilities or weaken existing security controls. The removal of local endpoints actually reduces the attack surface by eliminating a potential local network access point. The changes are consistent across both the workflow configuration and documentation, maintaining the advisory-only nature of the review system. No security-related issues were identified in the diff. All changes appear to be intentional operational adjustments that maintain the existing security posture. </details> <details><summary>🎯 Correctness — No material issues found</summary> VERDICT: No material issues found The changes correctly update the documentation and workflow configuration to reflect the reduced fleet of reviewers. The timeout reduction from 90m to 45m is consistent with removing the slower local models. The model list and concurrency settings have been appropriately updated to reflect the cloud-only configuration. No correctness issues were identified in the changes. </details> <details><summary>🧯 Error handling & edge cases — No material issues found</summary> VERDICT: No material issues found The changes in this PR are focused on updating documentation and configuration to reflect the removal of several reviewers from the Gadfly swarm. Through the lens of error handling and edge cases: - The removal of `GADFLY_ENDPOINT_M5` and related configurations appears consistent across both the workflow file and documentation. - The reduction in `timeout-minutes` from 90 to 45 is accompanied by explanatory comments that clarify the change. - The model list in `GADFLY_MODELS` has been appropriately trimmed, and the concurrency settings have been updated to remove the `m5=1` entry. - No new error handling scenarios or edge cases are introduced in the diff that would require additional scrutiny. All changes appear to be consistent and do not introduce new error handling or edge case concerns. The removal of local Mac reviewers and the corresponding configuration updates are handled cleanly. </details> Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 10s

steve merged commit 2b35f1741c into main

2026-06-27 22:56:57 +00:00

Sign in to join this conversation.

2 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: steve/majordomo#5

ci(gadfly): trim the weakest reviewers from the swarm #5

🪰 Gadfly review — glm-5.2:cloud (ollama-cloud)

🪰 Gadfly review — glm-5.1:cloud (ollama-cloud)

🪰 Gadfly review — minimax-m3:cloud (ollama-cloud)

🪰 Gadfly review — deepseek-v4-pro:cloud (ollama-cloud)

🪰 Gadfly review — nemotron-3-super:cloud (ollama-cloud)

🪰 Gadfly review — qwen3-coder:480b-cloud (ollama-cloud)

🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud)

🪰 Gadfly review — `glm-5.1:cloud` (ollama-cloud)

🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud)

🪰 Gadfly review — `deepseek-v4-pro:cloud` (ollama-cloud)

🪰 Gadfly review — `nemotron-3-super:cloud` (ollama-cloud)

🪰 Gadfly review — `qwen3-coder:480b-cloud` (ollama-cloud)