executus

Author	SHA1	Message	Date
steve	82a816ae29	ci(gadfly): trim pool to the strong 6 (drop m5/qwen3.6, gemma4, gpt-oss, kimi-k2.7) executus CI / test (push) Successful in 46s Details Pool now: minimax-m3, glm-5.2, glm-5.1, deepseek-v4-pro, nemotron-3-super, qwen3-coder:480b (all cloud, ollama-cloud=3). Removed the low-value reviewers + the last local endpoint (m5). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 18:06:36 -04:00
steve	f3bd43b726	ci(gadfly): drop the m1 reviewer (dead weight; keep m5) executus CI / test (pull_request) Failing after 1m1s Details m1/qwen3:14b proved consistently low-value + slowest in the pool over multiple PRs. Removed from GADFLY_MODELS + GADFLY_PROVIDER_CONCURRENCY + its endpoint so it never fires again. m5 retained. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 14:41:14 -04:00
steve	a103cc5e9f	ci(gadfly): 9-cloud panel @ 3 models x 3 lenses (9 concurrent) executus CI / test (push) Failing after 1m57s Details Match mort: minimax-m3, glm-5.2, glm-5.1 (SWE-Bench Pro SOTA), kimi-k2.7-code, deepseek-v4-pro, nemotron-3-super, gpt-oss:120b, qwen3-coder:480b, gemma4 (8 families) + m1/m5 locals. ollama-cloud=3 x lens=3 = 9 concurrent (10 budget). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 12:17:24 -04:00
steve	4d28cd6e2c	ci(gadfly): 4-cloud pool — add kimi-k2.7-code + deepseek-v4-pro, drop v4-flash executus CI / test (push) Failing after 1m2s Details Match mort's new cloud panel: minimax-m3, glm-5.2, kimi-k2.7-code (Moonshot), deepseek-v4-pro (frontier, replaces v4-flash). Keeps m1/m5 locals + the existing ollama-cloud=1 + lens-concurrency=3 serial-model style. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 11:59:13 -04:00
steve	dcaefff756	ci(gadfly): add M1/M5 Macs back to the reviewer pool (full fleet) executus CI / test (push) Failing after 1m23s Details Re-adds the local Macs (m1/qwen3:14b, m5/qwen3.6:35b-mlx) via their foreman endpoints alongside the 3 cloud models. Cloud keeps lens fan-out (ollama-cloud=1 model + lens=3); each Mac runs one model with lenses serial (foreman serializes anyway); all provider lanes parallel. Bumps the job timeout 30->90m for the slow local lanes. With findings telemetry now on, gadfly-reports can quantify whether the Macs earn their keep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 10:44:22 -04:00
steve	e37cf415de	ci(gadfly): emit findings to gadfly-reports + bump image to sha-d7f364d executus CI / test (push) Failing after 2m40s Details Adds GADFLY_FINDINGS_URL / GADFLY_FINDINGS_TOKEN (user-scope secrets) so each review POSTs its run + findings to the gadfly-reports store, and bumps the pinned gadfly image to sha-d7f364d (the build carrying the findings-emit). Advisory only — emit failures never affect the review. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 09:12:46 -04:00
steve	16ddd90914	ci(gadfly): new build sha-d0de034 + per-lens concurrency executus CI / test (push) Successful in 59s Details Bump the gadfly image to sha-d0de034 (adds GADFLY_PROVIDER_LENS_CONCURRENCY) and move ollama-cloud's concurrency from the MODEL axis to the LENS axis: - GADFLY_PROVIDER_CONCURRENCY: ollama-cloud=1 (one model at a time) - GADFLY_PROVIDER_LENS_CONCURRENCY: ollama-cloud=3 (its 3 lenses concurrent) Net: still 3 models, but reviewed serially — the first model's consolidated comment lands sooner and each model finishes faster, while the other two models' comments arrive in series after it (instead of all 3 in parallel). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 22:57:14 -04:00
steve	b35514dfaa	ci(gadfly): cloud-only fleet (3 models, drop local Macs) executus CI / test (push) Successful in 57s Details Measured on the P2 review: the local Macs (m1/m5) took 26–29 min with lens timeouts and found ZERO real bugs, while the two cloud models found every genuine finding in 6–12 min. Drop the Macs; add glm-5.2:cloud as a third cloud reviewer. Net: faster (~29→~12 min) and higher signal. Models: minimax-m3:cloud, deepseek-v4-flash:cloud, glm-5.2:cloud (ollama-cloud=3 concurrency). timeout-minutes 90→30. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 02:02:21 +00:00
steve	e0e2c0451a	ci: sync gadfly review config to mort's foreman-provider setup Mirror mort's updated adversarial-review.yml: m1/m5 pulled in via the GADFLY_ENDPOINT_M1/_M5 secrets using gadfly's "foreman" provider type (providers m1/m5; models m1/qwen3:14b, m5/qwen3.6:35b-mlx), 2 cloud models, 3-lens suite, pinned to the gadfly :sha-6e3a83c image. Header adjusted for executus; functional config identical to mort's tested version. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 02:02:21 +00:00
steve	741d7816ed	ci: add gadfly adversarial review on PRs (mirrors mort) executus CI / test (push) Successful in 35s Details Same setup as mort: the published gadfly:v1 image as a specialist swarm (m1/m5 local Macs + 2 cloud models, 3-lens suite), posting one consolidated advisory comment. Must live on main so it triggers on PRs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 19:46:47 -04:00

10 Commits