17 Commits

Author SHA1 Message Date
Steve Dudenhoeffer 6e87a3e73f docs: correct examples/reusable.yml pin guidance (runners cache @v1; prefer @sha)
Adversarial Review (Gadfly) / review (pull_request) Successful in 3m4s
The @v1 comment claimed it auto-updates on releases, but long-lived act_runners
cache the reusable by ref so a moved tag isn't re-fetched. Recommend an
immutable @<sha>; routine tuning rides owner variables.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 02:10:35 -04:00
steve 8f69e71311 docs: recommend the @v1 release tag for reusable-workflow consumers (#12)
Build & push image / build-and-push (push) Successful in 6s
2026-06-28 04:17:19 +00:00
steve b02b11d691 feat(reusable): ship the curated swarm as the default config consumers inherit (#10)
Build & push image / build-and-push (push) Successful in 8s
2026-06-28 02:23:40 +00:00
Steve Dudenhoeffer f06fe5ef72 security: scope reusable-workflow secrets (least privilege) over secrets: inherit
Adversarial Review (Gadfly) / review (pull_request) Failing after 2s
Build & push image / build-and-push (pull_request) Successful in 6s
The swarm (reviewing the mort/executus rollout PRs) correctly flagged that
`secrets: inherit` forwards EVERY caller secret to the reusable review
workflow — registry/deploy/db creds the reviewer never touches. Fix:

- review-reusable.yml: declare workflow_call.secrets (all optional) so a
  caller can forward only what the reviewer needs.
- adversarial-review.yml (gadfly's own caller) + examples/reusable.yml:
  replace `secrets: inherit` with an explicit forward of just
  OLLAMA_CLOUD_API_KEY / CLAUDE_CODE_OAUTH_TOKEN / findings tokens.
  GITEA_TOKEN stays automatic.
- Docs (README, examples) updated; also advise pinning consumers to an
  immutable @<sha> instead of @main (supply-chain, the other finding).

gadfly's own review on this PR exercises the explicit-secrets path (local
reusable ref) — validating it on the act_runner before mort/executus adopt it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 20:45:18 -04:00
steve 5f86062a5a feat: Phase 4 — reusable "subscribe" workflow (+ dogfood it) (#8)
Build & push image / build-and-push (push) Successful in 9s
Centralizes the consumer stub into a reusable Gitea workflow
(.gitea/workflows/review-reusable.yml, workflow_call + defaulted inputs +
secrets: inherit); gadfly's own dogfood is now a thin caller of it, which
proved end-to-end that github.event context propagates into the reusable
on this act_runner. Adds the slim examples/reusable.yml stub + docs.

Folded in the swarm's findings: timeout_minutes default 30->45, map
GADFLY_API_KEY, explicit permissions block, drop the dead specialist_suite
input, and harden the example's actor gate. ~70 findings graded.

Completes the gadfly-games build (Phases 1-4 + quality fixes).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
2026-06-27 23:42:01 +00:00
steve 82f7ef78d5 feat: claude-code backends + llamaswap provider + dogfood the CC engine (#3)
Build & push image / build-and-push (push) Successful in 10s
Phase 2: bump majordomo to latest and wire its new llamaswap provider
into gadfly's endpoint switches; add claude-code/sonnet to gadfly's own
dogfood swarm (pin :sha-86f12c1, map CLAUDE_CODE_OAUTH_TOKEN) so the
Phase-1 engine runs as a live competitor; document the Ollama-through-CC
ANTHROPIC_BASE_URL proxy path as example-only.

The 11-model swarm (incl. claude-code/sonnet) reviewed it; 52 findings
graded via the MCP. Folded in the two real ones: a llamaswap
endpointProvider test (caught by claude-code/sonnet, citing CLAUDE.md)
and adding "openai-compatible" to the provider error messages (gpt-oss).

gofmt clean, go vet quiet, go build + go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
2026-06-27 21:53:41 +00:00
steve 86f12c126f feat: claude-code reviewer engine (#2)
Build & push image / build-and-push (push) Successful in 28s
Phase 1: a second review engine alongside the majordomo agent loop. For
each lens, shell out to the Claude Code CLI (`claude -p --output-format
json`) inside the checked-out repo so it verifies findings with its own
read tools, then reuse gadfly's verdict-parse + recheck + consolidate +
emit pipeline. Select via GADFLY_MODELS `claude-code`/`claude-code/<model>`;
auth via CLAUDE_CODE_OAUTH_TOKEN (no --bare) else ANTHROPIC_API_KEY;
read-only by default; GADFLY_CLAUDE_* knobs. Dockerfile bundles Node +
@anthropic-ai/claude-code. Also bumped the dogfood pin to the status-board
image (PR #2 was the first dogfood with the live board + full fleet).

Folded in the swarm's own review findings: minimal subprocess env (no
GITEA_TOKEN leak to the CLI), runPass robustness (ctx/empty-result/runErr),
process-group cleanup on timeout, rune-safe error truncation, and
engine-neutral prompts (also de-mort-ified the recheck prompt). 66 findings
graded via the gadfly MCP.

gofmt clean, go vet quiet, go build + go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
2026-06-27 20:40:41 +00:00
steve c3d09d3bd4 feat: live status-board comment + full-fleet dogfood (#1)
Build & push image / build-and-push (push) Successful in 6s
Phase 3: one consolidated, live-updating PR comment aggregating every
model's per-lens progress (queued -> running -> finished + verdict), so
the swarm's progress is visible at a glance and a watcher can tell when
it's done. Opt-in statusWriter in the binary (atomic writes) + a
background status-board.sh renderer wired through entrypoint.sh; default
on, GADFLY_STATUS_BOARD=0 to disable.

Also restores gadfly's dogfood swarm to the full cloud fleet (9 cloud +
M5; M1 dropped as too slow) matching mort, and folds in the 3 real bugs
the swarm found on its own PR (skip-binary stuck-waiting, panic-stuck
lens, busy-loop on bad poll interval). All 36 findings graded via the
gadfly MCP (18 real / 18 false-positive).

gofmt clean, go vet quiet, go build + go test -race green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
2026-06-27 19:00:12 +00:00
steve d7f364d803 feat: optional findings telemetry — emit runs+findings to a gadfly-reports store
Build & push image / build-and-push (push) Successful in 8s
After each review the binary POSTs the run + its heuristically-extracted findings to GADFLY_FINDINGS_URL (off unless set). Advisory: any error only goes to stderr — never touches stdout, the exit code, or the review. stdlib net/http only (no new deps). entrypoint.sh derives GADFLY_REPO/GADFLY_PR and passes through GADFLY_FINDINGS_URL/GADFLY_FINDINGS_TOKEN. Also renames store references from the old 'docket' name to 'gadfly-reports'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 09:09:51 -04:00
steve d0de034726 feat: configurable lens fan-out, per-provider like model concurrency
Build & push image / build-and-push (push) Successful in 9s
Specialist lenses ran strictly sequentially within a model. Add a
GADFLY_LENS_CONCURRENCY knob (default 1 = unchanged) that overlaps the
independent per-lens review+recheck passes, so a model posts its
consolidated comment as soon as its lenses finish.

Per-provider configurable, mirroring GADFLY_PROVIDER_CONCURRENCY:
GADFLY_PROVIDER_LENS_CONCURRENCY takes a "provider=N,..." map keyed by
the same provider lanes (modelProvider() mirrors entrypoint's provider_of;
providerOverride() mirrors provider_cap). The override wins for the model's
lane, else the scalar default.

runSpecialists fans out via a bounded worker pool, order-preserving
(results written by index) and keeping each lens's own timeout/recheck.
repoFS is immutable + fresh-toolbox-per-pass, so lenses share no mutable
state (verified under -race). Docs/examples updated; dropped a duplicate
GADFLY_TIMEOUT_SECS README row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 22:53:27 -04:00
steve 6e3a83c437 feat: add foreman provider type for endpoint overrides
Build & push image / build-and-push (push) Successful in 7s
Accept "foreman" in both resolveModel (GADFLY_BASE_URL) and endpointProvider
(GADFLY_ENDPOINT_*) switches, mapping to majordomo's ollama.Foreman() preset
(handles foreman's non-streaming/long-poll quirks). Unlike the HTTPS-only
LLM_* foreman:// DSN, the base URL is verbatim, so a plaintext http:// foreman
queue works. Tests + README provider table + endpoint-aliases example updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 20:13:47 -04:00
Steve Dudenhoeffer a1e9d109e5 security: add job-level if-guard to example stubs (gate comment trigger by actor)
Build & push image / build-and-push (push) Successful in 5s
Per a Gadfly self-review finding (kimi-k2.7-code): an issue_comment can start a
secret-bearing run before the in-container allowed-users check. Add a workflow
if: that only lets trusted actors trigger via comment (PR/dispatch already
trusted); keep GADFLY_ALLOWED_USERS as the belt-and-suspenders layer. README
documents it + the default-branch caveat for comment triggers. (Docs/examples
only — paths-ignored, no image rebuild.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
EOF
2026-06-25 21:49:23 -04:00
Steve Dudenhoeffer 7809d1b93d feat: specialist suite — configurable + custom review lenses (one consolidated comment)
Build & push image / build-and-push (push) Successful in 8s
Replace the single generic review with a suite of focused specialists, each its
own review+recheck pass, merged into ONE comment (a collapsible section per lens,
led by the worst verdict; the optional `improvements` lens never escalates it).

- cmd/gadfly/specialists.go: built-in lenses + default suite (security, correctness,
  maintainability, performance, error-handling) + opt-in (tests, docs, conventions,
  improvements). Selection via GADFLY_SPECIALISTS (csv/"all"); custom defs via
  GADFLY_SPECIALIST_<NAME> env and a repo .gadfly.yml (specialists + define).
  Precedence: built-ins < file < env. Unknown names error but don't sink the run.
- cmd/gadfly/consolidate.go: verdict parse + one-comment render.
- main.go: loop specialists; per-lens failure is an inline notice, never fatal.
  Default timeout bumped to 600s (suite runs sequentially).
- base system prompt trimmed to persona+tools+discipline+output; lens-specific
  focus is appended per specialist (semantic re-derivation discipline kept in base).
- entrypoint default models -> single model (suite already gives breadth; cost ~=
  specialists × models × 2). Adds gopkg.in/yaml.v3.
- docs/examples: README "Specialists" section, examples/.gadfly.yml, stub var,
  CLAUDE.md architecture/config. Dynamic `auto` selection is the planned next step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:23:05 -04:00
Steve Dudenhoeffer 04cd260ff9 docs: add CLAUDE.md + provider example configs
Build & push image / build-and-push (push) Successful in 6s
- CLAUDE.md: project goals (advisory-only, real-bugs-not-nits, easy-to-enable,
  provider-agnostic, portable), architecture map, build/test/release, and
  maintenance rules — incl. "keep README + examples/ current with any env/flag/
  provider/trigger change" and the advisory-only invariant.
- examples/: local-ollama.yml, openai-compatible.yml, endpoint-aliases.yml +
  an examples/README index; README setup step points at them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:06:08 -04:00
Steve Dudenhoeffer bd76aa8286 feat: env-defined endpoint aliases (http-capable, local Ollama friendly)
Build & push image / build-and-push (push) Successful in 9s
majordomo's built-in LLM_* env DSNs are HTTPS-only (DSN.BaseURL forces https),
so they can't express a plaintext local Ollama. Add Gadfly-native env families
that register named providers/aliases with majordomo before resolution:

  GADFLY_ENDPOINT_<NAME>="<provider>|<base-url>[|<key>]"  # base URL verbatim (http ok)
  GADFLY_ALIAS_<NAME>="<majordomo spec>"                  # plain alias / failover chain

Then reference them as "<name>/<model>" (or the bare alias) in GADFLY_MODEL(S).
<NAME> lowercases to the registry name, matching majordomo's LLM_* convention.
LLM_* DSNs still work (and are documented) for HTTPS endpoints. + unit tests,
README "Endpoint aliases via env vars", stub example.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:01:07 -04:00
Steve Dudenhoeffer d9405f4f69 feat: multi-provider model support via majordomo (local Ollama, OpenAI-compatible, etc.)
Build & push image / build-and-push (push) Successful in 18s
Replace the hardcoded ollama.Cloud binding with majordomo's provider registry,
so Gadfly can target any backend majordomo supports without code changes.

- cmd/gadfly/model.go: resolveModel() — GADFLY_PROVIDER (default ollama-cloud)
  prefixes bare model ids; GADFLY_MODEL may be a full provider/model spec, alias,
  or failover chain (verbatim). GADFLY_BASE_URL constructs openai/ollama/anthropic/
  google directly at a custom endpoint (OpenAI-compatible + local/remote Ollama).
  GADFLY_API_KEY else the provider's standard env var. + buildSpec unit tests.
- run.sh: provider-aware key gate (local Ollama needs none); maps OLLAMA_CLOUD_API_KEY
  -> OLLAMA_API_KEY; provider/base-url/key inherited by the binary. Gadfly-branded comment.
- entrypoint.sh: GADFLY_MODELS alias for OLLAMA_REVIEW_MODELS; provider passthrough.
- examples + README: Models & providers section. Upfront: only the Ollama paths
  (local + OpenAI-compatible-against-Ollama) are tested; OpenAI/Anthropic/Google
  are wired via majordomo but UNTESTED (no spend).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:58:00 -04:00
Steve Dudenhoeffer c0d0152a34 Gadfly: agentic adversarial PR reviewer (initial extraction)
Standalone, Docker-packaged extraction of the agentic PR reviewer that runs in
Gitea Actions: reads the checked-out repo with read-only tools (read_file/grep/
find_files/get_diff), verifies findings before reporting, two-pass review +
adversarial recheck, posts one labeled comment per model. Advisory only.

- cmd/gadfly: reviewer binary (majordomo + Ollama Cloud), zero deps beyond stdlib + majordomo
- entrypoint.sh: container brains — trigger gating, PR clone, model loop (logic out of YAML)
- Dockerfile: multi-stage; build-time module token never reaches the final image
- .gitea/workflows/build-image.yml: tag v* → build & push image
- examples/: ~15-line consumer stub
- system prompt genericized + hardened to re-derive constants/formulas (semantic bugs)

Vibe-coded with Claude Code; see README disclosure. Advisory, never blocks merge.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:42:20 -04:00