feat: re-platform agentic review onto executus + large-PR cost controls (#20)
Build & push image / build-and-push (push) Successful in 33s
Build & push image / build-and-push (push) Successful in 33s
Makes gadfly a consumer of executus (run.Executor compaction/bounding/budget/critic + fanout) and fixes the large-PR token burn in size-gated layers: paginated get_diff, downshift above GADFLY_HUGE_DIFF_BYTES, and a swarm-wide GADFLY_PR_BUDGET_SECS backstop. Small PRs untouched; advisory-only and the static binary preserved. Dogfood swarm reviewed it (6 models, 21 real findings graded + folded in). Co-authored-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com> Co-committed-by: Steve Dudenhoeffer <steve@stevedudenhoeffer.com>
This commit was merged in pull request #20.
This commit is contained in:
@@ -22,15 +22,21 @@ verifies each one against the actual code, and posts its findings as a comment.
|
||||
4. **Provider-agnostic.** Powered by [majordomo](https://gitea.stevedudenhoeffer.com/steve/majordomo),
|
||||
so it can target Ollama (local/cloud), OpenAI, Anthropic, Google, or any
|
||||
OpenAI/Ollama-compatible endpoint. Don't re-hardcode a single provider.
|
||||
5. **Portable & self-contained.** `cmd/gadfly` depends only on the Go stdlib + majordomo. Keep
|
||||
it that way — no heavyweight deps, no coupling to any one consumer repo (e.g. mort).
|
||||
5. **Portable & self-contained.** `cmd/gadfly` depends only on the Go stdlib, majordomo, and
|
||||
[executus](https://gitea.stevedudenhoeffer.com/steve/executus) (whose *core* — `run`/`compact`/
|
||||
`model`/`fanout`/`tool` — is itself majordomo+stdlib only, so the binary stays static; do NOT
|
||||
pull executus's `contrib/store` or any battery that drags in a DB driver). No heavyweight deps,
|
||||
no coupling to any one consumer repo (e.g. mort). Gadfly is executus's canonical *light* consumer.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
cmd/gadfly/ the reviewer binary — pure producer of review markdown (stdout)
|
||||
main.go orchestration: loop specialists, each a review pass + adversarial recheck
|
||||
engine.go reviewEngine abstraction: majordomo agent loop vs claude-code CLI shell-out
|
||||
main.go orchestration: fan specialists out (executus/fanout), each a review pass + recheck
|
||||
engine.go reviewEngine abstraction: executus run.Executor (majordomo agent loop +
|
||||
compaction/bounding/budget/critic) vs claude-code CLI shell-out
|
||||
executus.go executus wiring: tool.Registry over the repo tools, the run.Executor build
|
||||
(compact + model context-limit threshold + per-PR budget + wrap-up critic)
|
||||
specialists.go specialist lenses: built-ins, default suite, env + .gadfly.yml resolution
|
||||
auto.go dynamic `auto` selection: a selector model picks lenses per-diff (may invent)
|
||||
delegate.go worker-tier delegate_investigation tool (cheap sub-agent does legwork)
|
||||
@@ -69,7 +75,7 @@ verdict. Verdict is one of: `No material issues found` / `Minor issues` / `Block
|
||||
## Build / test
|
||||
|
||||
```sh
|
||||
go build ./cmd/gadfly # needs read access to the private majordomo module
|
||||
go build ./cmd/gadfly # needs read access to the private majordomo + executus modules
|
||||
go test ./...
|
||||
gofmt -l cmd/ # must be clean
|
||||
docker build -t gadfly:dev --secret id=REGISTRY_USER,env=REGISTRY_USER --secret id=REGISTRY_PASSWORD,env=REGISTRY_PASSWORD .
|
||||
@@ -149,3 +155,21 @@ are actually exercised. OpenAI/Anthropic/Google come from majordomo's abstractio
|
||||
parallel, `cap` (from `GADFLY_PROVIDER_CONCURRENCY` else `GADFLY_CONCURRENCY`, default 1) bounds
|
||||
models-at-once within a lane. The review timeout (`GADFLY_TIMEOUT_SECS`) is **per-lens**, not
|
||||
shared across the suite — a slow model can't starve later lenses (the original timeout bug).
|
||||
- **Large-PR token burn**: the agent loop re-sends the whole transcript every step, so a giant
|
||||
diff (the old `get_diff` dumped it untruncated, and it was embedded in both the review and
|
||||
recheck task) was re-transmitted ~steps × lenses × passes × models times — a ~250 K-token PR
|
||||
could drain a metered usage block in minutes. Fixed in three size-gated layers (small PRs
|
||||
untouched): paginated `get_diff` + `executus/compact` compaction in the binary; an
|
||||
`entrypoint.sh` downshift above `GADFLY_HUGE_DIFF_BYTES` (one cheap model, fewer lenses/steps,
|
||||
no recheck); and a swarm-wide `GADFLY_PR_BUDGET_SECS` wall-clock backstop. Compaction's threshold
|
||||
is intentionally LOW (`GADFLY_COMPACT_RATIO` 0.45, not executus's 0.7) because the burning
|
||||
transcript on the embedded path rarely reaches 0.7×context.
|
||||
- **executus re-platform**: the in-process review path runs through `executus/run`'s `run.Executor`
|
||||
(compaction, run-bounding, `Ports.Budget`, the wrap-up nudge as `Ports.Critic`), wiring it in
|
||||
`cmd/gadfly/executus.go`. Gadfly KEEPS its own `model.go` resolution (so `GADFLY_ENDPOINT_<NAME>`
|
||||
http aliases + the claude-code engine survive) and only hands `run.Executor` the already-resolved
|
||||
model via a trivial resolver — do NOT route review-model resolution through
|
||||
`model.ParseModelForContext` (it bypasses gadfly's endpoint aliases). `run.Result` exposes no
|
||||
transcript, so the old transcript-based forced-finalization fallback is gone; the wrap-up critic
|
||||
nudge is the remaining "always emit something" mechanism. The claude-code engine still shells out
|
||||
and is unaffected.
|
||||
|
||||
Reference in New Issue
Block a user