feat: re-platform agentic review onto executus + large-PR cost controls #20

Merged
steve merged 2 commits from feat/executus-replatform into main 2026-06-30 15:41:03 +00:00

2 Commits

Author SHA1 Message Date
steve f5ca813af3 fix: address gadfly swarm review of the executus re-platform
Build & push image / build-and-push (pull_request) Successful in 8s
Folds in the real findings the dogfood swarm raised on PR #20 (21 graded
real, 1 false positive):

- tools.go: anchor get_diff `path` matching to whole path tokens (foo.go no
  longer pulls in barfoo.go; a trailing "/" still scopes a directory); split
  the diff once + cache it; drop the spurious trailing blank line; fix the
  "truncated after line N" off-by-one wording. Share one tool-list source
  (allTools) between toolbox() and the executus registry.
- executus.go: drop the dead Config.Defaults caps (per-run RunnableAgent always
  overrides them); shared envBool/reviewTimeout helpers; resolveContextTokens
  logs a failed lookup and uses a 5s timeout (was 15s); note the budget guard
  is pass-granular (the wall-clock backstop covers mid-pass).
- main.go/recheck.go: shared envBool; fix package-doc drift (the removed
  finalization fallback, the paginated get_diff).
- entrypoint.sh/run.sh: export GADFLY_MAX_DIFF_CHARS directly (run.sh prefers
  it); guard the watchdog's delayed SIGKILL on a .disarmed marker so it can't
  catch the consolidation pass.
- tests: anchoring test; corrected obsolete env var + truncation-wording asserts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 11:32:23 -04:00
steve b860119332 feat: re-platform agentic review onto executus + large-PR cost controls
Build & push image / build-and-push (pull_request) Successful in 11s
Adversarial Review (Gadfly) / review (pull_request) Successful in 25m56s
Fixes the large-PR token burn: a ~250K-token diff was re-sent every agent
step across models × lenses × passes, draining a metered usage block in
minutes. Small PRs are untouched (every mitigation is size-gated / no-op
under threshold).

- Re-platform the in-process review path onto executus run.Executor: context
  compaction (executus/compact, threshold from the model's real context window
  via executus/model), run-bounding, a per-PR budget gate (Ports.Budget), and
  the wrap-up nudge re-expressed as a run.Critic. Lens fan-out now uses
  executus/fanout. gadfly keeps its own model.go, so GADFLY_ENDPOINT_<NAME>
  aliases and the claude-code engine are unaffected. No majordomo bump; the
  binary stays static (executus core is majordomo+stdlib only).
- Paginate get_diff (per-file `path` + start_line/limit) instead of dumping the
  whole diff; trim the recheck diff embed (60k -> 20k chars).
- entrypoint.sh: downshift the fleet above GADFLY_HUGE_DIFF_BYTES (one cheap
  model, fewer lenses/steps, no recheck) + a swarm-wide GADFLY_PR_BUDGET_SECS
  wall-clock backstop (adds procps for pkill). All advisory; CI never fails.
- README + CLAUDE.md + tests updated.

Note: run.Result exposes no transcript, so the old transcript-based forced-
finalization fallback is dropped; the wrap-up critic nudge is the remaining
"always emit something" mechanism.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 10:43:34 -04:00