Commit Graph

2 Commits

Author SHA1 Message Date
steve eea84e6e2c fix: address verified gadfly P4c findings (3-cloud fleet)
executus CI / test (pull_request) Successful in 1m39s
critic (all 3 models — HIGH):
- ExtendOnce was a single global one-shot shared across every run a System
  monitors, so only the FIRST run to stall got its extension and all others
  were killed by the backstop. Key the fired-state per run (RunInfo.RunID).
- Kill is now sticky: a `killed` flag short-circuits later ticks so a wavering
  Escalator returning ExtendBy after a Kill can't un-collapse the deadline; a
  Kill paired with Nudge/ExtendBy ignores the latter.
- watch() recovers panics from a misbehaving Escalator (logs; the run falls
  back to its existing deadline) instead of silently killing the watch goroutine.

checkpoint (deepseek — HIGH): handle.Save advanced the throttle clock BEFORE
the store write, so a failed save was silently throttled away (caller believes
it persisted). Advance lastSave only after a successful persist.

schedule (all 3): compute Next BEFORE Run — a permanently-unparseable cron now
skips the job entirely instead of re-running it every tick forever; nil required
callbacks return a validate() error instead of a first-tick nil panic; Loop
recovers tick panics; the Mark-failure => possible-re-run trade-off is documented
(Run must be idempotent). + tests for each.

Triaged-but-kept: critic backstopMul<=1 floor (it's a total-runtime multiple, so
a floor >1 is intentional, not the reported footgun); checkpoint Load (nil,nil)
on miss (documented convention).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 23:32:27 -04:00
steve e5cab5525e P4: critic battery — two-tier timeout watchdog + Escalator seam
executus CI / test (pull_request) Successful in 1m35s
Adversarial Review (Gadfly) / review (pull_request) Successful in 9m21s
The last Tier-2 battery, plugging into run.Ports.Critic (executor call site is a
P2 follow-up). Clean split of concerns:
- executus owns the deterministic MECHANICS: System.Monitor returns a
  run.CriticHandle that tracks activity (RecordStep/RecordToolStart), and a
  watcher goroutine fires once per idle period a run crosses its soft timeout,
  applies the decision (queue Steer nudges / extend the Deadline / collapse it
  to now on Kill), and enforces a hard-kill backstop (softTimeout * mul).
- the POLICY is the Escalator seam (nudge/extend/kill/escalate). Mort plugs its
  LLM critic-agent in here; ExtendOnce is the zero-dependency default (extend
  once, then let the backstop kill a truly hung run).

Race-tested: escalate-once-per-idle-period with re-arm on fresh activity, Kill
collapses the deadline, ExtendOnce fires once, zero soft-timeout => nil handle.
Core imports ZERO from critic.

This completes the P4 battery set: audit, budget, persona, skill, checkpoint,
schedule, critic — each nil-safe, each with a default, each core-import-clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 23:04:29 -04:00