feat(run): critic owns the deadline — MaxRuntime becomes the soft trigger #21
Reference in New Issue
Block a user
Delete Branch "feat/critic-owns-deadline"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Problem
The executus migration ported mort's run-critic faithfully, but the kernel defeats it.
run/executor.goappliedcontext.WithTimeout(ctx, MaxRuntime)unconditionally — even forcritic_enabledagents — and started the critic on a context already hard-capped atMaxRuntime. So a critic-enabled agent (mort'sgeneral,MaxRuntime=30m) was hard-killed at exactly 30 min; the critic's extendable backstop (MaxRuntime × multiplier, clamped to 6h) was structurally unreachable. The soft trigger also used a globalDefaults.CriticSoftTimeout(90s) instead of the agent'sMaxRuntime.Live symptom in mort: a
general@mention that delegated to the long-runninganimateskill (a legit 30-min render) was killed atMaxRuntime(agent: step 3: context deadline exceeded), cascading to kill the in-flight render.Fix — restore the old agentexec two-tier semantics
run/executor.go: theWithTimeout(MaxRuntime)is now conditional oncriticOwnsDeadline = Ports.Critic != nil && ra.Critic.Enabled.MaxRuntimestays a literal hard kill (→timeout). Unchanged.MaxRuntimebecomes the SOFT trigger; the run is wrapped in a generousWithTimeoutat the newDefaults.CriticAbsoluteMax(default 6h) as a failsafe ceiling only — it never fires before the critic's backstop, and guarantees a broken/nil host handle can't run unbounded. The critic's extendable deadline (watched instartCritic, cancels viacancelCause) is the real deadline.run/critic.go:startCritictakes the resolvedMaxRuntimeas the soft trigger (falls back toDefaults.CriticSoftTimeout, then 90s).Defaults.CriticAbsoluteMaxadded (withFallbacks default 6h).Tests (
run/critic_deadline_test.go)TestNoCritic_MaxRuntimeIsHardCap— legacy contract preserved: a non-critic run whose work (200ms tool) outlastsMaxRuntime(20ms) ends in error.TestCriticOwnsDeadline_SurvivesPastMaxRuntime— the fix: an identical critic-owned run completes.TestCriticSoftTriggerIsMaxRuntime— the soft trigger handed toMonitoris the agent'sMaxRuntime, notDefaults.CriticSoftTimeout.Full
go test ./...green. The mort host adapter (executushost/critic_adapter.go) already clamps the per-run backstop to its convar-drivenAbsoluteMax, so the kernel ceiling is a redundant failsafe.🤖 Generated with Claude Code
🪰 Gadfly — live review status
6/6 reviewers finished · updated 2026-06-30 15:49:58Z
claude-code/opus· claude-code — ✅ doneclaude-code/opus:max· claude-code — ✅ donedeepseek-v4-pro:cloud· ollama-cloud — ✅ doneglm-5.2:cloud· ollama-cloud — ✅ donekimi-k2.6:cloud· ollama-cloud — ✅ doneqwen3.5:397b-cloud· ollama-cloud — ✅ doneLive status board. Findings are posted in each model's own comment. Advisory only — does not block merge.
🪰 Gadfly review — consensus across 6 models
Verdict: Minor issues · 5 findings (0 with multi-model agreement)
5 single-model findings (lower confidence)
run/executor.go:315run/critic.go:44run/executor.go:336run/critic.go:56run/executor.go:37Per-model detail
claude-code/opus (claude-code) — Minor issues
Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
I reviewed this change strictly through the Security lens (authz, injection, SSRF, deserialization, secret leakage, input validation, untrusted-data handling, insecure defaults, resource/DoS bounding). This is a timeout-semantics change in
run/executor.go+run/critic.go; it touches no parsing, no authz, no external I/O, and no secret handling. The one security-relevant axis is resource bounding / availability, which I traced carefully:Critic-owned runs no longer hard-capped at
MaxRuntime. Verifiedexecutor.go:304-340: for a critic-owned run the hard ceiling becomesDefaults.CriticAbsoluteMax(default 24h,executor.go:65-67). The effective bound is normally the critic's extendable backstop, enforced by the deadline-watch incritic.go:62-90. I confirmed the shipped default handle (critic/critic.go:143) always initializesdeadline = now + softTimeout × backstopMul— i.e. never zero — so a healthy supervised run is always bounded well below the 24h ceiling. The 24h ceiling is a failsafe, not the normal bound.Unsupervised failsafe.
executor.go:336-340re-tightens toMaxRuntimewhen the hostMonitorreturns a nil handle, preventing an enabled-but-unarmed run from holding a worker slot up to 24h. Verified thecriticOwns && critic == nilpredicate matchesstartCritic's nil-handle return (critic.go:51-61), so the two can't drift (sharedcriticOwnsDeadline). The testTestCriticOwnsDeadline_NilHandleFallsBackToMaxRuntimeexercises exactly this.Caller cancellation still propagates.
executor.go:319re-merges the caller ctx viaMergeCancellationafter theWithoutCanceldetach, so a cancelled/disconnected caller still tears the run down — no detach regression that would let an abandoned run linger.Insecure-default check on the new constant. 24h
CriticAbsoluteMaxis a deliberately-loose failsafe ceiling, not a per-run budget; the host (mort) clamps its own backstop to 6h and the default battery's backstop is far smaller. I re-derived that the ceiling only bites for a misbehaving custom host handle that returns a permanently zeroDeadline()— which contradicts the shipped default's behavior and is explicitly called out as defence-in-depth. Not a material insecure default.One non-blocking observation (defense-in-depth only, not a finding): the unsupervised failsafe keys on
critic == nil, so it does not cover a custom host handle that arms but leavesDeadline()zero forever — such a run would be bound only by the 24h ceiling. The shipped default never produces that state, so this is hypothetical and arguably a correctness/robustness concern rather than security; I'm not raising it as a finding.Nothing in my lane is materially wrong.
🎯 Correctness — Minor issues
I've verified the finding against the actual source. Every cited line checks out:
critic/critic.go:262— the default battery'stickgates escalation onidle := now - lastActivity; if idle < h.softTimeout { skip }. Confirmed it is idleness-based, not elapsed-runtime-based.critic/critic.go:143—deadline = now.Add(softTimeout × backstopMul), confirmingsoftTimeoutdoubles as the backstop basis.run/critic.go:43-45— the doc comment does assert "the critic first reviews once the run exceeds its nominal budget," and the sole caller passesmaxRuntimeassoftTrigger, so the single value now feeds both the idle-stall window and the backstop. Confirmed.CriticSoftTimeout(was default 90s, doc "idle window before the critic wakes") is confirmed from the diff; the idle window is now forced to equalMaxRuntime.The semantics/documentation concern is real and confirmable, so it survives.
VERDICT: Minor issues
The two-tier rewrite in
run/executor.gois logically sound and I could not find a hard logic bug. The context layering is correct: in the critic-owned path the run is bounded byCriticAbsoluteMax(runaway guard) + the critic'scancelCause-driven deadline-watch; in the unsupervised-handle path it's correctly tightened back tomaxRuntime. The removal ofCriticSoftTimeoutis clean — nothing fails to compile.One substantive correctness/semantics concern:
run/critic.go:43-45(and the newexecutor.gocall passingmaxRuntimeassoftTrigger) — the soft-trigger value is overloaded, and the comment misdescribes the shipped default's behavior. The doc says "the critic first reviews once the run exceeds its nominal budget." But the shipped defaultcriticbattery does not review on elapsed-runtime; it escalates on idleness:critic/critic.go:262gates onidle := now - lastActivity; if idle < h.softTimeout { skip }. So the same value now plays two roles for the default battery: (a) the stall-detection idle window, and (b) the backstop basis (deadline = now + softTimeout × backstopMul,critic.go:143). Before this PR the idle window was the independentCriticSoftTimeout(default 90s, doc: "idle window before the critic wakes"); this PR forces it to equalMaxRuntime. Concrete effect for a default-battery host with a 30-min agent: a genuinely hung run is no longer flagged after ~90s of inactivity but only after a full 30 minutes of inactivity — the watchdog's primary stall-detection purpose is materially weakened, and a host can no longer express "review after 90s idle but allow up toMaxRuntimetotal" because the knob that separated those two meanings was deleted. This may be a deliberate trade-off (it also prevents false-killing a legit long single tool call, which is the bug the PR targets), but the comment asserting "reviews once the run exceeds its nominal budget" is inaccurate for the default battery. Suggest either keeping a distinct idle-window knob or correcting the comment to describe the actual idle-based semantics. Verified by readingcritic/critic.go:125-149(Monitor) and:251-268(tick/idle gate).Non-blocking note (not a code defect): the PR description says
CriticAbsoluteMaxdefaults to 6h, but the code defaults to 24h (executor.gowithFallbacks, and the struct comment consistently says 24h). The code is internally consistent; only the PR prose is wrong.🧹 Code cleanliness & maintainability — No material issues found
I've now verified everything. The commit message is decisive on the draft's only finding.
The draft's single finding rests on "concrete evidence this already drifted: the PR description repeatedly says the default is 6h, while the code... say 24h." That evidence is false. The commit body shows the PR deliberately changed
CriticAbsoluteMaxfrom 6h to 24h ("CriticAbsoluteMax was 6h... Now CriticAbsoluteMax is a 24h RUNAWAY guard"), precisely because 6h was too low. The 6h that remains in the comments is mort's host convar (mort's 6h convar), a deliberately distinct value from the 24h executus default — the comments correctly distinguish the two. There is no drift; the PR documents an intentional value change.Stripped of that false evidence, the finding reduces to "the constant 24h is also mentioned in a third comment" (lines 37, 66, 294) — a trivial, speculative doc nit the draft itself concedes is partly house pattern. It cannot be confirmed as a material issue, so it drops. The draft's positive prose (rename complete — no
CriticSoftTimeoutrefs remain; single caller atexecutor.go:327passingmaxRuntime;maxRuntime>0guaranteed bywithFallbacks) all checks out, but those are not findings.With the only finding dropped, the verdict resets.
VERDICT: No material issues found
This PR is small, focused, and well-documented. The
CriticSoftTimeout→CriticAbsoluteMaxrename is complete (no stale references remain in any.go,README.md,CLAUDE.md, orexamples/).startCritichas a single caller (run/executor.go:327) that passes the resolvedmaxRuntime, whichwithFallbacks(executor.go:50) guarantees is>0. ThecriticOwnsDeadlineextraction (critic.go:29) is a clean dedup shared byRunandstartCritic. The new test helpers are well-factored and thecapturingCriticinterface-typed handle correctly exercises thenil-handle path.The "default 24h vs 6h" concern in the draft does not hold: the comments deliberately distinguish the executus default ceiling (24h) from mort's host backstop clamp (6h), and the PR intentionally moved the default from 6h to 24h — there is no drift.
⚡ Performance — No material issues found
I have enough to assess the performance lens. Let me confirm one thing about the deadline-watch cadence relative to the new much-longer-lived runs.
The relevant facts I verified by reading the code:
run/critic.go:14—criticDeadlineCheck = time.Second; the watch goroutine (startCritic, lines 63–90) ticks once/sec doing a cheaph.Deadline()compare. This is pre-existing and unchanged by the PR.run/executor.go:315— the newcontext.WithTimeout(..., hardCap)and the failsafe re-wrap at:338are O(1) per run, with their timers stopped viadefer cancelTimeout()/defer cancelUnsupervised(). No timer/goroutine leak.VERDICT: No material issues found
Reviewing strictly through the ⚡ Performance lens:
criticOwnsDeadline(a two-field predicate), one extracontext.WithTimeoutathardCap, and the conditional unsupervised re-wrap — executes once perRun, not inside the agent loop or any tool/step hot path. No new per-iteration work, allocation, or copy instepObserver(executor.go:353).CriticAbsoluteMaxWithTimeoutdoes not create a lingering timer: Go schedules it viatime.AfterFuncanddefer cancelTimeout()stops it whenRunreturns, so a fast-completing run pays nothing measurable. Same for the failsafe'scancelUnsupervised.general, etc.), so the fact that such runs may now legitimately live to the backstop (minutes/hours) instead of being killed atMaxRuntimeadds no per-tick cost beyond an already-trivialDeadline()comparison. Not a regression I can justify flagging.maxRuntimefailsafe when unsupervised), so there's no unbounded-resource-hold concern within this lens.Nothing in the performance lane is materially wrong here.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
I reviewed this PR strictly through the error-handling & edge-cases lens, verifying the diff against the checked-out code rather than the hunks alone.
What I verified:
startCriticreturns(nil, noop)both when!criticOwnsDeadlineand whenMonitorreturns a nil handle (run/critic.go:51,:59). The re-wrap guard atrun/executor.go:336uses the samecriticOwnsDeadlinepredicate, so the "critic enabled but no handle armed" case correctly tightens the ceiling back from 24h tomaxRuntime.TestCriticOwnsDeadline_NilHandleFallsBackToMaxRuntimeexercises exactly this (nil interface handle → falls back). No drift between the two decision sites.MaxRuntime > CriticAbsoluteMaxis guarded (run/executor.go:311-313):hardCapis clamped up tomaxRuntime, so the ceiling can never be shorter than the nominal budget. No off-by-direction bug.softTriggerhas the 90s floor (run/critic.go:55);withFallbacksguaranteesmaxRuntime > 0(:50-52) so it's defensive-only, as commented — not dead-but-wrong.Deadline()/KillCause()panic on its own goroutine (run/critic.go:67); a panic in the deferredstopCritic→h.Stop()unwinds into the top-levelrecover()atexecutor.go:139(registered first, runs last), so it still becomesResult.Errrather than crashing.cancelTimeout,cancelCause,mergeCancel,cancelUnsupervised) isdefer-cancelled.checkpointCauseis correctly set after the unsupervised re-wrap (executor.go:344) so it reads the context the loop actually runs on.CriticSoftTimeoutfield has zero remaining references in.gofiles; the onlystartCriticcall site was updated to the new 5-arg signature. The diff compiles cleanly across files.Non-blocking observation (not a finding): the unsupervised-failsafe only tightens the ceiling when the handle is
nil. A handle that is non-nil but never advancesDeadline()(a "live but lazy/broken" host critic) is bounded only byCriticAbsoluteMax(kernel default 24h). This is the documented runaway-guard behavior and is exercised byTestCriticOwnsDeadline_SurvivesPastMaxRuntime, so it's intentional, not a bug — I note it only because the asymmetry (nil →maxRuntime; live-but-zero-deadline → 24h) is a real edge case a host should be aware of.Outside my lens: the PR description says
CriticAbsoluteMaxdefaults to 6h, but the code defaults to 24h (executor.go:66) — a description/doc mismatch worth a correctness/docs pass.claude-code/opus:max (claude-code) — Minor issues
Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
This change is purely about timeout/deadline ownership semantics for critic-owned runs. I've verified the key paths (
run/critic.go:29-95,run/executor.go:304-344).Through the security lens specifically:
critic.recordToolStart(call.Name, string(call.Arguments))(executor.go:371) and theKillCause().Error()string wrapped intoErrCriticKill(critic.go:82) — both are forwarded to host-side handles / error chains, not into a shell, SQL, path, or template. Pre-existing behavior, unchanged by this diff.MaxRuntimehard cap with aCriticAbsoluteMax(default 24h) ceiling for critic-owned runs. This loosens the worst-case resource-holding bound, but it is deliberate and still bounded: the nil-handle failsafe (executor.go:336-340) re-tightens toMaxRuntimewhen no supervisor arms, the ceiling is never shorter thanMaxRuntime, and a healthy host clamps its own backstop well below the ceiling. This is an availability/DoS-resilience tuning question (24h is a long slot to hold), but it does not constitute a security regression in my lane — it's a resource-management/correctness concern that the correctness/availability lens owns.Nothing in the security lens is materially wrong.
VERDICT: No material issues found
🎯 Correctness — No material issues found
I've verified the full two-tier timeout flow against the actual code (
run/executor.go,run/critic.go, the critic battery incritic/critic.go, and theCriticport +CriticConfigcontracts). Here is my correctness review.VERDICT: No material issues found
The two-tier timeout logic is correct and, importantly, consistent with the documented design intent — which is what makes this a genuine fix rather than a behavior change.
What I checked and confirmed sound:
criticOwnsDeadlinegating is consistent across both call sites.executor.go:304(choosinghardCap) andcritic.go:51(the arm gate) use the same predicate, so the "did we widen the cap?" decision and the "did we arm the watch?" decision can't disagree. (Read both.)criticOwns && critic == nil(executor.go:336) is reachable only whenMonitorreturned a nil handle (the other nil-return path instartCriticrequires!criticOwns). Re-wrappingrunCtxatmaxRuntimecorrectly tightens an armed-but-unsupervised run back to the nominal budget. ThecheckpointCauseclosure is set after this re-wrap (executor.go:344), so it reads the context the loop actually runs on. Verified against the testTestCriticOwnsDeadline_NilHandleFallsBackToMaxRuntime.hardCap < maxRuntimeclamp (executor.go:311) correctly handles the degenerateMaxRuntime > CriticAbsoluteMaxconfig.MaxRuntimeis the correct value, not a regression. I was suspicious that flippingsoftfrom the dedicatedCriticSoftTimeout(90s) toMaxRuntimewould mis-drive the battery, sincecritic/critic.gousessoftTimeoutfor three things: the backstop base (deadline = now + soft×mul, line 143), the idle-escalation threshold (line 262), and the poll interval (soft/2, line 132). But the pre-existing contract already nails the intended meaning:CriticConfig.BackstopMultiplieris documented (run/agent.go:83) as "a multiple of the soft trigger (MaxRuntime)". So the old code passingCriticSoftTimeoutwas the value that contradicted the design; this PR aligns it. The backstop now correctly resolves toMaxRuntime × multiplier, matching the PR's stated semantics.grepforCriticSoftTimeoutacross the whole module returns zero hits — no dangling references, no compile break from dropping the field.Two non-blocking observations (neither is a code defect):
CriticAbsoluteMaxdefaults to 6h, but the code (executor.go:66) and its own doc comment say 24h. The code is internally consistent and is the correct choice: the runaway ceiling must sit strictly above the host's own backstop clamp (mort's 6h), so 24h > 6h is right and a 6h default would be wrong (it could pre-empt a healthy supervised run). Only the PR prose is stale — worth fixing the description so a future reader doesn't "correct" the code down to 6h.critic.go:44. The comment says the critic "first reviews once the run exceeds its nominal budget," but the battery actually escalates on idle time (idle >= softTimeout), not total elapsed — a progressing run whose activity clock keeps refreshing may never be reviewed before the backstop. This is the battery's unchanged behavior, not something this PR alters; the wording just slightly oversells when a review fires. Cosmetic.I did not find any introduced logic bug in my lens.
🧹 Code cleanliness & maintainability — Minor issues
All claims verified against the code:
run/critic.go:54-57: thesoft <= 0 → 90sbranch confirmed —soft = softTrigger, and the sole caller (executor.go:327) passesmaxRuntime, whichexecutor.go:158-160resolves to eitherra.MaxRuntimeorDefaults.MaxRuntime.NewrunswithFallbacks()(executor.go:104) guaranteeingDefaults.MaxRuntime >= 60s(executor.go:50-51). SosoftTriggeris always> 0and the branch is unreachable. The bare90 * time.Secondliteral replaces the former namedCriticSoftTimeoutfield. Confirmed; only one caller exists.CriticSoftTimeoutreferences remain (grep clean).executor.go:66defaultsCriticAbsoluteMaxto 24h; the field doc (executor.go:38) cites mort's 6h host clamp. Internally consistent, so the "6h" PR-description note holds.The finding survives. Outputting the corrected review.
VERDICT: Minor issues
Through the cleanliness/maintainability lens this is a careful, well-structured change. The new
criticOwnsDeadlinehelper (run/critic.go:29) sensibly factors out the predicate soRunandstartCriticcan't drift, names are clear (hardCap,criticOwns,softTrigger), and the removedCriticSoftTimeoutfield is gone cleanly — no stale references remain in.go,.md, orexamples/(verified via grep). Tests are descriptively named and commented. Only small notes:run/critic.go:54-57— thesoft <= 0 → 90sfallback is now unreachable dead code with a bare magic literal. Verified: the sole caller (run/executor.go:327) always passesmaxRuntime, which is resolved atexecutor.go:158-160fromra.MaxRuntimeore.cfg.Defaults.MaxRuntime, andcfg.Defaults = cfg.Defaults.withFallbacks()(executor.go:104) guaranteesMaxRuntime >= 60s(executor.go:50-51). SosoftTriggeris always> 0and the branch never fires. Previously this 90s lived as a named, fallback-defaulted field (CriticSoftTimeout); collapsing it into an inline literal inside an unreachable branch is a minor regression in maintainability. The code even documents it as dead twice (the doc comment atcritic.go:47and the inline comment atcritic.go:56). Low-churn fix: either drop the branch and assert/document the precondition, or keep it but lift90 * time.Secondto a named const so the magic number isn't bare. Trivial; the current form is harmless and clearly commented.PR-description vs. code drift (not a code defect): the description says
CriticAbsoluteMaxdefaults to "6h", but the code defaults to 24h (executor.go:66, and the field doc atexecutor.go:38and inline comment atexecutor.go:294both say 24h — internally consistent). The "6h" in the description is mort's host clamp, not the kernel default. Worth correcting the PR text so future readers aren't misled; nothing to change in code.No other cleanliness problems: the triple documentation of the two-tier semantics (field doc, the
Runblock comment atexecutor.go:279-303, and thestartCriticdoc) is verbose but each block targets a distinct reader and matches this repo's documented// Why:house style, so I'm not flagging it as a defect.⚡ Performance — No material issues found
I've verified the relevant code. The polling goroutine (
criticDeadlineCheck = time.Second,run/critic.go:14) is pre-existing and untouched by this diff — the soft-trigger change (CriticSoftTimeout→MaxRuntime) doesn't alter its cadence. The unsupervised failsafe adds one trivial extracontext.WithTimeoutlayer. No new allocations in hot loops, no N+1, no quadratic behavior.The one performance-adjacent effect is that critic-owned runs now hold their lane/worker slot up to the extendable backstop (host-clamped 6h) instead of being hard-killed at
MaxRuntime(e.g. 30m) — and the per-run 1s-ticker goroutine therefore lives proportionally longer (~21k ticks over 6h vs ~1.8k over 30m). But (a) that longer hold is the deliberate, documented intent of the change, not an accidental regression; (b) the per-tick cost is a singleDeadline()/KillCause()read — negligible even across many concurrent runs; and (c) the unbounded-hold risk is explicitly bounded by theCriticAbsoluteMaxceiling and the unsupervised-failsafe re-tighten. None of this rises to a material efficiency problem in my lane.VERDICT: No material issues found
🧯 Error handling & edge cases — No material issues found
I've verified the change against the checked-out code. Summary of what I checked through the error handling & edge cases lens:
cfg.Defaultsis normalized at construction (run.NewcallswithFallbacks(), executor.go:104), soCriticAbsoluteMaxis guaranteed>0(24h) before it's ever used as aWithTimeoutduration at executor.go:315. No zero-duration / instantly-expired context risk.maxRuntimeis guaranteed>0(executor.go:158-161 →Defaults.MaxRuntimefloor of 60s), so the soft trigger passed tostartCriticand thehardCap < maxRuntimeclamp (executor.go:311) are well-defined. Thesoft <= 0 → 90sguard instartCritic(critic.go:55) is genuinely just defensive.criticOwnsso it only fires on the nil-handle case (not the no-critic case), and tightens the cap back tomaxRuntime.cancelUnsupervisedis deferred — no context leak. Verified the nil-handle path incapturingCritic.Monitorreturns a nil interface, so the test exercises it.checkpointCausereordering (moved to executor.go:344, after the re-wrap) reads the actual loop context; the top-level defer is nil-safe (checkpointCause != nil, executor.go:144), so the early-return/panic window before assignment is handled.criticBindingmethods are nil-receiver-safe (critic.go:97-139), so thecritic == nilfailsafe path doesn't panic onrecordStep/maxStepsOption/drainSteer.CriticSoftTimeoutfield anywhere in the repo (grep clean); status classification (statusFor) is untouched and still nil-safe.recover()against a hostDeadline()panic (critic.go:67), and exits on bothdoneandrunCtx.Done()— no goroutine leak.The unhappy paths this diff introduces (zero/negative durations, nil critic handle, degenerate
MaxRuntime > AbsoluteMax, cleanup on every exit) are each handled.One pre-existing item I noticed but is outside this diff and outside my lens (correctness): a natural backstop expiry cancels via
cancelCause(context.DeadlineExceeded)(critic.go:84), butstatusForonly consultscontext.CauseforErrCriticKill(executor.go:529) — so a natural backstop expiry likely classifies ascancelled, not thetimeoutthe critic.go:39 comment promises. Not introduced here; flagging only as it brushes the deadline logic.VERDICT: No material issues found (error handling & edge cases lens)
The two-tier timeout change is defensively solid: durations are floored before use, the nil-handle and no-critic paths are both bounded, contexts are cancelled on every exit path, and the new test file exercises the hard-cap, soft-trigger, survival, and nil-handle-fallback edge cases.
deepseek-v4-pro:cloud (ollama-cloud) — No material issues found
Verdict: No material issues found — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
I reviewed this change through the security lens: authn/authz, injection, SSRF, unsafe deserialization, secret leakage, input validation of untrusted data, and insecure defaults. The diff is purely about context-timeout management — converting
MaxRuntimefrom a hardWithTimeoutkill to a soft trigger when a critic owns the deadline, with a generousCriticAbsoluteMaxrunaway ceiling as defense-in-depth.maxRuntime(atime.Duration), which is passed to the hostCritic.Monitorinterface — the host is responsible for its own validation.criticOwnsDeadlinepredicate gates the two-tier path onPorts.Critic != nil && ra.Critic.Enabled, both controlled by the host/agent config, not by untrusted user input.CriticAbsoluteMaxdefault is 24h (not 6h as the PR description states — the code comment explains this is intentional: the host clamps to a smaller value, and the kernel ceiling must sit well above it). A 24h runaway ceiling for a broken critic is generous but is a deliberate defense-in-depth choice, not a vulnerability — the host's own clamp (e.g., mort's 6h convar) is the real bound. I verified the code atrun/executor.go:65-66and the comment at lines 32-42.executor.go:336-339): when the critic is enabled butMonitorreturns nil, the run context is re-wrapped with the nominalMaxRuntimetimeout. This correctly prevents an unsupervised run from consuming resources up to the 24h ceiling. I verified thatstartCriticlaunches no goroutine in this path (critic.go:58-60), so no dangling watcher exists on the old context.criticOwnsDeadlinepredicate is called once atexecutor.go:304and again insidestartCriticatcritic.go:51— the same predicate gates both the ceiling choice and the arm/no-op decision, so they cannot drift.No material security issues found.
🎯 Correctness — No material issues found
Verdict: No material issues found
I traced the full context chain, the two-tier timeout logic, the unsupervised failsafe, the
checkpointCausetiming, thestatusForclassification, and the deadline-watch goroutine interaction. The logic is sound:criticOwnsDeadlineis the single predicate used in bothRun(hard cap selection) andstartCritic(arm/no-op gate) — they cannot drift.maxRuntime; critic-owned →max(CriticAbsoluteMax, maxRuntime). The floor guard handles degenerate configs correctly.startCriticsoft trigger: now receives the resolvedmaxRuntime(always > 0 viawithFallbacks), with a 90s defensive floor that is genuinely unreachable from the sole caller.criticOwns && critic == nil,runCtxis re-wrapped withWithTimeout(runCtx, maxRuntime). The child deadline is tighter than the parent'sCriticAbsoluteMax, so it fires first — correct. ThecancelCausefunction (which references the pre-wrap parent) is never called with a cause in this path because the deadline-watch goroutine was never started.checkpointCauseclosure is set after the unsupervised re-wrap, capturing the context the loop actually runs on. No early returns exist between the old and new positions.statusForcorrectly classifies the unsupervised timeout as"timeout"viaerrors.Is(runErr, context.DeadlineExceeded).Defaults.CriticSoftTimeoutremoval is clean — zero remaining references confirmed by grep.One minor documentation discrepancy (not a code bug): the PR description says
CriticAbsoluteMaxdefault is 6h, but the code has 24h. The code is internally consistent (the 24h comment explains it must be "FAR beyond any realistic backstop" since the host clamps to 6h), so this is a PR-description error, not a logic defect.🧹 Code cleanliness & maintainability — No material issues found
Verdict: No material issues found
The change is clean and well-structured through the maintainability lens:
criticOwnsDeadlineis a well-extracted single predicate, used in bothRunandstartCritic, preventing the two decision points from drifting — exactly the right pattern.CriticSoftTimeoutremoval is complete: zero remaining references across the repo (confirmed via grep). No dead config field left behind.Defaultsstruct comment forCriticAbsoluteMaxis thorough and explains the "why" (runaway guard, host clamps lower, never pre-empts a healthy run). TheRunmethod comment repeats some of this, but that's appropriate for a caller-side explanation vs. a field-level contract.checkpointCausereassignment after the unsupervised-failsafe re-wrap is correctly ordered and explicitly commented, so a future reader won't mistakenly move it back above the re-wrap.critic_deadline_test.go) is well-organized: helpers (slowToolInvocation,slowModel,capturingCritic) are scoped to the file, tests are named clearly, and thecapturingCritictype is a purposeful variant of the existingfakeCritic(captures soft trigger vs. ignores it) — not copy-paste.startCriticis explicitly documented as unreachable by the sole caller; it's dead code by construction but the comment makes the intent unambiguous, so it won't confuse a future maintainer.No duplicated logic, no leaky abstractions, no confusing names, no deeply nested or overly long functions introduced by this diff.
⚡ Performance — No material issues found
I've verified the full code in
run/executor.goandrun/critic.go, confirmed there are no other callers ofstartCritic, and confirmedCriticSoftTimeoutis fully removed (no stale references). Let me now assess through the performance lens.The changes are all per-run (not per-step or per-tool):
criticOwnsDeadline: two field reads, called once perRun(). Zero cost.hardCapcomputation: a conditional + comparison, once perRun(). Zero cost.startCriticsignature change: passes a value already computed (maxRuntime). No new work.WithTimeout: one extra context layer, only in the edge case whereMonitorreturns nil. Single allocation, negligible.startCritic: unchanged polling frequency (criticDeadlineCheck = 1s). No regression.Defaults.CriticAbsoluteMax(24h) vs oldCriticSoftTimeout(90s): both are just timer durations in the runtime heap — a 24h timer costs the same as a 90s timer.Verdict: No material issues found
🧯 Error handling & edge cases — No material issues found
Verdict: No material issues found
I reviewed the diff through the 🧯 Error handling & edge cases lens. The change is defensive and well-considered:
startCriticsoft-trigger fallback: Theif soft <= 0 { soft = 90 * time.Second }guard atcritic.go:55is genuinely defensive — the sole caller (executor.go:327) passesmaxRuntime, which is always> 0afterwithFallbacks()(verified atexecutor.go:158-161andexecutor.go:50-52). No reachable zero path.Unsupervised-run failsafe (
executor.go:336-340): WhencriticOwns && critic == nil, the code wrapsrunCtxwith a tighterWithTimeout(maxRuntime). ThecancelCausefunction remains tied to the parentWithCancelCausecontext, but sincecritic == nilmeans no goroutine was launched to callcancelCause, the only cancellation source is the newWithTimeout—context.Causecorrectly returnsDeadlineExceeded. ThecheckpointCauseclosure is set after this re-wrap (executor.go:344), so it reads the final context. Verified by tracing the full context chain atexecutor.go:315-344.criticOwnsDeadlinepredicate (critic.go:29-31): Single source of truth used by both the hard-cap decision and the arm/no-op gate — they cannot drift. The nil-check one.cfg.Ports.Criticis correct (interface nil check).Nil-handle path in
startCritic(critic.go:58-60):Monitorreturning nil is handled cleanly — returnsnil, noop, no goroutine launched, and the caller's unsupervised failsafe catches it.Goroutine panic recovery (
critic.go:67): Therecover()in the deadline-watch goroutine is pre-existing, not introduced by this diff. Silent swallow is intentional per the comment.Deferred cleanup ordering:
cancelUnsupervised→stopCritic→mergeCancel→cancelCause(nil)→cancelTimeout— each cancels its own resource in correct reverse order. The conditionaldefer cancelUnsupervised()inside theifblock is correct Go.hardCapfloor (executor.go:311-313): Theif hardCap < maxRuntime { hardCap = maxRuntime }guard handles the degenerate config case cleanly.No ignored errors, missing cleanup, nil-pointer risks, or unhandled edge cases found in the new code paths.
glm-5.2:cloud (ollama-cloud) — No material issues found
Verdict: No material issues found — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Reviewed through the 🔒 Security lens. This is timeout-plumbing: no new untrusted-input paths, no authn/authz surfaces, no injection/SSRF/deserialization/secret-handling changes.
criticOwnsDeadline(run/critic.go:29) gates only on operator-controlled config (Ports.Critic != nil && ra.Critic.Enabled); thesoftTrigger/maxRuntimevalues are agent/host config, not user-tainted.Security-relevant behavior I verified:
Monitorreturns a nil handle, the base context is re-tightened from the 24h runaway ceiling back toMaxRuntimeviacontext.WithTimeout(runCtx, maxRuntime). Verifiedcritic == nilis the exact nil-handle return path fromstartCritic(run/critic.go:59-61). This closes the obvious "critic enabled but never armed → run free for 24h" hole. Good defense-in-depth.Deadline()always returns zero, theWithTimeout(..., CriticAbsoluteMax)at executor.go:315 still bounds the run. So there is no unbounded-execution path.Minor (non-blocking) note, not a security vulnerability:
CriticAbsoluteMaxdefaults to 6h, but the code (run/executor.go:65-67) and its own comment default to 24h. From a threat-model standpoint an operator relying on the description would underestimate the worst-case runaway window (resource exhaustion / cost exposure for a broken-critic run) by 4×. The code is self-consistent; only the description is wrong. Suggest correcting the description to 24h so operators size the failsafe accurately. (Verified:withFallbackssets24 * time.Hour.)No findings rise to a security defect; the timeout layering is bounded at every path.
🎯 Correctness — No material issues found
VerDICT: No material issues found
I traced the two-tier timeout logic end-to-end against the actual checked-out code and verified the context-layering semantics are correct.
Verified findings (all confirmed by reading the code):
run/executor.go:304-315):criticOwnsis false →hardCap = maxRuntime→WithTimeout(..., maxRuntime). Legacy hard-cap preserved. ✓run/critic.go:49-95,executor.go:317-327): base context getsCriticAbsoluteMax(24h) ceiling; critic goroutine callscancelCause(ErrCriticKill|DeadlineExceeded)on theWithCancelCauselayer. Cause propagates: critic-kill →Cause(runCtx)=ErrCriticKill→ "killed"; backstop expiry →DeadlineExceeded→ "timeout"; caller cancel flows throughMergeCancellation. All three classification paths correct. ✓executor.go:336-340):critic == nil→ no deadline-watch goroutine was started (startCriticreturnsnil, noop), so theWithTimeout(runCtx, maxRuntime)re-wrap is the only sub-24h deadline. It fires at ~maxRuntimefrom now,Cause=DeadlineExceeded→ "timeout".checkpointCauseis set after the re-wrap (line 344) so it reads the context the loop runs on. Correct. ✓cancelUnsupervised(innermost) runs beforestopCritic(noop here) and the outer cancels — no leak. ✓CriticSoftTimeoutfully removed: grep acrossrun/shows zero remaining references; thewithFallbacksnow seedsCriticAbsoluteMaxinstead. Clean. ✓criticOwnsDeadlinesingle predicate used by bothRun(ceiling choice) andstartCritic(arm/no-op gate) — they cannot drift. ✓One trivial, non-blocking note (doc vs code, not a runtime bug):
run/executor.go:66sets theCriticAbsoluteMaxdefault to 24h, and the code comment at line 294/308 says "default 24h", but the PR description repeatedly claims 6h ("withFallbacks default 6h", "clamped to 6h"). The code is internally consistent and 24h is strictly safer (more conservative ceiling; the host adapter clamps its own backstop regardless), so there is no behavioral defect — but the description would mislead an operator reading only the PR text. Suggest updating the description to 24h.🧹 Code cleanliness & maintainability — No material issues found
Verdict: No material issues found
Through the code-cleanliness & maintainability lens, this change is well-structured:
criticOwnsDeadline(run/critic.go:29) is the one source of truth shared by bothRun(ceiling choice) andstartCritic(arm/no-op gate), with a clear doc note that they share it "so the two can never drift." This eliminates the old duplicatede.cfg.Ports.Critic == nil || !ra.Critic.Enabledcheck and keeps the two-tier decision consistent.CriticSoftTimeoutis fully removed — I grepped the whole repo and found zero remaining references.CriticAbsoluteMaxreplaces it with a thorough doc comment. No dangling/stale references.critic_deadline_test.gois inrun_testand reusesfakeCritic/fakeCriticHandlefromcritic_test.gorather than redefining them.slowToolInvocation/slowModelare factored out and reused across the new tests.checkpointCauseclosure is set after the unsupervised-failsafe re-wrap, with a comment explaining why ("reads the context the loop runs on") — avoiding a subtle stale-closure bug a future maintainer could easily introduce.Minor/trivial (not material):
run/critic_deadline_test.go:94capturingCriticpartially re-implementsfakeCritic.Monitor(just adds a soft-capture); it could embedfakeCriticto avoid the near-duplicate, but it's a 4-line test helper, so churn isn't worth it.CriticAbsoluteMaxfield doc and theRunblock comment are quite long (~11 lines / ~20 lines). They're accurate and genuinely help a future maintainer avoid re-breaking the two-tier semantics, so this is a style preference, not a defect.⚡ Performance — No material issues found
VERDICT: No material issues found
run/executor.go:310—e.cfg.Defaults.CriticAbsoluteMaxis read from thewithFallbacks()-resolved copy (applied inNewat line 104), so it is the defaulted value (24h when unset). Correct, not a bug.run/executor.go:336-339— the unsupervised-failsafeWithTimeoutre-wrapsrunCtxonly on the rare nil-handle path (hostMonitorreturns nil). One extra timer allocation per such run; immaterial on the hot path and thecheckpointCauseclosure is set afterward (line 344) so it reads the re-wrapped context. Correct.run/critic.go:68— the deadline-watch ticker interval (criticDeadlineCheck = time.Second, line 14) is unchanged by this diff; the soft trigger value does not affect the poll cadence. No regression; negligible per-in-flight-run cost. Correct.context.WithTimeoutplus, in the rare nil-handle case, one extraWithTimeoutand a deferred cancel.criticOwnsDeadlineis two field reads. No quadratic or blocking behavior on a hot path.🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
I reviewed the diff through the error-handling & edge-cases lens, verifying against the checked-out code.
nil-handle failsafe (
run/executor.go:336-340): whencriticOwns && critic == nil, the run is re-wrapped withcontext.WithTimeout(runCtx, maxRuntime). I confirmedmaxRuntimeis always > 0 (resolved atexecutor.go:158-161with aDefaults.MaxRuntimefallback thatwithFallbacksguarantees > 0), so this never creates a zero/burst timeout. Thedefer cancelUnsupervised()is registered in correct LIFO order aftercancelCause/mergeCancel, so cleanup is complete on every exit path. Verified.cause propagation for the failsafe ceiling (
run/executor.go:315,statusForat:520-538): the 24hWithTimeoutparent →WithCancelCausechild chain means a ceiling expiry surfaces ascontext.DeadlineExceededviacontext.Cause(runCtx)→ "timeout"; a critic kill surfaces asErrCriticKill→ "killed". ThecheckpointCauseclosure is deliberately assigned after the unsupervised re-wrap (:344) so it reads the context the loop actually runs on. Verified correct.startCriticsoft-trigger floor (run/critic.go:54-57):soft <= 0falls back to 90s. The sole caller passes the resolvedmaxRuntime(always > 0), so the floor is pure defence — but it correctly handles a hypothetical non-positive caller, no panic/zero-durationMonitorcall. Verified.critic goroutine panic recovery (
run/critic.go:67):defer func() { _ = recover() }()guards a panicking hostDeadline()/KillCause()on its own goroutine (the mainrecovercan't catch it). Verified present and appropriate.re-wrap vs. goroutine reference aliasing: the unsupervised
WithTimeoutreassignsrunCtx, but the critic deadline-watch goroutine only exists whencritic != nil, and the re-wrap branch is gated oncritic == nil— so the goroutine never observes a stalerunCtx. No leak or missed cancellation. Verified by reading both branches.MergeCancellation interaction (
run/runengine.go:62-73): the merge goroutine cancels the merged context with the caller's cause on caller cancel; this composes correctly with both thecriticOwnsand non-critic paths. No swallowed error.No ignored errors, missing defers, nil-deref, zero/negative-duration, or boundary issues found in this lens.
kimi-k2.6:cloud (ollama-cloud) — No material issues found
Verdict: No material issues found · ⚠️ 1/5 lens(es) errored — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
From a security lens, the change correctly restores the two-tier timeout semantics without introducing new trust-boundary gaps, injection vectors, or unsafe defaults.
criticOwnsDeadlinepredicate is used consistently as the single gate for both the run context ceiling selection and the critic arm decision, preventing drift.CriticAbsoluteMax(default 24h) is explicitly documented as a generous runaway guard for a broken/nil host handle, not the normal path. The host is expected to clamp its own backstop to a much smaller value (e.g., mort's 6h convar). An operator can lower it viaDefaults.CriticAbsoluteMax.criticOwns && critic == nil) correctly tightens the context back toMaxRuntimewhen the host Monitor returns no handle, preventing a run from escaping to the 24h ceiling if the critic fails to arm.startCritic'ssoftTriggeris the resolvedMaxRuntime, whichwithFallbacksguarantees is> 0; the<= 0branch is purely defensive and unreachable from the actual caller.statusForpreserves the kill-vs-timeout-vs-cancel distinction viacontext.Cause, withErrCriticKillwrapped via%wsoerrors.Ismatches correctly.No authn/authz gaps, injection, SSRF, deserialization, secret leakage, or insecure-default issues were identified.
🎯 Correctness — ⚠️ could not complete
⚠️ This reviewer failed to complete: agent: step 10: context deadline exceeded
🧹 Code cleanliness & maintainability — No material issues found
No material issues found.
The change is structurally clean through the maintainability lens:
criticOwnsDeadlineis a well-chosen extraction that centralises the two-tier-timeout predicate and preventsRunandstartCriticfrom drifting out of sync.CriticSoftTimeoutis cleanly removed fromDefaults; no stale references remain.slowToolInvocation,slowModel,capturingCritic).executor.goandcritic.goare verbose but justified by the semantic complexity of the two-tier timeout model.⚡ Performance — No material issues found
No material issues found
Reviewed through the Performance lens. The change restores two-tier deadline semantics: critic-owned runs get a generous runaway ceiling (
CriticAbsoluteMax, default 24 h) instead ofMaxRuntime, while non-critic runs remain unchanged.What I checked:
run/executor.go:304-344— the new context chain for critic-owned runs adds at most one extraWithTimeoutwrap (the unsupervised nil-handle fallback). This is per-run, not per-step or per-iteration, so it does not affect hot-path latency.run/critic.go:62-95— the deadline-watch goroutine and its 1-second ticker are unchanged.MergeCancellationgoroutine behavior is unchanged.cancelTimeout()on every exit path (including panic/recover), so there is no timer leak or heap pressure from long-duration timers.No N+1 queries, no new allocations inside loops, no unbounded growth, no blocking calls on hot paths, and no quadratic behavior introduced. The diff is clean from a performance standpoint.
🧯 Error handling & edge cases — No material issues found
No material issues found.
The two-tier timeout logic is carefully structured with proper nil-safety, defer ordering, and edge-case handling:
criticOwns && critic == nil) correctly re-wrapsrunCtxwithmaxRuntimeso a nilMonitorhandle cannot accidentally run up to the 24h ceiling.checkpointCauseis assigned after the unsupervised re-wrap, so it reads the correct context the loop actually executes on.criticBindingmethods are nil-safe, so the executor can callcritic.recordStep,critic.drainSteer(), etc. unconditionally.startCritic's deadline-watch goroutine defends against panics inh.Deadline()viarecover().withFallbacksguaranteesCriticAbsoluteMax > 0, and thehardCap < maxRuntimeclamp ensures the ceiling never undershoots the nominal budget even for degenerate configs.No ignored errors, missing cleanups, nil panics, or unhandled boundary conditions were introduced.
qwen3.5:397b-cloud (ollama-cloud) — Minor issues
Verdict: Minor issues · ⚠️ 1/5 lens(es) errored — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
After reviewing the changes through the security lens and verifying against the checked-out code:
What I verified:
run/executor.go(lines 1-100, 260-360) to trace the timeout context chainrun/critic.go(lines 1-100) to verify the deadline-watch goroutine logicCriticSoftTimeoutis no longer referenced (grep returned no results)checkpointCauseclosure is set AFTER the unsupervised failsafe re-wrap (line 344)Security analysis:
Timeout escape paths: The unsupervised failsafe (lines 336-340) correctly tightens the context back to
maxRuntimewhencriticOwns && critic == nil. The context chain is:timeoutCtx(24h ceiling) →runCtx(cancelCause) → newrunCtx(maxRuntime timeout). When maxRuntime expires, cancellation propagates correctly.Critic handle with no deadline: A non-nil critic handle whose
Deadline()always returns zero would leave the run bounded only byCriticAbsoluteMax(24h default). This is intentional design - confirmed byTestCriticOwnsDeadline_SurvivesPastMaxRuntimewhich explicitly tests this scenario withfakeCriticHandle{}(Deadline returns zero). The 24h ceiling is the documented runaway guard.Input validation: The
softTriggerparameter comes frommaxRuntime, whichwithFallbacks()guarantees is >0. The defensive check atcritic.go:55-56is a harmless backup.No injection/leakage/authz issues: The changes are purely timeout semantics - no untrusted input handling, no secrets, no external calls modified.
The 24h default for
CriticAbsoluteMaxcould be considered generous (mort uses 6h per the PR description), but this is a configuration choice, not a code vulnerability. The mechanism correctly implements the two-tier timeout semantics.🎯 Correctness — ⚠️ could not complete
⚠️ This reviewer failed to complete: agent: max steps reached without a final answer (max 14)
🧹 Code cleanliness & maintainability — Minor issues
VERDICT: Minor issues found
Findings:
run/executor.go:336-340— Unusual defer-in-conditional pattern: The unsupervised-run failsafe wrapsrunCtxwith a new timeout and defers cancellation inside theifblock. While this works in Go (the defer is registered conditionally and captures the variable), this pattern is less common and slightly harder to read than the alternative of declaring the variable outside and using a named cleanup function. Suggested fix: Extract to a small helper or use a more conventional pattern:run/executor.go:37-39— External system reference in comment: TheCriticAbsoluteMaxfield comment referencesmort's agents.critic.absolute_max_seconds = 6h— an external convar from a different codebase. While contextually helpful, embedding external system configuration details in kernel comments creates a maintenance coupling. Suggested fix: Rephrase to describe the relationship without hardcoding external values (e.g., "the host clamps its own backstop to a smaller absolute max").⚡ Performance — Minor issues
The finding is verified. The code does create a nested timeout structure where:
timeoutCtxis created withhardCap(24h when critic owns)criticOwns && critic == nil, a new timeout wrapsrunCtxwithmaxRuntimeThe outer 24h timer remains scheduled until the function exits and the defer at line 316 fires. This is technically wasteful, though it's a rare edge case (defensive path).
The finding is accurate and survives verification. The verdict of "Minor issues" is appropriate since this is a resource efficiency concern in a rare edge case, not a correctness bug.
VERDICT: Minor issues
Findings:
criticOwns && critic == nil(critic configured/enabled but host returned nil handle), the code first createstimeoutCtxwith the generous 24hCriticAbsoluteMax(line 315), then wraps it with a nestedWithTimeout(runCtx, maxRuntime)(line 338). The outer 24h timer remains scheduled even though it will never fire (the innermaxRuntimetimeout fires first). In a high-throughput system where critic handles occasionally fail to arm, this accumulates dormant 24h timers that waste resources. Fix: Check the critic handle validity before creating the generous timeout, or cancel/replace the outer timeout when tightening:cancelTimeout(); timeoutCtx, cancelTimeout = context.WithTimeout(context.WithoutCancel(ctx), maxRuntime). (Verified by reading run/executor.go:315-340 and run/executor.go:65-66 confirming 24h default)🧯 Error handling & edge cases — No material issues found
All findings in the draft review are either style suggestions or based on misunderstandings. The code is well-documented, tested, and works as designed:
context.Causeusage is standard Go 1.20+ behavior and is already well-documented in the code comments.Monitorreturnsnilis intentional defensive behavior, as documented in the code comment ("mort's adapter always arms when the flag is set, so this is pure defence in depth").CriticAbsoluteMaxdefault of 24h vs the host's 6h clamp is intentional design - the 24h is the kernel ceiling while 6h is the host clamp, as clearly documented in the struct comment at lines 32-42.No material issues found.
Verdict: No material issues found
After reviewing the changes through the error handling & edge cases lens and verifying against the checked-out code, all findings in the draft review are either style suggestions or based on misunderstandings of the intentional design:
executor.go:336-340correctly re-wraps the context and setscheckpointCauseafter the re-wrap, ensuring proper cancellation cause propagation. The behavior is tested byTestCriticOwnsDeadline_NilHandleFallsBackToMaxRuntime.Monitorreturnsnilatcritic.go:58-60is intentional defensive behavior, as documented in the code comment atexecutor.go:330-335.CriticAbsoluteMaxdefault of 24h atexecutor.go:66is intentionally larger than the host's 6h clamp, as explained in the struct comment at lines 32-42.Automated adversarial review by Gadfly — consensus across the model swarm. Advisory only — does not block merge.
🪰 Gadfly consensus review — 5 inline findings on changed lines. See the consensus comment for the full ranked summary.
Advisory only — does not block merge.
@@ -34,1 +42,3 @@func (e *Executor) startCritic(runCtx context.Context, cancelCause context.CancelCauseFunc, ra RunnableAgent, info RunInfo) (*criticBinding, func()) {//// softTrigger is the run's resolved MaxRuntime: for a critic-owned run MaxRuntime// is the soft wake (mort's two-tier semantics — the critic first reviews once the🟡 softTrigger=MaxRuntime is overloaded as the default battery's idle-stall window (critic.go:262), delaying hang detection from ~90s to a full MaxRuntime; comment 'reviews once the run exceeds its nominal budget' misdescribes the idle-based default behavior
correctness · flagged by 1 model
🪰 Gadfly · advisory
@@ -40,2 +54,3 @@soft := softTriggerif soft <= 0 {soft = 90 * time.Second // defensive: withFallbacks normally guarantees >0soft = 90 * time.Second // defensive only; the sole caller passes MaxRuntime (>0)⚪ Unreachable 90s defensive fallback reintroduces a bare magic literal that was previously a named default (CriticSoftTimeout)
maintainability · flagged by 1 model
🪰 Gadfly · advisory
@@ -33,0 +34,4 @@// not a hard cap, and the critic's own extendable backstop is the normal// deadline. This ceiling exists ONLY to stop a critic that never advances its// deadline (a broken host handle) from running forever, so it is deliberately// set FAR beyond any realistic backstop (default 24h): the host clamps its own⚪ Comment references external system convar value
maintainability · flagged by 1 model
run/executor.go:37-39— External system reference in comment: TheCriticAbsoluteMaxfield comment referencesmort's agents.critic.absolute_max_seconds = 6h— an external convar from a different codebase. While contextually helpful, embedding external system configuration details in kernel comments creates a maintenance coupling. Suggested fix: Rephrase to describe the relationship without hardcoding external values (e.g., "the host clamps its own backstop to a smaller absolute max").🪰 Gadfly · advisory
@@ -280,0 +312,4 @@hardCap = maxRuntime}}timeoutCtx, cancelTimeout := context.WithTimeout(context.WithoutCancel(ctx), hardCap)🟠 Nested 24h timeout wastes timer resources in unsupervised-run failsafe path
performance · flagged by 1 model
criticOwns && critic == nil(critic configured/enabled but host returned nil handle), the code first createstimeoutCtxwith the generous 24hCriticAbsoluteMax(line 315), then wraps it with a nestedWithTimeout(runCtx, maxRuntime)(line 338). The outer 24h timer remains scheduled even though it will never fire (the innermaxRuntimetimeout fires first). In a high-throughput system where cri…🪰 Gadfly · advisory
@@ -295,0 +333,4 @@// bounded only by the 24h ceiling. Tighten it back to the nominal MaxRuntime so// an unsupervised run can't hold its slot far past budget. mort's adapter always// arms when the flag is set, so this is pure defence in depth.if criticOwns && critic == nil {🟡 Unusual defer-in-conditional pattern in unsupervised failsafe
maintainability · flagged by 1 model
run/executor.go:336-340— Unusual defer-in-conditional pattern: The unsupervised-run failsafe wrapsrunCtxwith a new timeout and defers cancellation inside theifblock. While this works in Go (the defer is registered conditionally and captures the variable), this pattern is less common and slightly harder to read than the alternative of declaring the variable outside and using a named cleanup function. Suggested fix: Extract to a small helper or use a more conventional pattern: ```…🪰 Gadfly · advisory
All low-severity follow-ups on the critic-deadline change: - hardCap uses max(CriticAbsoluteMax, maxRuntime) instead of a nested if (723193a7). - Drop the now-dead 90s soft-trigger fallback + its bare literal: the sole caller passes the resolved MaxRuntime (>0), and Run's unsupervised-run failsafe bounds even an impossible 0 (8d377051, 2f86bf58). - Decouple the kernel doc from a named downstream convar ("a 6h host convar") (730c67fc). Graded false-positive: agent.go BackstopMultiplier validation (handled in the host; not in this diff), the 24h default "magic number" (matches every withFallbacks default), and the defer-in-conditional pattern (idiomatic). Kept: the thorough two-tier comment (this logic regressed once) and the rare-path nested timer. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Jo75sqmeVPgFUWZQBn179X