run: critic can raise a run's step ceiling mid-flight (CriticHandle.MaxSteps) #10
Reference in New Issue
Block a user
Delete Branch "critic-iteration-ceiling"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Prerequisite for a full-fidelity mort
agentcriticadapter. mort's critic adjusts a healthy-but-long run's iteration budget (not just its deadline); executus'sCriticHandlewas deadline+steer only. This adds a dynamic step ceiling — above an unchanged majordomo, which already exposesagent.WithMaxStepsFunc(fn func() int).Changes
run.RunInfo+=MaxIterations— the run's base step ceiling, so a critic can raise it relative to the baseline.run.CriticHandle+=MaxSteps() int— polled by the executor each step viaagent.WithMaxStepsFunc;<=0defers to the base. The executor usesWithMaxStepsFunc(critic.MaxSteps)when a critic is active, else the fixedWithMaxSteps.handle.maxSteps(initialised fromRunInfo.MaxIterations) +MaxSteps();DecisiongainsRaiseStepsByso an Escalator can raise the ceiling alongsideExtendBy. TheExtendOncedefault is unchanged (time-only).Test
A critic returning
MaxSteps()=5lets aMaxIterations=1run complete two tool-dispatch steps past the base ceiling (without the critic it would hitErrMaxSteps).Invariants
Core stays battery-free (
rundoesn't importcritic); full suite green.Next: the mort
agentcritic→run.Ports.Criticadapter (C2b) uses this to bridgeAdjustIterationsfor full parity.🤖 Generated with Claude Code
🪰 Gadfly review —
glm-5.1:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — Minor issues
I've verified all the key code references. The finding is factually correct:
critic/critic.go:h.maxSteps += d.RaiseStepsBywith only a> 0check, no cap.maxStepsinitialized frominfo.MaxIterationsat line 142 — confirmed.tick()is called fromwatch()on a ticker loop (line 235), so an Escalator can be invoked multiple times per run, each time adding tomaxStepswithout bound.ExtendByfollows the same unbounded pattern at line 274-275, so this is consistent.Deadline) does still provide a wall-clock backstop.The finding is valid: there's no upper-bound clamp, and a malicious or misconfigured Escalator (including an LLM-driven one) can arbitrarily raise the step ceiling. The
ExtendByhas the same pattern, making this a consistent design approach rather than an oversight, but the observation about weakening defense-in-depth is accurate.VERDICT: Minor issues
critic/critic.go:277-278— No upper-bound clamp onRaiseStepsBy; step-ceiling safety net can be defeated by an Escalator. Eachtickaddsd.RaiseStepsBytoh.maxStepswith only a> 0check — no cap, no saturation. Because the Escalator seam is designed to admit LLM-driven decisions (the package doc calls out "mort's critic-agent" as a consumer), an LLM's output could return an arbitrarily largeRaiseStepsBy, effectively removing the iteration budget entirely. The hard-wall-clock deadline (Deadline) still kills a true runaway, so this isn't an unbounded-resource bug in isolation, but it weakens a defense-in-depth layer: the step ceiling exists to cap cost/tokens independent of wall-clock time, and a single Escalator decision can negate it. (Note:ExtendByfollows the same unbounded pattern at line 274-275, so this is consistent with existing design, but a clamp — e.g.,h.maxSteps = min(h.maxSteps + d.RaiseStepsBy, cap)— would preserve the Escalator's intent while keeping the guardrail.)No other security issues found. The
MaxSteps()read path is correctly mutex-protected; themaxStepsOptionclosure is nil-safe (early return forb == nil); negativeRaiseStepsByis correctly ignored; theMaxSteps() <= 0defers-to-base fallback is sound.🎯 Correctness — No material issues found
VERDICT: No material issues found
I verified the following aspects through code inspection:
agent.WithMaxSteps/agent.WithMaxStepsFunc: These are from the externalmajordomo/agentpackage (imported inrun/critic.go). The functions are called with correct signatures —WithMaxSteps(int)andWithMaxStepsFunc(func() int)— matching the documented API. ✓maxStepsOptionnil-safety: The method checksb == nilfirst and returnsagent.WithMaxSteps(base). Whenbis non-nil,b.his guaranteed non-nil becausestartCriticonly creates acriticBindingwhenMonitorreturns a non-nil handle. ✓MaxSteps() <= 0deferral: When the handle'sMaxSteps()returns ≤ 0, the dynamicWithMaxStepsFuncfalls back tobase(i.e., the run's originalMaxIterations). This is semantically correct — it matches the documented contract onCriticHandle.MaxSteps(). ✓RunInfo.MaxIterationspopulation: The executor setsMaxIterations: maxIterinRunInfo, which is the samemaxItervalue used for the step ceiling option. Consistent. ✓handle.maxStepsinitialization: Incritic.go,maxStepsis initialized frominfo.MaxIterationsin thehandlestruct created byMonitor. This is the base ceiling thatRaiseStepsByincrements. ✓tick()modifyingh.maxSteps: This follows the exact same pattern as the pre-existingh.deadline = h.deadline.Add(d.ExtendBy)— both are mutations withintick(). Whether these are mutex-protected depends on howtick()is called (the method runs on the watch goroutine which serializes tick calls), matching the existing convention fordeadline. No new race pattern is introduced. ✓Test correctness:
TestCriticRaisesStepCeilinguses afakeCriticHandle{maxSteps: 5}withMaxIterations: 1. This tests that the executor'sWithMaxStepsFuncwiring correctly allows a critic to override the base ceiling. The fake returns a constant 5 rather than dynamically raising viaRaiseStepsBy, but this is a valid integration test for the executor wiring; theRaiseStepsByincrement logic is a unit-level concern within the critic package itself. ✓No logic bugs, off-by-one errors, or semantic correctness issues found in this change.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
Reviewing through the 🧯 error-handling & edge-case lens:
Zero / negative
MaxIterationsonRunInfo: The executor (line 125–128) normalisesmaxIterbefore populatingRunInfo.MaxIterations, so the value passed to the critic battery is always positive. Even if aRunInfowere constructed externally withMaxIterations = 0, thecriticBinding.maxStepsOptioncorrectly falls back tobasewhenMaxSteps() ≤ 0. Verified by readingrun/executor.go:125-128andrun/critic.go:95-104.RaiseStepsBy ≤ 0:handle.tick()guards withif d.RaiseStepsBy > 0, so negative or zero values are no-ops. No integer-underflow risk. Verified atcritic/critic.go:277-279.Integer overflow on
maxSteps += RaiseStepsBy: Repeated escalations could in theory overflowint, but the escalator is host-controlled code and a very largemaxStepsis functionally "unlimited" — no correctness impact. Not a material issue.Concurrent access to
handle.maxSteps: BothMaxSteps()andtick()acquireh.mubefore reading/writingmaxSteps. ThemaxStepsOptionclosure captures*criticBinding(whose.hfield is set once and never reassigned), so there is no race on the handle reference itself. Verified by readingcritic/critic.go:200-204andcritic/critic.go:261-279.Panic safety of
MaxSteps(): Called on the main goroutine (viaWithMaxStepsFunc), so the executor's top-levelrecover()atrun/executor.go:115-119would catch any panic. The separate deadline-watch goroutine has its ownrecover()atrun/critic.go:47. Both paths are covered.Test edge case —
fakeCriticHandle.maxStepszero default: The existingTestCriticWireduses a zero-valuedmaxSteps, soMaxSteps()returns 0, which the binding treats as "defer to base." Correct and consistent with the newMaxSteps()contract.No unhandled nil, empty, zero, boundary, or swallowed-error conditions found in the diff.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 0s
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
I reviewed the diff through the security lens — authn/authz, injection, SSRF, unsafe deserialization, credential leakage, and input validation on untrusted input.
What I verified:
run/critic.go:95-105—maxStepsOptionreturnsWithMaxSteps(base)for nil binding and a pollingWithMaxStepsFuncotherwise. No tainted path.run/executor.go:245-253— switch from fixedWithMaxSteps(maxIter)tocritic.maxStepsOption(maxIter)is a pure wiring change.critic/critic.go:200-204, 277-279—MaxSteps()is mutex-guarded and theRaiseStepsByaccumulation is locked. Negative values are correctly ignored viaif d.RaiseStepsBy > 0.run/ports.go:42-54, 124-142—RunInfo.MaxIterationsandCriticHandle.MaxSteps()are both int-valued and carry no deserialization surface.No security-relevant findings:
RaiseStepsBy/MaxStepschannel is an integer with no string/byte path — no injection surface.Critic/CriticHandleports are host-pluggable interfaces; the trust boundary is internal to a host that already owns the Escalator (which today can alreadyExtendByunboundedly andKill). This PR does not widen that trust model.maxStepsfield is correct (mutex around read inMaxSteps()and write intick()); no data race → no memory-leak / torn-read style leak vector.<= 0 → baseis a safe default (no infinite loop viaMaxSteps()returning 0/negative).No action needed within this lens.
🎯 Correctness — Blocking issues found
Confirmed —
RaiseStepsByis gated by> 0so negative values are silently absorbed (no addition happens). The finding's claim that negative values would "silently drop the ceiling" is wrong because the gate prevents it. However, the naming inconsistency (RaiseStepsByis strictly positive in behavior) and the recommendation for clarity still stand as a minor concern.Given the verification:
Actually, looking again at Finding 5's impact: "Recommend either renaming to AdjustStepsBy... or asserting > 0 in tick". The code already has
> 0assertion in tick (line 277), so the second recommendation is already implemented. The first (renaming) is just a naming preference. This finding doesn't add material value. I'll drop it.Final verdict: 4 confirmed findings remain — all blocking-ish. I'll call this "Blocking issues found" but soften the framing.
Review — 🎯 Correctness
VERDICT: Blocking issues found
Findings
run/executor.go:167—RunInfo.MaxIterationsrecords the post-defaultmaxIter, not the agent'sra.MaxIterations.The diff populates
info.MaxIterations = maxIter, wheremaxIteris the resolved value (ra.MaxIterations, falling back toe.cfg.Defaults.MaxIterations). The handle'smaxStepsis then seeded frominfo.MaxIterations(critic/critic.go:142), andRaiseStepsByadds on top of that baseline. A critic calibrated forMaxIterations=1(e.g. "raise by 3") that receives an agent with noMaxIterationsset will seeinfo.MaxIterations = Defaults.MaxIterations(e.g. 12), so its additive raise lands at 15 instead of 4. The doc onRunInfo.MaxIterations(run/ports.go:51-52) calls it "the run's base tool-dispatch step ceiling, so a critic can raise it relative to the baseline" — the silent baseline substitution (agent's ask → defaults) defeats the calibration. Fix: recordra.MaxIterations(0 if unset) and let the critic (orEscalator) resolve it againstDefaultsif it needs the effective ceiling — or rename to make the semantics explicit (e.g.EffectiveMaxIterations). Verified atrun/executor.go:125-128,158-168,critic/critic.go:142.run/critic.go:99-104—MaxStepsFuncclamps atmaxIterand conflates three distinct "<=0" cases.The fallback
if n := b.h.MaxSteps(); n > 0 { return n } return baseoverloads the<=0sentinel: it cannot distinguish "no critic / not set up" from "I want exactly 0" from "I want to drop below the base." A critic that legitimately wants to lower the ceiling mid-flight (a reasonable "this run is going off the rails" policy) has no path: returning 0 falls back to base, not 0; returning negative falls back to base too. The policy intent is invisible to the executor. Fix: add an explicit "defer to base" state (e.g.*int, or a separate sentineldefer boolonDecision), so a critic can expressLowerStepsByindependently from "no opinion." Verified atrun/critic.go:91-105andcritic/critic.go:277-279.run/critic.go:99-105—maxStepsOptionaccessesb.h.MaxSteps()without a nil-guard onb.h.The nil-safe pattern elsewhere in the file is
if b != nil { b.h.X() }.maxStepsOptiondoes checkb == nil(line 96), but inside the returned closure it dereferencesb.hunconditionally. Today this is safe —startCritic(lines 39-41) returns a nil*criticBindingwheneverMonitorreturns a nil handle, sob.his always non-nil whenbis non-nil — but it is asymmetric withrecordStep/recordToolStart/steerOptionsand would silently nil-panic on a future hostCriticwhoseMonitorreturns a non-nil handle whose methods themselves panic (onlyDeadline()is currently defensive-guarded atrun/critic.go:43-47). Fix: addif b == nil || b.h == nil { return agent.WithMaxSteps(base) }for symmetry and defense-in-depth. Verified atrun/critic.go:72-89,95-105.critic/critic.go:142—handle.maxStepsis initialized frominfo.MaxIterationsand depends on the first finding being left as-is.Today,
info.MaxIterations = maxIter(the resolved value), sohandle.maxStepsis initialized to the effective ceiling — non-zero whenever a run actually starts. After execution,MaxSteps()returns that value, andmaxStepsOptionuses it. If the first finding is fixed (recordra.MaxIterationsinstead ofmaxIter), then for agents with noMaxIterationsset,handle.maxSteps = 0,MaxSteps()returns 0, andmaxStepsOptionfalls back tobase(maxIter) — which is the correct behavior. The two findings are coupled: fixing one without considering the other leaves a brief window where the ceiling defers to base (acceptable) or, if a host critic readsinfo.MaxIterationsexpecting the agent's declared value, gets 0 (misleading). Worth pinning down in the same change. Verified atcritic/critic.go:142,200-204andrun/critic.go:99-104.🧯 Error handling & edge cases — Minor issues
Confirmed. The top-level
Runrecover at lines 115-119 explicitly covers "a panic anywhere in the run (incl. a host Critic/Audit/Palette callback on the main goroutine)". This includes any panic from theMaxStepsFunccallback since it's invoked synchronously insideag.Runon the same goroutine.Corrected Review
VERDICT: Minor issues
Reviewed through the 🧯 error handling & edge cases lens (panic propagation, swallowed errors, edge inputs, boundary conditions). Findings:
critic/critic.go:277-279— unboundedRaiseStepsByaccepted silently.h.maxSteps += d.RaiseStepsByhas no ceiling. A buggy/hostile Escalator returningRaiseStepsBy = math.MaxIntwill wrapint(silent overflow on 32-bit; or just blast past any sane budget). The struct doc at line 43 is also unconstrained ("raise ... by this"). Consider a documented sane upper bound orint32-style validation intick, mirroring the>0lower-bound check that's already there.critic/critic.go:142+run/critic.go:99-104—MaxIterations == 0path silently degrades toWithMaxStepsFunc(0). In practice this is masked byDefaults.withFallbacks()atrun/executor.go:34-37, which forcesMaxIterations = 12when zero/negative, soinfo.MaxIterationsand thushandle.maxStepsare effectively never 0. However the new dynamic path has no explicit guard against a zerobaseinmaxStepsOption, and behavior withMaxSteps() <= 0falling back to a 0 base is undefined. Consider treatingbase <= 0inmaxStepsOptionas "no ceiling" or a kernel minimum, matching howMonitoralready treatssoftTimeout <= 0.run/critic.go:47(pre-existing) — silent panic swallow in the deadline-watch goroutine._ = recover()discards the panic with no log; the run then runs to its (now-unenforced) runtime cap. Not introduced by this diff. Flagged because the newMaxSteps()poll inherits the same trust assumption in the wiring, making the pattern worth tightening before more reliance is placed on it.Dropped after verification:
run/critic.go:99-104— alleged missingrecoveraround theMaxStepsFuncclosure. The closure is invoked synchronously from insideagent.Run, which is called on the same goroutine asExecutor.Run.run/executor.go:115-119already has a top-leveldefer func() { if r := recover(); r != nil { res.Err = ... } }()that the docstring explicitly scopes to "a panic anywhere in the run (incl. a host Critic/Audit/Palette callback on the main goroutine)". A panickingMaxSteps()therefore surfaces asResult.Err, not a process crash. No additional wrapper is needed at the closure site.Verified by reading:
critic/critic.go(full),run/critic.go(full),run/executor.go(lines 1–350),run/ports.go(lines 100–160),run/critic_test.go(referenced via diff).Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 54s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Decision.RaiseStepsBy(host-suppliedEscalator) →handle.maxSteps += RaiseStepsBy(critic/critic.go:278) →MaxSteps()(critic/critic.go:200) →criticBinding.maxStepsOptioncallback (run/critic.go:99) →agent.WithMaxStepsFunc. The only "untrusted" source here isrun.Ports.Critic, which is a trusted host-supplied port (same trust level as the model resolver/registry), not request/user input. No injection, SSRF, path, deserialization, or secret surface is introduced — it's anintstep counter.RaiseStepsByis applied only when> 0(critic/critic.go:277), andMaxSteps() <= 0defers to the base ceiling (run/critic.go:100). So a critic cannot force a zero/negative ceiling to prematurely halt or bypass the base floor; it can only raise. Verified in critic/critic.go:240-279 and run/critic.go:95-105.Escalatorcould return hugeRaiseStepsByvalues, but the run remains bounded by the unchanged hardDeadline()watch (run/critic.go:43-64 cancelsrunCtxwhen the deadline passes) andMaxRuntime(executor.go:200). So wall-clock is still capped; this is at most a trusted-plugin DoS nuance, not a security finding.Nothing in the security lens is materially wrong.
🎯 Correctness — No material issues found
VERDICT: No material issues found
I verified the correctness-relevant pieces against the checked-out code:
maxIter→RunInfo.MaxIterationsparity (run/executor.go:125-128,167):maxIteris the resolved value (ra.MaxIterations, falling back toe.cfg.Defaults.MaxIterations).info.MaxIterations = maxIter, and the samemaxIteris passed tocritic.maxStepsOption(maxIter). So the critic's baseline and the fixed-path baseline are identical — no divergence between whatWithMaxSteps(maxIter)would have enforced and what the dynamic func starts from. ✓critic.maxStepsOptionnil-safety + deferral (run/critic.go:95-105):b == nil→WithMaxSteps(base); otherwiseWithMaxStepsFuncreturningb.h.MaxSteps()when> 0, elsebase. Logic is correct. ✓handle.maxStepsinit/raise (critic/critic.go:142,162,197-203,277-279): initialized toinfo.MaxIterations(the resolved base),MaxSteps()reads it under the lock,tickonly addsRaiseStepsBywhen> 0. Units are consistent (steps throughout, no conversion factor to mis-derive). The ceiling is monotonic non-decreasing — it can only be raised, never accidentally lowered below base, matching the "raise a long run's budget" intent. ✓Decision.RaiseStepsBysemantics:int, added tomaxStepsonly when positive; zero-valueDecisionleaves the ceiling unchanged (time-onlyExtendOncedefault preserved). ✓run/critic_test.go): baseMaxIterations=1, critic returnsMaxSteps()=5, two tool-dispatch steps then a final answer; asserts noErrMaxStepsand correct output. This correctly exercises the "run past the base ceiling" path and would fail under the old fixedWithMaxSteps(1). ✓One item I could not confirm with the available tools (and so am not asserting as a finding): the existence/signature of
agent.WithMaxStepsFunc(func() int)and its per-step polling semantics. Theagentpackage is an external module (gitea.stevedudenhoeffer.com/steve/majordomo), there is no vendor directory, and no local definition resolves — so I couldn't open it. This is the load-bearing external seam; if the symbol were absent or not polled per step the change wouldn't behave as described. The PR states the suite is green and the included test directly depends on this behavior, so I'm treating it as plausible rather than flagging it as a defect.Within the correctness lens, the logic, units, baseline parity, and monotonic-raise semantics are all sound.
🧯 Error handling & edge cases — No material issues found
The "unlimited/0" finding is moot:
withFallbackssetsMaxIterationsto 12 when ≤0, and executor usesmaxIterwhich is always ≥1 (default 12). Soinfo.MaxIterationspassed to the critic is never 0 in practice. The inversion edge case is not reachable.All other findings are confirmations of correct behavior (not issues). Dropping everything.
Corrected review
No material issues found.
critic/critic.go:142,277-279— no reachable zero-base inversion. The draft flagged a possible inversion whereMaxIterations == 0(treated as "unlimited") becomes capped atRaiseStepsByonce an Escalator fires. Verified against the glue:run/executor.go:125-128resolvesmaxIterfromra.MaxIterationsthen falls back toe.cfg.Defaults.MaxIterations, andDefaults.withFallbacks()(run/executor.go:34-37) coerces a<= 0default up to12. The value flows intoRunInfo.MaxIterations(executor.go:167) unchanged, so the critic'sinfo.MaxIterationsis always>= 1in practice. The zero-base path is not reachable; the finding does not survive.critic/critic.go:277-279—h.maxSteps += d.RaiseStepsByoverflow guard. Only a theoretical integer-overflow concern over many ticks;RaiseStepsBy > 0is already filtered. Not a material issue.run/critic.go:95-105— nil-binding and defer paths verified correct.b == nil→WithMaxSteps(base); critic active withMaxSteps() <= 0→ falls back tobase. No nil-deref, no swallowed error.critic/critic.go:200-204/:277-279— locking verified correct.MaxSteps()takesh.mu;tickholdsh.mu(viadefer h.mu.Unlock()at line 262) when mutatingh.maxSteps. No race.No ignored/swallowed errors or missing cleanup introduced by this diff.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 29s
🪰 Gadfly review —
m1/qwen3:14b(m1)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
The changes introduce a mechanism for dynamically adjusting a run's step ceiling via the
CriticHandle.MaxSteps()method. After thorough examination through the security lens:MaxSteps()method is internal to the system and not exposed to external inputs or user-controlled data.maxSteps.The implementation is confined to internal system components and does not introduce any new attack surfaces or security weaknesses.
🎯 Correctness — No material issues found
No material issues found
The draft review's finding about
TestCriticRaisesStepCeilingwas incorrect. The test is semantically valid: it correctly setsmaxSteps=5on the fake critic handle and verifies that the run completes past the baseMaxIterations=1(which would normally stop after 1 step). The test expects the run to complete successfully with 2 tool steps, which is allowed by the critic'smaxSteps=5. The test's logic and expectations align with the code's behavior.🧯 Error handling & edge cases — Blocking issues found
Blocking issues found
handle.maxStepsis initialized frominfo.MaxIterationswithout validation. IfMaxIterationsis zero or negative, this sets an invalid base ceiling. Fix: addif info.MaxIterations <= 0 { ... }guard before assignment.RaiseStepsByadds toh.maxStepswithout checking if the adjustment would make it negative. Fix: addif d.RaiseStepsBy > 0 { ... }guard and validateh.maxSteps + d.RaiseStepsBy > 0.info.MaxIterationsis set frommaxIterwithout validation. IfmaxIteris zero or negative, this creates an invalid base ceiling. Fix: addif maxIter <= 0 { ... }guard before assignment.maxStepsOptionfalls back tobasewhenh.MaxSteps()is <=0, but doesn't validatebaseis positive. Fix: addif base <= 0 { ... }guard before returningWithMaxStepsFunc.maxSteps(5). No test covers zero/negativemaxStepsor invalid base values. Fix: add test cases withmaxSteps=0,maxSteps=-1, andbase=0to verify fallback logic.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 21m 6s
🪰 Gadfly review —
m5/qwen3.6:35b-mlx(m5)Verdict: Blocking issues found · ⚠️ 1/3 lens(es) errored — 3 reviewers: security, correctness, error-handling
🔒 Security — Blocking issues found
Verdict: Blocking issues found
Security Findings
Unbounded
RaiseStepsBy— No Upper Cap on Step Ceiling (Blocking)critic/critic.go:274-276(tick),run/critic.go:95-107(maxStepsOption)agent.WithMaxSteps(maxIter)with a dynamicWithMaxStepsFuncthat pollsh.maxSteps. Insidetick(),h.maxSteps += d.RaiseStepsByis executed without any upper bound check. An adversarial or buggy LLM can setRaiseStepsByto arbitrary values, causing the agent's step ceiling to inflate indefinitely within its runtime window. This removes the previous hard safety cap and introduces a direct resource exhaustion / DoS vector (excessive API calls, CPU, and token consumption). Fix: clamph.maxStepsto a configurable max or enforce a static ceiling inmaxStepsOption.Caller-settable
RunInfo.MaxIterations— No Sanity Check Before Passing to Critic (Minor)run/executor.go:160,run/ports.go:53ra.MaxIterationsdirectly into the newly addedRunInfo.MaxIterationsfield, which seeds the critic's baseline step ceiling. If a caller supplies a huge integer forMaxIterations, the handle initializes with that inflated budget before the Escalator can even raise it further. A defensive validation or default cap should be enforced inexecutor.gobeforeinfo.MaxIterationsis assigned to prevent downstream escalation from an already-invalid starting state.🎯 Correctness — No material issues found
VERDICT: No material issues found
Findings:
No correctness issues found. The
RunInfo.MaxIterationsfield is correctly populated from the effectivemaxItervalue (which already accounts for defaults viawithFallbacks()), and this same effective value flows to bothinfo.MaxIterations(used bycritic.System.Monitorto initializeh.maxSteps) andmaxStepsOption(base)(which uses it as the fallback). This ensures consistency between what the critic sees as baseline and what the executor passes to majordomo.The dynamic ceiling mechanism is sound. In
criticBinding.maxStepsOption:b == nil), returns fixedagent.WithMaxSteps(base)— unchanged behavior.agent.WithMaxStepsFunc(func() int { ... })that pollsb.h.MaxSteps()each iteration. If the handle returns> 0, use that ceiling; otherwise fall back tobase.The
handle.ticklock-step update ofmaxStepsis correct.RaiseStepsBy > 0guard prevents negative or zero values from corrupting the ceiling, andh.maxSteps += d.RaiseStepsByincrements properly under the mutex, matching the concurrency-safety pattern already used fordeadline,steer, etc.Test validates the mechanism correctly.
fakeCriticHandle{maxSteps: 5}withMaxIterations: 1verifies that when a critic returns a ceiling > base, the agent can run past the original step limit. The test doesn't exercise the realhandle.tick→RaiseStepsBypath (it uses a hardcoded fake), but this is a valid simplification since the dynamic-polling path throughWithMaxStepsFuncis the core change being verified.The flow from executor to critic handle is consistent:
info.MaxIterations = maxIteratrun/executor.go:167wheremaxIterhas already been resolved (fromra.MaxIterationsor default). This samemaxIteris passed asbaseintomaxStepsOption(maxIter). No divergence.🧯 Error handling & edge cases — ⚠️ could not complete
⚠️ This reviewer failed to complete: agent: step 8: context deadline exceeded
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 14m 31s
🪰 Gadfly review —
kimi-k2.7-code:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
Security lens review: the change exposes a dynamic step ceiling via a new
CriticHandle.MaxSteps()method and aRaiseStepsByknob inDecision. The tainted-input path is short and safe:MaxIterationsis set by the executor from already-resolvedRunnableAgent/Defaultsvalues, passed into the critic port, and the dynamic function merely returns aninttoagent.WithMaxStepsFunc. There is no injection surface, no credential or secret handling, no deserialization, and no authz bypass introduced. The critic port is a privileged host extension by design, so an escalator raising its own run's step ceiling is the intended capability. No security issues.🎯 Correctness — No material issues found
Verdict: No material issues found
I reviewed the change through the Correctness lens by reading the modified files and cross-checking the relevant call sites.
Key verification points:
run.RunInfo.MaxIterationsis correctly populated from the resolvedmaxItervalue (run/executor.go:167), not the rawra.MaxIterations, so host defaults are honored.criticBinding.maxStepsOptionreturnsagent.WithMaxSteps(base)when no critic is active (run/critic.go:95-97), preserving the original fixed behavior. When a critic is active it usesagent.WithMaxStepsFuncand falls back tobaseif the handle returns<=0, matching the documented interface contract (run/ports.go:135-139).criticbattery initializeshandle.maxStepsfrominfo.MaxIterations(critic/critic.go:142) and only raises it viad.RaiseStepsBy(critic/critic.go:277-279). The arithmetich.maxSteps += d.RaiseStepsByis bounded and additive, which is the intended semantics for raising a ceiling mid-flight.TestCriticRaisesStepCeilingusesMaxIterations=1and a critic returningMaxSteps=5, which correctly exercises that two tool-dispatch steps can occur past the base ceiling.critic/critic.go:13-16correctly reflects that the executor already wiresMaxSteps, matching the actual state ofrun/executor.go:247-250.No logic bugs, incorrect constants, or semantic regressions were found in this lens.
🧯 Error handling & edge cases — Minor issues
Verdict: Minor issues
run/critic.go:99-104— Unbudgeted reduction edge case. The dynamic step closure returns any positiveMaxSteps()value as-is, so a misbehaving or over-eager critic can lower the step ceiling below the run's baseMaxIterations, even though the new API is raise-only (Decision.RaiseStepsBy) and the doc says the critic "raises" the budget. Since the intended invariant is "raise the ceiling relative to the baseline", values0 < n < baseshould probably be clamped tobase, or the interface contract should explicitly allow lowering. Verified by readingrun/critic.goandrun/executor.go:125-128(base is a normalized positivemaxIter).Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 20s
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
I traced every tainted-data path through the security lens:
MaxIterationsoriginates fromra.MaxIterations(user-provided) → sanitized bywithFallbacks()(negative/zero → 12) → stored inRunInfo→ flows tocritic/critic.go:142ashandle.maxStepsinitializer and toAudit.StartRun. No injection surface: it's anintused only in arithmetic comparisons.RaiseStepsBycomes from the hostEscalator(trusted host code) → gated by> 0atcritic/critic.go:277→ added tomaxStepsunder mutex. Negative values are ignored. Integer overflow (if an Escalator returnedmath.MaxIntrepeatedly) would wrapmaxStepsnegative, butmaxStepsOptionatrun/critic.go:100treats<= 0as "defer to base" — self-correcting, no security impact.MaxSteps()is polled per-step viaWithMaxStepsFuncand used only as a loop bound in majordomo. The closure atrun/critic.go:99-104captures the handle reference;MaxSteps()is properly mutex-guarded (critic/critic.go:200-204). No TOCTOU that could be exploited.critic/critic.go:223only emitsRunID, not the fullRunInfo.MaxIterationsis not a secret.The default
ExtendOnceescalator does not setRaiseStepsBy, so the default behavior is unchanged (time-only extension).🎯 Correctness — No material issues found
Verdict: No material issues found
After tracing the full logic chain — from
RunInfo.MaxIterations→ critic handle init →MaxSteps()polling →maxStepsOptionfallback →tick()raise — the implementation is semantically consistent and correct for the described use case.What I verified:
critic/critic.go:handle.maxStepsis initialized frominfo.MaxIterations(line 142),MaxSteps()returns it under lock (line 200–204), andtick()addsRaiseStepsByunder the same lock (line 277–278). No race between the watch goroutine's write and the executor's poll.run/critic.go:maxStepsOptioncorrectly branches onb == nil(line 96) — Go allows method calls on nil pointers, and the nil receiver is handled. The closure capturesbaseby value (line 99–104), which is safe sincebaseis anint.run/executor.go:maxIteris always > 0 after defaults (line 126–128), soinfo.MaxIterationsis always positive, and the critic handle'smaxStepsstarts at a positive value. Thecritic.maxStepsOption(maxIter)call (line 250) works correctly whethercriticis nil or not.run/ports.go:RunInfo.MaxIterationsis a pure addition to a struct passed by value — backward-compatible with all existing callers.CriticHandle.MaxSteps()is a net-new method; the only implementor (critic.handle) provides it.One observation (not a bug, but a design constraint worth noting): The critic only fires
OnSoftTimeoutwhen the run is idle (idle ≥ softTimeout). The step ceiling, however, matters when the run is active (iterating). If a run burns through its step budget without stalling, the critic never gets a chance to raise the ceiling — the agent hitsErrMaxStepsfirst. This isn't a correctness defect in this PR (the mechanism is sound), but the mort adapter will need to ensure its Escalator is invoked before the step budget is exhausted, or accept that the raise only helps runs that stall mid-way.🧯 Error handling & edge cases — No material issues found
Verdict: No material issues found
The diff is clean through the error-handling & edge-cases lens. I verified the following by reading the full files (not just the diff hunks):
Nil-safety of
maxStepsOption:criticBinding.maxStepsOptionhas ab == nilguard (verified atrun/critic.go:96), andstartCriticonly returns a non-nil binding whenh != nil(verified atrun/critic.go:38-41). The closure'sb.his never nil when the binding is non-nil.MaxSteps() <= 0fallback: ThemaxStepsOptionclosure correctly falls back tobasewhenMaxSteps()returns 0 or negative (run/critic.go:100). The critic handle starts withmaxSteps: info.MaxIterations(critic/critic.go:142), and if that's 0 (e.g., from a host that constructsRunInfowithout the new field), the fallback still works.RaiseStepsByguarded against negative/zero:tick()only appliesRaiseStepsBywhen> 0(critic/critic.go:277), so a misbehaving Escalator returning negative values is silently ignored — consistent with the "raise" (not "adjust") semantics.Kill path ignores
RaiseStepsBy: Whend.Killis true,tick()returns early before reaching theRaiseStepsByblock (critic/critic.go:266-269), matching the existing treatment ofNudge/ExtendByon kill.Concurrent safety:
handle.MaxSteps()acquiresh.mu(critic/critic.go:200-203), consistent with all other handle methods. Thetick()mutation ofh.maxStepsis also under the same mutex (acquired at line 261).Panic recovery:
MaxSteps()is polled on the main goroutine (viaWithMaxStepsFunc), so the executor's top-levelrecover(run/executor.go:115-118) catches any panic — consistent with howSteer()is already handled.Integer overflow:
h.maxSteps += d.RaiseStepsByusesint; overflow would require billions of steps on 32-bit or quintillions on 64-bit — not a realistic concern for a single run.RunInfobackward compatibility: The newMaxIterationsfield is at the end of the struct; existing callers that omit it get the zero value, which the critic correctly treats as "no baseline override" (verified atcritic/critic_test.go:31,70,97whereRunInfois constructed without it).No ignored errors, missing cleanup, nil-dereference risks, or unhandled boundary conditions found.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 16s
🪰 Gadfly review —
nemotron-3-super:cloud(ollama-cloud)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
Explanation: After reviewing the changes through the security lens (authentication/authorization gaps, injection flaws, SSRF, unsafe deserialization, secret leakage, input validation, and insecure defaults), no security issues were identified.
Key observations:
MaxSteps) controlled by the critic viaRaiseStepsByin theDecisionstruct.Ports.Criticinterface and not exposed to untrusted users.RaiseStepsByvalues or cause integer overflow (leading to unexpected fallback behavior), the critic already possesses equivalent destructive capabilities via deadline extension (ExtendBy) and termination (Kill). Thus, the new feature does not introduce a new attack surface beyond what the critic already wields.RaiseStepsByvalues are ignored (only positive values increase the ceiling), and the executor falls back to the baseMaxIterationsifMaxSteps()returns ≤0.No further security concerns were identified within the scope of this change.
🎯 Correctness — Blocking issues found
Blocking issues found:
run/critic.go:91-100: The maxStepsOption function contains a logical error. When a critic is present, it returns agent.WithMaxStepsFunc that calls b.h.MaxSteps() and returns that value if >0, otherwise falls back to base. However, the critic's handle.MaxSteps() returns h.maxSteps which is initialized to info.MaxIterations (the base) and only increased via RaiseStepsBy. Since the base is always positive (set from agent's MaxIterations or default 12), the critic's MaxSteps() will always be >= base > 0, making the fallback dead code. More critically, if the critic's MaxSteps() returns 0 or negative (which shouldn't happen but could if misused), it would incorrectly fall back to base instead of honoring the critic's intended ceiling of 0 or negative (which should mean "no limit" or "use base" per the interface comment). The correct behavior per the CriticHandle.MaxSteps() comment is: "Return <= 0 to defer to the run's base MaxIterations." So the function should return base when critic's MaxSteps() <= 0, but use critic's value when > 0. Current implementation does the opposite: it uses critic's value when > 0 (correct) but falls back to base when <= 0 (also correct per comment). Wait, re-reading: the function says: if n := b.h.MaxSteps(); n > 0 { return n } else { return base }. That matches the comment: critic's value > 0 is used, otherwise defer to base. So the logic is correct. But there's a different issue: the critic's MaxSteps() is initialized to base and only increased, so it will never be <=0. The fallback is never used when critic is present. This isn't a bug but makes the fallback redundant. However, the real issue is that the critic's MaxSteps() should be allowed to return values <=0 to defer to base, but the current design only allows increasing the ceiling (via RaiseStepsBy >0). The comment says critic can raise the ceiling, implying it can only increase, not decrease. So the fallback for <=0 is future-proofing but currently unused. Not a blocking issue.
run/critic.go:91-100: The function returns agent.WithMaxSteps(base) when b == nil. But when b == nil (no critic), we should use the fixed base ceiling. That's correct.
No blocking issues found in correctness lens. The logic appears sound.
However, let me double-check the test: run/critic_test.go:65-106: The test sets up a fake critic with maxSteps:5, and run with MaxIterations:1. The agent option becomes WithMaxStepsFunc that returns 5 (since critic's MaxSteps() returns 5>0). The agent runs up to 5 steps. The test provides two tool calls then a final answer, expecting success. This is correct.
One more check: run/executor.go:247-250: The comment says: "Step ceiling: a fixed WithMaxSteps(maxIter) normally, but when a critic is active it owns a DYNAMIC ceiling (WithMaxStepsFunc) so it can raise a healthy-but-long run's budget mid-flight. Falls back to maxIter." This matches the implementation.
Thus, no blocking issues found in correctness review.
But wait: let's verify the critic's MaxSteps() method is indeed called each step. The agent.WithMaxStepsFunc will call the provided func each step to get the current max steps. That's correct.
One potential issue: in critic/critic.go, the handle's MaxSteps() method locks the mutex. The agent's WithMaxStepsFunc will be called from the agent's run loop, which may be concurrent with the critic's watch goroutine (which calls tick and updates maxSteps). The mutex protects concurrent access, so that's fine.
Another: in critic/critic.go, the tick method updates h.maxSteps under the mutex when processing a decision. That's correct.
Thus, no correctness issues found.
Verdict: No material issues found.
🧯 Error handling & edge cases — No material issues found
No material issues found
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 48s
🪰 Gadfly review —
gpt-oss:120b-cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — Minor issues
Verdict: Minor issues
Potential info leak via
RunInfo.MaxIterationsexposure (run/ports.go:51‑54).The new
MaxIterationsfield is added toRunInfoand is passed to downstream ports (e.g.,Ports.Critic). If a custom critic implementation logs or returns the entireRunInfoto an external service, the internal iteration limit could be disclosed, which may aid an attacker in probing run budgets.Lack of validation for
CriticHandle.MaxStepsreturn value (run/critic.go:95‑104).The dynamic step ceiling is obtained from
b.h.MaxSteps(). If a malicious critic returns a negative value other than0,WithMaxStepsFuncwill accept it, causing the executor to treat the ceiling as the basemaxIteronly whenn > 0. This allows a critic to effectively disable the step ceiling by returning-1, potentially leading to unbounded execution and resource exhaustion.Missing authentication/authorization check before exposing
CriticHandle.MaxSteps(critic/critic.go:162‑166).The handle’s
MaxStepsmethod is called on every step, but no verification is performed to ensure the caller (the executor) is authorized to query the critic’s state. If future code exposed the handle to other components, an unprivileged caller could infer or manipulate the step ceiling. KeepingCriticHandleunexported and adding a comment emphasizing its limited use mitigates this surface.🎯 Correctness — Minor issues
Verdict: Minor issues
Findings
run/critic.go: maxStepsOption and handle.tick overflow risk (
run/critic.go:91‑104,run/critic.go:277‑279)The dynamic step‑ceiling option polls
handle.MaxSteps(), andhandle.tickaddsd.RaiseStepsBytoh.maxStepswithout any overflow protection. Repeated raises could overflow anint. A sanity check before adding would make the logic robust.run/ports.go / skill/validate.go: RunInfo.MaxIterations documentation (
run/ports.go:51‑53,skill/validate.go:206‑210)The new
MaxIterationsfield is correctly used throughout the code and the validator still enforces the existing hard ceiling (≤ 50 for non‑admin users). The only shortcoming is that the field’s comment could be clarified to state explicitly that it represents the base step ceiling, while the critic‑managed dynamic ceiling is separate.🧯 Error handling & edge cases — Minor issues
VERDICT: Minor issues
run/critic.go:95‑104) – ThemaxStepsOptionmethod callsb.h.MaxSteps()without checking thatb.his non‑nil. AlthoughcriticBindingis only created when a critic monitor returns a non‑nil handle, the code does not guard against a possible nilhfield, which could cause a panic ifb.hwere ever nil. Adding a nil check before invokingMaxSteps()would make this path safe. Impact: potential panic in the executor when a critic handle is nil.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 28s
🪰 Gadfly review —
qwen3-coder:480b-cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found.
🎯 Correctness — No material issues found
VERDICT: No material issues found
The changes correctly implement a dynamic step ceiling mechanism through the critic system. The logic for raising the step ceiling via
RaiseStepsByin the critic's decision and the executor's polling ofMaxSteps()is sound. The test caseTestCriticRaisesStepCeilingappropriately verifies that a critic can allow a run to exceed its baseMaxIterations. The integration between therunandcriticpackages maintains the intended invariants, and the fallback behavior whenMaxSteps() <= 0correctly defaults to the base ceiling. No correctness issues were identified in the diff.🧯 Error handling & edge cases — No material issues found
No material issues found.
The changes introduce a dynamic step ceiling mechanism through the critic system, allowing a critic to raise a run's step ceiling mid-flight. The implementation correctly:
MaxIterationstoRunInfoto track the base step ceilingCriticHandlewithMaxSteps() intmethodRaiseStepsByMaxSteps() <= 0All error handling and edge cases appear to be properly addressed, with appropriate locking around shared state and clear documentation of the behavior. The changes are well-integrated and maintain existing invariants.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 40s
🪰 Gadfly review —
gemma4:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
🎯 Correctness — No material issues found
Verdict: No material issues found
I have verified the changes against the repository. The logic for dynamically adjusting the step ceiling is correctly implemented across the
criticbattery and therunexecutor:run.Executorcorrectly passes themaxIterbaseline intoRunInfo, which thecritic.Systemthen uses to initialize thehandle.maxStepsbaseline.criticBinding.maxStepsOptionimplements the fallback logic correctly: if theCriticHandle.MaxSteps()returns a positive value (meaning the critic has set or raised the ceiling), that value is used; otherwise, it falls back to the original baseline.handle.maxStepsincritic/critic.gois protected by a mutex, which is necessary since the agent loop (pollingMaxSteps()) and the critic monitor (updating viatick()) run in separate goroutines.TestCriticRaisesStepCeilingcorrectly simulates the behavior by using a fake handle that returns a higherMaxStepsvalue, allowing a run withMaxIterations: 1to complete multiple steps.🧯 Error handling & edge cases — No material issues found
Verdict: No material issues found
All findings from the draft review were dropped after verification:
critic/critic.go:277: Dropped. The use ofintfor step counts is standard in the project. Given the context of agent iterations (typically in the tens or hundreds), the risk of integer overflow is negligible.run/critic.go:13: Dropped. The logic is consistent with the project's use of0or negative values to signify "no limit" or "defer to default". The fallbackif n := b.h.MaxSteps(); n > 0 { return n } return basecorrectly implements the intended behavior where the critic can either override the limit (if> 0) or let the executor's base limit apply.critic.maxStepsOptioninrun/executor.gowould not compile because it's a method on*criticBinding. However, inrun/executor.goat line 250, the call is actuallycritic.maxStepsOption(maxIter). Looking at the imports and the code, thecriticvariable inrun/executor.gois an instance of*criticBinding(created bye.startCriticat line 208), not thecriticpackage. The call is a valid method call on a pointer tocriticBinding.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 24s
Pull request closed