run: wire SessionToolFactory + PostRun artifacts + AttachImages #12
Reference in New Issue
Block a user
Delete Branch "run-postrun-sessiontools"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The gateway for migrating mort's artifact-producing surfaces (chat API, chatbot,
.skill, scaddy) onto executus. The session-tool types already lived intool/(the P4 move), but the executor never used them — this wires them.Changes
run/session.go—steerMailbox(thread-safe message queue) +runSession(tool.AgentSessionover it:AttachImages→ a user-role multimodal message injected before the agent's next step) +runPostRun(panic-isolated hook call).inv.AttachImagesbefore the toolbox build (so tool handlers capture the live seam); addinv.ExtraTools+ aSessionToolFactory's per-runToolsto the toolbox; defer itsCleanup; merge the session mailbox with the critic's nudges into oneWithSteer; after the run, callPostRunwith the full transcript (runRes.Messages) →Result.PostRunResult(best-effort — a PostRun panic never fails the run).run.Result+=PostRunResult *tool.PostRunResult.criticBinding.steerOptions(superseded bydrainSteer).Tests
A factory whose
PostRunemits an artifact from the output+transcript +Cleanuplands onResult.PostRunResult; a factory-added tool is callable by the model.Next: a mort PR maps
run.Result.PostRunResultthroughExecutusRunnerso a flagged artifact run delivers its gif/image — then the chat-API/chatbot surface migrations.🤖 Generated with Claude Code
🪰 Gadfly review —
glm-5.1:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Reviewing through the 🔒 Security lens:
inv.AttachImages/steerMailbox.push— The newAttachImagesseam queues arbitraryllm.Messages (user-role, constructed fromtext + images) into the agent's conversation loop. Only trusted, server-side code can reach this: tool handlers viainv, andSessionToolFactoryclosures viaAgentSession. No end-user input flows directly intopushwithout passing through a trusted tool handler first. The same trust model already applies to tool return values the model sees.SessionToolFactory/ExtraToolsinjection — Both allow injecting arbitrary tools and code (includingPostRun/Cleanupclosures) into a run. This is by design — they're set by the trusted dispatch path (mort's code constructingInvocation), not from raw user input.Invocationis passed by value toRun, so post-construction tampering by the caller isn't possible.PostRun panic isolation —
runPostRuncorrectly usesdefer/recoverto isolate panics, setting the result tonilrather than propagating. This prevents a malicious or buggyPostRunfrom crashing the process or failing an otherwise-successful run. ✓Cleanupisdeferred — Ensures resource teardown runs even on early returns or panics in the run path. ✓steerMailboxsynchronization —pushacquiresmu, appends, releases;drainacquiresmu, swaps tonil, releases. No TOCTOU, no data race on the slice after drain (the caller owns the returned slice). ✓No secret/credential leakage —
PostRunResultcarriesArtifacts(name+MIME+bytes) andMetadata(arbitrary map). These are produced by trusted PostRun hooks and returned in theResultstruct. No credentials, tokens, or secrets are in scope.No security-relevant findings after verification.
🎯 Correctness — No material issues found
VERDICT: No material issues found
After thorough verification:
runSession↔tool.AgentSessionconformance: TheAgentSessioninterface (intool/session_tools.go:23-30) requiresAttachImages(text string, images ...llm.ImagePart). The concreterunSessioninsession.go:64has exactly this signature. ✓Two
runSessioninstances sharing one mailbox: Lines 186–187 and 219 both create&runSession{mailbox: mailbox}pointing to the samesteerMailbox. This is intentional —inv.AttachImages(for ordinary tools) and the factory'sAgentSession.AttachImagesmust both inject into the same agent loop. ✓Nil-safety on
critic.drainSteer():drainSteer()returnsnilwhenb == nil(critic.go:119–123). The merged steer closureappend(mailbox.drain(), critic.drainSteer()...)appends to a nil slice when the critic is absent, which is valid Go. ✓runPostRunpanic recovery: Uses a named returnprrcorrectly; thedefer/recoversetsprr = nilon panic, which overwrites any partial assignment before the panic. ✓Cleanupdeferral:defer st.Cleanup()(executor.go:221) is registered before the tool-adding loop, so it runs even iftoolbox.Addfails at line 224. Early returns before the factory block (lines 192, 199) correctly skip Cleanup since no factory was created yet. ✓ctxvsrunCtxin PostRun:runPostRunreceivesctx(the parent context, line 341), not the potentially-cancelledrunCtx. This is correct — PostRun should still be able to produce artifacts after a timeout. ✓steerMailbox.drain()thread safety:drain()locks, swapsm.msgswith nil, and returns the old slice. Each call yields a fresh slice. Theappend(mailbox.drain(), critic.drainSteer()...)closure in the steer callback cannot corrupt the mailbox state. ✓No logic bugs, semantic errors, or domain-correctness issues found.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
Reviewing through the error-handling & edge-case lens:
runPostRunpanic isolation is correct: named returnprris assignednilinside thedefer/recover, so a panicking PostRun hook safely produces a nilPostRunResultwithout propagating the panic. ✓steerMailbox.drain()returns the internal slice then setsm.msgs = nil. Subsequentpushes allocate a new backing array, so the returned slice is not aliased. ✓criticBinding.drainSteer(): returnsnilwhenb == nil(no critic configured). Combined withmailbox.drain()(returnsnilwhen empty), the mergedsteerclosurefunc() []llm.Message { return append(mailbox.drain(), critic.drainSteer()...) }is nil-safe. ✓AttachImagesno-op guards: early return when all parts are empty (whitespace-only text + no images). ✓st.Cleanupdefer ordering:defer st.Cleanup()is registered after the top-levelrecoverdefer, so it executes first (LIFO). IfCleanuppanics, the top-level recover catches it — acceptable best-effort cleanup semantics. ✓Cleanup: Whentoolbox.Addfails for an extra/session tool, the function returns withres.Errset, butst.Cleanupwas already deferred and will still execute. ✓transcripton partial results: WhenrunResisnil,transcriptstays the zero-valuenil, which the PostRun hook receives as a valid empty slice. By-design best-effort. ✓runSessioninstances share one mailbox (line 187 vs 219): both wrap the samemailboxpointer, soinv.AttachImagesand the factory's session both push to the same queue. Correct. ✓Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 23s
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
🔒 Security Review
VERDICT: No material issues found
Through the security lens I checked:
run/session.gorunSession.AttachImages— Builds a[]llm.Partfrom caller-suppliedtextandimages. Both go through typed constructors (llm.Text,ImagePart) intollm.UserParts(...). No string interpolation into SQL/shell/path/templates; no deserialization; no SSRF surface (no URL fetch). Thepartsslice is purely model-facing. No injection vector.steerMailbox— Thread-safe queue protected bysync.Mutex. Onlypush/drainpaths, no external exposure. No untrusted-data parsing.runPostRun— Wraps the host-suppliedhookinrecover(), logs the panic, returnsnil. Panic isolation is correct (named returnprris reset to nil in the recover). No swallowed-error leak: the panic is logged viaslog.Errorwith the panic value.executor.gomailbox +inv.AttachImagesassignment — Occurs BEFORE toolbox build so registered tool handlers close over the live seam. The closure is set on a localinv(toolInvis what tools actually receive via the registry), butRegistry.Buildcopiesinvto a per-tooltoolInvand the executor then mutatesinv.AttachImagesafterwards — verified attool/registry.go:693(toolInv := inv). Wait:Buildruns atrun/executor.go:190, AFTER theinv.AttachImagesassignment at line 187, so the copy picks up the populatedAttachImages. Confirmed correct ordering — no stale-nil seam handed to tool handlers. (Verified by readingtool/registry.go:678-707andrun/executor.go:183-195.)ExtraTools/SessionToolFactory.Tools— These come from the invoker (already trusted: the same caller that builttool.Invocation). No external/user input flows into the toolbox unchecked; authn/authz gating happens via existingCheckGate(inv)per-tool. No privilege escalation vector beyond what the invoker already has.Cleanupdeferred — Runs at executor return, afterfinishAudit+ budget commit; isolated from hook panic viarunPostRun's recover. No resource-leak or post-panic-corruption issue observed.toolbox.Addfor session tools — Returns error on collision; duplicates fromExtraTools+ factory + registry are surfaced as run errors, not silently overwriting. (Verifiedrun/executor.go:211, 224;run/palette.go:45, 67use the same pattern.)No authn/authz gaps, no injection, no SSRF, no unsafe deserialization, no secret leakage, and no insecure defaults introduced by this change.
Outside my lens: none worth raising.
🎯 Correctness — Minor issues
VERDICT: Minor issues
run/executor.go:221—defer st.Cleanup()is registered BEFORE the run-context defers atrun/executor.go:245-255(defer cancelTimeout(),defer cancelCause(nil),defer mergeCancel(),defer stopCritic()), so by Go's LIFO defer orderst.Cleanupfires LAST — afterstopCritic()has calledh.Stop()on the critic and afterrunCtxhas been cancelled. The doc attool/session_tools.go:63-67only promises "deferred by the executor immediately after the factory returns"; it doesn't tell implementers theirCleanupruns on a fully-torn-down run context. ACleanupthat reads from the critic handle, callsh.Record*, or touches any resource the run context supervised will observe a cancelled/teardown world. Concrete fix: movedefer st.Cleanup()to AFTER therunCtx/criticdefers (e.g., right beforeagent.New) so it runs while the run context is still observable; alternatively document the teardown-after order so factory authors don't reach for live state.🧯 Error handling & edge cases — Minor issues
VERDICT: Minor issues
run/session.go+run/executor.go:341— PostRun runs on the caller'sctx, not a detached one. Verified. The executor's own comment says artifacts are "best-effort" and the PostRun is invoked with the run's callerctx(runPostRun(ctx, ...)at line 341). Compare tofinishAudit(line 344) which routes throughe.finishAuditand that helper explicitly doesdetach(ctx)so a cancelled run still records (per the comment atrun/executor.go:377-378). If the caller cancels betweenag.Runreturning andrunPostRunbeing called — e.g. Discord request deadline, gateway timeout — PostRun'sctxis alreadyCanceledand the hook will fail or short-circuit before producing artifacts. The panic isolation inrunPostRundoes not help with a cancelled ctx, only with a panic. Fix: callrunPostRun(detach(ctx), …)(or accept a detached context at the top ofRun) for symmetry with the audit path.run/executor.go:336-342— no timeout on PostRun. Verified by readingrunPostRuninrun/session.go:18-28: it has panic isolation but no deadline. A hung artifact hook (e.g. a renderer that wedges) blocksExecutor.Runindefinitely after the agent loop has finished. The "best-effort + panic-isolated" framing in the comment understates the liveness risk — panic isolation only catches panics, not infinite loops. Consider wrapping with a shortcontext.WithTimeout(e.g. 10s) usingdetach(ctx)as parent so a stuck hook can't pin the executor.run/executor.go:186-187—inv.AttachImagesis stamped on the localinvonly. Verified. The comment promises the seam is captured "live" by tool handlers built viae.cfg.Registry.Build(... inv ...). That part is fine becauseBuildtakesinvby value but tool handlers close over it. However, ifBuild(or any later step) copies/aliasesinvinto a sub-context that outlives this stack frame (e.g. into a delegation sub-run), the local mutation may not propagate. Worth confirming with the registry author; flagging as a potential edge case rather than a confirmed bug.No swallowed errors found — every error path in the new code (
toolbox.Add,SessionToolFactoryinvocation path) either returns withres.Errset +finishAudit, or panic-isolates viarunPostRun.steerMailbox.draincorrectly handles the empty case (returns nil, not a zero-length slice).runSession.AttachImagescorrectly no-ops on emptytext + images. The mailbox mutex usage is correct (push and drain both lock). The defer order forst.Cleanupis correct (deferred before the tool Add loop, so even a partial-Add failure still releases resources).Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 54s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
run/executor.go:186-187setsinv.AttachImagesbefore the toolbox build, andtool/registry.go:678-707copiesinvby value into each tool (toolInv := inv), so a tool handler cannot reassign the executor's seam — no tampering surface.inv.ExtraToolsandinv.SessionToolFactoryare host-constructed fields ontool.Invocation(confirmed attool/registry.go:427-453); they are not deserialized from untrusted/user input in this repo (grep found only test/executor call-sites). The addedtoolbox.Add(t)loop introduces no new injection or authz bypass — it reuses the same trust model as registry-built tools. No new untrusted-input surface.runPostRun(run/session.go:18-28) is panic-isolated and passes the callerctx(not the cancelledrunCtx); a PostRun panic is recovered and logged, never fails the run. No unsafe deserialization or execution of hook-supplied data.steerMailbox(run/session.go:35-56) is unbounded, so a runaway tool could queue many messages — but tools are trusted, host-provided code, not an untrusted-input vector; not a material security finding.No authn/authz gaps, injection, SSRF, deserialization, or secret-leakage issues introduced by this change.
🎯 Correctness — No material issues found
VERDICT: No material issues found
run/executor.go:322steer := func() []llm.Message { return append(mailbox.drain(), critic.drainSteer()...) }— matches the shapeagent.WithSteerpreviously consumed (b.h.Steerisfunc() []llm.Message, perrun/ports.go:140andcritic/critic.go:188). Bothdrain()andcritic.drainSteer()return fresh slices each call (mailbox resets to nil;Steer()drains), so there is no cross-step aliasing or double-injection. Merge order (session images before critic nudges) is benign.run/session.go:19runPostRunpanic isolation: named returnprris correctly reset tonilin thedefer recover(), so a panicking hook cannot leak a partial/zero-value*PostRunResult.SessionToolFactorywiring (run/executor.go:217-231):defer st.Cleanup()is registered before the tool-adding loop, so the early-return error path inside the loop still runs Cleanup. The factory's&runSession{mailbox: mailbox}shares the same mailbox pointer asinv.AttachImages, so the two distinctrunSessioninstances are correctly coupled.PostRuntranscript plumbing (run/executor.go:336-342):runRes.Messagesis read only whenrunRes != nil, andres.Outputis only populated whenrunRes != nil(run/executor.go:326-329), so on a failed run with no result the hook getstranscript=nil, output=""— consistent with the documented "populated even on partial results" contract (transcript is nil only when there's no result at all).(I could not resolve
agent.WithSteer's exact signature in-repo — it lives in the externalmajordomo/agentmodule — but the call-site type matches the priorcritic.steerOptions()usage that compiled, so this is not flagged.)🧯 Error handling & edge cases — Minor issues
All three findings are verified against the actual code. Let me confirm the details of each:
PostRun ctx (finding 1): Confirmed —
executor.go:341passesctx(the caller's context), notrunCtx. TherunCtxis built withMaxRuntimetimeout at line 244-248. The doc attool/session_tools.go:46-56says PostRun should produce artifacts on partial-success (timeout) runs. IfrunCtxwere passed, it would beDone()on a timed-out run, preventing artifact production. So passingctxis deliberate and defensible. Caller cancellation can drop artifacts — a narrow race.Panic skips PostRun (finding 2): Confirmed — top-level recover at
executor.go:119-123catches panics intores.Err.PostRunis called inline at line 336-342, not deferred. A panic insideag.Run(line 323) would unwind through the deferred recover, setres.Err, but skip thepostRuncall at 336.Cleanupis deferred (line 221) so it still runs. The doc says PostRun is "called even if the run fails" — a panic is a failure mode. Minor, as agent panics aren't expected.drain returns internal slice (finding 3): Confirmed —
session.go:53-55returnsm.msgsdirectly, then setsm.msgs = nil. The backing array is orphaned so current usage atexecutor.go:322(append(mailbox.drain(), critic.drainSteer()...)) is safe. But the aliasing is non-obvious; a future caller mutating the returned slice would mutate former mailbox state.All three are correctly characterized as minor/defensible, not blocking.
Verdict: Minor issues found
run/executor.go:341—PostRunreceives the caller'sctx, notrunCtx. This is a defensible choice, not a bug:runCtxcarries the run's ownMaxRuntimedeadline (executor.go:244-248), so on a timed-out (partial-success) runrunCtxwould already beDone()and a context-respecting hook could not produce artifacts — the exact partial-success case theSessionTools.PostRundoc calls out (tool/session_tools.go:46-56). Using the originalctxpreserves artifact production across run timeouts. The tradeoff is that a caller cancellation (client disconnect) will cancelPostRunand silently drop artifacts even on an otherwise-completed run — a narrow race (caller must cancel in the window betweenag.Runreturning andrunPostRunfinishing), arguably acceptable under "best-effort," but worth a one-line comment noting the deliberate ctx choice so a future reader doesn't "fix" it torunCtxand break the timeout/partial-success path.run/executor.go:323-342— a panic insideag.RunskipsPostRunentirely. The top-level recover atexecutor.go:119-123converts any panic (including one from the agent run) intores.Err, but becausePostRunis a synchronous call afterag.Run(not deferred), a panicking agent run meanspostRunis never invoked, so no artifacts are produced for that run.Cleanupis still honored (deferred atexecutor.go:221). TheSessionToolsdoc saysPostRunis called "even if the run fails," and a panic is a failure mode. In practice majordomo'sagent.Runisn't expected to panic (errors surface viarunErr), so impact is minimal — but if strict "PostRun always runs" coverage matters, the hook would need to be invoked from a deferred closure rather than inline. Flagging as minor; no change required if panics are considered out-of-contract for the agent loop.run/session.go:47-56—drainreturns the internal slice (not a copy). This is safe today: the mailbox setsm.msgs = nilbefore returning, so the backing array is orphaned and the caller'sappend(mailbox.drain(), critic.drainSteer()...)atexecutor.go:322cannot corrupt the mailbox. Not a bug, but the aliasing is non-obvious; a future caller that holds the returned slice and mutates it would be mutating state that was the mailbox's. A one-line comment or a copy would harden it — low priority.No issues found with:
AttachImagesempty-text/no-image early return (session.go:64-74),runPostRunnamed-return panic recovery (session.go:18-28), nil-guard onCleanup(executor.go:220-222), or the nil-safedrainSteer/critic == nilpath (critic.go:119-123,executor.go:322).Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 34s
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: Minor issues — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
The PR introduces a steer-mailbox mechanism (
steerMailbox/runSession.AttachImages) and aPostRunartifact hook — both are framework-level plumbing that pass data between trusted components (the executor, host-provided tools, and host-provided factory hooks). From a security perspective:AttachImagestext injection: Tools can inject arbitrary user-role messages into the model's conversation context. This is a prompt-injection surface, but it's the tool's design intent (showing rendered previews to the model), not a framework bug. The executor correctly provides the mechanism; the trust boundary is at the tool author, who controls what text/images they feed in. No executor-level validation is missing that would change this.PostRunartifact data:PostRunResult.Artifacts[].Datais raw[]byteproduced by the host's hook. The executor neither interprets, stores, nor serves these bytes — it just attaches them toResult. No injection or exposure risk at this layer.steerMailboxconcurrency: The mutex-guardedpush/drainis correctly implemented — no TOCTOU or data-race risk. Messages are consumed (cleared on drain), not duplicated.runPostRunpanic isolation: Thedefer/recovercorrectly prevents a hook panic from failing the run. Theslog.Errorcall logs the panic value; if a hook panics with sensitive data, that data hits the log — but the hook is trusted host code, and Go panics rarely carry secrets.No untrusted input flows into executor-controlled paths:
ExtraTools,SessionToolFactory, andPostRunare all supplied by the host (trusted). The executor does not deserialize, parse, or execute anything from an external/untrusted source.No injection, authn/authz, SSRF, deserialization, credential-leakage, or insecure-default issues were found in the changed code.
🎯 Correctness — Minor issues
Verdict: Minor issues
defer st.Cleanup()is not panic-isolated, unlikerunPostRun(run/executor.go:220–222). The PR explicitly wrapsPostRuninrunPostRunwithrecoverso a panic there never fails the run. Butst.Cleanup()is deferred raw — if it panics, the top-levelrecoverat line 119 catches it and setsres.Err, turning an otherwise-successful run into an error. If the intent is that Cleanup should also be best-effort (consistent with PostRun), it should be wrapped in its own recover. If the intent is that a Cleanup panic should fail the run, that's a defensible choice but should be documented — right now the asymmetry is silent and surprising. (Confirmed:executor.go:119–123top-level recover catches all panics inRun;executor.go:220–222defersst.Cleanup()with no local recover;session.go:21–26showsrunPostRunhas its own recover.)runPostRunreceives the raw callerctxrather than a detached context (run/executor.go:341vs344,348). BothfinishAuditanddeliverinternally calldetach(ctx)to insulate their I/O from caller cancellation, butrunPostRunpassesctxstraight through to the hook. If the caller's context is cancelled (deadline, client disconnect, etc.), the PostRun hook's I/O may fail silently and artifacts won't be produced. This is inconsistent with every other post-run operation in the same function. (Confirmed:executor.go:341passes rawctx;executor.go:393usesdetach(ctx)insidefinishAudit;executor.go:425usesdetach(ctx)insidedeliver;session.go:18–28showsrunPostRundoes not detach.)🧯 Error handling & edge cases — Minor issues
Both findings are factually confirmed against the actual code. Let me verify the exact line numbers match:
Finding 1: Line 341 indeed calls
runPostRun(ctx, ...)with rawctx, whilefinishAudit(line 393:rec.Close(detach(ctx), stats)) anddeliver(line 425:dctx := detach(ctx)) both detach internally. TherunPostRunfunction atsession.go:18-28passesctxstraight through to the hook with no detachment. Confirmed.Finding 2: Lines 218-221 show
st := inv.SessionToolFactory(...)followed bydefer st.Cleanup(). If the factory panics,stis never assigned, the defer never registers, and only the top-level recover at line 119-123 catches it — any resources the factory allocated before panicking leak. Confirmed.Both findings are accurate and properly scoped as minor. The verdict stands.
Verdict: Minor issues
run/executor.go:341— PostRun receives the caller's (potentially cancelled)ctx, not a detached context. Every other post-run operation in this function —finishAudit(line 344 → internally callsdetach(ctx)at line 393) anddeliver(line 348 → internally callsdetach(ctx)at line 425) — detaches from the caller's cancellation so a timed-out HTTP request or cancelled gRPC stream doesn't prevent audit writes or delivery.runPostRunpasses the rawctxdirectly to the hook. If the caller cancelled, the hook receives a cancelled context and any I/O it attempts (file writes, uploads) will fail withcontext.Canceled. The hook can work around this withcontext.WithoutCancel, but the asymmetry with every other post-run path is a trap. ConsiderrunPostRun(ctx, ...)→runPostRun(detach(ctx), ...)or document that the hook must self-detach. (Verified by readingfinishAuditat executor.go:379-394 anddeliverat executor.go:420-431 — both usedetach(ctx);runPostRunat session.go:18-28 does not.)run/executor.go:219-221— Ifinv.SessionToolFactoryitself panics,st.Cleanupnever runs. Thedefer st.Cleanup()is registered after the factory call returns. If the factory allocates resources (temp dirs, file handles) and then panics before returning, those resources leak — there is nostto callCleanupon. The executor's top-level recover (line 119) converts the panic tores.Err, but the leak stands. This is inherent to the shape (you can't defer Cleanup before you havest), but it's worth noting that a factory must not allocate before it's ready to return successfully, or it must do its own internal defer/recover. (Verified by reading executor.go:119-123 and 218-231.)Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 53s
🪰 Gadfly review —
nemotron-3-super:cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
No material issues found
The changes primarily wire session tooling and attach images functionality. From a security perspective:
The SessionToolFactory hook execution is panic-isolated and best-effort, but factory trust boundary remains unchanged from existing tool invocation patterns. All new code follows existing security patterns in the codebase.
🎯 Correctness — No material issues found
No material issues found
🧯 Error handling & edge cases — No material issues found
No material issues found
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 20s
🪰 Gadfly review —
qwen3-coder:480b-cloud(ollama-cloud)Verdict: No material issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
After reviewing the changes through a security lens, I found no material security issues. The diff introduces new functionality for session tools, artifact production, and image attachment but does so without introducing apparent security vulnerabilities.
Key security-relevant aspects I examined:
The changes appear to be well-contained and follow secure coding practices. The deferment of cleanup functions and panic isolation for PostRun hooks are implemented safely. The new steer mailbox mechanism for injecting messages into the agent loop is thread-safe.
🎯 Correctness — No material issues found
VERDICT: No material issues found
The changes correctly implement the wiring of SessionToolFactory, PostRun artifacts, and AttachImages functionality. The logic for merging steer messages from both the session mailbox and critic nudges is sound. The PostRun hook execution is properly isolated and best-effort as intended. All findings from the diff have been verified through code inspection and no correctness issues were identified.
🧯 Error handling & edge cases — No material issues found
VERDICT: No material issues found
The changes introduce new functionality for session tools, including
AttachImages,PostRunhooks, and asteerMailboxfor managing messages. Error handling and edge cases have been carefully considered:runPostRunfunction properly isolates panics from thePostRunhook, ensuring that artifact generation failures don't affect the main run result.steerMailboximplementation is thread-safe with proper locking around push and drain operations.AttachImagesmethod correctly handles empty inputs by returning early.All new code paths appear to handle errors appropriately, and the tests cover the key scenarios including panic isolation and proper cleanup. No blocking issues or significant edge case oversights were identified in this review.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 24s
Pull request closed