C0: wire Palette delegation into run.Executor (skill__/agent__ tools) #8

Closed
steve wants to merge 0 commits from phase-c0-executor-ports into main
Owner

First step of the mort cutover (plan: strangler-fig, .agent run canary first, flag + instant fallback, PR-per-phase). This wires the one run.Ports piece the canary needs that wasn't yet called by the executor: Palette delegation.

What

The executor now turns an agent's SkillPalette / SubAgentPalette into delegation tools, so a mort agent that delegates runs through run.Executor:

  • run/palette.go: addDelegationTools builds a skill__<name> tool (structured inputs) per SkillPalette entry and an agent__<name> tool (prompt) per SubAgentPalette entry. Each invokes run.Ports.Palette as a child of the current run (parentRunID = inv.RunID, inheriting caller + channel). A non-ok child status is surfaced to the parent with the partial output.
  • executor.go: called right after the low-level toolbox is built.
  • nil-safe: no PaletteSource or empty palette → no delegation tools (unchanged behavior).

Tests

  • The model calls skill__helper → routed through Palette with the right name/caller/inputs/parent run id.
  • nil palette → run still works.

Deferred to C0b (the rest of the executor run.Ports wiring)

Critic (soft-timeout monitor + deadline binding + steer), Delivery (executor-side output egress, for chatbot/cron surfaces), Checkpointer (needs a majordomo message-history hook to snapshot resumable state). The .agent run canary delivers its returned Result.Output itself, so none of these are on its critical path.

Next after this merges: C1 — mort-side adapters (audit/persona/skill stores + host-tool registration) and routing .agent run through executushost behind a convar flag.

🤖 Generated with Claude Code

First step of the **mort cutover** (plan: strangler-fig, `.agent run` canary first, flag + instant fallback, PR-per-phase). This wires the one `run.Ports` piece the canary needs that wasn't yet called by the executor: **Palette delegation**. ## What The executor now turns an agent's `SkillPalette` / `SubAgentPalette` into delegation tools, so a mort agent that delegates runs through `run.Executor`: - `run/palette.go`: `addDelegationTools` builds a `skill__<name>` tool (structured `inputs`) per `SkillPalette` entry and an `agent__<name>` tool (`prompt`) per `SubAgentPalette` entry. Each invokes `run.Ports.Palette` as a **child** of the current run (`parentRunID = inv.RunID`, inheriting caller + channel). A non-ok child status is surfaced to the parent with the partial output. - `executor.go`: called right after the low-level toolbox is built. - **nil-safe**: no `PaletteSource` or empty palette → no delegation tools (unchanged behavior). ## Tests - The model calls `skill__helper` → routed through `Palette` with the right name/caller/inputs/**parent run id**. - nil palette → run still works. ## Deferred to C0b (the rest of the executor `run.Ports` wiring) **Critic** (soft-timeout monitor + deadline binding + steer), **Delivery** (executor-side output egress, for chatbot/cron surfaces), **Checkpointer** (needs a majordomo message-history hook to snapshot resumable state). The `.agent run` canary delivers its returned `Result.Output` itself, so none of these are on its critical path. Next after this merges: **C1** — mort-side adapters (audit/persona/skill stores + host-tool registration) and routing `.agent run` through `executushost` behind a convar flag. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
steve added 1 commit 2026-06-27 13:28:22 +00:00
C0: wire Palette delegation into run.Executor (skill__/agent__ tools)
executus CI / test (pull_request) Failing after 1m2s
Adversarial Review (Gadfly) / review (pull_request) Successful in 3m47s
9d41987b0e
The first cutover prerequisite: the executor now turns an agent's SkillPalette /
SubAgentPalette into delegation tools so a mort agent that delegates works
through run.Executor (the piece the `.agent run` canary needs beyond the
already-wired audit/budget).

- run/palette.go: addDelegationTools builds a skill__<name> tool (structured
  inputs) per SkillPalette entry and an agent__<name> tool (prompt) per
  SubAgentPalette entry, each invoking run.Ports.Palette as a CHILD of the
  current run (parentRunID = inv.RunID, inheriting caller + channel). A non-ok
  child status is surfaced to the parent with the partial output. nil-safe: no
  PaletteSource or empty palette → no delegation tools (unchanged behavior).
- executor.go: call it right after building the low-level toolbox.

Tests: the model calls skill__helper → routed through Palette with the right
name/caller/inputs/parent; nil palette → run still works.

Deferred to C0b (the remaining run.Ports executor wiring): Critic (soft-timeout
monitor + deadline binding + steer), Delivery (output egress for surfaces that
need executor-side delivery), Checkpointer (needs a majordomo message-history
hook to snapshot resumable state). The `.agent run` canary delivers its returned
Result.Output itself, so these aren't on its critical path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

🪰 Gadfly review — minimax-m3:cloud (ollama-cloud)

Verdict: Minor issues — 3 reviewers: security, correctness, error-handling

🔒 Security — Minor issues

Confirmed — no gate/audit wrapper is applied in addDelegationTools. The tools are built via raw llm.DefineTool and added directly. Both findings stand.

🔒 Security lens

VERDICT: Minor issues

  • run/palette.go:22-58 — Delegation tools bypass the tool.Registry gate / permissions / share-safety / EmitAudit. Every other low-level tool routed through e.cfg.Registry.Build(...) at run/executor.go:164 is gated (CheckGate) and audited via the registry wrappers (see tool/gated_tool.go:144, tool/registry.go:685-690). The skill__<name> / agent__<name> handlers are built directly via llm.DefineTool and call palette.InvokeSkill/InvokeAgent with no gate check and no EmitAudit. Concretely: an LLM can call skill__<name> regardless of inv.SkillName / share-safety policy, and the audit recorder (rec.OnTool / EmitAudit) sees nothing for the delegation, only the upstream step observer's emit of call.Name/call.Arguments. Fix: route them through Registry.Build (register the delegation closures as a tool impl whose Permission carries the appropriate gate and SafeForShare), or at minimum call tool.CheckGate(inv) and tool.EmitAudit(inv, ...) inside each closure before/after the palette.* call.
  • run/palette.go:32,49 — Caller/Channel inherit verbatim from the parent inv. inv.CallerID / inv.ChannelID are passed straight to the child run. That is the intended "child of current run" semantics, but it means a child can perform actions charged to the parent's caller with no additional authorization step (no separate budget check, no Critic re-bind — both deferred to C0b). Worth a security note for C0b; not blocking here, but flagged because the parent-run model's tool call is the only authorization gate and it now extends the parent's caller identity into a fresh child run.

Nothing else in my lane is materially wrong: no SQL/command/path injection (no shell, no templating, no path concatenation from user input), no deserialization, no secret handling, no SSRF surface introduced by this diff. The Toolbox.Add API and llm.DefineTool signatures are verified against existing usage in tool/registry.go:696 and tool/registry_test.go:26.

🎯 Correctness — Minor issues

VERDICT: Minor issues

  • run/palette.go:23 — typed-nil PaletteSource would NPE. The palette == nil check uses the interface-vs-nil comparison, which does NOT catch a typed-nil concrete value (var p *ConcretePalette; Ports.Palette = p). The Ports doc at run/ports.go:33-34 explicitly promises that a nil Palette leaves palette entries inert, so a host following the docs could reasonably pass a typed-nil and then crash inside the first InvokeSkill/InvokeAgent call. Confirmed against run/palette.go:23-25 and run/ports.go:32-34.

  • run/palette.go:49 — sub-agent overrides are silently dropped. InvokeAgent is called with hardcoded "", "", nil, nil for modelTierOverride/promptPrepend/toolsSubset/onEvent, even though PaletteSource.InvokeAgent accepts all four (run/ports.go:164-167). The RunnableAgent DTO has no fields to thread them through (run/agent.go:16-55), so a saved sub-agent that needs any of these overrides cannot be wired up at this layer. Confirmed against the interface signature and DTO definition.

  • run/palette.go:31-37 — skill input schema is not enforced at the wrapper. skillDelegateArgs.Inputs is map[string]any; the description only says "matching its declared input schema" without enforcement. Any keys/values the model emits are passed straight through. Not a bug for C0 (the skill itself presumably validates downstream), but the PR's "structured inputs" framing is not actually enforced here. Confirmed against the type at run/palette.go:72-74.

  • run/palette.go:65-69 — delegationResult mixes status text into the output string. A non-ok child returns prose of the form [skill "x" ended with status "timeout"]\n<output>, which the model will see as plain tool output and may re-feed into subsequent steps. The executor's own step observer at run/executor.go:214 already propagates r.IsError, so a IsError=true result would be a more honest channel for surfacing a partial-status outcome. Confirmed against run/palette.go:65-70 and run/executor.go:196-219.

🧯 Error handling & edge cases — Minor issues

VERDICT: Minor issues

Findings through the error-handling & edge-cases lens:

  • run/palette.go:31-37, 48-54 — Partial output silently discarded on the err path. The doc comment for delegationResult (lines 63-64) promises a "non-ok child status is surfaced to the parent with the partial output" — but when InvokeSkill/InvokeAgent returns a non-nil err, the callback returns nil, fmt.Errorf(...) and never touches out/status. If a battery ever returns (partialOutput, childRunID, "timeout", someTransportErr) (work started, then failed mid-flight), the parent model only sees a bare error and can't react to the partial result. Either gate this behind a clear "err means hard-fail with no usable output" contract (and document it on PaletteSource), or merge the paths so delegationResult always runs (e.g. log the partial into the error message).

  • run/palette.go:26, 43 — Empty / duplicate palette names not pre-validated. ra.SkillPalette: []string{""} produces the tool name skill__, and a duplicate entry produces a duplicate tool name. Both surface as an opaque box.Add error (add skill__: %w / add skill__foo: %w) far from the real cause. An upfront check that rejects empty names with a clear message (and either dedupes or reports duplicates) before calling box.Add would turn a latent landmine into a fast-fail.

  • run/palette.go:65-69 — Header with trailing blank line when out == "" and status != ok. If the child produced no output but ended non-ok, delegationResult returns "[skill \"x\" ended with status \"timeout\"]\n" followed by the empty out formatted via %s, producing a header line followed by a blank line. Not wrong, just noisy. An if out == "" { return header } branch (or strings.TrimRight) would clean it up. Minor.

  • Nil-safe behavior verified. Confirmed by reading run/palette.go:23-25 and run/palette_test.go (TestNoPaletteNoDelegationTools): a nil PaletteSource short-circuits before any tool is added, and an empty palette (both slices nil) yields zero iterations. The executor error path at run/executor.go:173-177 mirrors the existing Build failure path (same res.Err + finishAudit("error", …) shape).

  • Loop capture verified. name := name (palette.go:27, 44) correctly shadows the loop variable — no cross-iteration aliasing if the toolbox dispatches these concurrently later.

  • inv value-capture verified. The closures capture tool.Invocation (struct) by value, so the executor's reuse of the local inv after this point can't race with the closure. The audit-stamping write at executor.go:160 (inv.RunState = stateAcc) happens before addDelegationTools is called at line 173, so the captured value is the post-audit-mutation one — inv.RunState propagates to delegates too, which appears intentional.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 44s

<!-- gadfly-review:ollama:minimax-m3:cloud --> ### 🪰 Gadfly review — `minimax-m3:cloud` (ollama-cloud) **Verdict: Minor issues** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — Minor issues</summary> Confirmed — no gate/audit wrapper is applied in `addDelegationTools`. The tools are built via raw `llm.DefineTool` and added directly. Both findings stand. ## 🔒 Security lens **VERDICT: Minor issues** - **run/palette.go:22-58 — Delegation tools bypass the tool.Registry gate / permissions / share-safety / EmitAudit.** Every other low-level tool routed through `e.cfg.Registry.Build(...)` at `run/executor.go:164` is gated (`CheckGate`) and audited via the registry wrappers (see `tool/gated_tool.go:144`, `tool/registry.go:685-690`). The `skill__<name>` / `agent__<name>` handlers are built directly via `llm.DefineTool` and call `palette.InvokeSkill/InvokeAgent` with no gate check and no `EmitAudit`. Concretely: an LLM can call `skill__<name>` regardless of `inv.SkillName` / share-safety policy, and the audit recorder (`rec.OnTool` / `EmitAudit`) sees nothing for the delegation, only the upstream step observer's emit of `call.Name`/`call.Arguments`. Fix: route them through `Registry.Build` (register the delegation closures as a tool impl whose `Permission` carries the appropriate gate and `SafeForShare`), or at minimum call `tool.CheckGate(inv)` and `tool.EmitAudit(inv, ...)` inside each closure before/after the `palette.*` call. - **run/palette.go:32,49 — Caller/Channel inherit verbatim from the parent inv.** `inv.CallerID` / `inv.ChannelID` are passed straight to the child run. That is the intended "child of current run" semantics, but it means a child can perform actions charged to the parent's caller with no additional authorization step (no separate budget check, no Critic re-bind — both deferred to C0b). Worth a security note for C0b; not blocking here, but flagged because the parent-run model's tool call is the only authorization gate and it now extends the parent's caller identity into a fresh child run. Nothing else in my lane is materially wrong: no SQL/command/path injection (no shell, no templating, no path concatenation from user input), no deserialization, no secret handling, no SSRF surface introduced by this diff. The Toolbox.Add API and `llm.DefineTool` signatures are verified against existing usage in `tool/registry.go:696` and `tool/registry_test.go:26`. </details> <details><summary><b>🎯 Correctness</b> — Minor issues</summary> ## VERDICT: Minor issues - **run/palette.go:23 — typed-nil `PaletteSource` would NPE.** The `palette == nil` check uses the interface-vs-nil comparison, which does NOT catch a typed-nil concrete value (`var p *ConcretePalette; Ports.Palette = p`). The Ports doc at `run/ports.go:33-34` explicitly promises that a nil Palette leaves palette entries inert, so a host following the docs could reasonably pass a typed-nil and then crash inside the first `InvokeSkill`/`InvokeAgent` call. **Confirmed** against `run/palette.go:23-25` and `run/ports.go:32-34`. - **run/palette.go:49 — sub-agent overrides are silently dropped.** `InvokeAgent` is called with hardcoded `""`, `""`, `nil`, `nil` for `modelTierOverride`/`promptPrepend`/`toolsSubset`/`onEvent`, even though `PaletteSource.InvokeAgent` accepts all four (`run/ports.go:164-167`). The `RunnableAgent` DTO has no fields to thread them through (`run/agent.go:16-55`), so a saved sub-agent that needs any of these overrides cannot be wired up at this layer. **Confirmed** against the interface signature and DTO definition. - **run/palette.go:31-37 — skill input schema is not enforced at the wrapper.** `skillDelegateArgs.Inputs` is `map[string]any`; the description only says "matching its declared input schema" without enforcement. Any keys/values the model emits are passed straight through. Not a bug for C0 (the skill itself presumably validates downstream), but the PR's "structured inputs" framing is not actually enforced here. **Confirmed** against the type at `run/palette.go:72-74`. - **run/palette.go:65-69 — `delegationResult` mixes status text into the output string.** A non-ok child returns prose of the form `[skill "x" ended with status "timeout"]\n<output>`, which the model will see as plain tool output and may re-feed into subsequent steps. The executor's own step observer at `run/executor.go:214` already propagates `r.IsError`, so a `IsError=true` result would be a more honest channel for surfacing a partial-status outcome. **Confirmed** against `run/palette.go:65-70` and `run/executor.go:196-219`. </details> <details><summary><b>🧯 Error handling & edge cases</b> — Minor issues</summary> **VERDICT: Minor issues** Findings through the error-handling & edge-cases lens: - **`run/palette.go:31-37, 48-54` — Partial output silently discarded on the err path.** The doc comment for `delegationResult` (lines 63-64) promises a "non-ok child status is surfaced to the parent with the partial output" — but when `InvokeSkill`/`InvokeAgent` returns a non-nil `err`, the callback returns `nil, fmt.Errorf(...)` and never touches `out`/`status`. If a battery ever returns `(partialOutput, childRunID, "timeout", someTransportErr)` (work started, then failed mid-flight), the parent model only sees a bare error and can't react to the partial result. Either gate this behind a clear "err means hard-fail with no usable output" contract (and document it on `PaletteSource`), or merge the paths so `delegationResult` always runs (e.g. log the partial into the error message). - **`run/palette.go:26, 43` — Empty / duplicate palette names not pre-validated.** `ra.SkillPalette: []string{""}` produces the tool name `skill__`, and a duplicate entry produces a duplicate tool name. Both surface as an opaque `box.Add` error (`add skill__: %w` / `add skill__foo: %w`) far from the real cause. An upfront check that rejects empty names with a clear message (and either dedupes or reports duplicates) before calling `box.Add` would turn a latent landmine into a fast-fail. - **`run/palette.go:65-69` — Header with trailing blank line when `out == ""` and status != ok.** If the child produced no output but ended non-ok, `delegationResult` returns `"[skill \"x\" ended with status \"timeout\"]\n"` followed by the empty `out` formatted via `%s`, producing a header line followed by a blank line. Not wrong, just noisy. An `if out == "" { return header }` branch (or `strings.TrimRight`) would clean it up. Minor. - **Nil-safe behavior verified.** Confirmed by reading `run/palette.go:23-25` and `run/palette_test.go` (`TestNoPaletteNoDelegationTools`): a nil `PaletteSource` short-circuits before any tool is added, and an empty palette (both slices nil) yields zero iterations. The executor error path at `run/executor.go:173-177` mirrors the existing `Build` failure path (same `res.Err` + `finishAudit("error", …)` shape). - **Loop capture verified.** `name := name` (palette.go:27, 44) correctly shadows the loop variable — no cross-iteration aliasing if the toolbox dispatches these concurrently later. - **`inv` value-capture verified.** The closures capture `tool.Invocation` (struct) by value, so the executor's reuse of the local `inv` after this point can't race with the closure. The audit-stamping write at `executor.go:160` (`inv.RunState = stateAcc`) happens before `addDelegationTools` is called at line 173, so the captured value is the post-audit-mutation one — `inv.RunState` propagates to delegates too, which appears intentional. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 44s</sub>

🪰 Gadfly review — deepseek-v4-flash:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

🔒 Security Review

Verdict: No material issues found

I reviewed the change through the security lens, tracing tainted input paths and checking for authn/authz gaps, injection, secret leakage, and unsafe defaults. No material security issues were found.

Analysis:

  • Authn/authz continuity is preserved. The delegation tools correctly pass inv.CallerID and inv.ChannelID from the parent invocation through to palette.InvokeSkill / palette.InvokeAgent. The parent run's identity is inherited, so the palette implementation can perform its own authorization against the same caller. Verified by reading run/palette.go:32,49 and run/ports.go:160-167.

  • No injection vector in tool names. Tool names are constructed as "skill__"+name / "agent__"+name from palette entries. These are registered with llm.DefineTool — they are identifiers in the LLM tool registry, not executed as commands or interpolated into any shell/path/query context. An attacker controlling palette entry names could not achieve code execution through this path.

  • LLM-generated input reaches the palette without sanitization (args.Inputs map and args.Prompt string at run/palette.go:32,49). This is by design — the PaletteSource interface is the security boundary, and the palette implementation is expected to validate its own inputs. The delegation layer is a thin bridge, not a sanitization layer. No finding here.

  • No secret/credential leakage. The delegationResult function (run/palette.go:65-69) returns partial child output even on non-ok status, but this is documented intentional behavior ("A non-ok child status is surfaced to the parent with the partial output"). The parent agent already has access to the child's output through the tool result; this is a functional design choice, not a leak.

  • No SSRF, unsafe deserialization, or insecure defaults. The code does not make network calls, deserialize untrusted data, or introduce new insecure defaults. The nil-safe palette check (run/palette.go:23-25) is correct — a nil PaletteSource produces no delegation tools, matching the pre-existing behavior.

Summary: This change is clean from a security standpoint. The delegation layer correctly threads caller identity through to the palette, uses no dangerous operations, and introduces no new attack surface beyond what the PaletteSource interface already defines.

🎯 Correctness — No material issues found

🎯 Correctness — Verdict: No material issues found

I verified the following against the actual checked-out code:

  • PaletteSource interface (run/ports.go:158-167): The InvokeSkill and InvokeAgent signatures match the calls in palette.go exactly — argument count, types, and order all line up. The parentRunID parameter is correctly passed inv.RunID in both cases. ✓
  • RunnableAgent struct (run/agent.go:16-55): Has SkillPalette []string and SubAgentPalette []string fields as consumed. ✓
  • llm.DefineTool call form: The same package (run/submit.go:76) already calls llm.DefineTool[submitArgs](...) with an explicit type parameter. The calls in palette.go omit the explicit type parameter, relying on Go's type inference from the closure argument — this is valid Go and consistent with usage in tool/registry_test.go:26. ✓
  • delegationResult logic: The status check (status != "" && status != "ok") correctly surfaces non-ok child statuses (timeout, cancelled, budget_exceeded) to the parent while still passing partial output. ✓
  • Loop-variable capture: name := name correctly shadows the loop variable for each closure. inv and palette are function parameters, not loop variables, so they are already properly captured. ✓
  • Nil-safety: palette == nil checks the interface value; the test TestNoPaletteNoDelegationTools confirms the nil-interface case works. ✓
  • Error wrapping: Both tool handlers wrap errors with fmt.Errorf("... %w", ...), preserving the error chain. ✓

No logic bugs, no signature mismatches, no incorrect data flow. The change is semantically correct for its stated purpose.

🧯 Error handling & edge cases — No material issues found

🧯 Error handling & edge cases — Verdict: No material issues found

After reading the full source files (run/palette.go, run/executor.go, run/ports.go, run/agent.go) and cross-referencing the PaletteSource interface and RunnableAgent struct, I find no error-handling or edge-case defects in this change.

What I verified:

  • Nil paletteaddDelegationTools returns nil immediately (line 23–25). The executor calls it unconditionally; the nil-guard makes this safe.
  • Empty SkillPalette / SubAgentPalette → the for range loops simply don't execute. No delegation tools are added, which is the documented behavior.
  • box.Add failure (e.g. duplicate tool name) → the error is propagated up, the executor catches it, sets res.Err, calls finishAudit with "error" status, and returns. No resource leak or partial state.
  • InvokeSkill / InvokeAgent returning an error → the closures return (nil, fmt.Errorf(...)), which the tool framework treats as a tool error. The model sees the failure.
  • Non-"ok" status with empty outputdelegationResult still produces a meaningful string like [skill "x" ended with status "timeout"]\n (the %s verb on an empty string is safe).
  • name := name capture inside the loop is correct (explicit shadow, safe for all Go versions).
  • InvokeAgent call passes nil, nil for toolsSubset and onEvent — both are valid zero values for slice and func types respectively; the interface contract permits them.
  • args.Inputs can be nil if the model omits the field — map[string]any accepts nil without issue.

No test coverage for error paths (e.g. InvokeSkill returning an error, or a non-ok status) — but this is a test-coverage gap, not a code defect. The production code handles these paths correctly.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 1s

<!-- gadfly-review:ollama:deepseek-v4-flash:cloud --> ### 🪰 Gadfly review — `deepseek-v4-flash:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> ## 🔒 Security Review **Verdict: No material issues found** I reviewed the change through the security lens, tracing tainted input paths and checking for authn/authz gaps, injection, secret leakage, and unsafe defaults. No material security issues were found. **Analysis:** - **Authn/authz continuity is preserved.** The delegation tools correctly pass `inv.CallerID` and `inv.ChannelID` from the parent invocation through to `palette.InvokeSkill` / `palette.InvokeAgent`. The parent run's identity is inherited, so the palette implementation can perform its own authorization against the same caller. Verified by reading `run/palette.go:32,49` and `run/ports.go:160-167`. - **No injection vector in tool names.** Tool names are constructed as `"skill__"+name` / `"agent__"+name` from palette entries. These are registered with `llm.DefineTool` — they are identifiers in the LLM tool registry, not executed as commands or interpolated into any shell/path/query context. An attacker controlling palette entry names could not achieve code execution through this path. - **LLM-generated input reaches the palette without sanitization** (`args.Inputs` map and `args.Prompt` string at `run/palette.go:32,49`). This is by design — the `PaletteSource` interface *is* the security boundary, and the palette implementation is expected to validate its own inputs. The delegation layer is a thin bridge, not a sanitization layer. No finding here. - **No secret/credential leakage.** The `delegationResult` function (`run/palette.go:65-69`) returns partial child output even on non-ok status, but this is documented intentional behavior ("A non-ok child status is surfaced to the parent with the partial output"). The parent agent already has access to the child's output through the tool result; this is a functional design choice, not a leak. - **No SSRF, unsafe deserialization, or insecure defaults.** The code does not make network calls, deserialize untrusted data, or introduce new insecure defaults. The nil-safe palette check (`run/palette.go:23-25`) is correct — a nil `PaletteSource` produces no delegation tools, matching the pre-existing behavior. **Summary:** This change is clean from a security standpoint. The delegation layer correctly threads caller identity through to the palette, uses no dangerous operations, and introduces no new attack surface beyond what the `PaletteSource` interface already defines. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> ## 🎯 Correctness — Verdict: No material issues found I verified the following against the actual checked-out code: - **`PaletteSource` interface** (`run/ports.go:158-167`): The `InvokeSkill` and `InvokeAgent` signatures match the calls in `palette.go` exactly — argument count, types, and order all line up. The `parentRunID` parameter is correctly passed `inv.RunID` in both cases. ✓ - **`RunnableAgent` struct** (`run/agent.go:16-55`): Has `SkillPalette []string` and `SubAgentPalette []string` fields as consumed. ✓ - **`llm.DefineTool` call form**: The same package (`run/submit.go:76`) already calls `llm.DefineTool[submitArgs](...)` with an explicit type parameter. The calls in `palette.go` omit the explicit type parameter, relying on Go's type inference from the closure argument — this is valid Go and consistent with usage in `tool/registry_test.go:26`. ✓ - **`delegationResult` logic**: The status check (`status != "" && status != "ok"`) correctly surfaces non-ok child statuses (timeout, cancelled, budget_exceeded) to the parent while still passing partial output. ✓ - **Loop-variable capture**: `name := name` correctly shadows the loop variable for each closure. `inv` and `palette` are function parameters, not loop variables, so they are already properly captured. ✓ - **Nil-safety**: `palette == nil` checks the interface value; the test `TestNoPaletteNoDelegationTools` confirms the nil-interface case works. ✓ - **Error wrapping**: Both tool handlers wrap errors with `fmt.Errorf("... %w", ...)`, preserving the error chain. ✓ No logic bugs, no signature mismatches, no incorrect data flow. The change is semantically correct for its stated purpose. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> ## 🧯 Error handling & edge cases — Verdict: **No material issues found** After reading the full source files (`run/palette.go`, `run/executor.go`, `run/ports.go`, `run/agent.go`) and cross-referencing the `PaletteSource` interface and `RunnableAgent` struct, I find no error-handling or edge-case defects in this change. **What I verified:** - **Nil palette** → `addDelegationTools` returns `nil` immediately (line 23–25). The executor calls it unconditionally; the nil-guard makes this safe. - **Empty `SkillPalette` / `SubAgentPalette`** → the `for range` loops simply don't execute. No delegation tools are added, which is the documented behavior. - **`box.Add` failure** (e.g. duplicate tool name) → the error is propagated up, the executor catches it, sets `res.Err`, calls `finishAudit` with `"error"` status, and returns. No resource leak or partial state. - **`InvokeSkill` / `InvokeAgent` returning an error** → the closures return `(nil, fmt.Errorf(...))`, which the tool framework treats as a tool error. The model sees the failure. - **Non-`"ok"` status with empty output** → `delegationResult` still produces a meaningful string like `[skill "x" ended with status "timeout"]\n` (the `%s` verb on an empty string is safe). - **`name := name` capture** inside the loop is correct (explicit shadow, safe for all Go versions). - **`InvokeAgent` call** passes `nil, nil` for `toolsSubset` and `onEvent` — both are valid zero values for slice and func types respectively; the interface contract permits them. - **`args.Inputs` can be nil** if the model omits the field — `map[string]any` accepts nil without issue. **No test coverage for error paths** (e.g. `InvokeSkill` returning an error, or a non-ok status) — but this is a test-coverage gap, not a code defect. The production code handles these paths correctly. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 1s</sub>

🪰 Gadfly review — glm-5.2:cloud (ollama-cloud)

Verdict: No material issues found — 3 reviewers: security, correctness, error-handling

🔒 Security — No material issues found

VERDICT: No material issues found

Through the security lens, I traced the trust model and tainted inputs in run/palette.go:

  • Tool names ("skill__"+name, "agent__"+name) are built from ra.SkillPalette / ra.SubAgentPalette — these are entries on RunnableAgent, i.e. host/agent config, not model- or caller-controlled input. The model only selects which pre-registered tool to call; it cannot inject arbitrary names at dispatch time. Duplicate-name collisions are caught by box.Add(t) (which errors and aborts the run). Verified RunnableAgent fields at run/agent.go:44-46.
  • Delegation inputs (args.Inputs map[string]any, args.Prompt string) are model-produced JSON/strings, which is the standard executor threat model — the agent loop already trusts the model to author tool arguments. They're forwarded verbatim to palette.InvokeSkill / palette.InvokeAgent (verified signatures at run/ports.go:158-167), which own their own input validation. No extra validation is expected at this thin wiring layer.
  • Identity propagation: delegated calls inherit inv.CallerID / inv.ChannelID / inv.RunID (as parentRunID) directly from the current invocation — there's no privilege change or escalation; the child runs under the same caller identity. Consistent with the doc comment.
  • agent__ delegation passes empty overrides and nil for toolsSubset, so the sub-agent uses its own default tool set — a config choice, not a security gap (no blanket host-tool exposure beyond what the sub-agent is already configured with).

No authn/authz gap, injection vector, SSRF, or secret leakage introduced by this change. The one thing I could not confirm is whether llm.DefineTool / Toolbox.Add rejects names containing characters that would break the tool namespace (e.g. embedded __ or control chars), but since the names are config-sourced rather than attacker-controlled, that would at most be a robustness nit, not a material security issue — and I'll leave it to the correctness lens rather than assert it unverified.

🎯 Correctness — No material issues found

VERDICT: No material issues found

Verified against the checked-out code:

  • Interface match (run/palette.go vs run/ports.go:158-167): InvokeSkill(ctx, callerID, channelID, name, inputs, parentRunID) (output, runID, status, err) — the palette.go call (palette.InvokeSkill(ctx, inv.CallerID, inv.ChannelID, name, args.Inputs, inv.RunID) returning out, _, status, err) matches positionally. InvokeAgent's 10-arg call (ctx, inv.CallerID, inv.ChannelID, name, args.Prompt, inv.RunID, "", "", nil, nil) matches the declared signature order (callerID, channelID, name, prompt, parentRunID, modelTierOverride, promptPrepend, toolsSubset, onEvent). ✓
  • Parent-run semantics (run/executor.go:105): inv.RunID is the current run's id (Result{RunID: inv.RunID}), so passing it as parentRunID to the child delegation is correct per the design ("child of the current run"). ✓
  • Closure capture: inv is a value parameter (not a pointer), and name := name rebinds the loop var — no aliasing across iterations. The tools are added before the run context exists; the closures reference the stable inv/palette/name, not the loop variable. ✓
  • Nil-safety: addDelegationTools returns nil immediately when palette == nil (run/palette.go:23); empty SkillPalette/SubAgentPalette slices simply skip both loops. Matches the TestNoPaletteNoDelegationTools expectation. ✓
  • delegationResult status handling (run/palette.go:65): non-ok (and non-empty) status wraps the partial output so the parent agent can react; ok/empty returns raw out. Behavior is as documented. ✓

Unverified (not pursued): the exact declaration of llm.DefineTool — grep did not locate it in the vendored dep, but the call form matches the existing, working usage in run/submit.go:76 (llm.DefineTool[submitArgs]), so compile-correctness is consistent with established usage.

No correctness bug found in this lens.

🧯 Error handling & edge cases — No material issues found

Verdict: No material issues found

Through the error-handling & edge-cases lens, this change is clean:

  • Nil palette guard (run/palette.go:23): early return nil when palette == nil, so the no-PaletteSource / empty-palette paths produce no tools and no error. Verified the executor callsite (run/executor.go:173) passes e.cfg.Ports.Palette directly and treats the returned error as fatal — consistent with the other fatal build failures above it (build toolbox).
  • Delegation error propagation (run/palette.go:32-34, 49-51): when InvokeSkill/InvokeAgent returns a non-nil err, it is wrapped (with %w) and returned to the loop as the tool's (nil, err) result — not swallowed. The non-error path surfaces a non-ok child status via delegationResult while still passing partial output (run/palette.go:65-69), so timeout/cancel/budget-stop children are visible to the parent agent rather than silently dropped.
  • Empty-status / ok boundary (run/palette.go:66): status == "" is treated as the "return out unchanged" branch (same as "ok"), so a child that reports an empty status doesn't get mislabeled as a failure. Correct edge-case handling.
  • Nil/empty inputs: args.Inputs is a plain map[string]any; if the model omits inputs, it's nil and passed through to InvokeSkill — no nil-deref here. Whether the palette tolerates nil inputs is that layer's contract, not this one's, so no defect in run/.
  • No missing cleanup: the tool funcs are pure closures over inv/palette; they allocate no resources requiring defer/rollback. The toolbox/runCtx lifecycle is already managed by the executor.

Nothing material to flag in this lens.

Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 0s

<!-- gadfly-review:ollama:glm-5.2:cloud --> ### 🪰 Gadfly review — `glm-5.2:cloud` (ollama-cloud) **Verdict: No material issues found** — 3 reviewers: security, correctness, error-handling <details><summary><b>🔒 Security</b> — No material issues found</summary> **VERDICT: No material issues found** Through the security lens, I traced the trust model and tainted inputs in `run/palette.go`: - **Tool names** (`"skill__"+name`, `"agent__"+name`) are built from `ra.SkillPalette` / `ra.SubAgentPalette` — these are entries on `RunnableAgent`, i.e. host/agent **config**, not model- or caller-controlled input. The model only *selects* which pre-registered tool to call; it cannot inject arbitrary names at dispatch time. Duplicate-name collisions are caught by `box.Add(t)` (which errors and aborts the run). Verified `RunnableAgent` fields at `run/agent.go:44-46`. - **Delegation inputs** (`args.Inputs map[string]any`, `args.Prompt string`) are model-produced JSON/strings, which is the standard executor threat model — the agent loop already trusts the model to author tool arguments. They're forwarded verbatim to `palette.InvokeSkill` / `palette.InvokeAgent` (verified signatures at `run/ports.go:158-167`), which own their own input validation. No extra validation is expected at this thin wiring layer. - **Identity propagation**: delegated calls inherit `inv.CallerID` / `inv.ChannelID` / `inv.RunID` (as parentRunID) directly from the current invocation — there's no privilege change or escalation; the child runs under the same caller identity. Consistent with the doc comment. - **`agent__` delegation** passes empty overrides and `nil` for `toolsSubset`, so the sub-agent uses its own default tool set — a config choice, not a security gap (no blanket host-tool exposure beyond what the sub-agent is already configured with). No authn/authz gap, injection vector, SSRF, or secret leakage introduced by this change. The one thing I could not confirm is whether `llm.DefineTool` / `Toolbox.Add` rejects names containing characters that would break the tool namespace (e.g. embedded `__` or control chars), but since the names are config-sourced rather than attacker-controlled, that would at most be a robustness nit, not a material security issue — and I'll leave it to the correctness lens rather than assert it unverified. </details> <details><summary><b>🎯 Correctness</b> — No material issues found</summary> ## VERDICT: No material issues found Verified against the checked-out code: - **Interface match** (`run/palette.go` vs `run/ports.go:158-167`): `InvokeSkill(ctx, callerID, channelID, name, inputs, parentRunID) (output, runID, status, err)` — the palette.go call (`palette.InvokeSkill(ctx, inv.CallerID, inv.ChannelID, name, args.Inputs, inv.RunID)` returning `out, _, status, err`) matches positionally. `InvokeAgent`'s 10-arg call (`ctx, inv.CallerID, inv.ChannelID, name, args.Prompt, inv.RunID, "", "", nil, nil`) matches the declared signature order (callerID, channelID, name, prompt, parentRunID, modelTierOverride, promptPrepend, toolsSubset, onEvent). ✓ - **Parent-run semantics** (`run/executor.go:105`): `inv.RunID` is the current run's id (`Result{RunID: inv.RunID}`), so passing it as `parentRunID` to the child delegation is correct per the design ("child of the current run"). ✓ - **Closure capture**: `inv` is a value parameter (not a pointer), and `name := name` rebinds the loop var — no aliasing across iterations. The tools are added before the run context exists; the closures reference the stable `inv`/`palette`/`name`, not the loop variable. ✓ - **Nil-safety**: `addDelegationTools` returns nil immediately when `palette == nil` (`run/palette.go:23`); empty `SkillPalette`/`SubAgentPalette` slices simply skip both loops. Matches the `TestNoPaletteNoDelegationTools` expectation. ✓ - **`delegationResult` status handling** (`run/palette.go:65`): non-ok (and non-empty) status wraps the partial output so the parent agent can react; ok/empty returns raw `out`. Behavior is as documented. ✓ Unverified (not pursued): the exact declaration of `llm.DefineTool` — grep did not locate it in the vendored dep, but the call form matches the existing, working usage in `run/submit.go:76` (`llm.DefineTool[submitArgs]`), so compile-correctness is consistent with established usage. No correctness bug found in this lens. </details> <details><summary><b>🧯 Error handling & edge cases</b> — No material issues found</summary> **Verdict: No material issues found** Through the error-handling & edge-cases lens, this change is clean: - **Nil palette guard** (`run/palette.go:23`): early `return nil` when `palette == nil`, so the no-`PaletteSource` / empty-palette paths produce no tools and no error. Verified the executor callsite (`run/executor.go:173`) passes `e.cfg.Ports.Palette` directly and treats the returned error as fatal — consistent with the other fatal build failures above it (`build toolbox`). - **Delegation error propagation** (`run/palette.go:32-34`, `49-51`): when `InvokeSkill`/`InvokeAgent` returns a non-nil `err`, it is wrapped (with `%w`) and returned to the loop as the tool's `(nil, err)` result — not swallowed. The non-error path surfaces a non-`ok` child status via `delegationResult` while still passing partial output (`run/palette.go:65-69`), so timeout/cancel/budget-stop children are visible to the parent agent rather than silently dropped. - **Empty-status / ok boundary** (`run/palette.go:66`): `status == ""` is treated as the "return `out` unchanged" branch (same as `"ok"`), so a child that reports an empty status doesn't get mislabeled as a failure. Correct edge-case handling. - **Nil/empty inputs**: `args.Inputs` is a plain `map[string]any`; if the model omits `inputs`, it's nil and passed through to `InvokeSkill` — no nil-deref here. Whether the palette tolerates nil inputs is that layer's contract, not this one's, so no defect in `run/`. - **No missing cleanup**: the tool funcs are pure closures over `inv`/`palette`; they allocate no resources requiring `defer`/rollback. The toolbox/`runCtx` lifecycle is already managed by the executor. Nothing material to flag in this lens. </details> <sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 1m 0s</sub>
steve added 1 commit 2026-06-27 13:53:14 +00:00
C0: address verified gadfly findings (trivial fixes)
executus CI / test (pull_request) Failing after 1m31s
executus CI / test (push) Failing after 1m31s
0c80679719
From the PR #8 review (all graded in the gadfly MCP):
- skip empty palette names + dedupe by final tool name, instead of producing a
  "skill__" tool or an opaque box.Add duplicate error.
- delegationResult: no trailing blank line when a non-ok child produced no output.
- delegationErr: fold a child's partial output into the hard-failure error so it
  isn't silently dropped.

Deferred to C0b (design-level, not trivial): route delegation through the
tool.Registry gate/audit wrappers; expose the skill's real input schema to the
LLM instead of a generic inputs map. typed-nil PaletteSource is left as a caller
contract (the == nil guard catches the untyped-nil interface).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
steve closed this pull request 2026-06-27 13:53:17 +00:00
steve deleted branch phase-c0-executor-ports 2026-06-27 13:53:17 +00:00
Some checks are pending
executus CI / test (pull_request) Failing after 1m31s
executus CI / test (push) Failing after 1m31s

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: steve/executus#8