Files
executus/run/agent.go
T
steve 30b79a330f
executus CI / test (pull_request) Successful in 1m49s
Adversarial Review (Gadfly) / review (pull_request) Successful in 13m59s
feat(run): execute multi-phase pipelines (RunnableAgent.Phases)
The kernel carried RunnableAgent.Phases as a DTO but never executed it —
Run always ran a single agent loop with ra.SystemPrompt, so a phased agent
(mort's deepresearch/research) silently ran one loop with the base prompt
instead of its pipeline. This implements the phase loop, ported from mort's
agentexec pipeline but reusing the kernel's own machinery.

- run/phases.go: runPhases / runOnePhase. Phases run sequentially; each is a
  fresh agent loop (or a bare LLM call for IsRunFunc phases) with its own
  template-expanded system prompt ({{.Query}} + {{.<PhaseName>}}), model
  tier, step cap, and tool subset. Outputs thread into later phases; the
  final phase's output is the run output. Optional phases swallow errors and
  substitute FallbackMessage; a non-optional phase that merely exhausts its
  step/tool budget salvages its partial transcript and continues (a hard
  error still aborts); per-phase tier-resolve failures fall back with a WARN.
- run/agent.go: Phase gains IsRunFunc + FallbackMessage (the kernel Phase
  struct previously omitted them).
- run/executor.go: Run factors the shared agent options (tool-error limits,
  step observer, compactor) and branches — single loop (critic's dynamic
  step ceiling) vs the phase runner (fixed per-phase caps; the run-level
  critic's steer + hard deadline still apply across phases). systemPrompt
  now delegates to systemPromptWithBody so each phase keeps the platform
  header. The same step observer feeds audit/steps/critic across all phases.

Tests (run/phases_test.go): sequential output threading + template
expansion, Optional-failure → FallbackMessage continues, hard-error abort,
IsRunFunc bare call, per-phase SystemHeader, filterToolbox subset, template
expansion. Full ./... suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-29 15:14:45 -04:00

89 lines
3.7 KiB
Go

package run
import "time"
// RunnableAgent is the kernel's view of "a thing to run": an identity, a model
// tier, a system prompt, execution caps, and a tool palette. It is a plain DTO
// on purpose — the run kernel never imports a noun battery. The persona Agent
// and the saved Skill each LOWER themselves into a RunnableAgent (a ToRunnable
// method on the battery side), and the kernel runs the DTO. This is the
// inversion of mort's agentexec.Executor.Run(*agents.Agent): the executor no
// longer depends on the persona struct, only on this shape.
//
// A light host can build a RunnableAgent inline (model tier + prompt + a few
// tool names) for a one-shot bounded run, with no persona or skill battery at
// all — that is exactly gadfly's swarm task.
type RunnableAgent struct {
// ID is a stable identifier for the run subject (an agent/skill UUID, or
// any host-chosen id). Used for audit attribution and dispatch-guard
// genealogy. Empty is allowed for anonymous one-shot runs.
ID string
// Name is a human label (audit/logs/delivery). Empty is allowed.
Name string
// SystemPrompt is the agent's base system prompt (before per-run
// personalization, which a host layers via Ports).
SystemPrompt string
// ModelTier is a tier alias or concrete spec resolved through
// model.ParseModelForContext. Empty resolves to the host's default tier.
ModelTier string
// MaxIterations caps the agent loop's tool-dispatch steps. 0 = kernel
// default. MaxRuntime caps wall-clock for the whole run (the kernel starts
// this clock AFTER any lane dequeue, not at submission). 0 = kernel
// default.
MaxIterations int
MaxRuntime time.Duration
// LowLevelTools are tool-registry names the run may call directly.
// SkillPalette / SubAgentPalette name saved skills / sub-agents exposed as
// skill__<name> / agent__<name> delegation tools, resolved through
// Ports.Palette (nil Palette => those entries are inert).
LowLevelTools []string
SkillPalette []string
SubAgentPalette []string
// Phases optionally model a multi-step pipeline (each phase its own prompt
// + tier + tools). An empty slice is a single-phase run — the common case.
Phases []Phase
// Critic configures the optional two-tier run-critic (Ports.Critic). The
// zero value (disabled) is the light-host default.
Critic CriticConfig
}
// Phase is one step of a multi-step run: its own system prompt, model tier,
// iteration cap, and tool subset. Phase prompts are Go text/template strings
// expanded against {{.Query}} (the original input) and {{.<PhaseName>}} (a
// prior phase's output) before the phase runs, so a phase can consume earlier
// work. The final phase's output is the run's output.
type Phase struct {
Name string
SystemPrompt string
ModelTier string
MaxIterations int
Tools []string
// Optional swallows a phase's error and substitutes FallbackMessage (or a
// generated note) as its output, so a non-critical phase failing does not
// abort the pipeline.
Optional bool
// FallbackMessage is the substitute output when an Optional phase fails.
// Empty → a generated "(phase %q encountered an error…)" note.
FallbackMessage string
// IsRunFunc marks a phase as a single bare LLM call (no tool loop, no tools
// array) — a deterministic transform step (plan/synthesize) rather than an
// agentic loop. Its Tools/MaxIterations are ignored.
IsRunFunc bool
}
// CriticConfig configures the optional run-critic. Enabled gates whether a
// critic monitor is started at all; BackstopMultiplier sets the hard-kill
// deadline as a multiple of the soft trigger (MaxRuntime). A non-positive
// multiplier uses the kernel default.
type CriticConfig struct {
Enabled bool
BackstopMultiplier float64
}