30b79a330f
The kernel carried RunnableAgent.Phases as a DTO but never executed it —
Run always ran a single agent loop with ra.SystemPrompt, so a phased agent
(mort's deepresearch/research) silently ran one loop with the base prompt
instead of its pipeline. This implements the phase loop, ported from mort's
agentexec pipeline but reusing the kernel's own machinery.
- run/phases.go: runPhases / runOnePhase. Phases run sequentially; each is a
fresh agent loop (or a bare LLM call for IsRunFunc phases) with its own
template-expanded system prompt ({{.Query}} + {{.<PhaseName>}}), model
tier, step cap, and tool subset. Outputs thread into later phases; the
final phase's output is the run output. Optional phases swallow errors and
substitute FallbackMessage; a non-optional phase that merely exhausts its
step/tool budget salvages its partial transcript and continues (a hard
error still aborts); per-phase tier-resolve failures fall back with a WARN.
- run/agent.go: Phase gains IsRunFunc + FallbackMessage (the kernel Phase
struct previously omitted them).
- run/executor.go: Run factors the shared agent options (tool-error limits,
step observer, compactor) and branches — single loop (critic's dynamic
step ceiling) vs the phase runner (fixed per-phase caps; the run-level
critic's steer + hard deadline still apply across phases). systemPrompt
now delegates to systemPromptWithBody so each phase keeps the platform
header. The same step observer feeds audit/steps/critic across all phases.
Tests (run/phases_test.go): sequential output threading + template
expansion, Optional-failure → FallbackMessage continues, hard-error abort,
IsRunFunc bare call, per-phase SystemHeader, filterToolbox subset, template
expansion. Full ./... suite green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
89 lines
3.7 KiB
Go
89 lines
3.7 KiB
Go
package run
|
|
|
|
import "time"
|
|
|
|
// RunnableAgent is the kernel's view of "a thing to run": an identity, a model
|
|
// tier, a system prompt, execution caps, and a tool palette. It is a plain DTO
|
|
// on purpose — the run kernel never imports a noun battery. The persona Agent
|
|
// and the saved Skill each LOWER themselves into a RunnableAgent (a ToRunnable
|
|
// method on the battery side), and the kernel runs the DTO. This is the
|
|
// inversion of mort's agentexec.Executor.Run(*agents.Agent): the executor no
|
|
// longer depends on the persona struct, only on this shape.
|
|
//
|
|
// A light host can build a RunnableAgent inline (model tier + prompt + a few
|
|
// tool names) for a one-shot bounded run, with no persona or skill battery at
|
|
// all — that is exactly gadfly's swarm task.
|
|
type RunnableAgent struct {
|
|
// ID is a stable identifier for the run subject (an agent/skill UUID, or
|
|
// any host-chosen id). Used for audit attribution and dispatch-guard
|
|
// genealogy. Empty is allowed for anonymous one-shot runs.
|
|
ID string
|
|
|
|
// Name is a human label (audit/logs/delivery). Empty is allowed.
|
|
Name string
|
|
|
|
// SystemPrompt is the agent's base system prompt (before per-run
|
|
// personalization, which a host layers via Ports).
|
|
SystemPrompt string
|
|
|
|
// ModelTier is a tier alias or concrete spec resolved through
|
|
// model.ParseModelForContext. Empty resolves to the host's default tier.
|
|
ModelTier string
|
|
|
|
// MaxIterations caps the agent loop's tool-dispatch steps. 0 = kernel
|
|
// default. MaxRuntime caps wall-clock for the whole run (the kernel starts
|
|
// this clock AFTER any lane dequeue, not at submission). 0 = kernel
|
|
// default.
|
|
MaxIterations int
|
|
MaxRuntime time.Duration
|
|
|
|
// LowLevelTools are tool-registry names the run may call directly.
|
|
// SkillPalette / SubAgentPalette name saved skills / sub-agents exposed as
|
|
// skill__<name> / agent__<name> delegation tools, resolved through
|
|
// Ports.Palette (nil Palette => those entries are inert).
|
|
LowLevelTools []string
|
|
SkillPalette []string
|
|
SubAgentPalette []string
|
|
|
|
// Phases optionally model a multi-step pipeline (each phase its own prompt
|
|
// + tier + tools). An empty slice is a single-phase run — the common case.
|
|
Phases []Phase
|
|
|
|
// Critic configures the optional two-tier run-critic (Ports.Critic). The
|
|
// zero value (disabled) is the light-host default.
|
|
Critic CriticConfig
|
|
}
|
|
|
|
// Phase is one step of a multi-step run: its own system prompt, model tier,
|
|
// iteration cap, and tool subset. Phase prompts are Go text/template strings
|
|
// expanded against {{.Query}} (the original input) and {{.<PhaseName>}} (a
|
|
// prior phase's output) before the phase runs, so a phase can consume earlier
|
|
// work. The final phase's output is the run's output.
|
|
type Phase struct {
|
|
Name string
|
|
SystemPrompt string
|
|
ModelTier string
|
|
MaxIterations int
|
|
Tools []string
|
|
// Optional swallows a phase's error and substitutes FallbackMessage (or a
|
|
// generated note) as its output, so a non-critical phase failing does not
|
|
// abort the pipeline.
|
|
Optional bool
|
|
// FallbackMessage is the substitute output when an Optional phase fails.
|
|
// Empty → a generated "(phase %q encountered an error…)" note.
|
|
FallbackMessage string
|
|
// IsRunFunc marks a phase as a single bare LLM call (no tool loop, no tools
|
|
// array) — a deterministic transform step (plan/synthesize) rather than an
|
|
// agentic loop. Its Tools/MaxIterations are ignored.
|
|
IsRunFunc bool
|
|
}
|
|
|
|
// CriticConfig configures the optional run-critic. Enabled gates whether a
|
|
// critic monitor is started at all; BackstopMultiplier sets the hard-kill
|
|
// deadline as a multiple of the soft trigger (MaxRuntime). A non-positive
|
|
// multiplier uses the kernel default.
|
|
type CriticConfig struct {
|
|
Enabled bool
|
|
BackstopMultiplier float64
|
|
}
|