43b2471737
Continues finishing the executor's run.Ports wiring (after C0's Palette).
Critic (run/critic.go): when Ports.Critic is set and the agent enables it, the
executor calls Monitor at run start, feeds RecordStep/RecordToolStart from the
step observer, drains the critic's Steer messages into the loop via
agent.WithSteer, and binds the run's hard cancellation to the critic's
(extendable) Deadline through a watch goroutine — a healthy-but-slow run gets
room while a hung one is killed. Stop() on run end. Soft timeout from
Defaults.CriticSoftTimeout (default 90s). nil-safe: no critic / not-enabled =
no-op.
Delivery (run/executor.go deliver): after the run, when Ports.Delivery is set
and inv.DeliveryID is non-empty, the executor posts Result.Output (or
DeliverError on failure) to a host-interpreted deliver.Target
{inv.DeliveryKind, inv.DeliveryID}. Empty target = caller reads Result.Output
itself (the synchronous default; the `.agent run` canary). Best-effort +
detached.
tool.Invocation gains DeliveryKind/DeliveryID (host-set egress target).
Tests: critic monitored/fed/steered/stopped when enabled, untouched when not;
delivery posts on a target, skips without one. Deferred: Checkpointer (needs a
majordomo hook to snapshot the running message history).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
709 lines
31 KiB
Go
709 lines
31 KiB
Go
// Package skilltools is the tool registry for the agentic skills platform.
|
|
// Tools registered here can be referenced by name from a Skill's Tools
|
|
// list and are surfaced to the underlying majordomo agent loop via Build().
|
|
//
|
|
// Independent of pkg/logic/chatbot/tool_provider.go: the chatbot's
|
|
// ToolProvider supplies tools per-channel during a chatbot turn; skill
|
|
// tools are scoped to one skill execution. Bridging happens once, in
|
|
// pkg/logic/skills/chatbot_provider.go, which exposes whole agent skills
|
|
// as chatbot tools (not individual skill tools).
|
|
//
|
|
// Permission model is documented in
|
|
// docs/superpowers/specs/2026-05-02-agentic-skills-design.md, "Tool
|
|
// registry" section. Three orthogonal checks:
|
|
//
|
|
// 1. Save-time: AuthoringRequirement vs caller's admin status.
|
|
// 2. Share-time: SafeForShare for visibility != private.
|
|
// 3. Execute-time: SkillNameGate.
|
|
package tool
|
|
|
|
import (
|
|
"context"
|
|
"fmt"
|
|
"sync"
|
|
"time"
|
|
|
|
llm "gitea.stevedudenhoeffer.com/steve/majordomo/llm"
|
|
)
|
|
|
|
// Visibility is the spec's visibility enum mirrored here as a typed
|
|
// string. It's redeclared (vs imported from pkg/logic/skills) to break
|
|
// the import cycle that would otherwise form: skills → skilltools →
|
|
// skills. The string values match Visibility one-to-one so a
|
|
// caller can pass `string(VisibilityPublic)` and it just works.
|
|
type Visibility string
|
|
|
|
const (
|
|
VisibilityPrivate Visibility = "private"
|
|
VisibilityShared Visibility = "shared"
|
|
VisibilityPublic Visibility = "public"
|
|
)
|
|
|
|
// Tool is what a registry entry implements. Concrete tools wrap an
|
|
// underlying mort subsystem (e.g. wolfram, weather, paste) and produce
|
|
// an llm.Tool on demand for a given Invocation.
|
|
//
|
|
// Why an interface (vs majordomo's concrete llm.Tool): we need richer
|
|
// metadata (Permission, Categories, SkillNameGate) for the platform's
|
|
// gating logic before we hand the tool to majordomo. BuildLLM converts
|
|
// to llm.Tool for one execution, closing over the Invocation so the
|
|
// per-tool handler can read CallerID/ChannelID without further plumbing.
|
|
//
|
|
// Why BuildLLM-per-call (vs static llm.Tool): per-user tools must close
|
|
// over inv.CallerID — the LLM-supplied args are intentionally ignored
|
|
// for those. Constructing the llm.Tool inside BuildLLM lets each tool
|
|
// craft its own typed Define call while reading the invocation context.
|
|
//
|
|
// Test: each tool under pkg/skilltools/tools/ has its own *_test.go.
|
|
type Tool interface {
|
|
Name() string
|
|
Description() string
|
|
Permission() Permission
|
|
// BuildLLM produces the llm.Tool for one invocation. The returned
|
|
// tool's name MUST equal Name(); the registry's Build() relies on
|
|
// this when wiring multiple tools into a Toolbox.
|
|
BuildLLM(inv Invocation) llm.Tool
|
|
}
|
|
|
|
// Permission summarises the three lifecycle gates plus UI metadata.
|
|
type Permission struct {
|
|
// AuthoringRequirement governs who may SAVE a skill that uses
|
|
// this tool: anyone or admin-only.
|
|
AuthoringRequirement Requirement
|
|
|
|
// OperatesOn classifies whose data the tool reads: global
|
|
// (channel-wide, public sources) or caller (the invoking user's
|
|
// own data).
|
|
OperatesOn Scope
|
|
|
|
// SafeForShare reports whether the tool may appear in a shared or
|
|
// public skill. Tools that operate on caller data are typically
|
|
// not safe for share — the executing skill becomes a vector for
|
|
// reading other users' data.
|
|
SafeForShare bool
|
|
|
|
// Categories are free-form labels used for UI grouping (read,
|
|
// write, network, code, data, social). Code does NOT branch on
|
|
// these strings.
|
|
Categories []string
|
|
|
|
// SkillNameGate, if non-empty, restricts execution to the named
|
|
// skill. Used for wizard-only tools in v2; SkillNameGate=="" means
|
|
// any skill may use the tool.
|
|
SkillNameGate string
|
|
}
|
|
|
|
// Requirement is who is allowed to author a skill using this tool.
|
|
type Requirement string
|
|
|
|
const (
|
|
RequirementAnyone Requirement = "anyone"
|
|
RequirementAdmin Requirement = "admin"
|
|
)
|
|
|
|
// Scope classifies the data domain a tool acts on.
|
|
type Scope string
|
|
|
|
const (
|
|
ScopeGlobal Scope = "global"
|
|
ScopeCaller Scope = "caller"
|
|
)
|
|
|
|
// ContinuationContext describes a V10 reply continuation. When set on
|
|
// an Invocation, the skill executor reuses the parent run's KV scope,
|
|
// renders a continuation prompt, and bumps ChainDepth for cap
|
|
// enforcement.
|
|
//
|
|
// The executor reads ParentRunID to set the new run's parent_run_id
|
|
// column (for call-tree reconstruction); ParentOutput to render the
|
|
// "previous output you sent" line in the agent prompt; ReplyText to
|
|
// render the "user replied with" line; ReplyMessageID for diagnostic
|
|
// logging; and ChainDepth to compare against
|
|
// skills.reply.max_chain_depth.
|
|
//
|
|
// Why ChainDepth (vs walking parent_run_id at execution time): a fresh
|
|
// query per turn would add a DB roundtrip on every reply hop. Carrying
|
|
// the count in the invocation is cheap and authoritative.
|
|
type ContinuationContext struct {
|
|
// ParentRunID is the run that produced the message the user
|
|
// replied to. The new run inherits its KV scope (run:<ParentRunID>).
|
|
ParentRunID string
|
|
|
|
// ParentOutput is the text the parent run delivered to Discord —
|
|
// stored on the run row so it survives even if the parent's
|
|
// run-scope KV has been auto-purged (24h after parent finished).
|
|
ParentOutput string
|
|
|
|
// ReplyText is what the user said when they replied (the new
|
|
// turn's user input). May be empty if the reply was an attachment-
|
|
// only message (handle gracefully — agent should handle empty
|
|
// input as a "noop continuation").
|
|
ReplyText string
|
|
|
|
// ReplyMessageID is the Discord message ID of the user's reply.
|
|
// Used for audit + log breadcrumbs; not currently consumed by the
|
|
// agent prompt.
|
|
ReplyMessageID string
|
|
|
|
// ChainDepth is how many continuation hops have happened in the
|
|
// chain rooted at the original invocation. The router should set
|
|
// this to (parent's chain depth + 1). The executor rejects when
|
|
// it exceeds skills.reply.max_chain_depth.
|
|
ChainDepth int
|
|
}
|
|
|
|
// InputFile is a non-image file the user supplied with a run (audio,
|
|
// etc.). The executor stages it into the file store under run scope and
|
|
// surfaces its file_id to the agent. Name is a safe base name (no path
|
|
// separators) suitable for /workspace/<name>; MimeType is the resolved
|
|
// content type; Data is the raw bytes.
|
|
type InputFile struct {
|
|
Name string
|
|
MimeType string
|
|
Data []byte
|
|
}
|
|
|
|
// Invocation is the runtime context passed to Tool.BuildLLM. The executor
|
|
// builds it once per skill run and the same struct is closed over by
|
|
// every tool's handler, so each tool sees the caller / channel identity.
|
|
type Invocation struct {
|
|
SkillID string
|
|
SkillName string
|
|
RunID string
|
|
CallerID string
|
|
ChannelID string
|
|
GuildID string
|
|
// DeliveryKind / DeliveryID name where the executor posts the run's output
|
|
// via run.Ports.Delivery — a host-interpreted Target ("channel"/"dm"/
|
|
// "thread"/...). An empty DeliveryID means the executor delivers nothing
|
|
// and the caller reads Result.Output itself (the synchronous default; the
|
|
// `.agent run` canary works this way).
|
|
DeliveryKind string
|
|
DeliveryID string
|
|
// CallerIsAdmin is true when the caller is a mort admin (Member.Admin).
|
|
// Populated by the executor at run dispatch via Bot.GetMember; defaults
|
|
// to false on any lookup failure (member not found, DB error, empty
|
|
// CallerID for system-invoked runs). Read by tools that gate behaviour
|
|
// on admin status — currently `code_exec` for the v15 admin-only WAN
|
|
// network mode.
|
|
//
|
|
// Why a precomputed bool on Invocation (vs an AdminChecker dep on
|
|
// every tool): the admin lookup is read-once-per-run; every tool
|
|
// would otherwise have to redo the work. The executor knows the
|
|
// caller's admin status by the time it builds Invocation, so it
|
|
// stamps the field once and every tool reads it for free.
|
|
CallerIsAdmin bool
|
|
// SkillInputs is the parsed input map for the enclosing skill —
|
|
// available so a tool can reference values the user supplied at
|
|
// invocation time. Tools may read this to specialise behaviour but
|
|
// MUST NOT use it as a substitute for inv.CallerID-based isolation.
|
|
SkillInputs map[string]any
|
|
// ParentRunID is set when the skill was invoked via skill_invoke
|
|
// from a parent skill run. Empty for top-level invocations
|
|
// (Discord, chatbot, scheduler). Used by the loop guard in
|
|
// skill_invoke and by the audit log for call-tree reconstruction.
|
|
//
|
|
// Why threaded through Invocation (vs context.Value): the loop
|
|
// guard runs at tool-handler time, where the only context the
|
|
// handler sees is inv. Stuffing it into context would force a
|
|
// helper for unwrap on every read; an explicit field is easier to
|
|
// audit and impossible to forget.
|
|
ParentRunID string
|
|
|
|
// RootRunID is the audit run id at the ROOT of the dispatch tree
|
|
// this run belongs to — for a top-level run, its own RunID; for a
|
|
// delegated run (skill_invoke / agent_invoke / agent_spawn /
|
|
// palette wrappers), the outermost ancestor's. Stamped by both
|
|
// executors from the dispatchguard ancestor chain right after
|
|
// guard entry. Backs the shared `root_run:<id>` KV scope that lets
|
|
// parallel sibling workers coordinate (see tools/scope_validate.go
|
|
// + RootRunKVPartition).
|
|
RootRunID string
|
|
|
|
// ToolsSubset, when non-empty, narrows an AGENT run's low-level tools
|
|
// to the named subset of the agent's configured LowLevelTools. Set by
|
|
// agent_invoke's `tools_subset` arg for ephemeral fan-out — spawning a
|
|
// focused worker from a template (e.g. a `coder` template with only
|
|
// code_exec + read_page). Names outside the agent's tool menu are
|
|
// rejected upstream (in the invoke adapter), so by the time the
|
|
// executor reads this the intersection is safe. Empty = full palette.
|
|
// Skill runs ignore this field.
|
|
ToolsSubset []string
|
|
|
|
// SystemPromptPrepend, when non-empty, is prepended to an AGENT's
|
|
// system prompt for this invocation only — the fan-out "customized
|
|
// system prompt" lever (agent_invoke's `prompt_prepend` arg). It
|
|
// specializes a template persona to a task without mutating the
|
|
// persisted agent row. Skill runs ignore this field.
|
|
SystemPromptPrepend string
|
|
|
|
// SuppressDelivery, when true, instructs the skill executor to
|
|
// SKIP its OutputTarget Delivery (Deliver / DeliverError) entirely.
|
|
// The run still produces an output string (returned from Run) and
|
|
// still writes to the audit log — only the side-channel delivery
|
|
// (Discord channel/DM/thread post) is suppressed.
|
|
//
|
|
// Why: when the chatbot exposure adapter invokes a skill, the skill's
|
|
// output is already going to be consumed by the chatbot as a tool
|
|
// result; ALSO posting it to Discord via OutputTarget produces double
|
|
// output and (worse) primes the chatbot to call the tool again on
|
|
// the next turn after seeing its own output as a "human message",
|
|
// kicking off a tool-loop. The chatbot adapter sets this to true on
|
|
// every invocation it constructs.
|
|
SuppressDelivery bool
|
|
|
|
// HandlerOwnsDelivery, when true, tells the executor that the caller
|
|
// (typically a Discord command handler) will assemble the final
|
|
// user-visible reply itself — folding any deferred attachments
|
|
// (rows queued by send_attachments to skill_run_pending_attachments)
|
|
// into the same message as the text output. The executor's
|
|
// post-run AttachmentDrainer is skipped so the handler can drain +
|
|
// classify + chain-overflow + post in one place.
|
|
//
|
|
// Why an explicit flag (vs reusing SuppressDelivery): SuppressDelivery
|
|
// also short-circuits the OutputTarget Delivery layer (channel/dm/
|
|
// thread post), which is the right shape for chatbot exposure but
|
|
// the WRONG shape for `.agent run` — the handler still wants the
|
|
// audit row to land and the executor's drainer to NOT post a
|
|
// separate "here's an image" follow-up message after the handler's
|
|
// own text reply. HandlerOwnsDelivery is the narrow "the caller is
|
|
// taking over post-run delivery" signal that does NOT change any
|
|
// other executor behaviour.
|
|
//
|
|
// SuppressDelivery and HandlerOwnsDelivery are independent. The
|
|
// drainer is skipped when EITHER is set (the chatbot path doesn't
|
|
// want stray posts either; agent-run sets HandlerOwnsDelivery
|
|
// because it owns delivery; sub-agent dispatches set SuppressDelivery
|
|
// because they surface output as a tool result).
|
|
HandlerOwnsDelivery bool
|
|
|
|
// Priority is the v9 per-invocation priority override for the lane
|
|
// scheduler. When non-zero, the executor uses this value when
|
|
// constructing the lane Job; zero falls back to the skill's
|
|
// Skill.DefaultPriority. Owners are capped by convar
|
|
// `skills.priority_max_per_user` (default 5); admins may exceed it.
|
|
//
|
|
// Why a non-pointer (vs *int): zero means "use the default", which
|
|
// matches the convention everywhere else in this struct. Skills
|
|
// that need an explicit zero priority can store
|
|
// DefaultPriority=0 — the result is identical.
|
|
Priority int
|
|
|
|
// LaneWaitMaxSeconds is the v9 per-invocation lane backoff cap. When
|
|
// >0, the executor calls SubmitWithMaxWait so the run is rejected
|
|
// with ErrLaneBusy (surfaced as `lane_busy`) when the estimated
|
|
// queue wait would exceed this many seconds. 0 (default) preserves
|
|
// the legacy block-forever Submit semantics.
|
|
LaneWaitMaxSeconds int
|
|
|
|
// LaneOverride forces the run onto the named lane regardless of
|
|
// Skill.ExecutionLane. Used by the v9 inbound webhook handler to
|
|
// route webhook-triggered runs to the dedicated webhook-default
|
|
// lane. Empty preserves the per-skill ExecutionLane.
|
|
LaneOverride string
|
|
|
|
// Continuation, when non-nil, signals that this Invocation is a
|
|
// V10 reply continuation: a Discord user replied to a message the
|
|
// originating skill posted, and mort is re-invoking the skill to
|
|
// produce the next turn. The executor reads this field to:
|
|
//
|
|
// - Reuse the parent run's `run:<parent_run_id>` KV scope (so any
|
|
// state the prior turn saved is still readable).
|
|
// - Render a continuation block at the top of the agent's user
|
|
// prompt that includes the parent output + reply text.
|
|
// - Enforce the per-deployment chain-depth cap
|
|
// (skills.reply.max_chain_depth, default 20).
|
|
// - Stamp parent_run_id on the new run for call-tree
|
|
// reconstruction in audit + UI.
|
|
//
|
|
// Why a pointer struct (vs flat fields): all five fields are
|
|
// meaningful only together — splitting them would invite
|
|
// half-populated states. nil = "this is a fresh invocation, not a
|
|
// continuation".
|
|
Continuation *ContinuationContext
|
|
|
|
// SourceWebhookSecretMatched is set true by the inbound webhook
|
|
// handler AFTER it has validated both the URL secret AND the HMAC
|
|
// signature for the named skill. It signals to System.Run that the
|
|
// caller is authenticated by a per-skill secret (not by Discord
|
|
// identity), so the visibility / owner gate in CanInvoke should be
|
|
// bypassed for THIS skill (matching SkillID). All other gates —
|
|
// pinned_version, budget caps, lane caps — still apply.
|
|
//
|
|
// Hotfix-5 Bug 1: pre-fix the webhook handler built an Invocation
|
|
// with CallerID=`<webhook>:<source-IP>` and dispatched through
|
|
// System.Run. CanInvoke saw a non-owner non-admin caller against a
|
|
// private skill and rejected with HTTP 500 ("caller is not
|
|
// permitted to invoke skill"). The cure isn't to weaken
|
|
// CanInvoke's general-purpose policy — it's to recognise that a
|
|
// matched secret IS the auth gate for the named skill.
|
|
//
|
|
// Why per-Invocation (vs a separate gate path): the executor uses
|
|
// Run as the single canonical dispatch point — adding a second
|
|
// "authenticated dispatch" entry would split run-recording, lane
|
|
// dispatch, and audit emission into two parallel implementations.
|
|
SourceWebhookSecretMatched bool
|
|
|
|
// OnEvent, when non-nil, is called by the executor at run
|
|
// boundaries and by the agent loop on each tool dispatch. The
|
|
// bot's command handler closes over the invoking message and
|
|
// reacts an emoji from the skill's StateReactEmoji map. Nil-safe.
|
|
//
|
|
// Event names:
|
|
// "__start__" — right before agent.Run starts
|
|
// "__end__" — on successful completion
|
|
// "__error__" — on terminal error
|
|
// <tool_name> — when a tool dispatches (any registered tool)
|
|
//
|
|
// The executor passes the resolved emoji as `emoji` so callers
|
|
// don't have to look it up themselves; emoji=="" means "no react
|
|
// for this event" and callers should skip the react entirely.
|
|
//
|
|
// Why a callback (vs a state-react map carried in the Invocation):
|
|
// the lookup table lives on the Skill, not the Invocation, but the
|
|
// caller-supplied side effect (a Discord react) lives on the bot
|
|
// command surface. A callback bridges the two without forcing the
|
|
// executor to import discord types and without forcing the bot
|
|
// command surface to know about the Skill's emoji map shape.
|
|
OnEvent func(ctx context.Context, event string, emoji string)
|
|
|
|
// OnToolEvent, when non-nil, is called by the executor on each tool
|
|
// dispatch with phase "start" (before the tool runs) then "end" or
|
|
// "error" (after it completes, with the result text in detail). Distinct
|
|
// from OnEvent (which is the emoji state-react hook): this carries the
|
|
// tool name + args/result so an out-of-band caller — e.g. the mortise
|
|
// chat API streaming SSE tool.start/tool.end frames — can surface live
|
|
// tool-progress. Nil-safe; the callback MUST be fast and non-blocking
|
|
// (it runs on the agent-loop goroutine).
|
|
OnToolEvent func(ctx context.Context, toolName, phase, detail string)
|
|
|
|
// OnStep, when non-nil, is called by the executor as the agent loop
|
|
// makes progress — currently once per tool call: phase "start" before
|
|
// the tool runs, phase "end" after it completes (StepEvent.Step.Status
|
|
// is "complete" or "error"). Correlate the two by StepEvent.Step.ID.
|
|
// "delta" is reserved for progressive detail and is unused today.
|
|
//
|
|
// Distinct from OnToolEvent (the raw tool-name/result hook): OnStep
|
|
// carries a richer, presentation-ready Step (kind + human present-tense
|
|
// summary) so an out-of-band consumer — e.g. the mortise chat API
|
|
// streaming SSE step.start/step.end frames — can render structured
|
|
// progress without re-deriving it. The executor ALSO accumulates the
|
|
// same Steps onto its run Result, so persistence does not depend on
|
|
// this callback being set. Nil-safe; the callback MUST be fast and
|
|
// non-blocking (it runs on the agent-loop goroutine).
|
|
OnStep func(ctx context.Context, ev StepEvent)
|
|
|
|
// InvokingMessageID is the Discord message ID of the user's command
|
|
// that triggered this run, when it was triggered by a Discord text
|
|
// command. Used by delivery to thread the reply (Discord native
|
|
// reply with the gray quote bar + jump link). Empty for chatbot
|
|
// exposure, scheduled, or webhook invocations — delivery falls
|
|
// back to a plain channel post for those.
|
|
//
|
|
// Why threaded through Invocation (vs a separate field on Skill or
|
|
// a magic SkillInputs key): the message ID is per-invocation, not
|
|
// per-skill, and the delivery layer is the natural reader. Direct
|
|
// field on Invocation matches the existing ChannelID / GuildID
|
|
// fields' shape.
|
|
InvokingMessageID string
|
|
|
|
// Images carries multi-modal image content for the initial user
|
|
// message. When non-empty, the executor builds the initial user
|
|
// message with llm.UserParts(text + image parts) instead of plain
|
|
// llm.UserText. Populated by callers that extract images from Discord
|
|
// attachments or URLs in prompt text (pkg/imageutil downloads the
|
|
// bytes — majordomo image parts are bytes-only). Nil = text-only.
|
|
Images []llm.ImagePart
|
|
|
|
// InputFiles carries non-image attachments (audio, etc.) the user
|
|
// supplied with the run. Unlike Images, these are NOT inlined into
|
|
// the model's context — the LLM can't ingest raw mp3/wav/midi bytes.
|
|
// Instead the executor stages each into the skill file store under
|
|
// run scope and tells the agent the resulting file_ids (in the
|
|
// prompt) so it can hand one to a worker tool (e.g. code_exec
|
|
// files_in → /workspace/<name>) for processing. Nil = none.
|
|
InputFiles []InputFile
|
|
|
|
// ExtraTools are additional llm.Tool instances injected for this
|
|
// run only. They are appended to the palette after registry-built
|
|
// tools, skill-palette wrappers, and sub-agent wrappers. Use this
|
|
// for session-specific tools that cannot be pre-registered in the
|
|
// catalog (e.g., scaddy's write_scad which needs per-session
|
|
// workspace + renderer state).
|
|
//
|
|
// Why on Invocation (vs a dedicated Run parameter): the Invocation
|
|
// is the per-run context carrier in mort's execution path. Adding
|
|
// a separate ExtraTools arg to Executor.Run would fork the
|
|
// signature for one use case; a field on the existing carrier
|
|
// keeps the surface stable.
|
|
ExtraTools []llm.Tool
|
|
|
|
// SessionToolFactory, if set, is called with the live AgentSession
|
|
// after the executor constructs the agent but before it runs. It
|
|
// returns a SessionTools struct carrying the tools to add, an
|
|
// optional PostRun hook for post-processing (e.g., rendering final
|
|
// artifacts from workspace state), and an optional Cleanup func for
|
|
// resource teardown. Types are defined in session_tools.go.
|
|
//
|
|
// Why a factory (vs ExtraTools): ExtraTools are static — they
|
|
// don't have access to the running agent. Tools that need to call
|
|
// session.AttachImages (to show rendered previews to the model on
|
|
// its next turn) require the live session handle that only exists
|
|
// after construction. The factory receives that handle.
|
|
SessionToolFactory SessionToolFactory
|
|
|
|
// PostRunDelivery, if set, is called by the agent command handler
|
|
// (`.agent run`) INSTEAD of the default text + paste-fallback reply
|
|
// when the executor's result carries a PostRunResult. The callback
|
|
// receives the Discord message to reply to, the agent's text output,
|
|
// and the PostRunResult. It returns the message ID of the primary
|
|
// reply (for origin recording) and any error.
|
|
//
|
|
// Why a callback on Invocation (vs a handler method on the agent):
|
|
// delivery needs services (paste, filetransfer, Discord session)
|
|
// that live outside the agents package. A callback lets the adapter
|
|
// (e.g., scaddy) close over the services at factory-build time
|
|
// without adding service dependencies to the agents.System struct.
|
|
//
|
|
// When nil, `handleRun` falls through to the standard text-based
|
|
// reply path (formatRunReply + postRunReply). When set, the
|
|
// callback owns the ENTIRE reply — `handleRun` does NOT post a
|
|
// text reply alongside it.
|
|
PostRunDelivery func(ctx context.Context, channelID, replyToMsgID string, output string, prr *PostRunResult) (primaryMsgID string, err error)
|
|
|
|
// RunState, when set by the executor, lets a tool read the live
|
|
// run's progress + budget snapshot (iteration vs cap, tool calls,
|
|
// tokens, cost, elapsed). Nil on paths that do not provide it (e.g.
|
|
// the no-tools direct path, or executors that predate the hook).
|
|
// The skill_self_status tool reads this.
|
|
RunState RunStateAccessor
|
|
|
|
// AttachImages, when set by the executor, queues a user-role message
|
|
// (optional text + image parts) into the LIVE run so the model sees
|
|
// the images on its next step — the same steer-mailbox mechanism the
|
|
// SessionToolFactory's AgentSession exposes, but reachable from any
|
|
// ordinary tool handler. A tool returns text; images cannot ride a
|
|
// string result, so a tool that fetches images the model must SEE
|
|
// (e.g. discord_list_recent_messages reading channel history) calls
|
|
// this to feed the pixels in. Nil on paths that do not own a steer
|
|
// mailbox (skillexec, the no-tools direct path); tools MUST nil-check
|
|
// before calling and degrade to text-only when it is nil.
|
|
AttachImages func(text string, images ...llm.ImagePart)
|
|
|
|
// gate / audit are populated by the registry's Build before
|
|
// BuildLLM is called. Tools should call CheckGate(inv) at the top
|
|
// of their handler and EmitAudit(inv, ...) when reporting tool
|
|
// results. The fields are unexported in the public surface but
|
|
// available to tools via the helpers in helpers.go.
|
|
gate string
|
|
currentSkill string
|
|
audit AuditHook
|
|
toolName string
|
|
}
|
|
|
|
// RunState is a live, read-only snapshot of the current run's progress
|
|
// and budget. Populated on demand by the executor's per-run accessor
|
|
// (see Invocation.RunState).
|
|
type RunState struct {
|
|
Iteration int
|
|
MaxIterations int
|
|
ToolCalls int
|
|
MaxToolCalls int
|
|
InputTokens int64
|
|
OutputTokens int64
|
|
ThinkingTokens int64
|
|
ElapsedSeconds int
|
|
}
|
|
|
|
// RunStateAccessor returns the live RunState for the enclosing run. The
|
|
// executor builds one per run and stamps it on Invocation.RunState
|
|
// before the toolbox is built; tools read it via inv.RunState. Nil on
|
|
// any path that does not provide it.
|
|
type RunStateAccessor interface {
|
|
RunState() RunState
|
|
}
|
|
|
|
// Registry is the read interface to the tool catalog. Concrete impl is
|
|
// the package-private *registry struct returned by NewRegistry.
|
|
type Registry interface {
|
|
Register(t Tool) error
|
|
Get(name string) (Tool, bool)
|
|
List() []Tool
|
|
// Build returns an llm.Toolbox with each named tool prepared for
|
|
// execution against the given invocation. Save-time authoring
|
|
// checks happen elsewhere (CheckAuthoring in checks.go) — Build
|
|
// trusts that the skill was already saved past those gates and
|
|
// only re-checks runtime invariants:
|
|
//
|
|
// 1. Share-safety drift: rejects an unsafe tool when visibility
|
|
// != private.
|
|
// 2. SkillNameGate enforcement is delegated to the per-tool
|
|
// handler via CheckGate, which reads invocation context.
|
|
// 3. Audit emission via EmitAudit (also per-tool).
|
|
//
|
|
// The optional `trusted` variadic argument lets the caller declare
|
|
// the skill as trusted infrastructure (a builtin loaded from disk
|
|
// by the project's own loader) so the share-safety drift check is
|
|
// skipped. Builtins legitimately ship with public visibility AND
|
|
// not-safe-for-share tools (e.g. skill-wizard's wizard_* tools),
|
|
// and the loader bypasses save-time gates by design — applying the
|
|
// share-safety check at invocation would be inconsistent with the
|
|
// rest of the trusted-builtin contract. Pass true ONLY for builtins
|
|
// (Skill.Source == SourceBuiltin / OwnerID == ""). Variadic so the
|
|
// existing call sites (and tests) compile unchanged.
|
|
Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error)
|
|
}
|
|
|
|
// AuditHook is invoked synchronously around each tool call. Implementations
|
|
// typically forward to skillaudit.Writer. May be nil for tests.
|
|
type AuditHook func(call AuditCall)
|
|
|
|
// AuditCall describes one tool invocation. Result is set on success;
|
|
// Err is set on failure. Either may be present together (e.g. the tool
|
|
// returned partial output then errored).
|
|
type AuditCall struct {
|
|
Tool string
|
|
Args string
|
|
Result string
|
|
Err error
|
|
}
|
|
|
|
// Step is one unit of agent progress surfaced to a consumer of OnStep
|
|
// (and accumulated onto the executor's run Result). Today there is one
|
|
// Step per tool call; the shape is deliberately open so future kinds
|
|
// (a coalesced reasoning beat, a sub-agent delegation) slot in without a
|
|
// wire change.
|
|
//
|
|
// This is a plain DTO — no HTTP/Discord/JSON-tag coupling beyond the
|
|
// neutral snake_case tags a transport may reuse. The chat API converts
|
|
// it to its own persisted/wire type; Discord/cron consumers read the
|
|
// Result field directly.
|
|
type Step struct {
|
|
// ID is stable per-step and unique within one run; it is the
|
|
// correlation key between the "start" and "end" emissions.
|
|
ID string `json:"id"`
|
|
// Kind is an open vocabulary (search, read, code, image, file,
|
|
// memory, delegate, tool, …); consumers map known values to an icon
|
|
// and fall back for unknown ones. Never drop a step for an
|
|
// unrecognised kind.
|
|
Kind string `json:"kind"`
|
|
// Title is a short machine-ish label (typically the raw tool name).
|
|
Title string `json:"title,omitempty"`
|
|
// Summary is the human present-tense one-liner ("Searching the web
|
|
// for …"); on end it may be replaced with a result phrase.
|
|
Summary string `json:"summary"`
|
|
// Status is "running" | "complete" | "error".
|
|
Status string `json:"status"`
|
|
// Detail is optional, user-safe, size-capped markdown. Never raw tool
|
|
// output, credentials, or chain-of-thought.
|
|
Detail string `json:"detail,omitempty"`
|
|
// StartedAt is when the step began.
|
|
StartedAt time.Time `json:"started_at"`
|
|
// EndedAt is set on the terminal "end" emission.
|
|
EndedAt *time.Time `json:"ended_at,omitempty"`
|
|
}
|
|
|
|
// StepEvent is one live emission to OnStep. Phase is "start" or "end"
|
|
// ("delta" is reserved for progressive detail and unused today). Step
|
|
// carries the full current snapshot; Detail holds the delta text when
|
|
// Phase == "delta".
|
|
type StepEvent struct {
|
|
Phase string
|
|
Step Step
|
|
Detail string
|
|
}
|
|
|
|
// NewRegistry constructs an empty registry. Call Register for each tool;
|
|
// see pkg/skilltools/default_registry.go for the v1 set.
|
|
func NewRegistry() Registry {
|
|
return ®istry{tools: make(map[string]Tool)}
|
|
}
|
|
|
|
type registry struct {
|
|
mu sync.RWMutex
|
|
tools map[string]Tool
|
|
}
|
|
|
|
func (r *registry) Register(t Tool) error {
|
|
if t == nil {
|
|
return fmt.Errorf("skilltools: nil tool")
|
|
}
|
|
name := t.Name()
|
|
if name == "" {
|
|
return fmt.Errorf("skilltools: tool with empty name")
|
|
}
|
|
r.mu.Lock()
|
|
defer r.mu.Unlock()
|
|
if _, dup := r.tools[name]; dup {
|
|
return fmt.Errorf("skilltools: duplicate tool name %q", name)
|
|
}
|
|
r.tools[name] = t
|
|
return nil
|
|
}
|
|
|
|
func (r *registry) Get(name string) (Tool, bool) {
|
|
r.mu.RLock()
|
|
defer r.mu.RUnlock()
|
|
t, ok := r.tools[name]
|
|
return t, ok
|
|
}
|
|
|
|
func (r *registry) List() []Tool {
|
|
r.mu.RLock()
|
|
defer r.mu.RUnlock()
|
|
out := make([]Tool, 0, len(r.tools))
|
|
for _, t := range r.tools {
|
|
out = append(out, t)
|
|
}
|
|
return out
|
|
}
|
|
|
|
// Build prepares an llm.Toolbox for one skill execution.
|
|
//
|
|
// Why: each tool needs to know the caller / channel / skill name plus
|
|
// the audit hook. Stuffing them into Invocation lets each Tool.BuildLLM
|
|
// produce a closure that has everything it needs without further
|
|
// plumbing.
|
|
//
|
|
// Defence in depth: rejects an unsafe tool when visibility != private —
|
|
// the share-time check should already have prevented this; this catches
|
|
// drift (e.g. a tool's SafeForShare flag flipping after a skill saved).
|
|
//
|
|
// The trusted variadic flag lets a caller bypass the share-safety drift
|
|
// check for builtin (trusted-infrastructure) skills. The mortventure /
|
|
// skill-wizard builtins legitimately ship with public visibility AND
|
|
// not-safe-for-share tools — the loader bypasses save-time gates and
|
|
// the share-safety check at invocation would block them inconsistently.
|
|
// Pass true ONLY for builtins.
|
|
func (r *registry) Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error) {
|
|
isTrusted := len(trusted) > 0 && trusted[0]
|
|
box := llm.NewToolbox("skilltools")
|
|
for _, name := range names {
|
|
t, ok := r.Get(name)
|
|
if !ok {
|
|
return nil, fmt.Errorf("skilltools: unknown tool %q", name)
|
|
}
|
|
|
|
if !isTrusted && vis != VisibilityPrivate && !t.Permission().SafeForShare {
|
|
return nil, fmt.Errorf("skilltools: tool %q is not safe for share but skill visibility is %s", name, vis)
|
|
}
|
|
|
|
// Populate the gate/audit fields on the Invocation so the tool
|
|
// can call CheckGate / EmitAudit from its handler.
|
|
toolInv := inv
|
|
toolInv.gate = t.Permission().SkillNameGate
|
|
toolInv.currentSkill = inv.SkillName
|
|
toolInv.audit = audit
|
|
toolInv.toolName = name
|
|
|
|
built := t.BuildLLM(toolInv)
|
|
if built.Name == "" {
|
|
return nil, fmt.Errorf("skilltools: tool %q built llm.Tool with empty name", name)
|
|
}
|
|
if err := box.Add(built); err != nil {
|
|
return nil, fmt.Errorf("skilltools: adding tool %q: %w", name, err)
|
|
}
|
|
}
|
|
return box, nil
|
|
}
|