Files
executus/tool/registry.go
T
steve 43b2471737
executus CI / test (pull_request) Failing after 1m0s
Adversarial Review (Gadfly) / review (pull_request) Successful in 5m9s
C0b: wire Critic + Delivery into run.Executor
Continues finishing the executor's run.Ports wiring (after C0's Palette).

Critic (run/critic.go): when Ports.Critic is set and the agent enables it, the
executor calls Monitor at run start, feeds RecordStep/RecordToolStart from the
step observer, drains the critic's Steer messages into the loop via
agent.WithSteer, and binds the run's hard cancellation to the critic's
(extendable) Deadline through a watch goroutine — a healthy-but-slow run gets
room while a hung one is killed. Stop() on run end. Soft timeout from
Defaults.CriticSoftTimeout (default 90s). nil-safe: no critic / not-enabled =
no-op.

Delivery (run/executor.go deliver): after the run, when Ports.Delivery is set
and inv.DeliveryID is non-empty, the executor posts Result.Output (or
DeliverError on failure) to a host-interpreted deliver.Target
{inv.DeliveryKind, inv.DeliveryID}. Empty target = caller reads Result.Output
itself (the synchronous default; the `.agent run` canary). Best-effort +
detached.

tool.Invocation gains DeliveryKind/DeliveryID (host-set egress target).

Tests: critic monitored/fed/steered/stopped when enabled, untouched when not;
delivery posts on a target, skips without one. Deferred: Checkpointer (needs a
majordomo hook to snapshot the running message history).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 10:00:05 -04:00

709 lines
31 KiB
Go

// Package skilltools is the tool registry for the agentic skills platform.
// Tools registered here can be referenced by name from a Skill's Tools
// list and are surfaced to the underlying majordomo agent loop via Build().
//
// Independent of pkg/logic/chatbot/tool_provider.go: the chatbot's
// ToolProvider supplies tools per-channel during a chatbot turn; skill
// tools are scoped to one skill execution. Bridging happens once, in
// pkg/logic/skills/chatbot_provider.go, which exposes whole agent skills
// as chatbot tools (not individual skill tools).
//
// Permission model is documented in
// docs/superpowers/specs/2026-05-02-agentic-skills-design.md, "Tool
// registry" section. Three orthogonal checks:
//
// 1. Save-time: AuthoringRequirement vs caller's admin status.
// 2. Share-time: SafeForShare for visibility != private.
// 3. Execute-time: SkillNameGate.
package tool
import (
"context"
"fmt"
"sync"
"time"
llm "gitea.stevedudenhoeffer.com/steve/majordomo/llm"
)
// Visibility is the spec's visibility enum mirrored here as a typed
// string. It's redeclared (vs imported from pkg/logic/skills) to break
// the import cycle that would otherwise form: skills → skilltools →
// skills. The string values match Visibility one-to-one so a
// caller can pass `string(VisibilityPublic)` and it just works.
type Visibility string
const (
VisibilityPrivate Visibility = "private"
VisibilityShared Visibility = "shared"
VisibilityPublic Visibility = "public"
)
// Tool is what a registry entry implements. Concrete tools wrap an
// underlying mort subsystem (e.g. wolfram, weather, paste) and produce
// an llm.Tool on demand for a given Invocation.
//
// Why an interface (vs majordomo's concrete llm.Tool): we need richer
// metadata (Permission, Categories, SkillNameGate) for the platform's
// gating logic before we hand the tool to majordomo. BuildLLM converts
// to llm.Tool for one execution, closing over the Invocation so the
// per-tool handler can read CallerID/ChannelID without further plumbing.
//
// Why BuildLLM-per-call (vs static llm.Tool): per-user tools must close
// over inv.CallerID — the LLM-supplied args are intentionally ignored
// for those. Constructing the llm.Tool inside BuildLLM lets each tool
// craft its own typed Define call while reading the invocation context.
//
// Test: each tool under pkg/skilltools/tools/ has its own *_test.go.
type Tool interface {
Name() string
Description() string
Permission() Permission
// BuildLLM produces the llm.Tool for one invocation. The returned
// tool's name MUST equal Name(); the registry's Build() relies on
// this when wiring multiple tools into a Toolbox.
BuildLLM(inv Invocation) llm.Tool
}
// Permission summarises the three lifecycle gates plus UI metadata.
type Permission struct {
// AuthoringRequirement governs who may SAVE a skill that uses
// this tool: anyone or admin-only.
AuthoringRequirement Requirement
// OperatesOn classifies whose data the tool reads: global
// (channel-wide, public sources) or caller (the invoking user's
// own data).
OperatesOn Scope
// SafeForShare reports whether the tool may appear in a shared or
// public skill. Tools that operate on caller data are typically
// not safe for share — the executing skill becomes a vector for
// reading other users' data.
SafeForShare bool
// Categories are free-form labels used for UI grouping (read,
// write, network, code, data, social). Code does NOT branch on
// these strings.
Categories []string
// SkillNameGate, if non-empty, restricts execution to the named
// skill. Used for wizard-only tools in v2; SkillNameGate=="" means
// any skill may use the tool.
SkillNameGate string
}
// Requirement is who is allowed to author a skill using this tool.
type Requirement string
const (
RequirementAnyone Requirement = "anyone"
RequirementAdmin Requirement = "admin"
)
// Scope classifies the data domain a tool acts on.
type Scope string
const (
ScopeGlobal Scope = "global"
ScopeCaller Scope = "caller"
)
// ContinuationContext describes a V10 reply continuation. When set on
// an Invocation, the skill executor reuses the parent run's KV scope,
// renders a continuation prompt, and bumps ChainDepth for cap
// enforcement.
//
// The executor reads ParentRunID to set the new run's parent_run_id
// column (for call-tree reconstruction); ParentOutput to render the
// "previous output you sent" line in the agent prompt; ReplyText to
// render the "user replied with" line; ReplyMessageID for diagnostic
// logging; and ChainDepth to compare against
// skills.reply.max_chain_depth.
//
// Why ChainDepth (vs walking parent_run_id at execution time): a fresh
// query per turn would add a DB roundtrip on every reply hop. Carrying
// the count in the invocation is cheap and authoritative.
type ContinuationContext struct {
// ParentRunID is the run that produced the message the user
// replied to. The new run inherits its KV scope (run:<ParentRunID>).
ParentRunID string
// ParentOutput is the text the parent run delivered to Discord —
// stored on the run row so it survives even if the parent's
// run-scope KV has been auto-purged (24h after parent finished).
ParentOutput string
// ReplyText is what the user said when they replied (the new
// turn's user input). May be empty if the reply was an attachment-
// only message (handle gracefully — agent should handle empty
// input as a "noop continuation").
ReplyText string
// ReplyMessageID is the Discord message ID of the user's reply.
// Used for audit + log breadcrumbs; not currently consumed by the
// agent prompt.
ReplyMessageID string
// ChainDepth is how many continuation hops have happened in the
// chain rooted at the original invocation. The router should set
// this to (parent's chain depth + 1). The executor rejects when
// it exceeds skills.reply.max_chain_depth.
ChainDepth int
}
// InputFile is a non-image file the user supplied with a run (audio,
// etc.). The executor stages it into the file store under run scope and
// surfaces its file_id to the agent. Name is a safe base name (no path
// separators) suitable for /workspace/<name>; MimeType is the resolved
// content type; Data is the raw bytes.
type InputFile struct {
Name string
MimeType string
Data []byte
}
// Invocation is the runtime context passed to Tool.BuildLLM. The executor
// builds it once per skill run and the same struct is closed over by
// every tool's handler, so each tool sees the caller / channel identity.
type Invocation struct {
SkillID string
SkillName string
RunID string
CallerID string
ChannelID string
GuildID string
// DeliveryKind / DeliveryID name where the executor posts the run's output
// via run.Ports.Delivery — a host-interpreted Target ("channel"/"dm"/
// "thread"/...). An empty DeliveryID means the executor delivers nothing
// and the caller reads Result.Output itself (the synchronous default; the
// `.agent run` canary works this way).
DeliveryKind string
DeliveryID string
// CallerIsAdmin is true when the caller is a mort admin (Member.Admin).
// Populated by the executor at run dispatch via Bot.GetMember; defaults
// to false on any lookup failure (member not found, DB error, empty
// CallerID for system-invoked runs). Read by tools that gate behaviour
// on admin status — currently `code_exec` for the v15 admin-only WAN
// network mode.
//
// Why a precomputed bool on Invocation (vs an AdminChecker dep on
// every tool): the admin lookup is read-once-per-run; every tool
// would otherwise have to redo the work. The executor knows the
// caller's admin status by the time it builds Invocation, so it
// stamps the field once and every tool reads it for free.
CallerIsAdmin bool
// SkillInputs is the parsed input map for the enclosing skill —
// available so a tool can reference values the user supplied at
// invocation time. Tools may read this to specialise behaviour but
// MUST NOT use it as a substitute for inv.CallerID-based isolation.
SkillInputs map[string]any
// ParentRunID is set when the skill was invoked via skill_invoke
// from a parent skill run. Empty for top-level invocations
// (Discord, chatbot, scheduler). Used by the loop guard in
// skill_invoke and by the audit log for call-tree reconstruction.
//
// Why threaded through Invocation (vs context.Value): the loop
// guard runs at tool-handler time, where the only context the
// handler sees is inv. Stuffing it into context would force a
// helper for unwrap on every read; an explicit field is easier to
// audit and impossible to forget.
ParentRunID string
// RootRunID is the audit run id at the ROOT of the dispatch tree
// this run belongs to — for a top-level run, its own RunID; for a
// delegated run (skill_invoke / agent_invoke / agent_spawn /
// palette wrappers), the outermost ancestor's. Stamped by both
// executors from the dispatchguard ancestor chain right after
// guard entry. Backs the shared `root_run:<id>` KV scope that lets
// parallel sibling workers coordinate (see tools/scope_validate.go
// + RootRunKVPartition).
RootRunID string
// ToolsSubset, when non-empty, narrows an AGENT run's low-level tools
// to the named subset of the agent's configured LowLevelTools. Set by
// agent_invoke's `tools_subset` arg for ephemeral fan-out — spawning a
// focused worker from a template (e.g. a `coder` template with only
// code_exec + read_page). Names outside the agent's tool menu are
// rejected upstream (in the invoke adapter), so by the time the
// executor reads this the intersection is safe. Empty = full palette.
// Skill runs ignore this field.
ToolsSubset []string
// SystemPromptPrepend, when non-empty, is prepended to an AGENT's
// system prompt for this invocation only — the fan-out "customized
// system prompt" lever (agent_invoke's `prompt_prepend` arg). It
// specializes a template persona to a task without mutating the
// persisted agent row. Skill runs ignore this field.
SystemPromptPrepend string
// SuppressDelivery, when true, instructs the skill executor to
// SKIP its OutputTarget Delivery (Deliver / DeliverError) entirely.
// The run still produces an output string (returned from Run) and
// still writes to the audit log — only the side-channel delivery
// (Discord channel/DM/thread post) is suppressed.
//
// Why: when the chatbot exposure adapter invokes a skill, the skill's
// output is already going to be consumed by the chatbot as a tool
// result; ALSO posting it to Discord via OutputTarget produces double
// output and (worse) primes the chatbot to call the tool again on
// the next turn after seeing its own output as a "human message",
// kicking off a tool-loop. The chatbot adapter sets this to true on
// every invocation it constructs.
SuppressDelivery bool
// HandlerOwnsDelivery, when true, tells the executor that the caller
// (typically a Discord command handler) will assemble the final
// user-visible reply itself — folding any deferred attachments
// (rows queued by send_attachments to skill_run_pending_attachments)
// into the same message as the text output. The executor's
// post-run AttachmentDrainer is skipped so the handler can drain +
// classify + chain-overflow + post in one place.
//
// Why an explicit flag (vs reusing SuppressDelivery): SuppressDelivery
// also short-circuits the OutputTarget Delivery layer (channel/dm/
// thread post), which is the right shape for chatbot exposure but
// the WRONG shape for `.agent run` — the handler still wants the
// audit row to land and the executor's drainer to NOT post a
// separate "here's an image" follow-up message after the handler's
// own text reply. HandlerOwnsDelivery is the narrow "the caller is
// taking over post-run delivery" signal that does NOT change any
// other executor behaviour.
//
// SuppressDelivery and HandlerOwnsDelivery are independent. The
// drainer is skipped when EITHER is set (the chatbot path doesn't
// want stray posts either; agent-run sets HandlerOwnsDelivery
// because it owns delivery; sub-agent dispatches set SuppressDelivery
// because they surface output as a tool result).
HandlerOwnsDelivery bool
// Priority is the v9 per-invocation priority override for the lane
// scheduler. When non-zero, the executor uses this value when
// constructing the lane Job; zero falls back to the skill's
// Skill.DefaultPriority. Owners are capped by convar
// `skills.priority_max_per_user` (default 5); admins may exceed it.
//
// Why a non-pointer (vs *int): zero means "use the default", which
// matches the convention everywhere else in this struct. Skills
// that need an explicit zero priority can store
// DefaultPriority=0 — the result is identical.
Priority int
// LaneWaitMaxSeconds is the v9 per-invocation lane backoff cap. When
// >0, the executor calls SubmitWithMaxWait so the run is rejected
// with ErrLaneBusy (surfaced as `lane_busy`) when the estimated
// queue wait would exceed this many seconds. 0 (default) preserves
// the legacy block-forever Submit semantics.
LaneWaitMaxSeconds int
// LaneOverride forces the run onto the named lane regardless of
// Skill.ExecutionLane. Used by the v9 inbound webhook handler to
// route webhook-triggered runs to the dedicated webhook-default
// lane. Empty preserves the per-skill ExecutionLane.
LaneOverride string
// Continuation, when non-nil, signals that this Invocation is a
// V10 reply continuation: a Discord user replied to a message the
// originating skill posted, and mort is re-invoking the skill to
// produce the next turn. The executor reads this field to:
//
// - Reuse the parent run's `run:<parent_run_id>` KV scope (so any
// state the prior turn saved is still readable).
// - Render a continuation block at the top of the agent's user
// prompt that includes the parent output + reply text.
// - Enforce the per-deployment chain-depth cap
// (skills.reply.max_chain_depth, default 20).
// - Stamp parent_run_id on the new run for call-tree
// reconstruction in audit + UI.
//
// Why a pointer struct (vs flat fields): all five fields are
// meaningful only together — splitting them would invite
// half-populated states. nil = "this is a fresh invocation, not a
// continuation".
Continuation *ContinuationContext
// SourceWebhookSecretMatched is set true by the inbound webhook
// handler AFTER it has validated both the URL secret AND the HMAC
// signature for the named skill. It signals to System.Run that the
// caller is authenticated by a per-skill secret (not by Discord
// identity), so the visibility / owner gate in CanInvoke should be
// bypassed for THIS skill (matching SkillID). All other gates —
// pinned_version, budget caps, lane caps — still apply.
//
// Hotfix-5 Bug 1: pre-fix the webhook handler built an Invocation
// with CallerID=`<webhook>:<source-IP>` and dispatched through
// System.Run. CanInvoke saw a non-owner non-admin caller against a
// private skill and rejected with HTTP 500 ("caller is not
// permitted to invoke skill"). The cure isn't to weaken
// CanInvoke's general-purpose policy — it's to recognise that a
// matched secret IS the auth gate for the named skill.
//
// Why per-Invocation (vs a separate gate path): the executor uses
// Run as the single canonical dispatch point — adding a second
// "authenticated dispatch" entry would split run-recording, lane
// dispatch, and audit emission into two parallel implementations.
SourceWebhookSecretMatched bool
// OnEvent, when non-nil, is called by the executor at run
// boundaries and by the agent loop on each tool dispatch. The
// bot's command handler closes over the invoking message and
// reacts an emoji from the skill's StateReactEmoji map. Nil-safe.
//
// Event names:
// "__start__" — right before agent.Run starts
// "__end__" — on successful completion
// "__error__" — on terminal error
// <tool_name> — when a tool dispatches (any registered tool)
//
// The executor passes the resolved emoji as `emoji` so callers
// don't have to look it up themselves; emoji=="" means "no react
// for this event" and callers should skip the react entirely.
//
// Why a callback (vs a state-react map carried in the Invocation):
// the lookup table lives on the Skill, not the Invocation, but the
// caller-supplied side effect (a Discord react) lives on the bot
// command surface. A callback bridges the two without forcing the
// executor to import discord types and without forcing the bot
// command surface to know about the Skill's emoji map shape.
OnEvent func(ctx context.Context, event string, emoji string)
// OnToolEvent, when non-nil, is called by the executor on each tool
// dispatch with phase "start" (before the tool runs) then "end" or
// "error" (after it completes, with the result text in detail). Distinct
// from OnEvent (which is the emoji state-react hook): this carries the
// tool name + args/result so an out-of-band caller — e.g. the mortise
// chat API streaming SSE tool.start/tool.end frames — can surface live
// tool-progress. Nil-safe; the callback MUST be fast and non-blocking
// (it runs on the agent-loop goroutine).
OnToolEvent func(ctx context.Context, toolName, phase, detail string)
// OnStep, when non-nil, is called by the executor as the agent loop
// makes progress — currently once per tool call: phase "start" before
// the tool runs, phase "end" after it completes (StepEvent.Step.Status
// is "complete" or "error"). Correlate the two by StepEvent.Step.ID.
// "delta" is reserved for progressive detail and is unused today.
//
// Distinct from OnToolEvent (the raw tool-name/result hook): OnStep
// carries a richer, presentation-ready Step (kind + human present-tense
// summary) so an out-of-band consumer — e.g. the mortise chat API
// streaming SSE step.start/step.end frames — can render structured
// progress without re-deriving it. The executor ALSO accumulates the
// same Steps onto its run Result, so persistence does not depend on
// this callback being set. Nil-safe; the callback MUST be fast and
// non-blocking (it runs on the agent-loop goroutine).
OnStep func(ctx context.Context, ev StepEvent)
// InvokingMessageID is the Discord message ID of the user's command
// that triggered this run, when it was triggered by a Discord text
// command. Used by delivery to thread the reply (Discord native
// reply with the gray quote bar + jump link). Empty for chatbot
// exposure, scheduled, or webhook invocations — delivery falls
// back to a plain channel post for those.
//
// Why threaded through Invocation (vs a separate field on Skill or
// a magic SkillInputs key): the message ID is per-invocation, not
// per-skill, and the delivery layer is the natural reader. Direct
// field on Invocation matches the existing ChannelID / GuildID
// fields' shape.
InvokingMessageID string
// Images carries multi-modal image content for the initial user
// message. When non-empty, the executor builds the initial user
// message with llm.UserParts(text + image parts) instead of plain
// llm.UserText. Populated by callers that extract images from Discord
// attachments or URLs in prompt text (pkg/imageutil downloads the
// bytes — majordomo image parts are bytes-only). Nil = text-only.
Images []llm.ImagePart
// InputFiles carries non-image attachments (audio, etc.) the user
// supplied with the run. Unlike Images, these are NOT inlined into
// the model's context — the LLM can't ingest raw mp3/wav/midi bytes.
// Instead the executor stages each into the skill file store under
// run scope and tells the agent the resulting file_ids (in the
// prompt) so it can hand one to a worker tool (e.g. code_exec
// files_in → /workspace/<name>) for processing. Nil = none.
InputFiles []InputFile
// ExtraTools are additional llm.Tool instances injected for this
// run only. They are appended to the palette after registry-built
// tools, skill-palette wrappers, and sub-agent wrappers. Use this
// for session-specific tools that cannot be pre-registered in the
// catalog (e.g., scaddy's write_scad which needs per-session
// workspace + renderer state).
//
// Why on Invocation (vs a dedicated Run parameter): the Invocation
// is the per-run context carrier in mort's execution path. Adding
// a separate ExtraTools arg to Executor.Run would fork the
// signature for one use case; a field on the existing carrier
// keeps the surface stable.
ExtraTools []llm.Tool
// SessionToolFactory, if set, is called with the live AgentSession
// after the executor constructs the agent but before it runs. It
// returns a SessionTools struct carrying the tools to add, an
// optional PostRun hook for post-processing (e.g., rendering final
// artifacts from workspace state), and an optional Cleanup func for
// resource teardown. Types are defined in session_tools.go.
//
// Why a factory (vs ExtraTools): ExtraTools are static — they
// don't have access to the running agent. Tools that need to call
// session.AttachImages (to show rendered previews to the model on
// its next turn) require the live session handle that only exists
// after construction. The factory receives that handle.
SessionToolFactory SessionToolFactory
// PostRunDelivery, if set, is called by the agent command handler
// (`.agent run`) INSTEAD of the default text + paste-fallback reply
// when the executor's result carries a PostRunResult. The callback
// receives the Discord message to reply to, the agent's text output,
// and the PostRunResult. It returns the message ID of the primary
// reply (for origin recording) and any error.
//
// Why a callback on Invocation (vs a handler method on the agent):
// delivery needs services (paste, filetransfer, Discord session)
// that live outside the agents package. A callback lets the adapter
// (e.g., scaddy) close over the services at factory-build time
// without adding service dependencies to the agents.System struct.
//
// When nil, `handleRun` falls through to the standard text-based
// reply path (formatRunReply + postRunReply). When set, the
// callback owns the ENTIRE reply — `handleRun` does NOT post a
// text reply alongside it.
PostRunDelivery func(ctx context.Context, channelID, replyToMsgID string, output string, prr *PostRunResult) (primaryMsgID string, err error)
// RunState, when set by the executor, lets a tool read the live
// run's progress + budget snapshot (iteration vs cap, tool calls,
// tokens, cost, elapsed). Nil on paths that do not provide it (e.g.
// the no-tools direct path, or executors that predate the hook).
// The skill_self_status tool reads this.
RunState RunStateAccessor
// AttachImages, when set by the executor, queues a user-role message
// (optional text + image parts) into the LIVE run so the model sees
// the images on its next step — the same steer-mailbox mechanism the
// SessionToolFactory's AgentSession exposes, but reachable from any
// ordinary tool handler. A tool returns text; images cannot ride a
// string result, so a tool that fetches images the model must SEE
// (e.g. discord_list_recent_messages reading channel history) calls
// this to feed the pixels in. Nil on paths that do not own a steer
// mailbox (skillexec, the no-tools direct path); tools MUST nil-check
// before calling and degrade to text-only when it is nil.
AttachImages func(text string, images ...llm.ImagePart)
// gate / audit are populated by the registry's Build before
// BuildLLM is called. Tools should call CheckGate(inv) at the top
// of their handler and EmitAudit(inv, ...) when reporting tool
// results. The fields are unexported in the public surface but
// available to tools via the helpers in helpers.go.
gate string
currentSkill string
audit AuditHook
toolName string
}
// RunState is a live, read-only snapshot of the current run's progress
// and budget. Populated on demand by the executor's per-run accessor
// (see Invocation.RunState).
type RunState struct {
Iteration int
MaxIterations int
ToolCalls int
MaxToolCalls int
InputTokens int64
OutputTokens int64
ThinkingTokens int64
ElapsedSeconds int
}
// RunStateAccessor returns the live RunState for the enclosing run. The
// executor builds one per run and stamps it on Invocation.RunState
// before the toolbox is built; tools read it via inv.RunState. Nil on
// any path that does not provide it.
type RunStateAccessor interface {
RunState() RunState
}
// Registry is the read interface to the tool catalog. Concrete impl is
// the package-private *registry struct returned by NewRegistry.
type Registry interface {
Register(t Tool) error
Get(name string) (Tool, bool)
List() []Tool
// Build returns an llm.Toolbox with each named tool prepared for
// execution against the given invocation. Save-time authoring
// checks happen elsewhere (CheckAuthoring in checks.go) — Build
// trusts that the skill was already saved past those gates and
// only re-checks runtime invariants:
//
// 1. Share-safety drift: rejects an unsafe tool when visibility
// != private.
// 2. SkillNameGate enforcement is delegated to the per-tool
// handler via CheckGate, which reads invocation context.
// 3. Audit emission via EmitAudit (also per-tool).
//
// The optional `trusted` variadic argument lets the caller declare
// the skill as trusted infrastructure (a builtin loaded from disk
// by the project's own loader) so the share-safety drift check is
// skipped. Builtins legitimately ship with public visibility AND
// not-safe-for-share tools (e.g. skill-wizard's wizard_* tools),
// and the loader bypasses save-time gates by design — applying the
// share-safety check at invocation would be inconsistent with the
// rest of the trusted-builtin contract. Pass true ONLY for builtins
// (Skill.Source == SourceBuiltin / OwnerID == ""). Variadic so the
// existing call sites (and tests) compile unchanged.
Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error)
}
// AuditHook is invoked synchronously around each tool call. Implementations
// typically forward to skillaudit.Writer. May be nil for tests.
type AuditHook func(call AuditCall)
// AuditCall describes one tool invocation. Result is set on success;
// Err is set on failure. Either may be present together (e.g. the tool
// returned partial output then errored).
type AuditCall struct {
Tool string
Args string
Result string
Err error
}
// Step is one unit of agent progress surfaced to a consumer of OnStep
// (and accumulated onto the executor's run Result). Today there is one
// Step per tool call; the shape is deliberately open so future kinds
// (a coalesced reasoning beat, a sub-agent delegation) slot in without a
// wire change.
//
// This is a plain DTO — no HTTP/Discord/JSON-tag coupling beyond the
// neutral snake_case tags a transport may reuse. The chat API converts
// it to its own persisted/wire type; Discord/cron consumers read the
// Result field directly.
type Step struct {
// ID is stable per-step and unique within one run; it is the
// correlation key between the "start" and "end" emissions.
ID string `json:"id"`
// Kind is an open vocabulary (search, read, code, image, file,
// memory, delegate, tool, …); consumers map known values to an icon
// and fall back for unknown ones. Never drop a step for an
// unrecognised kind.
Kind string `json:"kind"`
// Title is a short machine-ish label (typically the raw tool name).
Title string `json:"title,omitempty"`
// Summary is the human present-tense one-liner ("Searching the web
// for …"); on end it may be replaced with a result phrase.
Summary string `json:"summary"`
// Status is "running" | "complete" | "error".
Status string `json:"status"`
// Detail is optional, user-safe, size-capped markdown. Never raw tool
// output, credentials, or chain-of-thought.
Detail string `json:"detail,omitempty"`
// StartedAt is when the step began.
StartedAt time.Time `json:"started_at"`
// EndedAt is set on the terminal "end" emission.
EndedAt *time.Time `json:"ended_at,omitempty"`
}
// StepEvent is one live emission to OnStep. Phase is "start" or "end"
// ("delta" is reserved for progressive detail and unused today). Step
// carries the full current snapshot; Detail holds the delta text when
// Phase == "delta".
type StepEvent struct {
Phase string
Step Step
Detail string
}
// NewRegistry constructs an empty registry. Call Register for each tool;
// see pkg/skilltools/default_registry.go for the v1 set.
func NewRegistry() Registry {
return &registry{tools: make(map[string]Tool)}
}
type registry struct {
mu sync.RWMutex
tools map[string]Tool
}
func (r *registry) Register(t Tool) error {
if t == nil {
return fmt.Errorf("skilltools: nil tool")
}
name := t.Name()
if name == "" {
return fmt.Errorf("skilltools: tool with empty name")
}
r.mu.Lock()
defer r.mu.Unlock()
if _, dup := r.tools[name]; dup {
return fmt.Errorf("skilltools: duplicate tool name %q", name)
}
r.tools[name] = t
return nil
}
func (r *registry) Get(name string) (Tool, bool) {
r.mu.RLock()
defer r.mu.RUnlock()
t, ok := r.tools[name]
return t, ok
}
func (r *registry) List() []Tool {
r.mu.RLock()
defer r.mu.RUnlock()
out := make([]Tool, 0, len(r.tools))
for _, t := range r.tools {
out = append(out, t)
}
return out
}
// Build prepares an llm.Toolbox for one skill execution.
//
// Why: each tool needs to know the caller / channel / skill name plus
// the audit hook. Stuffing them into Invocation lets each Tool.BuildLLM
// produce a closure that has everything it needs without further
// plumbing.
//
// Defence in depth: rejects an unsafe tool when visibility != private —
// the share-time check should already have prevented this; this catches
// drift (e.g. a tool's SafeForShare flag flipping after a skill saved).
//
// The trusted variadic flag lets a caller bypass the share-safety drift
// check for builtin (trusted-infrastructure) skills. The mortventure /
// skill-wizard builtins legitimately ship with public visibility AND
// not-safe-for-share tools — the loader bypasses save-time gates and
// the share-safety check at invocation would block them inconsistently.
// Pass true ONLY for builtins.
func (r *registry) Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error) {
isTrusted := len(trusted) > 0 && trusted[0]
box := llm.NewToolbox("skilltools")
for _, name := range names {
t, ok := r.Get(name)
if !ok {
return nil, fmt.Errorf("skilltools: unknown tool %q", name)
}
if !isTrusted && vis != VisibilityPrivate && !t.Permission().SafeForShare {
return nil, fmt.Errorf("skilltools: tool %q is not safe for share but skill visibility is %s", name, vis)
}
// Populate the gate/audit fields on the Invocation so the tool
// can call CheckGate / EmitAudit from its handler.
toolInv := inv
toolInv.gate = t.Permission().SkillNameGate
toolInv.currentSkill = inv.SkillName
toolInv.audit = audit
toolInv.toolName = name
built := t.BuildLLM(toolInv)
if built.Name == "" {
return nil, fmt.Errorf("skilltools: tool %q built llm.Tool with empty name", name)
}
if err := box.Add(built); err != nil {
return nil, fmt.Errorf("skilltools: adding tool %q: %w", name, err)
}
}
return box, nil
}