P1 (part 1): move skilltools core -> tool/ (clean, verbatim)

The tool registry core (registry, permission model, Invocation, gated-tool wrapper, ssrf guard, hmac, encryption, argcoerce, helpers, rootrun, session_tools, webhook_rate_limit) had zero mort coupling — it imports only majordomo/llm + x/crypto/hkdf — so it moves verbatim with a package rename (skilltools -> tool). All same-package tests came along and pass; the SSRF, gated-wrapper, encryption and output-pattern invariants are re-anchored here. majordomo re-enters the module graph (now pinned to the latest, incl. the front-loaded-output fix). model/ + llmmeta + structured follow next. Docs: CLAUDE.md now requires README/examples to stay in sync with changes in the same commit; CI skips docs/example-only pushes via paths-ignore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 19:31:47 -04:00
parent d2c18ad5bb
commit dc28b63ad8
24 changed files with 3461 additions and 1 deletions
@@ -0,0 +1,701 @@
+// Package skilltools is the tool registry for the agentic skills platform.
+// Tools registered here can be referenced by name from a Skill's Tools
+// list and are surfaced to the underlying majordomo agent loop via Build().
+//
+// Independent of pkg/logic/chatbot/tool_provider.go: the chatbot's
+// ToolProvider supplies tools per-channel during a chatbot turn; skill
+// tools are scoped to one skill execution. Bridging happens once, in
+// pkg/logic/skills/chatbot_provider.go, which exposes whole agent skills
+// as chatbot tools (not individual skill tools).
+//
+// Permission model is documented in
+// docs/superpowers/specs/2026-05-02-agentic-skills-design.md, "Tool
+// registry" section. Three orthogonal checks:
+//
+//  1. Save-time:    AuthoringRequirement vs caller's admin status.
+//  2. Share-time:   SafeForShare for visibility != private.
+//  3. Execute-time: SkillNameGate.
+package tool
+
+import (
+	"context"
+	"fmt"
+	"sync"
+	"time"
+
+	llm "gitea.stevedudenhoeffer.com/steve/majordomo/llm"
+)
+
+// Visibility is the spec's visibility enum mirrored here as a typed
+// string. It's redeclared (vs imported from pkg/logic/skills) to break
+// the import cycle that would otherwise form: skills → skilltools →
+// skills. The string values match Visibility one-to-one so a
+// caller can pass `string(VisibilityPublic)` and it just works.
+type Visibility string
+
+const (
+	VisibilityPrivate Visibility = "private"
+	VisibilityShared  Visibility = "shared"
+	VisibilityPublic  Visibility = "public"
+)
+
+// Tool is what a registry entry implements. Concrete tools wrap an
+// underlying mort subsystem (e.g. wolfram, weather, paste) and produce
+// an llm.Tool on demand for a given Invocation.
+//
+// Why an interface (vs majordomo's concrete llm.Tool): we need richer
+// metadata (Permission, Categories, SkillNameGate) for the platform's
+// gating logic before we hand the tool to majordomo. BuildLLM converts
+// to llm.Tool for one execution, closing over the Invocation so the
+// per-tool handler can read CallerID/ChannelID without further plumbing.
+//
+// Why BuildLLM-per-call (vs static llm.Tool): per-user tools must close
+// over inv.CallerID — the LLM-supplied args are intentionally ignored
+// for those. Constructing the llm.Tool inside BuildLLM lets each tool
+// craft its own typed Define call while reading the invocation context.
+//
+// Test: each tool under pkg/skilltools/tools/ has its own *_test.go.
+type Tool interface {
+	Name() string
+	Description() string
+	Permission() Permission
+	// BuildLLM produces the llm.Tool for one invocation. The returned
+	// tool's name MUST equal Name(); the registry's Build() relies on
+	// this when wiring multiple tools into a Toolbox.
+	BuildLLM(inv Invocation) llm.Tool
+}
+
+// Permission summarises the three lifecycle gates plus UI metadata.
+type Permission struct {
+	// AuthoringRequirement governs who may SAVE a skill that uses
+	// this tool: anyone or admin-only.
+	AuthoringRequirement Requirement
+
+	// OperatesOn classifies whose data the tool reads: global
+	// (channel-wide, public sources) or caller (the invoking user's
+	// own data).
+	OperatesOn Scope
+
+	// SafeForShare reports whether the tool may appear in a shared or
+	// public skill. Tools that operate on caller data are typically
+	// not safe for share — the executing skill becomes a vector for
+	// reading other users' data.
+	SafeForShare bool
+
+	// Categories are free-form labels used for UI grouping (read,
+	// write, network, code, data, social). Code does NOT branch on
+	// these strings.
+	Categories []string
+
+	// SkillNameGate, if non-empty, restricts execution to the named
+	// skill. Used for wizard-only tools in v2; SkillNameGate=="" means
+	// any skill may use the tool.
+	SkillNameGate string
+}
+
+// Requirement is who is allowed to author a skill using this tool.
+type Requirement string
+
+const (
+	RequirementAnyone Requirement = "anyone"
+	RequirementAdmin  Requirement = "admin"
+)
+
+// Scope classifies the data domain a tool acts on.
+type Scope string
+
+const (
+	ScopeGlobal Scope = "global"
+	ScopeCaller Scope = "caller"
+)
+
+// ContinuationContext describes a V10 reply continuation. When set on
+// an Invocation, the skill executor reuses the parent run's KV scope,
+// renders a continuation prompt, and bumps ChainDepth for cap
+// enforcement.
+//
+// The executor reads ParentRunID to set the new run's parent_run_id
+// column (for call-tree reconstruction); ParentOutput to render the
+// "previous output you sent" line in the agent prompt; ReplyText to
+// render the "user replied with" line; ReplyMessageID for diagnostic
+// logging; and ChainDepth to compare against
+// skills.reply.max_chain_depth.
+//
+// Why ChainDepth (vs walking parent_run_id at execution time): a fresh
+// query per turn would add a DB roundtrip on every reply hop. Carrying
+// the count in the invocation is cheap and authoritative.
+type ContinuationContext struct {
+	// ParentRunID is the run that produced the message the user
+	// replied to. The new run inherits its KV scope (run:<ParentRunID>).
+	ParentRunID string
+
+	// ParentOutput is the text the parent run delivered to Discord —
+	// stored on the run row so it survives even if the parent's
+	// run-scope KV has been auto-purged (24h after parent finished).
+	ParentOutput string
+
+	// ReplyText is what the user said when they replied (the new
+	// turn's user input). May be empty if the reply was an attachment-
+	// only message (handle gracefully — agent should handle empty
+	// input as a "noop continuation").
+	ReplyText string
+
+	// ReplyMessageID is the Discord message ID of the user's reply.
+	// Used for audit + log breadcrumbs; not currently consumed by the
+	// agent prompt.
+	ReplyMessageID string
+
+	// ChainDepth is how many continuation hops have happened in the
+	// chain rooted at the original invocation. The router should set
+	// this to (parent's chain depth + 1). The executor rejects when
+	// it exceeds skills.reply.max_chain_depth.
+	ChainDepth int
+}
+
+// InputFile is a non-image file the user supplied with a run (audio,
+// etc.). The executor stages it into the file store under run scope and
+// surfaces its file_id to the agent. Name is a safe base name (no path
+// separators) suitable for /workspace/<name>; MimeType is the resolved
+// content type; Data is the raw bytes.
+type InputFile struct {
+	Name     string
+	MimeType string
+	Data     []byte
+}
+
+// Invocation is the runtime context passed to Tool.BuildLLM. The executor
+// builds it once per skill run and the same struct is closed over by
+// every tool's handler, so each tool sees the caller / channel identity.
+type Invocation struct {
+	SkillID   string
+	SkillName string
+	RunID     string
+	CallerID  string
+	ChannelID string
+	GuildID   string
+	// CallerIsAdmin is true when the caller is a mort admin (Member.Admin).
+	// Populated by the executor at run dispatch via Bot.GetMember; defaults
+	// to false on any lookup failure (member not found, DB error, empty
+	// CallerID for system-invoked runs). Read by tools that gate behaviour
+	// on admin status — currently `code_exec` for the v15 admin-only WAN
+	// network mode.
+	//
+	// Why a precomputed bool on Invocation (vs an AdminChecker dep on
+	// every tool): the admin lookup is read-once-per-run; every tool
+	// would otherwise have to redo the work. The executor knows the
+	// caller's admin status by the time it builds Invocation, so it
+	// stamps the field once and every tool reads it for free.
+	CallerIsAdmin bool
+	// SkillInputs is the parsed input map for the enclosing skill —
+	// available so a tool can reference values the user supplied at
+	// invocation time. Tools may read this to specialise behaviour but
+	// MUST NOT use it as a substitute for inv.CallerID-based isolation.
+	SkillInputs map[string]any
+	// ParentRunID is set when the skill was invoked via skill_invoke
+	// from a parent skill run. Empty for top-level invocations
+	// (Discord, chatbot, scheduler). Used by the loop guard in
+	// skill_invoke and by the audit log for call-tree reconstruction.
+	//
+	// Why threaded through Invocation (vs context.Value): the loop
+	// guard runs at tool-handler time, where the only context the
+	// handler sees is inv. Stuffing it into context would force a
+	// helper for unwrap on every read; an explicit field is easier to
+	// audit and impossible to forget.
+	ParentRunID string
+
+	// RootRunID is the audit run id at the ROOT of the dispatch tree
+	// this run belongs to — for a top-level run, its own RunID; for a
+	// delegated run (skill_invoke / agent_invoke / agent_spawn /
+	// palette wrappers), the outermost ancestor's. Stamped by both
+	// executors from the dispatchguard ancestor chain right after
+	// guard entry. Backs the shared `root_run:<id>` KV scope that lets
+	// parallel sibling workers coordinate (see tools/scope_validate.go
+	// + RootRunKVPartition).
+	RootRunID string
+
+	// ToolsSubset, when non-empty, narrows an AGENT run's low-level tools
+	// to the named subset of the agent's configured LowLevelTools. Set by
+	// agent_invoke's `tools_subset` arg for ephemeral fan-out — spawning a
+	// focused worker from a template (e.g. a `coder` template with only
+	// code_exec + read_page). Names outside the agent's tool menu are
+	// rejected upstream (in the invoke adapter), so by the time the
+	// executor reads this the intersection is safe. Empty = full palette.
+	// Skill runs ignore this field.
+	ToolsSubset []string
+
+	// SystemPromptPrepend, when non-empty, is prepended to an AGENT's
+	// system prompt for this invocation only — the fan-out "customized
+	// system prompt" lever (agent_invoke's `prompt_prepend` arg). It
+	// specializes a template persona to a task without mutating the
+	// persisted agent row. Skill runs ignore this field.
+	SystemPromptPrepend string
+
+	// SuppressDelivery, when true, instructs the skill executor to
+	// SKIP its OutputTarget Delivery (Deliver / DeliverError) entirely.
+	// The run still produces an output string (returned from Run) and
+	// still writes to the audit log — only the side-channel delivery
+	// (Discord channel/DM/thread post) is suppressed.
+	//
+	// Why: when the chatbot exposure adapter invokes a skill, the skill's
+	// output is already going to be consumed by the chatbot as a tool
+	// result; ALSO posting it to Discord via OutputTarget produces double
+	// output and (worse) primes the chatbot to call the tool again on
+	// the next turn after seeing its own output as a "human message",
+	// kicking off a tool-loop. The chatbot adapter sets this to true on
+	// every invocation it constructs.
+	SuppressDelivery bool
+
+	// HandlerOwnsDelivery, when true, tells the executor that the caller
+	// (typically a Discord command handler) will assemble the final
+	// user-visible reply itself — folding any deferred attachments
+	// (rows queued by send_attachments to skill_run_pending_attachments)
+	// into the same message as the text output. The executor's
+	// post-run AttachmentDrainer is skipped so the handler can drain +
+	// classify + chain-overflow + post in one place.
+	//
+	// Why an explicit flag (vs reusing SuppressDelivery): SuppressDelivery
+	// also short-circuits the OutputTarget Delivery layer (channel/dm/
+	// thread post), which is the right shape for chatbot exposure but
+	// the WRONG shape for `.agent run` — the handler still wants the
+	// audit row to land and the executor's drainer to NOT post a
+	// separate "here's an image" follow-up message after the handler's
+	// own text reply. HandlerOwnsDelivery is the narrow "the caller is
+	// taking over post-run delivery" signal that does NOT change any
+	// other executor behaviour.
+	//
+	// SuppressDelivery and HandlerOwnsDelivery are independent. The
+	// drainer is skipped when EITHER is set (the chatbot path doesn't
+	// want stray posts either; agent-run sets HandlerOwnsDelivery
+	// because it owns delivery; sub-agent dispatches set SuppressDelivery
+	// because they surface output as a tool result).
+	HandlerOwnsDelivery bool
+
+	// Priority is the v9 per-invocation priority override for the lane
+	// scheduler. When non-zero, the executor uses this value when
+	// constructing the lane Job; zero falls back to the skill's
+	// Skill.DefaultPriority. Owners are capped by convar
+	// `skills.priority_max_per_user` (default 5); admins may exceed it.
+	//
+	// Why a non-pointer (vs *int): zero means "use the default", which
+	// matches the convention everywhere else in this struct. Skills
+	// that need an explicit zero priority can store
+	// DefaultPriority=0 — the result is identical.
+	Priority int
+
+	// LaneWaitMaxSeconds is the v9 per-invocation lane backoff cap. When
+	// >0, the executor calls SubmitWithMaxWait so the run is rejected
+	// with ErrLaneBusy (surfaced as `lane_busy`) when the estimated
+	// queue wait would exceed this many seconds. 0 (default) preserves
+	// the legacy block-forever Submit semantics.
+	LaneWaitMaxSeconds int
+
+	// LaneOverride forces the run onto the named lane regardless of
+	// Skill.ExecutionLane. Used by the v9 inbound webhook handler to
+	// route webhook-triggered runs to the dedicated webhook-default
+	// lane. Empty preserves the per-skill ExecutionLane.
+	LaneOverride string
+
+	// Continuation, when non-nil, signals that this Invocation is a
+	// V10 reply continuation: a Discord user replied to a message the
+	// originating skill posted, and mort is re-invoking the skill to
+	// produce the next turn. The executor reads this field to:
+	//
+	//  - Reuse the parent run's `run:<parent_run_id>` KV scope (so any
+	//    state the prior turn saved is still readable).
+	//  - Render a continuation block at the top of the agent's user
+	//    prompt that includes the parent output + reply text.
+	//  - Enforce the per-deployment chain-depth cap
+	//    (skills.reply.max_chain_depth, default 20).
+	//  - Stamp parent_run_id on the new run for call-tree
+	//    reconstruction in audit + UI.
+	//
+	// Why a pointer struct (vs flat fields): all five fields are
+	// meaningful only together — splitting them would invite
+	// half-populated states. nil = "this is a fresh invocation, not a
+	// continuation".
+	Continuation *ContinuationContext
+
+	// SourceWebhookSecretMatched is set true by the inbound webhook
+	// handler AFTER it has validated both the URL secret AND the HMAC
+	// signature for the named skill. It signals to System.Run that the
+	// caller is authenticated by a per-skill secret (not by Discord
+	// identity), so the visibility / owner gate in CanInvoke should be
+	// bypassed for THIS skill (matching SkillID). All other gates —
+	// pinned_version, budget caps, lane caps — still apply.
+	//
+	// Hotfix-5 Bug 1: pre-fix the webhook handler built an Invocation
+	// with CallerID=`<webhook>:<source-IP>` and dispatched through
+	// System.Run. CanInvoke saw a non-owner non-admin caller against a
+	// private skill and rejected with HTTP 500 ("caller is not
+	// permitted to invoke skill"). The cure isn't to weaken
+	// CanInvoke's general-purpose policy — it's to recognise that a
+	// matched secret IS the auth gate for the named skill.
+	//
+	// Why per-Invocation (vs a separate gate path): the executor uses
+	// Run as the single canonical dispatch point — adding a second
+	// "authenticated dispatch" entry would split run-recording, lane
+	// dispatch, and audit emission into two parallel implementations.
+	SourceWebhookSecretMatched bool
+
+	// OnEvent, when non-nil, is called by the executor at run
+	// boundaries and by the agent loop on each tool dispatch. The
+	// bot's command handler closes over the invoking message and
+	// reacts an emoji from the skill's StateReactEmoji map. Nil-safe.
+	//
+	// Event names:
+	//   "__start__" — right before agent.Run starts
+	//   "__end__"   — on successful completion
+	//   "__error__" — on terminal error
+	//   <tool_name> — when a tool dispatches (any registered tool)
+	//
+	// The executor passes the resolved emoji as `emoji` so callers
+	// don't have to look it up themselves; emoji=="" means "no react
+	// for this event" and callers should skip the react entirely.
+	//
+	// Why a callback (vs a state-react map carried in the Invocation):
+	// the lookup table lives on the Skill, not the Invocation, but the
+	// caller-supplied side effect (a Discord react) lives on the bot
+	// command surface. A callback bridges the two without forcing the
+	// executor to import discord types and without forcing the bot
+	// command surface to know about the Skill's emoji map shape.
+	OnEvent func(ctx context.Context, event string, emoji string)
+
+	// OnToolEvent, when non-nil, is called by the executor on each tool
+	// dispatch with phase "start" (before the tool runs) then "end" or
+	// "error" (after it completes, with the result text in detail). Distinct
+	// from OnEvent (which is the emoji state-react hook): this carries the
+	// tool name + args/result so an out-of-band caller — e.g. the mortise
+	// chat API streaming SSE tool.start/tool.end frames — can surface live
+	// tool-progress. Nil-safe; the callback MUST be fast and non-blocking
+	// (it runs on the agent-loop goroutine).
+	OnToolEvent func(ctx context.Context, toolName, phase, detail string)
+
+	// OnStep, when non-nil, is called by the executor as the agent loop
+	// makes progress — currently once per tool call: phase "start" before
+	// the tool runs, phase "end" after it completes (StepEvent.Step.Status
+	// is "complete" or "error"). Correlate the two by StepEvent.Step.ID.
+	// "delta" is reserved for progressive detail and is unused today.
+	//
+	// Distinct from OnToolEvent (the raw tool-name/result hook): OnStep
+	// carries a richer, presentation-ready Step (kind + human present-tense
+	// summary) so an out-of-band consumer — e.g. the mortise chat API
+	// streaming SSE step.start/step.end frames — can render structured
+	// progress without re-deriving it. The executor ALSO accumulates the
+	// same Steps onto its run Result, so persistence does not depend on
+	// this callback being set. Nil-safe; the callback MUST be fast and
+	// non-blocking (it runs on the agent-loop goroutine).
+	OnStep func(ctx context.Context, ev StepEvent)
+
+	// InvokingMessageID is the Discord message ID of the user's command
+	// that triggered this run, when it was triggered by a Discord text
+	// command. Used by delivery to thread the reply (Discord native
+	// reply with the gray quote bar + jump link). Empty for chatbot
+	// exposure, scheduled, or webhook invocations — delivery falls
+	// back to a plain channel post for those.
+	//
+	// Why threaded through Invocation (vs a separate field on Skill or
+	// a magic SkillInputs key): the message ID is per-invocation, not
+	// per-skill, and the delivery layer is the natural reader. Direct
+	// field on Invocation matches the existing ChannelID / GuildID
+	// fields' shape.
+	InvokingMessageID string
+
+	// Images carries multi-modal image content for the initial user
+	// message. When non-empty, the executor builds the initial user
+	// message with llm.UserParts(text + image parts) instead of plain
+	// llm.UserText. Populated by callers that extract images from Discord
+	// attachments or URLs in prompt text (pkg/imageutil downloads the
+	// bytes — majordomo image parts are bytes-only). Nil = text-only.
+	Images []llm.ImagePart
+
+	// InputFiles carries non-image attachments (audio, etc.) the user
+	// supplied with the run. Unlike Images, these are NOT inlined into
+	// the model's context — the LLM can't ingest raw mp3/wav/midi bytes.
+	// Instead the executor stages each into the skill file store under
+	// run scope and tells the agent the resulting file_ids (in the
+	// prompt) so it can hand one to a worker tool (e.g. code_exec
+	// files_in → /workspace/<name>) for processing. Nil = none.
+	InputFiles []InputFile
+
+	// ExtraTools are additional llm.Tool instances injected for this
+	// run only. They are appended to the palette after registry-built
+	// tools, skill-palette wrappers, and sub-agent wrappers. Use this
+	// for session-specific tools that cannot be pre-registered in the
+	// catalog (e.g., scaddy's write_scad which needs per-session
+	// workspace + renderer state).
+	//
+	// Why on Invocation (vs a dedicated Run parameter): the Invocation
+	// is the per-run context carrier in mort's execution path. Adding
+	// a separate ExtraTools arg to Executor.Run would fork the
+	// signature for one use case; a field on the existing carrier
+	// keeps the surface stable.
+	ExtraTools []llm.Tool
+
+	// SessionToolFactory, if set, is called with the live AgentSession
+	// after the executor constructs the agent but before it runs. It
+	// returns a SessionTools struct carrying the tools to add, an
+	// optional PostRun hook for post-processing (e.g., rendering final
+	// artifacts from workspace state), and an optional Cleanup func for
+	// resource teardown. Types are defined in session_tools.go.
+	//
+	// Why a factory (vs ExtraTools): ExtraTools are static — they
+	// don't have access to the running agent. Tools that need to call
+	// session.AttachImages (to show rendered previews to the model on
+	// its next turn) require the live session handle that only exists
+	// after construction. The factory receives that handle.
+	SessionToolFactory SessionToolFactory
+
+	// PostRunDelivery, if set, is called by the agent command handler
+	// (`.agent run`) INSTEAD of the default text + paste-fallback reply
+	// when the executor's result carries a PostRunResult. The callback
+	// receives the Discord message to reply to, the agent's text output,
+	// and the PostRunResult. It returns the message ID of the primary
+	// reply (for origin recording) and any error.
+	//
+	// Why a callback on Invocation (vs a handler method on the agent):
+	// delivery needs services (paste, filetransfer, Discord session)
+	// that live outside the agents package. A callback lets the adapter
+	// (e.g., scaddy) close over the services at factory-build time
+	// without adding service dependencies to the agents.System struct.
+	//
+	// When nil, `handleRun` falls through to the standard text-based
+	// reply path (formatRunReply + postRunReply). When set, the
+	// callback owns the ENTIRE reply — `handleRun` does NOT post a
+	// text reply alongside it.
+	PostRunDelivery func(ctx context.Context, channelID, replyToMsgID string, output string, prr *PostRunResult) (primaryMsgID string, err error)
+
+	// RunState, when set by the executor, lets a tool read the live
+	// run's progress + budget snapshot (iteration vs cap, tool calls,
+	// tokens, cost, elapsed). Nil on paths that do not provide it (e.g.
+	// the no-tools direct path, or executors that predate the hook).
+	// The skill_self_status tool reads this.
+	RunState RunStateAccessor
+
+	// AttachImages, when set by the executor, queues a user-role message
+	// (optional text + image parts) into the LIVE run so the model sees
+	// the images on its next step — the same steer-mailbox mechanism the
+	// SessionToolFactory's AgentSession exposes, but reachable from any
+	// ordinary tool handler. A tool returns text; images cannot ride a
+	// string result, so a tool that fetches images the model must SEE
+	// (e.g. discord_list_recent_messages reading channel history) calls
+	// this to feed the pixels in. Nil on paths that do not own a steer
+	// mailbox (skillexec, the no-tools direct path); tools MUST nil-check
+	// before calling and degrade to text-only when it is nil.
+	AttachImages func(text string, images ...llm.ImagePart)
+
+	// gate / audit are populated by the registry's Build before
+	// BuildLLM is called. Tools should call CheckGate(inv) at the top
+	// of their handler and EmitAudit(inv, ...) when reporting tool
+	// results. The fields are unexported in the public surface but
+	// available to tools via the helpers in helpers.go.
+	gate         string
+	currentSkill string
+	audit        AuditHook
+	toolName     string
+}
+
+// RunState is a live, read-only snapshot of the current run's progress
+// and budget. Populated on demand by the executor's per-run accessor
+// (see Invocation.RunState).
+type RunState struct {
+	Iteration      int
+	MaxIterations  int
+	ToolCalls      int
+	MaxToolCalls   int
+	InputTokens    int64
+	OutputTokens   int64
+	ThinkingTokens int64
+	ElapsedSeconds int
+}
+
+// RunStateAccessor returns the live RunState for the enclosing run. The
+// executor builds one per run and stamps it on Invocation.RunState
+// before the toolbox is built; tools read it via inv.RunState. Nil on
+// any path that does not provide it.
+type RunStateAccessor interface {
+	RunState() RunState
+}
+
+// Registry is the read interface to the tool catalog. Concrete impl is
+// the package-private *registry struct returned by NewRegistry.
+type Registry interface {
+	Register(t Tool) error
+	Get(name string) (Tool, bool)
+	List() []Tool
+	// Build returns an llm.Toolbox with each named tool prepared for
+	// execution against the given invocation. Save-time authoring
+	// checks happen elsewhere (CheckAuthoring in checks.go) — Build
+	// trusts that the skill was already saved past those gates and
+	// only re-checks runtime invariants:
+	//
+	//  1. Share-safety drift: rejects an unsafe tool when visibility
+	//     != private.
+	//  2. SkillNameGate enforcement is delegated to the per-tool
+	//     handler via CheckGate, which reads invocation context.
+	//  3. Audit emission via EmitAudit (also per-tool).
+	//
+	// The optional `trusted` variadic argument lets the caller declare
+	// the skill as trusted infrastructure (a builtin loaded from disk
+	// by the project's own loader) so the share-safety drift check is
+	// skipped. Builtins legitimately ship with public visibility AND
+	// not-safe-for-share tools (e.g. skill-wizard's wizard_* tools),
+	// and the loader bypasses save-time gates by design — applying the
+	// share-safety check at invocation would be inconsistent with the
+	// rest of the trusted-builtin contract. Pass true ONLY for builtins
+	// (Skill.Source == SourceBuiltin / OwnerID == ""). Variadic so the
+	// existing call sites (and tests) compile unchanged.
+	Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error)
+}
+
+// AuditHook is invoked synchronously around each tool call. Implementations
+// typically forward to skillaudit.Writer. May be nil for tests.
+type AuditHook func(call AuditCall)
+
+// AuditCall describes one tool invocation. Result is set on success;
+// Err is set on failure. Either may be present together (e.g. the tool
+// returned partial output then errored).
+type AuditCall struct {
+	Tool   string
+	Args   string
+	Result string
+	Err    error
+}
+
+// Step is one unit of agent progress surfaced to a consumer of OnStep
+// (and accumulated onto the executor's run Result). Today there is one
+// Step per tool call; the shape is deliberately open so future kinds
+// (a coalesced reasoning beat, a sub-agent delegation) slot in without a
+// wire change.
+//
+// This is a plain DTO — no HTTP/Discord/JSON-tag coupling beyond the
+// neutral snake_case tags a transport may reuse. The chat API converts
+// it to its own persisted/wire type; Discord/cron consumers read the
+// Result field directly.
+type Step struct {
+	// ID is stable per-step and unique within one run; it is the
+	// correlation key between the "start" and "end" emissions.
+	ID string `json:"id"`
+	// Kind is an open vocabulary (search, read, code, image, file,
+	// memory, delegate, tool, …); consumers map known values to an icon
+	// and fall back for unknown ones. Never drop a step for an
+	// unrecognised kind.
+	Kind string `json:"kind"`
+	// Title is a short machine-ish label (typically the raw tool name).
+	Title string `json:"title,omitempty"`
+	// Summary is the human present-tense one-liner ("Searching the web
+	// for …"); on end it may be replaced with a result phrase.
+	Summary string `json:"summary"`
+	// Status is "running" | "complete" | "error".
+	Status string `json:"status"`
+	// Detail is optional, user-safe, size-capped markdown. Never raw tool
+	// output, credentials, or chain-of-thought.
+	Detail string `json:"detail,omitempty"`
+	// StartedAt is when the step began.
+	StartedAt time.Time `json:"started_at"`
+	// EndedAt is set on the terminal "end" emission.
+	EndedAt *time.Time `json:"ended_at,omitempty"`
+}
+
+// StepEvent is one live emission to OnStep. Phase is "start" or "end"
+// ("delta" is reserved for progressive detail and unused today). Step
+// carries the full current snapshot; Detail holds the delta text when
+// Phase == "delta".
+type StepEvent struct {
+	Phase  string
+	Step   Step
+	Detail string
+}
+
+// NewRegistry constructs an empty registry. Call Register for each tool;
+// see pkg/skilltools/default_registry.go for the v1 set.
+func NewRegistry() Registry {
+	return &registry{tools: make(map[string]Tool)}
+}
+
+type registry struct {
+	mu    sync.RWMutex
+	tools map[string]Tool
+}
+
+func (r *registry) Register(t Tool) error {
+	if t == nil {
+		return fmt.Errorf("skilltools: nil tool")
+	}
+	name := t.Name()
+	if name == "" {
+		return fmt.Errorf("skilltools: tool with empty name")
+	}
+	r.mu.Lock()
+	defer r.mu.Unlock()
+	if _, dup := r.tools[name]; dup {
+		return fmt.Errorf("skilltools: duplicate tool name %q", name)
+	}
+	r.tools[name] = t
+	return nil
+}
+
+func (r *registry) Get(name string) (Tool, bool) {
+	r.mu.RLock()
+	defer r.mu.RUnlock()
+	t, ok := r.tools[name]
+	return t, ok
+}
+
+func (r *registry) List() []Tool {
+	r.mu.RLock()
+	defer r.mu.RUnlock()
+	out := make([]Tool, 0, len(r.tools))
+	for _, t := range r.tools {
+		out = append(out, t)
+	}
+	return out
+}
+
+// Build prepares an llm.Toolbox for one skill execution.
+//
+// Why: each tool needs to know the caller / channel / skill name plus
+// the audit hook. Stuffing them into Invocation lets each Tool.BuildLLM
+// produce a closure that has everything it needs without further
+// plumbing.
+//
+// Defence in depth: rejects an unsafe tool when visibility != private —
+// the share-time check should already have prevented this; this catches
+// drift (e.g. a tool's SafeForShare flag flipping after a skill saved).
+//
+// The trusted variadic flag lets a caller bypass the share-safety drift
+// check for builtin (trusted-infrastructure) skills. The mortventure /
+// skill-wizard builtins legitimately ship with public visibility AND
+// not-safe-for-share tools — the loader bypasses save-time gates and
+// the share-safety check at invocation would block them inconsistently.
+// Pass true ONLY for builtins.
+func (r *registry) Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error) {
+	isTrusted := len(trusted) > 0 && trusted[0]
+	box := llm.NewToolbox("skilltools")
+	for _, name := range names {
+		t, ok := r.Get(name)
+		if !ok {
+			return nil, fmt.Errorf("skilltools: unknown tool %q", name)
+		}
+
+		if !isTrusted && vis != VisibilityPrivate && !t.Permission().SafeForShare {
+			return nil, fmt.Errorf("skilltools: tool %q is not safe for share but skill visibility is %s", name, vis)
+		}
+
+		// Populate the gate/audit fields on the Invocation so the tool
+		// can call CheckGate / EmitAudit from its handler.
+		toolInv := inv
+		toolInv.gate = t.Permission().SkillNameGate
+		toolInv.currentSkill = inv.SkillName
+		toolInv.audit = audit
+		toolInv.toolName = name
+
+		built := t.BuildLLM(toolInv)
+		if built.Name == "" {
+			return nil, fmt.Errorf("skilltools: tool %q built llm.Tool with empty name", name)
+		}
+		if err := box.Add(built); err != nil {
+			return nil, fmt.Errorf("skilltools: adding tool %q: %w", name, err)
+		}
+	}
+	return box, nil
+}