P1 (part 1): move skilltools core -> tool/ (clean, verbatim)
executus CI / test (push) Successful in 36s
executus CI / test (push) Successful in 36s
The tool registry core (registry, permission model, Invocation, gated-tool wrapper, ssrf guard, hmac, encryption, argcoerce, helpers, rootrun, session_tools, webhook_rate_limit) had zero mort coupling — it imports only majordomo/llm + x/crypto/hkdf — so it moves verbatim with a package rename (skilltools -> tool). All same-package tests came along and pass; the SSRF, gated-wrapper, encryption and output-pattern invariants are re-anchored here. majordomo re-enters the module graph (now pinned to the latest, incl. the front-loaded-output fix). model/ + llmmeta + structured follow next. Docs: CLAUDE.md now requires README/examples to stay in sync with changes in the same commit; CI skips docs/example-only pushes via paths-ignore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,701 @@
|
||||
// Package skilltools is the tool registry for the agentic skills platform.
|
||||
// Tools registered here can be referenced by name from a Skill's Tools
|
||||
// list and are surfaced to the underlying majordomo agent loop via Build().
|
||||
//
|
||||
// Independent of pkg/logic/chatbot/tool_provider.go: the chatbot's
|
||||
// ToolProvider supplies tools per-channel during a chatbot turn; skill
|
||||
// tools are scoped to one skill execution. Bridging happens once, in
|
||||
// pkg/logic/skills/chatbot_provider.go, which exposes whole agent skills
|
||||
// as chatbot tools (not individual skill tools).
|
||||
//
|
||||
// Permission model is documented in
|
||||
// docs/superpowers/specs/2026-05-02-agentic-skills-design.md, "Tool
|
||||
// registry" section. Three orthogonal checks:
|
||||
//
|
||||
// 1. Save-time: AuthoringRequirement vs caller's admin status.
|
||||
// 2. Share-time: SafeForShare for visibility != private.
|
||||
// 3. Execute-time: SkillNameGate.
|
||||
package tool
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"sync"
|
||||
"time"
|
||||
|
||||
llm "gitea.stevedudenhoeffer.com/steve/majordomo/llm"
|
||||
)
|
||||
|
||||
// Visibility is the spec's visibility enum mirrored here as a typed
|
||||
// string. It's redeclared (vs imported from pkg/logic/skills) to break
|
||||
// the import cycle that would otherwise form: skills → skilltools →
|
||||
// skills. The string values match Visibility one-to-one so a
|
||||
// caller can pass `string(VisibilityPublic)` and it just works.
|
||||
type Visibility string
|
||||
|
||||
const (
|
||||
VisibilityPrivate Visibility = "private"
|
||||
VisibilityShared Visibility = "shared"
|
||||
VisibilityPublic Visibility = "public"
|
||||
)
|
||||
|
||||
// Tool is what a registry entry implements. Concrete tools wrap an
|
||||
// underlying mort subsystem (e.g. wolfram, weather, paste) and produce
|
||||
// an llm.Tool on demand for a given Invocation.
|
||||
//
|
||||
// Why an interface (vs majordomo's concrete llm.Tool): we need richer
|
||||
// metadata (Permission, Categories, SkillNameGate) for the platform's
|
||||
// gating logic before we hand the tool to majordomo. BuildLLM converts
|
||||
// to llm.Tool for one execution, closing over the Invocation so the
|
||||
// per-tool handler can read CallerID/ChannelID without further plumbing.
|
||||
//
|
||||
// Why BuildLLM-per-call (vs static llm.Tool): per-user tools must close
|
||||
// over inv.CallerID — the LLM-supplied args are intentionally ignored
|
||||
// for those. Constructing the llm.Tool inside BuildLLM lets each tool
|
||||
// craft its own typed Define call while reading the invocation context.
|
||||
//
|
||||
// Test: each tool under pkg/skilltools/tools/ has its own *_test.go.
|
||||
type Tool interface {
|
||||
Name() string
|
||||
Description() string
|
||||
Permission() Permission
|
||||
// BuildLLM produces the llm.Tool for one invocation. The returned
|
||||
// tool's name MUST equal Name(); the registry's Build() relies on
|
||||
// this when wiring multiple tools into a Toolbox.
|
||||
BuildLLM(inv Invocation) llm.Tool
|
||||
}
|
||||
|
||||
// Permission summarises the three lifecycle gates plus UI metadata.
|
||||
type Permission struct {
|
||||
// AuthoringRequirement governs who may SAVE a skill that uses
|
||||
// this tool: anyone or admin-only.
|
||||
AuthoringRequirement Requirement
|
||||
|
||||
// OperatesOn classifies whose data the tool reads: global
|
||||
// (channel-wide, public sources) or caller (the invoking user's
|
||||
// own data).
|
||||
OperatesOn Scope
|
||||
|
||||
// SafeForShare reports whether the tool may appear in a shared or
|
||||
// public skill. Tools that operate on caller data are typically
|
||||
// not safe for share — the executing skill becomes a vector for
|
||||
// reading other users' data.
|
||||
SafeForShare bool
|
||||
|
||||
// Categories are free-form labels used for UI grouping (read,
|
||||
// write, network, code, data, social). Code does NOT branch on
|
||||
// these strings.
|
||||
Categories []string
|
||||
|
||||
// SkillNameGate, if non-empty, restricts execution to the named
|
||||
// skill. Used for wizard-only tools in v2; SkillNameGate=="" means
|
||||
// any skill may use the tool.
|
||||
SkillNameGate string
|
||||
}
|
||||
|
||||
// Requirement is who is allowed to author a skill using this tool.
|
||||
type Requirement string
|
||||
|
||||
const (
|
||||
RequirementAnyone Requirement = "anyone"
|
||||
RequirementAdmin Requirement = "admin"
|
||||
)
|
||||
|
||||
// Scope classifies the data domain a tool acts on.
|
||||
type Scope string
|
||||
|
||||
const (
|
||||
ScopeGlobal Scope = "global"
|
||||
ScopeCaller Scope = "caller"
|
||||
)
|
||||
|
||||
// ContinuationContext describes a V10 reply continuation. When set on
|
||||
// an Invocation, the skill executor reuses the parent run's KV scope,
|
||||
// renders a continuation prompt, and bumps ChainDepth for cap
|
||||
// enforcement.
|
||||
//
|
||||
// The executor reads ParentRunID to set the new run's parent_run_id
|
||||
// column (for call-tree reconstruction); ParentOutput to render the
|
||||
// "previous output you sent" line in the agent prompt; ReplyText to
|
||||
// render the "user replied with" line; ReplyMessageID for diagnostic
|
||||
// logging; and ChainDepth to compare against
|
||||
// skills.reply.max_chain_depth.
|
||||
//
|
||||
// Why ChainDepth (vs walking parent_run_id at execution time): a fresh
|
||||
// query per turn would add a DB roundtrip on every reply hop. Carrying
|
||||
// the count in the invocation is cheap and authoritative.
|
||||
type ContinuationContext struct {
|
||||
// ParentRunID is the run that produced the message the user
|
||||
// replied to. The new run inherits its KV scope (run:<ParentRunID>).
|
||||
ParentRunID string
|
||||
|
||||
// ParentOutput is the text the parent run delivered to Discord —
|
||||
// stored on the run row so it survives even if the parent's
|
||||
// run-scope KV has been auto-purged (24h after parent finished).
|
||||
ParentOutput string
|
||||
|
||||
// ReplyText is what the user said when they replied (the new
|
||||
// turn's user input). May be empty if the reply was an attachment-
|
||||
// only message (handle gracefully — agent should handle empty
|
||||
// input as a "noop continuation").
|
||||
ReplyText string
|
||||
|
||||
// ReplyMessageID is the Discord message ID of the user's reply.
|
||||
// Used for audit + log breadcrumbs; not currently consumed by the
|
||||
// agent prompt.
|
||||
ReplyMessageID string
|
||||
|
||||
// ChainDepth is how many continuation hops have happened in the
|
||||
// chain rooted at the original invocation. The router should set
|
||||
// this to (parent's chain depth + 1). The executor rejects when
|
||||
// it exceeds skills.reply.max_chain_depth.
|
||||
ChainDepth int
|
||||
}
|
||||
|
||||
// InputFile is a non-image file the user supplied with a run (audio,
|
||||
// etc.). The executor stages it into the file store under run scope and
|
||||
// surfaces its file_id to the agent. Name is a safe base name (no path
|
||||
// separators) suitable for /workspace/<name>; MimeType is the resolved
|
||||
// content type; Data is the raw bytes.
|
||||
type InputFile struct {
|
||||
Name string
|
||||
MimeType string
|
||||
Data []byte
|
||||
}
|
||||
|
||||
// Invocation is the runtime context passed to Tool.BuildLLM. The executor
|
||||
// builds it once per skill run and the same struct is closed over by
|
||||
// every tool's handler, so each tool sees the caller / channel identity.
|
||||
type Invocation struct {
|
||||
SkillID string
|
||||
SkillName string
|
||||
RunID string
|
||||
CallerID string
|
||||
ChannelID string
|
||||
GuildID string
|
||||
// CallerIsAdmin is true when the caller is a mort admin (Member.Admin).
|
||||
// Populated by the executor at run dispatch via Bot.GetMember; defaults
|
||||
// to false on any lookup failure (member not found, DB error, empty
|
||||
// CallerID for system-invoked runs). Read by tools that gate behaviour
|
||||
// on admin status — currently `code_exec` for the v15 admin-only WAN
|
||||
// network mode.
|
||||
//
|
||||
// Why a precomputed bool on Invocation (vs an AdminChecker dep on
|
||||
// every tool): the admin lookup is read-once-per-run; every tool
|
||||
// would otherwise have to redo the work. The executor knows the
|
||||
// caller's admin status by the time it builds Invocation, so it
|
||||
// stamps the field once and every tool reads it for free.
|
||||
CallerIsAdmin bool
|
||||
// SkillInputs is the parsed input map for the enclosing skill —
|
||||
// available so a tool can reference values the user supplied at
|
||||
// invocation time. Tools may read this to specialise behaviour but
|
||||
// MUST NOT use it as a substitute for inv.CallerID-based isolation.
|
||||
SkillInputs map[string]any
|
||||
// ParentRunID is set when the skill was invoked via skill_invoke
|
||||
// from a parent skill run. Empty for top-level invocations
|
||||
// (Discord, chatbot, scheduler). Used by the loop guard in
|
||||
// skill_invoke and by the audit log for call-tree reconstruction.
|
||||
//
|
||||
// Why threaded through Invocation (vs context.Value): the loop
|
||||
// guard runs at tool-handler time, where the only context the
|
||||
// handler sees is inv. Stuffing it into context would force a
|
||||
// helper for unwrap on every read; an explicit field is easier to
|
||||
// audit and impossible to forget.
|
||||
ParentRunID string
|
||||
|
||||
// RootRunID is the audit run id at the ROOT of the dispatch tree
|
||||
// this run belongs to — for a top-level run, its own RunID; for a
|
||||
// delegated run (skill_invoke / agent_invoke / agent_spawn /
|
||||
// palette wrappers), the outermost ancestor's. Stamped by both
|
||||
// executors from the dispatchguard ancestor chain right after
|
||||
// guard entry. Backs the shared `root_run:<id>` KV scope that lets
|
||||
// parallel sibling workers coordinate (see tools/scope_validate.go
|
||||
// + RootRunKVPartition).
|
||||
RootRunID string
|
||||
|
||||
// ToolsSubset, when non-empty, narrows an AGENT run's low-level tools
|
||||
// to the named subset of the agent's configured LowLevelTools. Set by
|
||||
// agent_invoke's `tools_subset` arg for ephemeral fan-out — spawning a
|
||||
// focused worker from a template (e.g. a `coder` template with only
|
||||
// code_exec + read_page). Names outside the agent's tool menu are
|
||||
// rejected upstream (in the invoke adapter), so by the time the
|
||||
// executor reads this the intersection is safe. Empty = full palette.
|
||||
// Skill runs ignore this field.
|
||||
ToolsSubset []string
|
||||
|
||||
// SystemPromptPrepend, when non-empty, is prepended to an AGENT's
|
||||
// system prompt for this invocation only — the fan-out "customized
|
||||
// system prompt" lever (agent_invoke's `prompt_prepend` arg). It
|
||||
// specializes a template persona to a task without mutating the
|
||||
// persisted agent row. Skill runs ignore this field.
|
||||
SystemPromptPrepend string
|
||||
|
||||
// SuppressDelivery, when true, instructs the skill executor to
|
||||
// SKIP its OutputTarget Delivery (Deliver / DeliverError) entirely.
|
||||
// The run still produces an output string (returned from Run) and
|
||||
// still writes to the audit log — only the side-channel delivery
|
||||
// (Discord channel/DM/thread post) is suppressed.
|
||||
//
|
||||
// Why: when the chatbot exposure adapter invokes a skill, the skill's
|
||||
// output is already going to be consumed by the chatbot as a tool
|
||||
// result; ALSO posting it to Discord via OutputTarget produces double
|
||||
// output and (worse) primes the chatbot to call the tool again on
|
||||
// the next turn after seeing its own output as a "human message",
|
||||
// kicking off a tool-loop. The chatbot adapter sets this to true on
|
||||
// every invocation it constructs.
|
||||
SuppressDelivery bool
|
||||
|
||||
// HandlerOwnsDelivery, when true, tells the executor that the caller
|
||||
// (typically a Discord command handler) will assemble the final
|
||||
// user-visible reply itself — folding any deferred attachments
|
||||
// (rows queued by send_attachments to skill_run_pending_attachments)
|
||||
// into the same message as the text output. The executor's
|
||||
// post-run AttachmentDrainer is skipped so the handler can drain +
|
||||
// classify + chain-overflow + post in one place.
|
||||
//
|
||||
// Why an explicit flag (vs reusing SuppressDelivery): SuppressDelivery
|
||||
// also short-circuits the OutputTarget Delivery layer (channel/dm/
|
||||
// thread post), which is the right shape for chatbot exposure but
|
||||
// the WRONG shape for `.agent run` — the handler still wants the
|
||||
// audit row to land and the executor's drainer to NOT post a
|
||||
// separate "here's an image" follow-up message after the handler's
|
||||
// own text reply. HandlerOwnsDelivery is the narrow "the caller is
|
||||
// taking over post-run delivery" signal that does NOT change any
|
||||
// other executor behaviour.
|
||||
//
|
||||
// SuppressDelivery and HandlerOwnsDelivery are independent. The
|
||||
// drainer is skipped when EITHER is set (the chatbot path doesn't
|
||||
// want stray posts either; agent-run sets HandlerOwnsDelivery
|
||||
// because it owns delivery; sub-agent dispatches set SuppressDelivery
|
||||
// because they surface output as a tool result).
|
||||
HandlerOwnsDelivery bool
|
||||
|
||||
// Priority is the v9 per-invocation priority override for the lane
|
||||
// scheduler. When non-zero, the executor uses this value when
|
||||
// constructing the lane Job; zero falls back to the skill's
|
||||
// Skill.DefaultPriority. Owners are capped by convar
|
||||
// `skills.priority_max_per_user` (default 5); admins may exceed it.
|
||||
//
|
||||
// Why a non-pointer (vs *int): zero means "use the default", which
|
||||
// matches the convention everywhere else in this struct. Skills
|
||||
// that need an explicit zero priority can store
|
||||
// DefaultPriority=0 — the result is identical.
|
||||
Priority int
|
||||
|
||||
// LaneWaitMaxSeconds is the v9 per-invocation lane backoff cap. When
|
||||
// >0, the executor calls SubmitWithMaxWait so the run is rejected
|
||||
// with ErrLaneBusy (surfaced as `lane_busy`) when the estimated
|
||||
// queue wait would exceed this many seconds. 0 (default) preserves
|
||||
// the legacy block-forever Submit semantics.
|
||||
LaneWaitMaxSeconds int
|
||||
|
||||
// LaneOverride forces the run onto the named lane regardless of
|
||||
// Skill.ExecutionLane. Used by the v9 inbound webhook handler to
|
||||
// route webhook-triggered runs to the dedicated webhook-default
|
||||
// lane. Empty preserves the per-skill ExecutionLane.
|
||||
LaneOverride string
|
||||
|
||||
// Continuation, when non-nil, signals that this Invocation is a
|
||||
// V10 reply continuation: a Discord user replied to a message the
|
||||
// originating skill posted, and mort is re-invoking the skill to
|
||||
// produce the next turn. The executor reads this field to:
|
||||
//
|
||||
// - Reuse the parent run's `run:<parent_run_id>` KV scope (so any
|
||||
// state the prior turn saved is still readable).
|
||||
// - Render a continuation block at the top of the agent's user
|
||||
// prompt that includes the parent output + reply text.
|
||||
// - Enforce the per-deployment chain-depth cap
|
||||
// (skills.reply.max_chain_depth, default 20).
|
||||
// - Stamp parent_run_id on the new run for call-tree
|
||||
// reconstruction in audit + UI.
|
||||
//
|
||||
// Why a pointer struct (vs flat fields): all five fields are
|
||||
// meaningful only together — splitting them would invite
|
||||
// half-populated states. nil = "this is a fresh invocation, not a
|
||||
// continuation".
|
||||
Continuation *ContinuationContext
|
||||
|
||||
// SourceWebhookSecretMatched is set true by the inbound webhook
|
||||
// handler AFTER it has validated both the URL secret AND the HMAC
|
||||
// signature for the named skill. It signals to System.Run that the
|
||||
// caller is authenticated by a per-skill secret (not by Discord
|
||||
// identity), so the visibility / owner gate in CanInvoke should be
|
||||
// bypassed for THIS skill (matching SkillID). All other gates —
|
||||
// pinned_version, budget caps, lane caps — still apply.
|
||||
//
|
||||
// Hotfix-5 Bug 1: pre-fix the webhook handler built an Invocation
|
||||
// with CallerID=`<webhook>:<source-IP>` and dispatched through
|
||||
// System.Run. CanInvoke saw a non-owner non-admin caller against a
|
||||
// private skill and rejected with HTTP 500 ("caller is not
|
||||
// permitted to invoke skill"). The cure isn't to weaken
|
||||
// CanInvoke's general-purpose policy — it's to recognise that a
|
||||
// matched secret IS the auth gate for the named skill.
|
||||
//
|
||||
// Why per-Invocation (vs a separate gate path): the executor uses
|
||||
// Run as the single canonical dispatch point — adding a second
|
||||
// "authenticated dispatch" entry would split run-recording, lane
|
||||
// dispatch, and audit emission into two parallel implementations.
|
||||
SourceWebhookSecretMatched bool
|
||||
|
||||
// OnEvent, when non-nil, is called by the executor at run
|
||||
// boundaries and by the agent loop on each tool dispatch. The
|
||||
// bot's command handler closes over the invoking message and
|
||||
// reacts an emoji from the skill's StateReactEmoji map. Nil-safe.
|
||||
//
|
||||
// Event names:
|
||||
// "__start__" — right before agent.Run starts
|
||||
// "__end__" — on successful completion
|
||||
// "__error__" — on terminal error
|
||||
// <tool_name> — when a tool dispatches (any registered tool)
|
||||
//
|
||||
// The executor passes the resolved emoji as `emoji` so callers
|
||||
// don't have to look it up themselves; emoji=="" means "no react
|
||||
// for this event" and callers should skip the react entirely.
|
||||
//
|
||||
// Why a callback (vs a state-react map carried in the Invocation):
|
||||
// the lookup table lives on the Skill, not the Invocation, but the
|
||||
// caller-supplied side effect (a Discord react) lives on the bot
|
||||
// command surface. A callback bridges the two without forcing the
|
||||
// executor to import discord types and without forcing the bot
|
||||
// command surface to know about the Skill's emoji map shape.
|
||||
OnEvent func(ctx context.Context, event string, emoji string)
|
||||
|
||||
// OnToolEvent, when non-nil, is called by the executor on each tool
|
||||
// dispatch with phase "start" (before the tool runs) then "end" or
|
||||
// "error" (after it completes, with the result text in detail). Distinct
|
||||
// from OnEvent (which is the emoji state-react hook): this carries the
|
||||
// tool name + args/result so an out-of-band caller — e.g. the mortise
|
||||
// chat API streaming SSE tool.start/tool.end frames — can surface live
|
||||
// tool-progress. Nil-safe; the callback MUST be fast and non-blocking
|
||||
// (it runs on the agent-loop goroutine).
|
||||
OnToolEvent func(ctx context.Context, toolName, phase, detail string)
|
||||
|
||||
// OnStep, when non-nil, is called by the executor as the agent loop
|
||||
// makes progress — currently once per tool call: phase "start" before
|
||||
// the tool runs, phase "end" after it completes (StepEvent.Step.Status
|
||||
// is "complete" or "error"). Correlate the two by StepEvent.Step.ID.
|
||||
// "delta" is reserved for progressive detail and is unused today.
|
||||
//
|
||||
// Distinct from OnToolEvent (the raw tool-name/result hook): OnStep
|
||||
// carries a richer, presentation-ready Step (kind + human present-tense
|
||||
// summary) so an out-of-band consumer — e.g. the mortise chat API
|
||||
// streaming SSE step.start/step.end frames — can render structured
|
||||
// progress without re-deriving it. The executor ALSO accumulates the
|
||||
// same Steps onto its run Result, so persistence does not depend on
|
||||
// this callback being set. Nil-safe; the callback MUST be fast and
|
||||
// non-blocking (it runs on the agent-loop goroutine).
|
||||
OnStep func(ctx context.Context, ev StepEvent)
|
||||
|
||||
// InvokingMessageID is the Discord message ID of the user's command
|
||||
// that triggered this run, when it was triggered by a Discord text
|
||||
// command. Used by delivery to thread the reply (Discord native
|
||||
// reply with the gray quote bar + jump link). Empty for chatbot
|
||||
// exposure, scheduled, or webhook invocations — delivery falls
|
||||
// back to a plain channel post for those.
|
||||
//
|
||||
// Why threaded through Invocation (vs a separate field on Skill or
|
||||
// a magic SkillInputs key): the message ID is per-invocation, not
|
||||
// per-skill, and the delivery layer is the natural reader. Direct
|
||||
// field on Invocation matches the existing ChannelID / GuildID
|
||||
// fields' shape.
|
||||
InvokingMessageID string
|
||||
|
||||
// Images carries multi-modal image content for the initial user
|
||||
// message. When non-empty, the executor builds the initial user
|
||||
// message with llm.UserParts(text + image parts) instead of plain
|
||||
// llm.UserText. Populated by callers that extract images from Discord
|
||||
// attachments or URLs in prompt text (pkg/imageutil downloads the
|
||||
// bytes — majordomo image parts are bytes-only). Nil = text-only.
|
||||
Images []llm.ImagePart
|
||||
|
||||
// InputFiles carries non-image attachments (audio, etc.) the user
|
||||
// supplied with the run. Unlike Images, these are NOT inlined into
|
||||
// the model's context — the LLM can't ingest raw mp3/wav/midi bytes.
|
||||
// Instead the executor stages each into the skill file store under
|
||||
// run scope and tells the agent the resulting file_ids (in the
|
||||
// prompt) so it can hand one to a worker tool (e.g. code_exec
|
||||
// files_in → /workspace/<name>) for processing. Nil = none.
|
||||
InputFiles []InputFile
|
||||
|
||||
// ExtraTools are additional llm.Tool instances injected for this
|
||||
// run only. They are appended to the palette after registry-built
|
||||
// tools, skill-palette wrappers, and sub-agent wrappers. Use this
|
||||
// for session-specific tools that cannot be pre-registered in the
|
||||
// catalog (e.g., scaddy's write_scad which needs per-session
|
||||
// workspace + renderer state).
|
||||
//
|
||||
// Why on Invocation (vs a dedicated Run parameter): the Invocation
|
||||
// is the per-run context carrier in mort's execution path. Adding
|
||||
// a separate ExtraTools arg to Executor.Run would fork the
|
||||
// signature for one use case; a field on the existing carrier
|
||||
// keeps the surface stable.
|
||||
ExtraTools []llm.Tool
|
||||
|
||||
// SessionToolFactory, if set, is called with the live AgentSession
|
||||
// after the executor constructs the agent but before it runs. It
|
||||
// returns a SessionTools struct carrying the tools to add, an
|
||||
// optional PostRun hook for post-processing (e.g., rendering final
|
||||
// artifacts from workspace state), and an optional Cleanup func for
|
||||
// resource teardown. Types are defined in session_tools.go.
|
||||
//
|
||||
// Why a factory (vs ExtraTools): ExtraTools are static — they
|
||||
// don't have access to the running agent. Tools that need to call
|
||||
// session.AttachImages (to show rendered previews to the model on
|
||||
// its next turn) require the live session handle that only exists
|
||||
// after construction. The factory receives that handle.
|
||||
SessionToolFactory SessionToolFactory
|
||||
|
||||
// PostRunDelivery, if set, is called by the agent command handler
|
||||
// (`.agent run`) INSTEAD of the default text + paste-fallback reply
|
||||
// when the executor's result carries a PostRunResult. The callback
|
||||
// receives the Discord message to reply to, the agent's text output,
|
||||
// and the PostRunResult. It returns the message ID of the primary
|
||||
// reply (for origin recording) and any error.
|
||||
//
|
||||
// Why a callback on Invocation (vs a handler method on the agent):
|
||||
// delivery needs services (paste, filetransfer, Discord session)
|
||||
// that live outside the agents package. A callback lets the adapter
|
||||
// (e.g., scaddy) close over the services at factory-build time
|
||||
// without adding service dependencies to the agents.System struct.
|
||||
//
|
||||
// When nil, `handleRun` falls through to the standard text-based
|
||||
// reply path (formatRunReply + postRunReply). When set, the
|
||||
// callback owns the ENTIRE reply — `handleRun` does NOT post a
|
||||
// text reply alongside it.
|
||||
PostRunDelivery func(ctx context.Context, channelID, replyToMsgID string, output string, prr *PostRunResult) (primaryMsgID string, err error)
|
||||
|
||||
// RunState, when set by the executor, lets a tool read the live
|
||||
// run's progress + budget snapshot (iteration vs cap, tool calls,
|
||||
// tokens, cost, elapsed). Nil on paths that do not provide it (e.g.
|
||||
// the no-tools direct path, or executors that predate the hook).
|
||||
// The skill_self_status tool reads this.
|
||||
RunState RunStateAccessor
|
||||
|
||||
// AttachImages, when set by the executor, queues a user-role message
|
||||
// (optional text + image parts) into the LIVE run so the model sees
|
||||
// the images on its next step — the same steer-mailbox mechanism the
|
||||
// SessionToolFactory's AgentSession exposes, but reachable from any
|
||||
// ordinary tool handler. A tool returns text; images cannot ride a
|
||||
// string result, so a tool that fetches images the model must SEE
|
||||
// (e.g. discord_list_recent_messages reading channel history) calls
|
||||
// this to feed the pixels in. Nil on paths that do not own a steer
|
||||
// mailbox (skillexec, the no-tools direct path); tools MUST nil-check
|
||||
// before calling and degrade to text-only when it is nil.
|
||||
AttachImages func(text string, images ...llm.ImagePart)
|
||||
|
||||
// gate / audit are populated by the registry's Build before
|
||||
// BuildLLM is called. Tools should call CheckGate(inv) at the top
|
||||
// of their handler and EmitAudit(inv, ...) when reporting tool
|
||||
// results. The fields are unexported in the public surface but
|
||||
// available to tools via the helpers in helpers.go.
|
||||
gate string
|
||||
currentSkill string
|
||||
audit AuditHook
|
||||
toolName string
|
||||
}
|
||||
|
||||
// RunState is a live, read-only snapshot of the current run's progress
|
||||
// and budget. Populated on demand by the executor's per-run accessor
|
||||
// (see Invocation.RunState).
|
||||
type RunState struct {
|
||||
Iteration int
|
||||
MaxIterations int
|
||||
ToolCalls int
|
||||
MaxToolCalls int
|
||||
InputTokens int64
|
||||
OutputTokens int64
|
||||
ThinkingTokens int64
|
||||
ElapsedSeconds int
|
||||
}
|
||||
|
||||
// RunStateAccessor returns the live RunState for the enclosing run. The
|
||||
// executor builds one per run and stamps it on Invocation.RunState
|
||||
// before the toolbox is built; tools read it via inv.RunState. Nil on
|
||||
// any path that does not provide it.
|
||||
type RunStateAccessor interface {
|
||||
RunState() RunState
|
||||
}
|
||||
|
||||
// Registry is the read interface to the tool catalog. Concrete impl is
|
||||
// the package-private *registry struct returned by NewRegistry.
|
||||
type Registry interface {
|
||||
Register(t Tool) error
|
||||
Get(name string) (Tool, bool)
|
||||
List() []Tool
|
||||
// Build returns an llm.Toolbox with each named tool prepared for
|
||||
// execution against the given invocation. Save-time authoring
|
||||
// checks happen elsewhere (CheckAuthoring in checks.go) — Build
|
||||
// trusts that the skill was already saved past those gates and
|
||||
// only re-checks runtime invariants:
|
||||
//
|
||||
// 1. Share-safety drift: rejects an unsafe tool when visibility
|
||||
// != private.
|
||||
// 2. SkillNameGate enforcement is delegated to the per-tool
|
||||
// handler via CheckGate, which reads invocation context.
|
||||
// 3. Audit emission via EmitAudit (also per-tool).
|
||||
//
|
||||
// The optional `trusted` variadic argument lets the caller declare
|
||||
// the skill as trusted infrastructure (a builtin loaded from disk
|
||||
// by the project's own loader) so the share-safety drift check is
|
||||
// skipped. Builtins legitimately ship with public visibility AND
|
||||
// not-safe-for-share tools (e.g. skill-wizard's wizard_* tools),
|
||||
// and the loader bypasses save-time gates by design — applying the
|
||||
// share-safety check at invocation would be inconsistent with the
|
||||
// rest of the trusted-builtin contract. Pass true ONLY for builtins
|
||||
// (Skill.Source == SourceBuiltin / OwnerID == ""). Variadic so the
|
||||
// existing call sites (and tests) compile unchanged.
|
||||
Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error)
|
||||
}
|
||||
|
||||
// AuditHook is invoked synchronously around each tool call. Implementations
|
||||
// typically forward to skillaudit.Writer. May be nil for tests.
|
||||
type AuditHook func(call AuditCall)
|
||||
|
||||
// AuditCall describes one tool invocation. Result is set on success;
|
||||
// Err is set on failure. Either may be present together (e.g. the tool
|
||||
// returned partial output then errored).
|
||||
type AuditCall struct {
|
||||
Tool string
|
||||
Args string
|
||||
Result string
|
||||
Err error
|
||||
}
|
||||
|
||||
// Step is one unit of agent progress surfaced to a consumer of OnStep
|
||||
// (and accumulated onto the executor's run Result). Today there is one
|
||||
// Step per tool call; the shape is deliberately open so future kinds
|
||||
// (a coalesced reasoning beat, a sub-agent delegation) slot in without a
|
||||
// wire change.
|
||||
//
|
||||
// This is a plain DTO — no HTTP/Discord/JSON-tag coupling beyond the
|
||||
// neutral snake_case tags a transport may reuse. The chat API converts
|
||||
// it to its own persisted/wire type; Discord/cron consumers read the
|
||||
// Result field directly.
|
||||
type Step struct {
|
||||
// ID is stable per-step and unique within one run; it is the
|
||||
// correlation key between the "start" and "end" emissions.
|
||||
ID string `json:"id"`
|
||||
// Kind is an open vocabulary (search, read, code, image, file,
|
||||
// memory, delegate, tool, …); consumers map known values to an icon
|
||||
// and fall back for unknown ones. Never drop a step for an
|
||||
// unrecognised kind.
|
||||
Kind string `json:"kind"`
|
||||
// Title is a short machine-ish label (typically the raw tool name).
|
||||
Title string `json:"title,omitempty"`
|
||||
// Summary is the human present-tense one-liner ("Searching the web
|
||||
// for …"); on end it may be replaced with a result phrase.
|
||||
Summary string `json:"summary"`
|
||||
// Status is "running" | "complete" | "error".
|
||||
Status string `json:"status"`
|
||||
// Detail is optional, user-safe, size-capped markdown. Never raw tool
|
||||
// output, credentials, or chain-of-thought.
|
||||
Detail string `json:"detail,omitempty"`
|
||||
// StartedAt is when the step began.
|
||||
StartedAt time.Time `json:"started_at"`
|
||||
// EndedAt is set on the terminal "end" emission.
|
||||
EndedAt *time.Time `json:"ended_at,omitempty"`
|
||||
}
|
||||
|
||||
// StepEvent is one live emission to OnStep. Phase is "start" or "end"
|
||||
// ("delta" is reserved for progressive detail and unused today). Step
|
||||
// carries the full current snapshot; Detail holds the delta text when
|
||||
// Phase == "delta".
|
||||
type StepEvent struct {
|
||||
Phase string
|
||||
Step Step
|
||||
Detail string
|
||||
}
|
||||
|
||||
// NewRegistry constructs an empty registry. Call Register for each tool;
|
||||
// see pkg/skilltools/default_registry.go for the v1 set.
|
||||
func NewRegistry() Registry {
|
||||
return ®istry{tools: make(map[string]Tool)}
|
||||
}
|
||||
|
||||
type registry struct {
|
||||
mu sync.RWMutex
|
||||
tools map[string]Tool
|
||||
}
|
||||
|
||||
func (r *registry) Register(t Tool) error {
|
||||
if t == nil {
|
||||
return fmt.Errorf("skilltools: nil tool")
|
||||
}
|
||||
name := t.Name()
|
||||
if name == "" {
|
||||
return fmt.Errorf("skilltools: tool with empty name")
|
||||
}
|
||||
r.mu.Lock()
|
||||
defer r.mu.Unlock()
|
||||
if _, dup := r.tools[name]; dup {
|
||||
return fmt.Errorf("skilltools: duplicate tool name %q", name)
|
||||
}
|
||||
r.tools[name] = t
|
||||
return nil
|
||||
}
|
||||
|
||||
func (r *registry) Get(name string) (Tool, bool) {
|
||||
r.mu.RLock()
|
||||
defer r.mu.RUnlock()
|
||||
t, ok := r.tools[name]
|
||||
return t, ok
|
||||
}
|
||||
|
||||
func (r *registry) List() []Tool {
|
||||
r.mu.RLock()
|
||||
defer r.mu.RUnlock()
|
||||
out := make([]Tool, 0, len(r.tools))
|
||||
for _, t := range r.tools {
|
||||
out = append(out, t)
|
||||
}
|
||||
return out
|
||||
}
|
||||
|
||||
// Build prepares an llm.Toolbox for one skill execution.
|
||||
//
|
||||
// Why: each tool needs to know the caller / channel / skill name plus
|
||||
// the audit hook. Stuffing them into Invocation lets each Tool.BuildLLM
|
||||
// produce a closure that has everything it needs without further
|
||||
// plumbing.
|
||||
//
|
||||
// Defence in depth: rejects an unsafe tool when visibility != private —
|
||||
// the share-time check should already have prevented this; this catches
|
||||
// drift (e.g. a tool's SafeForShare flag flipping after a skill saved).
|
||||
//
|
||||
// The trusted variadic flag lets a caller bypass the share-safety drift
|
||||
// check for builtin (trusted-infrastructure) skills. The mortventure /
|
||||
// skill-wizard builtins legitimately ship with public visibility AND
|
||||
// not-safe-for-share tools — the loader bypasses save-time gates and
|
||||
// the share-safety check at invocation would block them inconsistently.
|
||||
// Pass true ONLY for builtins.
|
||||
func (r *registry) Build(names []string, inv Invocation, vis Visibility, audit AuditHook, trusted ...bool) (*llm.Toolbox, error) {
|
||||
isTrusted := len(trusted) > 0 && trusted[0]
|
||||
box := llm.NewToolbox("skilltools")
|
||||
for _, name := range names {
|
||||
t, ok := r.Get(name)
|
||||
if !ok {
|
||||
return nil, fmt.Errorf("skilltools: unknown tool %q", name)
|
||||
}
|
||||
|
||||
if !isTrusted && vis != VisibilityPrivate && !t.Permission().SafeForShare {
|
||||
return nil, fmt.Errorf("skilltools: tool %q is not safe for share but skill visibility is %s", name, vis)
|
||||
}
|
||||
|
||||
// Populate the gate/audit fields on the Invocation so the tool
|
||||
// can call CheckGate / EmitAudit from its handler.
|
||||
toolInv := inv
|
||||
toolInv.gate = t.Permission().SkillNameGate
|
||||
toolInv.currentSkill = inv.SkillName
|
||||
toolInv.audit = audit
|
||||
toolInv.toolName = name
|
||||
|
||||
built := t.BuildLLM(toolInv)
|
||||
if built.Name == "" {
|
||||
return nil, fmt.Errorf("skilltools: tool %q built llm.Tool with empty name", name)
|
||||
}
|
||||
if err := box.Add(built); err != nil {
|
||||
return nil, fmt.Errorf("skilltools: adding tool %q: %w", name, err)
|
||||
}
|
||||
}
|
||||
return box, nil
|
||||
}
|
||||
Reference in New Issue
Block a user