P1 (part 1): move skilltools core -> tool/ (clean, verbatim)
executus CI / test (push) Successful in 36s

The tool registry core (registry, permission model, Invocation, gated-tool
wrapper, ssrf guard, hmac, encryption, argcoerce, helpers, rootrun,
session_tools, webhook_rate_limit) had zero mort coupling — it imports only
majordomo/llm + x/crypto/hkdf — so it moves verbatim with a package rename
(skilltools -> tool). All same-package tests came along and pass; the SSRF,
gated-wrapper, encryption and output-pattern invariants are re-anchored here.

majordomo re-enters the module graph (now pinned to the latest, incl. the
front-loaded-output fix). model/ + llmmeta + structured follow next.

Docs: CLAUDE.md now requires README/examples to stay in sync with changes in
the same commit; CI skips docs/example-only pushes via paths-ignore.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 19:31:47 -04:00
parent d2c18ad5bb
commit dc28b63ad8
24 changed files with 3461 additions and 1 deletions
+272
View File
@@ -0,0 +1,272 @@
package tool
import (
"context"
"encoding/json"
"fmt"
llm "gitea.stevedudenhoeffer.com/steve/majordomo/llm"
)
// gatedToolMarker is the unexported interface implemented by every Tool
// constructed via NewGatedTool. The IsGatedTool helper performs a type
// assertion against this marker so the meta-test in default_test.go
// (and the wizardtools meta-test) can enforce that every registered
// production tool uses the wrapper.
//
// Why an unexported method (vs a public marker): the goal is to make it
// IMPOSSIBLE for an external caller to lie about being gated. Only the
// implementation in this file can satisfy the interface, so the
// type-assertion in IsGatedTool is a real proof of provenance, not a
// pinky-swear from a struct that opts in.
type gatedToolMarker interface {
isGatedTool()
}
// gatedTool is the concrete Tool returned by NewGatedTool. It carries
// the per-tool metadata (Name/Description/Permission) and the typed
// handler closure; BuildLLM wraps the handler with CheckGate +
// EmitAudit so tool authors literally cannot forget either call.
//
// Why generic on Args (vs accepting any-shaped JSON): each tool's
// handler is typed against its own param struct. defineTypedTool
// derives a JSON schema for the LLM from Args (llm.SchemaFor) and
// parses the args before invoking the handler. We re-marshal args to
// JSON once for the audit row so the captured shape matches exactly
// what the handler ran with (post-coercion).
type gatedTool[Args any] struct {
name string
description string
permission Permission
fn func(ctx context.Context, inv Invocation, args Args) (string, error)
}
// isGatedTool implements gatedToolMarker for the meta-test.
func (g *gatedTool[Args]) isGatedTool() {}
// defineTypedTool builds the majordomo llm.Tool for a typed handler:
// schema derived from Args, arguments decoded leniently (string→number/
// boolean coercion preserved from the legacy gollm era — see argcoerce.go)
// before the handler runs. An unparseable arguments object returns the
// decode error WITHOUT running fn, framing arg-parse-error as a
// tool-call wiring failure rather than a tool-handler failure.
//
// Why not majordomo's llm.DefineTool: its decode is strict by design;
// mort's tool catalog keeps the lenient dialect for parity with years
// of model traffic that emits "3" where the schema says integer.
func defineTypedTool[Args any](name, description string, fn func(ctx context.Context, args Args) (string, error)) llm.Tool {
schema, err := llm.SchemaFor[Args]()
if err != nil {
panic(fmt.Sprintf("skilltools: defineTypedTool(%q): %v", name, err))
}
return llm.Tool{
Name: name,
Description: description,
Parameters: schema,
Handler: func(ctx context.Context, raw json.RawMessage) (any, error) {
var args Args
if err := unmarshalArgsLenient(raw, &args); err != nil {
return nil, fmt.Errorf("invalid arguments for %s: %w", name, err)
}
return fn(ctx, args)
},
}
}
// NewGatedTool wraps a typed handler so it automatically:
// 1. Calls CheckGate(inv) before the handler runs. On gate rejection
// emits EmitAudit(inv, "{}", "", err) and returns the gate error.
// 2. Calls fn(ctx, inv, args) once gate passes.
// 3. Re-marshals args to JSON for the audit row (so the captured args
// reflect any coercion performed during deserialisation), then
// emits EmitAudit(inv, argsJSON, result, err) once the handler
// returns.
//
// Production tools SHOULD use NewGatedTool unless they have a strong
// reason to handle gating manually. The wrapper exists because the
// previous per-tool pattern repeated four lines of boilerplate
// (CheckGate at the top, EmitAudit on every return path), and that
// boilerplate is easy to forget — wizard tools in v1 hotfix #4 had to
// be retrofitted because the author overlooked CheckGate. Centralising
// the calls makes them impossible to skip and the meta-test in
// tools/default_test.go enforces the discipline.
//
// The typed define layer handles JSON parsing and arg coercion before
// fn runs; if the args JSON is unparseable, the decode error is
// returned directly (the wrapper's audit emission does NOT fire on
// parse error — arg-parse-error is a tool-call wiring failure rather
// than a tool-handler failure).
//
// Test: pkg/skilltools/gated_tool_test.go covers gate rejection,
// happy path, fn-returned error, and the IsGatedTool assertion. The
// meta-test in pkg/skilltools/tools/default_test.go walks the registry
// and asserts every production tool implements gatedToolMarker.
func NewGatedTool[Args any](
name, description string,
permission Permission,
fn func(ctx context.Context, inv Invocation, args Args) (string, error),
) Tool {
return &gatedTool[Args]{
name: name,
description: description,
permission: permission,
fn: fn,
}
}
// Name returns the tool's registry key.
func (g *gatedTool[Args]) Name() string { return g.name }
// Description is shown to the LLM.
func (g *gatedTool[Args]) Description() string { return g.description }
// Permission classifies the tool for save-time / share-time gating.
func (g *gatedTool[Args]) Permission() Permission { return g.permission }
// BuildLLM produces the per-invocation llm.Tool. The returned tool's
// handler:
// - Runs CheckGate(inv) FIRST (before any handler logic). On gate
// rejection emits the audit row and returns the gate error.
// - Calls the user-supplied fn with the typed args. fn never sees a
// gate-rejected invocation.
// - Re-marshals args to JSON and emits the audit row exactly once,
// regardless of fn's return value (success or error).
//
// Why re-marshal vs using the raw LLM JSON: the lenient decode performs
// numeric/boolean coercion (e.g. "3" → 3) before invoking the handler;
// the audit row should reflect what fn actually received, not the
// pre-coercion text the LLM emitted.
func (g *gatedTool[Args]) BuildLLM(inv Invocation) llm.Tool {
return defineTypedTool[Args](
g.name,
g.description,
func(ctx context.Context, args Args) (string, error) {
if err := CheckGate(inv); err != nil {
EmitAudit(inv, "{}", "", err)
return "", err
}
argsJSON, mErr := json.Marshal(args)
if mErr != nil {
// Vanishingly rare for the typed param structs in use;
// fall back to "{}" so the audit row never carries a
// half-formed args field.
argsJSON = []byte("{}")
}
result, err := g.fn(ctx, inv, args)
EmitAudit(inv, string(argsJSON), result, err)
return result, err
},
)
}
// IsGatedTool reports whether t was constructed via NewGatedTool /
// NewGatedToolWithAudit. Used by the meta-test in
// tools/default_test.go to enforce that every registered production
// tool uses the wrapper. The check is a type assertion against the
// unexported gatedToolMarker interface, so only the gatedTool variants
// from this package can satisfy it — there is no way for an external
// Tool to pretend to be gated.
func IsGatedTool(t Tool) bool {
_, ok := t.(gatedToolMarker)
return ok
}
// AuditedResult is what a NewGatedToolWithAudit handler returns:
// LLMResult is the string surfaced to the LLM (the tool-call result
// the model sees in its conversation); AuditArgs and AuditResult are
// what the wrapper logs to the audit row INSTEAD of the auto-derived
// values.
//
// Why a separate variant: a small number of tools (paste_create being
// the canonical example) need to return a sensitive value to the LLM
// (a URL containing an encryption-key fragment) but MUST redact that
// value from the audit row, since the audit row is rendered to admins
// in the webui run-trace view. The default wrapper auto-logs args +
// result, which would leak the key. NewGatedToolWithAudit lets the
// handler explicitly separate the LLM-visible output from the
// audit-visible output, while still benefitting from auto-injected
// CheckGate.
type AuditedResult struct {
// LLMResult is the string returned to the LLM as the tool result.
LLMResult string
// AuditArgs is the args string written to the audit row. If empty,
// the wrapper falls back to the JSON-marshaled typed args (same
// behaviour as NewGatedTool).
AuditArgs string
// AuditResult is the result string written to the audit row. May
// be empty (logged as "") to suppress sensitive fragments.
AuditResult string
}
// gatedToolWithAudit is the variant of gatedTool whose handler returns
// an AuditedResult so it can override what the audit row captures.
type gatedToolWithAudit[Args any] struct {
name string
description string
permission Permission
fn func(ctx context.Context, inv Invocation, args Args) (AuditedResult, error)
}
// isGatedTool implements gatedToolMarker for the meta-test.
func (g *gatedToolWithAudit[Args]) isGatedTool() {}
func (g *gatedToolWithAudit[Args]) Name() string { return g.name }
func (g *gatedToolWithAudit[Args]) Description() string { return g.description }
func (g *gatedToolWithAudit[Args]) Permission() Permission { return g.permission }
// NewGatedToolWithAudit is the redaction-aware variant of NewGatedTool.
// Use it ONLY when the LLM-facing result must differ from the audit
// row (e.g. the result contains an encryption key that the audit must
// NOT capture). Most tools should use NewGatedTool.
//
// Behaviour matches NewGatedTool exactly except:
// - The handler returns AuditedResult; the wrapper passes
// AuditedResult.LLMResult to the LLM and writes
// AuditedResult.AuditArgs / AuditedResult.AuditResult to the
// audit row (falling back to the JSON-marshaled args if
// AuditArgs is empty).
// - Gate rejection still emits an audit row with empty Result and
// args="{}" before returning the gate error.
//
// Test: covered alongside NewGatedTool in pkg/skilltools/
// gated_tool_test.go.
func NewGatedToolWithAudit[Args any](
name, description string,
permission Permission,
fn func(ctx context.Context, inv Invocation, args Args) (AuditedResult, error),
) Tool {
return &gatedToolWithAudit[Args]{
name: name,
description: description,
permission: permission,
fn: fn,
}
}
// BuildLLM produces the per-invocation llm.Tool. Same gate-injection
// semantics as gatedTool[Args].BuildLLM; the audit row uses the
// handler-supplied AuditArgs / AuditResult so a sensitive LLM-visible
// result string never leaks into the audit log.
func (g *gatedToolWithAudit[Args]) BuildLLM(inv Invocation) llm.Tool {
return defineTypedTool[Args](
g.name,
g.description,
func(ctx context.Context, args Args) (string, error) {
if err := CheckGate(inv); err != nil {
EmitAudit(inv, "{}", "", err)
return "", err
}
res, err := g.fn(ctx, inv, args)
auditArgs := res.AuditArgs
if auditArgs == "" {
if b, mErr := json.Marshal(args); mErr == nil {
auditArgs = string(b)
} else {
auditArgs = "{}"
}
}
EmitAudit(inv, auditArgs, res.AuditResult, err)
return res.LLMResult, err
},
)
}