P3: meta + primitive tool group (think/now/cite + classify/extract/summarize)

Grow executus/tools into a real generic tool library:

- Register(reg): the always-available, zero-config tools — think, now (UTC
  unless a CurrentTimeProvider is wired), cite (inert unless a CitationStorage
  is wired). All nil-safe; a light host calls Register and is useful.
- RegisterMeta(reg, MetaDeps): the LLM-backed meta tools — classify,
  extract_entities, summarize — over the llmmeta helper. Budget defaults to the
  shipped in-memory per-run cap; Files optional; caps default.
- Seams moved (interface/type-only, no host coupling): research_providers.go
  (CurrentTimeProvider/CitationStorage/SearchBudget/PageExtractor/PDFFetcher/…)
  and file_storage.go (FileStorage + FileDomainMeta). Plus the in-memory budget
  default (research_defaults.go) and scope_validate.go.

calculate deferred (drags github.com/Krognol/go-wolfram + a module-path replace
— not worth it in the lean core for one tool). Core go.sum still free of
gorm/redis/discordgo/sqlite/wolfram.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 21:00:45 -04:00
parent df95425bb5
commit 1e201550b3
11 changed files with 1802 additions and 17 deletions
+332
View File
@@ -0,0 +1,332 @@
// Package tools — research provider plumbing for v11.
//
// This file declares the narrow interfaces v11's research tools
// (web_search, read_page, read_video, read_pdf, verify_url, etc.) need
// at execute time. Production wiring lives in pkg/logic/mort.go and
// closes over the searcher chain, the extractor / chromedp client, the
// PDF extractor, and the yt-dlp wrapper.
//
// Why narrow interfaces (vs importing pkg/logic/searcher / extractor
// directly): the same cycle-break pattern used by KVStorage, FileStorage,
// HTTPConfigProvider — keeps pkg/skilltools/tools free of the wiring
// layer so tests can stub each dependency. Each provider is nil-safe:
// the tool surfaces "not configured" at first call rather than failing
// at registration.
//
// Test: each tool under pkg/skilltools/tools/ wired against these
// interfaces has its own *_test.go using the in-package fakes in
// research_providers_fakes_test.go.
package tools
import (
"context"
"errors"
"time"
)
// PageCache is the narrow surface read_page (and read_pdf) consult to
// avoid re-fetching the same URL within the cache's TTL. Production
// wiring bridges this interface to the legacy *cache.Cache held by
// pkg/logic/query.System so a `.query foo.com` and a
// `.skill query foo.com` for the same URL share one cache slot.
//
// Why a narrow interface (vs importing the cache package directly):
// same cycle-break pattern as KVStorage / FileStorage / CitationStorage
// — keeps pkg/skilltools/tools free of the wiring layer. The legacy
// cache slot key is `sha256(url)`; the production adapter is
// responsible for hashing so this interface stays clean (raw URL in/out)
// and skill-tool authors never need to know the slot shape.
//
// nil-safe: a tool constructed with a nil PageCache simply skips the
// cache layer (always treat Get as a miss; Set is a no-op).
//
// Test: tests pass a fake PageCache that records Get/Set calls and
// returns canned hits. See page_cache_test.go for the read_page hit /
// miss scenarios.
type PageCache interface {
// Get returns the cached body for urlStr and true on hit, or
// (nil, false) on miss. Implementations MUST treat any backing-
// store error as a miss (best-effort, never fail the caller).
Get(ctx context.Context, urlStr string) ([]byte, bool)
// Set writes body under the slot for urlStr with the supplied TTL.
// Implementations MUST swallow backing-store errors (best-effort
// caching is correct: a write failure should not propagate to the
// agent loop).
Set(ctx context.Context, urlStr string, body []byte, ttl time.Duration)
}
// PageCacheTTL is the default TTL applied by tools that consult a
// PageCache. Mirrors the legacy `query.pageCacheTTL` constant
// (1 hour) so a `.query`-warmed slot reads back from a `.skill query`
// (and vice versa) within the same window.
//
// Tools that want a different TTL pass an explicit value to
// PageCache.Set; this constant is the project default the v11 / v-research
// tools all use.
const PageCacheTTL = 1 * time.Hour
// PageExtractor is the narrow surface read_page needs at execute
// time. The production adapter wraps mort's existing extractor
// (Ollama web_fetch first, chromedp fallback on JS-heavy pages).
//
// nil-safe: a tool constructed with a nil PageExtractor surfaces
// "not configured" at first call.
//
// Why: read_page used to be a thin io.ReadAll over the URL — it
// missed JS rendering, didn't honour the v6 page cache, and could
// not surface the underlying provider name. v11 routes through this
// interface so the production wiring (mort.go) can plug in the
// existing query-side extractor without exposing query.Agent.
type PageExtractor interface {
// ExtractPage fetches and extracts readable text from urlStr.
// Returns the extracted body, a final URL (after any redirects
// the extractor followed), the provider name ("ollama" |
// "chromedp" | "ytdlp"), and an error.
//
// The returned body is the FULL extracted text — callers apply
// the v10 byte-vs-reference cap before surfacing to the agent.
//
// bypassCache=true skips any page cache and forces a fresh
// extraction. Default false.
ExtractPage(ctx context.Context, urlStr string, bypassCache bool) (text string, finalURL string, provider string, err error)
}
// VideoTranscriber is the narrow surface read_video needs at
// execute time. Production wiring wraps internal/ytdlp.
//
// nil-safe: tool surfaces "not configured" at first call.
//
// Why a separate interface from PageExtractor: video is a different
// shape (transcript + metadata) and a different binary (yt-dlp).
// Keeping them distinct lets tests stub each independently.
type VideoTranscriber interface {
// ExtractVideoTranscript returns the transcript text and the
// best-effort metadata (title, duration in seconds, channel).
// Implementations MUST return a non-empty transcript or an
// error — empty-transcript success is interpreted by the tool
// as a "transcript_unavailable" failure.
ExtractVideoTranscript(ctx context.Context, urlStr string) (transcript string, meta VideoMeta, err error)
}
// VideoMeta is best-effort metadata returned alongside a video
// transcript. Any field may be empty/zero if the implementation
// could not extract it.
type VideoMeta struct {
Title string
Channel string
DurationSeconds int
}
// PDFFetcher is the narrow surface read_pdf needs at execute time.
// Production wiring uses an HTTP-aware fetcher that HEAD-validates
// content-type before downloading the body.
//
// nil-safe: tool surfaces "not configured" at first call.
//
// Why: a tool that just embedded PDF extraction would couple
// fetching + parsing. Splitting the fetch (allowlist + SSRF +
// HEAD check) from the extract (page-level parsing) keeps each
// step testable and lets the same fetcher serve verify_url one
// day if we want a PDF-aware fast path.
type PDFFetcher interface {
// FetchPDF downloads the PDF at urlStr (after HEAD-validating
// content-type) and returns the raw bytes plus the final URL.
// HEAD-validation rejects a URL whose Content-Type is not a
// PDF mime AND whose path does not end in .pdf.
FetchPDF(ctx context.Context, urlStr string) (body []byte, finalURL string, err error)
}
// PDFExtractor parses PDF bytes into plain text + page count.
// Production wires internal.ExtractPDFText.
//
// Why split from PDFFetcher: tests want to vary the fetch (mock
// server returning bytes) without rebuilding the extractor.
type PDFExtractor interface {
// ExtractPDFText returns the concatenated plain-text content
// of the PDF along with the page count. The caller applies any
// per-page cap and the v10 byte-vs-reference cap on the result.
ExtractPDFText(ctx context.Context, body []byte, maxPages int) (text string, pageCount int, truncated bool, err error)
}
// HEADChecker is the narrow surface verify_url needs at execute
// time. Production wiring uses the same SSRF-pinned transport as
// http_get so the security envelope is consistent.
//
// Why a separate interface (vs reusing HTTPConfigProvider+doHTTP):
// verify_url's contract is simpler — HEAD only, no body bytes
// returned, and the agent only cares about reachable / status /
// final URL / content-type. A bespoke surface lets the production
// adapter optimise for that path (no body buffer, no body close).
type HEADChecker interface {
// HEAD performs a HEAD request against urlStr (with SSRF +
// allowlist enforcement) and returns the final URL after any
// redirects, the HTTP status code, and the Content-Type header.
// Returns reachable=false with a non-nil err for transport
// failures (DNS, TCP, allowlist rejection); reachable=true with
// any HTTP status (including 4xx/5xx) is the success shape —
// the agent decides whether the URL is "real".
HEAD(ctx context.Context, urlStr string) (finalURL string, status int, contentType string, reachable bool, err error)
}
// CitationStorage is the narrow surface cite() needs at execute
// time. Production wires *skills.System.Storage(); tests stub.
//
// nil-safe: tool surfaces "not configured" at first call.
//
// Why a narrow interface (vs importing pkg/logic/skills): same
// cycle constraint as KVStorage / FileStorage. Production adapter
// in mort.go bridges to skills.Storage's RecordCitation /
// ListCitations methods AND a separate URL-history tracker.
//
// Two responsibilities, deliberately separate:
//
// 1. RecordCitation writes a row into skill_run_sources — this is
// the user-visible citations table for the Sources panel and
// CSV export. ONLY rows the agent successfully cited via
// cite() land here.
// 2. RecordURLTouch / GetTouchedURLs maintains a per-run set of
// URLs the agent has interacted with (web_search results,
// read_page input, read_pdf input, read_video input). cite()
// reads this set to reject claims for URLs the agent never
// touched. This set lives in a different table or scope from
// the citations table — it's working state, not a record.
type CitationStorage interface {
// RecordCitation appends one (run_id, url, claim, cited_at)
// row to the citations table (skill_run_sources). cited_at is
// set by the storage layer to time.Now() when zero. The caller
// has already verified the URL is in the touched-URL set
// (via GetTouchedURLs); this method is the persistence step.
RecordCitation(ctx context.Context, runID, url, claim string) error
// RecordURLTouch records that the agent has interacted with
// `url` during `runID`. Called by web_search (per result),
// read_page, read_pdf, and read_video. Idempotent — repeat
// calls for the same (run_id, url) are no-ops at the storage
// layer.
RecordURLTouch(ctx context.Context, runID, url string) error
// GetTouchedURLs returns the set of URLs the run has
// interacted with. Used by cite() to verify that a claim's
// URL is one the agent actually visited. Empty for a fresh
// run — cite() then rejects every claim with
// "url_not_in_run_history".
GetTouchedURLs(ctx context.Context, runID string) (map[string]struct{}, error)
// ListCitations returns all citations recorded for the run, in
// insertion order. Powers the /skills/{id}/runs/{run_id}
// Sources panel.
ListCitations(ctx context.Context, runID string) ([]CitationRow, error)
}
// CitationRow mirrors the skill_run_sources row shape. Fields
// match the spec: run_id is implicit in the query, url + claim are
// what the agent submitted, cited_at is the wall-clock timestamp
// at insert.
type CitationRow struct {
URL string
Claim string
CitedAt int64 // unix-seconds; storage adapter normalises from time.Time
}
// CurrentTimeProvider exposes a "now" + per-user timezone lookup.
// Production wiring closes over the bot's member-config getter.
//
// nil-safe: a tool constructed with a nil provider falls back to
// server-time + UTC (current behaviour of NewNow before v11).
type CurrentTimeProvider interface {
// UserTimezone returns the IANA timezone name configured for
// the given Discord member ID, or "" when the member has no
// timezone configured. Empty fallback is "UTC".
UserTimezone(ctx context.Context, memberID string) string
}
// SearchBudget is the narrow surface web_search reads at execute
// time to honour skills.web_search.max_per_run.
//
// Production wiring closes over a per-run counter held by the
// executor. nil-safe: tool falls back to a built-in package
// counter (process-wide, NOT per-run) — useful for tests but NOT
// production-correct because budget bleeds across runs. The
// production adapter MUST be wired.
type SearchBudget interface {
// CheckAndIncrement returns the current count AFTER incrementing
// for the given runID, the configured max, and an error when
// the call would exceed the cap. The handler returns a clean
// "search_budget_exceeded" string on exceed (not an error so
// the agent can react).
CheckAndIncrement(ctx context.Context, runID, kind string) (count, max int, exceeded bool)
}
// ResearchConfig is the narrow surface that read_page / read_video /
// read_pdf / verify_url read at execute time for per-tool budget caps
// and inline-vs-file_id thresholds. Production wiring closes over
// the relevant convars.
//
// nil-safe: tools fall back to package defaults.
type ResearchConfig interface {
// MaxInlineBytes returns the cap above which extracted text is
// persisted as a file_id under run-scope (v10 byte-vs-reference
// principle). Default 12 KiB.
MaxInlineBytes(ctx context.Context) int
// PDFMaxPages returns the cap on pages extracted from a PDF
// before truncation. Default 50.
PDFMaxPages(ctx context.Context) int
// WebSearchEnabled is the master switch for web_search.
WebSearchEnabled(ctx context.Context) bool
// WebSearchMaxPerRun is the per-run search cap.
WebSearchMaxPerRun(ctx context.Context) int
// ReadPageMaxPerRun is the per-run page-read cap.
ReadPageMaxPerRun(ctx context.Context) int
// VideoMaxPerRun is the per-run video-read cap.
VideoMaxPerRun(ctx context.Context) int
// VerifyURLMaxPerRun is the per-run HEAD-check cap.
VerifyURLMaxPerRun(ctx context.Context) int
// ReadPDFMaxPerRun is the per-run PDF-read cap.
ReadPDFMaxPerRun(ctx context.Context) int
// HTTPGetMaxPerRun (v15.2) is the per-run http_get cap. The agent
// otherwise can retry-storm through random URLs and bloat its own
// context with each tool result. Default 20.
HTTPGetMaxPerRun(ctx context.Context) int
// HTTPPostMaxPerRun (v15.2) is the per-run http_post cap. Default 20.
HTTPPostMaxPerRun(ctx context.Context) int
// WebSearchAugmentThreshold is the minimum number of primary
// (Ollama) results required to skip the secondary (DDG/Brave)
// search. When the primary backend returns fewer than this many
// results, the augmented searcher also queries the secondary and
// merges both result sets. Default 5.
WebSearchAugmentThreshold(ctx context.Context) int
// ReplyChainDepthMax is unused here; placeholder shape for
// future per-tool caps. Kept off this interface — callers reach
// into the convar reader directly when they need it.
}
// ErrPageExtractionFailed is the sentinel returned by a PageExtractor
// when both Ollama and chromedp paths produce empty content.
var ErrPageExtractionFailed = errors.New("page extraction failed: empty content")
// ErrVideoTranscriptUnavailable is the sentinel returned by a
// VideoTranscriber when no captions / transcript could be obtained.
var ErrVideoTranscriptUnavailable = errors.New("video transcript unavailable")
// ErrPDFNotPDF is the sentinel returned by a PDFFetcher when the
// HEAD response indicates a non-PDF content-type AND the URL path
// has no .pdf extension. Surfaces a clean "url_is_not_a_pdf"
// rejection rather than a generic transport error.
var ErrPDFNotPDF = errors.New("url does not serve a PDF")
// ErrPDFEncrypted is returned by a PDFExtractor when the PDF refuses
// extraction because it is password-protected. Surfaces a clean
// "pdf_encrypted" rejection.
var ErrPDFEncrypted = errors.New("pdf is encrypted")