P3: meta + primitive tool group (think/now/cite + classify/extract/summarize)

Grow executus/tools into a real generic tool library:

- Register(reg): the always-available, zero-config tools — think, now (UTC
  unless a CurrentTimeProvider is wired), cite (inert unless a CitationStorage
  is wired). All nil-safe; a light host calls Register and is useful.
- RegisterMeta(reg, MetaDeps): the LLM-backed meta tools — classify,
  extract_entities, summarize — over the llmmeta helper. Budget defaults to the
  shipped in-memory per-run cap; Files optional; caps default.
- Seams moved (interface/type-only, no host coupling): research_providers.go
  (CurrentTimeProvider/CitationStorage/SearchBudget/PageExtractor/PDFFetcher/…)
  and file_storage.go (FileStorage + FileDomainMeta). Plus the in-memory budget
  default (research_defaults.go) and scope_validate.go.

calculate deferred (drags github.com/Krognol/go-wolfram + a module-path replace
— not worth it in the lean core for one tool). Core go.sum still free of
gorm/redis/discordgo/sqlite/wolfram.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 21:00:45 -04:00
parent df95425bb5
commit 1e201550b3
11 changed files with 1802 additions and 17 deletions
+97
View File
@@ -0,0 +1,97 @@
package tools
import (
"context"
"sync"
)
// DefaultResearchConfig returns a ResearchConfig pinned to the v11
// design defaults. Production wiring overrides via a convar-aware
// adapter; tests use the defaults directly.
func DefaultResearchConfig() ResearchConfig {
return defaultResearchConfig{}
}
type defaultResearchConfig struct{}
func (defaultResearchConfig) MaxInlineBytes(_ context.Context) int { return 12 * 1024 }
func (defaultResearchConfig) PDFMaxPages(_ context.Context) int { return 50 }
func (defaultResearchConfig) WebSearchEnabled(_ context.Context) bool { return true }
func (defaultResearchConfig) WebSearchMaxPerRun(_ context.Context) int { return 10 }
func (defaultResearchConfig) ReadPageMaxPerRun(_ context.Context) int { return 10 }
func (defaultResearchConfig) VideoMaxPerRun(_ context.Context) int { return 5 }
func (defaultResearchConfig) VerifyURLMaxPerRun(_ context.Context) int { return 20 }
func (defaultResearchConfig) ReadPDFMaxPerRun(_ context.Context) int { return 5 }
func (defaultResearchConfig) HTTPGetMaxPerRun(_ context.Context) int { return 20 }
func (defaultResearchConfig) HTTPPostMaxPerRun(_ context.Context) int { return 20 }
func (defaultResearchConfig) WebSearchAugmentThreshold(_ context.Context) int { return 5 }
// InMemorySearchBudget is the package-default SearchBudget — a
// simple per-(run,kind) counter held in a map. NOT
// production-correct because the map persists across the process
// lifetime; production wiring MUST plug a per-run reset.
//
// Why a default at all: tests want a working SearchBudget without
// rolling their own. Documenting the production-correctness gap
// here keeps the production adapter (in mort.go) honest.
type InMemorySearchBudget struct {
cap map[string]int // by kind; "" means "use Default"
mu sync.Mutex
counts map[string]int // key = runID+"|"+kind
}
// NewInMemorySearchBudget constructs a default SearchBudget. Pass a
// per-kind cap map (e.g. {"web_search": 10, "read_page": 10}); kinds
// missing from the map fall back to maxPerKindDefault.
func NewInMemorySearchBudget(caps map[string]int) *InMemorySearchBudget {
if caps == nil {
caps = map[string]int{}
}
return &InMemorySearchBudget{
cap: caps,
counts: make(map[string]int),
}
}
// CheckAndIncrement implements SearchBudget. Returns the count AFTER
// incrementing on success; the counter is NOT incremented when the
// call would exceed the cap (so a "search_budget_exceeded" rejection
// doesn't burn budget on retry).
func (b *InMemorySearchBudget) CheckAndIncrement(_ context.Context, runID, kind string) (int, int, bool) {
max := b.cap[kind]
if max <= 0 {
max = 10 // safe default
}
b.mu.Lock()
defer b.mu.Unlock()
key := runID + "|" + kind
cur := b.counts[key]
if cur >= max {
return cur, max, true
}
b.counts[key] = cur + 1
return cur + 1, max, false
}
// ResetRun is a test helper: clears the counters for a single run
// across all kinds. Production wiring uses its own per-run lifecycle
// (the executor's RunFinalizer interface).
func (b *InMemorySearchBudget) ResetRun(runID string) {
b.mu.Lock()
defer b.mu.Unlock()
prefix := runID + "|"
for k := range b.counts {
if len(k) > len(prefix) && k[:len(prefix)] == prefix {
delete(b.counts, k)
}
}
}
// StaticTimeProvider is the package-default CurrentTimeProvider —
// returns "" for every member (the tool then falls back to UTC).
// Tests that need a specific timezone wire a one-line struct.
type StaticTimeProvider struct{}
// UserTimezone implements CurrentTimeProvider with a flat fallback to "".
func (StaticTimeProvider) UserTimezone(_ context.Context, _ string) string { return "" }