P1: model layer (convar->config inversion) + llmmeta

Lifts mort's pkg/logic/llms into executus/model, decoupled from mort: - tiers.go: the tier resolver now reads a host-supplied config.Source under "model.tier.<name>" with host-supplied fallbacks (Configure(cfg, defaults, ttl)), instead of convar.Manager. Tier NAMES + specs are host config; the resolution mechanism (cache, reasoning-suffix dialect, chain validation) is generic. No tier names hard-coded in the harness. - sink.go: usage/trace recording inverted off mort's llmusage/llmtrace into UsageSink / TraceSink seams + a model-owned Span, with nil-safe context attribution helpers (WithModel/WithTraceID/WithUsageTool/WithUsageUser). Both sinks optional (nil = off) so a light host records nothing. - lane decoration repointed to executus/lane; utils.Errorf -> fmt.Errorf. - call.go keeps GenerateWith[T] (instrumented structured output) — this is the structured-output primitive; no separate structured/ package. - llmmeta moved over model/ (the meta-LLM helper: tier allowlist + JSON retry + ledger). Its tests configure a minimal tier table via TestMain. New tests cover the inversion: config overrides fallback, tier registration, reasoning-suffix survival, nested-tier rejection, nil-sink no-ops. Full module: go build/vet/test -race green; core go.sum still free of gorm/redis/discordgo/sqlite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 19:47:13 -04:00
parent 741d7816ed
commit b424261aca
17 changed files with 3698 additions and 3 deletions
@@ -0,0 +1,91 @@
+// Package llms — lane_mapping.go: maps a model spec to a stable lane
+// name. Pure data + a single function; no dependency on the registry,
+// no provider wrapping. Kept separate from lane_transport.go so the
+// mapping table can be committed and reviewed in isolation, and so
+// admin / webui code that just wants to *display* lane assignments
+// doesn't drag in the transport machinery.
+//
+// Why a fixed table: provider concurrency caps differ — Ollama Pro is
+// 3 connections, Anthropic Claude has higher per-tier limits, etc.
+// Each provider gets its own lane name so they can be configured
+// independently via convars (lanes.<name>.max_concurrent). Lane names
+// are user-facing (admin dashboard + convar key suffixes) and need to
+// stay stable across deploys; an env-overridable map adds complexity
+// for no current benefit.
+//
+// Test: lane_transport_test.go covers TestLaneFor_Mapping.
+package model
+
+import "strings"
+
+// Lane name constants. Defined as exported strings so admin code (.skill
+// admin set-lane <skill> <lane>), webui dropdowns, and convar consumers
+// share a single canonical spelling.
+const (
+	// LaneOllama covers ollama-cloud/* (and any future ollama/* local).
+	// The local ollama instance is on the same physical resource as
+	// the cloud account from mort's perspective — the connection cap
+	// should apply jointly.
+	LaneOllama = "ollama"
+
+	// LaneAnthropicThinking is the lane for Anthropic models in
+	// extended-thinking mode. Separated from default because thinking
+	// requests hold connections longer and can starve faster lanes
+	// when multiplexed.
+	LaneAnthropicThinking = "anthropic-thinking"
+
+	// LaneAnthropicDefault is the lane for non-thinking Anthropic
+	// requests (haiku, sonnet, opus without -thinking-).
+	LaneAnthropicDefault = "anthropic-default"
+
+	// LaneM1 is the lane for m1/* models (foreman-style router
+	// pointing at a dedicated local instance). Separated from the
+	// ollama lane because m1 targets a distinct host with its own
+	// connection budget.
+	LaneM1 = "m1"
+
+	// LaneLLMDefault is the catch-all lane for any provider/model
+	// combination not explicitly mapped above.
+	LaneLLMDefault = "llm-default"
+)
+
+// LaneFor returns the lane name for the given model spec. Mapping:
+//
+//	ollama-cloud/*               → "ollama"          (Pro account: 3 connections)
+//	anthropic/*-thinking-*       → "anthropic-thinking"
+//	anthropic/*                  → "anthropic-default"
+//	(anything else)              → "llm-default"
+//
+// Tier aliases (fast/standard/thinking) flow through this function as
+// the resolver's expanded provider/model spec, so callers don't need
+// to think about tier indirection. Empty input falls through to
+// LaneLLMDefault rather than panicking — defensive against unset
+// model specs in edge-case test wiring.
+//
+// Substring match for "-thinking-" keeps future Anthropic naming
+// variations classified correctly without churning this table on
+// every model release.
+func LaneFor(modelSpec string) string {
+	s := strings.TrimSpace(modelSpec)
+	if strings.HasPrefix(s, "ollama-cloud/") {
+		return LaneOllama
+	}
+	if strings.HasPrefix(s, "anthropic/") {
+		if strings.Contains(s, "-thinking-") {
+			return LaneAnthropicThinking
+		}
+		return LaneAnthropicDefault
+	}
+	// Foreman instances are backed by Ollama and share its connection
+	// cap, so they route to the same lane.
+	if strings.HasPrefix(s, "foreman/") {
+		return LaneOllama
+	}
+	// m1/ is a foreman-style router pointing at a dedicated local
+	// instance with its own connection budget. Separate lane so its
+	// concurrency cap is independent of the shared ollama lane.
+	if strings.HasPrefix(s, "m1/") {
+		return LaneM1
+	}
+	return LaneLLMDefault
+}