Files
executus/model/lane_mapping.go
T
steve b424261aca
executus CI / test (pull_request) Successful in 58s
Adversarial Review (Gadfly) / review (pull_request) Successful in 26m27s
executus CI / test (push) Successful in 1m2s
P1: model layer (convar->config inversion) + llmmeta
Lifts mort's pkg/logic/llms into executus/model, decoupled from mort:

- tiers.go: the tier resolver now reads a host-supplied config.Source under
  "model.tier.<name>" with host-supplied fallbacks (Configure(cfg, defaults,
  ttl)), instead of convar.Manager. Tier NAMES + specs are host config; the
  resolution mechanism (cache, reasoning-suffix dialect, chain validation) is
  generic. No tier names hard-coded in the harness.
- sink.go: usage/trace recording inverted off mort's llmusage/llmtrace into
  UsageSink / TraceSink seams + a model-owned Span, with nil-safe context
  attribution helpers (WithModel/WithTraceID/WithUsageTool/WithUsageUser).
  Both sinks optional (nil = off) so a light host records nothing.
- lane decoration repointed to executus/lane; utils.Errorf -> fmt.Errorf.
- call.go keeps GenerateWith[T] (instrumented structured output) — this is the
  structured-output primitive; no separate structured/ package.
- llmmeta moved over model/ (the meta-LLM helper: tier allowlist + JSON retry
  + ledger). Its tests configure a minimal tier table via TestMain.

New tests cover the inversion: config overrides fallback, tier registration,
reasoning-suffix survival, nested-tier rejection, nil-sink no-ops.

Full module: go build/vet/test -race green; core go.sum still free of
gorm/redis/discordgo/sqlite.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 19:47:13 -04:00

92 lines
3.6 KiB
Go

// Package llms — lane_mapping.go: maps a model spec to a stable lane
// name. Pure data + a single function; no dependency on the registry,
// no provider wrapping. Kept separate from lane_transport.go so the
// mapping table can be committed and reviewed in isolation, and so
// admin / webui code that just wants to *display* lane assignments
// doesn't drag in the transport machinery.
//
// Why a fixed table: provider concurrency caps differ — Ollama Pro is
// 3 connections, Anthropic Claude has higher per-tier limits, etc.
// Each provider gets its own lane name so they can be configured
// independently via convars (lanes.<name>.max_concurrent). Lane names
// are user-facing (admin dashboard + convar key suffixes) and need to
// stay stable across deploys; an env-overridable map adds complexity
// for no current benefit.
//
// Test: lane_transport_test.go covers TestLaneFor_Mapping.
package model
import "strings"
// Lane name constants. Defined as exported strings so admin code (.skill
// admin set-lane <skill> <lane>), webui dropdowns, and convar consumers
// share a single canonical spelling.
const (
// LaneOllama covers ollama-cloud/* (and any future ollama/* local).
// The local ollama instance is on the same physical resource as
// the cloud account from mort's perspective — the connection cap
// should apply jointly.
LaneOllama = "ollama"
// LaneAnthropicThinking is the lane for Anthropic models in
// extended-thinking mode. Separated from default because thinking
// requests hold connections longer and can starve faster lanes
// when multiplexed.
LaneAnthropicThinking = "anthropic-thinking"
// LaneAnthropicDefault is the lane for non-thinking Anthropic
// requests (haiku, sonnet, opus without -thinking-).
LaneAnthropicDefault = "anthropic-default"
// LaneM1 is the lane for m1/* models (foreman-style router
// pointing at a dedicated local instance). Separated from the
// ollama lane because m1 targets a distinct host with its own
// connection budget.
LaneM1 = "m1"
// LaneLLMDefault is the catch-all lane for any provider/model
// combination not explicitly mapped above.
LaneLLMDefault = "llm-default"
)
// LaneFor returns the lane name for the given model spec. Mapping:
//
// ollama-cloud/* → "ollama" (Pro account: 3 connections)
// anthropic/*-thinking-* → "anthropic-thinking"
// anthropic/* → "anthropic-default"
// (anything else) → "llm-default"
//
// Tier aliases (fast/standard/thinking) flow through this function as
// the resolver's expanded provider/model spec, so callers don't need
// to think about tier indirection. Empty input falls through to
// LaneLLMDefault rather than panicking — defensive against unset
// model specs in edge-case test wiring.
//
// Substring match for "-thinking-" keeps future Anthropic naming
// variations classified correctly without churning this table on
// every model release.
func LaneFor(modelSpec string) string {
s := strings.TrimSpace(modelSpec)
if strings.HasPrefix(s, "ollama-cloud/") {
return LaneOllama
}
if strings.HasPrefix(s, "anthropic/") {
if strings.Contains(s, "-thinking-") {
return LaneAnthropicThinking
}
return LaneAnthropicDefault
}
// Foreman instances are backed by Ollama and share its connection
// cap, so they route to the same lane.
if strings.HasPrefix(s, "foreman/") {
return LaneOllama
}
// m1/ is a foreman-style router pointing at a dedicated local
// instance with its own connection budget. Separate lane so its
// concurrency cap is independent of the shared ollama lane.
if strings.HasPrefix(s, "m1/") {
return LaneM1
}
return LaneLLMDefault
}