P1: model layer (convar->config inversion) + llmmeta
Lifts mort's pkg/logic/llms into executus/model, decoupled from mort: - tiers.go: the tier resolver now reads a host-supplied config.Source under "model.tier.<name>" with host-supplied fallbacks (Configure(cfg, defaults, ttl)), instead of convar.Manager. Tier NAMES + specs are host config; the resolution mechanism (cache, reasoning-suffix dialect, chain validation) is generic. No tier names hard-coded in the harness. - sink.go: usage/trace recording inverted off mort's llmusage/llmtrace into UsageSink / TraceSink seams + a model-owned Span, with nil-safe context attribution helpers (WithModel/WithTraceID/WithUsageTool/WithUsageUser). Both sinks optional (nil = off) so a light host records nothing. - lane decoration repointed to executus/lane; utils.Errorf -> fmt.Errorf. - call.go keeps GenerateWith[T] (instrumented structured output) — this is the structured-output primitive; no separate structured/ package. - llmmeta moved over model/ (the meta-LLM helper: tier allowlist + JSON retry + ledger). Its tests configure a minimal tier table via TestMain. New tests cover the inversion: config overrides fallback, tier registration, reasoning-suffix survival, nested-tier rejection, nil-sink no-ops. Full module: go build/vet/test -race green; core go.sum still free of gorm/redis/discordgo/sqlite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit was merged in pull request #1.
This commit is contained in:
@@ -0,0 +1,91 @@
|
||||
// Package llms — lane_mapping.go: maps a model spec to a stable lane
|
||||
// name. Pure data + a single function; no dependency on the registry,
|
||||
// no provider wrapping. Kept separate from lane_transport.go so the
|
||||
// mapping table can be committed and reviewed in isolation, and so
|
||||
// admin / webui code that just wants to *display* lane assignments
|
||||
// doesn't drag in the transport machinery.
|
||||
//
|
||||
// Why a fixed table: provider concurrency caps differ — Ollama Pro is
|
||||
// 3 connections, Anthropic Claude has higher per-tier limits, etc.
|
||||
// Each provider gets its own lane name so they can be configured
|
||||
// independently via convars (lanes.<name>.max_concurrent). Lane names
|
||||
// are user-facing (admin dashboard + convar key suffixes) and need to
|
||||
// stay stable across deploys; an env-overridable map adds complexity
|
||||
// for no current benefit.
|
||||
//
|
||||
// Test: lane_transport_test.go covers TestLaneFor_Mapping.
|
||||
package model
|
||||
|
||||
import "strings"
|
||||
|
||||
// Lane name constants. Defined as exported strings so admin code (.skill
|
||||
// admin set-lane <skill> <lane>), webui dropdowns, and convar consumers
|
||||
// share a single canonical spelling.
|
||||
const (
|
||||
// LaneOllama covers ollama-cloud/* (and any future ollama/* local).
|
||||
// The local ollama instance is on the same physical resource as
|
||||
// the cloud account from mort's perspective — the connection cap
|
||||
// should apply jointly.
|
||||
LaneOllama = "ollama"
|
||||
|
||||
// LaneAnthropicThinking is the lane for Anthropic models in
|
||||
// extended-thinking mode. Separated from default because thinking
|
||||
// requests hold connections longer and can starve faster lanes
|
||||
// when multiplexed.
|
||||
LaneAnthropicThinking = "anthropic-thinking"
|
||||
|
||||
// LaneAnthropicDefault is the lane for non-thinking Anthropic
|
||||
// requests (haiku, sonnet, opus without -thinking-).
|
||||
LaneAnthropicDefault = "anthropic-default"
|
||||
|
||||
// LaneM1 is the lane for m1/* models (foreman-style router
|
||||
// pointing at a dedicated local instance). Separated from the
|
||||
// ollama lane because m1 targets a distinct host with its own
|
||||
// connection budget.
|
||||
LaneM1 = "m1"
|
||||
|
||||
// LaneLLMDefault is the catch-all lane for any provider/model
|
||||
// combination not explicitly mapped above.
|
||||
LaneLLMDefault = "llm-default"
|
||||
)
|
||||
|
||||
// LaneFor returns the lane name for the given model spec. Mapping:
|
||||
//
|
||||
// ollama-cloud/* → "ollama" (Pro account: 3 connections)
|
||||
// anthropic/*-thinking-* → "anthropic-thinking"
|
||||
// anthropic/* → "anthropic-default"
|
||||
// (anything else) → "llm-default"
|
||||
//
|
||||
// Tier aliases (fast/standard/thinking) flow through this function as
|
||||
// the resolver's expanded provider/model spec, so callers don't need
|
||||
// to think about tier indirection. Empty input falls through to
|
||||
// LaneLLMDefault rather than panicking — defensive against unset
|
||||
// model specs in edge-case test wiring.
|
||||
//
|
||||
// Substring match for "-thinking-" keeps future Anthropic naming
|
||||
// variations classified correctly without churning this table on
|
||||
// every model release.
|
||||
func LaneFor(modelSpec string) string {
|
||||
s := strings.TrimSpace(modelSpec)
|
||||
if strings.HasPrefix(s, "ollama-cloud/") {
|
||||
return LaneOllama
|
||||
}
|
||||
if strings.HasPrefix(s, "anthropic/") {
|
||||
if strings.Contains(s, "-thinking-") {
|
||||
return LaneAnthropicThinking
|
||||
}
|
||||
return LaneAnthropicDefault
|
||||
}
|
||||
// Foreman instances are backed by Ollama and share its connection
|
||||
// cap, so they route to the same lane.
|
||||
if strings.HasPrefix(s, "foreman/") {
|
||||
return LaneOllama
|
||||
}
|
||||
// m1/ is a foreman-style router pointing at a dedicated local
|
||||
// instance with its own connection budget. Separate lane so its
|
||||
// concurrency cap is independent of the shared ollama lane.
|
||||
if strings.HasPrefix(s, "m1/") {
|
||||
return LaneM1
|
||||
}
|
||||
return LaneLLMDefault
|
||||
}
|
||||
Reference in New Issue
Block a user