# Phase 1 design summary (for after-the-fact review) Written at the Phase 1 → 2 boundary of the unattended build run (2026-06-10). Captures the public surface and the decisions behind it. Authoritative details live in the ADRs; this is the review digest. ## What the library looks like to a consumer ```go reg := majordomo.New() // built-ins + LLM_* env providers reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud") m, err := reg.Parse("m5/qwen3:30b,ollama-cloud/kimi-k2.6:cloud,thinking") resp, err := m.Generate(ctx, majordomo.Request{ System: "You are terse.", Messages: []majordomo.Message{majordomo.UserText("hi")}, }, majordomo.WithMaxTokens(200)) ``` - `Model` = `Generate` / `Stream` / `Capabilities`; a chain and a single target are the same interface. - `Provider` = `Name` / `Model(id, opts...)`; ids verbatim, no catalogs. - Canonical types live in `majordomo/llm`, re-exported at root via aliases (ADR-0001) — providers import `llm` only. ## Parse grammar (ADR-0003) `spec := element ("," element)*`; element = `provider/model` (model id = everything after the first slash, verbatim) or a bare alias token expanded inline + recursively with cycle detection. Both kickoff README examples are covered by tests, including the trailing-`thinking` variant and dedup of overlapping alias expansions. **Deviation from go-llm worth reviewing:** no `:low/:medium/:high` reasoning-suffix stripping — it conflicts with verbatim ids (`minimax-m3:cloud`, `richardyoung/qwen3-14b-abliterated:q4_K_M` in mort's tiers). Plan: reasoning effort becomes an explicit request option when providers land; mort's wrapper translates its legacy suffix dialect during Phase 9. If you want suffix parity instead, it's an additive change behind a RegistryOption. ## LLM_* env DSNs (ADR-0004) Parser is byte-for-byte go-llm (`scheme://[token@]host[/path]`, https forced, fail-on-use for malformed values). Two resolution paths: eager scan in `New()`/`LoadEnv(map)` (kickoff requirement; `LLM_M1` → provider `m1`) **plus** go-llm's lazy `LLM_{UPPER(name)}` fallback at Parse time (so hyphenated names keep working). Schemes are factories (`RegisterScheme`) — consumers can bind custom provider kinds to DSNs. ## Health & chains (ADR-0006, ADR-0008) Clock-injected in-memory tracker keyed `provider/model`. Transient vs permanent via `llm.Classify` (unknown → transient; `context.Canceled` → permanent). Defaults: 1 same-target retry; bench after 2 consecutive failed attempts; cooldown 5s ×2 capped 5m; success resets everything. Chains skip benched targets, advance penalty-free on 404, fail fast on auth/malformed (flippable via `AdvanceOnPermanent`), and join per-target reasons on exhaustion. Chain `Capabilities()` = head element (per-attempt media normalization will use the actual target, Phase 3). Streaming failover covers stream establishment only. ## Flagged for reconsideration 1. **Reasoning suffixes** (above) — deliberate deviation, easy to add back. 2. **Duplicate-element dedup in chains** (first occurrence wins): right for health semantics, but means `a,b,a` won't retry `a` at the tail even after `b` fails. Believed correct (same request, same bench state); flag if "retry head last" matters to you. 3. **`AdvanceOnPermanent` default = fail-fast** on auth/malformed errors: matches the kickoff; mort's old behavior was closer to advance-on-everything. Phase 9 can set the flag per-registry if mort's UX prefers availability. 4. **Stub built-ins**: until Phases 3–4, `openai/...` etc. parse fine and error on use with "not implemented yet". Chains mixing stubs and real providers will fail over past stubs naturally (the error classifies transient) — temporary, gone by Phase 4. ## ADR set 0001 package layout · 0002 message model · 0003 parse grammar · 0004 env DSNs · 0005 provider/capabilities · 0006 health/backoff · 0007 dependency policy · 0008 chain semantics