Phase 1 of the majordomo build: - llm/ canonical contract (messages, parts, tools, capabilities, streaming, Model/Provider, error classification) - health/ clock-injected tracker (threshold bench, exponential capped cooldown, reset-on-success) - root Registry + Parse (verbatim model ids, inline recursive alias expansion with cycle detection, chain dedup), LLM_* env-DSN providers (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain executor behind the Model interface - provider/fake scriptable test provider; hermetic test suite incl. the trailing-thinking chain and foreman:// env loading - ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow, docs/phase-1-design.md Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
3.9 KiB
Phase 1 design summary (for after-the-fact review)
Written at the Phase 1 → 2 boundary of the unattended build run (2026-06-10). Captures the public surface and the decisions behind it. Authoritative details live in the ADRs; this is the review digest.
What the library looks like to a consumer
reg := majordomo.New() // built-ins + LLM_* env providers
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
m, err := reg.Parse("m5/qwen3:30b,ollama-cloud/kimi-k2.6:cloud,thinking")
resp, err := m.Generate(ctx, majordomo.Request{
System: "You are terse.",
Messages: []majordomo.Message{majordomo.UserText("hi")},
}, majordomo.WithMaxTokens(200))
Model=Generate/Stream/Capabilities; a chain and a single target are the same interface.Provider=Name/Model(id, opts...); ids verbatim, no catalogs.- Canonical types live in
majordomo/llm, re-exported at root via aliases (ADR-0001) — providers importllmonly.
Parse grammar (ADR-0003)
spec := element ("," element)*; element = provider/model (model id =
everything after the first slash, verbatim) or a bare alias token expanded
inline + recursively with cycle detection. Both kickoff README examples are
covered by tests, including the trailing-thinking variant and dedup of
overlapping alias expansions.
Deviation from go-llm worth reviewing: no :low/:medium/:high
reasoning-suffix stripping — it conflicts with verbatim ids
(minimax-m3:cloud, richardyoung/qwen3-14b-abliterated:q4_K_M in mort's
tiers). Plan: reasoning effort becomes an explicit request option when
providers land; mort's wrapper translates its legacy suffix dialect during
Phase 9. If you want suffix parity instead, it's an additive change behind
a RegistryOption.
LLM_* env DSNs (ADR-0004)
Parser is byte-for-byte go-llm (scheme://[token@]host[/path], https
forced, fail-on-use for malformed values). Two resolution paths:
eager scan in New()/LoadEnv(map) (kickoff requirement;
LLM_M1 → provider m1) plus go-llm's lazy LLM_{UPPER(name)}
fallback at Parse time (so hyphenated names keep working). Schemes are
factories (RegisterScheme) — consumers can bind custom provider kinds to
DSNs.
Health & chains (ADR-0006, ADR-0008)
Clock-injected in-memory tracker keyed provider/model. Transient vs
permanent via llm.Classify (unknown → transient; context.Canceled →
permanent). Defaults: 1 same-target retry; bench after 2 consecutive failed
attempts; cooldown 5s ×2 capped 5m; success resets everything. Chains skip
benched targets, advance penalty-free on 404, fail fast on auth/malformed
(flippable via AdvanceOnPermanent), and join per-target reasons on
exhaustion. Chain Capabilities() = head element (per-attempt media
normalization will use the actual target, Phase 3). Streaming failover
covers stream establishment only.
Flagged for reconsideration
- Reasoning suffixes (above) — deliberate deviation, easy to add back.
- Duplicate-element dedup in chains (first occurrence wins): right for
health semantics, but means
a,b,awon't retryaat the tail even afterbfails. Believed correct (same request, same bench state); flag if "retry head last" matters to you. AdvanceOnPermanentdefault = fail-fast on auth/malformed errors: matches the kickoff; mort's old behavior was closer to advance-on-everything. Phase 9 can set the flag per-registry if mort's UX prefers availability.- Stub built-ins: until Phases 3–4,
openai/...etc. parse fine and error on use with "not implemented yet". Chains mixing stubs and real providers will fail over past stubs naturally (the error classifies transient) — temporary, gone by Phase 4.
ADR set
0001 package layout · 0002 message model · 0003 parse grammar · 0004 env DSNs · 0005 provider/capabilities · 0006 health/backoff · 0007 dependency policy · 0008 chain semantics