feat: foundations — canonical types, Parse grammar, env DSNs, health, chains

Phase 1 of the majordomo build:
- llm/ canonical contract (messages, parts, tools, capabilities, streaming,
  Model/Provider, error classification)
- health/ clock-injected tracker (threshold bench, exponential capped
  cooldown, reset-on-success)
- root Registry + Parse (verbatim model ids, inline recursive alias
  expansion with cycle detection, chain dedup), LLM_* env-DSN providers
  (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain
  executor behind the Model interface
- provider/fake scriptable test provider; hermetic test suite incl. the
  trailing-thinking chain and foreman:// env loading
- ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow,
  docs/phase-1-design.md

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 12:35:23 +02:00
parent 3025044817
commit dcd004289f
42 changed files with 3863 additions and 0 deletions
+84
View File
@@ -0,0 +1,84 @@
# Phase 1 design summary (for after-the-fact review)
Written at the Phase 1 → 2 boundary of the unattended build run
(2026-06-10). Captures the public surface and the decisions behind it.
Authoritative details live in the ADRs; this is the review digest.
## What the library looks like to a consumer
```go
reg := majordomo.New() // built-ins + LLM_* env providers
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
m, err := reg.Parse("m5/qwen3:30b,ollama-cloud/kimi-k2.6:cloud,thinking")
resp, err := m.Generate(ctx, majordomo.Request{
System: "You are terse.",
Messages: []majordomo.Message{majordomo.UserText("hi")},
}, majordomo.WithMaxTokens(200))
```
- `Model` = `Generate` / `Stream` / `Capabilities`; a chain and a single
target are the same interface.
- `Provider` = `Name` / `Model(id, opts...)`; ids verbatim, no catalogs.
- Canonical types live in `majordomo/llm`, re-exported at root via aliases
(ADR-0001) — providers import `llm` only.
## Parse grammar (ADR-0003)
`spec := element ("," element)*`; element = `provider/model` (model id =
everything after the first slash, verbatim) or a bare alias token expanded
inline + recursively with cycle detection. Both kickoff README examples are
covered by tests, including the trailing-`thinking` variant and dedup of
overlapping alias expansions.
**Deviation from go-llm worth reviewing:** no `:low/:medium/:high`
reasoning-suffix stripping — it conflicts with verbatim ids
(`minimax-m3:cloud`, `richardyoung/qwen3-14b-abliterated:q4_K_M` in mort's
tiers). Plan: reasoning effort becomes an explicit request option when
providers land; mort's wrapper translates its legacy suffix dialect during
Phase 9. If you want suffix parity instead, it's an additive change behind
a RegistryOption.
## LLM_* env DSNs (ADR-0004)
Parser is byte-for-byte go-llm (`scheme://[token@]host[/path]`, https
forced, fail-on-use for malformed values). Two resolution paths:
eager scan in `New()`/`LoadEnv(map)` (kickoff requirement;
`LLM_M1` → provider `m1`) **plus** go-llm's lazy `LLM_{UPPER(name)}`
fallback at Parse time (so hyphenated names keep working). Schemes are
factories (`RegisterScheme`) — consumers can bind custom provider kinds to
DSNs.
## Health & chains (ADR-0006, ADR-0008)
Clock-injected in-memory tracker keyed `provider/model`. Transient vs
permanent via `llm.Classify` (unknown → transient; `context.Canceled`
permanent). Defaults: 1 same-target retry; bench after 2 consecutive failed
attempts; cooldown 5s ×2 capped 5m; success resets everything. Chains skip
benched targets, advance penalty-free on 404, fail fast on auth/malformed
(flippable via `AdvanceOnPermanent`), and join per-target reasons on
exhaustion. Chain `Capabilities()` = head element (per-attempt media
normalization will use the actual target, Phase 3). Streaming failover
covers stream establishment only.
## Flagged for reconsideration
1. **Reasoning suffixes** (above) — deliberate deviation, easy to add back.
2. **Duplicate-element dedup in chains** (first occurrence wins): right for
health semantics, but means `a,b,a` won't retry `a` at the tail even
after `b` fails. Believed correct (same request, same bench state);
flag if "retry head last" matters to you.
3. **`AdvanceOnPermanent` default = fail-fast** on auth/malformed errors:
matches the kickoff; mort's old behavior was closer to
advance-on-everything. Phase 9 can set the flag per-registry if mort's
UX prefers availability.
4. **Stub built-ins**: until Phases 34, `openai/...` etc. parse fine and
error on use with "not implemented yet". Chains mixing stubs and real
providers will fail over past stubs naturally (the error classifies
transient) — temporary, gone by Phase 4.
## ADR set
0001 package layout · 0002 message model · 0003 parse grammar ·
0004 env DSNs · 0005 provider/capabilities · 0006 health/backoff ·
0007 dependency policy · 0008 chain semantics