feat: foundations — canonical types, Parse grammar, env DSNs, health, chains
Phase 1 of the majordomo build: - llm/ canonical contract (messages, parts, tools, capabilities, streaming, Model/Provider, error classification) - health/ clock-injected tracker (threshold bench, exponential capped cooldown, reset-on-success) - root Registry + Parse (verbatim model ids, inline recursive alias expansion with cycle detection, chain dedup), LLM_* env-DSN providers (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain executor behind the Model interface - provider/fake scriptable test provider; hermetic test suite incl. the trailing-thinking chain and foreman:// env loading - ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow, docs/phase-1-design.md Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,84 @@
|
||||
# Phase 1 design summary (for after-the-fact review)
|
||||
|
||||
Written at the Phase 1 → 2 boundary of the unattended build run
|
||||
(2026-06-10). Captures the public surface and the decisions behind it.
|
||||
Authoritative details live in the ADRs; this is the review digest.
|
||||
|
||||
## What the library looks like to a consumer
|
||||
|
||||
```go
|
||||
reg := majordomo.New() // built-ins + LLM_* env providers
|
||||
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
|
||||
|
||||
m, err := reg.Parse("m5/qwen3:30b,ollama-cloud/kimi-k2.6:cloud,thinking")
|
||||
resp, err := m.Generate(ctx, majordomo.Request{
|
||||
System: "You are terse.",
|
||||
Messages: []majordomo.Message{majordomo.UserText("hi")},
|
||||
}, majordomo.WithMaxTokens(200))
|
||||
```
|
||||
|
||||
- `Model` = `Generate` / `Stream` / `Capabilities`; a chain and a single
|
||||
target are the same interface.
|
||||
- `Provider` = `Name` / `Model(id, opts...)`; ids verbatim, no catalogs.
|
||||
- Canonical types live in `majordomo/llm`, re-exported at root via aliases
|
||||
(ADR-0001) — providers import `llm` only.
|
||||
|
||||
## Parse grammar (ADR-0003)
|
||||
|
||||
`spec := element ("," element)*`; element = `provider/model` (model id =
|
||||
everything after the first slash, verbatim) or a bare alias token expanded
|
||||
inline + recursively with cycle detection. Both kickoff README examples are
|
||||
covered by tests, including the trailing-`thinking` variant and dedup of
|
||||
overlapping alias expansions.
|
||||
|
||||
**Deviation from go-llm worth reviewing:** no `:low/:medium/:high`
|
||||
reasoning-suffix stripping — it conflicts with verbatim ids
|
||||
(`minimax-m3:cloud`, `richardyoung/qwen3-14b-abliterated:q4_K_M` in mort's
|
||||
tiers). Plan: reasoning effort becomes an explicit request option when
|
||||
providers land; mort's wrapper translates its legacy suffix dialect during
|
||||
Phase 9. If you want suffix parity instead, it's an additive change behind
|
||||
a RegistryOption.
|
||||
|
||||
## LLM_* env DSNs (ADR-0004)
|
||||
|
||||
Parser is byte-for-byte go-llm (`scheme://[token@]host[/path]`, https
|
||||
forced, fail-on-use for malformed values). Two resolution paths:
|
||||
eager scan in `New()`/`LoadEnv(map)` (kickoff requirement;
|
||||
`LLM_M1` → provider `m1`) **plus** go-llm's lazy `LLM_{UPPER(name)}`
|
||||
fallback at Parse time (so hyphenated names keep working). Schemes are
|
||||
factories (`RegisterScheme`) — consumers can bind custom provider kinds to
|
||||
DSNs.
|
||||
|
||||
## Health & chains (ADR-0006, ADR-0008)
|
||||
|
||||
Clock-injected in-memory tracker keyed `provider/model`. Transient vs
|
||||
permanent via `llm.Classify` (unknown → transient; `context.Canceled` →
|
||||
permanent). Defaults: 1 same-target retry; bench after 2 consecutive failed
|
||||
attempts; cooldown 5s ×2 capped 5m; success resets everything. Chains skip
|
||||
benched targets, advance penalty-free on 404, fail fast on auth/malformed
|
||||
(flippable via `AdvanceOnPermanent`), and join per-target reasons on
|
||||
exhaustion. Chain `Capabilities()` = head element (per-attempt media
|
||||
normalization will use the actual target, Phase 3). Streaming failover
|
||||
covers stream establishment only.
|
||||
|
||||
## Flagged for reconsideration
|
||||
|
||||
1. **Reasoning suffixes** (above) — deliberate deviation, easy to add back.
|
||||
2. **Duplicate-element dedup in chains** (first occurrence wins): right for
|
||||
health semantics, but means `a,b,a` won't retry `a` at the tail even
|
||||
after `b` fails. Believed correct (same request, same bench state);
|
||||
flag if "retry head last" matters to you.
|
||||
3. **`AdvanceOnPermanent` default = fail-fast** on auth/malformed errors:
|
||||
matches the kickoff; mort's old behavior was closer to
|
||||
advance-on-everything. Phase 9 can set the flag per-registry if mort's
|
||||
UX prefers availability.
|
||||
4. **Stub built-ins**: until Phases 3–4, `openai/...` etc. parse fine and
|
||||
error on use with "not implemented yet". Chains mixing stubs and real
|
||||
providers will fail over past stubs naturally (the error classifies
|
||||
transient) — temporary, gone by Phase 4.
|
||||
|
||||
## ADR set
|
||||
|
||||
0001 package layout · 0002 message model · 0003 parse grammar ·
|
||||
0004 env DSNs · 0005 provider/capabilities · 0006 health/backoff ·
|
||||
0007 dependency policy · 0008 chain semantics
|
||||
Reference in New Issue
Block a user