Files
majordomo/README.md
T
steve dcd004289f feat: foundations — canonical types, Parse grammar, env DSNs, health, chains
Phase 1 of the majordomo build:
- llm/ canonical contract (messages, parts, tools, capabilities, streaming,
  Model/Provider, error classification)
- health/ clock-injected tracker (threshold bench, exponential capped
  cooldown, reset-on-success)
- root Registry + Parse (verbatim model ids, inline recursive alias
  expansion with cycle detection, chain dedup), LLM_* env-DSN providers
  (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain
  executor behind the Model interface
- provider/fake scriptable test provider; hermetic test suite incl. the
  trailing-thinking chain and foreman:// env loading
- ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow,
  docs/phase-1-design.md

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:35:34 +02:00

217 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# majordomo
A clean-slate Go library for building LLM-backed agents: one canonical API
over many model providers, a parseable model naming / failover / tiering
system with built-in health tracking, capability-aware multimodality, tool
calls, structured output, and composable agents and skills.
> **Status:** under construction, phase by phase. The
> [support matrix](#featureprovider-support-matrix) below is kept honest:
> *pending* means not built yet, and this README is updated in the same
> commit as the behavior it describes.
## Install
```bash
go get gitea.stevedudenhoeffer.com/steve/majordomo
```
Requires Go 1.26+.
## Quickstart
```go
package main
import (
"context"
"fmt"
"gitea.stevedudenhoeffer.com/steve/majordomo"
)
func main() {
reg := majordomo.New() // built-ins + LLM_* env providers
m, err := reg.Parse("ollama-cloud/minimax-m3:cloud")
if err != nil { panic(err) }
resp, err := m.Generate(context.Background(), majordomo.Request{
Messages: []majordomo.Message{majordomo.UserText("hello!")},
})
if err != nil { panic(err) }
fmt.Println(resp.Text())
}
```
`majordomo.Parse(...)` (package level) uses a lazily-built default registry
if you don't need isolation.
## Model specs: targets, chains, tiers
A model spec is a comma-separated **failover chain**; each element is either
a `provider/model` target or a registered **alias** (tier):
```go
// Try minimax-m3 first; on failure kimi-k2.6; finally fall back to opus-4.8.
m, _ := reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8")
// Identical, with the registered alias "thinking" appended and expanded
// in place as the tail of the chain:
m, _ = reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8,thinking")
```
Everything after the **first `/`** (up to the next comma) is the model id,
passed to the provider **verbatim** — tags (`:cloud`, `:30b`) and ids with
extra slashes survive intact. majordomo never validates ids against a
catalog.
### Custom tiers (aliases)
```go
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
reg.RegisterAlias("workhorse", "ollama-cloud/minimax-m2.7:cloud,ollama-cloud/qwen3-coder:480b-cloud")
m, _ := reg.Parse("thinking") // a chain, same Model interface as a single target
```
Aliases may appear anywhere in a chain (head, middle, tail), may reference
other aliases, and expand inline and recursively; cycles are detected and
returned as errors.
### Failover & health
Chains are health-tracked per target:
- A **single transient error** (429/5xx, timeout, connection failure) is
retried once on the same target.
- **Repeated transient errors** (default: 2 consecutive failed attempts)
bench the target — chains skip it until its cooldown expires (exponential:
5s, 10s, 20s, ... capped at 5m). Any success resets it.
- `model not found` advances down the chain without penalty; auth/malformed
errors fail fast (failing over can't fix a bad key). All knobs are
configurable via `WithChainConfig` / `WithHealthConfig`.
- If every element fails, you get one joined error naming each target and
why it failed.
## Providers
### Built-in env vars
| Provider | Spec name | Key env var | Default endpoint |
|----------|-----------|-------------|------------------|
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | api.openai.com *(pending)* |
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | api.anthropic.com *(pending)* |
| Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gen AI API *(pending)* |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com *(pending)* |
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 *(pending)* |
| foreman | `foreman` | — (token via DSN) | requires DSN/base URL *(pending)* |
### `LLM_*` env-DSN provider definitions
Define named providers entirely from the environment (go-llm parity):
```
LLM_M1=foreman://test-token-change-me@foreman-m1.orgrimmar.dudenhoeffer.casa
LLM_M5=foreman://test-token-change-me@foreman-m5.orgrimmar.dudenhoeffer.casa
```
defines providers `m1` and `m5` (foreman targets — native Ollama wire
protocol behind a bearer token). They are first-class in `Parse`, chains,
and aliases:
```go
m, _ := reg.Parse("m5/qwen3:30b,m1/qwen3:30b,thinking")
```
DSN format: `scheme://[token@]host[/path]`, scheme ∈ `foreman`, `ollama`,
`ollama-cloud`, `openai`, `anthropic`, `google`/`gemini`, or any scheme you
add with `RegisterScheme`. The token is the credential (bearer token / API
key); the base URL is always `https://host[/path]`. `New()` loads `LLM_*`
vars eagerly; unknown provider names also resolve lazily at Parse time
(`my-prov/x``LLM_MY_PROV`).
### Custom providers
Implement the two-method `Provider` interface and register it:
```go
reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
```
## Multimodality *(pending — Phase 3)*
Attach images without knowing the target's limits; majordomo normalizes
(downscale, re-encode, count/size limits) against the resolved target's
declared capabilities and rejects clearly what cannot fit.
```go
resp, err := m.Generate(ctx, majordomo.Request{
Messages: []majordomo.Message{
majordomo.UserParts(majordomo.Text("what's in this image?"),
majordomo.Image("image/png", pngBytes)),
},
})
```
## Tool calls *(canonical API ready; provider wiring pending — Phase 3)*
```go
weather := majordomo.Tool{
Name: "get_weather",
Description: "Current weather for a city",
Parameters: json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}`),
Handler: func(ctx context.Context, args json.RawMessage) (any, error) {
var p struct{ City string `json:"city"` }
_ = json.Unmarshal(args, &p)
return map[string]any{"city": p.City, "temp_c": 21}, nil
},
}
resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
// resp.ToolCalls → execute → append ToolResultsMessage → continue
```
## Structured output *(canonical API ready; provider wiring pending — Phase 3)*
```go
resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
```
A generic `Generate[T]` helper (schema from your struct, unmarshal into it)
lands with the agent phase.
## Agents & skills *(pending — Phases 56)*
Agents = model + system prompt + toolboxes, running a tool-dispatch loop;
skills = reusable instruction+tool bundles attachable to any agent.
## Feature/provider support matrix
| Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
|----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| OpenAI (+compatible) | ✅ | pending | pending | pending | pending | pending | ✅ |
| Anthropic (+compat) | ✅ | pending | pending | pending | pending | pending | ✅ |
| Google (Gemini) | ✅ | pending | pending | pending | pending | pending | ✅ |
| Ollama Cloud | ✅ | pending | pending | pending | pending | pending | ✅ |
| Ollama (local) | ✅ | pending | pending | pending | pending | pending | ✅ |
| foreman | ✅ | pending | pending | pending | pending | pending | ✅ |
| fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline pending ·
agent loop pending · skills pending · `Generate[T]` pending.
## Development
```bash
go build ./... && go vet ./... && go test -race -count=1 ./...
```
The default test suite is fully hermetic (no network, no credentials).
Live integration tests (Phase 8) are gated behind the `live` build tag and
read `.env` (see `.env.example`; never commit `.env`).
Design decisions are recorded in [docs/adr/](docs/adr/README.md);
conventions in [CLAUDE.md](CLAUDE.md); build history in
[progress.md](progress.md).