# majordomo A clean-slate Go library for building LLM-backed agents: one canonical API over many model providers, a parseable model naming / failover / tiering system with built-in health tracking, capability-aware multimodality, tool calls, structured output, and composable agents and skills. > **Status:** under construction, phase by phase. The > [support matrix](#featureprovider-support-matrix) below is kept honest: > *pending* means not built yet, and this README is updated in the same > commit as the behavior it describes. ## Install ```bash go get gitea.stevedudenhoeffer.com/steve/majordomo ``` Requires Go 1.26+. ## Quickstart ```go package main import ( "context" "fmt" "gitea.stevedudenhoeffer.com/steve/majordomo" ) func main() { reg := majordomo.New() // built-ins + LLM_* env providers m, err := reg.Parse("ollama-cloud/minimax-m3:cloud") if err != nil { panic(err) } resp, err := m.Generate(context.Background(), majordomo.Request{ Messages: []majordomo.Message{majordomo.UserText("hello!")}, }) if err != nil { panic(err) } fmt.Println(resp.Text()) } ``` `majordomo.Parse(...)` (package level) uses a lazily-built default registry if you don't need isolation. ## Model specs: targets, chains, tiers A model spec is a comma-separated **failover chain**; each element is either a `provider/model` target or a registered **alias** (tier): ```go // Try minimax-m3 first; on failure kimi-k2.6; finally fall back to opus-4.8. m, _ := reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8") // Identical, with the registered alias "thinking" appended and expanded // in place as the tail of the chain: m, _ = reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8,thinking") ``` Everything after the **first `/`** (up to the next comma) is the model id, passed to the provider **verbatim** — tags (`:cloud`, `:30b`) and ids with extra slashes survive intact. majordomo never validates ids against a catalog. ### Custom tiers (aliases) ```go reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud") reg.RegisterAlias("workhorse", "ollama-cloud/minimax-m2.7:cloud,ollama-cloud/qwen3-coder:480b-cloud") m, _ := reg.Parse("thinking") // a chain, same Model interface as a single target ``` Aliases may appear anywhere in a chain (head, middle, tail), may reference other aliases, and expand inline and recursively; cycles are detected and returned as errors. ### Failover & health Chains are health-tracked per target: - A **single transient error** (429/5xx, timeout, connection failure) is retried once on the same target. - **Repeated transient errors** (default: 2 consecutive failed attempts) bench the target — chains skip it until its cooldown expires (exponential: 5s, 10s, 20s, ... capped at 5m). Any success resets it. - `model not found` advances down the chain without penalty; auth/malformed errors fail fast (failing over can't fix a bad key). All knobs are configurable via `WithChainConfig` / `WithHealthConfig`. - If every element fails, you get one joined error naming each target and why it failed. ## Providers ### Built-in env vars | Provider | Spec name | Key env var | Default endpoint | |----------|-----------|-------------|------------------| | OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | https://api.openai.com/v1 | | Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | https://api.anthropic.com | | Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gemini API (official SDK) | | Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com | | Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 | | foreman | `foreman` | — (token via DSN) | requires an LLM_* DSN or `ollama.Foreman(url, token)` | OpenAI-compatible / Anthropic-compatible endpoints: construct the provider with a name and base URL and register it — ```go reg.RegisterProvider(openai.New( openai.WithName("groq"), openai.WithBaseURL("https://api.groq.com/openai/v1"), openai.WithAPIKey(key), // openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens )) // now "groq/llama-3.3-70b" works in Parse, chains, and aliases ``` ### `LLM_*` env-DSN provider definitions Define named providers entirely from the environment (go-llm parity): ``` LLM_M1=foreman://test-token-change-me@foreman-m1.orgrimmar.dudenhoeffer.casa LLM_M5=foreman://test-token-change-me@foreman-m5.orgrimmar.dudenhoeffer.casa ``` defines providers `m1` and `m5` (foreman targets — native Ollama wire protocol behind a bearer token). They are first-class in `Parse`, chains, and aliases: ```go m, _ := reg.Parse("m5/qwen3:30b,m1/qwen3:30b,thinking") ``` DSN format: `scheme://[token@]host[/path]`, scheme ∈ `foreman`, `ollama`, `ollama-cloud`, `openai`, `anthropic`, `google`/`gemini`, or any scheme you add with `RegisterScheme`. The token is the credential (bearer token / API key); the base URL is always `https://host[/path]`. `New()` loads `LLM_*` vars eagerly; unknown provider names also resolve lazily at Parse time (`my-prov/x` → `LLM_MY_PROV`). ### Custom providers Implement the two-method `Provider` interface and register it: ```go reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases ``` ## Multimodality Attach images without knowing the target's limits. Before each attempt the request is normalized against the **actual serving target's** declared capabilities: the real format is sniffed from the bytes, oversize images are downscaled (aspect preserved), disallowed formats are re-encoded, and byte budgets are enforced by a quality ladder. What cannot be made to fit is rejected with a clear `ErrUnsupported` error — and in a chain, the request simply advances to the next (e.g. vision-capable) element. ```go resp, err := m.Generate(ctx, majordomo.Request{ Messages: []majordomo.Message{ majordomo.UserParts(majordomo.Text("what's in this image?"), majordomo.Image("image/png", pngBytes)), }, }) ``` ## Tool calls ```go weather := majordomo.Tool{ Name: "get_weather", Description: "Current weather for a city", Parameters: json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}`), Handler: func(ctx context.Context, args json.RawMessage) (any, error) { var p struct{ City string `json:"city"` } _ = json.Unmarshal(args, &p) return map[string]any{"city": p.City, "temp_c": 21}, nil }, } resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather)) // resp.ToolCalls → execute → append ToolResultsMessage → continue ``` Each provider maps this one shape to its native function-calling format (OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with object arguments). Tool-call ids are synthesized when a backend omits them; streaming buffers tool-call arguments until they parse. ## Structured output ```go resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer")) ``` Maps to OpenAI `response_format: json_schema`, Anthropic `output_config.format`, Ollama `format`, and Google `responseJsonSchema`. The typed helper derives the schema from your struct (all fields required, `additionalProperties:false`, pointers nullable; `description:"..."` and `enum:"a,b,c"` tags supported) and unmarshals the result: ```go type Verdict struct { Guilty bool `json:"guilty"` Why string `json:"why" description:"one-sentence rationale"` } v, err := majordomo.Generate[Verdict](ctx, m, req) ``` ## Agents An agent is a model + system prompt + toolboxes, run as a tool-dispatch loop until the model answers (or `MaxSteps`): ```go import "gitea.stevedudenhoeffer.com/steve/majordomo/agent" a := agent.New(m, "You are a research assistant.", agent.WithToolbox(searchTools), agent.WithMaxSteps(8), agent.WithStepObserver(func(s agent.Step) { log.Printf("step %d", s.Index) }), ) res, err := a.Run(ctx, "What changed in Go 1.26?") // res.Output, res.Steps, res.Usage; res.Messages round-trips via // agent.WithHistory for conversation continuation. ``` The loop never panics: tool handler errors and panics become error results the model can react to; unknown tools likewise; duplicate tool names across toolboxes fail loudly. On `agent.ErrMaxSteps` (and on model errors) the partial result with the full transcript is still returned. ## Skills Skills are reusable instruction+tool bundles attachable to **any** agent, at construction or on demand. Instructions extend the system prompt; tools extend the toolset — additively, in attachment order. ```go import ( "gitea.stevedudenhoeffer.com/steve/majordomo/skill" "gitea.stevedudenhoeffer.com/steve/majordomo/skill/calc" "gitea.stevedudenhoeffer.com/steve/majordomo/skill/clock" ) research := skill.New("research", skill.WithInstructions("Cite a source for every claim."), skill.WithTools(searchTool, fetchTool), ) a := agent.New(m, "You are helpful.", agent.WithSkill(research)) a.AddSkill(clock.New()) // ready-made: time awareness a.AddSkill(calc.New()) // ready-made: exact arithmetic ``` Anything implementing the three-method `agent.Skill` interface (Name / Instructions / Tools) is a skill — `skill.New` is just the convenient way to build one. ## Feature/provider support matrix | Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN | |----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:| | OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Google (Gemini) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ | | fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — | ¹ foreman's daemon currently buffers sync chat responses (no token-by-token streaming); majordomo's stream API works against it and delivers the response as a single delta plus final event. Notes: Ollama has no native tool_choice — `"none"` drops the tools; `"required"`/named choices are best-effort ignored there. Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ · health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅ (per-target normalization in chains) · agent loop ✅ · `Generate[T]` + schema derivation ✅ · skills ✅ (with clock + calc examples). ## Development ```bash go build ./... && go vet ./... && go test -race -count=1 ./... ``` The default test suite is fully hermetic (no network, no credentials). Live integration tests (Phase 8) are gated behind the `live` build tag and read `.env` (see `.env.example`; never commit `.env`). Design decisions are recorded in [docs/adr/](docs/adr/README.md); conventions in [CLAUDE.md](CLAUDE.md); build history in [progress.md](progress.md).