Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
majordomo
A clean-slate Go library for building LLM-backed agents: one canonical API over many model providers, a parseable model naming / failover / tiering system with built-in health tracking, capability-aware multimodality, tool calls, structured output, and composable agents and skills.
The support matrix below is kept honest: pending means not built, and this README is updated in the same commit as the behavior it describes. Runnable programs for every feature live in examples/.
Install
go get gitea.stevedudenhoeffer.com/steve/majordomo
Requires Go 1.26+.
Quickstart
package main
import (
"context"
"fmt"
"gitea.stevedudenhoeffer.com/steve/majordomo"
)
func main() {
reg := majordomo.New() // built-ins + LLM_* env providers
m, err := reg.Parse("ollama-cloud/minimax-m3:cloud")
if err != nil { panic(err) }
resp, err := m.Generate(context.Background(), majordomo.Request{
Messages: []majordomo.Message{majordomo.UserText("hello!")},
})
if err != nil { panic(err) }
fmt.Println(resp.Text())
}
majordomo.Parse(...) (package level) uses a lazily-built default registry
if you don't need isolation.
Model specs: targets, chains, tiers
A model spec is a comma-separated failover chain; each element is either
a provider/model target or a registered alias (tier):
// Try minimax-m3 first; on failure kimi-k2.6; finally fall back to opus-4.8.
m, _ := reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8")
// Identical, with the registered alias "thinking" appended and expanded
// in place as the tail of the chain:
m, _ = reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8,thinking")
Everything after the first / (up to the next comma) is the model id,
passed to the provider verbatim — tags (:cloud, :30b) and ids with
extra slashes survive intact. majordomo never validates ids against a
catalog.
Custom tiers (aliases)
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
reg.RegisterAlias("workhorse", "ollama-cloud/minimax-m2.7:cloud,ollama-cloud/qwen3-coder:480b-cloud")
m, _ := reg.Parse("thinking") // a chain, same Model interface as a single target
Aliases may appear anywhere in a chain (head, middle, tail), may reference other aliases, and expand inline and recursively; cycles are detected and returned as errors.
For tiers that live in a database or config system, register a dynamic resolver — consulted after static aliases, output expanded with the same recursion and cycle guards:
reg.RegisterResolver(majordomo.ResolverFunc(func(name string) (string, bool) {
return myConfigStore.LookupTier(name) // e.g. "agent-thinking" → a chain
}))
Failover & health
Chains are health-tracked per target:
- A single transient error (429/5xx, timeout, connection failure) is retried once on the same target.
- Repeated transient errors (default: 2 consecutive failed attempts) bench the target — chains skip it until its cooldown expires (exponential: 5s, 10s, 20s, ... capped at 5m). Any success resets it.
model not foundadvances down the chain without penalty; auth/malformed errors fail fast (failing over can't fix a bad key). All knobs are configurable viaWithChainConfig/WithHealthConfig.- If every element fails, you get one joined error naming each target and why it failed.
- Ops surfaces:
reg.Health()exposesBench/Unbench/Snapshotfor manual control and dashboards;ChainConfig.Observerreceives one event per failover decision (failed attempt, bench, benched-skip) for logging.
Providers
Built-in env vars
| Provider | Spec name | Key env var | Default endpoint |
|---|---|---|---|
| OpenAI (+compatible) | openai |
OPENAI_API_KEY |
https://api.openai.com/v1 |
| Anthropic (+compatible) | anthropic |
ANTHROPIC_API_KEY |
https://api.anthropic.com |
| Google (Gemini) | google |
GOOGLE_API_KEY / GEMINI_API_KEY |
Gemini API (official SDK) |
| Ollama Cloud | ollama-cloud |
OLLAMA_API_KEY |
https://ollama.com |
| Ollama (local) | ollama |
— | OLLAMA_HOST or http://localhost:11434 |
| foreman | foreman |
— (token via DSN) | requires an LLM_* DSN or ollama.Foreman(url, token) |
OpenAI-compatible / Anthropic-compatible endpoints: construct the provider with a name and base URL and register it —
reg.RegisterProvider(openai.New(
openai.WithName("groq"),
openai.WithBaseURL("https://api.groq.com/openai/v1"),
openai.WithAPIKey(key),
// openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
))
// now "groq/llama-3.3-70b" works in Parse, chains, and aliases
LLM_* env-DSN provider definitions
Define named providers entirely from the environment (go-llm parity):
LLM_M1=foreman://test-token-change-me@foreman-m1.orgrimmar.dudenhoeffer.casa
LLM_M5=foreman://test-token-change-me@foreman-m5.orgrimmar.dudenhoeffer.casa
defines providers m1 and m5 (foreman targets — native Ollama wire
protocol behind a bearer token). They are first-class in Parse, chains,
and aliases:
m, _ := reg.Parse("m5/qwen3:30b,m1/qwen3:30b,thinking")
DSN format: scheme://[token@]host[/path], scheme ∈ foreman, ollama,
ollama-cloud, openai, anthropic, google/gemini, or any scheme you
add with RegisterScheme. The token is the credential (bearer token / API
key); the base URL is always https://host[/path]. New() loads LLM_*
vars eagerly; unknown provider names also resolve lazily at Parse time
(my-prov/x → LLM_MY_PROV).
Custom providers
Implement the two-method Provider interface and register it:
reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
Multimodality
Attach images without knowing the target's limits. Before each attempt the
request is normalized against the actual serving target's declared
capabilities: the real format is sniffed from the bytes, oversize images
are downscaled (aspect preserved), disallowed formats are re-encoded, and
byte budgets are enforced by a quality ladder. What cannot be made to fit
is rejected with a clear ErrUnsupported error — and in a chain, the
request simply advances to the next (e.g. vision-capable) element.
resp, err := m.Generate(ctx, majordomo.Request{
Messages: []majordomo.Message{
majordomo.UserParts(majordomo.Text("what's in this image?"),
majordomo.Image("image/png", pngBytes)),
},
})
Tool calls
weather := majordomo.Tool{
Name: "get_weather",
Description: "Current weather for a city",
Parameters: json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}`),
Handler: func(ctx context.Context, args json.RawMessage) (any, error) {
var p struct{ City string `json:"city"` }
_ = json.Unmarshal(args, &p)
return map[string]any{"city": p.City, "temp_c": 21}, nil
},
}
resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
// resp.ToolCalls → execute → append ToolResultsMessage → continue
Or typed, with the schema derived from your argument struct:
weather := majordomo.DefineTool("get_weather", "Current weather for a city",
func(ctx context.Context, args struct {
City string `json:"city" description:"city name"`
}) (any, error) {
return lookup(args.City)
})
Each provider maps this one shape to its native function-calling format (OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with object arguments). Tool-call ids are synthesized when a backend omits them; streaming buffers tool-call arguments until they parse.
Structured output
resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
Maps to OpenAI response_format: json_schema, Anthropic
output_config.format, Ollama format, and Google responseJsonSchema.
The typed helper derives the schema from your struct (all fields required,
additionalProperties:false, pointers nullable; description:"..." and
enum:"a,b,c" tags supported) and unmarshals the result:
type Verdict struct {
Guilty bool `json:"guilty"`
Why string `json:"why" description:"one-sentence rationale"`
}
v, err := majordomo.Generate[Verdict](ctx, m, req)
Agents
An agent is a model + system prompt + toolboxes, run as a tool-dispatch
loop until the model answers (or MaxSteps):
import "gitea.stevedudenhoeffer.com/steve/majordomo/agent"
a := agent.New(m, "You are a research assistant.",
agent.WithToolbox(searchTools),
agent.WithMaxSteps(8),
agent.WithStepObserver(func(s agent.Step) { log.Printf("step %d", s.Index) }),
)
res, err := a.Run(ctx, "What changed in Go 1.26?")
// res.Output, res.Steps, res.Usage; res.Messages round-trips via
// agent.WithHistory for conversation continuation.
The loop never panics: tool handler errors and panics become error results
the model can react to; unknown tools likewise; duplicate tool names across
toolboxes fail loudly. On agent.ErrMaxSteps (and on model errors) the
partial result with the full transcript is still returned.
Supervision hooks for orchestrators: WithMaxStepsFunc (dynamic step
budget), WithSteer (inject messages into a running agent),
WithCompactor (transform the outbound transcript when context grows —
the canonical Result.Messages stays complete), and WithToolErrorLimits
(circuit breakers for all-error steps and identical repeated calls,
surfacing agent.ErrToolLoop).
Skills
Skills are reusable instruction+tool bundles attachable to any agent, at construction or on demand. Instructions extend the system prompt; tools extend the toolset — additively, in attachment order.
import (
"gitea.stevedudenhoeffer.com/steve/majordomo/skill"
"gitea.stevedudenhoeffer.com/steve/majordomo/skill/calc"
"gitea.stevedudenhoeffer.com/steve/majordomo/skill/clock"
)
research := skill.New("research",
skill.WithInstructions("Cite a source for every claim."),
skill.WithTools(searchTool, fetchTool),
)
a := agent.New(m, "You are helpful.", agent.WithSkill(research))
a.AddSkill(clock.New()) // ready-made: time awareness
a.AddSkill(calc.New()) // ready-made: exact arithmetic
Anything implementing the three-method agent.Skill interface (Name /
Instructions / Tools) is a skill — skill.New is just the convenient way
to build one.
Feature/provider support matrix
| Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
|---|---|---|---|---|---|---|---|
| OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Google (Gemini) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ |
| fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
¹ foreman's daemon currently buffers sync chat responses (no token-by-token streaming); majordomo's stream API works against it and delivers the response as a single delta plus final event.
Notes: Ollama has no native tool_choice — "none" drops the tools;
"required"/named choices are best-effort ignored there. Ollama Cloud
ignores the format field (verified live), so the provider also states
the schema as an explicit system instruction — constrained decoding on
local Ollama, instruction-guided JSON on cloud, one canonical API either
way.
Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅
(per-target normalization in chains) · agent loop ✅ · Generate[T] +
schema derivation ✅ · skills ✅ (with clock + calc examples).
Development
go build ./... && go vet ./... && go test -race -count=1 ./...
The default test suite is fully hermetic (no network, no credentials).
Live integration tests (Phase 8) are gated behind the live build tag and
read .env (see .env.example; never commit .env).
Design decisions are recorded in docs/adr/; conventions in CLAUDE.md; build history in progress.md; the mort conversion plan in docs/mort-migration.md.