Files
majordomo/README.md
T
steve 1ca607906d feat: Google (Gemini) provider on the official Gen AI SDK
Phase 4: provider/google on google.golang.org/genai v1.59.0 — lazy cached
client, FunctionResponse tool loop, raw-JSON-schema tools and structured
output, ThinkingLevel reasoning mapping, iter.Pull2 streaming, hermetic
httptest suite via HTTPOptions.BaseURL. Registry wires google + gemini
schemes to the real client; stub machinery deleted (all built-ins real).
ADR-0011; README matrix + CLAUDE.md synced.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:04:28 +02:00

248 lines
9.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# majordomo
A clean-slate Go library for building LLM-backed agents: one canonical API
over many model providers, a parseable model naming / failover / tiering
system with built-in health tracking, capability-aware multimodality, tool
calls, structured output, and composable agents and skills.
> **Status:** under construction, phase by phase. The
> [support matrix](#featureprovider-support-matrix) below is kept honest:
> *pending* means not built yet, and this README is updated in the same
> commit as the behavior it describes.
## Install
```bash
go get gitea.stevedudenhoeffer.com/steve/majordomo
```
Requires Go 1.26+.
## Quickstart
```go
package main
import (
"context"
"fmt"
"gitea.stevedudenhoeffer.com/steve/majordomo"
)
func main() {
reg := majordomo.New() // built-ins + LLM_* env providers
m, err := reg.Parse("ollama-cloud/minimax-m3:cloud")
if err != nil { panic(err) }
resp, err := m.Generate(context.Background(), majordomo.Request{
Messages: []majordomo.Message{majordomo.UserText("hello!")},
})
if err != nil { panic(err) }
fmt.Println(resp.Text())
}
```
`majordomo.Parse(...)` (package level) uses a lazily-built default registry
if you don't need isolation.
## Model specs: targets, chains, tiers
A model spec is a comma-separated **failover chain**; each element is either
a `provider/model` target or a registered **alias** (tier):
```go
// Try minimax-m3 first; on failure kimi-k2.6; finally fall back to opus-4.8.
m, _ := reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8")
// Identical, with the registered alias "thinking" appended and expanded
// in place as the tail of the chain:
m, _ = reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8,thinking")
```
Everything after the **first `/`** (up to the next comma) is the model id,
passed to the provider **verbatim** — tags (`:cloud`, `:30b`) and ids with
extra slashes survive intact. majordomo never validates ids against a
catalog.
### Custom tiers (aliases)
```go
reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
reg.RegisterAlias("workhorse", "ollama-cloud/minimax-m2.7:cloud,ollama-cloud/qwen3-coder:480b-cloud")
m, _ := reg.Parse("thinking") // a chain, same Model interface as a single target
```
Aliases may appear anywhere in a chain (head, middle, tail), may reference
other aliases, and expand inline and recursively; cycles are detected and
returned as errors.
### Failover & health
Chains are health-tracked per target:
- A **single transient error** (429/5xx, timeout, connection failure) is
retried once on the same target.
- **Repeated transient errors** (default: 2 consecutive failed attempts)
bench the target — chains skip it until its cooldown expires (exponential:
5s, 10s, 20s, ... capped at 5m). Any success resets it.
- `model not found` advances down the chain without penalty; auth/malformed
errors fail fast (failing over can't fix a bad key). All knobs are
configurable via `WithChainConfig` / `WithHealthConfig`.
- If every element fails, you get one joined error naming each target and
why it failed.
## Providers
### Built-in env vars
| Provider | Spec name | Key env var | Default endpoint |
|----------|-----------|-------------|------------------|
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | https://api.openai.com/v1 |
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | https://api.anthropic.com |
| Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gemini API (official SDK) |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com |
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 |
| foreman | `foreman` | — (token via DSN) | requires an LLM_* DSN or `ollama.Foreman(url, token)` |
OpenAI-compatible / Anthropic-compatible endpoints: construct the provider
with a name and base URL and register it —
```go
reg.RegisterProvider(openai.New(
openai.WithName("groq"),
openai.WithBaseURL("https://api.groq.com/openai/v1"),
openai.WithAPIKey(key),
// openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
))
// now "groq/llama-3.3-70b" works in Parse, chains, and aliases
```
### `LLM_*` env-DSN provider definitions
Define named providers entirely from the environment (go-llm parity):
```
LLM_M1=foreman://test-token-change-me@foreman-m1.orgrimmar.dudenhoeffer.casa
LLM_M5=foreman://test-token-change-me@foreman-m5.orgrimmar.dudenhoeffer.casa
```
defines providers `m1` and `m5` (foreman targets — native Ollama wire
protocol behind a bearer token). They are first-class in `Parse`, chains,
and aliases:
```go
m, _ := reg.Parse("m5/qwen3:30b,m1/qwen3:30b,thinking")
```
DSN format: `scheme://[token@]host[/path]`, scheme ∈ `foreman`, `ollama`,
`ollama-cloud`, `openai`, `anthropic`, `google`/`gemini`, or any scheme you
add with `RegisterScheme`. The token is the credential (bearer token / API
key); the base URL is always `https://host[/path]`. `New()` loads `LLM_*`
vars eagerly; unknown provider names also resolve lazily at Parse time
(`my-prov/x``LLM_MY_PROV`).
### Custom providers
Implement the two-method `Provider` interface and register it:
```go
reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
```
## Multimodality
Attach images without knowing the target's limits. Before each attempt the
request is normalized against the **actual serving target's** declared
capabilities: the real format is sniffed from the bytes, oversize images
are downscaled (aspect preserved), disallowed formats are re-encoded, and
byte budgets are enforced by a quality ladder. What cannot be made to fit
is rejected with a clear `ErrUnsupported` error — and in a chain, the
request simply advances to the next (e.g. vision-capable) element.
```go
resp, err := m.Generate(ctx, majordomo.Request{
Messages: []majordomo.Message{
majordomo.UserParts(majordomo.Text("what's in this image?"),
majordomo.Image("image/png", pngBytes)),
},
})
```
## Tool calls
```go
weather := majordomo.Tool{
Name: "get_weather",
Description: "Current weather for a city",
Parameters: json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}`),
Handler: func(ctx context.Context, args json.RawMessage) (any, error) {
var p struct{ City string `json:"city"` }
_ = json.Unmarshal(args, &p)
return map[string]any{"city": p.City, "temp_c": 21}, nil
},
}
resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
// resp.ToolCalls → execute → append ToolResultsMessage → continue
```
Each provider maps this one shape to its native function-calling format
(OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with
object arguments). Tool-call ids are synthesized when a backend omits them;
streaming buffers tool-call arguments until they parse.
## Structured output
```go
resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
```
Maps to OpenAI `response_format: json_schema`, Anthropic
`output_config.format`, and Ollama `format`. A generic `Generate[T]` helper
(schema from your struct, unmarshal into it) lands with the agent phase.
## Agents & skills *(pending — Phases 56)*
Agents = model + system prompt + toolboxes, running a tool-dispatch loop;
skills = reusable instruction+tool bundles attachable to any agent.
## Feature/provider support matrix
| Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
|----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Google (Gemini) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ |
| fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
¹ foreman's daemon currently buffers sync chat responses (no token-by-token
streaming); majordomo's stream API works against it and delivers the
response as a single delta plus final event.
Notes: Ollama has no native tool_choice — `"none"` drops the tools;
`"required"`/named choices are best-effort ignored there.
Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅
(per-target normalization in chains) · agent loop pending · skills pending
· `Generate[T]` pending.
## Development
```bash
go build ./... && go vet ./... && go test -race -count=1 ./...
```
The default test suite is fully hermetic (no network, no credentials).
Live integration tests (Phase 8) are gated behind the `live` build tag and
read `.env` (see `.env.example`; never commit `.env`).
Design decisions are recorded in [docs/adr/](docs/adr/README.md);
conventions in [CLAUDE.md](CLAUDE.md); build history in
[progress.md](progress.md).