T

Steve Dudenhoeffer 88d3fc3279 chore: repin gadfly reusable to @5007597 (structured findings + consensus + inline review)

Adopts gadfly's review-representation overhaul: one ranked consensus comment
across the swarm + an advisory COMMENT-state inline PR review, on image
sha-3095ebf. Swarm config still rides the owner variables.

[skip ci]

2026-06-28 22:13:24 -04:00

.gitea/workflows

chore: repin gadfly reusable to @5007597 (structured findings + consensus + inline review)

2026-06-28 22:13:24 -04:00

agent

fix(agent): recover front-loaded answer when terminal turn is degenerate

2026-06-26 18:37:38 -04:00

docs

feat(imagegen): optional per-request generation settings

2026-06-28 19:05:49 -04:00

examples

docs: public-readiness — vibe-coded disclosure + genericize internal hosts

2026-06-25 19:25:58 -04:00

health

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

2026-06-10 13:30:06 +02:00

imagegen

feat(imagegen): optional per-request generation settings

2026-06-28 19:05:49 -04:00

llm

feat(chain): fail over on empty/degenerate responses

2026-06-26 10:35:07 -04:00

media

test(media): match the overflow placeholder by const, not substring (gadfly #8 )

2026-06-28 18:33:01 -04:00

provider

feat(imagegen): optional per-request generation settings

2026-06-28 19:05:49 -04:00

skill

feat: skills — additive instruction+tool bundles, clock + calc examples

2026-06-10 13:13:07 +02:00

.env.example

feat: foundations — canonical types, Parse grammar, env DSNs, health, chains

2026-06-10 12:35:34 +02:00

.gitignore

feat: foundations — canonical types, Parse grammar, env DSNs, health, chains

2026-06-10 12:35:34 +02:00

builtin_llamaswap_test.go

feat(llamaswap): add llama-swaps (TLS) DSN scheme

2026-06-27 17:58:59 -04:00

builtin.go

feat(llamaswap): add llama-swaps (TLS) DSN scheme

2026-06-27 17:58:59 -04:00

chain_test.go

feat: foundations — canonical types, Parse grammar, env DSNs, health, chains

2026-06-10 12:35:34 +02:00

chain.go

feat(chain): fail over on empty/degenerate responses

2026-06-26 10:35:07 -04:00

CLAUDE.md

ci: switch gadfly review to the reusable workflow (curated swarm, 5 lenses) (#6 )

2026-06-28 02:48:28 +00:00

env_test.go

docs: public-readiness — vibe-coded disclosure + genericize internal hosts

2026-06-25 19:25:58 -04:00

env.go

feat: foundations — canonical types, Parse grammar, env DSNs, health, chains

2026-06-10 12:35:34 +02:00

extensions_test.go

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

2026-06-10 13:30:06 +02:00

failover_empty_test.go

feat(chain): fail over on empty/degenerate responses

2026-06-26 10:35:07 -04:00

failover_test.go

feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline

2026-06-10 12:58:08 +02:00

generate_test.go

feat: agent run loop, Generate[T], reflect-derived schemas

2026-06-10 13:10:18 +02:00

generate.go

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

2026-06-10 13:30:06 +02:00

go.mod

feat: Google (Gemini) provider on the official Gen AI SDK

2026-06-10 13:04:28 +02:00

go.sum

feat: Google (Gemini) provider on the official Gen AI SDK

2026-06-10 13:04:28 +02:00

majordomo.go

feat(llamaswap): add llama-swap provider + canonical imagegen interface

2026-06-27 15:01:54 -04:00

parse_test.go

feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline

2026-06-10 12:58:08 +02:00

parse.go

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

2026-06-10 13:30:06 +02:00

progress.md

feat(llamaswap): add llama-swaps (TLS) DSN scheme

2026-06-27 17:58:59 -04:00

README.md

feat(llamaswap): add llama-swaps (TLS) DSN scheme

2026-06-27 17:58:59 -04:00

registry.go

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

2026-06-10 13:30:06 +02:00

README.md

majordomo

A clean-slate Go library for building LLM-backed agents: one canonical API over many model providers, a parseable model naming / failover / tiering system with built-in health tracking, capability-aware multimodality, tool calls, structured output, and composable agents and skills.

🤖 Heads up: this is a vibe-coded project

majordomo was built almost entirely by an AI agent (Claude Code) — design, code, and docs. It is reasonably well-tested (a fully hermetic suite plus gated live integration tests) and is used in earnest, but treat it accordingly: read the code before depending on it, expect the occasional AI-flavored rough edge, and please open issues. No warranty implied.

The support matrix below is kept honest: pending means not built, and this README is updated in the same commit as the behavior it describes. Runnable programs for every feature live in examples/.

Install

go get gitea.stevedudenhoeffer.com/steve/majordomo

Requires Go 1.26+.

Quickstart

package main

import (
    "context"
    "fmt"

    "gitea.stevedudenhoeffer.com/steve/majordomo"
)

func main() {
    reg := majordomo.New() // built-ins + LLM_* env providers

    m, err := reg.Parse("ollama-cloud/minimax-m3:cloud")
    if err != nil { panic(err) }

    resp, err := m.Generate(context.Background(), majordomo.Request{
        Messages: []majordomo.Message{majordomo.UserText("hello!")},
    })
    if err != nil { panic(err) }
    fmt.Println(resp.Text())
}

majordomo.Parse(...) (package level) uses a lazily-built default registry if you don't need isolation.

Model specs: targets, chains, tiers

A model spec is a comma-separated failover chain; each element is either a provider/model target or a registered alias (tier):

// Try minimax-m3 first; on failure kimi-k2.6; finally fall back to opus-4.8.
m, _ := reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8")

// Identical, with the registered alias "thinking" appended and expanded
// in place as the tail of the chain:
m, _ = reg.Parse("ollama-cloud/minimax-m3:cloud,ollama-cloud/kimi-k2.6:cloud,anthropic/opus-4.8,thinking")

Everything after the first / (up to the next comma) is the model id, passed to the provider verbatim — tags (:cloud, :30b) and ids with extra slashes survive intact. majordomo never validates ids against a catalog.

Custom tiers (aliases)

reg.RegisterAlias("thinking", "anthropic/opus-4.8,ollama-cloud/minimax-m3:cloud")
reg.RegisterAlias("workhorse", "ollama-cloud/minimax-m2.7:cloud,ollama-cloud/qwen3-coder:480b-cloud")

m, _ := reg.Parse("thinking") // a chain, same Model interface as a single target

Aliases may appear anywhere in a chain (head, middle, tail), may reference other aliases, and expand inline and recursively; cycles are detected and returned as errors.

For tiers that live in a database or config system, register a dynamic resolver — consulted after static aliases, output expanded with the same recursion and cycle guards:

reg.RegisterResolver(majordomo.ResolverFunc(func(name string) (string, bool) {
    return myConfigStore.LookupTier(name) // e.g. "agent-thinking" → a chain
}))

Failover & health

Chains are health-tracked per target:

A single transient error (429/5xx, timeout, connection failure) is retried once on the same target.
Repeated transient errors (default: 2 consecutive failed attempts) bench the target — chains skip it until its cooldown expires (exponential: 5s, 10s, 20s, ... capped at 5m). Any success resets it.
model not found advances down the chain without penalty; auth/malformed errors fail fast (failing over can't fix a bad key). All knobs are configurable via WithChainConfig / WithHealthConfig.
If every element fails, you get one joined error naming each target and why it failed.
Ops surfaces: reg.Health() exposes Bench/Unbench/Snapshot for manual control and dashboards; ChainConfig.Observer receives one event per failover decision (failed attempt, bench, benched-skip) for logging.

Providers

Built-in env vars

Provider	Spec name	Key env var	Default endpoint
OpenAI (+compatible)	`openai`	`OPENAI_API_KEY`	https://api.openai.com/v1
Anthropic (+compatible)	`anthropic`	`ANTHROPIC_API_KEY`	https://api.anthropic.com
Google (Gemini)	`google`	`GOOGLE_API_KEY` / `GEMINI_API_KEY`	Gemini API (official SDK)
Ollama Cloud	`ollama-cloud`	`OLLAMA_API_KEY`	https://ollama.com
Ollama (local)	`ollama`	—	`OLLAMA_HOST` or http://localhost:11434
foreman	`foreman`	— (token via DSN)	requires an LLM_* DSN or `ollama.Foreman(url, token)`
llama-swap	`llama-swap`	— (token via DSN)	requires an LLM_* DSN or `llamaswap.New(...)`

OpenAI-compatible / Anthropic-compatible endpoints: construct the provider with a name and base URL and register it —

reg.RegisterProvider(openai.New(
    openai.WithName("groq"),
    openai.WithBaseURL("https://api.groq.com/openai/v1"),
    openai.WithAPIKey(key),
    // openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
))
// now "groq/llama-3.3-70b" works in Parse, chains, and aliases

`LLM_*` env-DSN provider definitions

Define named providers entirely from the environment (go-llm parity):

LLM_M1=foreman://test-token-change-me@foreman-m1.example.com
LLM_M5=foreman://test-token-change-me@foreman-m5.example.com

defines providers m1 and m5 (foreman targets — native Ollama wire protocol behind a bearer token). They are first-class in Parse, chains, and aliases:

m, _ := reg.Parse("m5/qwen3:30b,m1/qwen3:30b,thinking")

DSN format: scheme://[token@]host[/path], scheme ∈ foreman, ollama, ollama-cloud, openai, anthropic, google/gemini, llama-swap, llama-swaps, or any scheme you add with RegisterScheme. The token is the credential (bearer token / API key); the base URL is always https://host[/path] — except llama-swap, which builds http://host[:port] since it's local-first (llama-swaps is the TLS twin → https://host, mirroring redis/rediss). New() loads LLM_* vars eagerly; unknown provider names also resolve lazily at Parse time (my-prov/x → LLM_MY_PROV).

LLM_LS=llama-swap://token@box.local:8080    # http  → "ls/qwen3:14b" parses
LLM_LS=llama-swaps://token@swap.example.com # https → TLS-fronted instance

llama-swap is a model-swapping proxy over llama.cpp. Its chat API is OpenAI-compatible (majordomo reuses the openai client), and the *llamaswap.Provider adds management methods (ListModels/Running/Unload) plus image generation (see below). A cold model swap can take many seconds — bound calls with a context deadline, not a client timeout.

Custom providers

Implement the two-method Provider interface and register it:

reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases

Multimodality

Attach images without knowing the target's limits. Before each attempt the request is normalized against the actual serving target's declared capabilities: the real format is sniffed from the bytes, oversize images are downscaled (aspect preserved), disallowed formats are re-encoded, and byte budgets are enforced by a quality ladder. What cannot be made to fit is rejected with a clear ErrUnsupported error — and in a chain, the request simply advances to the next (e.g. vision-capable) element.

resp, err := m.Generate(ctx, majordomo.Request{
    Messages: []majordomo.Message{
        majordomo.UserParts(majordomo.Text("what's in this image?"),
            majordomo.Image("image/png", pngBytes)),
    },
})

Image generation

Text-to-image is a separate contract (imagegen) from chat, because it shares none of the message/tool/stream machinery. Generated images come back as llm.ImagePart, so they drop straight back into a chat turn. The first backend is llama-swap (OpenAI /v1/images/generations → a stable-diffusion.cpp upstream).

ls := llamaswap.New(llamaswap.WithBaseURL("http://box.local:8080"))
im, _ := ls.ImageModel("sd-xl")

res, err := im.Generate(ctx, imagegen.Request{Prompt: "a red bicycle"},
    imagegen.WithSize("1024x1024"))
// res.Images[0] is an llm.ImagePart (bytes + MIME) — feed it back into chat:
// majordomo.UserParts(majordomo.Text("describe this"), res.Images[0])

*llamaswap.Provider also exposes management methods: ListModels (what llama-swap can serve), Running (what's loaded), and Unload (free a model).

Tool calls

weather := majordomo.Tool{
    Name:        "get_weather",
    Description: "Current weather for a city",
    Parameters:  json.RawMessage(`{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}`),
    Handler: func(ctx context.Context, args json.RawMessage) (any, error) {
        var p struct{ City string `json:"city"` }
        _ = json.Unmarshal(args, &p)
        return map[string]any{"city": p.City, "temp_c": 21}, nil
    },
}
resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
// resp.ToolCalls → execute → append ToolResultsMessage → continue

Or typed, with the schema derived from your argument struct:

weather := majordomo.DefineTool("get_weather", "Current weather for a city",
    func(ctx context.Context, args struct {
        City string `json:"city" description:"city name"`
    }) (any, error) {
        return lookup(args.City)
    })

Each provider maps this one shape to its native function-calling format (OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with object arguments). Tool-call ids are synthesized when a backend omits them; streaming buffers tool-call arguments until they parse.

Structured output

resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))

Maps to OpenAI response_format: json_schema, Anthropic output_config.format, Ollama format, and Google responseJsonSchema.

The typed helper derives the schema from your struct (all fields required, additionalProperties:false, pointers nullable; description:"..." and enum:"a,b,c" tags supported) and unmarshals the result:

type Verdict struct {
    Guilty bool   `json:"guilty"`
    Why    string `json:"why" description:"one-sentence rationale"`
}
v, err := majordomo.Generate[Verdict](ctx, m, req)

Agents

An agent is a model + system prompt + toolboxes, run as a tool-dispatch loop until the model answers (or MaxSteps):

import "gitea.stevedudenhoeffer.com/steve/majordomo/agent"

a := agent.New(m, "You are a research assistant.",
    agent.WithToolbox(searchTools),
    agent.WithMaxSteps(8),
    agent.WithStepObserver(func(s agent.Step) { log.Printf("step %d", s.Index) }),
)
res, err := a.Run(ctx, "What changed in Go 1.26?")
// res.Output, res.Steps, res.Usage; res.Messages round-trips via
// agent.WithHistory for conversation continuation.

The loop never panics: tool handler errors and panics become error results the model can react to; unknown tools likewise; duplicate tool names across toolboxes fail loudly. On agent.ErrMaxSteps (and on model errors) the partial result with the full transcript is still returned.

Supervision hooks for orchestrators: WithMaxStepsFunc (dynamic step budget), WithSteer (inject messages into a running agent), WithCompactor (transform the outbound transcript when context grows — the canonical Result.Messages stays complete), and WithToolErrorLimits (circuit breakers for all-error steps and identical repeated calls, surfacing agent.ErrToolLoop).

Skills

Skills are reusable instruction+tool bundles attachable to any agent, at construction or on demand. Instructions extend the system prompt; tools extend the toolset — additively, in attachment order.

import (
    "gitea.stevedudenhoeffer.com/steve/majordomo/skill"
    "gitea.stevedudenhoeffer.com/steve/majordomo/skill/calc"
    "gitea.stevedudenhoeffer.com/steve/majordomo/skill/clock"
)

research := skill.New("research",
    skill.WithInstructions("Cite a source for every claim."),
    skill.WithTools(searchTool, fetchTool),
)

a := agent.New(m, "You are helpful.", agent.WithSkill(research))
a.AddSkill(clock.New()) // ready-made: time awareness
a.AddSkill(calc.New())  // ready-made: exact arithmetic

Anything implementing the three-method agent.Skill interface (Name / Instructions / Tools) is a skill — skill.New is just the convenient way to build one.

Feature/provider support matrix

Provider	Resolve/Parse	Chat	Streaming	Tools	Structured	Images	Env DSN
OpenAI (+compatible)	✅	✅	✅	✅	✅	✅	✅
Anthropic (+compat)	✅	✅	✅	✅	✅	✅	✅
Google (Gemini)	✅	✅	✅	✅	✅	✅	✅
Ollama Cloud	✅	✅	✅	✅	✅	✅	✅
Ollama (local)	✅	✅	✅	✅	✅	✅	✅
foreman	✅	✅	✅¹	✅	✅	✅	✅
llama-swap	✅	✅	✅	✅²	✅²	✅²	✅
fake (testing)	✅	✅	✅	✅	✅	✅	—

¹ foreman's daemon currently buffers sync chat responses (no token-by-token streaming); majordomo's stream API works against it and delivers the response as a single delta plus final event.

² llama-swap's chat is OpenAI-compatible and reuses the openai client, so these capabilities are present at the client level; whether a given call succeeds depends on the llama.cpp model llama-swap loads. llama-swap also provides image generation (a separate imagegen axis, not shown above) and management methods on *llamaswap.Provider.

Notes: Ollama has no native tool_choice — "none" drops the tools; "required"/named choices are best-effort ignored there. Ollama Cloud ignores the format field (verified live), so the provider also states the schema as an explicit system instruction — constrained decoding on local Ollama, instruction-guided JSON on cloud, one canonical API either way.

Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ · health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅ (per-target normalization in chains) · agent loop ✅ · Generate[T] + schema derivation ✅ · skills ✅ (with clock + calc examples).

Development

go build ./... && go vet ./... && go test -race -count=1 ./...

The default test suite is fully hermetic (no network, no credentials). Live integration tests (Phase 8) are gated behind the live build tag and read .env (see .env.example; never commit .env).

Design decisions are recorded in docs/adr/; conventions in CLAUDE.md; build history in progress.md; the mort conversion plan in docs/mort-migration.md.

README.md

majordomo

🤖 Heads up: this is a vibe-coded project

Install

Quickstart

Model specs: targets, chains, tiers

Custom tiers (aliases)

Failover & health

Providers

Built-in env vars

LLM_* env-DSN provider definitions

Custom providers

Multimodality

Image generation

Tool calls

Structured output

Agents

Skills

Feature/provider support matrix

Development

`LLM_*` env-DSN provider definitions