feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline

Phase 3:
- provider/openai: Chat Completions for OpenAI + compat endpoints (SSE
  streaming with by-index tool-call assembly, response_format json_schema,
  legacy max_tokens option, reasoning_effort)
- provider/anthropic: Messages API (tool_use/tool_result, GA structured
  output via output_config.format, full SSE event parser, 529 transient)
- provider/ollama: one native /api/chat client behind the ollama,
  ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant
  of foreman's buffered single-object responses; object tool arguments;
  format-schema structured output; think mapping)
- media/: capability normalization (sniff, downscale, transcode, byte
  ladder, ErrUnsupported), wired into the chain executor per target with
  penalty-free advance past incapable elements
- registry: real provider + scheme wiring, WithHTTPClient option, required
  env-foreman TLS chat round-trip test
- ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README
  matrix + CLAUDE.md synced

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-10 12:58:08 +02:00
parent 323558ed72
commit 043249e0e1
31 changed files with 6194 additions and 74 deletions
+51 -20
View File
@@ -100,12 +100,25 @@ Chains are health-tracked per target:
| Provider | Spec name | Key env var | Default endpoint |
|----------|-----------|-------------|------------------|
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | api.openai.com *(pending)* |
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | api.anthropic.com *(pending)* |
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | https://api.openai.com/v1 |
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | https://api.anthropic.com |
| Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gen AI API *(pending)* |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com *(pending)* |
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 *(pending)* |
| foreman | `foreman` | — (token via DSN) | requires DSN/base URL *(pending)* |
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com |
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 |
| foreman | `foreman` | — (token via DSN) | requires an LLM_* DSN or `ollama.Foreman(url, token)` |
OpenAI-compatible / Anthropic-compatible endpoints: construct the provider
with a name and base URL and register it —
```go
reg.RegisterProvider(openai.New(
openai.WithName("groq"),
openai.WithBaseURL("https://api.groq.com/openai/v1"),
openai.WithAPIKey(key),
// openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
))
// now "groq/llama-3.3-70b" works in Parse, chains, and aliases
```
### `LLM_*` env-DSN provider definitions
@@ -139,11 +152,15 @@ Implement the two-method `Provider` interface and register it:
reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
```
## Multimodality *(pending — Phase 3)*
## Multimodality
Attach images without knowing the target's limits; majordomo normalizes
(downscale, re-encode, count/size limits) against the resolved target's
declared capabilities and rejects clearly what cannot fit.
Attach images without knowing the target's limits. Before each attempt the
request is normalized against the **actual serving target's** declared
capabilities: the real format is sniffed from the bytes, oversize images
are downscaled (aspect preserved), disallowed formats are re-encoded, and
byte budgets are enforced by a quality ladder. What cannot be made to fit
is rejected with a clear `ErrUnsupported` error — and in a chain, the
request simply advances to the next (e.g. vision-capable) element.
```go
resp, err := m.Generate(ctx, majordomo.Request{
@@ -154,7 +171,7 @@ resp, err := m.Generate(ctx, majordomo.Request{
})
```
## Tool calls *(canonical API ready; provider wiring pending — Phase 3)*
## Tool calls
```go
weather := majordomo.Tool{
@@ -171,14 +188,20 @@ resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
// resp.ToolCalls → execute → append ToolResultsMessage → continue
```
## Structured output *(canonical API ready; provider wiring pending — Phase 3)*
Each provider maps this one shape to its native function-calling format
(OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with
object arguments). Tool-call ids are synthesized when a backend omits them;
streaming buffers tool-call arguments until they parse.
## Structured output
```go
resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
```
A generic `Generate[T]` helper (schema from your struct, unmarshal into it)
lands with the agent phase.
Maps to OpenAI `response_format: json_schema`, Anthropic
`output_config.format`, and Ollama `format`. A generic `Generate[T]` helper
(schema from your struct, unmarshal into it) lands with the agent phase.
## Agents & skills *(pending — Phases 56)*
@@ -189,17 +212,25 @@ skills = reusable instruction+tool bundles attachable to any agent.
| Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
|----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
| OpenAI (+compatible) | ✅ | pending | pending | pending | pending | pending | ✅ |
| Anthropic (+compat) | ✅ | pending | pending | pending | pending | pending | ✅ |
| OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Google (Gemini) | ✅ | pending | pending | pending | pending | pending | ✅ |
| Ollama Cloud | ✅ | pending | pending | pending | pending | pending | ✅ |
| Ollama (local) | ✅ | pending | pending | pending | pending | pending | ✅ |
| foreman | ✅ | pending | pending | pending | pending | pending | ✅ |
| Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
| foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ |
| fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
¹ foreman's daemon currently buffers sync chat responses (no token-by-token
streaming); majordomo's stream API works against it and delivers the
response as a single delta plus final event.
Notes: Ollama has no native tool_choice — `"none"` drops the tools;
`"required"`/named choices are best-effort ignored there.
Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline pending ·
agent loop pending · skills pending · `Generate[T]` pending.
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline
(per-target normalization in chains) · agent loop pending · skills pending
· `Generate[T]` pending.
## Development