feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline

Phase 3: - provider/openai: Chat Completions for OpenAI + compat endpoints (SSE streaming with by-index tool-call assembly, response_format json_schema, legacy max_tokens option, reasoning_effort) - provider/anthropic: Messages API (tool_use/tool_result, GA structured output via output_config.format, full SSE event parser, 529 transient) - provider/ollama: one native /api/chat client behind the ollama, ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant of foreman's buffered single-object responses; object tool arguments; format-schema structured output; think mapping) - media/: capability normalization (sniff, downscale, transcode, byte ladder, ErrUnsupported), wired into the chain executor per target with penalty-free advance past incapable elements - registry: real provider + scheme wiring, WithHTTPClient option, required env-foreman TLS chat round-trip test - ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README matrix + CLAUDE.md synced Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:58:08 +02:00
parent 323558ed72
commit 043249e0e1
31 changed files with 6194 additions and 74 deletions
@@ -100,12 +100,25 @@ Chains are health-tracked per target:

 | Provider | Spec name | Key env var | Default endpoint |
 |----------|-----------|-------------|------------------|
-| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | api.openai.com *(pending)* |
-| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | api.anthropic.com *(pending)* |
+| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | https://api.openai.com/v1 |
+| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | https://api.anthropic.com |
 | Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gen AI API *(pending)* |
-| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com *(pending)* |
-| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 *(pending)* |
-| foreman | `foreman` | — (token via DSN) | requires DSN/base URL *(pending)* |
+| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com |
+| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 |
+| foreman | `foreman` | — (token via DSN) | requires an LLM_* DSN or `ollama.Foreman(url, token)` |
+
+OpenAI-compatible / Anthropic-compatible endpoints: construct the provider
+with a name and base URL and register it —
+
+```go
+reg.RegisterProvider(openai.New(
+    openai.WithName("groq"),
+    openai.WithBaseURL("https://api.groq.com/openai/v1"),
+    openai.WithAPIKey(key),
+    // openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
+))
+// now "groq/llama-3.3-70b" works in Parse, chains, and aliases
+```

 ### `LLM_*` env-DSN provider definitions

@@ -139,11 +152,15 @@ Implement the two-method `Provider` interface and register it:
 reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
 ```

-## Multimodality *(pending — Phase 3)*
+## Multimodality

-Attach images without knowing the target's limits; majordomo normalizes
-(downscale, re-encode, count/size limits) against the resolved target's
-declared capabilities and rejects clearly what cannot fit.
+Attach images without knowing the target's limits. Before each attempt the
+request is normalized against the **actual serving target's** declared
+capabilities: the real format is sniffed from the bytes, oversize images
+are downscaled (aspect preserved), disallowed formats are re-encoded, and
+byte budgets are enforced by a quality ladder. What cannot be made to fit
+is rejected with a clear `ErrUnsupported` error — and in a chain, the
+request simply advances to the next (e.g. vision-capable) element.

 ```go
 resp, err := m.Generate(ctx, majordomo.Request{
@@ -154,7 +171,7 @@ resp, err := m.Generate(ctx, majordomo.Request{
 })
 ```

-## Tool calls *(canonical API ready; provider wiring pending — Phase 3)*
+## Tool calls

 ```go
 weather := majordomo.Tool{
@@ -171,14 +188,20 @@ resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
 // resp.ToolCalls → execute → append ToolResultsMessage → continue
 ```

-## Structured output *(canonical API ready; provider wiring pending — Phase 3)*
+Each provider maps this one shape to its native function-calling format
+(OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with
+object arguments). Tool-call ids are synthesized when a backend omits them;
+streaming buffers tool-call arguments until they parse.
+
+## Structured output

 ```go
 resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
 ```

-A generic `Generate[T]` helper (schema from your struct, unmarshal into it)
-lands with the agent phase.
+Maps to OpenAI `response_format: json_schema`, Anthropic
+`output_config.format`, and Ollama `format`. A generic `Generate[T]` helper
+(schema from your struct, unmarshal into it) lands with the agent phase.

 ## Agents & skills *(pending — Phases 5–6)*

@@ -189,17 +212,25 @@ skills = reusable instruction+tool bundles attachable to any agent.

 | Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
 |----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
-| OpenAI (+compatible) | ✅ | pending | pending | pending | pending | pending | ✅ |
-| Anthropic (+compat) | ✅ | pending | pending | pending | pending | pending | ✅ |
+| OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
 | Google (Gemini) | ✅ | pending | pending | pending | pending | pending | ✅ |
-| Ollama Cloud | ✅ | pending | pending | pending | pending | pending | ✅ |
-| Ollama (local) | ✅ | pending | pending | pending | pending | pending | ✅ |
-| foreman | ✅ | pending | pending | pending | pending | pending | ✅ |
+| Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
+| foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ |
 | fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |

+¹ foreman's daemon currently buffers sync chat responses (no token-by-token
+streaming); majordomo's stream API works against it and delivers the
+response as a single delta plus final event.
+
+Notes: Ollama has no native tool_choice — `"none"` drops the tools;
+`"required"`/named choices are best-effort ignored there.
+
 Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
-health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline pending ·
-agent loop pending · skills pending · `Generate[T]` pending.
+health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅
+(per-target normalization in chains) · agent loop pending · skills pending
+· `Generate[T]` pending.

 ## Development