feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline
Phase 3: - provider/openai: Chat Completions for OpenAI + compat endpoints (SSE streaming with by-index tool-call assembly, response_format json_schema, legacy max_tokens option, reasoning_effort) - provider/anthropic: Messages API (tool_use/tool_result, GA structured output via output_config.format, full SSE event parser, 529 transient) - provider/ollama: one native /api/chat client behind the ollama, ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant of foreman's buffered single-object responses; object tool arguments; format-schema structured output; think mapping) - media/: capability normalization (sniff, downscale, transcode, byte ladder, ErrUnsupported), wired into the chain executor per target with penalty-free advance past incapable elements - registry: real provider + scheme wiring, WithHTTPClient option, required env-foreman TLS chat round-trip test - ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README matrix + CLAUDE.md synced Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -100,12 +100,25 @@ Chains are health-tracked per target:
|
||||
|
||||
| Provider | Spec name | Key env var | Default endpoint |
|
||||
|----------|-----------|-------------|------------------|
|
||||
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | api.openai.com *(pending)* |
|
||||
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | api.anthropic.com *(pending)* |
|
||||
| OpenAI (+compatible) | `openai` | `OPENAI_API_KEY` | https://api.openai.com/v1 |
|
||||
| Anthropic (+compatible) | `anthropic` | `ANTHROPIC_API_KEY` | https://api.anthropic.com |
|
||||
| Google (Gemini) | `google` | `GOOGLE_API_KEY` / `GEMINI_API_KEY` | Gen AI API *(pending)* |
|
||||
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com *(pending)* |
|
||||
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 *(pending)* |
|
||||
| foreman | `foreman` | — (token via DSN) | requires DSN/base URL *(pending)* |
|
||||
| Ollama Cloud | `ollama-cloud` | `OLLAMA_API_KEY` | https://ollama.com |
|
||||
| Ollama (local) | `ollama` | — | `OLLAMA_HOST` or http://localhost:11434 |
|
||||
| foreman | `foreman` | — (token via DSN) | requires an LLM_* DSN or `ollama.Foreman(url, token)` |
|
||||
|
||||
OpenAI-compatible / Anthropic-compatible endpoints: construct the provider
|
||||
with a name and base URL and register it —
|
||||
|
||||
```go
|
||||
reg.RegisterProvider(openai.New(
|
||||
openai.WithName("groq"),
|
||||
openai.WithBaseURL("https://api.groq.com/openai/v1"),
|
||||
openai.WithAPIKey(key),
|
||||
// openai.WithLegacyMaxTokens(), // for servers that only honor max_tokens
|
||||
))
|
||||
// now "groq/llama-3.3-70b" works in Parse, chains, and aliases
|
||||
```
|
||||
|
||||
### `LLM_*` env-DSN provider definitions
|
||||
|
||||
@@ -139,11 +152,15 @@ Implement the two-method `Provider` interface and register it:
|
||||
reg.RegisterProvider(myProvider) // now "myprovider/model-x" parses, chains, aliases
|
||||
```
|
||||
|
||||
## Multimodality *(pending — Phase 3)*
|
||||
## Multimodality
|
||||
|
||||
Attach images without knowing the target's limits; majordomo normalizes
|
||||
(downscale, re-encode, count/size limits) against the resolved target's
|
||||
declared capabilities and rejects clearly what cannot fit.
|
||||
Attach images without knowing the target's limits. Before each attempt the
|
||||
request is normalized against the **actual serving target's** declared
|
||||
capabilities: the real format is sniffed from the bytes, oversize images
|
||||
are downscaled (aspect preserved), disallowed formats are re-encoded, and
|
||||
byte budgets are enforced by a quality ladder. What cannot be made to fit
|
||||
is rejected with a clear `ErrUnsupported` error — and in a chain, the
|
||||
request simply advances to the next (e.g. vision-capable) element.
|
||||
|
||||
```go
|
||||
resp, err := m.Generate(ctx, majordomo.Request{
|
||||
@@ -154,7 +171,7 @@ resp, err := m.Generate(ctx, majordomo.Request{
|
||||
})
|
||||
```
|
||||
|
||||
## Tool calls *(canonical API ready; provider wiring pending — Phase 3)*
|
||||
## Tool calls
|
||||
|
||||
```go
|
||||
weather := majordomo.Tool{
|
||||
@@ -171,14 +188,20 @@ resp, _ := m.Generate(ctx, req, majordomo.WithTools(weather))
|
||||
// resp.ToolCalls → execute → append ToolResultsMessage → continue
|
||||
```
|
||||
|
||||
## Structured output *(canonical API ready; provider wiring pending — Phase 3)*
|
||||
Each provider maps this one shape to its native function-calling format
|
||||
(OpenAI tools/tool_calls, Anthropic tool_use/tool_result, Ollama tools with
|
||||
object arguments). Tool-call ids are synthesized when a backend omits them;
|
||||
streaming buffers tool-call arguments until they parse.
|
||||
|
||||
## Structured output
|
||||
|
||||
```go
|
||||
resp, _ := m.Generate(ctx, req, majordomo.WithSchema(schemaJSON, "answer"))
|
||||
```
|
||||
|
||||
A generic `Generate[T]` helper (schema from your struct, unmarshal into it)
|
||||
lands with the agent phase.
|
||||
Maps to OpenAI `response_format: json_schema`, Anthropic
|
||||
`output_config.format`, and Ollama `format`. A generic `Generate[T]` helper
|
||||
(schema from your struct, unmarshal into it) lands with the agent phase.
|
||||
|
||||
## Agents & skills *(pending — Phases 5–6)*
|
||||
|
||||
@@ -189,17 +212,25 @@ skills = reusable instruction+tool bundles attachable to any agent.
|
||||
|
||||
| Provider | Resolve/Parse | Chat | Streaming | Tools | Structured | Images | Env DSN |
|
||||
|----------------------|:---:|:---:|:---:|:---:|:---:|:---:|:---:|
|
||||
| OpenAI (+compatible) | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| Anthropic (+compat) | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| OpenAI (+compatible) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Anthropic (+compat) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Google (Gemini) | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| Ollama Cloud | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| Ollama (local) | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| foreman | ✅ | pending | pending | pending | pending | pending | ✅ |
|
||||
| Ollama Cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| Ollama (local) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
|
||||
| foreman | ✅ | ✅ | ✅¹ | ✅ | ✅ | ✅ | ✅ |
|
||||
| fake (testing) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | — |
|
||||
|
||||
¹ foreman's daemon currently buffers sync chat responses (no token-by-token
|
||||
streaming); majordomo's stream API works against it and delivers the
|
||||
response as a single delta plus final event.
|
||||
|
||||
Notes: Ollama has no native tool_choice — `"none"` drops the tools;
|
||||
`"required"`/named choices are best-effort ignored there.
|
||||
|
||||
Cross-cutting: Parse grammar ✅ · aliases/tiers ✅ · failover chains ✅ ·
|
||||
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline pending ·
|
||||
agent loop pending · skills pending · `Generate[T]` pending.
|
||||
health tracking/backoff ✅ · LLM_* env DSNs ✅ · media pipeline ✅
|
||||
(per-target normalization in chains) · agent loop pending · skills pending
|
||||
· `Generate[T]` pending.
|
||||
|
||||
## Development
|
||||
|
||||
|
||||
Reference in New Issue
Block a user