feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline
Phase 3: - provider/openai: Chat Completions for OpenAI + compat endpoints (SSE streaming with by-index tool-call assembly, response_format json_schema, legacy max_tokens option, reasoning_effort) - provider/anthropic: Messages API (tool_use/tool_result, GA structured output via output_config.format, full SSE event parser, 529 transient) - provider/ollama: one native /api/chat client behind the ollama, ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant of foreman's buffered single-object responses; object tool arguments; format-schema structured output; think mapping) - media/: capability normalization (sniff, downscale, transcode, byte ladder, ErrUnsupported), wired into the chain executor per target with penalty-free advance past incapable elements - registry: real provider + scheme wiring, WithHTTPClient option, required env-foreman TLS chat round-trip test - ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README matrix + CLAUDE.md synced Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
+40
@@ -1,5 +1,45 @@
|
||||
# progress
|
||||
|
||||
## 2026-06-10 — Phase 3: REST providers (OpenAI, Anthropic, Ollama×3) + media
|
||||
|
||||
**Landed:**
|
||||
- `provider/openai`: Chat Completions client for OpenAI and every
|
||||
OpenAI-compatible endpoint (tools with string-arguments mapping, strict
|
||||
SSE streaming incl. by-index tool-call assembly and the empty-choices
|
||||
usage chunk, response_format json_schema, max_completion_tokens with a
|
||||
WithLegacyMaxTokens compat option, reasoning_effort).
|
||||
- `provider/anthropic`: Messages API client (anthropic-version 2023-06-01,
|
||||
required-max_tokens defaulting, tool_use/tool_result blocks with native
|
||||
is_error, GA structured output via output_config.format, full SSE event
|
||||
parser with input_json_delta buffering, 529-overloaded classified
|
||||
transient, usage sums cache tokens).
|
||||
- `provider/ollama`: ONE native /api/chat client serving ollama (local,
|
||||
OLLAMA_HOST normalization), ollama-cloud (https://ollama.com + bearer
|
||||
OLLAMA_API_KEY), and foreman (base URL + bearer; tolerates its
|
||||
buffered-single-object "streaming"). Object tool arguments, tool_name
|
||||
results, format-schema structured output, think-level mapping, NDJSON
|
||||
streaming with 16MB lines.
|
||||
- `media/`: normalization pipeline per ADR-0009 (magic-byte sniffing,
|
||||
box-filter downscale, transcode preference ladder, byte-budget quality
|
||||
ladder, webp passthrough-or-reject, copy-on-write, everything-unfittable
|
||||
wraps ErrUnsupported).
|
||||
- Chain executor now normalizes media PER TARGET before each attempt and
|
||||
advances penalty-free past targets that can't take the request (proven:
|
||||
text-only head + vision fallback; per-target downscale assertions).
|
||||
- Registry: real providers + scheme factories wired for openai, anthropic,
|
||||
ollama, ollama-cloud, foreman (google still stubbed, Phase 4);
|
||||
WithHTTPClient registry option; required env-foreman TLS chat round-trip
|
||||
test (LLM_FM=foreman://token@host → Parse("fm/qwen3:30b") → bearer
|
||||
arrives, chat answers).
|
||||
- ADR-0009 (multimodal), ADR-0010 (tools/structured mapping); README
|
||||
matrix flipped to ✅ for the four landed provider families; ~70 new
|
||||
hermetic tests across the three provider packages + media.
|
||||
- Run note: openai/anthropic/media were built by three parallel
|
||||
subagents against the frozen llm contract; ollama/foreman, chain wiring,
|
||||
and registry integration done in the main line. All gates green.
|
||||
|
||||
**Next:** Phase 4 — Google provider on google.golang.org/genai.
|
||||
|
||||
## 2026-06-10 — Phase 2: health + failover chain, proven
|
||||
|
||||
**Landed:** the full deterministic failover test matrix over the fake
|
||||
|
||||
Reference in New Issue
Block a user