Files
majordomo/progress.md
T
steve 043249e0e1 feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline
Phase 3:
- provider/openai: Chat Completions for OpenAI + compat endpoints (SSE
  streaming with by-index tool-call assembly, response_format json_schema,
  legacy max_tokens option, reasoning_effort)
- provider/anthropic: Messages API (tool_use/tool_result, GA structured
  output via output_config.format, full SSE event parser, 529 transient)
- provider/ollama: one native /api/chat client behind the ollama,
  ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant
  of foreman's buffered single-object responses; object tool arguments;
  format-schema structured output; think mapping)
- media/: capability normalization (sniff, downscale, transcode, byte
  ladder, ErrUnsupported), wired into the chain executor per target with
  penalty-free advance past incapable elements
- registry: real provider + scheme wiring, WithHTTPClient option, required
  env-foreman TLS chat round-trip test
- ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README
  matrix + CLAUDE.md synced

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:58:08 +02:00

94 lines
5.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# progress
## 2026-06-10 — Phase 3: REST providers (OpenAI, Anthropic, Ollama×3) + media
**Landed:**
- `provider/openai`: Chat Completions client for OpenAI and every
OpenAI-compatible endpoint (tools with string-arguments mapping, strict
SSE streaming incl. by-index tool-call assembly and the empty-choices
usage chunk, response_format json_schema, max_completion_tokens with a
WithLegacyMaxTokens compat option, reasoning_effort).
- `provider/anthropic`: Messages API client (anthropic-version 2023-06-01,
required-max_tokens defaulting, tool_use/tool_result blocks with native
is_error, GA structured output via output_config.format, full SSE event
parser with input_json_delta buffering, 529-overloaded classified
transient, usage sums cache tokens).
- `provider/ollama`: ONE native /api/chat client serving ollama (local,
OLLAMA_HOST normalization), ollama-cloud (https://ollama.com + bearer
OLLAMA_API_KEY), and foreman (base URL + bearer; tolerates its
buffered-single-object "streaming"). Object tool arguments, tool_name
results, format-schema structured output, think-level mapping, NDJSON
streaming with 16MB lines.
- `media/`: normalization pipeline per ADR-0009 (magic-byte sniffing,
box-filter downscale, transcode preference ladder, byte-budget quality
ladder, webp passthrough-or-reject, copy-on-write, everything-unfittable
wraps ErrUnsupported).
- Chain executor now normalizes media PER TARGET before each attempt and
advances penalty-free past targets that can't take the request (proven:
text-only head + vision fallback; per-target downscale assertions).
- Registry: real providers + scheme factories wired for openai, anthropic,
ollama, ollama-cloud, foreman (google still stubbed, Phase 4);
WithHTTPClient registry option; required env-foreman TLS chat round-trip
test (LLM_FM=foreman://token@host → Parse("fm/qwen3:30b") → bearer
arrives, chat answers).
- ADR-0009 (multimodal), ADR-0010 (tools/structured mapping); README
matrix flipped to ✅ for the four landed provider families; ~70 new
hermetic tests across the three provider packages + media.
- Run note: openai/anthropic/media were built by three parallel
subagents against the frozen llm contract; ollama/foreman, chain wiring,
and registry integration done in the main line. All gates green.
**Next:** Phase 4 — Google provider on google.golang.org/genai.
## 2026-06-10 — Phase 2: health + failover chain, proven
**Landed:** the full deterministic failover test matrix over the fake
provider + fake clock (no sleeps, no network): single-transient recovery
via same-target retry; repeated transients bench + advance; cooldown expiry
re-admits and success resets; backoff doubling across bench rounds;
mixed chain with an inline-expanded alias element failing over through the
expanded targets; permanent-policy default (fail-fast on auth) and
`AdvanceOnPermanent` override; `TransientRetries` disabled/custom; retry
loop stops early when the tracker benches mid-request; exhaustion error
lists skipped-while-benched targets; custom classifier override; chain-of-
one gets identical semantics; HTTP 529 fails over. Implementation needed no
changes — Phase 1's executor held up.
**Next:** Phase 3 — OpenAI/Anthropic/Ollama/foreman REST clients + media
pipeline.
## 2026-06-10 — Phase 1: foundations, ADRs, skeleton, docs
**Landed:**
- Module scaffold (Go 1.26), `.gitea/workflows/ci.yaml` (foreman-style
gates: build, vet, race tests, tidy-diff), `.env.example`.
- `llm/` canonical contract: Message/Part (sealed; text+image),
Request/Options, Response/Usage/FinishReason, Stream/StreamEvent,
Tool/Toolbox (panic-safe Execute), Capabilities (zero-value semantics),
Model/Provider interfaces, APIError + transient/permanent Classify.
- `health/`: clock-injected tracker — consecutive-failure threshold,
exponential capped cooldown, reset-on-success, thread-safe; full
deterministic test suite (fake clock).
- Root: Registry (providers/aliases/schemes/health), Parse with the binding
grammar (verbatim model ids, inline recursive alias expansion, cycle
detection, dedup), LLM_* env-DSN loading (go-llm-parity lazy fallback +
eager LoadEnv/New scan), chain executor implementing Model
(retry-on-transient, bench-on-repeat, skip-benched, 404-advance,
fail-fast-on-auth, joined exhaustion errors). Built-ins register as
resolvable stubs until their phases land.
- `provider/fake/`: scriptable provider (per-model outcome queues, request
recording, capabilities overrides, streaming) — the hermetic test rig.
- ADRs 00010008 + index; CLAUDE.md; honest README with pending-marked
matrix.
- Tests cover the two required cases: the trailing-`thinking` chain parse
and `LLM_M1=foreman://token@host` loading (plus DSN table, lazy fallback,
cycle detection, chain failover/backoff/exhaustion, toolbox execution,
error classification).
**Notes:** chain executor landed in Phase 1 (design was settled);
Phase 2 deepens its test matrix (cooldown re-admission via fake clock,
alias-in-chain failover, permanent-policy override) and wires anything the
tests flush out.
**Next:** Phase 2 — exhaustive health/chain test matrix.