Files
majordomo/progress.md
T
steve 1ca607906d feat: Google (Gemini) provider on the official Gen AI SDK
Phase 4: provider/google on google.golang.org/genai v1.59.0 — lazy cached
client, FunctionResponse tool loop, raw-JSON-schema tools and structured
output, ThinkingLevel reasoning mapping, iter.Pull2 streaming, hermetic
httptest suite via HTTPOptions.BaseURL. Registry wires google + gemini
schemes to the real client; stub machinery deleted (all built-ins real).
ADR-0011; README matrix + CLAUDE.md synced.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:04:28 +02:00

110 lines
6.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# progress
## 2026-06-10 — Phase 4: Google provider (official genai SDK)
**Landed:** `provider/google` on google.golang.org/genai v1.59.0 (ADR-0011):
lazy cached client (construction never fails; missing key = synthetic 401
so chains fail over), assistant→model role mapping, FunctionResponse tool
results with output/error payloads, ParametersJsonSchema raw-schema tools,
ResponseJsonSchema structured output, ToolChoice→FunctionCallingConfig,
ReasoningEffort→ThinkingConfig.ThinkingLevel, usage includes thought
tokens, iter.Pull2-adapted streaming, genai.APIError→llm.APIError mapping.
Hermetic tests via HTTPOptions.BaseURL + httptest (SSE fixtures for
streaming). Registry: google + gemini schemes wired to the real provider;
the last stub machinery deleted — all six built-ins are now real clients.
README matrix: Google row fully ✅.
**Next:** Phase 5 — Agent run loop, Toolbox ergonomics, Generate[T].
## 2026-06-10 — Phase 3: REST providers (OpenAI, Anthropic, Ollama×3) + media
**Landed:**
- `provider/openai`: Chat Completions client for OpenAI and every
OpenAI-compatible endpoint (tools with string-arguments mapping, strict
SSE streaming incl. by-index tool-call assembly and the empty-choices
usage chunk, response_format json_schema, max_completion_tokens with a
WithLegacyMaxTokens compat option, reasoning_effort).
- `provider/anthropic`: Messages API client (anthropic-version 2023-06-01,
required-max_tokens defaulting, tool_use/tool_result blocks with native
is_error, GA structured output via output_config.format, full SSE event
parser with input_json_delta buffering, 529-overloaded classified
transient, usage sums cache tokens).
- `provider/ollama`: ONE native /api/chat client serving ollama (local,
OLLAMA_HOST normalization), ollama-cloud (https://ollama.com + bearer
OLLAMA_API_KEY), and foreman (base URL + bearer; tolerates its
buffered-single-object "streaming"). Object tool arguments, tool_name
results, format-schema structured output, think-level mapping, NDJSON
streaming with 16MB lines.
- `media/`: normalization pipeline per ADR-0009 (magic-byte sniffing,
box-filter downscale, transcode preference ladder, byte-budget quality
ladder, webp passthrough-or-reject, copy-on-write, everything-unfittable
wraps ErrUnsupported).
- Chain executor now normalizes media PER TARGET before each attempt and
advances penalty-free past targets that can't take the request (proven:
text-only head + vision fallback; per-target downscale assertions).
- Registry: real providers + scheme factories wired for openai, anthropic,
ollama, ollama-cloud, foreman (google still stubbed, Phase 4);
WithHTTPClient registry option; required env-foreman TLS chat round-trip
test (LLM_FM=foreman://token@host → Parse("fm/qwen3:30b") → bearer
arrives, chat answers).
- ADR-0009 (multimodal), ADR-0010 (tools/structured mapping); README
matrix flipped to ✅ for the four landed provider families; ~70 new
hermetic tests across the three provider packages + media.
- Run note: openai/anthropic/media were built by three parallel
subagents against the frozen llm contract; ollama/foreman, chain wiring,
and registry integration done in the main line. All gates green.
**Next:** Phase 4 — Google provider on google.golang.org/genai.
## 2026-06-10 — Phase 2: health + failover chain, proven
**Landed:** the full deterministic failover test matrix over the fake
provider + fake clock (no sleeps, no network): single-transient recovery
via same-target retry; repeated transients bench + advance; cooldown expiry
re-admits and success resets; backoff doubling across bench rounds;
mixed chain with an inline-expanded alias element failing over through the
expanded targets; permanent-policy default (fail-fast on auth) and
`AdvanceOnPermanent` override; `TransientRetries` disabled/custom; retry
loop stops early when the tracker benches mid-request; exhaustion error
lists skipped-while-benched targets; custom classifier override; chain-of-
one gets identical semantics; HTTP 529 fails over. Implementation needed no
changes — Phase 1's executor held up.
**Next:** Phase 3 — OpenAI/Anthropic/Ollama/foreman REST clients + media
pipeline.
## 2026-06-10 — Phase 1: foundations, ADRs, skeleton, docs
**Landed:**
- Module scaffold (Go 1.26), `.gitea/workflows/ci.yaml` (foreman-style
gates: build, vet, race tests, tidy-diff), `.env.example`.
- `llm/` canonical contract: Message/Part (sealed; text+image),
Request/Options, Response/Usage/FinishReason, Stream/StreamEvent,
Tool/Toolbox (panic-safe Execute), Capabilities (zero-value semantics),
Model/Provider interfaces, APIError + transient/permanent Classify.
- `health/`: clock-injected tracker — consecutive-failure threshold,
exponential capped cooldown, reset-on-success, thread-safe; full
deterministic test suite (fake clock).
- Root: Registry (providers/aliases/schemes/health), Parse with the binding
grammar (verbatim model ids, inline recursive alias expansion, cycle
detection, dedup), LLM_* env-DSN loading (go-llm-parity lazy fallback +
eager LoadEnv/New scan), chain executor implementing Model
(retry-on-transient, bench-on-repeat, skip-benched, 404-advance,
fail-fast-on-auth, joined exhaustion errors). Built-ins register as
resolvable stubs until their phases land.
- `provider/fake/`: scriptable provider (per-model outcome queues, request
recording, capabilities overrides, streaming) — the hermetic test rig.
- ADRs 00010008 + index; CLAUDE.md; honest README with pending-marked
matrix.
- Tests cover the two required cases: the trailing-`thinking` chain parse
and `LLM_M1=foreman://token@host` loading (plus DSN table, lazy fallback,
cycle detection, chain failover/backoff/exhaustion, toolbox execution,
error classification).
**Notes:** chain executor landed in Phase 1 (design was settled);
Phase 2 deepens its test matrix (cooldown re-admission via fake clock,
alias-in-chain failover, permanent-policy override) and wires anything the
tests flush out.
**Next:** Phase 2 — exhaustive health/chain test matrix.