4 Commits

Author SHA1 Message Date
steve 67c3ebe067 feat(ollama): add automatic retry with exponential backoff for transient HTTP errors
CI / Build, Test & Lint (push) Successful in 10m50s
Ollama Cloud returns HTTP 503 when the model is temporarily overloaded,
429 on rate limit, and 502 on upstream failures. These are transient
conditions that resolve on retry. Previously they bubbled up as hard
errors, forcing callers to implement their own retry logic.

The retry is implemented at the HTTP transport level in doChatRequest,
so both Complete and Stream benefit transparently. Strategy: up to 3
retries with exponential backoff (1s, 2s, 4s), Retry-After header
respected for 429, context cancellation checked between retries.
Non-transient errors (400, 401, 403, 404, 500) are never retried.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 11:58:25 -04:00
steve f70c7c0842 feat(v2/ollama): implement native Stream() with NDJSON parsing
Reads Ollama's NDJSON stream (one JSON object per line) and emits
provider.StreamEvent values for text, thinking, tool-call start/delta/end,
and a final Done event carrying assembled Response and Usage. Uses
bufio.Scanner with a 4 MiB max-line buffer so multi-KB tool-call deltas
parse cleanly, and accepts tool-call arguments delivered either as
escaped string fragments (delta-style) or a complete JSON object
(one-shot).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-01 18:29:04 +00:00
steve 583f8724b2 feat(v2/ollama): implement native Complete() with tools, vision, thinking
Non-streaming /api/chat support including:
- Vision via images: []base64
- Tool calls on assistant + tool-role response messages
- think field accepting string reasoning levels (or "true"/"false")
- Authorization header when apiKey is non-empty (cloud mode)

Tool-call arguments are passed as JSON objects to the wire and surfaced
as JSON-string Arguments on provider.ToolCall. Tool calls are assigned
synthetic IDs (tc_<index>) when Ollama omits one, so the round-trip
back as an assistant tool_calls + tool-role message remains correlated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-01 18:24:02 +00:00
steve 0e358148eb feat(v2/ollama): scaffold native /api/chat provider
Adds wire types and a Provider struct that will replace the
openaicompat-based Ollama shim with a native /api/chat implementation.
Complete and Stream methods are stubs; subsequent commits implement them.

Adjusts the existing ollama.go to drop the type alias on
openaicompat.Provider (renaming the legacy shim to a temporary internal
helper) so the new native Provider type does not collide. Public New()
still returns the openaicompat-backed provider until Task 4 swaps it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-01 18:22:11 +00:00