2.2 KiB
2.2 KiB
phase-2.md — Ollama target client, model poller, native passthrough
Re-ground: CLAUDE.md + ADR-0003 (API surface), 0007 (model polling), 0012
(streaming), 0002 (unreachable = transient). Plan, get approval, implement.
Objective
Make foreman a working transparent front for its Ollama target — enough that
go-llm can use the Mac as a target today, before any queue exists. (Phase 3
will move this through the queue; here it can proxy directly.)
Tasks
internal/ollama: a small client to the target (FOREMAN_OLLAMA_URL) behind an interface, coveringPOST /api/chat(streaming and non-streaming),GET /api/tags,GET /api/ps. Attach the outbound bearer if configured. Wrap errors; classify connection failures distinctly (Phase 3 needs that signal).- Model poller (goroutine): poll
/api/tagseveryFOREMAN_POLL_INTERVAL(default 30s) into an in-memory inventory with a mutex; track last-poll time and a degraded flag. On target unreachable, retain last-known inventory and set degraded — do not clear it. Wire degraded state into/healthz. - Passthrough handlers in
internal/server:GET /api/tagsandGET /api/psserved from the poller/target.POST /api/chat: validate the requested model against the inventory (one re-poll on miss, then 4xx if still absent); proxy to the target. Support streaming faithfully (stream the target's chunks straight through; set the right content type). For now this may call the target directly — no queue.
- Tests: a stub HTTP server standing in for Ollama; assert tags/ps proxy, model validation rejects unknown models, streaming passes chunks through, and the poller flips degraded on target failure and recovers.
Definition of done
go build/vet/test -racegreen.- Against a real or stubbed Ollama:
curl .../api/tagsreturns the inventory; a non-streaming and a streaming/api/chatboth work end-to-end. - Acceptance: from a scratch Go program,
llm.Ollama(llm.WithBaseURL("http://<foreman>:8080"))(orllm.OllamaCloud(token, WithBaseURL(...))if a token is set) completes a chat through foreman. Note this inprogress.md.
Wrap up: progress.md, commit on phase-2-passthrough, note what Phase 3 changes
(routing this through the queue).