add initial prompts
This commit is contained in:
@@ -0,0 +1,42 @@
|
||||
# phase-2.md — Ollama target client, model poller, native passthrough
|
||||
|
||||
Re-ground: `CLAUDE.md` + ADR-0003 (API surface), 0007 (model polling), 0012
|
||||
(streaming), 0002 (unreachable = transient). Plan, get approval, implement.
|
||||
|
||||
## Objective
|
||||
|
||||
Make foreman a working transparent front for its Ollama target — enough that
|
||||
`go-llm` can use the Mac as a target *today*, before any queue exists. (Phase 3
|
||||
will move this through the queue; here it can proxy directly.)
|
||||
|
||||
## Tasks
|
||||
|
||||
- `internal/ollama`: a small client to the target (`FOREMAN_OLLAMA_URL`) behind
|
||||
an interface, covering `POST /api/chat` (streaming and non-streaming),
|
||||
`GET /api/tags`, `GET /api/ps`. Attach the outbound bearer if configured. Wrap
|
||||
errors; classify connection failures distinctly (Phase 3 needs that signal).
|
||||
- Model poller (goroutine): poll `/api/tags` every `FOREMAN_POLL_INTERVAL`
|
||||
(default 30s) into an in-memory inventory with a mutex; track last-poll time
|
||||
and a degraded flag. On target unreachable, retain last-known inventory and set
|
||||
degraded — do not clear it. Wire degraded state into `/healthz`.
|
||||
- Passthrough handlers in `internal/server`:
|
||||
- `GET /api/tags` and `GET /api/ps` served from the poller/target.
|
||||
- `POST /api/chat`: validate the requested model against the inventory (one
|
||||
re-poll on miss, then 4xx if still absent); proxy to the target. Support
|
||||
streaming faithfully (stream the target's chunks straight through; set the
|
||||
right content type). For now this may call the target directly — no queue.
|
||||
- Tests: a stub HTTP server standing in for Ollama; assert tags/ps proxy,
|
||||
model validation rejects unknown models, streaming passes chunks through, and
|
||||
the poller flips degraded on target failure and recovers.
|
||||
|
||||
## Definition of done
|
||||
|
||||
- `go build/vet/test -race` green.
|
||||
- Against a real or stubbed Ollama: `curl .../api/tags` returns the inventory;
|
||||
a non-streaming and a streaming `/api/chat` both work end-to-end.
|
||||
- Acceptance: from a scratch Go program, `llm.Ollama(llm.WithBaseURL("http://<foreman>:8080"))`
|
||||
(or `llm.OllamaCloud(token, WithBaseURL(...))` if a token is set) completes a
|
||||
chat through foreman. Note this in `progress.md`.
|
||||
|
||||
Wrap up: `progress.md`, commit on `phase-2-passthrough`, note what Phase 3 changes
|
||||
(routing this through the queue).
|
||||
Reference in New Issue
Block a user