add initial prompts

This commit is contained in:
2026-05-23 16:51:19 -04:00
parent 8fde024281
commit d5702f7a75
7 changed files with 383 additions and 0 deletions
+42
View File
@@ -0,0 +1,42 @@
# phase-2.md — Ollama target client, model poller, native passthrough
Re-ground: `CLAUDE.md` + ADR-0003 (API surface), 0007 (model polling), 0012
(streaming), 0002 (unreachable = transient). Plan, get approval, implement.
## Objective
Make foreman a working transparent front for its Ollama target — enough that
`go-llm` can use the Mac as a target *today*, before any queue exists. (Phase 3
will move this through the queue; here it can proxy directly.)
## Tasks
- `internal/ollama`: a small client to the target (`FOREMAN_OLLAMA_URL`) behind
an interface, covering `POST /api/chat` (streaming and non-streaming),
`GET /api/tags`, `GET /api/ps`. Attach the outbound bearer if configured. Wrap
errors; classify connection failures distinctly (Phase 3 needs that signal).
- Model poller (goroutine): poll `/api/tags` every `FOREMAN_POLL_INTERVAL`
(default 30s) into an in-memory inventory with a mutex; track last-poll time
and a degraded flag. On target unreachable, retain last-known inventory and set
degraded — do not clear it. Wire degraded state into `/healthz`.
- Passthrough handlers in `internal/server`:
- `GET /api/tags` and `GET /api/ps` served from the poller/target.
- `POST /api/chat`: validate the requested model against the inventory (one
re-poll on miss, then 4xx if still absent); proxy to the target. Support
streaming faithfully (stream the target's chunks straight through; set the
right content type). For now this may call the target directly — no queue.
- Tests: a stub HTTP server standing in for Ollama; assert tags/ps proxy,
model validation rejects unknown models, streaming passes chunks through, and
the poller flips degraded on target failure and recovers.
## Definition of done
- `go build/vet/test -race` green.
- Against a real or stubbed Ollama: `curl .../api/tags` returns the inventory;
a non-streaming and a streaming `/api/chat` both work end-to-end.
- Acceptance: from a scratch Go program, `llm.Ollama(llm.WithBaseURL("http://<foreman>:8080"))`
(or `llm.OllamaCloud(token, WithBaseURL(...))` if a token is set) completes a
chat through foreman. Note this in `progress.md`.
Wrap up: `progress.md`, commit on `phase-2-passthrough`, note what Phase 3 changes
(routing this through the queue).