add initial prompts

2026-05-23 16:51:19 -04:00
parent 8fde024281
commit d5702f7a75
7 changed files with 383 additions and 0 deletions
@@ -0,0 +1,40 @@
+# phase-3.md — Durable queue, single worker, drain-by-model
+
+Re-ground: `CLAUDE.md` + ADR-0009 (single worker / drain-by-model), 0008 (queue),
+0004 (lifecycle/retry). Plan, get approval, implement.
+
+## Objective
+
+Route execution through the SQLite queue with exactly one worker and
+drain-by-model scheduling. The synchronous passthrough from Phase 2 now enqueues
+and blocks on completion instead of calling the target directly.
+
+## Tasks
+
+- Promote chat requests to persisted jobs: every `/api/chat` call creates a `jobs`
+  row (state `queued`), and the handler blocks until that job reaches a terminal
+  state, then writes the response. Assign a **ULID** as the job id now (used
+  everywhere in Phase 4).
+- `internal/worker`: a single worker loop (concurrency 1). Select the next job
+  with `ORDER BY (model != :current_resident), created_at` so all jobs for the
+  currently-resident model (from `/api/ps`) drain before a swap. Transition
+  `queued→loading→working→done`. Pin residency with Ollama `keep_alive`.
+- Retry semantics (ADR-0004): a connection failure to the target re-queues the
+  job with backoff and increments `attempt`; exceeding a bounded max moves it to
+  `failed` with the last error stored. Never auto-fail on a single transient
+  error. Jobs survive process restart (resume `queued`/in-flight on boot).
+- Tests: against the stub Ollama — jobs persist and execute serially; a sequence
+  mixing two models drains by model (assert the swap happens once, not per job);
+  a flapping target causes retry-then-success without data loss; restart mid-queue
+  resumes cleanly.
+
+## Definition of done
+
+- `go build/vet/test -race` green.
+- The Phase 2 acceptance (go-llm completes a chat) still passes, now served
+  through the queue.
+- Demonstrable: enqueue several jobs across two models and observe drain-by-model
+  ordering in logs; kill and restart foreman mid-queue and watch it resume.
+
+Wrap up: `progress.md`, commit on `phase-3-queue`. M0 is effectively complete here
+— note that. Phase 4 adds the async surface on top of this same engine.