2.0 KiB
2.0 KiB
phase-3.md — Durable queue, single worker, drain-by-model
Re-ground: CLAUDE.md + ADR-0009 (single worker / drain-by-model), 0008 (queue),
0004 (lifecycle/retry). Plan, get approval, implement.
Objective
Route execution through the SQLite queue with exactly one worker and drain-by-model scheduling. The synchronous passthrough from Phase 2 now enqueues and blocks on completion instead of calling the target directly.
Tasks
- Promote chat requests to persisted jobs: every
/api/chatcall creates ajobsrow (statequeued), and the handler blocks until that job reaches a terminal state, then writes the response. Assign a ULID as the job id now (used everywhere in Phase 4). internal/worker: a single worker loop (concurrency 1). Select the next job withORDER BY (model != :current_resident), created_atso all jobs for the currently-resident model (from/api/ps) drain before a swap. Transitionqueued→loading→working→done. Pin residency with Ollamakeep_alive.- Retry semantics (ADR-0004): a connection failure to the target re-queues the
job with backoff and increments
attempt; exceeding a bounded max moves it tofailedwith the last error stored. Never auto-fail on a single transient error. Jobs survive process restart (resumequeued/in-flight on boot). - Tests: against the stub Ollama — jobs persist and execute serially; a sequence mixing two models drains by model (assert the swap happens once, not per job); a flapping target causes retry-then-success without data loss; restart mid-queue resumes cleanly.
Definition of done
go build/vet/test -racegreen.- The Phase 2 acceptance (go-llm completes a chat) still passes, now served through the queue.
- Demonstrable: enqueue several jobs across two models and observe drain-by-model ordering in logs; kill and restart foreman mid-queue and watch it resume.
Wrap up: progress.md, commit on phase-3-queue. M0 is effectively complete here
— note that. Phase 4 adds the async surface on top of this same engine.