Commit pre-existing uncommitted working-tree changes that predate the license/public-readiness work — NOT authored in this session, just flushed so they're not lost: ADR-0003/0005/0009/0012 edits, the new ADR-0013 (embeddings-bypass + two-slot residency, already referenced by CLAUDE.md), and the phase-0..3 prompt revisions + prompts/README.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
foreman — Architecture Decision Records
foreman is a small daemon that fronts one Ollama target. It turns a single
Ollama instance into a queued, observable job endpoint: it polls the target's
installed models, serializes jobs through the target (managing model swaps),
assigns every job an ID, and reports progress + artifacts via webhooks. It also
ships a Go client so the target is trivial to use from go-llm.
It is the deliberately pared-down successor to peon-overseer. One daemon, one
worker, one queue. No distributed dispatch, no leases, no fair queueing.
Index
| ADR | Title | Status |
|---|---|---|
| 0001 | One daemon per Ollama target | Accepted |
| 0002 | Daemon placement and remote target configuration | Accepted |
| 0003 | API surface: native Ollama passthrough vs OpenAI-compat | Accepted |
| 0004 | Async job surface, job IDs, and queued execution | Accepted |
| 0005 | Webhook state-update protocol | Accepted |
| 0006 | Artifact handling and transport | Accepted |
| 0007 | Model inventory polling and discovery | Accepted |
| 0008 | Durable SQLite-backed queue | Accepted |
| 0009 | Single-worker serialization and drain-by-model scheduling | Accepted |
| 0010 | Authentication and security boundary | Accepted |
| 0011 | Go client library and go-llm integration | Accepted |
| 0012 | Streaming support | Accepted |
| 0013 | Two-slot residency and embedding bypass | Accepted |
| 0014 | No webhooks on synchronous /api/chat | Accepted |
ADR-0003 was resolved in favor of native Ollama as the v1 surface: foreman is,
on the wire, a private authenticated Ollama deployment, so go-llm integrates via
a thin llm.Foreman(baseURL, token) constructor that delegates to the existing
ollama provider (ADR-0011). OpenAI-compat /v1 is deferred.
These ADRs refine the API/integration sections of the project CLAUDE.md. The
queue, single-worker, drain-by-model, and security guardrails carry forward
unchanged.
Format
Each ADR: Status, Context, Decision, Consequences, and Alternatives where useful. One decision per file. Append new ADRs; supersede rather than rewrite.