foreman — Architecture Decision Records

foreman is a small daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes jobs through the target (managing model swaps), assigns every job an ID, and reports progress + artifacts via webhooks. It also ships a Go client so the target is trivial to use from go-llm.

It is the deliberately pared-down successor to peon-overseer. One daemon, one worker, one queue. No distributed dispatch, no leases, no fair queueing.

Index

ADR	Title	Status
0001	One daemon per Ollama target	Accepted
0002	Daemon placement and remote target configuration	Accepted
0003	API surface: native Ollama passthrough vs OpenAI-compat	Accepted
0004	Async job surface, job IDs, and queued execution	Accepted
0005	Webhook state-update protocol	Accepted
0006	Artifact handling and transport	Accepted
0007	Model inventory polling and discovery	Accepted
0008	Durable SQLite-backed queue	Accepted
0009	Single-worker serialization and drain-by-model scheduling	Accepted
0010	Authentication and security boundary	Accepted
0011	Go client library and go-llm integration	Accepted
0012	Streaming support	Accepted

ADR-0003 was resolved in favor of native Ollama as the v1 surface: foreman is, on the wire, a private authenticated Ollama deployment, so go-llm integrates via a thin llm.Foreman(baseURL, token) constructor that delegates to the existing ollama provider (ADR-0011). OpenAI-compat /v1 is deferred.

These ADRs refine the API/integration sections of the project CLAUDE.md. The queue, single-worker, drain-by-model, and security guardrails carry forward unchanged.

Format

Each ADR: Status, Context, Decision, Consequences, and Alternatives where useful. One decision per file. Append new ADRs; supersede rather than rewrite.