Files
foreman/docs/adr
steve 0526bada90 docs: land prior ADR + prompt updates
Commit pre-existing uncommitted working-tree changes that predate the
license/public-readiness work — NOT authored in this session, just flushed so
they're not lost: ADR-0003/0005/0009/0012 edits, the new ADR-0013
(embeddings-bypass + two-slot residency, already referenced by CLAUDE.md), and
the phase-0..3 prompt revisions + prompts/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 20:33:39 -04:00
..
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00

foreman — Architecture Decision Records

foreman is a small daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes jobs through the target (managing model swaps), assigns every job an ID, and reports progress + artifacts via webhooks. It also ships a Go client so the target is trivial to use from go-llm.

It is the deliberately pared-down successor to peon-overseer. One daemon, one worker, one queue. No distributed dispatch, no leases, no fair queueing.

Index

ADR Title Status
0001 One daemon per Ollama target Accepted
0002 Daemon placement and remote target configuration Accepted
0003 API surface: native Ollama passthrough vs OpenAI-compat Accepted
0004 Async job surface, job IDs, and queued execution Accepted
0005 Webhook state-update protocol Accepted
0006 Artifact handling and transport Accepted
0007 Model inventory polling and discovery Accepted
0008 Durable SQLite-backed queue Accepted
0009 Single-worker serialization and drain-by-model scheduling Accepted
0010 Authentication and security boundary Accepted
0011 Go client library and go-llm integration Accepted
0012 Streaming support Accepted
0013 Two-slot residency and embedding bypass Accepted
0014 No webhooks on synchronous /api/chat Accepted

ADR-0003 was resolved in favor of native Ollama as the v1 surface: foreman is, on the wire, a private authenticated Ollama deployment, so go-llm integrates via a thin llm.Foreman(baseURL, token) constructor that delegates to the existing ollama provider (ADR-0011). OpenAI-compat /v1 is deferred.

These ADRs refine the API/integration sections of the project CLAUDE.md. The queue, single-worker, drain-by-model, and security guardrails carry forward unchanged.

Format

Each ADR: Status, Context, Decision, Consequences, and Alternatives where useful. One decision per file. Append new ADRs; supersede rather than rewrite.