Files
foreman/docs/adr
steve daf07fd759 feat: add async /jobs surface, state webhooks, and artifact handling
Add the async job submission API, webhook state notifications, and
artifact serving endpoints on top of the Phase 3 queue infrastructure.

Key changes:
- POST /jobs: async job submission with 202 + job_id ULID; optional
  state_webhook_url for push notifications on state transitions
- GET /jobs/{id}: job status polling with result, error, and artifact
  metadata; artifacts <= 256KB inlined, larger ones by URL reference
- GET /jobs/{id}/artifacts/{name}: raw artifact data serving
- Webhook dispatcher: at-least-once delivery with exponential backoff
  (5 retries); optional HMAC-SHA256 signing (X-Foreman-Signature)
- ADR-0014: state_webhook_url only honored on POST /jobs, not sync
  /api/chat (caller already blocks for result)
- Comprehensive tests for /jobs lifecycle, webhook delivery, HMAC
  verification, artifact inline/URL threshold, and TTL pruning

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 18:30:18 -04:00
..
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00
2026-05-23 16:41:20 -04:00

foreman — Architecture Decision Records

foreman is a small daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes jobs through the target (managing model swaps), assigns every job an ID, and reports progress + artifacts via webhooks. It also ships a Go client so the target is trivial to use from go-llm.

It is the deliberately pared-down successor to peon-overseer. One daemon, one worker, one queue. No distributed dispatch, no leases, no fair queueing.

Index

ADR Title Status
0001 One daemon per Ollama target Accepted
0002 Daemon placement and remote target configuration Accepted
0003 API surface: native Ollama passthrough vs OpenAI-compat Accepted
0004 Async job surface, job IDs, and queued execution Accepted
0005 Webhook state-update protocol Accepted
0006 Artifact handling and transport Accepted
0007 Model inventory polling and discovery Accepted
0008 Durable SQLite-backed queue Accepted
0009 Single-worker serialization and drain-by-model scheduling Accepted
0010 Authentication and security boundary Accepted
0011 Go client library and go-llm integration Accepted
0012 Streaming support Accepted
0013 Two-slot residency and embedding bypass Accepted
0014 No webhooks on synchronous /api/chat Accepted

ADR-0003 was resolved in favor of native Ollama as the v1 surface: foreman is, on the wire, a private authenticated Ollama deployment, so go-llm integrates via a thin llm.Foreman(baseURL, token) constructor that delegates to the existing ollama provider (ADR-0011). OpenAI-compat /v1 is deferred.

These ADRs refine the API/integration sections of the project CLAUDE.md. The queue, single-worker, drain-by-model, and security guardrails carry forward unchanged.

Format

Each ADR: Status, Context, Decision, Consequences, and Alternatives where useful. One decision per file. Append new ADRs; supersede rather than rewrite.