Files

T

steve daf07fd759 feat: add async /jobs surface, state webhooks, and artifact handling

Add the async job submission API, webhook state notifications, and
artifact serving endpoints on top of the Phase 3 queue infrastructure.

Key changes:
- POST /jobs: async job submission with 202 + job_id ULID; optional
  state_webhook_url for push notifications on state transitions
- GET /jobs/{id}: job status polling with result, error, and artifact
  metadata; artifacts <= 256KB inlined, larger ones by URL reference
- GET /jobs/{id}/artifacts/{name}: raw artifact data serving
- Webhook dispatcher: at-least-once delivery with exponential backoff
  (5 retries); optional HMAC-SHA256 signing (X-Foreman-Signature)
- ADR-0014: state_webhook_url only honored on POST /jobs, not sync
  /api/chat (caller already blocks for result)
- Comprehensive tests for /jobs lifecycle, webhook delivery, HMAC
  verification, artifact inline/URL threshold, and TTL pruning

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 18:30:18 -04:00

2.1 KiB

Raw Blame History

foreman — Architecture Decision Records

foreman is a small daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes jobs through the target (managing model swaps), assigns every job an ID, and reports progress + artifacts via webhooks. It also ships a Go client so the target is trivial to use from go-llm.

It is the deliberately pared-down successor to peon-overseer. One daemon, one worker, one queue. No distributed dispatch, no leases, no fair queueing.

Index

ADR	Title	Status
0001	One daemon per Ollama target	Accepted
0002	Daemon placement and remote target configuration	Accepted
0003	API surface: native Ollama passthrough vs OpenAI-compat	Accepted
0004	Async job surface, job IDs, and queued execution	Accepted
0005	Webhook state-update protocol	Accepted
0006	Artifact handling and transport	Accepted
0007	Model inventory polling and discovery	Accepted
0008	Durable SQLite-backed queue	Accepted
0009	Single-worker serialization and drain-by-model scheduling	Accepted
0010	Authentication and security boundary	Accepted
0011	Go client library and go-llm integration	Accepted
0012	Streaming support	Accepted
0013	Two-slot residency and embedding bypass	Accepted
0014	No webhooks on synchronous /api/chat	Accepted

ADR-0003 was resolved in favor of native Ollama as the v1 surface: foreman is, on the wire, a private authenticated Ollama deployment, so go-llm integrates via a thin llm.Foreman(baseURL, token) constructor that delegates to the existing ollama provider (ADR-0011). OpenAI-compat /v1 is deferred.

These ADRs refine the API/integration sections of the project CLAUDE.md. The queue, single-worker, drain-by-model, and security guardrails carry forward unchanged.

Format

Each ADR: Status, Context, Decision, Consequences, and Alternatives where useful. One decision per file. Append new ADRs; supersede rather than rewrite.

2.1 KiB Raw Blame History

foreman — Architecture Decision Records

Index

Format

2.1 KiB

Raw Blame History