T

steve daf07fd759 feat: add async /jobs surface, state webhooks, and artifact handling

Add the async job submission API, webhook state notifications, and
artifact serving endpoints on top of the Phase 3 queue infrastructure.

Key changes:
- POST /jobs: async job submission with 202 + job_id ULID; optional
  state_webhook_url for push notifications on state transitions
- GET /jobs/{id}: job status polling with result, error, and artifact
  metadata; artifacts <= 256KB inlined, larger ones by URL reference
- GET /jobs/{id}/artifacts/{name}: raw artifact data serving
- Webhook dispatcher: at-least-once delivery with exponential backoff
  (5 retries); optional HMAC-SHA256 signing (X-Foreman-Signature)
- ADR-0014: state_webhook_url only honored on POST /jobs, not sync
  /api/chat (caller already blocks for result)
- Comprehensive tests for /jobs lifecycle, webhook delivery, HMAC
  verification, artifact inline/URL threshold, and TTL pruning

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 18:30:18 -04:00

.gitea/workflows

feat: scaffold project with config, store, health endpoint, CI, and Dockerfile

2026-05-23 17:58:36 -04:00

cmd/foreman

feat: add durable queue, single worker, and drain-by-model scheduling

2026-05-23 18:29:32 -04:00

docs/adr

feat: add async /jobs surface, state webhooks, and artifact handling

2026-05-23 18:30:18 -04:00

internal

feat: add async /jobs surface, state webhooks, and artifact handling

2026-05-23 18:30:18 -04:00

prompts

add initial prompts

2026-05-23 16:51:19 -04:00

.env.example

feat: add async /jobs surface, state webhooks, and artifact handling

2026-05-23 18:30:18 -04:00

.gitignore

feat: scaffold project with config, store, health endpoint, CI, and Dockerfile

2026-05-23 17:58:36 -04:00

CLAUDE.md

initial commit

2026-05-23 16:41:20 -04:00

Dockerfile

feat: scaffold project with config, store, health endpoint, CI, and Dockerfile

2026-05-23 17:58:36 -04:00

go.mod

feat: add durable queue, single worker, and drain-by-model scheduling

2026-05-23 18:29:32 -04:00

go.sum

feat: add durable queue, single worker, and drain-by-model scheduling

2026-05-23 18:29:32 -04:00

progress.md

feat: add async /jobs surface, state webhooks, and artifact handling

2026-05-23 18:30:18 -04:00

README.md

feat: scaffold project with config, store, health endpoint, CI, and Dockerfile

2026-05-23 17:58:36 -04:00

README.md

foreman

A small, always-on Go daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes work through the target (managing model swaps), assigns every job an ID, and reports progress via webhooks.

On the wire it speaks native Ollama, so it doubles as a drop-in go-llm target.

Quickstart

# Set the required Ollama target URL
export FOREMAN_OLLAMA_URL=http://mac.tail:11434

# Run directly
go run ./cmd/foreman serve

# Or build and run
go build -o foreman ./cmd/foreman
./foreman serve

Docker

docker build -t foreman .
docker run -e FOREMAN_OLLAMA_URL=http://mac.tail:11434 -p 8080:8080 foreman

Configuration

All configuration is via environment variables, namespaced under FOREMAN_*. See .env.example for the full list.

Variable	Default	Description
`FOREMAN_ADDR`	`:8080`	Listen address
`FOREMAN_OLLAMA_URL`	(required)	Ollama target base URL
`FOREMAN_OLLAMA_TOKEN`	(empty)	Bearer token sent to the target
`FOREMAN_TOKEN`	(empty)	Bearer token callers must present
`FOREMAN_EMBED_MODEL`	(empty)	Always-resident embedder model
`FOREMAN_DB_PATH`	`foreman.db`	SQLite database path
`FOREMAN_POLL_INTERVAL`	`30s`	Target model poll interval
`FOREMAN_WEBHOOK_SECRET`	(empty)	HMAC key for webhook signing

Health check

curl http://localhost:8080/healthz
# {"status":"ok","degraded":false}

Architecture

See docs/adr/ for design decisions. Key points:

One daemon per Ollama target (ADR-0001)
SQLite-backed durable job queue in WAL mode (ADR-0008)
Single worker loop with drain-by-model scheduling (ADR-0009)
Native Ollama passthrough + async /jobs surface (ADR-0003, ADR-0004)
Embeddings bypass the queue entirely (ADR-0013)

Description

🪓 Small always-on Go daemon that fronts one Ollama target — turns it into a queued, observable job endpoint (model-swap serialization, job IDs, progress webhooks). Speaks native Ollama on the wire, so it's a drop-in target for any Ollama client.

Readme MIT 244 KiB