4759a06d1bfc64c7d0bbee5bb23a9af1fe16b776
Adds client/ -- a public Go package providing a synchronous facade over
foreman's async POST /jobs API (Level 1 integration per ADR-0011).
Two delivery modes:
- Webhook receiver (preferred): ephemeral HTTP server on random port,
pushes results immediately, verifies HMAC when configured
- Polling fallback: polls GET /jobs/{id} at configurable interval
Also includes Tags() and Embed() helpers, bearer auth support, and
comprehensive integration tests against the real foreman HTTP handlers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
foreman
A small, always-on Go daemon that fronts one Ollama target. It turns a single Ollama instance into a queued, observable job endpoint: it polls the target's installed models, serializes work through the target (managing model swaps), assigns every job an ID, and reports progress via webhooks.
On the wire it speaks native Ollama, so it doubles as a drop-in go-llm
target.
Quickstart
# Set the required Ollama target URL
export FOREMAN_OLLAMA_URL=http://mac.tail:11434
# Run directly
go run ./cmd/foreman serve
# Or build and run
go build -o foreman ./cmd/foreman
./foreman serve
Docker
docker build -t foreman .
docker run -e FOREMAN_OLLAMA_URL=http://mac.tail:11434 -p 8080:8080 foreman
Configuration
All configuration is via environment variables, namespaced under FOREMAN_*.
See .env.example for the full list.
| Variable | Default | Description |
|---|---|---|
FOREMAN_ADDR |
:8080 |
Listen address |
FOREMAN_OLLAMA_URL |
(required) | Ollama target base URL |
FOREMAN_OLLAMA_TOKEN |
(empty) | Bearer token sent to the target |
FOREMAN_TOKEN |
(empty) | Bearer token callers must present |
FOREMAN_EMBED_MODEL |
(empty) | Always-resident embedder model |
FOREMAN_DB_PATH |
foreman.db |
SQLite database path |
FOREMAN_POLL_INTERVAL |
30s |
Target model poll interval |
FOREMAN_WEBHOOK_SECRET |
(empty) | HMAC key for webhook signing |
Health check
curl http://localhost:8080/healthz
# {"status":"ok","degraded":false}
Architecture
See docs/adr/ for design decisions. Key points:
- One daemon per Ollama target (ADR-0001)
- SQLite-backed durable job queue in WAL mode (ADR-0008)
- Single worker loop with drain-by-model scheduling (ADR-0009)
- Native Ollama passthrough + async
/jobssurface (ADR-0003, ADR-0004) - Embeddings bypass the queue entirely (ADR-0013)
Languages
Go
99.1%
Shell
0.7%
Dockerfile
0.2%