Phase 6 deployment infrastructure: finalize Dockerfile with OCI labels,
improve .env.example with grouped config keys, add scripts/pull-models.sh
for Mac-side model setup, and add docs/deploy.md covering the full
deployment topology, prerequisites, security model, and troubleshooting.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds client/ -- a public Go package providing a synchronous facade over
foreman's async POST /jobs API (Level 1 integration per ADR-0011).
Two delivery modes:
- Webhook receiver (preferred): ephemeral HTTP server on random port,
pushes results immediately, verifies HMAC when configured
- Polling fallback: polls GET /jobs/{id} at configurable interval
Also includes Tags() and Embed() helpers, bearer auth support, and
comprehensive integration tests against the real foreman HTTP handlers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add the async job submission API, webhook state notifications, and
artifact serving endpoints on top of the Phase 3 queue infrastructure.
Key changes:
- POST /jobs: async job submission with 202 + job_id ULID; optional
state_webhook_url for push notifications on state transitions
- GET /jobs/{id}: job status polling with result, error, and artifact
metadata; artifacts <= 256KB inlined, larger ones by URL reference
- GET /jobs/{id}/artifacts/{name}: raw artifact data serving
- Webhook dispatcher: at-least-once delivery with exponential backoff
(5 retries); optional HMAC-SHA256 signing (X-Foreman-Signature)
- ADR-0014: state_webhook_url only honored on POST /jobs, not sync
/api/chat (caller already blocks for result)
- Comprehensive tests for /jobs lifecycle, webhook delivery, HMAC
verification, artifact inline/URL threshold, and TTL pruning
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the Phase 2 in-flight chat gate (buffered channel) with a real
SQLite-backed job queue and single worker loop. Every /api/chat request
now creates a job row, blocks until the worker completes it, and returns
the result transparently.
Key changes:
- internal/store: NextJob (drain-by-model ordering), IncrementAttempt,
ResetInterruptedJobs, DeleteTerminalJobsBefore; busy_timeout pragma
- internal/worker: single-threaded worker loop with Notifier for sync
handler completion signaling; retry on ConnectionError, terminal fail
on HTTPError; crash recovery resets interrupted jobs on startup
- internal/webhook: dispatcher infrastructure for async webhook delivery
- internal/server: chat handler rewritten to enqueue+wait; old chatGate
removed; embeddings remain direct concurrent proxies (ADR-0013)
- internal/config: FOREMAN_MAX_ATTEMPTS, FOREMAN_JOB_TTL
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2 of foreman: the daemon now acts as a transparent Ollama proxy.
- internal/ollama: Client interface and HTTP implementation for chat
(streaming + non-streaming), embed, tags, ps with auth forwarding,
NDJSON streaming via bufio.Scanner, and connection vs HTTP error
classification via custom error types.
- internal/ollama: ModelInventory with background poller for /api/tags
and /api/ps, degraded mode on target unreachable with model retention,
automatic recovery on reconnect.
- internal/server: Passthrough routes (/api/chat, /api/tags, /api/ps,
/api/embed, /api/embeddings) with model validation, chat serialization
gate (capacity-1 channel), concurrent embedding bypass (ADR-0013),
NDJSON streaming with per-chunk flush, and degraded health reporting.
- cmd/foreman: Full serve wiring with Ollama client, poller goroutine,
embedder warmup (keep_alive:-1), and signal-based shutdown.
The Mac is now usable as a go-llm target through foreman.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1 of foreman: initialize the Go module, project layout, and core
infrastructure. Includes env-based configuration (FOREMAN_* namespace),
SQLite-backed durable job queue with WAL mode via modernc.org/sqlite,
stdlib HTTP server with /healthz and optional bearer-token auth middleware,
subcommand dispatch (serve + stubs), Gitea CI workflow, multi-stage
distroless Dockerfile, and comprehensive tests for all packages.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>