foreman

steve/foreman

Fork 0

Commit Graph

Author	SHA1	Message	Date
steve	7cd7eaff8b	feat: add FOREMAN_KEEP_ALIVE config for worker model residency CI / Tidy (push) Successful in 9m42s Details CI / Build & Test (push) Successful in 10m28s Details CI / Publish Docker Image (push) Successful in 21s Details Allow configuring how long the worker model stays resident on the Ollama target after a request via FOREMAN_KEEP_ALIVE env var. Accepts Ollama duration strings ("-1" forever, "0" unload, "15m", "1h", etc). Defaults to "-1" (pin forever). The embedder warm-up is unaffected and always uses keep_alive=-1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 21:29:37 -04:00
steve	6fd050855a	feat: add durable queue, single worker, and drain-by-model scheduling Replace the Phase 2 in-flight chat gate (buffered channel) with a real SQLite-backed job queue and single worker loop. Every /api/chat request now creates a job row, blocks until the worker completes it, and returns the result transparently. Key changes: - internal/store: NextJob (drain-by-model ordering), IncrementAttempt, ResetInterruptedJobs, DeleteTerminalJobsBefore; busy_timeout pragma - internal/worker: single-threaded worker loop with Notifier for sync handler completion signaling; retry on ConnectionError, terminal fail on HTTPError; crash recovery resets interrupted jobs on startup - internal/webhook: dispatcher infrastructure for async webhook delivery - internal/server: chat handler rewritten to enqueue+wait; old chatGate removed; embeddings remain direct concurrent proxies (ADR-0013) - internal/config: FOREMAN_MAX_ATTEMPTS, FOREMAN_JOB_TTL Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 18:29:32 -04:00

Author

SHA1

Message

Date

steve

7cd7eaff8b

feat: add FOREMAN_KEEP_ALIVE config for worker model residency

CI / Tidy (push) Successful in 9m42s

Details

CI / Build & Test (push) Successful in 10m28s

Details

CI / Publish Docker Image (push) Successful in 21s

Details

Allow configuring how long the worker model stays resident on the Ollama
target after a request via FOREMAN_KEEP_ALIVE env var. Accepts Ollama
duration strings ("-1" forever, "0" unload, "15m", "1h", etc). Defaults
to "-1" (pin forever). The embedder warm-up is unaffected and always
uses keep_alive=-1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 21:29:37 -04:00

steve

6fd050855a

feat: add durable queue, single worker, and drain-by-model scheduling

Replace the Phase 2 in-flight chat gate (buffered channel) with a real
SQLite-backed job queue and single worker loop. Every /api/chat request
now creates a job row, blocks until the worker completes it, and returns
the result transparently.

Key changes:
- internal/store: NextJob (drain-by-model ordering), IncrementAttempt,
  ResetInterruptedJobs, DeleteTerminalJobsBefore; busy_timeout pragma
- internal/worker: single-threaded worker loop with Notifier for sync
  handler completion signaling; retry on ConnectionError, terminal fail
  on HTTPError; crash recovery resets interrupted jobs on startup
- internal/webhook: dispatcher infrastructure for async webhook delivery
- internal/server: chat handler rewritten to enqueue+wait; old chatGate
  removed; embeddings remain direct concurrent proxies (ADR-0013)
- internal/config: FOREMAN_MAX_ATTEMPTS, FOREMAN_JOB_TTL

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 18:29:32 -04:00

2 Commits