feat: add FOREMAN_KEEP_ALIVE config for worker model residency
Allow configuring how long the worker model stays resident on the Ollama
target after a request via FOREMAN_KEEP_ALIVE env var. Accepts Ollama
duration strings ("-1" forever, "0" unload, "15m", "1h", etc). Defaults
to "-1" (pin forever). The embedder warm-up is unaffected and always
uses keep_alive=-1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -17,6 +17,13 @@ FOREMAN_TOKEN=change-me-to-a-secret
|
||||
# Always-resident embedding model (pinned in slot 1)
|
||||
FOREMAN_EMBED_MODEL=nomic-embed-text
|
||||
|
||||
# How long the worker model stays resident on the target after a request.
|
||||
# Accepts Ollama duration strings: "-1" (forever/pin), "0" (unload immediately),
|
||||
# "15m", "1h", "3600" (seconds), etc.
|
||||
# Does NOT affect the embedder, which is always pinned with keep_alive=-1.
|
||||
# Default: -1 (pin forever — best for a dedicated box)
|
||||
FOREMAN_KEEP_ALIVE=-1
|
||||
|
||||
# === Persistence ===
|
||||
|
||||
# SQLite database path (default: foreman.db)
|
||||
|
||||
Reference in New Issue
Block a user