823c0b4ca8
Add MIT LICENSE (matches gadfly/majordomo, same author). README + CLAUDE.md: note this is a public, vibe-coded project; clarify the `go-llm` referenced in the docs is now majordomo, and link it + gadfly as the downstream consumers (foreman is a drop-in native-Ollama target via majordomo's ollama.Foreman preset). CLAUDE.md gains a Build / test / run section. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
78 lines
2.6 KiB
Markdown
78 lines
2.6 KiB
Markdown
# foreman
|
|
|
|
🪓 A small, always-on Go daemon that fronts **one** Ollama target. It turns a
|
|
single Ollama instance into a queued, observable job endpoint: it polls the
|
|
target's installed models, serializes work through the target (managing model
|
|
swaps), assigns every job an ID, and reports progress via webhooks.
|
|
|
|
On the wire it speaks **native Ollama**, so it doubles as a drop-in target for
|
|
any Ollama client — including [majordomo](https://gitea.stevedudenhoeffer.com/steve/majordomo)
|
|
(via its `ollama.Foreman(url, token)` preset) and, through that,
|
|
[gadfly](https://gitea.stevedudenhoeffer.com/steve/gadfly). Point a client at the
|
|
foreman URL instead of the raw Ollama and you get queuing + model-swap
|
|
serialization for free.
|
|
|
|
> **This is a public, vibe-coded project** (built largely by an AI agent). It runs
|
|
> the author's homelab but is intentionally generic — one daemon, one target, one
|
|
> queue. Treat the homelab specifics in the docs as illustrative, and don't
|
|
> oversell it: it's a deliberately small queue in front of Ollama, not a
|
|
> distributed scheduler.
|
|
|
|
## Quickstart
|
|
|
|
```bash
|
|
# Set the required Ollama target URL
|
|
export FOREMAN_OLLAMA_URL=http://mac.tail:11434
|
|
|
|
# Run directly
|
|
go run ./cmd/foreman serve
|
|
|
|
# Or build and run
|
|
go build -o foreman ./cmd/foreman
|
|
./foreman serve
|
|
```
|
|
|
|
## Docker
|
|
|
|
```bash
|
|
docker build -t foreman .
|
|
docker run -e FOREMAN_OLLAMA_URL=http://mac.tail:11434 -p 8080:8080 foreman
|
|
```
|
|
|
|
## Configuration
|
|
|
|
All configuration is via environment variables, namespaced under `FOREMAN_*`.
|
|
See [`.env.example`](.env.example) for the full list.
|
|
|
|
| Variable | Default | Description |
|
|
|---|---|---|
|
|
| `FOREMAN_ADDR` | `:8080` | Listen address |
|
|
| `FOREMAN_OLLAMA_URL` | *(required)* | Ollama target base URL |
|
|
| `FOREMAN_OLLAMA_TOKEN` | *(empty)* | Bearer token sent to the target |
|
|
| `FOREMAN_TOKEN` | *(empty)* | Bearer token callers must present |
|
|
| `FOREMAN_EMBED_MODEL` | *(empty)* | Always-resident embedder model |
|
|
| `FOREMAN_DB_PATH` | `foreman.db` | SQLite database path |
|
|
| `FOREMAN_POLL_INTERVAL` | `30s` | Target model poll interval |
|
|
| `FOREMAN_WEBHOOK_SECRET` | *(empty)* | HMAC key for webhook signing |
|
|
|
|
## Health check
|
|
|
|
```bash
|
|
curl http://localhost:8080/healthz
|
|
# {"status":"ok","degraded":false}
|
|
```
|
|
|
|
## Architecture
|
|
|
|
See [`docs/adr/`](docs/adr/) for design decisions. Key points:
|
|
|
|
- One daemon per Ollama target (ADR-0001)
|
|
- SQLite-backed durable job queue in WAL mode (ADR-0008)
|
|
- Single worker loop with drain-by-model scheduling (ADR-0009)
|
|
- Native Ollama passthrough + async `/jobs` surface (ADR-0003, ADR-0004)
|
|
- Embeddings bypass the queue entirely (ADR-0013)
|
|
|
|
## License
|
|
|
|
[MIT](LICENSE) © 2026 Steve Dudenhoeffer.
|