# phase-0-kickoff.md — foreman build kickoff You are building **foreman**, a Go daemon that fronts one Ollama target and turns it into a queued, observable, OpenAI/Ollama-compatible job endpoint. This is a deliberately pared-down restart of a system (`peon-overseer`) that died of scope creep. Restraint is a feature, not a limitation. ## Read these first (authoritative, in order) 1. `CLAUDE.md` in this repo — the operating manual. It is the source of truth for architecture, stack, conventions, and the **out-of-scope guardrails**. 2. `docs/adr/README.md` then every `docs/adr/00NN-*.md`. The ADRs are the *why*. Do not relitigate them; if you believe one is wrong, say so and propose a new superseding ADR rather than silently diverging. 3. Via the **gitea MCP**, read the integration target — `steve/go-llm`: `v2/provider/provider.go` (the `Provider` interface you must stay compatible with), `v2/ollama/ollama.go` and `v2/constructors.go` (how `Ollama` / `OllamaCloud` construct over native `/api/chat` + Bearer), and `v2/CLAUDE.md` (DD#8: native API, not OpenAI-compat). 4. Via the gitea MCP, study deployment conventions in `steve/steveternet`: `kalimdor/orgrimmar/warhol-queue/`, `kalimdor/orgrimmar/ratchet/`, and `kalimdor/orgrimmar/mort/` for `docker-compose.yml` + `.env.example` patterns, and `kalimdor/orgrimmar/traefik/` (incl. `custom/`) for the Traefik network name, entrypoint, certresolver, and router/label conventions. foreman will live at `kalimdor/orgrimmar/foreman/`. **Mirror these exactly; do not invent label syntax.** ## Working agreement (opusplan) - **Plan before code.** For each phase, produce a plan and wait for my approval before implementing. Do not run ahead to later phases. - **One phase at a time**, in order. Each phase is its own prompt I will paste. - After every phase: `go build ./...`, `go vet ./...`, `go test -race -count=1 ./...` must all pass. Append a dated entry to `progress.md`. Commit on a phase branch with conventional-commit messages (`feat:`, `chore:`, `test:`, `docs:`). - **Ask before assuming.** If a detail is ambiguous and not settled by CLAUDE.md or an ADR, ask me — don't guess. - **Propose an ADR** (append-only, next number) for any architectural decision not already covered. Keep `docs/adr/README.md`'s index current. - Keep dependencies minimal; match `go-llm` house style (tabs; wrap errors with `fmt.Errorf("%w: ...", err)`; imports stdlib → third-party → internal). SQLite via `modernc.org/sqlite` (pure-Go, `CGO_ENABLED=0`). No UI. - **Refuse scope creep.** No distributed dispatch, leases, fair queueing, capacity budgets, auth framework/SSO, GUI, or multi-target support. If a task seems to need them, stop and flag it — that means the design is being violated. ## Definition of done (whole project) A deployable daemon that: - fronts one configurable Ollama target and transparently proxies native `/api/chat`, `/api/tags`, `/api/ps` (so `go-llm` uses the Mac as a target with no provider changes), including streaming; - runs a durable SQLite-backed queue with a single worker and drain-by-model scheduling, surviving restarts and target sleep; - exposes an async `POST /jobs` surface returning a job ID, with `queued→loading→working→done/failed` state webhooks and artifact delivery; - ships a Go client package (synchronous facade over the async surface); - passes CI on Gitea, builds as a container, and deploys via a steveternet `docker-compose.yml` behind Traefik. ## Phase map 1. Scaffold, config, SQLite store, health, CI, Dockerfile. 2. Ollama target client + model poller + native passthrough (the go-llm target). 3. Durable queue + single worker + drain-by-model. 4. Async `/jobs` + job IDs + state webhooks + artifacts. 5. Go client package (sync facade) + `llm.Foreman()` in go-llm. 6. Deploy: steveternet compose + Traefik, `.env.example`, deploy docs, model-pull script. ## Your task right now Confirm you've read the sources above, briefly restate the architecture in your own words (so I can check your understanding), flag anything in the ADRs you'd push back on, then produce a **detailed plan for Phase 1 only**. Do not write code yet. Stop for my approval.