4.2 KiB
4.2 KiB
phase-0-kickoff.md — foreman build kickoff
You are building foreman, a Go daemon that fronts one Ollama target and turns
it into a queued, observable, OpenAI/Ollama-compatible job endpoint. This is a
deliberately pared-down restart of a system (peon-overseer) that died of scope
creep. Restraint is a feature, not a limitation.
Read these first (authoritative, in order)
CLAUDE.mdin this repo — the operating manual. It is the source of truth for architecture, stack, conventions, and the out-of-scope guardrails.docs/adr/README.mdthen everydocs/adr/00NN-*.md. The ADRs are the why. Do not relitigate them; if you believe one is wrong, say so and propose a new superseding ADR rather than silently diverging.- Via the gitea MCP, read the integration target —
steve/go-llm:v2/provider/provider.go(theProviderinterface you must stay compatible with),v2/ollama/ollama.goandv2/constructors.go(howOllama/OllamaCloudconstruct over native/api/chat+ Bearer), andv2/CLAUDE.md(DD#8: native API, not OpenAI-compat). - Via the gitea MCP, study deployment conventions in
steve/steveternet:kalimdor/orgrimmar/warhol-queue/,kalimdor/orgrimmar/ratchet/, andkalimdor/orgrimmar/mort/fordocker-compose.yml+.env.examplepatterns, andkalimdor/orgrimmar/traefik/(incl.custom/) for the Traefik network name, entrypoint, certresolver, and router/label conventions. foreman will live atkalimdor/orgrimmar/foreman/. Mirror these exactly; do not invent label syntax.
Working agreement (opusplan)
- Plan before code. For each phase, produce a plan and wait for my approval before implementing. Do not run ahead to later phases.
- One phase at a time, in order. Each phase is its own prompt I will paste.
- After every phase:
go build ./...,go vet ./...,go test -race -count=1 ./...must all pass. Append a dated entry toprogress.md. Commit on a phase branch with conventional-commit messages (feat:,chore:,test:,docs:). - Ask before assuming. If a detail is ambiguous and not settled by CLAUDE.md or an ADR, ask me — don't guess.
- Propose an ADR (append-only, next number) for any architectural decision
not already covered. Keep
docs/adr/README.md's index current. - Keep dependencies minimal; match
go-llmhouse style (tabs; wrap errors withfmt.Errorf("%w: ...", err); imports stdlib → third-party → internal). SQLite viamodernc.org/sqlite(pure-Go,CGO_ENABLED=0). No UI. - Refuse scope creep. No distributed dispatch, leases, fair queueing, capacity budgets, auth framework/SSO, GUI, or multi-target support. If a task seems to need them, stop and flag it — that means the design is being violated.
Definition of done (whole project)
A deployable daemon that:
- fronts one configurable Ollama target and transparently proxies native
/api/chat,/api/tags,/api/ps(sogo-llmuses the Mac as a target with no provider changes), including streaming; - runs a durable SQLite-backed queue with a single worker and drain-by-model scheduling, surviving restarts and target sleep;
- exposes an async
POST /jobssurface returning a job ID, withqueued→loading→working→done/failedstate webhooks and artifact delivery; - ships a Go client package (synchronous facade over the async surface);
- passes CI on Gitea, builds as a container, and deploys via a steveternet
docker-compose.ymlbehind Traefik.
Phase map
- Scaffold, config, SQLite store, health, CI, Dockerfile.
- Ollama target client + model poller + native passthrough (the go-llm target).
- Durable queue + single worker + drain-by-model.
- Async
/jobs+ job IDs + state webhooks + artifacts. - Go client package (sync facade) +
llm.Foreman()in go-llm. - Deploy: steveternet compose + Traefik,
.env.example, deploy docs, model-pull script.
Your task right now
Confirm you've read the sources above, briefly restate the architecture in your own words (so I can check your understanding), flag anything in the ADRs you'd push back on, then produce a detailed plan for Phase 1 only. Do not write code yet. Stop for my approval.