# ADR-0001: One daemon per Ollama target **Status:** Accepted — 2026-05-23 ## Context `peon-overseer` ballooned because it coordinated *multiple* workers from a central service: pull-based dispatch, claim leases, weighted fair queueing, capacity budgets, eligibility gates. All of that complexity existed solely to arbitrate shared workers. We want none of it back. The system being built fronts inference hardware (initially the M1 Pro running Ollama) and exposes it as a managed job endpoint. ## Decision Each `foreman` process is bound to **exactly one** Ollama target, configured by a single base URL. One target = one daemon = one queue. There is no cross-daemon awareness and no shared state between daemons. If a second worker is added later (the 4090 box, the M5 Max), it gets its own `foreman` instance. Any fan-out across workers is the concern of a *separate* higher-level router that talks to multiple foreman instances — explicitly out of scope here and not to be anticipated in this codebase. ## Consequences - The daemon is radically simple: one target, one serialized work stream. - Horizontal scale is "run another daemon," an operational act, not a code change. - No lease/fairness/budget machinery is permitted in this repo. If a change starts to require it, that is the signal that the multi-worker router (a different project) is what's actually needed. ## Alternatives considered - **One daemon managing many targets.** Rejected: reintroduces the scheduling and arbitration complexity that sank the predecessor.