foreman/docs/adr/0001-one-daemon-per-target.md

# ADR-0001: One daemon per Ollama target

**Status:** Accepted — 2026-05-23

## Context

`peon-overseer` ballooned because it coordinated *multiple* workers from a
central service: pull-based dispatch, claim leases, weighted fair queueing,
capacity budgets, eligibility gates. All of that complexity existed solely to
arbitrate shared workers. We want none of it back.

The system being built fronts inference hardware (initially the M1 Pro running
Ollama) and exposes it as a managed job endpoint.

## Decision

Each `foreman` process is bound to **exactly one** Ollama target, configured by a
single base URL. One target = one daemon = one queue. There is no cross-daemon
awareness and no shared state between daemons.

If a second worker is added later (the 4090 box, the M5 Max), it gets its own
`foreman` instance. Any fan-out across workers is the concern of a *separate*
higher-level router that talks to multiple foreman instances — explicitly out of
scope here and not to be anticipated in this codebase.

## Consequences

- The daemon is radically simple: one target, one serialized work stream.
- Horizontal scale is "run another daemon," an operational act, not a code change.
- No lease/fairness/budget machinery is permitted in this repo. If a change
  starts to require it, that is the signal that the multi-worker router (a
  different project) is what's actually needed.

## Alternatives considered

- **One daemon managing many targets.** Rejected: reintroduces the scheduling and
  arbitration complexity that sank the predecessor.