Files
foreman/docs/adr/0012-streaming.md
T
2026-05-23 16:41:20 -04:00

42 lines
1.9 KiB
Markdown

# ADR-0012: Streaming support
**Status:** Accepted — 2026-05-23
## Context
`go-llm`'s provider interface has a `Stream()` method, and Ollama's native
`/api/chat` streams token-by-token by default. The synchronous passthrough
(ADR-0003) must not break streaming clients. Separately, the async `/jobs`
surface (ADR-0004) reports progress via discrete state webhooks, which is a
different granularity than token streaming.
## Decision
- **Sync passthrough: support streaming.** When a `/api/chat` request sets
`stream: true`, foreman streams the target's token deltas back to the caller
(SSE/chunked, matching Ollama's native streaming). A streamed job still moves
through the queue; streaming begins once the job reaches `working`, so a job
waiting behind the drain-by-model queue (ADR-0009) simply starts streaming when
its turn comes. go-llm's `Stream()` works against foreman unchanged.
- **Async `/jobs` surface: no token streaming in v1.** Webhooks carry coarse state
transitions (ADR-0005) and the final result/artifacts, not per-token deltas.
Token-level streaming over a fire-and-forget webhook job is deliberately
deferred — it adds a transport (persistent connection or chunked webhook) whose
complexity isn't justified yet.
## Consequences
- Interactive go-llm usage gets real streaming through the transparent surface.
- Orchestration callers get state + final artifacts, which is what they need;
they can use the sync streaming surface directly if they want tokens.
- The job state machine and webhook protocol stay simple (no streaming transport
to design or operate).
## Alternatives considered
- **Stream tokens over the async surface too.** Deferred: requires either a
long-lived connection (defeats the point of async) or chunked-delta webhooks
(complex, rarely needed). Revisit only on a concrete need.
- **No streaming at all.** Would break go-llm's `Stream()` and interactive use on
the very path that is the primary goal. Rejected.