Files
foreman/docs/adr/0005-webhook-protocol.md
T
2026-05-23 16:41:20 -04:00

64 lines
2.2 KiB
Markdown

# ADR-0005: Webhook state-update protocol
**Status:** Accepted — 2026-05-23
## Context
Async callers (ADR-0004) need to know how their job is progressing without
polling. The requirement: periodically push state updates
(`queued → loading → working → done`) and deliver results/artifacts on
completion.
## Decision
When a job is submitted with `state_webhook_url`, foreman POSTs a JSON event to
that URL on every state transition.
### Event payload
```json
{
"job_id": "01J...",
"state": "loading",
"previous_state": "queued",
"timestamp": "2026-05-23T12:00:00Z",
"model": "qwen3.6:35b",
"attempt": 1,
"error": null,
"result": null,
"artifacts": null
}
```
- `state`: one of `queued`, `loading`, `working`, `done`, `failed`.
- On `done`: `result` holds the completion (native-Ollama-shaped) and `artifacts`
holds artifact references (ADR-0006).
- On `failed`: `error` holds a message; `result` is null.
### Delivery semantics
- **At-least-once.** Callers must be idempotent on `job_id` + `state`. A missed
webhook can always be reconciled via `GET /jobs/{id}` (ADR-0004).
- **Retry with backoff** on non-2xx or connection failure, bounded attempts, then
the event is dropped (the job state itself is unaffected and remains queryable).
- **Ordering is not guaranteed** across retries; `previous_state` + `timestamp`
let callers order/deduplicate.
- **Optional HMAC signing:** if a webhook secret is configured, foreman sends an
`X-Foreman-Signature` header (HMAC-SHA256 of the body) so receivers can verify
authenticity. Off by default; recommended once foreman is reachable beyond a
fully trusted network.
## Consequences
- Callers get push observability with a polling fallback.
- Idempotency is pushed onto the caller — documented as a hard requirement.
- Webhook delivery is decoupled from job execution: a flaky receiver never blocks
or fails the job.
## Alternatives considered
- **Polling only.** Simpler for foreman, worse for callers; rejected since
webhooks were an explicit requirement. (Polling is still available as fallback.)
- **WebSocket/SSE for state.** Heavier; SSE is reserved for token streaming on the
sync surface (ADR-0012), not job-state fan-out.