Files
foreman/docs/adr/0005-webhook-protocol.md
T
steve 0526bada90 docs: land prior ADR + prompt updates
Commit pre-existing uncommitted working-tree changes that predate the
license/public-readiness work — NOT authored in this session, just flushed so
they're not lost: ADR-0003/0005/0009/0012 edits, the new ADR-0013
(embeddings-bypass + two-slot residency, already referenced by CLAUDE.md), and
the phase-0..3 prompt revisions + prompts/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 20:33:39 -04:00

65 lines
2.3 KiB
Markdown

# ADR-0005: Webhook state-update protocol
**Status:** Accepted — 2026-05-23
## Context
Async callers (ADR-0004) need to know how their job is progressing without
polling. The requirement: periodically push state updates
(`queued → loading → working → done`) and deliver results/artifacts on
completion.
## Decision
When a job is submitted with `state_webhook_url`, foreman POSTs a JSON event to
that URL on every state transition.
### Event payload
```json
{
"job_id": "01J...",
"state": "loading",
"previous_state": "queued",
"timestamp": "2026-05-23T12:00:00Z",
"model": "qwen3:30b",
"attempt": 1,
"error": null,
"result": null,
"artifacts": null
}
```
- `state`: one of `queued`, `loading`, `working`, `done`, `failed`.
- On `done`: `result` holds the completion (native-Ollama-shaped) and `artifacts`
holds artifact references (ADR-0006).
- On `failed`: `error` holds a message; `result` is null.
### Delivery semantics
- **At-least-once.** Callers must be idempotent on `job_id` + `state`. A missed
webhook can always be reconciled via `GET /jobs/{id}` (ADR-0004).
- **Retry with backoff** on non-2xx or connection failure, bounded attempts, then
the event is dropped (the job state itself is unaffected and remains queryable).
- **Ordering is not guaranteed** across retries; `previous_state` + `timestamp`
let callers order/deduplicate.
- **Optional HMAC signing:** if a webhook secret is configured, foreman sends an
`X-Foreman-Signature` header (HMAC-SHA256 of the body) so receivers can verify
authenticity. Off by default; recommended once foreman is reachable beyond a
fully trusted network.
## Consequences
- Callers get push observability with a polling fallback.
- Idempotency is pushed onto the caller — documented as a hard requirement.
- Webhook delivery is decoupled from job execution: a flaky receiver never blocks
or fails the job.
## Alternatives considered
- **Polling only.** Simpler for foreman, worse for callers; rejected since
webhooks were an explicit requirement. (Polling is still available as fallback.)
- **WebSocket/streamed connection for state.** Heavier; token streaming on the
sync surface is NDJSON (ADR-0012), and job-state fan-out doesn't need a
persistent connection — discrete webhook POSTs suffice.