Files
foreman/docs/adr/0005-webhook-protocol.md
T
steve 0526bada90 docs: land prior ADR + prompt updates
Commit pre-existing uncommitted working-tree changes that predate the
license/public-readiness work — NOT authored in this session, just flushed so
they're not lost: ADR-0003/0005/0009/0012 edits, the new ADR-0013
(embeddings-bypass + two-slot residency, already referenced by CLAUDE.md), and
the phase-0..3 prompt revisions + prompts/README.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 20:33:39 -04:00

2.3 KiB

ADR-0005: Webhook state-update protocol

Status: Accepted — 2026-05-23

Context

Async callers (ADR-0004) need to know how their job is progressing without polling. The requirement: periodically push state updates (queued → loading → working → done) and deliver results/artifacts on completion.

Decision

When a job is submitted with state_webhook_url, foreman POSTs a JSON event to that URL on every state transition.

Event payload

{
  "job_id": "01J...",
  "state": "loading",
  "previous_state": "queued",
  "timestamp": "2026-05-23T12:00:00Z",
  "model": "qwen3:30b",
  "attempt": 1,
  "error": null,
  "result": null,
  "artifacts": null
}
  • state: one of queued, loading, working, done, failed.
  • On done: result holds the completion (native-Ollama-shaped) and artifacts holds artifact references (ADR-0006).
  • On failed: error holds a message; result is null.

Delivery semantics

  • At-least-once. Callers must be idempotent on job_id + state. A missed webhook can always be reconciled via GET /jobs/{id} (ADR-0004).
  • Retry with backoff on non-2xx or connection failure, bounded attempts, then the event is dropped (the job state itself is unaffected and remains queryable).
  • Ordering is not guaranteed across retries; previous_state + timestamp let callers order/deduplicate.
  • Optional HMAC signing: if a webhook secret is configured, foreman sends an X-Foreman-Signature header (HMAC-SHA256 of the body) so receivers can verify authenticity. Off by default; recommended once foreman is reachable beyond a fully trusted network.

Consequences

  • Callers get push observability with a polling fallback.
  • Idempotency is pushed onto the caller — documented as a hard requirement.
  • Webhook delivery is decoupled from job execution: a flaky receiver never blocks or fails the job.

Alternatives considered

  • Polling only. Simpler for foreman, worse for callers; rejected since webhooks were an explicit requirement. (Polling is still available as fallback.)
  • WebSocket/streamed connection for state. Heavier; token streaming on the sync surface is NDJSON (ADR-0012), and job-state fan-out doesn't need a persistent connection — discrete webhook POSTs suffice.