feat: add async /jobs surface, state webhooks, and artifact handling
Add the async job submission API, webhook state notifications, and
artifact serving endpoints on top of the Phase 3 queue infrastructure.
Key changes:
- POST /jobs: async job submission with 202 + job_id ULID; optional
state_webhook_url for push notifications on state transitions
- GET /jobs/{id}: job status polling with result, error, and artifact
metadata; artifacts <= 256KB inlined, larger ones by URL reference
- GET /jobs/{id}/artifacts/{name}: raw artifact data serving
- Webhook dispatcher: at-least-once delivery with exponential backoff
(5 retries); optional HMAC-SHA256 signing (X-Foreman-Signature)
- ADR-0014: state_webhook_url only honored on POST /jobs, not sync
/api/chat (caller already blocks for result)
- Comprehensive tests for /jobs lifecycle, webhook delivery, HMAC
verification, artifact inline/URL threshold, and TTL pruning
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
+48
@@ -117,3 +117,51 @@ with the real SQLite-backed job queue and single worker loop.
|
||||
DeleteTerminalJobsBefore.
|
||||
- Server: chat model validation (404), non-streaming chat through queue,
|
||||
serialization (max 1 concurrent), context cancellation, embed bypass unchanged.
|
||||
|
||||
## Phase 4: Async /jobs surface, webhooks, artifacts — 2026-05-23
|
||||
|
||||
**M1 core complete** (minus CLI and go-llm constructor, which are separate work).
|
||||
|
||||
- `internal/webhook/` — webhook dispatcher:
|
||||
- `Dispatcher.Fire(url, event)`: non-blocking goroutine delivery with
|
||||
exponential backoff retry (1s, 2s, 4s, 8s, 16s — max 5 attempts).
|
||||
- Optional HMAC-SHA256 signing via `FOREMAN_WEBHOOK_SECRET` — sets
|
||||
`X-Foreman-Signature: sha256=<hex>` header.
|
||||
- `VerifySignature()`: exported for webhook receivers.
|
||||
- `FormatArtifacts()`: inline (data field) for artifacts <= 256KB, URL reference
|
||||
for larger ones.
|
||||
- Webhook failures are logged and dropped — never block or fail the job
|
||||
(ADR-0005).
|
||||
|
||||
- `internal/server/` — new routes:
|
||||
- `POST /jobs`: validates model, creates job row with optional
|
||||
`state_webhook_url`, returns `202 Accepted` with `{"job_id":"<ulid>"}`.
|
||||
Fires initial "queued" webhook. Wakes worker.
|
||||
- `GET /jobs/{id}`: returns full job state, result, error, and artifact
|
||||
metadata. 404 for unknown IDs. Artifacts under 256KB are inlined; larger
|
||||
ones get a URL reference.
|
||||
- `GET /jobs/{id}/artifacts/{name}`: serves raw artifact data with stored
|
||||
content type. 404 for unknown job/artifact.
|
||||
|
||||
- `docs/adr/0014-no-webhooks-on-sync-chat.md`:
|
||||
- `state_webhook_url` is only honored on `POST /jobs`. Sync `/api/chat` does
|
||||
not fire webhooks (ADR-0014). Rationale: the caller already holds a blocking
|
||||
HTTP connection.
|
||||
|
||||
- `cmd/foreman/main.go` — full serve wiring:
|
||||
- Creates webhook dispatcher, notifier, worker.
|
||||
- Starts worker loop goroutine and TTL pruner goroutine.
|
||||
- TTL pruner runs every `jobTTL/4` (min 1 minute), deletes terminal jobs
|
||||
older than `FOREMAN_JOB_TTL` (default 24h).
|
||||
- Server constructor now receives notifier, worker, and dispatcher.
|
||||
|
||||
- Tests (all passing with `-race`):
|
||||
- Jobs API: 202 on submit, ULID format, 404 for unknown model, 400 for
|
||||
missing model, 404 for unknown job, job state after completion, artifact
|
||||
retrieval, artifact 404.
|
||||
- Webhooks: full lifecycle events (queued->working->done), 500-returning
|
||||
receiver does not affect job state, HMAC signature verification.
|
||||
- Webhook dispatcher: delivery, retry on 500, non-blocking Fire, HMAC signing,
|
||||
no HMAC when no secret, signature format validation.
|
||||
- Artifacts: small inline, large by URL, empty returns nil.
|
||||
- TTL pruner: deletes old terminal jobs.
|
||||
|
||||
Reference in New Issue
Block a user