2.0 KiB
2.0 KiB
ADR-0006: Artifact handling and transport
Status: Accepted — 2026-05-23
Context
Jobs must "transmit artifacts when done." For a chat completion the obvious artifact is the assistant's text/tool-call output, but the term is deliberately broader: a job may produce structured data, multiple named outputs, or content too large to embed comfortably in a webhook body.
Decision
An artifact is a named, typed blob attached to a completed job:
{ "name": "completion", "content_type": "application/json", "size": 1234,
"inline": { ... }, "url": null }
- The primary completion is always emitted as an artifact named
completion(the native-Ollama response shape), so there is one consistent access pattern. - Additional artifacts use distinct names.
Transport: inline vs fetch
- Small artifacts (under a configurable threshold, default ~256 KB) are
delivered inline in the
donewebhook (inlinepopulated,urlnull) and inGET /jobs/{id}. - Large artifacts exceed the threshold: the webhook/
GETcarries metadata plus aurl(GET /jobs/{id}/artifacts/{name}), and the bytes are fetched on demand. This keeps webhook payloads bounded and avoids shipping megabytes through a callback POST.
Retention
Artifacts are stored alongside the job in SQLite (ADR-0008) and pruned with the job after a configurable TTL. No separate blob store in v1; revisit only if artifact sizes outgrow SQLite comfort (single-digit MB).
Consequences
- One uniform way to read output (
completionartifact), extensible to richer jobs later without protocol changes. - Webhook bodies stay small; large outputs don't bloat or break delivery.
- A pull endpoint for artifacts means a missed/oversized webhook never loses data.
Alternatives considered
- Always inline. Simple but risks huge webhook bodies and SQLite row bloat in the hot path. Rejected.
- External object store (S3/MinIO) from day one. Over-engineered for the expected sizes; deferred behind the TTL/threshold knobs.