Files
foreman/docs/adr/0006-artifact-handling.md
T
2026-05-23 16:41:20 -04:00

54 lines
2.0 KiB
Markdown

# ADR-0006: Artifact handling and transport
**Status:** Accepted — 2026-05-23
## Context
Jobs must "transmit artifacts when done." For a chat completion the obvious
artifact is the assistant's text/tool-call output, but the term is deliberately
broader: a job may produce structured data, multiple named outputs, or content
too large to embed comfortably in a webhook body.
## Decision
An **artifact** is a named, typed blob attached to a completed job:
```json
{ "name": "completion", "content_type": "application/json", "size": 1234,
"inline": { ... }, "url": null }
```
- The primary completion is always emitted as an artifact named `completion`
(the native-Ollama response shape), so there is one consistent access pattern.
- Additional artifacts use distinct names.
### Transport: inline vs fetch
- **Small artifacts** (under a configurable threshold, default ~256 KB) are
delivered **inline** in the `done` webhook (`inline` populated, `url` null) and
in `GET /jobs/{id}`.
- **Large artifacts** exceed the threshold: the webhook/`GET` carries metadata
plus a `url` (`GET /jobs/{id}/artifacts/{name}`), and the bytes are fetched
on demand. This keeps webhook payloads bounded and avoids shipping megabytes
through a callback POST.
### Retention
Artifacts are stored alongside the job in SQLite (ADR-0008) and pruned with the
job after a configurable TTL. No separate blob store in v1; revisit only if
artifact sizes outgrow SQLite comfort (single-digit MB).
## Consequences
- One uniform way to read output (`completion` artifact), extensible to richer
jobs later without protocol changes.
- Webhook bodies stay small; large outputs don't bloat or break delivery.
- A pull endpoint for artifacts means a missed/oversized webhook never loses data.
## Alternatives considered
- **Always inline.** Simple but risks huge webhook bodies and SQLite row bloat in
the hot path. Rejected.
- **External object store (S3/MinIO) from day one.** Over-engineered for the
expected sizes; deferred behind the TTL/threshold knobs.