feat(llamaswap): add llama-swap provider + canonical imagegen interface

Add provider/llamaswap, a tailored provider for llama-swap (the model-swapping proxy over llama.cpp / stable-diffusion.cpp). Its chat path delegates to provider/openai at {base}/v1 — no duplicated wire client (ADR-0007) — with legacy max_tokens, a Bearer no-key placeholder for keyless local instances, and a timeout-free client so cold model swaps rely on context deadlines. The "tailored" surface is concrete management methods (ListModels / Running / Unload) that don't belong on the canonical llm.Provider interface. The llama-swap:// DSN scheme builds an http base URL (local-first); a no-URL built-in errors clearly on use, mirroring foreman. Add imagegen, a new canonical text-to-image interface separate from llm (Request/Result/Model/Provider; Image = llm.ImagePart so generated images feed straight back into chat). First backend is llama-swap via OpenAI /v1/images/generations (b64_json, bytes-only). Re-exported from the root. v1 is txt2img only. Hermetic httptest coverage for chat delegation, management endpoints, image decode, and scheme wiring. ADR-0015 + ADR-0016, README support matrix + image-gen section, CLAUDE.md package map, and progress.md updated in the same commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 15:01:54 -04:00
parent 1fd7109a42
commit 96c612e707
14 changed files with 994 additions and 7 deletions
@@ -0,0 +1,44 @@
+# ADR-0016: imagegen — a canonical text-to-image interface
+
+**Status:** Accepted — 2026-06-27
+
+## Context
+
+mort needs to generate images (via llama-swap's stable-diffusion.cpp backend),
+and majordomo had no image-generation surface. Image generation does not fit
+the chat contract: there are no conversation messages, tools, streaming, or
+failover-chain semantics — forcing it through `llm.Request`/`llm.Response`/
+`llm.Model` would overload that contract with mostly-unused fields. The user
+asked for "a new ai image interface as opposed to llm".
+
+## Decision
+
+- A new canonical **leaf package `imagegen`**, parallel to `llm`, re-exported
+  from the root (`ImageModel`, `ImageProvider`, `ImageRequest`, `ImageResult`,
+  `ImageOption`, plus `WithImageCount`/`WithImageSize`). Providers import
+  `imagegen`; mort codes to the interface, not to llama-swap.
+- Minimal v1 surface (text-to-image only):
+  - `Request{ Prompt string; N int; Size string }` — zero values mean provider
+    default (N=0 → backend default count; "" Size → backend default).
+  - `Result{ Images []Image; Raw any }`.
+  - `Model.Generate(ctx, Request, ...Option) (*Result, error)` and
+    `Provider.ImageModel(id, ...ModelOption) (Model, error)`.
+  - Functional options + `Request.Apply`, mirroring `llm`.
+- **`type Image = llm.ImagePart`** (bytes + MIME). Reusing the chat content type
+  means a generated image drops straight back into a chat turn
+  (`llm.UserParts(res.Images[0])`) with no conversion — the key interop win.
+- Out of scope for v1 (designed-for, deferred): image edits / img2img, the raw
+  A1111 SDAPI, masks/seeds/steps, streaming, and registry-level image-model DSN
+  resolution (construct the provider directly for now).
+- First implementation: `provider/llamaswap`, targeting OpenAI
+  `/v1/images/generations` with `response_format: "b64_json"` (bytes inline; we
+  never fetch remote URLs — mirrors `ImagePart`'s bytes-only contract).
+
+## Consequences
+
+- Image generation is provider-agnostic from day one; a future OpenAI DALL·E or
+  Gemini image backend implements the same interface.
+- The narrow interface keeps the door open for richer requests without breaking
+  callers (additive fields/options).
+- No health/failover for image models yet; if needed it can be added as a
+  separate chain type rather than retrofitting the chat chain.