The OpenAI /v1/images/generations endpoint ignores `seed` on our
stable-diffusion.cpp build — every render of a given prompt comes back
byte-identical, so a drawbot batch of N collapsed to one image. Switch the
image provider to sd-server's A1111 /sdapi/v1/txt2img endpoint, which honors
`seed` (verified live: distinct seeds -> distinct images on SDXL and
Qwen-Image). Size is split into width/height; llama-swap still routes by the
`model` field. Tests + ADR-0016 updated.
Add Steps, CFGScale, NegativePrompt, Sampler, Seed to imagegen.Request
(pointer/empty = leave the backend's per-model default), with mirror
options, and forward them in the llamaswap wire payload as the
stable-diffusion.cpp fields (steps/cfg_scale/negative_prompt/
sample_method/seed). Unset fields are omitted so sd-server keeps its
baked defaults.
Lets callers (e.g. mort drawbots) override only what they explicitly set.
- Unload: reject model ids containing path separators (/?#) so a model name
can't redirect the request to another endpoint; ":" (common in ids) stays
verbatim.
- doJSON: take a model arg so image/management HTTP errors carry the target id
(was always ""); add a base-URL guard so management methods fail clearly
instead of building a bare-path request; cap the success-path JSON decode with
io.LimitReader (64 MiB) and drain the body when out is nil for conn reuse.
- image: reject negative Request.N before sending.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add provider/llamaswap, a tailored provider for llama-swap (the model-swapping
proxy over llama.cpp / stable-diffusion.cpp). Its chat path delegates to
provider/openai at {base}/v1 — no duplicated wire client (ADR-0007) — with
legacy max_tokens, a Bearer no-key placeholder for keyless local instances, and
a timeout-free client so cold model swaps rely on context deadlines. The
"tailored" surface is concrete management methods (ListModels / Running /
Unload) that don't belong on the canonical llm.Provider interface. The
llama-swap:// DSN scheme builds an http base URL (local-first); a no-URL
built-in errors clearly on use, mirroring foreman.
Add imagegen, a new canonical text-to-image interface separate from llm
(Request/Result/Model/Provider; Image = llm.ImagePart so generated images feed
straight back into chat). First backend is llama-swap via OpenAI
/v1/images/generations (b64_json, bytes-only). Re-exported from the root. v1 is
txt2img only.
Hermetic httptest coverage for chat delegation, management endpoints, image
decode, and scheme wiring. ADR-0015 + ADR-0016, README support matrix +
image-gen section, CLAUDE.md package map, and progress.md updated in the same
commit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Phase 8: all six live checks pass (tier aliases, thinking-tier chat, real
tool invocation, structured Generate[T], forced failover with bench+skip,
skill agent). Discovery: ollama.com ignores the format field — the
provider now also states the schema as a system instruction (constrained
decoding locally, instruction-guided JSON on cloud), with hermetic test.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>