feat(llamaswap): add llama-swap provider + canonical imagegen interface
Add provider/llamaswap, a tailored provider for llama-swap (the model-swapping
proxy over llama.cpp / stable-diffusion.cpp). Its chat path delegates to
provider/openai at {base}/v1 — no duplicated wire client (ADR-0007) — with
legacy max_tokens, a Bearer no-key placeholder for keyless local instances, and
a timeout-free client so cold model swaps rely on context deadlines. The
"tailored" surface is concrete management methods (ListModels / Running /
Unload) that don't belong on the canonical llm.Provider interface. The
llama-swap:// DSN scheme builds an http base URL (local-first); a no-URL
built-in errors clearly on use, mirroring foreman.
Add imagegen, a new canonical text-to-image interface separate from llm
(Request/Result/Model/Provider; Image = llm.ImagePart so generated images feed
straight back into chat). First backend is llama-swap via OpenAI
/v1/images/generations (b64_json, bytes-only). Re-exported from the root. v1 is
txt2img only.
Hermetic httptest coverage for chat delegation, management endpoints, image
decode, and scheme wiring. ADR-0015 + ADR-0016, README support matrix +
image-gen section, CLAUDE.md package map, and progress.md updated in the same
commit.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -32,6 +32,8 @@ mort-agnostic: no mort types, no Discord, no mort config.
|
||||
majordomo Registry, Parse, env-DSN loading, chain executor, re-exports
|
||||
llm/ canonical contract: Message/Part/Request/Response/Option,
|
||||
Tool/Toolbox, Capabilities, Stream, Model, Provider, errors
|
||||
imagegen/ canonical text-to-image contract: Request/Result/Model/
|
||||
Provider (separate from llm; Image = llm.ImagePart) (ADR-0016)
|
||||
health/ clock-injected health tracker (bench/backoff)
|
||||
media/ image normalization to target capabilities (sniff real
|
||||
format, downscale, transcode, byte ladder; ErrUnsupported
|
||||
@@ -41,6 +43,8 @@ majordomo Registry, Parse, env-DSN loading, chain executor, re-exports
|
||||
provider/anthropic/ Messages API client (+ Anthropic-compat targets)
|
||||
provider/ollama/ one native /api/chat client serving the ollama,
|
||||
ollama-cloud, and foreman built-ins via presets
|
||||
provider/llamaswap/ llama-swap proxy: chat delegates to provider/openai,
|
||||
plus management methods + imagegen image client (ADR-0015)
|
||||
provider/google/ Gemini on google.golang.org/genai (the one approved
|
||||
dependency; lazy client, raw-JSON-schema tools,
|
||||
ThinkingLevel reasoning, iter.Pull2 streaming)
|
||||
@@ -75,10 +79,12 @@ alias := bare token (no slash), expands INLINE, recursively, cycle-checked
|
||||
`LLM_<NAME>=scheme://[token@]host[/path]` — e.g.
|
||||
`LLM_M5=foreman://token@foreman-m5.example` defines provider `m5`; then
|
||||
`m5/qwen3:30b` works in Parse, chains, and aliases. Scheme ∈ {foreman,
|
||||
ollama, ollama-cloud, openai, anthropic, google, gemini} ∪ RegisterScheme.
|
||||
Token = credential; base URL = `https://host` always. `New()` scans the
|
||||
process env eagerly; unknown names also resolve lazily at Parse time
|
||||
(`my-prov` → `LLM_MY_PROV`). Malformed entries fail on use, not at startup.
|
||||
ollama, ollama-cloud, openai, anthropic, google, gemini, llama-swap} ∪
|
||||
RegisterScheme. Token = credential; base URL = `https://host` always —
|
||||
**except `llama-swap`, which builds `http://host` (local-first; ADR-0015).**
|
||||
`New()` scans the process env eagerly; unknown names also resolve lazily at
|
||||
Parse time (`my-prov` → `LLM_MY_PROV`). Malformed entries fail on use, not at
|
||||
startup.
|
||||
|
||||
## Health & failover (ADR-0006, ADR-0008)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user