fix(llamaswap): use A1111 /sdapi/v1/txt2img so seed is honored #10
Reference in New Issue
Block a user
Delete Branch "fix/llamaswap-a1111-seed"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
sd-server's OpenAI
/v1/images/generationsendpoint ignoresseedon our stable-diffusion.cpp build — every render of a given prompt is byte-identical, so a drawbot batch of N collapses to one image (confirmed live: seeds 111 vs 222 → identical md5 on both SDXL and Qwen-Image).Switch the image provider to sd-server's A1111
/sdapi/v1/txt2imgendpoint, which honorsseed(verified live: distinct seeds → distinct images, different md5 and byte sizes).Sizeis split intowidth/height; llama-swap still routes by themodelfield in the body. Optional fields stay pointer/omitempty so unset values fall back to each model's baked--steps/--cfg-scaledefaults. Tests + ADR-0016 updated.🪰 Gadfly — live review status
7/7 reviewers finished · updated 2026-06-29 03:08:20Z
claude-code/opus· claude-code — ✅ doneclaude-code/opus:max· claude-code — ✅ donedeepseek-v4-pro:cloud· ollama-cloud — ✅ donedevstral-2:123b-cloud· ollama-cloud — ✅ doneglm-5.2:cloud· ollama-cloud — ✅ donekimi-k2.6:cloud· ollama-cloud — ✅ doneqwen3.5:397b-cloud· ollama-cloud — ✅ doneLive status board. Findings are posted in each model's own comment. Advisory only — does not block merge.
🪰 Gadfly review —
claude-code/opus(claude-code)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
I've verified the full request/response path. Here's my security review.
VERDICT: No material issues found
I traced the change through the Security lens against the checked-out code:
"/sdapi/v1/txt2img"(image.go:85), and the request body is built as a typed struct andjson.Marshal-ed indoJSON(llamaswap.go:194). User/agent-controlled fields (Prompt,NegativePrompt,Model=m.id) ride in the JSON body where the encoder escapes them — none of them reach the URL path or a shell. Verified by readingdoJSON(llamaswap.go:188-227).p.baseURLcomes from provider config (WithBaseURL/LLM_*DSN), not from per-request untrusted input, andpathis a literal. Request target =p.baseURL+pathwith a fixed suffix.resp.Imagescomes from the upstream sd-server, but the response is decoded throughio.LimitReader(resp.Body, maxResponseBytes)(llamaswap.go:219), and each base64 entry is decoded withbase64.StdEncoding.DecodeStringwith the error surfaced (image.go:94-97). The newcontinue-on-empty (image.go:91) only changes which-entry behavior, not a security boundary. MIME is sniffed locally, not trusted from the wire (sniffImageMIME).parseSize. Width/height are parsed withstrconv.Atoiand rejected if non-numeric or<= 0(image.go:121-124); empty input yields(nil,nil). No negative/zero values reach the wire. There's no explicit upper bound, but oversized dimensions are a resource concern for the downstream renderer (performance lens), not an injection/auth issue — and the value is still just a JSON integer.Authorization: Bearerheader logic indoJSONis untouched; nothing new is logged or echoed.Nothing in the authn/authz, injection, SSRF, deserialization, or secret-leakage areas is materially wrong in this diff.
🎯 Correctness — Minor issues
I've verified both findings against the actual source.
Finding 1 (
batch_countexternal contract): The draft itself concedes this is "Unverified — external HTTP contract, not checkable from the repo." It cannot be positively confirmed against the code. Moreover, its supporting argument is misleading: the target is stable-diffusion.cpp'ssd-server, not stock AUTOMATIC1111, andbatch_countis in fact the field sd-server's A1111-compatible handler accepts. This is speculation about external backend behavior, not a defect confirmable from the repo. Dropped.Finding 2 (under-count silently accepted): Verified by reading
image.go:90-107— the loop appends only non-empty entries and errors only whenlen(out.Images) == 0. Factually accurate and confirmed.VERDICT: Minor issues
Findings (Correctness lens)
provider/llamaswap/image.go:90-107only errors when the result contains zero images (len(out.Images) == 0). A caller requestingNimages that gets back fewer thanNreceives a successfulResultwith too few images and no error. This property existed on the prior OpenAI path too, so the diff doesn't introduce it, but the switch to A1111batch_countmakes a collapsed/partial batch a more plausible outcome. Consider asserting the returned count against the requestedNwhenN>1. (Note:imagegen.RequestdocumentsNloosely —N==0means "provider default, usually one" — so this is a defensive-validation suggestion, not a contract violation.)Checked and clean
parseSize(image.go:112-127): correct —"WxH"split,ToLowerhandles512X512, rejects non-positive/non-numeric, and""→(nil, nil)so the model's baked resolution applies. Consistent withimagegen.Request.Sizesemantics (imagegen/imagegen.go:38-40).Seed *int64is forwarded (omitempty) to an endpoint that honors it; nil omits → backend random. Consistent with the stated fix./sdapi/v1/txt2imgmatching the new path and the updated tests.txt2imgResponsereading onlyimagesis fine; extra A1111 fields (info/parameters) are harmlessly ignored.🧹 Code cleanliness & maintainability — Minor issues
I've verified all three findings against the actual source. Findings 1 and 2 are confirmed verbatim. Finding 3 is not a confirmable defect — the comment reads "pointers/omitempty," which already parses as "pointer fields or omitempty value fields," so it is not actually inaccurate; I'm dropping it.
VERDICT: Minor issues
Stale package doc now contradicts the implementation —
imagegen/imagegen.go:13-14still reads "The first implementation is provider/llamaswap, which targets the OpenAI/v1/images/generationsendpoint routed to a stable-diffusion.cpp backend." This PR is the change that makes that false: the code now POSTs/sdapi/v1/txt2img(provider/llamaswap/image.go:56,85). CLAUDE.md makes "docs match reality in the same commit that changes behavior" a credibility promise; the PR updated ADR-0016 but missed this package-level comment. Fix: update the sentence to reference the A1111/sdapi/v1/txt2imgendpoint (or drop the endpoint specifics, since it's a leaky implementation detail in a contract-package doc).Trivial: duplicated error literal in
parseSize—provider/llamaswap/image.go:119and:124repeatfmt.Errorf("invalid size %q (want WxH)", size)verbatim. Collapsing the two failure paths into one shared error removes the copy-paste. Low priority.Notes:
imagegen/,provider/, andmedia/;parseSizeis the only one, so introducing it locally is reasonable — no duplication of an existing utility.parseSizehelper, naming, and field reordering otherwise read cleanly. The ADR change uses the append-only "Update" section, which is the repo's documented convention — correct, not a nit.b64continue-then-error-if-all-empty change (image.go:90-106) is an error-handling/behavior shift left to the error-handling lens.⚡ Performance — No material issues found
I've verified the full changed file. This is a like-for-like endpoint/shape swap; let me assess it strictly for performance.
VERDICT: No material issues found
Performance-lens assessment (verified by reading
provider/llamaswap/image.goat the PR checkout):Generatestill does onedoJSONround-trip (now to/sdapi/v1/txt2img) and one pass over the returned images. No new network calls, no N+1, no added round-trips.image.go:84-87parseSizeis O(1) and runs once per request — a singleSplitN(..., 2)and twoAtoi. Not on any loop or hot path.image.go:112-127req.N, validated>= 0), same as before. Base64 decode + MIME sniff per image is inherent to returning bytes, not a regression.image.go:90-99BatchCount: req.Nsimply relabels the previously-sentNfield; no change to how much work is requested upstream. Any per-image generation cost is the intended behavior of the fix (distinct seeds now actually produce distinct renders), not avoidable overhead introduced here.Minor non-issue I considered and am not flagging as material:
out.Imagesis appended without a pre-sizedmake(..., 0, len(resp.Images)). With image batches being small (single digits typically) and dwarfed by GPU render + base64 decode cost, the reallocations are negligible — a true micro-optimization, below the bar.Nothing in the performance lens is materially wrong here.
🧯 Error handling & edge cases — No material issues found
I've verified the actual checked-out code, which includes lines the truncated diff hunk didn't show (the post-loop
len(out.Images) == 0guard atimage.go:100-106). That changes the analysis materially, so reading the file was necessary.VERDICT: No material issues found
Reviewed through the error-handling & edge-cases lens. The unhappy paths this diff introduces are handled well:
image.go:59) and negative N (image.go:62) still rejected up front — unchanged, good.parseSizeedge cases (image.go:112-127) — verified robust:""→(nil, nil, nil), correctly falling back to model default.WxHinput ("512","512x512x512"→ SplitN gives"512x512"whichAtoirejects,"512x"→ empty part) all produce a cleanErrUnsupportederror rather than a panic or silent garbage.w <= 0 || h <= 0; integer overflow caught bystrconv.Atoireturningerr1/err2. No off-by-one or panic surface.image.go:100-106): the loop nowcontinues past emptyb64strings, but the post-looplen(out.Images) == 0guard returns anAPIError. So a response of{"images":[]}or{"images":[""]}fails loudly rather than returning a hollow success. This was not visible in the diff hunk but is the real safety net replacing the old per-entry error — confirmed by reading the file.image.go:94-97) still returns a wrapped error per image; indexiis the slice index, correct.One deliberate behavior change worth being aware of (not a defect): partial-empty responses are now tolerated — if the backend returns N slots with some empty, the caller silently gets fewer than N images instead of an error (old code errored on the first empty entry). For A1111
batch_countthis is defensible defensive handling, and the all-empty case is still an error, so I don't consider it a material edge-case bug.No ignored errors, no missing cleanup (no resources to release here —
doJSONowns the HTTP body), and no panic-on-bad-input paths in the changed code.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 46s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
Verdict: No material issues found
I traced the change end-to-end through the security lens and verified each tainted input path against the checked-out code.
Request construction (
image.go:71-82):Prompt,NegativePrompt,Sampler,Seed,Steps,CFGScaleare all placed into a struct and marshaled byjson.MarshalinsidedoJSON(llamaswap.go:194). No string interpolation into the URL, body, or a query — so no injection surface. The endpoint path/sdapi/v1/txt2imgis a hardcoded constant, not user-derived, so no path traversal/SSF vector is introduced by the endpoint switch.modelfield routing (image.go:73):m.id(caller-supplied atImageModel) is sent only as a JSON body field, never interpolated into the URL path. Verified the URL is built purely asp.baseURL+path(llamaswap.go:200) withpathconstant. No route-injection.Sizeparsing (image.go:112-127): newparseSizevalidates withstrconv.Atoi(digits-only) and rejectsw <= 0 || h <= 0, so only positive integers reachwidth/height. Malformed sizes returnErrUnsupportedrather than being passed through. This is stricter than the prior code, which forwarded an opaqueSizestring. Good.Auth (
llamaswap.go:207-209): the bearer token is applied to the new endpoint exactly as before; no authn/authz regression from the endpoint change.Response handling (
image.go:90-98,llamaswap.go:219): responses are read viaio.LimitReader(resp.Body, maxResponseBytes)(64 MiB cap,llamaswap.go:44), decoded withencoding/jsoninto a plain struct (no customUnmarshalJSON), thenbase64.StdEncoding.DecodeString. No unsafe deserialization, no second fetch of remoteurlcontent (the old URL-only branch was removed entirely), andsniffImageMIMEuseshttp.DetectContentTypeon the decoded bytes. No execution/reflective path.batch_count(image.go:81):req.Nis validated>= 0atimage.go:62; withomitempty,N=0omits the field (backend default),N>0forwards it. No unbounded/amplification-by-attacker issue beyond the existing caller-controlled contract.No security-relevant problems in this PR.
🎯 Correctness — Minor issues
VERDICT: Minor issues
provider/llamaswap/image.go:81—BatchCount: req.Ncombined with a single fixedSeed(image.go:75) could re-collapse a batch to identical images, depending on sd-server backend behavior not present in this repo. The code is confirmed: onetxt2imgRequestsends oneSeed *int64plusBatchCount int = req.N, so a caller that sets bothWithSeed(s)andWithN(k>1)issues a single request with a fixed seed and a batch count. The PR's justification (the quoted live test) only covers the single-image, distinct-seed-per-request case and does not demonstrate thebatch_count > 1+ fixed-seed case. Whether stable-diffusion.cpp's/sdapi/v1/txt2imgvaries noise per batch image or applies the literal seed to every image cannot be determined from this repository. This is not a confirmed correctness bug, but a real semantic gap worth either documenting or guarding against (e.g., whenreq.N > 1andreq.Seed != nil, issue N separate requests withseed, seed+1, …). Suggested: add a test asserting the intended semantics, or document the constraint. Minor / unverified-impact.provider/llamaswap/image.go:90-106— no check that the decoded-image count matchesreq.N(pre-existing, not introduced by this PR). The loop (image.go:90-99) iteratesresp.Images, skips empty base64 strings, and appends; the only error path islen(out.Images) == 0(image.go:100). If the backend silently under-delivers (e.g., ignoresbatch_countand returns 1 image for anN=4request), the caller receives a short result with no error. Confirmed in the current code; this lenient behavior predates the PR, but the switch from the OpenAInfield tobatch_count(a field whose honoring by this sd-server build is not verified in-repo) makes the silent-under-delivery path more plausible. Not blocking.provider/llamaswap/image.go:68—fmt.Errorf("%w: %v", llm.ErrUnsupported, err)wraps the outer sentinel with%wbut renders the innerparseSizeerror with%v, so the parse error is not unwrappable. Confirmed:parseSizereturnsfmt.Errorf("invalid size %q (want WxH)", size)errors, andGenerateformats them via%v. Material impact is negligible (the inner error is a formatted string with no further wrapping), but for consistency the inner error should use%was well. Minor.🧹 Code cleanliness & maintainability — Minor issues
All four findings are verifiable against the actual code. Let me confirm each:
Finding 1 (image.go:90-107): Confirmed. Old code returned an explicit
*llm.APIErrorper-image with message "image %d returned no inline b64_json data" and a comment explaining the rationale. New code (lines 90-92) silentlycontinues on empty string, only reporting generic "image response contained no images" (line 104) if all empty. The "Why error rather than skip" comment is dropped with no replacement.Finding 2 (image.go:117): Confirmed.
parseSizecallsstrings.ToLower(size)before splitting on"x", enabling"512X512". Doc comment (line 110-111) says only"WxH"— case-insensitive separator not documented. The ToLower only matters for uppercase separatorX.Finding 3 (ADR lines 33-35): Confirmed. "First implementation" bullet still states OpenAI
/v1/images/generationswithresponse_format: "b64_json", now historically inaccurate given the appended Update section (lines 56-63). Not marked superseded.Finding 4 (image.go:49): Confirmed. Struct comment (lines 31-38) describes pointers/omitempty as the unset-default mechanism.
BatchCount int(line 49) withomitemptyuses zero-value semantics (0 = backend default) differently from pointer fields, but this isn't called out.imagegen.Request.N(imagegen.go:35-36) confirmsN == 0= provider default. All findings survive verification.All four findings are cleanliness/maintainability nits (explicitly the reviewer's lens), not correctness blockers. The verdict "Minor issues" is appropriate.
Corrected review
VERDICT: Minor issues
provider/llamaswap/image.go:90-107— The old code surfaced an explicit*llm.APIErrorfor ab64_json-less entry with a helpful "image %d returned no inline b64_json data" message; the new code silentlycontinues on an empty string and only later reports a generic "image response contained no images" (line 104) if all entries were empty. The comment explaining the prior "Why error rather than skip" rationale was dropped with no replacement. From a cleanliness standpoint this is a regression in self-documentation: a reader can no longer tell why empty entries are tolerated vs. flagged. Consider either restoring a brief comment (e.g.// A1111 may return empty slots; skip them, but error if none decode) or keeping the per-image error. Verified by reading the fullGeneratebody (image.go:90-107).provider/llamaswap/image.go:117—parseSizecallsstrings.ToLower(size)before splitting on"x", which is only meaningful if the separator could be uppercase"X"(e.g."512X512");strconv.Atoihandles digits/sign case-insensitively already. That's a real, harmless input-tolerance nicety, but it's undocumented in the function's doc comment (image.go:110-111), which says only"WxH". Minor: either mention"WxH"is case-insensitive on the separator, or drop theToLowerif you don't intend to accept"X".docs/adr/0016-imagegen-interface.md:33-35— The ADR's original "First implementation" bullet still states the provider targets OpenAI/v1/images/generationswithresponse_format: "b64_json", which is now historically inaccurate. The new "Update — A1111 txt2img endpoint" section is appended below (lines 56-63), leaving a contradictory in-place claim. For maintainability the stale bullet should be marked superseded (e.g.~~First implementation: OpenAI /v1/images/generations~~ (superseded; see Update below)), consistent with how the doc already layers "Update" sections.provider/llamaswap/image.go:49—BatchCount intusesomitempty, andreq.N == 0(the documented "provider default" sentinel perimagegen.Request.N,imagegen.go:35-36) correctly maps tobatch_countbeing omitted. However the struct comment (lines 31-38) describes pointer/omitempty fields as the unset-default mechanism and then listsBatchCountas a plainintwithomitemptywithout calling out that its zero-value semantics differ (0 = backend default, not "unset"). Minor readability nit: a one-line note thatN/batch_countuses zero-as-default rather than pointer-as-default would keep the comment honest.⚡ Performance — No material issues found
VERDICT: No material issues found
Reviewed through the performance lens. The change is a wire-format swap (OpenAI → A1111), not an algorithmic change:
parseSize(image.go:112) runs once perGeneratecall and does a singleSplitN+ twoAtoi— negligible, not on a hot loop.BatchCount: req.N(image.go:81) maps the request count 1:1 tobatch_count; the server generates N images in one request rather than the client issuing N requests, so this is if anything more efficient than a loop.Nothing material to flag in the performance lane.
🧯 Error handling & edge cases — Minor issues
VERDICT: Minor issues
provider/llamaswap/image.go:62-64,81— The guard rejectsreq.N < 0but permitsreq.N == 0. With the new A1111 wire shape,BatchCountis a plainintwithomitempty(line 49), soN == 0maps toBatchCount: 0(line 81) and the field is omitted from the request body entirely. Theimagegen.Requestcontract (imagegen/imagegen.go:29,35) explicitly promises "N == 0 yields the backend's default count (usually one)". Whether sd-server's/sdapi/v1/txt2imgtreats an omittedbatch_countas 1 (default count) or 0 is not determinable from this repo; if it defaults to 0, the code's own safety net at line 100 (len(out.Images) == 0→"image response contained no images"APIError) turns a legitimate "default count" request into a hard error. Suggested fix: mapreq.N == 0toBatchCount: 1, or document that callers must setNexplicitly for this provider. (Backend default unverified — no sd-server source in repo.)provider/llamaswap/image.go:90-93— A returned-but-empty image slot (b64"") is now silentlycontinued. The previous code explicitly returned a hardAPIError("image %d returned no inline b64_json data") for an empty entry, with a comment explaining the choice. If sd-server returns a mix of populated and empty strings (e.g. a partial mid-batch failure that yields an empty element), the caller now gets a silently-shortened result with no indication that imageiwas dropped; an error only surfaces if the whole batch collapses to empty (line 100). This is a regression in partial-failure visibility. Suggested fix: count skipped entries and surface that in the result/error, or treat a non-emptyresp.Imagesarray containing any empty string as a partial-failure error as before.provider/llamaswap/image.go:66-69—parseSizefailure is wrapped withllm.ErrUnsupported. A malformedSizeis a caller-input bug rather than an unsupported-capability condition, so the error category is semantically off; this is consistent with the file's existing use ofErrUnsupportedfor theNand prompt guards, so the mismatch is preserved rather than introduced. Note only; no action required.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 3m 10s
🪰 Gadfly review —
devstral-2:123b-cloud(ollama-cloud)⏳ Reviewing… this comment will update with findings and run time.
🪰 Gadfly review —
deepseek-v4-pro:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
No material issues found
I reviewed the full diff and the surrounding code (
provider/llamaswap/image.go,provider/llamaswap/llamaswap.go,imagegen/imagegen.go, and the test file) specifically through the security lens. Here's what I checked:Injection: All user-controlled fields (
Prompt,NegativePrompt,Size→Width/Height,Seed,Steps,CFGScale,Sampler,N→BatchCount) are serialized viajson.MarshalindoJSON, which prevents JSON injection. The backend (sd-server) is the same trusted service as before; no new interpreter boundary is introduced.Path traversal / SSRF: The endpoint path is hardcoded (
/sdapi/v1/txt2img). Themodelfield is sent in the JSON body, not interpolated into the URL.doJSONconstructs the URL asp.baseURL + pathwith no user-controlled segments. No SSRF or path-traversal vector.parseSize: The new size-parsing function usesstrconv.Atoi(safe against overflow — it errors on out-of-range), validates positivity, and handles whitespace. No injection or overflow risk.BatchCountwithomitempty: Whenreq.N == 0, the zero-valueintis omitted, deferring to the backend default. This is the documented contract. No abuse vector.Response handling: The
txt2imgResponseonly carries base64 image strings. TheRawescape hatch exposes no secrets. Empty image entries are now silently skipped (previously errored), but the "no images" sentinel error still fires if all are empty — no information leak.Auth: Unchanged.
doJSONstill sendsAuthorization: Bearer <token>(orBearer no-keyplaceholder) exactly as before.Model ID validation: The model ID is passed through verbatim (same as before), but it's only used in the JSON body for routing and in error messages — never in a URL path or shell command. The existing
Unloadpath-traversal guard for model IDs is unrelated to this code path.No material security issues introduced by this change.
🎯 Correctness — No material issues found
Verdict: No material issues found
All three findings in the draft are speculative claims about the field names and semantics expected by an external server (stable-diffusion.cpp's sd-server) that is not part of this repository. The code is internally consistent —
sample_method,batch_count, and thereq.N → BatchCountmapping are used uniformly across the struct definition, the request construction, the ADR documentation, and the unit tests. Whether the external sd-server build actually honors those exact field names is an integration concern that cannot be confirmed or refuted from the code in this repo alone. None of the findings can be positively verified, so all are dropped.🧹 Code cleanliness & maintainability — Minor issues
All three findings are confirmed against the actual code. Here is the corrected review:
Verdict: Minor issues
imagegen/imagegen.go:14— stale package doc comment. The package doc still says the first implementation "targets the OpenAI/v1/images/generationsendpoint." This PR switches to A1111/sdapi/v1/txt2img, so the comment is now factually wrong. A future reader who starts at the package doc will be misled.provider/llamaswap/image.go:90-93— silent skip of empty images replaces previous hard error. The old code returned an*llm.APIErrorwhen a response entry had nob64_jsondata. The new code silentlycontinues past empty strings inresp.Images. If the A1111 endpoint can return empty entries for partially-failed batches, the skip may be intentional — but there's no comment explaining the change in error-handling philosophy. (Note: if all images are empty, lines 100-106 still return anllm.APIError, so the silent skip only applies to partial-empty responses.)provider/llamaswap/image.go:49—BatchCountuses plainintwithomitemptywhile every other optional numeric field in the same struct uses a pointer.Steps,CFGScale,Width,Height, andSeedare all*int/*float64/*int64so thatomitemptycleanly distinguishes "caller didn't set" from "caller set zero."BatchCountis plainint, relying on the fact that 0 happens to mean "use default." This works today but is inconsistent with the established pattern in the same struct. If a future model ever treatsbatch_count=0as a meaningful value distinct from "unset," the plain-int approach would break silently.⚡ Performance — No material issues found
No material issues found.
The diff makes a single HTTP call per
Generateinvocation (unchanged), adds one trivialparseSizehelper that runs once per request (two heap-allocated*intpointers — negligible against the seconds-to-minutes latency of image generation), and the response iteration remains O(N) over a user-bounded image count. No N+1 queries, no allocations inside hot loops, no unbounded growth, no quadratic behavior introduced. ThemaxResponseBytes = 64 << 20limit on the response body is unchanged and remains generous for the base64-encoded payloads the A1111 endpoint returns.🧯 Error handling & edge cases — No material issues found
No material issues found
The error-handling and edge-case coverage in this diff is solid:
parseSize(line 112–127) correctly handles empty string →(nil, nil, nil), malformed formats, zero/negative dimensions, andAtoioverflow — all verified by reading the implementation. Thestrings.ToLower+SplitNapproach correctly handles mixed-case like"512X512".req.N < 0guard (line 62–64) is preserved.N=0maps toBatchCount=0whichomitemptydrops, letting the backend use its default — consistent with theimagegen.Requestcontract.Nil-pointer
omitemptyfields (Seed,Steps,CFGScale,Width,Height) all correctly fall through to backend defaults when unset. Verified by reading both the struct tags and theimagegen.Requestzero-value semantics.Empty
resp.Imagesslice (nil or[]) is a no-op infor range, falling through to thelen(out.Images) == 0→APIErrorguard (line 100–106). Verified.Base64 decode failure (line 94–97) returns a wrapped error — not swallowed. Verified.
continueon empty b64 string (line 91–93): The old code errored on an emptyb64_jsonentry; the new code skips empty strings and only errors if all images are empty. This is a deliberate softening — partial results are better than none, and the A1111imagesarray shouldn't contain empties in normal operation. The "all empty" case is still caught. Reasonable.doJSON(inllamaswap.go:188–227) handles non-2xx viaapiError, drains the body on nilout, and bounds reads withio.LimitReader— all pre-existing and unchanged, but confirmed the new call site at line 85 benefits from the same protections.No panics on bad input, no swallowed errors, no missing cleanup, no nil-pointer dereference paths introduced by this change.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 6m 51s
🪰 Gadfly review —
claude-code/opus:max(claude-code)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Reviewed through the 🔒 Security lens. I traced the untrusted/attacker-influenceable inputs (model id, prompt, size, and the upstream response body) to where they're used.
image.go:85POSTs to the literal"/sdapi/v1/txt2img"; the model id is carried in the JSON body (txt2imgRequest.Model,image.go:72), not interpolated into the URL or path. So this change introduces no path/route injection. (Verified by readingdoJSONatllamaswap.go:188-227: it buildsp.baseURL+pathfrom the fixed constant, and JSON-marshals the body so the model string is escaped — contrast the deliberate/?#rejection on the path-segmentUnloadroute atllamaswap.go:173, which doesn't apply here.)imageResponsecarried aurlfield and the code special-cased url-only entries; the newtxt2imgResponseisImages []string(base64 only) and the code never fetches a remote URL. Verified atimage.go:90-99.image.go:94), but the response read is capped atmaxResponseBytes = 64<<20viaio.LimitReaderindoJSON(llamaswap.go:219), so a malicious/runaway upstream can't drive unbounded allocation through this path. Decode errors are surfaced, not panicked (image.go:95-96).Sizeis sound.parseSize(image.go:112-127) rejects malformed input and non-positive dimensions, andstrconv.Atoirejects out-of-range integers — no negative/overflow values reach the wire. (Very large but valid dimensions could be a resource concern on the upstream backend, but that's a DoS/correctness matter on infra outside this library's trust boundary, not a security defect here.)Authorization: Bearertoken handling indoJSON(llamaswap.go:207-209) is untouched by this diff; no new logging of body, token, or response.Nothing in the security lane is materially wrong with this change.
🎯 Correctness — Minor issues
I've verified the draft against the actual code. The mechanical refactor claims all check out. The one substantive "Minor" finding rests on a premise the draft itself admits is unverifiable from the repo (whether sd-server honors
batch_count/sample_method) — that's speculation about an external system, not a defect confirmable against the code, so I drop it. The only confirmable residue is the test-coverage gap.VERDICT: Minor issues
Verified correct (no action):
imageRequest/imageResponse→txt2imgRequest/txt2imgResponse(image.go:39,52) has no dangling references; grep for the old names,/v1/images, andb64_jsonturns up only the unrelated openai provider and doc comments. Compiles.req.*fields forwarded (Size,N,Steps,CFGScale,Sampler,Seed,NegativePrompt) exist onimagegen.Requestwith matching types (Seed *int64,Steps *int,CFGScale *float64); pointer/omitemptyfall-through is preserved and asserted byTestImageGenerateSettings(llamaswap_test.go:212-240).parseSize(image.go:112) is correct:""→(nil,nil,nil); rejects non-WxHand non-positive dims; lowercases so512X512works;SplitN(...,2)rejects"1x2x3"via theAtoifailure.image.go:100-106returns anAPIErrorwhen zero images decode.N==0→BatchCountomitted → server default.Minor — batch path (
req.N→batch_count) is untested:image.go:49,81:BatchCount int \json:"batch_count,omitempty"`←req.Nis the new batch mechanism, but no test exercises N>1 or assertsgotBody["batch_count"]—TestImageGenerate/TestImageGenerateSettingsonly checkwidth/height/steps/cfg_scale/negative_prompt/sample_method/seed. Adding a test that sends N≥2 and assertsbatch_countwould lock the field in. (The draft's stronger claim — that an unrecognizedbatch_count/sample_method` would silently collapse N>1 — depends on sd-server's actual A1111 field names, which cannot be confirmed from this repo; it's a live-build question, not a code defect.)🧹 Code cleanliness & maintainability — Minor issues
Both findings verified against the source. Finding 1 (stale package doc at
imagegen/imagegen.go:13-14) is confirmed — the doc still pins the OpenAI/v1/images/generationsendpoint that this PR replaces.Finding 2 (the "pointers" nit) does not survive: the comment at
image.go:37-38literally reads "Optional fields are pointers/omitempty" — the slash already accounts for the non-pointeromitemptyfields (SampleMethod,BatchCount), so the comment is not actually misleading. The "outside my lens"batch_countnote is speculative ("worth a correctness check"), not a confirmed finding, so it is dropped as well.VERDICT: Minor issues
Findings
imagegen/imagegen.go:13-14. The package doc still says "The first implementation is provider/llamaswap, which targets the OpenAI/v1/images/generationsendpoint routed to a stable-diffusion.cpp backend" — exactly the behavior this PR changes. This is a live doc comment (not the append-only ADR), andprovider/llamaswap/image.go:56now POSTs/sdapi/v1/txt2img. CLAUDE.md requires docs to be updated in the same commit as the behavior they describe. Fix: update the sentence to reference the A1111/sdapi/v1/txt2imgendpoint, or generalize it so it doesn't pin a specific endpoint.⚡ Performance — No material issues found
This is a clean, isolated image-generation path. Image generation is dominated by upstream model inference (seconds per render); the Go-side request marshaling and response decoding here are not a hot path. I checked for the usual performance regressions and found nothing material.
VERDICT: No material issues found
provider/llamaswap/image.go:90-99— the response loop overresp.Imagesdoes one base64 decode + MIME sniff per returned image. This is O(images) with no nested work, no per-image network round-trip (the bytes are inline, same as before — the old code's url-only branch that could have implied a second fetch was never actually fetching). No N+1 introduced; if anythingBatchCount: req.Nlets sd-server produce N images in a single request rather than N calls. Verified by reading the full function.provider/llamaswap/image.go:66,112-127—parseSizeruns once perGeneratecall, does a couple of small string ops and twoAtois. Negligible, not in any loop.Trivial (not worth acting on):
out.Imagesat line 98 grows viaappendwithout preallocation —make([]llm.ImagePart, 0, len(resp.Images))would avoid a couple of slice regrowths, but for the handful of images a batch returns this is immeasurable against inference cost. Flagging only for completeness; not a finding.🧯 Error handling & edge cases — No material issues found
I verified against the actual checked-out file (
provider/llamaswap/image.go), which is important here — the diff in the task message is truncated: it ends the loop atreturn out, nil, but the real code has an added guard at lines 100–106 that the diff doesn't show.Let me trace the unhappy paths through my lens.
Empty / all-empty image array — From the diff alone this looks like a swallowed-error bug: the new loop does
if b64 == "" { continue }, so an{"images":[""]}or{"images":[]}response would seem to fall through to a "successful"Resultwith zero images. But the real file guards it:So an all-empty/empty response surfaces an error rather than a hollow success. Handled correctly (and consistent with the old per-entry error, just more tolerant of partial empties). Not a finding.
parseSizeedge cases — I checked the boundaries:""→(nil,nil,nil)→ falls back to model default. ✓"x512","512x","512x512x3"→Atoierror orlen(parts)!=2→ error. ✓"-512x512","0x0") →w<=0 || h<=0→ error. ✓"99999999999999999999x512") →Atoireturns error, caught byerr1/err2. ✓"512X512") →ToLowernormalizes. ✓Robust; the error is wrapped with
llm.ErrUnsupportedand returned before any network call.N/BatchCountboundary —req.N < 0is rejected;N==0→BatchCount==0→omitemptydrops it → sd-server default. No panic or negative-count path. ✓Decode failure mid-batch —
base64.StdEncoding.DecodeStringerror returns immediately with a wrapped%werror rather than being swallowed. ✓VERDICT: No material issues found
Through the error-handling & edge-cases lens, the change is clean. The one path that looked suspect from the diff (silently returning zero images on an empty/all-empty response) is actually closed by the
len(out.Images) == 0guard in the checked-out code — verified by readingprovider/llamaswap/image.go:90–107.parseSizerejects every malformed/non-positive/overflowing input before issuing a request, the negative-Nguard is intact, and decode errors are propagated rather than ignored.Outside my lens (one note, not pursued): A1111's
/sdapi/v1/txt2imgconventionally usesbatch_size/n_iter, notbatch_count— if sd-server ignores the unrecognized field,N>1would silently yield one image. That's a correctness/API-contract question for another lens, not an error-handling defect.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 2m 43s
🪰 Gadfly review —
kimi-k2.6:cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — No material issues found
No material issues found
Reviewed through the Security lens. The change switches the image generation endpoint from OpenAI
/v1/images/generationsto A1111/sdapi/v1/txt2img, refactors the request/response structs, and adds aparseSizehelper.Key security-relevant findings after verification:
parseSizeatprovider/llamaswap/image.go:112only extracts integer dimensions from a"WxH"string usingstrconv.Atoi; the values are emitted as JSON numbers, not concatenated into URLs, commands, or queries.SplitNwith limit 2 safely handles extra"x"characters."/sdapi/v1/txt2img"); user input (req.Size,req.Prompt, etc.) never influences the request URL.doJSONstill attaches the Bearer token the same way; nothing in the PR adds, removes, or weakens authentication.b64_jsoncheck existed because the OpenAI endpoint could return either inline data or a remote URL. The A1111 endpoint only returns base64 strings inimages; skipping empty entries is correct because there is no URL-fetch fallback to accidentally trigger.base64.StdEncoding.DecodeStringandhttp.DetectContentTypeon the decoded bytes are standard and safe.req.N < 0is rejected; width/height must be> 0;parseSizereturns an error on malformed input.Nothing in this diff introduces an authn/authz gap, injection risk, unsafe deserialization, secret leakage, or insecure default.
🎯 Correctness — No material issues found
Verdict: No material issues found
After a thorough correctness review:
parseSizecorrectly handles the documented"WxH"format, rejects invalid/negative/zero dimensions, and returns(nil, nil)for an empty string so the backend default applies. Verified by reading the function inprovider/llamaswap/image.go:112-127.imagegen.Requesttotxt2imgRequestpreserves the semantic intent: optional overrides remain pointers withomitemptyso the backend falls back to its own defaults;Sizeis properly split intowidth/height;Nmaps tobatch_count. Verified by reading the struct andGenerateinprovider/llamaswap/image.go:39-82.images []string). The loop skips empty strings rather than hard-failing on the first missing entry, but the finallen(out.Images) == 0check still guarantees at least one valid image is returned. Verified inprovider/llamaswap/image.go:89-107.width/height/batch_countfrom the omission assertion inTestImageGenerateSettingsis a test-coverage gap, not a code correctness bug.No logic errors, incorrect constants, or semantic/domain bugs were found in the change itself.
🧹 Code cleanliness & maintainability — Minor issues
Minor issues
provider/llamaswap/image.go:112— NewparseSizehas zero unit tests. It has several branches (empty, malformed, non-positive numbers, whitespace trimming) but the test file only exercises the happy path ("512x512") indirectly throughTestImageGenerate. Add a small table-drivenTestParseSizecovering"","512x512"," 512X512 ","512","0x512","-1x512"to prevent regressions in size parsing.provider/llamaswap/image.go:68— Error wrapping uses%vfor the inner error instead of%w, breaking error-chain inspection:fmt.Errorf("%w: %v", llm.ErrUnsupported, err)should befmt.Errorf("%w: %w", llm.ErrUnsupported, err)(or%w: %sif the inner error should not be wrapped).provider/llamaswap/image.go:91— Empty base64 entries are silently skipped withcontinuebut there is no comment explaining why. The old code had a detailed comment justifying the hard error; the new skip behavior should have at least a// skip empty entriesnote so future readers know it's intentional and not a missing validation.⚡ Performance — No material issues found
No material issues found.
🧯 Error handling & edge cases — No material issues found
No material issues found
After walking through the diff with the error-handling lens:
parseSizehandles its edge cases correctly: empty string yieldsnil,nil,nil; non-"WxH"shapes, non-numeric components, zero/negative dimensions, and overflow all return errors.TrimSpace+ToLower+SplitN(..., "x", 2)is robust against extra spaces and mixed case.req.N < 0and emptyreq.Promptare validated before the request is built.BatchCountis omitted viaomitempty, matching theimagegen.Requestcontract whereN == 0means "backend default".resp.Imagesfully-empty (nil or all empty strings) is caught by the post-looplen(out.Images) == 0check, returning anAPIError.base64.DecodeStringwith a wrapped error including the index.doJSONpropagates HTTP and decode errors and stilldefersresp.Body.Close().The switch from erroring on a single empty
b64_jsonentry tocontinue-ing on an empty string inresp.Imagesis justified by the endpoint change: the old check existed because the OpenAI endpoint could ignoreresponse_formatand return URLs instead; the A1111 endpoint has no suchresponse_formatconcept, and a total failure is still caught by the zero-images guard.I found no swallowed errors, missing cleanups, nil dereference risks, or unhandled boundary conditions introduced by this change.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 8m 9s
🪰 Gadfly review —
qwen3.5:397b-cloud(ollama-cloud)Verdict: Minor issues — 5 reviewers: security, correctness, maintainability, performance, error-handling
🔒 Security — Minor issues
Verdict: Minor issues found
Security findings (input validation / DoS vectors):
provider/llamaswap/image.go:123— No upper bound on width/height inparseSize"9999999x9999999"). The validation only checksw <= 0 || h <= 0, so any positive integer passes. This could cause backend resource exhaustion (memory/CPU) or trigger costly renders, enabling DoS by authenticated users.provider/llamaswap/image.go:62-64— No upper bound on batch countNreq.N < 0only prevents negative values. A caller can requestN=1000000images in one batch, which gets forwarded asBatchCountto the backend. This could exhaust backend resources or cause extremely long-running renders.Both issues are input validation gaps that could be exploited for denial-of-service by authenticated API users. They don't enable remote code execution or data leakage, but they do allow resource exhaustion attacks against the image generation backend.
🎯 Correctness — No material issues found
VERDICT: No material issues found
All three findings in the draft review have been dropped after verification against the actual code:
Finding 1 (image.go:48-49 field names) — DROPPED. The code intentionally uses
sample_methodandbatch_countfor the stable-diffusion.cpp sd-server implementation. The ADR document (docs/adr/0016-imagegen-interface.md:52) explicitly documents these field names as the intended contract with sd-server. The draft incorrectly assumed standard A1111 webui field names (sampler_name/batch_size), but this code targets the stable-diffusion.cpp project's sd-server which uses different field names. A repo-wide grep confirms no usage ofsampler_nameorbatch_sizeanywhere.Finding 2 (test validates incorrect field) — DROPPED. This finding depends on Finding 1 being correct. Since
sample_methodis the intentional field name for this sd-server implementation, the test at line 235 correctly validates the expected behavior.Finding 3 (BatchCount omitempty semantics) — DROPPED. While the code structure (
intwithomitempty) can be confirmed, the draft itself acknowledges this "may be intentional" and matches the original OpenAI code pattern. Without access to sd-server's documented contract forbatch_count=0, this cannot be positively confirmed as a bug rather than intended design (N=0 likely means "use backend default", which aligns with omitting the field).🧹 Code cleanliness & maintainability — No material issues found
VERDICT: No material issues found
Analysis through the code cleanliness & maintainability lens:
No dead code left behind: Verified via grep that old types
imageRequestandimageResponseare completely removed from the codebase — no orphaned definitions.Naming and structure are clear: New types
txt2imgRequest/txt2imgResponseare well-named and match their endpoint. TheparseSize()helper is a focused, single-responsibility function that cleanly splits the "WxH" parsing logic out ofGenerate().Comments explain the "why": The struct comments (lines 31-38, 52-54) document not just the shape but why this endpoint is used over the OpenAI one — excellent maintainability practice.
Consistent optional-field pattern: Pointer types with
omitemptyforSeed,Steps,CFGScale,Width,Heightcorrectly allow sd-server to use its baked defaults when unset.Test updates match the change: Test assertions updated to check
width/heightintegers instead ofsizestring, and response JSON structure matches the newtxt2imgResponse.Verified by reading: Confirmed in
provider/llamaswap/image.gothat no references to old types remain, and the new code follows the same patterns as surrounding code.The change is a clean refactor with no maintainability debt introduced.
⚡ Performance — No material issues found
VERDICT: No material issues found
Performance lens findings:
After reviewing the change through the performance lens:
No N+1 regression: The code correctly uses
BatchCount: req.N(line 81) to request all N images in a single API call, matching the old behavior whereNwas sent to the OpenAI endpoint.No new allocations: The response structure changed from
[]struct{B64JSON string}to[]string, which is actually fewer allocations per image returned.Bounded memory: The
doJSONhelper usesio.LimitReaderwithmaxResponseBytes(64MB cap) for response decoding, preventing unbounded growth.No hot-path work added:
parseSizeis called once per request (not in a loop), and base64 decoding is necessary work that was already performed in the old code.No blocking calls introduced: The HTTP request pattern is unchanged; still a single POST with bounded response reading.
Verified by: Reading
provider/llamaswap/image.go(full file),provider/llamaswap/llamaswap.go:188-227(doJSON implementation), and confirmingmaxResponseBytesis defined and used for bounding.The endpoint switch from OpenAI
/v1/images/generationsto A1111/sdapi/v1/txt2imgdoes not introduce performance regressions in the code path reviewed.🧯 Error handling & edge cases — Minor issues
VERDICT: Minor issues found
provider/llamaswap/image.go:91-92— Empty base64 entries now silently skipped (continue) instead of returning an error. The old code explicitly errored on emptyb64_jsonwith the rationale that url-only entries meant the backend ignoredresponse_formatand the client doesn't fetch remote content. The new behavior is more lenient: if the A1111 endpoint returns a mix of valid and empty entries, partial failures are masked. Note: there is a check at line 100-106 that errors if all images are empty, but mixed results (some valid, some empty) will silently drop the empty ones. Suggested fix: either restore the error (if the A1111 endpoint should never return empty entries) or add a warning/log when skipping to aid debugging.provider/llamaswap/image.go:110-126—parseSizevalidates format and positive values but has no upper bound on dimensions. Extremely large values (e.g.,"999999999x999999999") pass validation and are sent to the upstream, which may reject them or fail unpredictably. Suggested fix: add reasonable max dimension checks (e.g., 4096 or 8192) if the upstream has known limits, or document that validation is minimal.Both issues are in the error-handling/edge-case lane. The first is a behavioral regression on how partial failures are handled; the second is a missing guard on an input boundary.
Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 4m 6s