fix: address gadfly P1 review (3 low-risk findings)

Triaged gadfly's P1 review (advisory). Fixed the three clearly-correct,
low-risk items; the rest were pre-existing mort behavior or theoretical:

- model/call.go: recordUsage dropped fully-cached responses (input==0 &&
  output==0 early-return missed CacheRead/CacheWrite-only usage, which
  Anthropic/OpenAI prompt-caching bills). Guard now also checks cache tokens.
- llmmeta/helper.go: recordLedger swallowed Storage.RecordMetaCall errors;
  now logs them (slog.Warn) so a non-logging Storage impl can't silently drop
  audit rows.
- model/cloud_sync.go: the ollama.com limit-cache used unbounded io.ReadAll;
  wrapped both reads in io.LimitReader(1 MiB) so a misbehaving endpoint can't
  exhaust memory before the 15s timeout.

Noted-not-fixed (follow-ups / pre-existing mort semantics): tier_not_allowed
ledger label on resolution failure, unknown-model usage attribution, the
cloud_sync https scheme allowlist, and several theoretical/cosmetic items.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

This commit is contained in:

Steve Dudenhoeffer

2026-06-26 20:32:58 -04:00

committed by

steve

parent e0e2c0451a

commit d9b44387f5

3 changed files with 12 additions and 4 deletions

									
										model/call.go
									
		+1
		-1
	
												View File
												
				@@ -237,7 +237,7 @@ func recordUsage(ctx context.Context, resp *llm.Response) {

						return

					}

					u := resp.Usage

					if u.InputTokens == 0 && u.OutputTokens == 0 {

					if u.InputTokens == 0 && u.OutputTokens == 0 && u.CacheReadTokens == 0 && u.CacheWriteTokens == 0 {

						return

					}

					model := resolvedModelName(ctx, resp)