fix: address verified gadfly P2 findings (9 real of 18)
Independently verified all 18 gadfly findings against the code (18-agent
fan-out). Fixed the 9 real ones; the other 9 were false-positive /
hallucinated / valid-tradeoff (no change).
High:
- F1 nil model: a Models resolver returning (ctx,nil,nil) flowed into the
agent loop and nil-panicked. Now a clean error (Run never panics). +test.
- F9 compactor data-leak: renderTranscript sent tool-call args verbatim to
the summarizer (a possibly-different provider/tier); secret-bearing tool
args (mcp_call/email_send/http_*/webhook_*) are now redacted, with a doc
note that result bodies still flow (summary needs them).
Medium/minor:
- F2 compactor error path returned the folded slice, not the original msgs
(contradicting the documented non-fatal contract) -> return msgs.
- F3 RunStats.Status only ok/error; now timeout (DeadlineExceeded) /
cancelled (Canceled) via statusFor. +test.
- F4 step-zip emitted empty-name "ghost" steps when results>calls; now pairs
min(calls,results) only.
- F5 SetIteration was never called -> RunState.Iteration always 0; the step
observer now updates it each loop.
- F6 matchPending fallback was LIFO; now FIFO (matches the per-key queue).
- F7 estimateTokens had no default arm (future Part kinds counted as 0);
unknown parts now counted conservatively.
- F8 cloud_sync silently truncated >1MiB responses -> opaque JSON error; now
a clear "response exceeded N bytes" via readCapped.
- F12 step observer captured the caller ctx; now the merged runCtx.
- F13 compaction onFire was nil (doc claimed it logged); now wired to
audit LogEvent("compaction_fired").
- F11 (no pre-dispatch hook in majordomo) documented honestly as a known
limitation; F18 UsageSink doc clarified cache tokens are subsets of input.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+18
-2
@@ -314,7 +314,7 @@ func (c *CloudOllamaLimitCache) fetchTags(ctx context.Context) ([]string, error)
|
||||
return nil, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
body, err := io.ReadAll(io.LimitReader(resp.Body, maxLimitCacheResponseBytes))
|
||||
body, err := readCapped(resp.Body)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
@@ -367,7 +367,7 @@ func (c *CloudOllamaLimitCache) fetchContextLength(ctx context.Context, modelNam
|
||||
return 0, err
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
respBody, err := io.ReadAll(io.LimitReader(resp.Body, maxLimitCacheResponseBytes))
|
||||
respBody, err := readCapped(resp.Body)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
@@ -456,3 +456,19 @@ func truncate(b []byte, n int) string {
|
||||
// (/api/tags, /api/show) so a misbehaving endpoint can't stream an unbounded
|
||||
// body before the 15s timeout fires. 1 MiB is far above any real response.
|
||||
const maxLimitCacheResponseBytes = 1 << 20
|
||||
|
||||
// readCapped reads up to maxLimitCacheResponseBytes from r and returns a clear
|
||||
// error if the response EXCEEDS the cap — rather than silently truncating (as a
|
||||
// bare io.LimitReader does) and letting downstream json.Unmarshal fail with an
|
||||
// opaque "unexpected end of JSON input". It reads one extra byte to detect the
|
||||
// overflow.
|
||||
func readCapped(r io.Reader) ([]byte, error) {
|
||||
body, err := io.ReadAll(io.LimitReader(r, maxLimitCacheResponseBytes+1))
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
if len(body) > maxLimitCacheResponseBytes {
|
||||
return nil, fmt.Errorf("cloud_sync: response exceeded %d bytes", maxLimitCacheResponseBytes)
|
||||
}
|
||||
return body, nil
|
||||
}
|
||||
|
||||
@@ -16,6 +16,11 @@ import (
|
||||
// UsageSink receives one record per successful Generate through a model parsed
|
||||
// by this package (ParseModelRequest / ParseModelForContext). Implement it to
|
||||
// meter or bill; the token detail mirrors majordomo's Response.Usage.
|
||||
//
|
||||
// IMPORTANT: cacheReadTokens and cacheWriteTokens are PORTIONS of inputTokens,
|
||||
// not independent additive values (they let a sink price cached vs fresh input
|
||||
// differently). A sink must NOT compute total = input+output+cacheRead+
|
||||
// cacheWrite — that double-counts the cached input.
|
||||
type UsageSink interface {
|
||||
Record(ctx context.Context, model string, inputTokens, outputTokens, cacheReadTokens, cacheWriteTokens int)
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user