P3: generic tool library (think/now/cite + meta + store groups) #3
Reference in New Issue
Block a user
Delete Branch "phase-3-tools"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 3: stand up
executus/tools— the generic, host-agnostic tool library a host registers against atool.Registry, then names inRunnableAgent.LowLevelTools.What's here
Register(reg)— always-available, zero-config tools:think,now(UTC unless aCurrentTimeProvideris wired),cite(inert unless aCitationStorageis wired). All nil-safe.RegisterMeta(reg, MetaDeps)— LLM-backed:classify,extract_entities,summarizeover thellmmetahelper. Budget defaults to a shipped in-memory per-run cap.RegisterStore(reg, StoreDeps)— agent memory:kv_get/set/list/delete+file_save/get/get_text/get_metadata/list/delete(+file_search,create_file_urlwhen wired). Quota defaults to a generous static cap; kv and file groups register independently.research_providers.go,file_storage.go,kv_storage.go,quota_provider.go,file_descendant_grant.go, plus the in-memory budget default.Verification
go build/vet/test -race ./...green; corego.sumstill free of gorm/redis/discordgo/sqlite/wolfram.thinktool call → dispatched through the registry → finalised; step instrumentation captures it).Notes
calculatewas deferred (dragsgo-wolfram+ a module-path replace — not worth it in the lean core for one tool). Remaining P3: web/net/compose groups + their default backends (DDG searcher, stdlib readability, SSRF http). The default in-memory KV/File store backends arrive withcontrib/storeat P4.Stacked on the now-merged P2. Gadfly review (3 cloud models) is advisory — findings will be independently verified.
🤖 Generated with Claude Code
🪰 Gadfly review —
minimax-m3:cloud(ollama-cloud)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — No material issues found
VERDICT: No material issues found
Findings through the 🔒 security lens (host-agnostic generic tool library):
tools/file_search.go:73 —
file_searchis admin-only + share-safe. The handler at tools/file_search.go:110 pinsskillIDtoinv.SkillID, so a non-admin caller invoking an admin-authored public skill can only search that skill's bucket. The doc-comment claim matches the implementation. Fail-closed type-assertion in tools/file_descendant_grant.go:48 keeps the cross-skill escape hatch off for fakes that don't implementDescendantRunChecker. Verified by reading both files.tools/scope_validate.go — strict scope authorisation. Every code path I traced (
kv_get,kv_set,kv_list,kv_delete,file_save,file_get,file_get_text,file_get_metadata,file_list,file_delete,create_file_url,summarize) reachesValidateScope(inv, scope, false)before any storage call, andisAdmin=falseis constant — no path can be coerced via anInvocationfield into another user's bucket.user:,run:,root_run:cases I read in tools/scope_validate.go:48-103 reject mismatch; parent-run and continuation allowances use exact-string equality, not prefix match.tools/create_file_url.go — token randomness + cross-skill guard. tools/create_file_url.go:170 reads 32 bytes from
crypto/rand(~190 bits) and base64url-encodes; tools/create_file_url.go:118 does the cross-skill rejection + descendant grant before persisting. The default + max caps (DefaultFileURLExpiry,MaxFileURLExpiry,MaxFileURLViews) are applied before any DB write. Verified.tools/kv_set.go — quota delta math + scope partition. tools/kv_set.go:107 subtracts prior value length on UPDATE only inside the per-skill (non-root_run) branch; the root_run partition is excluded because per-skill attribution is meaningless there. The type-assertion for
KVHistoryRecorderis gated on storage succeeding — a failed history write cannot shadow the kv_set return. Verified.tools/now.go —
time.LoadLocationon untrusted string. tools/now.go:79 callstime.LoadLocation(tzName)with a caller-supplied IANA name. Go's stdlib does tzdata lookups only (no arbitrary file IO), and unknown zones fall back to UTC with awarning— not a security issue.tools/classify.go / tools/extract_entities.go — prompt composition with caller text.
Text,Categories, and fieldDescriptionare concatenated into the user prompt verbatim (tools/classify.go:180-186, tools/extract_entities.go:259-267). This is the standard extraction-tool shape and the budgets gate cost; no system-prompt injection vector (system prompt is fixed; user text is on the user side). The coerced output is returned to the agent's context — downstream prompt-injection surface, inherent to LLM extract/classification, not introduced by this PR.tools/file_save.go — base64 decode + size cap. tools/file_save.go:107 uses
base64.StdEncoding.DecodeStringand rejects on parse error; size cap is checked AFTER decode so an oversized-but-well-formed payload is caught. Per-file cap is the only line of defence whenquotais nil, which is the documented test/admin opt-out path.tools/cite.go —
url_not_in_run_historyrejection. tools/cite.go:97 reads the per-run touched-URL set viaCitationStorage.GetTouchedURLsand rejects any URL not in it, preventing fabricated citations. Theno_run_contextbranch (tools/cite.go:82) bails loudly wheninv.RunID == "", so an empty run context cannot smuggle a cite through.tools/think.go — empty-thought rejection. tools/think.go:60 returns a structured
{"ok":false,"error":"empty_thought"}rather than silently accepting, closing off the "free spam" foot-gun noted in the doc-comment.tools/scope_validate.go:60 — empty
user:/run:/root_run:rejection. tools/scope_validate.go:54, 64, 83 each reject the bare-prefix shape with a descriptive error, closing off trivial"user:"matches.No authn/authz gap, no SQL/command/path injection, no SSRF surface (network tools are deferred to a follow-up PR per the description), no unsafe deserialization, no secret/credential handling, and no insecure default that I can confirm by reading the code. Outside my lens I noticed nothing severe enough to flag.
🎯 Correctness — No material issues found
Review — Correctness Lens
VERDICT: No material issues found
Verification results:
Finding 1 (tools/kv_set.go:115, partition mismatch) — DROPPED. Read of
tools/kv_set.golines 85–115 confirms the gateif quota != nil && partition == inv.SkillID(line 93) is the only place the inner block runs; inside that branchpartition == inv.SkillIDis a tautology, so theKVGet(..., inv.SkillID, ...)call on line 107 is equivalent to passingpartition. The draft itself concedes "the call is internally consistent" and frames the issue as a future-refactor hazard ("if a future refactor loosens the gating"). No current correctness bug — not a blocker.Finding 2 (tools/file_save.go:127, dedup ignored) — DROPPED. Read of
tools/file_save.golines 24–27 (the in-file doc comment) explicitly states: "we do NOT subtract a 'prior' value here the way kv_set does: file_save always inserts a new file row (content-addressable dedup is at the blob layer, not the row layer), so every save is additive to FileUsageBytes." That is,FileUsageBytesmeasures ROWS (not dedup'd blob bytes), and eachfile_savedoes insert a new row, so the per-row quota charge is correct as written. The draft's premise ("as the comment elsewhere implies"FileUsageBytesreturns dedup'd blob-layer total) is not supported — I grepped the whole repo for any such comment and found none; the in-file comment states the opposite. The agent-facing description ("Content is dedup'd by hash — multiple file_save calls with identical bytes share storage") reflects the blob-layer dedup, which is independent of the per-row quota count. No current correctness bug — not a blocker.Neither finding survives verification.
🧯 Error handling & edge cases — Blocking issues found
VERDICT: Blocking issues found
Error-handling & edge-case review of the P3 tools package. Several material gaps; one cross-file inconsistency to flag.
tools/summarize.go:198-203— missingdescendantFileGrantescape hatch on file_id input.loadSummarizeInputrejects cross-skill file_ids outright, butfile_get.go:74-79,file_get_text.go, andfile_get_metadata.goall allow the parent → spawned-worker artifact path viadescendantFileGrant. A parent orchestrator (the agent_spawn use casefile_descendant_grant.gowas added for) canread_page/file_savea worker's chart but cannot summarize it viafile_id— the LLM is forced to re-read bytes inline. The package's own doc-comment forfile_descendant_grant.gosays "you may touch the artifacts of workers you (transitively) dispatched" but summarize silently breaks that contract. Fix: type-assert the storage dep toDescendantRunCheckerthe same wayfile_getdoes and fall through when the owner is a descendant.tools/kv_set.go:93-114+tools/file_save.go:118-130— TOCTOU race in quota enforcement. Both gates doused := storage.UsageBytes(...)→ arithmetic check → write, with no locking. Two concurrent calls in the same run can both observeused < kvMaxand both commit, pushing total storage past the cap. The PR description even names this "per-skill byte quota" as a hard gate, but it isn't atomic. Suggested fix: serialize the check+write under a per-skill mutex in the production adapter (e.g.,gormStorage.QuotaLock(skillID)), or convert to an atomic SQL UPDATE with a WHERE-clause guard and retry on conflict.tools/file_delete.go:52-54— inconsistent cross-skill handling (no descendant grant, unlike file_get/file_get_text/file_get_metadata). Probably defensible (delete is destructive; explicit non-inheritance is the safer default), but the asymmetry with the read tools should be a deliberate choice, not an accident — and the package doc onfile_descendant_grant.godoesn't carve out a delete exception. Either add a code comment marking the asymmetry intentional, or wire the descendant check (matching the other tools).tools/research_defaults.go:63— comment references a non-existent constant.NewInMemorySearchBudget's comment says "kinds missing from the map fall back to maxPerKindDefault" but no such constant exists in the package;CheckAndIncrementhardcodesmax = 10(line 64). Either rename the inline10to a named constant matching the comment, or fix the comment. Verified by grep — nomaxPerKindDefaultsymbol exists.tools/think.go:71—var _ = fmt.Errorfis a "keep the import alive" hack with no actual use. The onlyfmtreference is the unused-import keeper itself. The empty-thought branch returns a hand-rolled JSON literal, never formats anything. Remove thefmtimport and the_ =line. Verified by reading lines 1-72 —fmt.Errorfis not called anywhere in the file.tools/kv_set.go:81-83— silent accept ofnullvalue.json.Unmarshal([]byte("null"), &probe)succeeds withprobe == nil. Combined with thekv_getpath returning the rawValuebytes verbatim, an agent that storesnullreads back"null"(a 4-byte JSON literal string, not a JSON null). It's coherent at the wire level, but the schema comment ("object, array, string, number, bool, or null") advertises a value type the storage layer doesn't round-trip —kv_getof anullrow returns the string"null", not the JSON null. Either rejectnullatkv_settime, or havekv_getreturn it as the literalnull(unquoted). Minor, but the asymmetry betweennulland other JSON values is a sharp edge.tools/summarize.go:208-214+tools/classify.go:108-109— byte-truncation can split a UTF-8 multibyte rune.text[:summarizeMaxInputBytes]and the matching classify slice happen at byte boundaries, not rune boundaries. A 32 KiB Chinese / emoji string truncates into invalid UTF-8 that the LLM then sees as replacement chars. Fix: useutf8.DecodeLastRuneto walk back to a valid boundary, or convert via[]runefirst.tools/cite.go:87-95— storage errors swallowed into the responseReasonfield but not surfaced aserror. A failedGetTouchedURLsreturnsreason: "touched_lookup_failed: <err>"rather than the standarderrorfield. The agent's reaction logic must special-case thereasonstring. Minor consistency issue: every other cite failure mode usesReason; a transport error is structurally indistinguishable from a validation failure, which is fine, but ifRecordCitationever returns a non-recoverable error (e.g., DB down), the agent silently keeps retrying. Consider at minimum returning("", error)forGetTouchedURLs/RecordCitationfailures so the executor surfaces a real error and can decide whether to retry or abort.tools/store.go:46-49—RegisterStoresilently installsstaticQuotawhend.Quota == nildespite theQuotaProviderinterface comment saying nil is "do NOT enforce".staticQuota.EffectiveQuotareturns64 << 20/1 << 30regardless of caller. A host that readsquota_provider.go's "nil-safe: tools constructed against a nil QuotaProvider do NOT enforce" and passesStoreDeps{Quota: nil}expecting disabled enforcement actually gets a 64 MiB KV / 1 GiB file cap on every skill. The doc onStoreDepsitself says "defaults to a generous static cap" (correct), but theQuotaProviderinterface comment (line 22 ofquota_provider.go) contradicts that. Fix: align theQuotaProviderinterface comment to the actual wrap-time default, or renamestaticQuota→defaultStaticQuotaand document it as "always-on, not skipped".Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 7m 34s
🪰 Gadfly review —
glm-5.2:cloud(ollama-cloud)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — Blocking issues found
All findings confirmed against the actual code. The corrected review:
VERDICT: Blocking issues found
Security lens findings
1.
ValidateScopeignoresCallerIsAdmin— every tool hard-codesfalse, so admin overrides are dead codetools/scope_validate.go:42—ValidateScope(inv, scope, isAdmin)takes an explicitisAdminbool, but every production caller passes the literalfalse. Verified via grep across all 11 call sites:file_save.go:87,file_list.go:44,file_get.go:85,file_get_metadata.go:73,file_get_text.go:80,file_delete.go:55,file_search.go:101,kv_get.go:46,kv_set.go:62,kv_list.go:52,kv_delete.go:36,summarize.go:201.The comment at
scope_validate.go:20-24and:36-40claims "production tools always pass isAdmin=false" and "Invocation does NOT carry an admin flag." That is stale:tool/registry.go:188showsInvocation.CallerIsAdminexists and is populated by the executor (documented as backed byBot.GetMember), and is already read bycode_execfor v15 admin-only network mode. The correct, available value isinv.CallerIsAdmin, and it is dropped on the floor everywhere.Security impact: the entire admin branch of the authorization matrix (
user:<other>allowed for admin atscope_validate.go:53,run:<other>allowed for admin at:107,root_run:<other>allowed for admin at:72) is unreachable in production. The documented ruleset (file doc lines 5-12) promises admin escalation, but it does not actually exist. This is a defense-in-depth gap: fail-closed today (safe), but any future reliance on admin status via these tools silently fails. One-token fix: passinv.CallerIsAdmininstead offalseat each call site, and correct the stale comments.2.
create_file_urlmissing scope check — a caller can mint a public URL for a file bypassing row-scope authorizationtools/create_file_url.go:109-130— the handler fetchesFileDomainMetaviaFileGet, checksmeta.SkillID != inv.SkillIDwith the descendant grant (descendantFileGrant), but never callsValidateScope. Grep confirms noValidateScopereference anywhere increate_file_url.go. Compare every other file-access tool:file_get.go:85,file_get_metadata.go:73,file_get_text.go:80,summarize.go:201all runValidateScope(inv, meta.Scope, false)after the cross-skill check;file_save.go:87checks scope before writing.Security impact:
create_file_urlis a publication primitive — it mints an unauthenticatedmort.sh/files/<token>link. A skill that legitimately owns a file stored underuser:bobscope can mint a public URL for it, whereasfile_getcorrectly rejects the same caller reading that file's bytes becauseValidateScopeenforcesuser:<id>==inv.CallerID. Same-skill ownership (meta.SkillID == inv.SkillID) passes the cross-skill check at:127, then falls straight through to token minting with no scope gate. A file saved underrun:<other_run_id>oruser:<other>by the same skill becomes publicly publishable without the scope authorization every read path enforces. This is the one file-access tool that drops the scope check, and it drops it at exactly the wrong (publication) boundary.Suggested fix: mirror
file_get_metadata's pattern — trackgrantedViaDescendant, and when the file is owned by this skill (meta.SkillID == inv.SkillID), still runValidateScope(inv, meta.Scope, inv.CallerIsAdmin)before minting the public link; skip it only for the descendant-grant case.3.
file_searchlets an emptyscopebypassValidateScopeand reach the searcher unvalidatedtools/file_search.go:99-104— whenargs.Scope == "", the scope check is skipped and""is passed straight tosearcher.SearchFiles(ctx, inv.SkillID, "", ...). The handler pinsskill_idtoinv.SkillID(the primary cross-skill boundary), so theskill_iddimension is safe. But the scope dimension is left to theFileSearcherbackend to interpret: an empty scope handed to a backend that treats""as "all scopes within this skill" could surface rows fromuser:<other>scopes thatValidateScopewould have rejected for a non-admin caller.Security impact: minor / defense-in-depth — the
skill_idpin is the primary boundary and is enforced here (:110). The empty-scope path is the one place a caller-supplied scope string flows into storage without validation. TheFileSearcherbackend is not in this PR (lives inmort.go), so whether""is interpreted as cross-scope is an assumption about the future adapter. Worth either rejecting empty scope or documenting that the backend must not interpret""as cross-scope.🎯 Correctness — Blocking issues found
Both findings confirmed against the actual code. Verifying the remaining details and recomputing.
VERDICT: Blocking issues found
tools/create_file_url.go:120-129— missing scope check (confirmed).file_get(line 85),file_get_text(line 80),file_get_metadata(line 73), andfile_delete(line 55) all callValidateScope(inv, meta.Scope, false)after theSkillID/descendant check.create_file_urldoes not: whenmeta.SkillID == inv.SkillIDit falls straight through from line 129 to minting a public URL at line 154. A file saved by the same skill underuser:<callerA>can therefore be published to a public/files/<token>URL by a different caller (callerB) of that shared skill who only knows thefile_id. This breaks the per-user scope isolation the other four file tools enforce and is a concrete cross-user exposure — public link minting, not merely a read. Fix: trackgrantedViaDescendant(as the sibling tools do) and addif !grantedViaDescendant { if err := ValidateScope(inv, meta.Scope, false); err != nil { return "", fmt.Errorf("create_file_url: %w", err) } }after the skill/descendant branch, mirroringfile_get.go:73-87.tools/classify.go:286-298— single-label returns a "winner" even when all scores are 0 (confirmed).selectClassifyLabelsinitialisesbestScore := -1.0and iterates withscores[c] > bestScore. If the LLM reply yields all-zero scores (empty/absent keys default to 0 innormaliseClassifyScores), the first category always satisfies0 > -1, so the tool returnslabels: [firstCategory]withscoresall0. ThebestCat == ""guard at line 295 only fires whencategoriesis empty, which earlier validation (line ~140) already rejects, so it never triggers here. An agent consuming justlabels[]gets a false-positive classification. Consider returningnil(or surfacing an error) when the best score is0/below a small epsilon so "no category fit" is distinguishable from "category A won".tools/file_delete.go:52-53vs the other file tools — descendant-grant inconsistency (confirmed).file_get/file_get_text/file_get_metadata/create_file_urlhonourdescendantFileGrantfor cross-skill access;file_deleterejects outright whenmeta.SkillID != inv.SkillIDwith no descendant escape hatch. Deleting a worker's artifact is arguably a distinct act from reading/publishing it, so the omission may be deliberate — but it's worth a one-line comment infile_deletestating the omission is intentional, since the package's stated blanket rule otherwise routes every cross-skill site throughdescendantFileGrant.🧯 Error handling & edge cases — Minor issues
All three findings confirmed against the actual code. Outputting the corrected review.
VERDICT: Minor issues
tools/tools.go:59—MetaDeps.MaxPerRunis silently ignored viaRegisterMeta. Whend.Budgetis nil (the normal path),RegisterMetaconstructsNewInMemorySearchBudget(nil)— an empty caps map (tools.go:59,research_defaults.go:47-54) — and passes that shared budget to all three meta tools. Inside each handler, theif budget == nilfallback that would have consultedcfg.MaxPerRun(ctx)is skipped (budget is non-nil,classify.go:118,extract_entities.go:128,summarize.go:108), soCheckAndIncrementlands inresearch_defaults.go:62-64whereb.cap[kind]is 0 → hardcodedmax = 10. ThefixedMetaConfig.MaxPerRunvalue derived fromMetaDeps.MaxPerRun(tools.go:67,88) is therefore dead code in theRegisterMetapath. A host that setsMetaDeps.MaxPerRun = 50expecting 50 calls/run still gets capped at 10. The default (10) happens to coincide, masking the bug. Fix:RegisterMetashould build the budget with the configured caps, e.g.NewInMemorySearchBudget(map[string]int{"classify": d.MaxPerRun, "extract_entities": d.MaxPerRun, "summarize": d.MaxPerRun}).tools/create_file_url.go:133-138—ExpiresInSecondsoverflow bypasses the max-expiry cap.expiry = time.Duration(args.ExpiresInSeconds) * time.Second(create_file_url.go:134) can overflowint64nanoseconds for very largeintvalues (e.g. nearmath.MaxInt64), producing a negativetime.Duration. The subsequent guardif expiry > MaxFileURLExpiry(create_file_url.go:136) is then false (negative < positive), so the cap is skipped andtime.Now().Add(expiry)(create_file_url.go:139) yields a pastexpiresAt— the token is minted already-expired. The doc promises "Max 2592000 (30 days)" but the clamp only handles the over-but-positive case, not overflow/negative. Suggested fix: clampargs.ExpiresInSecondsto[0, int(MaxFileURLExpiry/time.Second)]before the multiply, or checkexpiry <= 0after the multiply and fall back to the default.tools/classify.go:118,extract_entities.go:128,summarize.go:108— per-call budget fallback defeats enforcement when a tool is constructed directly withnilbudget. Each handler, whenbudget == nil, builds a freshInMemorySearchBudgeton every invocation (classify.go:123-125,extract_entities.go:133-135,summarize.go:113-115), so the counter always starts at 0 → never exceeds → no effective cap. This is a footgun for hosts that construct the tools directly (rather than viaRegisterMeta, which supplies a shared non-nil budget). The nil-safe claim ("budget defaults to an in-memory per-run cap") is misleading because the per-call allocation makes it per-call, not per-run. Not blocking (RegisterMeta is the documented path), but worth a comment or a constructor-level default so direct construction is still bounded.Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 6m 32s
🪰 Gadfly review —
deepseek-v4-flash:cloud(ollama-cloud)Verdict: Blocking issues found — 3 reviewers: security, correctness, error-handling
🔒 Security — Blocking issues found
🔒 Security Review — Verdict: Blocking issues found
1.
tools/create_file_url.go:127— Missing scope validation allows cross-user public URL mintingConfirmed. The
create_file_urlhandler checks cross-skill ownership (line 127) but never callsValidateScope(inv, meta.Scope, false)after fetching the file metadata. Every other file-access tool (file_get.go:85,file_get_metadata.go:73,file_get_text.go:80,file_delete.go:55,summarize.go:201) performs this scope check.Impact: A caller in skill X can mint a public, unauthenticated URL for any file in skill X, regardless of its scope. For example, if a file was saved with scope
user:bob, calleralice(same skill) can still create a public URL for it. The file's bytes become accessible to anyone with the link, bypassing the per-user isolation thatValidateScopeenforces. This is a privilege escalation / access control bypass.Suggested fix: Add a scope validation after the cross-skill check, matching the pattern in
file_get.go:The descendant-grant exception should also be handled: when
descendantFileGrantpasses, the scope check should be skipped (asfile_get.godoes withgrantedViaDescendant), because the worker's file scope is the worker's run, not the caller's.2.
tools/file_delete.go:52— Missing descendant grant for cross-skill file deletion (fail-closed, lower severity)Confirmed.
file_delete.gorejects any file wheremeta.SkillID != inv.SkillID(line 52) without consultingdescendantFileGrant. All other file-access tools (file_get.go:75,file_get_metadata.go:67,file_get_text.go:74,create_file_url.go:127) allow the descendant-grant exception.Impact: A parent skill that spawned a worker cannot delete the worker's files, even though it can read them. This is fail-closed (safe by default) but breaks the parent/worker handoff pattern that the descendant grant was designed to support. A worker's orphaned files would accumulate until the run-scope sweeper cleans them.
Suggested fix: Add the descendant grant check before the cross-skill rejection, matching the pattern in
file_get.go:3.
tools/research_defaults.go—InMemorySearchBudgetis explicitly not production-correct (documented, but worth flagging)Confirmed. The
InMemorySearchBudget(lines 37–85) is a process-wide map that never resets across runs. The documentation at lines 29–36 explicitly says: "NOT production-correct because the map persists across the process lifetime; production wiring MUST plug a per-run reset." The default wiring intools.go:59uses this implementation.Impact: In a test or misconfigured deployment where this default is used in production, budget counters bleed across runs. A single run could exhaust the budget for all subsequent runs of the same kind (e.g.,
classify,summarize), causing false "budget exceeded" rejections. This is a documented gap, not a code bug, but it is a security-relevant deployment pitfall.Suggested fix: Ensure production wiring supplies a per-run-resetting
SearchBudgetimplementation. TheInMemorySearchBudgetshould be restricted to test use only, perhaps by unexporting it or adding a// +build testconstraint.🎯 Correctness — Minor issues
Now I have all the information I need to evaluate each finding.
Evaluation of Finding 1
The draft claims that when
budget == nil, a newInMemorySearchBudgetis created inside the handler closure, and "the local variable shadows the closure's capturedbudget", so every call gets a fresh budget.Looking at the actual code in
classify.golines 118-126:The
budgethere is the parameter ofNewClassify(line 70). The closure captures this variable. The assignmentbudget = NewInMemorySearchBudget(...)uses=(not:=), so it modifies the captured variable. There is no new local variable declaration forbudget— onlymaxPerRunuses:=. On the next invocation,budgetwill be non-nil and theif budget == nilbranch is skipped.The draft's claim that "the local variable shadows the closure's captured budget" is incorrect. The budget is created once on the first nil-budget call and reused thereafter. The per-run cap IS enforced (though the
InMemorySearchBudgetis process-wide, as documented).Finding 1 is DROPPED.
Evaluation of Finding 2
The draft claims
coerceClassifyScoreaccepts malformed percent strings like"50%extra". Let me trace:strings.TrimSpace("50%extra")→"50%extra"strings.TrimSuffix("50%extra", "%")→"50extra"fmt.Sscanf("50extra", "%f", &f)parses50.0and stops (partial match), returnserr == nilstrings.HasSuffix(strings.TrimSpace(v), "%")—"50%extra"ends with"a", not%→ falseSo the draft's specific example is wrong:
"50%extra"does NOT end with%, soHasSuffixreturns false and the function returns50.0(not0.5). However, the underlying issue is real:Sscanfwith%fsilently ignores trailing garbage like"50extra", returning50.0for a clearly malformed input. The function should validate that the entire string was consumed.Finding 2 is KEPT (with corrected analysis).
Evaluation of Finding 3
tools/file_save.go:46:const defaultFileMaxBytes = 10 * 1024 * 1024 // 10 MiBtools/store.go:37: comment saysMaxFileBytes int // file_save per-file cap; default 16 MiBtools/store.go:44-45:if d.MaxFileBytes <= 0 { d.MaxFileBytes = 16 << 20 }Confirmed:
RegisterStoredefaults to 16 MiB,NewFileSavecalled directly with0defaults to 10 MiB. Real inconsistency.Finding 3 is KEPT.
Corrected Review
🎯 Correctness — Verdict: Minor issues
1.
coerceClassifyScorepercent-string parsing accepts malformed input silentlyFile:
tools/classify.go:249-265fmt.Sscanfwith%fparses a leading float and silently ignores trailing garbage. A string like"50extra"(no percent sign) is accepted:sbecomes"50extra",Sscanfparses50.0and stops,HasSuffixis false, and the function returns(50.0, true). The caller gets a score of50.0for a clearly malformed input — the clamping innormaliseClassifyScoreswould cap it to1.0, but the value is still silently accepted when it should be rejected.Impact: An LLM that emits
"50extra"instead of50gets a score of1.0(after clamping) rather than a rejection. This silently corrupts the classification distribution.Fix: Validate that after removing the
%suffix, the remainder is a pure numeric string with no trailing garbage. For example, usestrconv.ParseFloatinstead offmt.Sscanf, which rejects trailing characters:2. Minor:
defaultFileMaxBytes(10 MiB) vsRegisterStoredefault (16 MiB) inconsistencyFiles:
tools/file_save.go:46,tools/store.go:37NewFileSavefalls back todefaultFileMaxBytes = 10 * 1024 * 1024(10 MiB) whenmaxFileBytes <= 0. ButRegisterStoredefaultsMaxFileBytesto16 << 20(16 MiB) before passing it toNewFileSave. A caller usingRegisterStoregets 16 MiB; a caller callingNewFileSavedirectly with0gets 10 MiB.Not a correctness bug per se (both are valid caps), but the inconsistency is confusing and suggests one of the two constants is stale. The doc comment on
StoreDeps.MaxFileBytessays "default 16 MiB", which matchesRegisterStore— sodefaultFileMaxBytesinfile_save.gois likely the stale one.🧯 Error handling & edge cases — Minor issues
All three findings are confirmed against the actual code. Here is the corrected review.
🧯 Error handling & edge cases — review
Verdict: Minor issues
1.
create_file_url.go:181— Swallowedjson.MarshalerrorThe error from
json.Marshalis silently discarded. If marshaling fails (e.g. due to a non-serializable value in the struct), the function returns an empty string with no error, which would confuse the agent loop. Every other tool in this package (file_get.go:97,file_get_metadata.go:84,file_list.go:68, etc.) correctly returns the marshal error.Fix: Change to
b, err := json.Marshal(res)and return the error:2.
create_file_url.go— MissingValidateScopecall after file fetchcreate_file_urlfetches the file metadata and checksmeta.SkillIDagainstinv.SkillID(cross-skill guard), but it never callsValidateScopeon the file'smeta.Scope. Every other file-access tool in this package does:ValidateScope?file_get.go:85file_get_text.go:80file_get_metadata.go:73file_delete.go:55create_file_url.goImpact: A caller in skill
Scould mint a public URL for a file that belongs to skillSbut was saved underuser:bobscope, even when the caller isalice. The cross-skill check passes (same skill), but the scope isolation is bypassed — the file was meant to be private tobob.Fix: Add the same scope-validation pattern used in
file_get.go:3. Byte-boundary truncation can split multi-byte UTF-8 characters
Three tools truncate input text at a hard byte boundary without regard for UTF-8 rune boundaries:
classify.go:114:text = text[:classifyMaxInputBytes](16 KiB)extract_entities.go:124:text = text[:extractEntitiesMaxInputBytes](32 KiB)summarize.go:213:return text[:summarizeMaxInputBytes], true, nil(32 KiB)If the byte boundary falls in the middle of a multi-byte UTF-8 character, the resulting string contains an invalid trailing byte sequence. While LLMs are generally tolerant, this can cause tokenization artifacts or trigger validation warnings in downstream processing.
Fix: Truncate at the nearest rune boundary instead:
(Add
"unicode/utf8"to imports.)Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in 10m 8s
Pull request closed