fix: address verified gadfly P4/#4 findings (audit/budget/persona)
executus CI / test (push) Failing after 1m4s

Security (all 3 models — HIGH): audit OnTool persisted raw tool args + results
verbatim for the very tools the OnStep narration-redaction flags as secret
(mcp_call/email_send/http_*) — the args/results are what CARRY the secret, so
they landed in skill_run_logs unredacted. Factored the predicate into
isSecretTool() (single source of truth) and OnTool now emits
args_redacted/result_redacted (+ lengths) for secret tools. Test asserts no
secret reaches the log. (persona) webhook_ip_allowlist entries are now
CIDR/IP-validated at load (malformed dropped + warned) instead of accepted raw.

Contract correctness (glm-5.2 + deepseek) — audit Memory now honors its
documented Storage contract: ListChildrenByParent/ListFinishedRunsBefore return
oldest-first; WalkParentChain returns root-first and honors MaxParentChainDepth;
ListRunsFiltered clamps limit (<=0 or >500 -> 50); ListFinishedRunsBefore with
limit<=0 returns none; an explicit RunFilter.Status (incl. "dry_run") matches
regardless of IncludeDryRun; LastRunBySkills counts only status=="ok" unless
includeFailed. (PurgeOlderThan's FinishedAt key is the SAFE behavior — in-flight
runs retained — so the doc was aligned to it, not the impl.)

Error-handling: appendLog now uses a bounded context (auditAppendTimeout=3s) so
a hung backend can't block the run goroutine on the hot path; Sink.StartRun
logs its (still best-effort) failure instead of swallowing it; budget Memory.Get
uses RLock (RWMutex); budget package doc fixed (was skillexec's); Check uses the
budgetWindow constant, not a duplicated literal.

Triaged false-positive: NewNoOpBudget returning BudgetTracker is assignable to
run.Budget (identical method sets) — no change needed.

Core go.sum still free of host/DB deps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 23:44:34 -04:00
parent 2260480c81
commit d82cef46b4
8 changed files with 197 additions and 24 deletions
+43 -6
View File
@@ -168,16 +168,26 @@ func (w *Writer) OnStep(iter int, resp *llm.Response) {
// surrounding narration could leak a secret (MCP args, email body/
// recipients, raw HTTP request). Mirrors the steps.go redaction list so
// the audit trace never persists secret-adjacent assistant text.
// isSecretTool reports whether a tool's arguments/results may carry secrets
// (MCP args, email bodies/recipients, HTTP auth headers/bodies) and so must be
// redacted from the persisted audit log. Single source of truth for both the
// step-narration redaction and the OnTool arg/result redaction. NOTE: this is
// a name-prefix allowlist — a NEW secret-bearing tool must be added here or its
// args/results will be logged verbatim.
func isSecretTool(name string) bool {
switch name {
case "mcp_call", "email_send":
return true
}
return strings.HasPrefix(name, "http_")
}
func stepHasSecretTool(resp *llm.Response) bool {
if resp == nil {
return false
}
for _, c := range resp.ToolCalls {
switch c.Name {
case "mcp_call", "email_send":
return true
}
if strings.HasPrefix(c.Name, "http_") {
if isSecretTool(c.Name) {
return true
}
}
@@ -211,6 +221,24 @@ func (w *Writer) OnTool(call llm.ToolCall, result string) {
return
}
w.calls.Add(1)
// Redact the args/result of secret-bearing tools — these fields actually
// CARRY the secret (MCP args, email body/recipients, HTTP auth/body), so
// logging them verbatim would defeat the OnStep narration redaction.
if isSecretTool(call.Name) {
w.appendLog("tool_call", map[string]any{
"name": call.Name,
"id": call.ID,
"args_redacted": true,
"args_len": len(call.Arguments),
})
w.appendLog("tool_result", map[string]any{
"name": call.Name,
"id": call.ID,
"result_redacted": true,
"result_len": len(result),
})
return
}
w.appendLog("tool_call", map[string]any{
"name": call.Name,
"args": string(call.Arguments),
@@ -296,6 +324,10 @@ func (w *Writer) Close(ctx context.Context, stats RunStats) {
// hung connection that the run goroutine shouldn't keep waiting on.
const auditFinishTimeout = 10 * time.Second
// auditAppendTimeout bounds each per-event AppendLog on the hot path so a hung
// storage backend can't block the run goroutine.
const auditAppendTimeout = 3 * time.Second
// ToolCallsCount returns how many tool invocations OnTool has seen so
// far. Useful for budget enforcement.
func (w *Writer) ToolCallsCount() int { return int(w.calls.Load()) }
@@ -309,7 +341,12 @@ func (w *Writer) appendLog(eventType string, payload map[string]any) {
Payload: payload,
CreatedAt: time.Now(),
}
if err := w.storage.AppendLog(context.Background(), log); err != nil {
// Bound the write: a hung storage backend must not block the run goroutine
// on the hot path (every step/tool event flows through here). Detached from
// any caller deadline — the log write is independent of the run's context.
ctx, cancel := context.WithTimeout(context.Background(), auditAppendTimeout)
defer cancel()
if err := w.storage.AppendLog(ctx, log); err != nil {
slog.Warn("skillaudit: AppendLog failed", "run_id", w.runID, "seq", seq, "type", eventType, "error", err)
}
}