P1 (part 1): move skilltools core -> tool/ (clean, verbatim)

The tool registry core (registry, permission model, Invocation, gated-tool wrapper, ssrf guard, hmac, encryption, argcoerce, helpers, rootrun, session_tools, webhook_rate_limit) had zero mort coupling — it imports only majordomo/llm + x/crypto/hkdf — so it moves verbatim with a package rename (skilltools -> tool). All same-package tests came along and pass; the SSRF, gated-wrapper, encryption and output-pattern invariants are re-anchored here. majordomo re-enters the module graph (now pinned to the latest, incl. the front-loaded-output fix). model/ + llmmeta + structured follow next. Docs: CLAUDE.md now requires README/examples to stay in sync with changes in the same commit; CI skips docs/example-only pushes via paths-ignore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 19:31:47 -04:00
parent d2c18ad5bb
commit dc28b63ad8
24 changed files with 3461 additions and 1 deletions
@@ -0,0 +1,145 @@
+// Package skilltools — webhook_rate_limit.go: per-IP-per-skill
+// sliding-window rate limiter for the v7 inbound webhook handler.
+//
+// Why an in-memory limiter (vs Redis or DB-backed): rate limiting is
+// the cheap reject path BEFORE the HMAC compute and run-budget check,
+// and an extra round-trip per inbound webhook would be wasted. The
+// 6-person server's volume is well within a single-process limiter's
+// scale; if mort is ever multi-process the limiter becomes
+// approximate (still good enough to throttle abusive sources).
+//
+// Why per-IP-per-skill (vs per-IP global): one busy webhook (e.g.
+// GitHub PR opened) shouldn't shadow another (Stripe charge). The
+// composite key keeps a noisy source from pushing other skill's
+// callers off the lane.
+//
+// Test: webhook_rate_limit_test.go covers admit + reject paths.
+package tool
+
+import (
+	"sync"
+	"time"
+)
+
+// WebhookRateLimiter is a sliding-window per-(skillID, sourceIP)
+// counter. Configure once at construction; concurrent-safe.
+type WebhookRateLimiter struct {
+	limit  int
+	window time.Duration
+	clock  func() time.Time
+
+	mu      sync.Mutex
+	buckets map[string]*rateBucket // key = skillID + "|" + sourceIP
+}
+
+type rateBucket struct {
+	// hits is a slice of timestamps within the window. Pruned on
+	// every Admit call so the slice never grows unbounded.
+	hits []time.Time
+}
+
+// NewWebhookRateLimiter constructs the limiter.
+//
+// limit  — max calls per (skill, ip) within window. <=0 means
+//
+//	"unlimited" (every call admitted; useful for tests).
+//
+// window — sliding window length. <=0 falls back to 1 minute.
+// clock  — testable wall-clock; nil → time.Now.
+func NewWebhookRateLimiter(limit int, window time.Duration, clock func() time.Time) *WebhookRateLimiter {
+	if window <= 0 {
+		window = time.Minute
+	}
+	if clock == nil {
+		clock = time.Now
+	}
+	return &WebhookRateLimiter{
+		limit:   limit,
+		window:  window,
+		clock:   clock,
+		buckets: make(map[string]*rateBucket),
+	}
+}
+
+// Admit returns (true, 0) if the call is within the rate cap (records
+// the hit), or (false, retry-after) if the cap is hit. retry-after is
+// the time until the OLDEST hit in the window expires — the caller can
+// surface it via the Retry-After response header.
+//
+// Why return retry-after not just bool: HTTP 429 responses commonly
+// include Retry-After to avoid synchronizing client retries; computing
+// it from the sliding window is essentially free.
+func (l *WebhookRateLimiter) Admit(skillID, sourceIP string) (bool, time.Duration) {
+	if l.limit <= 0 {
+		return true, 0
+	}
+	now := l.clock()
+	cutoff := now.Add(-l.window)
+	key := skillID + "|" + sourceIP
+
+	l.mu.Lock()
+	defer l.mu.Unlock()
+
+	b, ok := l.buckets[key]
+	if !ok {
+		b = &rateBucket{}
+		l.buckets[key] = b
+	}
+	// Prune in place. The slice is append-only at the tail; the head
+	// shrinks as old hits fall out of the window.
+	first := 0
+	for first < len(b.hits) && b.hits[first].Before(cutoff) {
+		first++
+	}
+	if first > 0 {
+		// Copy the surviving tail to the head; reuse backing array.
+		n := copy(b.hits, b.hits[first:])
+		b.hits = b.hits[:n]
+	}
+	if len(b.hits) >= l.limit {
+		oldest := b.hits[0]
+		retryAfter := oldest.Add(l.window).Sub(now)
+		if retryAfter < 0 {
+			retryAfter = 0
+		}
+		return false, retryAfter
+	}
+	b.hits = append(b.hits, now)
+	return true, 0
+}
+
+// Sweep purges buckets whose hit-list is empty after pruning. Called
+// periodically (e.g. once per minute) to bound the buckets map's
+// growth.
+//
+// Why a separate Sweep vs auto-prune in Admit: a hostile source that
+// rotates IP addresses across many addresses each hitting once
+// would leave millions of single-hit buckets in the map. A periodic
+// sweep keeps the worst case bounded.
+func (l *WebhookRateLimiter) Sweep() {
+	now := l.clock()
+	cutoff := now.Add(-l.window)
+	l.mu.Lock()
+	defer l.mu.Unlock()
+	for k, b := range l.buckets {
+		// Prune in place.
+		first := 0
+		for first < len(b.hits) && b.hits[first].Before(cutoff) {
+			first++
+		}
+		if first > 0 {
+			n := copy(b.hits, b.hits[first:])
+			b.hits = b.hits[:n]
+		}
+		if len(b.hits) == 0 {
+			delete(l.buckets, k)
+		}
+	}
+}
+
+// CountKeys returns the bucket count. Test helper.
+func (l *WebhookRateLimiter) CountKeys() int {
+	l.mu.Lock()
+	defer l.mu.Unlock()
+	return len(l.buckets)
+}