feat(ui): false-positive penalty (severity-scaled, default -0.5)
Build & push image / build-and-push (push) Successful in 20s
CI / test (push) Successful in 10m24s

Adds an editable 'false-positive penalty ×' to the dashboard. A false positive carries no graded severity, so it's penalized by the severity the model CLAIMED (its lens verdict / raw_severity, mapped onto the curve: Blocking->high, Minor->small). points(net) = confirmed points + Σ penalty×points[claimed], so a model with a few good finds but many false positives nets down — even negative — and sorts to the bottom. Adds an 'fp pen' column; net points/pts-min/pts-run shown red when negative. Client-side only; the store stays point-free.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-27 09:50:18 -04:00
parent 35ebc53561
commit 0cb6b25f11
2 changed files with 30 additions and 7 deletions
+6
View File
@@ -131,6 +131,12 @@ points curve (default `trivial=1, small=3, medium=5, high=8, critical=20`) and c
`points = Σ weight[severity]·count` and `value/min = points / minutes` on the fly — retune it without
touching stored data.
There's also an editable **false-positive penalty ×** (default `-0.5`). A false positive has no
graded severity, so it's penalized by the severity the model **claimed** (its lens verdict —
Blocking→high, Minor→small): `penalty × points[claimed]`. So a Blocking-claimed FP at `-0.5` costs
`high(8) × -0.5 = -4`, and a model with the odd good find but many false positives nets *down*
even negative — instead of coasting on its hits.
Auth: the `/ui` shell is public (it holds no data); paste the store token into its **connect** box,
or open `/ui?token=<token>` once (remembered in `localStorage`). Prefer your own dashboard? Point
Grafana/Metabase/etc. at the SQLite file or the same `/export` + `/scoreboard` + `/runs` JSON.