gadfly-reports

steve/gadfly-reports

Fork 0

Commit Graph

Author	SHA1	Message	Date
steve	1af115fdf1	feat: PR filter — compare models on the same set of PRs Build & push image / build-and-push (push) Successful in 13s Details CI / test (push) Successful in 9m51s Details UI: a repo#pr multi-select (labeled with how many models ran each PR) scopes the whole table — runs, minutes, findings, points — to the chosen PRs, so a model with 2 runs can be fairly compared against one with 60. API: GET /scoreboard accepts ?repo= and ?pr= (repeatable or comma-list). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-07-02 22:56:49 -04:00
steve	ddcf42a3ce	feat: gadfly-reports — findings store + scoreboard daemon Build & push image / build-and-push (push) Successful in 1m13s Details CI / test (push) Successful in 10m39s Details SQLite-backed HTTP store for Gadfly review findings, per-review run timings, and human/Claude grades, with a points-free per-model scoreboard. Pure fact store: it computes no points or rankings (the dashboard maps severity->points client-side and retunes without re-scoring). Findings are content-addressed by location so cross-model reports collapse for consensus; one grade per finding, latest wins. Pure-Go SQLite (CGO-free) + Docker image CI + tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 23:55:24 -04:00

Author

SHA1

Message

Date

steve

1af115fdf1

feat: PR filter — compare models on the same set of PRs

Build & push image / build-and-push (push) Successful in 13s

Details

CI / test (push) Successful in 9m51s

Details

UI: a repo#pr multi-select (labeled with how many models ran each PR)
scopes the whole table — runs, minutes, findings, points — to the chosen
PRs, so a model with 2 runs can be fairly compared against one with 60.
API: GET /scoreboard accepts ?repo= and ?pr= (repeatable or comma-list).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

2026-07-02 22:56:49 -04:00

steve

ddcf42a3ce

feat: gadfly-reports — findings store + scoreboard daemon

Build & push image / build-and-push (push) Successful in 1m13s

Details

CI / test (push) Successful in 10m39s

Details

SQLite-backed HTTP store for Gadfly review findings, per-review run timings, and human/Claude grades, with a points-free per-model scoreboard. Pure fact store: it computes no points or rankings (the dashboard maps severity->points client-side and retunes without re-scoring). Findings are content-addressed by location so cross-model reports collapse for consensus; one grade per finding, latest wins. Pure-Go SQLite (CGO-free) + Docker image CI + tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-26 23:55:24 -04:00

2 Commits