35ebc53561
Serves a self-contained vanilla-JS dashboard (embedded via go:embed): a per-model performance table — runs, minutes, findings, confirmed/false-positive/ungraded, points, points-per-minute, points-per-run, by-severity — with drill-down filters (date range, repo, provider, model, lens, grade/severity), free-text search, and a click-to-scope findings detail table. Scoring stays client-side: the page has an editable points curve and computes points + value-per-minute in the browser, so the store remains point-free. Adds GET /runs (lists all runs, incl. zero-finding ones) so minutes/runs are filterable. The /ui shell is public (carries no data); data endpoints stay token-gated and the JS sends the token. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
156 lines
7.3 KiB
Markdown
156 lines
7.3 KiB
Markdown
# 🪰📋 gadfly-reports
|
||
|
||
A small **durable store + scoreboard** for [Gadfly](https://gitea.stevedudenhoeffer.com/steve/gadfly)
|
||
review findings. Gadfly (and any CI) POST each model's findings and per-review timing here; a human
|
||
or Claude — via [gadfly-mcp](https://gitea.stevedudenhoeffer.com/steve/gadfly-mcp) — later grades
|
||
each finding. It's a single Go binary backed by SQLite, speaking a tiny HTTP API.
|
||
|
||
> ### 🤖 Heads up: this is a vibe-coded project
|
||
> gadfly-reports was built almost entirely by an AI agent (Claude Code) — the design, the code, and
|
||
> these docs. It's small and it's tested, but treat it accordingly: it's a homelab-grade service,
|
||
> not a hardened product, and there may be the occasional AI-flavored rough edge. Issues and PRs
|
||
> welcome.
|
||
|
||
## What it stores — and what it deliberately doesn't
|
||
|
||
gadfly-reports is a **pure fact store**:
|
||
|
||
- **runs** — one per model's review of a PR: wall-clock duration, lens count, optional token/cost.
|
||
- **findings** — **content-addressed by location** (`repo + pr + lens + file + line`), so the *same*
|
||
issue raised by several models collapses to one finding with many **reports**. That collapse is
|
||
what makes cross-model **consensus** and per-model **precision** measurable.
|
||
- **grades** — a triage verdict per finding: `is_real`, `severity`
|
||
(`trivial|small|medium|high|critical`), optional `usefulness` (1–5), notes, grader. Grade history
|
||
is kept; the latest wins.
|
||
|
||
It stores **no points and computes no rankings.** Mapping severity → points and ranking models by
|
||
"value per minute" (or per token) is a **client/dashboard concern**, so you can retune the curve any
|
||
time without migrating or re-scoring stored data.
|
||
|
||
## Run it
|
||
|
||
```sh
|
||
# from source
|
||
go run gitea.stevedudenhoeffer.com/steve/gadfly-reports@latest serve
|
||
|
||
# or Docker (image published by CI on every push to main)
|
||
docker run -d --name gadfly-reports -p 8090:8090 -v gadfly-reports-data:/data \
|
||
-e GADFLY_REPORTS_TOKEN=change-me \
|
||
gitea.stevedudenhoeffer.com/steve/gadfly-reports:latest
|
||
```
|
||
|
||
### Deploy behind Traefik (expose over a domain)
|
||
|
||
```yaml
|
||
# docker-compose.yml — publish gadfly-reports at https://reports.example.com via Traefik.
|
||
services:
|
||
gadfly-reports:
|
||
image: gitea.stevedudenhoeffer.com/steve/gadfly-reports:latest
|
||
restart: unless-stopped
|
||
environment:
|
||
# Auth is built in: callers (gadfly emit, gadfly-mcp) send this as a bearer
|
||
# token; /healthz stays open. ADDR and DB default to :8090 and
|
||
# /data/gadfly-reports.db inside the image.
|
||
GADFLY_REPORTS_TOKEN: ${GADFLY_REPORTS_TOKEN:?set GADFLY_REPORTS_TOKEN in .env}
|
||
volumes:
|
||
- gadfly-reports-data:/data
|
||
networks: [traefik]
|
||
healthcheck:
|
||
test: ["CMD", "wget", "-q", "-O", "-", "http://localhost:8090/healthz"]
|
||
interval: 30s
|
||
timeout: 5s
|
||
retries: 3
|
||
labels:
|
||
- "traefik.enable=true"
|
||
- "traefik.http.routers.gadfly-reports.rule=Host(`reports.example.com`)"
|
||
- "traefik.http.routers.gadfly-reports.entrypoints=websecure"
|
||
- "traefik.http.routers.gadfly-reports.tls=true"
|
||
- "traefik.http.routers.gadfly-reports.tls.certresolver=letsencrypt"
|
||
- "traefik.http.services.gadfly-reports.loadbalancer.server.port=8090"
|
||
|
||
volumes:
|
||
gadfly-reports-data:
|
||
|
||
networks:
|
||
traefik:
|
||
external: true # the network your Traefik instance is attached to
|
||
```
|
||
|
||
Put `GADFLY_REPORTS_TOKEN=<secret>` in a `.env` beside the compose file. Tailor the three
|
||
Traefik bits to your setup — the **host** (`reports.example.com`), the **entrypoint**
|
||
(`websecure`) and the **certresolver** (`letsencrypt`) must match your Traefik config, and the
|
||
`traefik` network must be the external one Traefik watches. Traefik terminates TLS and forwards
|
||
to the container's `:8090`. Then point `gadfly`'s `GADFLY_FINDINGS_URL` and `gadfly-mcp`'s
|
||
`--store` at `https://reports.example.com` (with the same token).
|
||
|
||
## HTTP API (the canonical contract)
|
||
|
||
| Method & path | Body / query | Purpose |
|
||
|---|---|---|
|
||
| `GET /healthz` | — | liveness (open even when a token is set) |
|
||
| `GET /` · `GET /ui` | — | **view-only dashboard** — HTML shell, public; its JS fetches the gated endpoints with the token |
|
||
| `POST /runs` | one run object | upsert a model's review of a PR (timing/tokens) |
|
||
| `POST /reports` | JSON **array** of report objects | record findings + which model reported each |
|
||
| `POST /findings/{id}/grade` | `{is_real, severity?, usefulness?, notes?, grader?}` | record a triage grade |
|
||
| `GET /export` | — | flat report×finding×run×latest-grade rows — the dashboard feed |
|
||
| `GET /runs` | — | list all runs (timing/tokens), oldest first |
|
||
| `GET /scoreboard` | — | points-free per-model rollup |
|
||
|
||
`POST /runs` body: `{run_id, repo, pr, model, provider, lenses, duration_secs, input_tokens?, output_tokens?, cost_usd?}`
|
||
(re-posting the same `run_id` updates it).
|
||
|
||
`POST /reports` array element: `{repo, pr, lens, file, line, title, model, provider, run_id, raw_severity, detail}`.
|
||
|
||
`GET /scoreboard` element: `{model, provider, runs, minutes, input_tokens, output_tokens, findings, confirmed, false_positive, ungraded, by_severity:{severity:count}}`.
|
||
|
||
If `GADFLY_REPORTS_TOKEN` is set, every route except the public view shell (`/healthz`, `/`, `/ui`)
|
||
requires `Authorization: Bearer <token>`. The `/ui` shell carries no data itself — its JS sends the
|
||
token on each fetch — so the public shell leaks nothing.
|
||
|
||
## Configuration
|
||
|
||
| Env | Default | Meaning |
|
||
|-----|---------|---------|
|
||
| `GADFLY_REPORTS_ADDR` | `:8090` | listen address |
|
||
| `GADFLY_REPORTS_DB` | `gadfly-reports.db` (`/data/gadfly-reports.db` in Docker) | SQLite path |
|
||
| `GADFLY_REPORTS_TOKEN` | *(empty)* | bearer token callers must present (empty = open) |
|
||
|
||
CLI flags `--addr` / `--db` / `--token` override the env.
|
||
|
||
## Dashboard
|
||
|
||
A built-in **read-only dashboard** ships at **`/ui`** (hit the host root and you're redirected
|
||
there). It's a single self-contained page that pulls `/runs` + `/export` and does everything in your
|
||
browser: a **per-model performance table** — runs, minutes, findings, confirmed / false-positive /
|
||
ungraded, points, **points-per-minute**, points-per-run, by-severity — with **drill-down filters**
|
||
(date range, repo, provider, model, lens, grade/severity), free-text search, and a click-to-scope
|
||
findings detail table.
|
||
|
||
True to the store's "no points" rule, **scoring lives in the browser**: the page has an editable
|
||
points curve (default `trivial=1, small=3, medium=5, high=8, critical=20`) and computes
|
||
`points = Σ weight[severity]·count` and `value/min = points / minutes` on the fly — retune it without
|
||
touching stored data.
|
||
|
||
Auth: the `/ui` shell is public (it holds no data); paste the store token into its **connect** box,
|
||
or open `/ui?token=<token>` once (remembered in `localStorage`). Prefer your own dashboard? Point
|
||
Grafana/Metabase/etc. at the SQLite file or the same `/export` + `/scoreboard` + `/runs` JSON.
|
||
|
||
## How it fits together
|
||
|
||
- **[gadfly](https://gitea.stevedudenhoeffer.com/steve/gadfly)** POSTs findings here after each
|
||
review when `GADFLY_FINDINGS_URL` points at this store (advisory; off by default).
|
||
- **[gadfly-mcp](https://gitea.stevedudenhoeffer.com/steve/gadfly-mcp)** is the MCP server Claude
|
||
uses to list findings and record grades against this store.
|
||
|
||
## Build / test
|
||
|
||
```sh
|
||
go build ./...
|
||
go test ./...
|
||
gofmt -l . # must be clean
|
||
```
|
||
|
||
## License
|
||
|
||
MIT © 2026 Steve Dudenhoeffer.
|