Files
gadfly-reports/README.md
T
steve 35ebc53561
Build & push image / build-and-push (push) Successful in 26s
CI / test (push) Successful in 10m24s
feat: built-in read-only dashboard at /ui + GET /runs
Serves a self-contained vanilla-JS dashboard (embedded via go:embed): a per-model performance table — runs, minutes, findings, confirmed/false-positive/ungraded, points, points-per-minute, points-per-run, by-severity — with drill-down filters (date range, repo, provider, model, lens, grade/severity), free-text search, and a click-to-scope findings detail table.

Scoring stays client-side: the page has an editable points curve and computes points + value-per-minute in the browser, so the store remains point-free. Adds GET /runs (lists all runs, incl. zero-finding ones) so minutes/runs are filterable. The /ui shell is public (carries no data); data endpoints stay token-gated and the JS sends the token.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 00:22:39 -04:00

156 lines
7.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 🪰📋 gadfly-reports
A small **durable store + scoreboard** for [Gadfly](https://gitea.stevedudenhoeffer.com/steve/gadfly)
review findings. Gadfly (and any CI) POST each model's findings and per-review timing here; a human
or Claude — via [gadfly-mcp](https://gitea.stevedudenhoeffer.com/steve/gadfly-mcp) — later grades
each finding. It's a single Go binary backed by SQLite, speaking a tiny HTTP API.
> ### 🤖 Heads up: this is a vibe-coded project
> gadfly-reports was built almost entirely by an AI agent (Claude Code) — the design, the code, and
> these docs. It's small and it's tested, but treat it accordingly: it's a homelab-grade service,
> not a hardened product, and there may be the occasional AI-flavored rough edge. Issues and PRs
> welcome.
## What it stores — and what it deliberately doesn't
gadfly-reports is a **pure fact store**:
- **runs** — one per model's review of a PR: wall-clock duration, lens count, optional token/cost.
- **findings** — **content-addressed by location** (`repo + pr + lens + file + line`), so the *same*
issue raised by several models collapses to one finding with many **reports**. That collapse is
what makes cross-model **consensus** and per-model **precision** measurable.
- **grades** — a triage verdict per finding: `is_real`, `severity`
(`trivial|small|medium|high|critical`), optional `usefulness` (15), notes, grader. Grade history
is kept; the latest wins.
It stores **no points and computes no rankings.** Mapping severity → points and ranking models by
"value per minute" (or per token) is a **client/dashboard concern**, so you can retune the curve any
time without migrating or re-scoring stored data.
## Run it
```sh
# from source
go run gitea.stevedudenhoeffer.com/steve/gadfly-reports@latest serve
# or Docker (image published by CI on every push to main)
docker run -d --name gadfly-reports -p 8090:8090 -v gadfly-reports-data:/data \
-e GADFLY_REPORTS_TOKEN=change-me \
gitea.stevedudenhoeffer.com/steve/gadfly-reports:latest
```
### Deploy behind Traefik (expose over a domain)
```yaml
# docker-compose.yml — publish gadfly-reports at https://reports.example.com via Traefik.
services:
gadfly-reports:
image: gitea.stevedudenhoeffer.com/steve/gadfly-reports:latest
restart: unless-stopped
environment:
# Auth is built in: callers (gadfly emit, gadfly-mcp) send this as a bearer
# token; /healthz stays open. ADDR and DB default to :8090 and
# /data/gadfly-reports.db inside the image.
GADFLY_REPORTS_TOKEN: ${GADFLY_REPORTS_TOKEN:?set GADFLY_REPORTS_TOKEN in .env}
volumes:
- gadfly-reports-data:/data
networks: [traefik]
healthcheck:
test: ["CMD", "wget", "-q", "-O", "-", "http://localhost:8090/healthz"]
interval: 30s
timeout: 5s
retries: 3
labels:
- "traefik.enable=true"
- "traefik.http.routers.gadfly-reports.rule=Host(`reports.example.com`)"
- "traefik.http.routers.gadfly-reports.entrypoints=websecure"
- "traefik.http.routers.gadfly-reports.tls=true"
- "traefik.http.routers.gadfly-reports.tls.certresolver=letsencrypt"
- "traefik.http.services.gadfly-reports.loadbalancer.server.port=8090"
volumes:
gadfly-reports-data:
networks:
traefik:
external: true # the network your Traefik instance is attached to
```
Put `GADFLY_REPORTS_TOKEN=<secret>` in a `.env` beside the compose file. Tailor the three
Traefik bits to your setup — the **host** (`reports.example.com`), the **entrypoint**
(`websecure`) and the **certresolver** (`letsencrypt`) must match your Traefik config, and the
`traefik` network must be the external one Traefik watches. Traefik terminates TLS and forwards
to the container's `:8090`. Then point `gadfly`'s `GADFLY_FINDINGS_URL` and `gadfly-mcp`'s
`--store` at `https://reports.example.com` (with the same token).
## HTTP API (the canonical contract)
| Method & path | Body / query | Purpose |
|---|---|---|
| `GET /healthz` | — | liveness (open even when a token is set) |
| `GET /` · `GET /ui` | — | **view-only dashboard** — HTML shell, public; its JS fetches the gated endpoints with the token |
| `POST /runs` | one run object | upsert a model's review of a PR (timing/tokens) |
| `POST /reports` | JSON **array** of report objects | record findings + which model reported each |
| `POST /findings/{id}/grade` | `{is_real, severity?, usefulness?, notes?, grader?}` | record a triage grade |
| `GET /export` | — | flat report×finding×run×latest-grade rows — the dashboard feed |
| `GET /runs` | — | list all runs (timing/tokens), oldest first |
| `GET /scoreboard` | — | points-free per-model rollup |
`POST /runs` body: `{run_id, repo, pr, model, provider, lenses, duration_secs, input_tokens?, output_tokens?, cost_usd?}`
(re-posting the same `run_id` updates it).
`POST /reports` array element: `{repo, pr, lens, file, line, title, model, provider, run_id, raw_severity, detail}`.
`GET /scoreboard` element: `{model, provider, runs, minutes, input_tokens, output_tokens, findings, confirmed, false_positive, ungraded, by_severity:{severity:count}}`.
If `GADFLY_REPORTS_TOKEN` is set, every route except the public view shell (`/healthz`, `/`, `/ui`)
requires `Authorization: Bearer <token>`. The `/ui` shell carries no data itself — its JS sends the
token on each fetch — so the public shell leaks nothing.
## Configuration
| Env | Default | Meaning |
|-----|---------|---------|
| `GADFLY_REPORTS_ADDR` | `:8090` | listen address |
| `GADFLY_REPORTS_DB` | `gadfly-reports.db` (`/data/gadfly-reports.db` in Docker) | SQLite path |
| `GADFLY_REPORTS_TOKEN` | *(empty)* | bearer token callers must present (empty = open) |
CLI flags `--addr` / `--db` / `--token` override the env.
## Dashboard
A built-in **read-only dashboard** ships at **`/ui`** (hit the host root and you're redirected
there). It's a single self-contained page that pulls `/runs` + `/export` and does everything in your
browser: a **per-model performance table** — runs, minutes, findings, confirmed / false-positive /
ungraded, points, **points-per-minute**, points-per-run, by-severity — with **drill-down filters**
(date range, repo, provider, model, lens, grade/severity), free-text search, and a click-to-scope
findings detail table.
True to the store's "no points" rule, **scoring lives in the browser**: the page has an editable
points curve (default `trivial=1, small=3, medium=5, high=8, critical=20`) and computes
`points = Σ weight[severity]·count` and `value/min = points / minutes` on the fly — retune it without
touching stored data.
Auth: the `/ui` shell is public (it holds no data); paste the store token into its **connect** box,
or open `/ui?token=<token>` once (remembered in `localStorage`). Prefer your own dashboard? Point
Grafana/Metabase/etc. at the SQLite file or the same `/export` + `/scoreboard` + `/runs` JSON.
## How it fits together
- **[gadfly](https://gitea.stevedudenhoeffer.com/steve/gadfly)** POSTs findings here after each
review when `GADFLY_FINDINGS_URL` points at this store (advisory; off by default).
- **[gadfly-mcp](https://gitea.stevedudenhoeffer.com/steve/gadfly-mcp)** is the MCP server Claude
uses to list findings and record grades against this store.
## Build / test
```sh
go build ./...
go test ./...
gofmt -l . # must be clean
```
## License
MIT © 2026 Steve Dudenhoeffer.