Files
steve f92e54e3ed
CI / test (push) Successful in 10m10s
feat: gadfly-mcp — MCP server for grading gadfly-reports findings
Thin, stateless stdio MCP server (official Go SDK) that exposes a gadfly-reports store to an MCP client (e.g. Claude). Tools: list_findings, record_finding_grade, scoreboard (grader forced to claude). Launch via 'go run ...@latest' — nothing to install. Core logic tested against httptest, no daemon required.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 23:55:24 -04:00

2.4 KiB

🪰🔌 gadfly-mcp

An MCP server that lets an MCP client (e.g. Claude) read and grade Gadfly review findings stored in gadfly-reports. It's a tiny, stateless stdio process — a thin HTTP client to the store — so there's nothing to install or manage: your MCP client launches it on demand with go run …@latest.

🤖 Heads up: this is a vibe-coded project

gadfly-mcp was built almost entirely by an AI agent (Claude Code) — code and docs. It's small and tested, but treat it as homelab-grade. Issues and PRs welcome.

Add it to Claude

The store (gadfly-reports) runs persistently somewhere; this MCP server is throwaway. Point your client at it via go run (first launch compiles + caches; needs Go + access to the module host):

{
  "mcpServers": {
    "gadfly": {
      "command": "go",
      "args": [
        "run", "gitea.stevedudenhoeffer.com/steve/gadfly-mcp@latest",
        "--store", "https://gadfly-reports.your-host:8090"
      ],
      "env": { "GADFLY_REPORTS_TOKEN": "the-same-token-the-store-uses" }
    }
  }
}

--store defaults to $GADFLY_REPORTS_URL (else http://localhost:8090). If the store requires a bearer token, set GADFLY_REPORTS_TOKEN.

Tools

Tool Args Does
list_findings repo?, pr?, only_ungraded? lists findings (one entry per finding; reports from multiple models grouped, distinct models listed)
record_finding_grade finding_id, is_real, severity?, usefulness?, notes? records a triage grade (grader is always claude)
scoreboard model? per-model rollup (runs, minutes, tokens, confirmed-by-severity histogram)

severity is one of trivial|small|medium|high|critical (set it when is_real=true; omit for a false positive). Points are not stored or returned — gadfly-reports keeps raw facts, so any "value per minute / per token" ranking is computed client-side (map severity → points, divide by minutes). Use scoreboard for the raw material.

Typical flow: "List the ungraded gadfly findings on PR 2, look into each against the code, and record a grade for each."

Build / test

go build ./...
go test ./...
gofmt -l .   # must be clean

License

MIT © 2026 Steve Dudenhoeffer.