Files
gadfly/CLAUDE.md
T
Steve Dudenhoeffer 04cd260ff9
Build & push image / build-and-push (push) Successful in 6s
docs: add CLAUDE.md + provider example configs
- CLAUDE.md: project goals (advisory-only, real-bugs-not-nits, easy-to-enable,
  provider-agnostic, portable), architecture map, build/test/release, and
  maintenance rules — incl. "keep README + examples/ current with any env/flag/
  provider/trigger change" and the advisory-only invariant.
- examples/: local-ollama.yml, openai-compatible.yml, endpoint-aliases.yml +
  an examples/README index; README setup step points at them.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 19:06:08 -04:00

114 lines
6.5 KiB
Markdown

# Gadfly — Developer Guide
Gadfly (🪰) is an **agentic adversarial code reviewer** that runs in Gitea Actions. On a pull
request it reads the *checked-out repository* with read-only tools, hunts for real problems,
verifies each one against the actual code, and posts its findings as a comment. It is
**advisory only** — it never blocks a merge.
> This is a public, **vibe-coded** project (built largely by an AI agent). Keep that framing
> honest in the README; don't oversell it.
## Project goals (keep changes aligned to these)
1. **Find *real* problems, not nits.** The whole point of the agentic tools + two-pass
recheck is to kill diff-only false positives. Anything that raises the false-positive rate
(or removes verification) works against the project.
2. **Advisory, never blocking.** Gadfly must never fail a CI job for review *content*, never
merge, never deploy. Non-zero exit only on usage/config errors; even then run.sh posts a
notice rather than failing. Do not add it to branch-protection required checks.
3. **Easy to turn on for any repo.** Consumers should need only a ~15-line stub workflow + a
couple of secrets/vars. All real logic lives in the image (`entrypoint.sh`), not in the
consumer's YAML (Gitea's act_runner has weak YAML expression support).
4. **Provider-agnostic.** Powered by [majordomo](https://gitea.stevedudenhoeffer.com/steve/majordomo),
so it can target Ollama (local/cloud), OpenAI, Anthropic, Google, or any
OpenAI/Ollama-compatible endpoint. Don't re-hardcode a single provider.
5. **Portable & self-contained.** `cmd/gadfly` depends only on the Go stdlib + majordomo. Keep
it that way — no heavyweight deps, no coupling to any one consumer repo (e.g. mort).
## Architecture
```
cmd/gadfly/ the reviewer binary — pure producer of review markdown (stdout)
main.go agent orchestration: review pass + adversarial recheck pass, budgets
model.go provider/model resolution (majordomo.Parse) + env endpoint aliases
tools.go the 5 read-only repo tools (read_file/list_dir/grep/find_files/get_diff)
recheck.go second-pass verification prompt + verdict recompute
*_test.go sandbox, recheck, wrap-up, and spec/endpoint-parse unit tests
scripts/run.sh fetch PR diff+meta, run the binary, upsert ONE labeled PR comment
scripts/system-prompt.txt the reviewer persona + verification discipline (generic, not repo-specific)
entrypoint.sh container brains: trigger gating, PR clone, model loop (the logic that
used to live in workflow YAML)
Dockerfile multi-stage; private-module creds via BuildKit secrets never reach the final image
.gitea/workflows/build-image.yml push main → :latest; tag v* → :<tag>+:latest; PR → build-only
examples/ copy-paste consumer stub workflows for different providers
```
**Data flow:** consumer stub workflow → container `entrypoint.sh` (gate + clone) →
`scripts/run.sh` (per model) → `cmd/gadfly` binary (agentic review) → markdown → run.sh
upserts a PR comment as `gitea-actions`.
**Two passes:** a *review* pass drafts findings; an adversarial *recheck* pass independently
re-verifies each finding against the code and drops the unconfirmed ones, recomputing the
verdict. Verdict is one of: `No material issues found` / `Minor issues` / `Blocking issues found`.
## Build / test
```sh
go build ./cmd/gadfly # needs read access to the private majordomo module
go test ./...
gofmt -l cmd/ # must be clean
docker build -t gadfly:dev --secret id=REGISTRY_USER,env=REGISTRY_USER --secret id=REGISTRY_PASSWORD,env=REGISTRY_PASSWORD .
```
Run it locally against a real diff without CI:
```sh
git -C /path/to/repo diff main > /tmp/x.diff
GADFLY_PROVIDER=ollama GADFLY_MODEL=qwen2.5-coder:7b \
GADFLY_REPO_DIR=/path/to/repo GADFLY_DIFF_FILE=/tmp/x.diff \
GADFLY_SYSTEM_FILE=scripts/system-prompt.txt ./gadfly
```
## Release / deploy
- **Push to `main`** → CI builds and pushes `:latest` (+ `:sha-<short>`).
- **Tag `v*`** → publishes `:<tag>` (+ `:latest`). Pin consumers to `:vN` for stability.
- Required CI secrets: `REGISTRY_USER` / `REGISTRY_PASSWORD` (registry push + read access to the
private majordomo module). Optional `DISCORD_WEBHOOK_URL`.
## Configuration
The full env reference lives in the **README** (`Models & providers` + `Configuration`).
Provider selection: `GADFLY_PROVIDER` (default `ollama-cloud`), `GADFLY_MODEL`/`GADFLY_MODELS`,
`GADFLY_BASE_URL`, `GADFLY_API_KEY`. Named endpoint aliases via `GADFLY_ENDPOINT_<NAME>` /
`GADFLY_ALIAS_<NAME>` (http-capable) and majordomo `LLM_*` DSNs (HTTPS-only).
**Tested vs untested:** only the Ollama paths (local + OpenAI-compatible pointed at Ollama)
are actually exercised. OpenAI/Anthropic/Google come from majordomo's abstraction and are
**untested** (no spend). Keep the README honest about this; update it if that changes.
## When making changes — maintenance rules
- **Keep the README and `examples/` current.** Any change to env vars, flags, defaults,
triggers, provider support, or the consumer stub MUST be reflected in `README.md` and the
relevant files under `examples/` in the *same* change. The README's `Configuration` table,
the `Models & providers` table, and the example workflows are the contract users rely on —
stale docs are a bug.
- **Preserve the advisory-only invariant** (goal #2). If you touch exit codes or the workflow,
re-confirm a review can never fail/block a consumer's CI.
- **Don't add mort-specific (or any single-consumer) assumptions** to the binary or system
prompt. The system prompt is intentionally generic; repo-specific conventions should be
discovered by the agent at runtime (it can read the repo's own CONTRIBUTING/CLAUDE.md), not
hardcoded here.
- **Keep secrets out of image layers.** Private-module creds flow via BuildKit `--mount=type=secret`
in the build stage only; never bake them into the final image or commit them.
- Add a test when you add logic (see the `*_test.go` patterns). Keep `gofmt` clean and `go vet` quiet.
## Lessons
- majordomo's `LLM_*` env DSNs are **HTTPS-only** (`DSN.BaseURL()` forces `https://`), so they
can't express a plaintext local Ollama. That's why Gadfly adds the http-capable
`GADFLY_ENDPOINT_<NAME>="provider|base-url[|key]"` mechanism (see `cmd/gadfly/model.go`).
- Gitea `vars`/`secrets` are **not** auto-exposed as env in a job — the consumer stub must map
each one explicitly in its `env:` block (dynamic alias names can't be auto-enumerated).