fix: per-lens timeout, errored-verdict honesty, accurate provider label, tighter lens focus, run timing
Build & push image / build-and-push (push) Successful in 8s
Build & push image / build-and-push (push) Successful in 8s
Five fixes, several surfaced by the live bake-off: - PER-LENS TIMEOUT (critical): GADFLY_TIMEOUT_SECS now applies to EACH specialist (own context), not shared across the suite. A slow model (e.g. a 35B local MLX) was exhausting the whole 600s budget on lens 1, leaving the rest "step 0: context deadline exceeded". Default lowered to 300s (per-lens). cmd/gadfly/main.go. - ERRORED VERDICT: a lens whose review pass failed no longer counts as "clean". Header shows "· ⚠️ N/M lens(es) errored" (or "Review incomplete — all lenses errored"); the section reads "⚠️ could not complete". consolidate.go. - PROVIDER LABEL: the comment header now shows the model's ACTUAL backend from the spec ("m1pro/qwen3.6:35b-mlx" -> m1pro), not the global GADFLY_PROVIDER default (was wrongly "ollama-cloud" for local models). scripts/run.sh. - LENS FOCUS: base prompt no longer licenses "report anything serious"; each lens stays in its lane, says "nothing in my area" rather than re-reporting another lens's bug, with a one-line "Outside my lens:" escape hatch. The re-derive- constants discipline is now lane-scoped, not "every lens". system-prompt.txt + specialists.go. - RUN TIMING: run.sh posts a "⏳ Reviewing…" placeholder at model start and updates it with "⏱️ reviewed in 1m 23s" on finish, for per-model comparison. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -49,6 +49,7 @@ type specialistResult struct {
|
|||||||
spec Specialist
|
spec Specialist
|
||||||
out string
|
out string
|
||||||
verdict verdict
|
verdict verdict
|
||||||
|
errored bool // the review pass failed (timeout/model error) — not a clean result
|
||||||
}
|
}
|
||||||
|
|
||||||
// worstVerdict returns the most severe verdict across results. The optional
|
// worstVerdict returns the most severe verdict across results. The optional
|
||||||
@@ -70,17 +71,36 @@ func worstVerdict(results []specialistResult) verdict {
|
|||||||
// followed by one verbatim section per specialist. run.sh wraps this with the
|
// followed by one verbatim section per specialist. run.sh wraps this with the
|
||||||
// "🪰 Gadfly review — <model>" header and the advisory footer.
|
// "🪰 Gadfly review — <model>" header and the advisory footer.
|
||||||
func renderConsolidated(results []specialistResult) string {
|
func renderConsolidated(results []specialistResult) string {
|
||||||
|
errored := 0
|
||||||
|
for _, r := range results {
|
||||||
|
if r.errored {
|
||||||
|
errored++
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
headline := "Verdict: " + worstVerdict(results).label()
|
||||||
|
if len(results) > 0 && errored == len(results) {
|
||||||
|
// Every lens errored — do NOT report this as "clean".
|
||||||
|
headline = "Review incomplete — all lenses errored"
|
||||||
|
} else if errored > 0 {
|
||||||
|
headline += fmt.Sprintf(" · ⚠️ %d/%d lens(es) errored", errored, len(results))
|
||||||
|
}
|
||||||
|
|
||||||
var b strings.Builder
|
var b strings.Builder
|
||||||
fmt.Fprintf(&b, "**Verdict: %s** — %d reviewers: %s\n",
|
fmt.Fprintf(&b, "**%s** — %d reviewers: %s\n",
|
||||||
worstVerdict(results).label(), len(results), strings.Join(specialistNames(results), ", "))
|
headline, len(results), strings.Join(specialistNames(results), ", "))
|
||||||
|
|
||||||
for _, r := range results {
|
for _, r := range results {
|
||||||
body := strings.TrimSpace(r.out)
|
body := strings.TrimSpace(r.out)
|
||||||
if body == "" {
|
if body == "" {
|
||||||
body = "_(no output)_"
|
body = "_(no output)_"
|
||||||
}
|
}
|
||||||
|
summary := r.verdict.label()
|
||||||
|
if r.errored {
|
||||||
|
summary = "⚠️ could not complete"
|
||||||
|
}
|
||||||
fmt.Fprintf(&b, "\n<details><summary><b>%s</b> — %s</summary>\n\n%s\n\n</details>\n",
|
fmt.Fprintf(&b, "\n<details><summary><b>%s</b> — %s</summary>\n\n%s\n\n</details>\n",
|
||||||
r.spec.Title, r.verdict.label(), body)
|
r.spec.Title, summary, body)
|
||||||
}
|
}
|
||||||
return strings.TrimRight(b.String(), "\n")
|
return strings.TrimRight(b.String(), "\n")
|
||||||
}
|
}
|
||||||
|
|||||||
+24
-17
@@ -69,10 +69,14 @@ import (
|
|||||||
|
|
||||||
const (
|
const (
|
||||||
defaultMaxSteps = 24
|
defaultMaxSteps = 24
|
||||||
// defaultTimeoutSecs is the overall deadline shared by ALL specialist passes
|
// defaultTimeoutSecs is the deadline for EACH specialist's passes (review +
|
||||||
// in one invocation; the default suite runs several lenses sequentially.
|
// recheck). It is per-lens, not shared across the suite, so one slow lens
|
||||||
defaultTimeoutSecs = 600
|
// (e.g. a big local model) can't starve the others. Slow local models may
|
||||||
|
// need this raised (and a higher job timeout to match the suite total).
|
||||||
|
defaultTimeoutSecs = 300
|
||||||
defaultMaxDiffChars = 60000
|
defaultMaxDiffChars = 60000
|
||||||
|
// autoSelectTimeout bounds the dynamic specialist-selection call.
|
||||||
|
autoSelectTimeout = 120 * time.Second
|
||||||
// defaultWrapUpReserve is how many steps before the cap the agent is told
|
// defaultWrapUpReserve is how many steps before the cap the agent is told
|
||||||
// to stop investigating and write its final answer. Reserving a margin is
|
// to stop investigating and write its final answer. Reserving a margin is
|
||||||
// what keeps a thorough reviewer from spending its whole budget on tool
|
// what keeps a thorough reviewer from spending its whole budget on tool
|
||||||
@@ -148,17 +152,15 @@ func run() error {
|
|||||||
fmt.Fprintln(os.Stderr, "gadfly:", e)
|
fmt.Fprintln(os.Stderr, "gadfly:", e)
|
||||||
}
|
}
|
||||||
|
|
||||||
timeout := time.Duration(envInt("GADFLY_TIMEOUT_SECS", defaultTimeoutSecs)) * time.Second
|
|
||||||
ctx, cancel := context.WithTimeout(context.Background(), timeout)
|
|
||||||
defer cancel()
|
|
||||||
|
|
||||||
// Dynamic selection: a (cheap) model picks the lenses this diff needs.
|
// Dynamic selection: a (cheap) model picks the lenses this diff needs.
|
||||||
if auto {
|
if auto {
|
||||||
selector, serr := resolveSelectorModel(mdl)
|
selector, serr := resolveSelectorModel(mdl)
|
||||||
if serr != nil {
|
if serr != nil {
|
||||||
return fmt.Errorf("resolve selector model: %w", serr)
|
return fmt.Errorf("resolve selector model: %w", serr)
|
||||||
}
|
}
|
||||||
picked, aerr := autoSelectSpecialists(ctx, selector, os.Getenv("GADFLY_TITLE"), os.Getenv("GADFLY_BODY"), diff, registry)
|
selCtx, cancel := context.WithTimeout(context.Background(), autoSelectTimeout)
|
||||||
|
picked, aerr := autoSelectSpecialists(selCtx, selector, os.Getenv("GADFLY_TITLE"), os.Getenv("GADFLY_BODY"), diff, registry)
|
||||||
|
cancel()
|
||||||
if aerr != nil {
|
if aerr != nil {
|
||||||
fmt.Fprintln(os.Stderr, "gadfly: auto-select failed; falling back to the default suite:", aerr)
|
fmt.Fprintln(os.Stderr, "gadfly: auto-select failed; falling back to the default suite:", aerr)
|
||||||
specialists = suiteFromRegistry(registry, defaultSuite)
|
specialists = suiteFromRegistry(registry, defaultSuite)
|
||||||
@@ -176,24 +178,29 @@ func run() error {
|
|||||||
task := buildTask(diff)
|
task := buildTask(diff)
|
||||||
results := make([]specialistResult, 0, len(specialists))
|
results := make([]specialistResult, 0, len(specialists))
|
||||||
for _, sp := range specialists {
|
for _, sp := range specialists {
|
||||||
out := reviewWithSpecialist(ctx, mdl, fsTools, base, sp, task, diff)
|
out, errored := reviewWithSpecialist(mdl, fsTools, base, sp, task, diff)
|
||||||
results = append(results, specialistResult{spec: sp, out: out, verdict: parseVerdict(out)})
|
results = append(results, specialistResult{spec: sp, out: out, verdict: parseVerdict(out), errored: errored})
|
||||||
}
|
}
|
||||||
|
|
||||||
fmt.Println(renderConsolidated(results))
|
fmt.Println(renderConsolidated(results))
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
||||||
// reviewWithSpecialist runs one lens end-to-end: a review pass under the
|
// reviewWithSpecialist runs one lens end-to-end under its OWN timeout, so a slow
|
||||||
// specialist's composed prompt, then the shared adversarial recheck pass. A
|
// model on one lens can't starve the others: a review pass under the
|
||||||
// failed pass is rendered as an inline notice (advisory — one lens failing
|
// specialist's composed prompt, then the shared adversarial recheck pass. The
|
||||||
// never sinks the others or the job).
|
// returned bool is true when the review pass failed (rendered as an inline
|
||||||
func reviewWithSpecialist(ctx context.Context, mdl llm.Model, fsTools *repoFS, base string, sp Specialist, task, diff string) string {
|
// notice — advisory; one lens failing never sinks the others or the job).
|
||||||
|
func reviewWithSpecialist(mdl llm.Model, fsTools *repoFS, base string, sp Specialist, task, diff string) (string, bool) {
|
||||||
|
timeout := time.Duration(envInt("GADFLY_TIMEOUT_SECS", defaultTimeoutSecs)) * time.Second
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), timeout)
|
||||||
|
defer cancel()
|
||||||
|
|
||||||
draft, err := runAgent(ctx, mdl, fsTools, composeSpecialistPrompt(base, sp), task,
|
draft, err := runAgent(ctx, mdl, fsTools, composeSpecialistPrompt(base, sp), task,
|
||||||
envInt("GADFLY_MAX_STEPS", defaultMaxSteps))
|
envInt("GADFLY_MAX_STEPS", defaultMaxSteps))
|
||||||
if err != nil {
|
if err != nil {
|
||||||
fmt.Fprintf(os.Stderr, "gadfly: specialist %q review pass failed: %v\n", sp.Name, err)
|
fmt.Fprintf(os.Stderr, "gadfly: specialist %q review pass failed: %v\n", sp.Name, err)
|
||||||
return fmt.Sprintf("⚠️ This reviewer failed to complete: %v", err)
|
return fmt.Sprintf("⚠️ This reviewer failed to complete: %v", err), true
|
||||||
}
|
}
|
||||||
|
|
||||||
final := draft
|
final := draft
|
||||||
@@ -206,7 +213,7 @@ func reviewWithSpecialist(ctx context.Context, mdl llm.Model, fsTools *repoFS, b
|
|||||||
final = rechecked
|
final = rechecked
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
return final
|
return final, false
|
||||||
}
|
}
|
||||||
|
|
||||||
// runAgent runs one agent pass (its own fresh toolbox over the sandbox) and
|
// runAgent runs one agent pass (its own fresh toolbox over the sandbox) and
|
||||||
|
|||||||
@@ -192,8 +192,13 @@ func specialistNamesOf(specs []Specialist) []string {
|
|||||||
// composeSpecialistPrompt appends a specialist's lens to the base system prompt.
|
// composeSpecialistPrompt appends a specialist's lens to the base system prompt.
|
||||||
func composeSpecialistPrompt(base string, sp Specialist) string {
|
func composeSpecialistPrompt(base string, sp Specialist) string {
|
||||||
return strings.TrimRight(base, "\n") +
|
return strings.TrimRight(base, "\n") +
|
||||||
"\n\n## Your review lens: " + sp.Title + "\n" +
|
"\n\n## Your assigned lens — " + sp.Title + "\n" +
|
||||||
"Focus your review on this lens; other reviewers cover the rest. " + sp.Focus
|
"Review the change specifically and ONLY through this lens. Scrutinize it for:\n" +
|
||||||
|
sp.Focus +
|
||||||
|
"\n\nStay in this lane: other lenses (correctness, security, performance, etc.) are reviewed " +
|
||||||
|
"separately, so don't duplicate their findings here. If nothing in your lens is materially " +
|
||||||
|
"wrong, reply with the \"No material issues found\" verdict for this lens — do not reach for " +
|
||||||
|
"another lens's issue just to have something to say."
|
||||||
}
|
}
|
||||||
|
|
||||||
func loadFileConfig(repoDir string) (fileConfig, bool, error) {
|
func loadFileConfig(repoDir string) (fileConfig, bool, error) {
|
||||||
|
|||||||
+48
-29
@@ -42,6 +42,14 @@ MAX_DIFF_CHARS="${MAX_DIFF_CHARS:-60000}"
|
|||||||
MARKER="<!-- gadfly-review:${PROVIDER}:${MODEL} -->"
|
MARKER="<!-- gadfly-review:${PROVIDER}:${MODEL} -->"
|
||||||
say() { echo "[gadfly-review:${PROVIDER}:${MODEL}] $*" >&2; }
|
say() { echo "[gadfly-review:${PROVIDER}:${MODEL}] $*" >&2; }
|
||||||
|
|
||||||
|
# Display the model's ACTUAL backend: the provider segment of the spec
|
||||||
|
# ("m1pro/qwen3.6:35b-mlx" -> "m1pro"); a bare id uses GADFLY_PROVIDER (default
|
||||||
|
# ollama-cloud). This is what the comment header shows, not the run.sh lane.
|
||||||
|
case "$MODEL" in
|
||||||
|
*/*) MODEL_PROVIDER="${MODEL%%/*}" ;;
|
||||||
|
*) MODEL_PROVIDER="${GADFLY_PROVIDER:-ollama-cloud}" ;;
|
||||||
|
esac
|
||||||
|
|
||||||
# jq is required for payload building / response parsing; install if missing.
|
# jq is required for payload building / response parsing; install if missing.
|
||||||
if ! command -v jq >/dev/null 2>&1; then
|
if ! command -v jq >/dev/null 2>&1; then
|
||||||
say "jq not found; attempting install"
|
say "jq not found; attempting install"
|
||||||
@@ -55,6 +63,32 @@ fi
|
|||||||
# binary / agy, not here.)
|
# binary / agy, not here.)
|
||||||
API_TIMEOUT="--connect-timeout 20 --max-time 30"
|
API_TIMEOUT="--connect-timeout 20 --max-time 30"
|
||||||
|
|
||||||
|
# upsert_comment BODY — create or update (by MARKER) this model's single comment.
|
||||||
|
upsert_comment() {
|
||||||
|
local body="$1" post_body existing_id page=1 cmts
|
||||||
|
post_body="$(jq -n --arg b "$body" '{body:$b}')"
|
||||||
|
existing_id=""
|
||||||
|
while [ "$page" -le 10 ]; do
|
||||||
|
cmts="$(curl $API_TIMEOUT -fsS -H "Authorization: token ${GITEA_TOKEN}" \
|
||||||
|
"${GITEA_API}/issues/${PR}/comments?limit=50&page=${page}" || echo '[]')"
|
||||||
|
[ "$(echo "$cmts" | jq 'length')" = "0" ] && break
|
||||||
|
existing_id="$(echo "$cmts" | jq -r --arg m "$MARKER" \
|
||||||
|
'.[] | select(.body != null and (.body | startswith($m))) | .id' | head -n1)"
|
||||||
|
[ -n "$existing_id" ] && break
|
||||||
|
page=$((page+1))
|
||||||
|
done
|
||||||
|
if [ -n "$existing_id" ]; then
|
||||||
|
curl $API_TIMEOUT -sS -X PATCH -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" \
|
||||||
|
"${GITEA_API}/issues/comments/${existing_id}" -d "$post_body" >/dev/null
|
||||||
|
else
|
||||||
|
curl $API_TIMEOUT -sS -X POST -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" \
|
||||||
|
"${GITEA_API}/issues/${PR}/comments" -d "$post_body" >/dev/null
|
||||||
|
fi
|
||||||
|
}
|
||||||
|
|
||||||
|
# fmt_duration SECONDS -> "1m 23s" / "45s"
|
||||||
|
fmt_duration() { if [ "$1" -ge 60 ]; then echo "$(($1/60))m $(($1%60))s"; else echo "$1s"; fi; }
|
||||||
|
|
||||||
# --- fetch PR context -------------------------------------------------------
|
# --- fetch PR context -------------------------------------------------------
|
||||||
say "fetching PR #${PR} context"
|
say "fetching PR #${PR} context"
|
||||||
DIFF="$(curl $API_TIMEOUT -fsS -H "Authorization: token ${GITEA_TOKEN}" "${GITEA_API}/pulls/${PR}.diff" || true)"
|
DIFF="$(curl $API_TIMEOUT -fsS -H "Authorization: token ${GITEA_TOKEN}" "${GITEA_API}/pulls/${PR}.diff" || true)"
|
||||||
@@ -81,6 +115,12 @@ SYS="$(cat "${SCRIPT_DIR}/system-prompt.txt")"
|
|||||||
USR="$(printf 'PR #%s: %s\n\nDescription:\n%s\n\nUnified diff to review:\n```diff\n%s\n```%s' \
|
USR="$(printf 'PR #%s: %s\n\nDescription:\n%s\n\nUnified diff to review:\n```diff\n%s\n```%s' \
|
||||||
"$PR" "$TITLE" "$BODY" "$DIFF" "$TRUNC_NOTE")"
|
"$PR" "$TITLE" "$BODY" "$DIFF" "$TRUNC_NOTE")"
|
||||||
|
|
||||||
|
# --- announce start (placeholder comment) -----------------------------------
|
||||||
|
START_TS="$(date +%s)"
|
||||||
|
say "starting review with ${MODEL}"
|
||||||
|
upsert_comment "$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n⏳ Reviewing… this comment will update with findings and run time.' \
|
||||||
|
"$MARKER" "$MODEL" "$MODEL_PROVIDER")"
|
||||||
|
|
||||||
# --- call the model ---------------------------------------------------------
|
# --- call the model ---------------------------------------------------------
|
||||||
REVIEW=""
|
REVIEW=""
|
||||||
case "$PROVIDER" in
|
case "$PROVIDER" in
|
||||||
@@ -96,7 +136,7 @@ case "$PROVIDER" in
|
|||||||
if [ -n "${OLLAMA_CLOUD_API_KEY:-}" ] && [ -z "${OLLAMA_API_KEY:-}" ]; then
|
if [ -n "${OLLAMA_CLOUD_API_KEY:-}" ] && [ -z "${OLLAMA_API_KEY:-}" ]; then
|
||||||
export OLLAMA_API_KEY="$OLLAMA_CLOUD_API_KEY"
|
export OLLAMA_API_KEY="$OLLAMA_CLOUD_API_KEY"
|
||||||
fi
|
fi
|
||||||
GADFLY_PROVIDER_EFF="${GADFLY_PROVIDER:-ollama-cloud}"
|
GADFLY_PROVIDER_EFF="$MODEL_PROVIDER"
|
||||||
|
|
||||||
# Only the default cloud provider strictly needs a key up front; local Ollama
|
# Only the default cloud provider strictly needs a key up front; local Ollama
|
||||||
# and other providers either need none or read their own standard env var.
|
# and other providers either need none or read their own standard env var.
|
||||||
@@ -152,31 +192,10 @@ $(tail -c 1500 agy.err 2>/dev/null)
|
|||||||
say "unknown provider: ${PROVIDER}"; exit 1 ;;
|
say "unknown provider: ${PROVIDER}"; exit 1 ;;
|
||||||
esac
|
esac
|
||||||
|
|
||||||
# --- assemble comment -------------------------------------------------------
|
# --- assemble + post final comment (with run time) --------------------------
|
||||||
COMMENT="$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n%s\n\n<sub>Automated adversarial review by Gadfly. Advisory only — does not block merge.</sub>' \
|
ELAPSED="$(( $(date +%s) - START_TS ))"
|
||||||
"$MARKER" "$MODEL" "${GADFLY_PROVIDER_EFF:-$PROVIDER}" "$REVIEW")"
|
DUR="$(fmt_duration "$ELAPSED")"
|
||||||
POST_BODY="$(jq -n --arg b "$COMMENT" '{body:$b}')"
|
COMMENT="$(printf '%s\n### 🪰 Gadfly review — `%s` (%s)\n\n%s\n\n<sub>Automated adversarial review by Gadfly. Advisory only — does not block merge. · ⏱️ reviewed in %s</sub>' \
|
||||||
|
"$MARKER" "$MODEL" "$MODEL_PROVIDER" "$REVIEW" "$DUR")"
|
||||||
# --- upsert by marker -------------------------------------------------------
|
upsert_comment "$COMMENT"
|
||||||
EXISTING_ID=""
|
say "done in ${DUR}"
|
||||||
page=1
|
|
||||||
while [ "$page" -le 10 ]; do
|
|
||||||
CMTS="$(curl $API_TIMEOUT -fsS -H "Authorization: token ${GITEA_TOKEN}" \
|
|
||||||
"${GITEA_API}/issues/${PR}/comments?limit=50&page=${page}" || echo '[]')"
|
|
||||||
[ "$(echo "$CMTS" | jq 'length')" = "0" ] && break
|
|
||||||
EXISTING_ID="$(echo "$CMTS" | jq -r --arg m "$MARKER" \
|
|
||||||
'.[] | select(.body != null and (.body | startswith($m))) | .id' | head -n1)"
|
|
||||||
[ -n "$EXISTING_ID" ] && break
|
|
||||||
page=$((page+1))
|
|
||||||
done
|
|
||||||
|
|
||||||
if [ -n "$EXISTING_ID" ]; then
|
|
||||||
say "updating existing comment ${EXISTING_ID}"
|
|
||||||
curl $API_TIMEOUT -sS -X PATCH -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" \
|
|
||||||
"${GITEA_API}/issues/comments/${EXISTING_ID}" -d "$POST_BODY" >/dev/null
|
|
||||||
else
|
|
||||||
say "creating new comment"
|
|
||||||
curl $API_TIMEOUT -sS -X POST -H "Authorization: token ${GITEA_TOKEN}" -H "Content-Type: application/json" \
|
|
||||||
"${GITEA_API}/issues/${PR}/comments" -d "$POST_BODY" >/dev/null
|
|
||||||
fi
|
|
||||||
say "done"
|
|
||||||
|
|||||||
@@ -1,9 +1,14 @@
|
|||||||
You are Gadfly, an ADVERSARIAL code reviewer. Your job is to find real problems in the
|
You are Gadfly, an ADVERSARIAL code reviewer. Your job is to find real problems in the
|
||||||
pull request below — not to praise it. A gadfly does not let things slide.
|
pull request below — not to praise it. A gadfly does not let things slide.
|
||||||
|
|
||||||
You review through ONE assigned lens (given at the end of this prompt). Stay in your lane —
|
You review through ONE assigned lens (named at the end of this prompt). Evaluate the change
|
||||||
other reviewers cover the other angles — but do report anything clearly serious you happen
|
THROUGH THAT LENS — that is your job. A separate reviewer independently covers each other
|
||||||
to notice.
|
angle, so problems outside your lens WILL be caught without you. Do not restate a finding that
|
||||||
|
plainly belongs to another lens just to have something to report — that only creates noise. If
|
||||||
|
your lens turns up nothing material, say so plainly; an honest "nothing in my area" beats
|
||||||
|
re-reporting the obvious bug every other lens already sees. Only exception: if you spot a
|
||||||
|
SEVERE issue clearly outside your lens, you may add ONE line prefixed "Outside my lens:" — but
|
||||||
|
your actual findings must stay within your lens.
|
||||||
|
|
||||||
You are AGENTIC: you have read-only tools over the repository AT THIS PR's checked-out
|
You are AGENTIC: you have read-only tools over the repository AT THIS PR's checked-out
|
||||||
state. USE THEM to verify before you report. Do not review the diff in isolation.
|
state. USE THEM to verify before you report. Do not review the diff in isolation.
|
||||||
@@ -23,10 +28,10 @@ Mandatory verification discipline — this is the whole point of giving you tool
|
|||||||
- If you cannot confirm a suspicion with the tools, either drop it or clearly label it
|
- If you cannot confirm a suspicion with the tools, either drop it or clearly label it
|
||||||
"unverified" — do NOT present an unchecked guess as a finding.
|
"unverified" — do NOT present an unchecked guess as a finding.
|
||||||
|
|
||||||
Be skeptical and concrete, and apply your assigned lens rigorously. A recurring, high-value
|
Be skeptical and concrete, and apply your assigned lens rigorously. (If your lens leads you to
|
||||||
discipline regardless of lens: do NOT trust a constant, conversion factor, formula, unit, or
|
a constant, conversion factor, formula, unit, or threshold, don't trust it because it looks
|
||||||
threshold just because it looks reasonable — RE-DERIVE the expected value from first principles
|
reasonable — re-derive the expected value from first principles and compare; plausible-looking
|
||||||
and compare. Plausible-looking magic numbers are where real bugs hide.
|
magic numbers hide real bugs. Pursue this when it's in your lane, not as a reason to leave it.)
|
||||||
|
|
||||||
Output rules:
|
Output rules:
|
||||||
- Output GitHub-flavored markdown, concise. No filler, no restating the diff.
|
- Output GitHub-flavored markdown, concise. No filler, no restating the diff.
|
||||||
|
|||||||
Reference in New Issue
Block a user