feat: configurable lens fan-out, per-provider like model concurrency
Build & push image / build-and-push (push) Successful in 9s

Specialist lenses ran strictly sequentially within a model. Add a
GADFLY_LENS_CONCURRENCY knob (default 1 = unchanged) that overlaps the
independent per-lens review+recheck passes, so a model posts its
consolidated comment as soon as its lenses finish.

Per-provider configurable, mirroring GADFLY_PROVIDER_CONCURRENCY:
GADFLY_PROVIDER_LENS_CONCURRENCY takes a "provider=N,..." map keyed by
the same provider lanes (modelProvider() mirrors entrypoint's provider_of;
providerOverride() mirrors provider_cap). The override wins for the model's
lane, else the scalar default.

runSpecialists fans out via a bounded worker pool, order-preserving
(results written by index) and keeping each lens's own timeout/recheck.
repoFS is immutable + fresh-toolbox-per-pass, so lenses share no mutable
state (verified under -race). Docs/examples updated; dropped a duplicate
GADFLY_TIMEOUT_SECS README row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-26 22:53:27 -04:00
parent 6e3a83c437
commit d0de034726
5 changed files with 317 additions and 6 deletions
+20 -1
View File
@@ -174,6 +174,24 @@ GADFLY_MODELS: "m1pro/qwen3:14b,qwen3-coder:480b-cloud,gpt-oss:120b-cloud"
A model's provider is the spec's first segment (`m1pro/…``m1pro`), or `GADFLY_PROVIDER`/
`ollama-cloud` for a bare id. Default (`cap 1`) keeps a single-provider pool fully sequential.
**Lens fan-out (within a model).** By default the specialist lenses run **sequentially** inside
each model (`GADFLY_LENS_CONCURRENCY=1`). Raise it to overlap the independent per-lens
review+recheck passes — the model then posts its consolidated comment as soon as its lenses
finish (so with sequential models, results stream in per model and per-model timings stay
clean). Like the model cap, it's **per-provider configurable**: `GADFLY_PROVIDER_LENS_CONCURRENCY`
takes a `provider=N` map keyed by the **same provider lanes** as `GADFLY_PROVIDER_CONCURRENCY`,
falling back to the `GADFLY_LENS_CONCURRENCY` scalar (default `1`). **It multiplies with the
model cap:** total in-flight requests ≈ *models-at-once × lenses-at-once*, so to fan lenses out
without oversubscribing a backend, keep its model cap low and raise its lens cap:
```yaml
# Per provider: cloud runs one model at a time but fans its 3 lenses out (3 concurrent requests);
# the slow local box stays fully serial. Both provider lanes still run in parallel.
GADFLY_PROVIDER_CONCURRENCY: "ollama-cloud=1,m1=1"
GADFLY_PROVIDER_LENS_CONCURRENCY: "ollama-cloud=3,m1=1"
GADFLY_SPECIALISTS: "security,correctness,error-handling"
```
### Triggers
1. A **new/reopened/ready** non-draft PR — automatic.
@@ -227,11 +245,12 @@ The reviewer binary reads these (the stub/entrypoint set sane defaults):
| `GADFLY_WORKER_MAX_STEPS` | 8 | tool-step cap for a delegated worker run |
| `GADFLY_CONCURRENCY` | 1 | default max models run at once **per provider** |
| `GADFLY_PROVIDER_CONCURRENCY` | — | per-provider overrides, e.g. `ollama-cloud=3,m1pro=1` |
| `GADFLY_LENS_CONCURRENCY` | 1 | specialist lenses run at once **within a model** (× model cap = total in-flight) |
| `GADFLY_PROVIDER_LENS_CONCURRENCY` | — | per-provider lens overrides, same lanes as `GADFLY_PROVIDER_CONCURRENCY`, e.g. `ollama-cloud=3,m1=1` |
| `GADFLY_MAX_STEPS` | 24 | review-pass tool-step cap |
| `GADFLY_TIMEOUT_SECS` | 300 | deadline **per specialist lens** (review+recheck) |
| `GADFLY_RECHECK` | on | set `0`/`false` to skip the recheck pass |
| `GADFLY_RECHECK_MAX_STEPS` | 16 | recheck-pass step cap |
| `GADFLY_TIMEOUT_SECS` | 300 | overall deadline (both passes) |
| `GADFLY_MAX_DIFF_CHARS` | 60000 | diff chars embedded in the prompt (full diff via `get_diff`) |
| `GADFLY_TRIGGER_PHRASE` | `@gadfly review` | comment phrase that re-triggers |
| `GADFLY_ALLOWED_USERS` | *(collaborators)* | comma-separated allow-list for comment triggers |