feat(reusable): add the 4090 Ti (qwen3.6-27b via llama-swap) to the default swarm
Build & push image / build-and-push (pull_request) Successful in 4s
Adversarial Review (Gadfly) / review (pull_request) Successful in 4m21s

Adds a local GPU reviewer to the shared default:
- models += ragnaros/qwen3.6-27b
- GADFLY_ENDPOINT_RAGNAROS=llama-swaps|https://llama-swap.ragnaros.dudenhoeffer.casa
  (plain LAN URL, no credential; registers provider "ragnaros")
- provider_concurrency ragnaros=1, provider_lens_concurrency ragnaros=1
  (one model, one lens at a time — a single local GPU)

Inherited by all @v1 consumers (mort/executus/majordomo) once v1 moves.
Comments + README + CLAUDE.md updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Steve Dudenhoeffer
2026-06-28 00:23:45 -04:00
parent 8f69e71311
commit 2b02cbb4ba
3 changed files with 18 additions and 9 deletions
+14 -6
View File
@@ -13,7 +13,8 @@
# with: { allowed_users: "..." } # config inputs are optional (see below) # with: { allowed_users: "..." } # config inputs are optional (see below)
# #
# Inputs ship the DEFAULT swarm (see the inputs block): 3 cloud models + the # Inputs ship the DEFAULT swarm (see the inputs block): 3 cloud models + the
# Claude Code engine, 5-lens suite (3 claude models concurrent, 5 lenses each). A consumer # Claude Code engine + a local 4090 Ti (qwen3.6-27b via llama-swap), 5-lens suite
# (3 claude models concurrent / 5 lenses each; the 4090 Ti runs 1 model × 1 lens). A consumer
# inherits it by omitting `with:` entirely, or overrides any field (e.g. # inherits it by omitting `with:` entirely, or overrides any field (e.g.
# `models:` for a cloud-only / different-provider setup; "" falls back to the # `models:` for a cloud-only / different-provider setup; "" falls back to the
# image's built-in default). Secrets are DECLARED below (workflow_call.secrets) so a # image's built-in default). Secrets are DECLARED below (workflow_call.secrets) so a
@@ -35,9 +36,11 @@ on:
workflow_call: workflow_call:
# Inputs ship the DEFAULT Gadfly swarm so a consumer can just call this # Inputs ship the DEFAULT Gadfly swarm so a consumer can just call this
# workflow (no `with:` block) and inherit it. The default is opinionated — # workflow (no `with:` block) and inherit it. The default is opinionated —
# 3 strong cloud models + the Claude Code engine (sonnet/opus/opus:max), the # 3 strong cloud models + the Claude Code engine (sonnet/opus/opus:max) + a
# local 4090 Ti (qwen3.6-27b via llama-swap at GADFLY_ENDPOINT_RAGNAROS), the
# 5-lens suite, with all 3 claude models concurrent and each running its 5 # 5-lens suite, with all 3 claude models concurrent and each running its 5
# lenses at once. It needs OLLAMA_CLOUD_API_KEY and CLAUDE_CODE_OAUTH_TOKEN; a consumer # lenses at once (the 4090 Ti runs 1 model × 1 lens — a single local GPU). It
# needs OLLAMA_CLOUD_API_KEY and CLAUDE_CODE_OAUTH_TOKEN; a consumer
# with only one (or a different provider) overrides `models:` (and forwards # with only one (or a different provider) overrides `models:` (and forwards
# just the secrets it uses). Set any input to "" to fall back to the # just the secrets it uses). Set any input to "" to fall back to the
# image/entrypoint built-in default. # image/entrypoint built-in default.
@@ -46,12 +49,12 @@ on:
# (3 models × 5 lenses = up to 15 concurrent `claude -p` per pass). If you hit # (3 models × 5 lenses = up to 15 concurrent `claude -p` per pass). If you hit
# subscription rate limits or runner load, dial claude-code down in either knob. # subscription rate limits or runner load, dial claude-code down in either knob.
inputs: inputs:
models: { type: string, default: "minimax-m3:cloud,glm-5.2:cloud,deepseek-v4-pro:cloud,claude-code/sonnet,claude-code/opus,claude-code/opus:max" } # GADFLY_MODELS (csv) models: { type: string, default: "minimax-m3:cloud,glm-5.2:cloud,deepseek-v4-pro:cloud,claude-code/sonnet,claude-code/opus,claude-code/opus:max,ragnaros/qwen3.6-27b" } # GADFLY_MODELS (csv); ragnaros/* = the 4090 Ti via llama-swap (see GADFLY_ENDPOINT_RAGNAROS)
specialists: { type: string, default: "security,correctness,maintainability,performance,error-handling" } # GADFLY_SPECIALISTS (5-lens default suite) specialists: { type: string, default: "security,correctness,maintainability,performance,error-handling" } # GADFLY_SPECIALISTS (5-lens default suite)
provider: { type: string, default: "" } # GADFLY_PROVIDER provider: { type: string, default: "" } # GADFLY_PROVIDER
base_url: { type: string, default: "" } # GADFLY_BASE_URL base_url: { type: string, default: "" } # GADFLY_BASE_URL
provider_concurrency: { type: string, default: "ollama-cloud=3,claude-code=3" } # GADFLY_PROVIDER_CONCURRENCY (all 3 claude models at once) provider_concurrency: { type: string, default: "ollama-cloud=3,claude-code=3,ragnaros=1" } # GADFLY_PROVIDER_CONCURRENCY (claude all 3 at once; ragnaros 4090 Ti one model at a time)
provider_lens_concurrency: { type: string, default: "ollama-cloud=3,claude-code=5" } # GADFLY_PROVIDER_LENS_CONCURRENCY (each claude runs all 5 lenses at once) provider_lens_concurrency: { type: string, default: "ollama-cloud=3,claude-code=5,ragnaros=1" } # GADFLY_PROVIDER_LENS_CONCURRENCY (claude 5 lenses at once; ragnaros 1 lens at a time)
timeout_secs: { type: string, default: "600" } # GADFLY_TIMEOUT_SECS (per lens) timeout_secs: { type: string, default: "600" } # GADFLY_TIMEOUT_SECS (per lens)
max_steps: { type: string, default: "14" } # GADFLY_MAX_STEPS max_steps: { type: string, default: "14" } # GADFLY_MAX_STEPS
worker_model: { type: string, default: "" } # GADFLY_WORKER_MODEL worker_model: { type: string, default: "" } # GADFLY_WORKER_MODEL
@@ -119,6 +122,11 @@ jobs:
# reusable workflow can't enumerate arbitrary names. # reusable workflow can't enumerate arbitrary names.
GADFLY_ENDPOINT_M1: ${{ secrets.GADFLY_ENDPOINT_M1 }} GADFLY_ENDPOINT_M1: ${{ secrets.GADFLY_ENDPOINT_M1 }}
GADFLY_ENDPOINT_M5: ${{ secrets.GADFLY_ENDPOINT_M5 }} GADFLY_ENDPOINT_M5: ${{ secrets.GADFLY_ENDPOINT_M5 }}
# ragnaros = the 4090 Ti, reached over the LAN through its llama-swap
# proxy (lazy-loads models on demand). Plain URL, no credential — set
# here so the default `ragnaros/qwen3.6-27b` model resolves for all
# consumers. Registers provider "ragnaros".
GADFLY_ENDPOINT_RAGNAROS: "llama-swaps|https://llama-swap.ragnaros.dudenhoeffer.casa"
# --- findings telemetry (optional) -------------------------------- # --- findings telemetry (optional) --------------------------------
GADFLY_FINDINGS_URL: ${{ secrets.GADFLY_FINDINGS_URL }} GADFLY_FINDINGS_URL: ${{ secrets.GADFLY_FINDINGS_URL }}
GADFLY_FINDINGS_TOKEN: ${{ secrets.GADFLY_FINDINGS_TOKEN }} GADFLY_FINDINGS_TOKEN: ${{ secrets.GADFLY_FINDINGS_TOKEN }}
+3 -2
View File
@@ -47,8 +47,9 @@ entrypoint.sh container brains: trigger gating, PR clone, model loop (t
Dockerfile multi-stage; private-module creds via BuildKit secrets never reach the final image Dockerfile multi-stage; private-module creds via BuildKit secrets never reach the final image
.gitea/workflows/build-image.yml push main → :latest; tag v* → :<tag>+:latest; PR → build-only .gitea/workflows/build-image.yml push main → :latest; tag v* → :<tag>+:latest; PR → build-only
.gitea/workflows/review-reusable.yml reusable (workflow_call) review job; ships the DEFAULT swarm as .gitea/workflows/review-reusable.yml reusable (workflow_call) review job; ships the DEFAULT swarm as
input defaults (3 cloud + Claude Code sonnet/opus/opus:max, 5-lens suite; input defaults (3 cloud + Claude Code sonnet/opus/opus:max + a local 4090 Ti
3 claude models concurrent, 5 lenses each) so consumers inherit it by omitting `with:`. Consumers subscribe via llama-swap, 5-lens suite; 3 claude models concurrent / 5 lenses each, the
4090 Ti 1 model × 1 lens) so consumers inherit it by omitting `with:`. Consumers subscribe
with an ~8-line caller forwarding only the secrets the reviewer needs (Phase 4); with an ~8-line caller forwarding only the secrets the reviewer needs (Phase 4);
gadfly's own adversarial-review.yml is a thin caller of it (dogfoods the path). gadfly's own adversarial-review.yml is a thin caller of it (dogfoods the path).
examples/ copy-paste consumer stub workflows for different providers examples/ copy-paste consumer stub workflows for different providers
+1 -1
View File
@@ -40,7 +40,7 @@ it. Drop one file in your repo and set a couple of secrets/vars:
your repo. Two flavors: the slim [`reusable.yml`](examples/reusable.yml) — a tiny caller of your repo. Two flavors: the slim [`reusable.yml`](examples/reusable.yml) — a tiny caller of
Gadfly's **reusable workflow** (`uses: steve/gadfly/.gitea/workflows/review-reusable.yml@…`, Gadfly's **reusable workflow** (`uses: steve/gadfly/.gitea/workflows/review-reusable.yml@…`,
forwarding only the secrets the reviewer needs), which ships a **default swarm** (3 cloud models + forwarding only the secrets the reviewer needs), which ships a **default swarm** (3 cloud models +
the Claude Code engine, 5-lens suite) you inherit by omitting `with:` or override per-input — or the full self-contained the Claude Code engine + a local 4090 Ti via llama-swap, 5-lens suite) you inherit by omitting `with:` or override per-input — or the full self-contained
[`adversarial-review.yml`](examples/adversarial-review.yml) (Ollama Cloud default, with inline [`adversarial-review.yml`](examples/adversarial-review.yml) (Ollama Cloud default, with inline
notes for every provider / local Ollama / OpenAI-compatible / endpoint aliases). See the notes for every provider / local Ollama / OpenAI-compatible / endpoint aliases). See the
[examples index](examples/README.md). [examples index](examples/README.md).