majordomo/docs/adr/0008-chain-semantics.md

# ADR-0008: Failover-chain execution semantics

**Status:** Accepted — 2026-06-10

## Context

A parsed spec is an ordered chain of targets sharing the registry's health
tracker. The executor must realize the kickoff's failover story (retry one
blip; bench repeat offenders; skip benched targets; clear exhaustion errors)
identically for chains of one and many.

## Decision

For each request, iterate elements head-to-tail:

1. **Skip** targets currently benched (recorded in the exhaustion error).
2. Attempt the target. On success → report success (resets health), return.
3. On error, classify:
   - **Permanent + model-not-found** → advance, no health penalty.
   - **Permanent otherwise** (auth, malformed) → **fail fast** by default —
     failing over cannot fix a bad request; `ChainConfig.AdvanceOnPermanent`
     flips this for callers who prefer availability.
   - **Transient** → report the failed attempt to the tracker; retry the
     same target while attempts remain (`TransientRetries`, default 1)
     **unless the tracker just benched it**, in which case advance
     immediately.
4. All elements failed/skipped → return `errors.Join(ErrChainExhausted,
   per-target reasons...)` naming every target and why.

Other decisions:

- **Capabilities() = head element's capabilities.** The head is the
  preferred target and the honest answer to "what should I prepare for?".
  Per-attempt media normalization (Phase 3) uses the *actual* target's
  capabilities, so fallbacks still get correctly-fitted inputs.
  Intersection semantics were rejected: a rarely-used tail fallback would
  artificially constrain every request.
- **Streaming failover applies to stream establishment only.** Once a
  stream is open, mid-stream errors propagate; silently restarting on
  another target would re-deliver partial output.
- `context.Canceled` aborts the chain immediately between and during
  attempts.
- Duplicate post-expansion elements were already dropped at Parse
  (ADR-0003).

## Consequences

- "One transient error is fine" holds: blip → same-target retry succeeds,
  no failover, one health mark that the success immediately clears... and
  with default knobs (retries=1, threshold=2) a target whose retry also
  fails is benched in the same request and the chain advances — exactly the
  kickoff narrative.
- Single-target specs get the same retry/backoff behavior for free.

## Alternatives considered

- Per-request (not per-attempt) failure counting — needs two failed
  *requests* to bench, letting a dead model eat the retry budget twice.
  Rejected as weaker than the kickoff's story.
- Intersection capabilities — see above. Rejected.