3 Commits

Author SHA1 Message Date
steve 1fd7109a42 fix(agent): recover front-loaded answer when terminal turn is degenerate
CI / Tidy (pull_request) Successful in 9m31s
CI / Build & Test (pull_request) Successful in 10m14s
CI / Tidy (push) Successful in 9m26s
CI / Build & Test (push) Successful in 10m19s
The agent loop took the final answer only from the terminal (no-tool-call)
turn. Models that "front-load" their answer into an earlier turn that also
calls a tool — then close with a trivial pointer like "(Already answered
above.)" — had their real answer discarded and the pointer delivered. This
recurs across several open-weight models (glm-5.2, etc.); well-behaved models
(Claude/GPT) defer their answer to the terminal turn and are unaffected.

finalOutput() now falls back to the last substantive assistant content in the
transcript when the terminal text is weak (empty, or a short back-reference).
The predicate is narrow and back-reference-gated so short-but-correct answers
("42", "It's down, restarting now.") are never overridden; recovery only picks
a prior turn that reads like a real answer, not a preamble. Zero extra model
calls. Terminal-answer behavior for normal runs is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 18:37:38 -04:00
steve 0147a79d18 feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls
CI / Tidy (push) Successful in 9m31s
CI / Build & Test (push) Successful in 10m13s
Phase 9a (ADR-0014): Registry.RegisterResolver for dynamic tiers;
DefineTool[Args] typed tools; Usage cache/reasoning detail fields wired
through anthropic/openai/google; WithPromptCaching (Anthropic
cache_control); agent supervision hooks (WithMaxStepsFunc, WithSteer,
WithCompactor, WithToolErrorLimits + ErrToolLoop); health
Bench/Unbench/Snapshot; ChainConfig.Observer failover events.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:30:06 +02:00
steve 7dab4112ff feat: agent run loop, Generate[T], reflect-derived schemas
Phase 5:
- agent/: model + system prompt + toolboxes composition; bounded
  tool-dispatch loop (default 10 steps); panic-proof tool execution;
  unknown-tool and duplicate-name handling; history continuation; step
  observers; partial results on ErrMaxSteps/errors (ADR-0012)
- llm.SchemaFor[T]: strict-compatible JSON schemas from Go types
  (nullable pointers, description/enum tags, recursion rejected)
- majordomo.Generate[T]: typed structured output with fence-stripping
  decode and model-naming errors
- README agents/structured-output sections + matrix synced

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 13:10:18 +02:00