majordomo

steve/majordomo

Fork 0

Commit Graph

Author	SHA1	Message	Date
steve	1fd7109a42	fix(agent): recover front-loaded answer when terminal turn is degenerate CI / Tidy (pull_request) Successful in 9m31s Details CI / Build & Test (pull_request) Successful in 10m14s Details CI / Tidy (push) Successful in 9m26s Details CI / Build & Test (push) Successful in 10m19s Details The agent loop took the final answer only from the terminal (no-tool-call) turn. Models that "front-load" their answer into an earlier turn that also calls a tool — then close with a trivial pointer like "(Already answered above.)" — had their real answer discarded and the pointer delivered. This recurs across several open-weight models (glm-5.2, etc.); well-behaved models (Claude/GPT) defer their answer to the terminal turn and are unaffected. finalOutput() now falls back to the last substantive assistant content in the transcript when the terminal text is weak (empty, or a short back-reference). The predicate is narrow and back-reference-gated so short-but-correct answers ("42", "It's down, restarting now.") are never overridden; recovery only picks a prior turn that reads like a real answer, not a preamble. Zero extra model calls. Terminal-answer behavior for normal runs is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-26 18:37:38 -04:00
steve	0147a79d18	feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls CI / Tidy (push) Successful in 9m31s Details CI / Build & Test (push) Successful in 10m13s Details Phase 9a (ADR-0014): Registry.RegisterResolver for dynamic tiers; DefineTool[Args] typed tools; Usage cache/reasoning detail fields wired through anthropic/openai/google; WithPromptCaching (Anthropic cache_control); agent supervision hooks (WithMaxStepsFunc, WithSteer, WithCompactor, WithToolErrorLimits + ErrToolLoop); health Bench/Unbench/Snapshot; ChainConfig.Observer failover events. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 13:30:06 +02:00
steve	7dab4112ff	feat: agent run loop, Generate[T], reflect-derived schemas Phase 5: - agent/: model + system prompt + toolboxes composition; bounded tool-dispatch loop (default 10 steps); panic-proof tool execution; unknown-tool and duplicate-name handling; history continuation; step observers; partial results on ErrMaxSteps/errors (ADR-0012) - llm.SchemaFor[T]: strict-compatible JSON schemas from Go types (nullable pointers, description/enum tags, recursion rejected) - majordomo.Generate[T]: typed structured output with fence-stripping decode and model-naming errors - README agents/structured-output sections + matrix synced Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 13:10:18 +02:00

Author

SHA1

Message

Date

steve

1fd7109a42

fix(agent): recover front-loaded answer when terminal turn is degenerate

CI / Tidy (pull_request) Successful in 9m31s

Details

CI / Build & Test (pull_request) Successful in 10m14s

Details

CI / Tidy (push) Successful in 9m26s

Details

CI / Build & Test (push) Successful in 10m19s

Details

The agent loop took the final answer only from the terminal (no-tool-call)
turn. Models that "front-load" their answer into an earlier turn that also
calls a tool — then close with a trivial pointer like "(Already answered
above.)" — had their real answer discarded and the pointer delivered. This
recurs across several open-weight models (glm-5.2, etc.); well-behaved models
(Claude/GPT) defer their answer to the terminal turn and are unaffected.

finalOutput() now falls back to the last substantive assistant content in the
transcript when the terminal text is weak (empty, or a short back-reference).
The predicate is narrow and back-reference-gated so short-but-correct answers
("42", "It's down, restarting now.") are never overridden; recovery only picks
a prior turn that reads like a real answer, not a preamble. Zero extra model
calls. Terminal-answer behavior for normal runs is unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-26 18:37:38 -04:00

steve

0147a79d18

feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls

CI / Tidy (push) Successful in 9m31s

Details

CI / Build & Test (push) Successful in 10m13s

Details

Phase 9a (ADR-0014): Registry.RegisterResolver for dynamic tiers;
DefineTool[Args] typed tools; Usage cache/reasoning detail fields wired
through anthropic/openai/google; WithPromptCaching (Anthropic
cache_control); agent supervision hooks (WithMaxStepsFunc, WithSteer,
WithCompactor, WithToolErrorLimits + ErrToolLoop); health
Bench/Unbench/Snapshot; ChainConfig.Observer failover events.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

2026-06-10 13:30:06 +02:00

steve

7dab4112ff

feat: agent run loop, Generate[T], reflect-derived schemas

Phase 5:
- agent/: model + system prompt + toolboxes composition; bounded
  tool-dispatch loop (default 10 steps); panic-proof tool execution;
  unknown-tool and duplicate-name handling; history continuation; step
  observers; partial results on ErrMaxSteps/errors (ADR-0012)
- llm.SchemaFor[T]: strict-compatible JSON schemas from Go types
  (nullable pointers, description/enum tags, recursion rejected)
- majordomo.Generate[T]: typed structured output with fence-stripping
  decode and model-naming errors
- README agents/structured-output sections + matrix synced

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

2026-06-10 13:10:18 +02:00

3 Commits