P3 (kickoff): generic tools/ library + end-to-end tool-using-agent test
Stand up executus/tools — the generic, host-agnostic tool library — and prove the full pattern end to end: - tools/tools.go: Register(reg) adds the always-available zero-dependency tools (currently `think`). A light host calls it and is immediately useful; backed tools (web/store/meta groups) will register via grouped registrars with nil-safe Deps as they land. - tools/think.go: the `think` tool moved from mort (imports only executus/tool). - tools/integration_test.go: end-to-end proof that the executor runs an agent which CALLS a registered tool — the fake model emits a `think` tool call, the executor dispatches it through the registry, the model finalises, and the step instrumentation captures the `think` step. Exercises the full tool-dispatch loop through run.Executor. Stacked on phase-2-run-kernel (P3 needs run.Executor). Remaining P3: the meta/web/net/store/compose groups + their Deps + default backends (splitting mort's default.go grab-bag). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,72 @@
|
||||
// Package tools — v11 think.
|
||||
//
|
||||
// Pure prompt-engineering tool: the agent's "thought" is recorded
|
||||
// to skill_run_logs (via the audit hook the gated wrapper applies
|
||||
// transparently) but produces no side effect. The literature on
|
||||
// agent design notes that giving an agent an explicit `think` tool
|
||||
// keeps it on plan better than giving it nothing — without one,
|
||||
// agents tend to either skip planning OR babble into the final
|
||||
// output. With one, planning lands in tool calls and the final
|
||||
// output stays clean.
|
||||
//
|
||||
// V11 deliberately rejects empty thoughts. An agent that learns
|
||||
// "calling think with empty args is free" will spam it; a
|
||||
// rejection forces the call to actually carry reasoning.
|
||||
package tools
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"strings"
|
||||
|
||||
"gitea.stevedudenhoeffer.com/steve/executus/tool"
|
||||
)
|
||||
|
||||
type thinkParams struct {
|
||||
Thought string `json:"thought" description:"Your reasoning. May be a plan, a working hypothesis, an analysis of a tool result, or anything else you'd note in a private scratchpad. Empty input is rejected — make this load-bearing."`
|
||||
}
|
||||
|
||||
// thinkResponse is intentionally minimal. The agent doesn't need
|
||||
// machine-readable output; the value is the audit trail + the
|
||||
// implicit "now you've planned, what's next" prompting the call
|
||||
// gives the agent loop.
|
||||
type thinkResponse struct {
|
||||
OK bool `json:"ok"`
|
||||
Error string `json:"error,omitempty"`
|
||||
}
|
||||
|
||||
// NewThink constructs the v11 think tool. No deps — the audit
|
||||
// hook wrapper handles persistence transparently.
|
||||
func NewThink() tool.Tool {
|
||||
return tool.NewGatedTool[thinkParams](
|
||||
"think",
|
||||
"Record a thought / plan / working hypothesis. The thought is logged to the run trace but does NOT affect any external state. Use to slow down before a tricky tool call, sketch a multi-step plan, or summarise findings before continuing. Empty thoughts are rejected.",
|
||||
tool.Permission{
|
||||
AuthoringRequirement: tool.RequirementAnyone,
|
||||
OperatesOn: tool.ScopeGlobal,
|
||||
SafeForShare: true,
|
||||
Categories: []string{"utility"},
|
||||
},
|
||||
func(_ context.Context, _ tool.Invocation, p thinkParams) (string, error) {
|
||||
if strings.TrimSpace(p.Thought) == "" {
|
||||
// Returns ok:false in a structured envelope rather
|
||||
// than an error so the agent loop continues with a
|
||||
// recoverable signal.
|
||||
return `{"ok":false,"error":"empty_thought"}`, nil
|
||||
}
|
||||
// Successful think emits a flat JSON. The audit hook
|
||||
// (auto-injected by NewGatedTool) writes the args + result
|
||||
// pair so the trace UI shows the thought verbatim.
|
||||
return `{"ok":true}`, nil
|
||||
},
|
||||
)
|
||||
}
|
||||
|
||||
// Note: returning a hand-rolled JSON literal instead of a marshaller
|
||||
// keeps think the cheapest possible tool — no heap allocation, no
|
||||
// json.Marshal call, no goroutine-local buffer churn. The two output
|
||||
// shapes are static. If a future field is added to thinkResponse,
|
||||
// switch back to json.Marshal — but until then, the literal is the
|
||||
// idiom that matches the tool's "do nothing" intent.
|
||||
var _ = thinkResponse{} // declared so vet doesn't flag the unused struct
|
||||
var _ = fmt.Errorf
|
||||
Reference in New Issue
Block a user