43b2471737
Continues finishing the executor's run.Ports wiring (after C0's Palette).
Critic (run/critic.go): when Ports.Critic is set and the agent enables it, the
executor calls Monitor at run start, feeds RecordStep/RecordToolStart from the
step observer, drains the critic's Steer messages into the loop via
agent.WithSteer, and binds the run's hard cancellation to the critic's
(extendable) Deadline through a watch goroutine — a healthy-but-slow run gets
room while a hung one is killed. Stop() on run end. Soft timeout from
Defaults.CriticSoftTimeout (default 90s). nil-safe: no critic / not-enabled =
no-op.
Delivery (run/executor.go deliver): after the run, when Ports.Delivery is set
and inv.DeliveryID is non-empty, the executor posts Result.Output (or
DeliverError on failure) to a host-interpreted deliver.Target
{inv.DeliveryKind, inv.DeliveryID}. Empty target = caller reads Result.Output
itself (the synchronous default; the `.agent run` canary). Best-effort +
detached.
tool.Invocation gains DeliveryKind/DeliveryID (host-set egress target).
Tests: critic monitored/fed/steered/stopped when enabled, untouched when not;
delivery posts on a target, skips without one. Deferred: Checkpointer (needs a
majordomo hook to snapshot the running message history).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
136 lines
6.7 KiB
Markdown
136 lines
6.7 KiB
Markdown
# executus — developer & agent guide
|
||
|
||
> ⚠️ **This project is vibe-coded** (AI-authored, human-steered). See `README.md`.
|
||
|
||
executus is a **batteries-included base for LLM agent harnesses**, layered
|
||
strictly above [majordomo]. majordomo is the lean substrate (agent loop, `llm`
|
||
types, providers, media, parse/failover/tiering). executus is the opinionated
|
||
layer majordomo deliberately omits. **executus requires no majordomo changes** —
|
||
it decorates `llm.Model` and wraps `majordomo/agent.Agent`.
|
||
|
||
[majordomo]: https://gitea.stevedudenhoeffer.com/steve/majordomo
|
||
|
||
## North star
|
||
|
||
A brand-new project imports executus, does a little setup, and is most of the way
|
||
to agentic capabilities. The mechanism is **one shipped default per seam**:
|
||
`executus.New()` (once the runtime lands) is agentic with zero host wiring; the
|
||
same builder lets a serious host swap each default for its own implementation and
|
||
register its own tools.
|
||
|
||
Two consumers define the envelope:
|
||
|
||
- **mort** (heavy) — Discord, mortbux, media, MySQL/GORM, DB-backed convar config,
|
||
saved skills, audit, scheduling, run-critic.
|
||
- **gadfly** (light) — a CI PR-reviewer Docker image, env-var configured, running
|
||
an N-models × M-lenses structured-output swarm. Needs model fleet, lanes,
|
||
bounded runs, structured output, fan-out, a few read tools — and **none** of the
|
||
batteries.
|
||
|
||
That spread is why executus is **tiered**: a light host imports core only; a heavy
|
||
host opts into batteries.
|
||
|
||
## Module & layering
|
||
|
||
One module `gitea.stevedudenhoeffer.com/steve/executus`, `go.mod` = **majordomo +
|
||
stdlib only** (no gorm/redis/discordgo/cgo). A second nested module
|
||
`contrib/store` carries the SQLite dependency so the core never inherits it.
|
||
|
||
```
|
||
CORE (majordomo + stdlib):
|
||
config/ ConfigSource seam (+ env default) [P0 ✓]
|
||
lane/ bounded fair-share worker pool [P0 ✓]
|
||
fanout/ programmatic N×M swarm [P0 ✓]
|
||
deliver/ output egress seam (+ Discard/Stdout) [P0 ✓]
|
||
identity/ caller identity seams [P0 ✓]
|
||
run/ run.Executor is RUNNABLE: model-resolve + [P2 core ✓]
|
||
toolbox + majordomo loop + compaction +
|
||
run-bounding (V10 detached timeout) + step/
|
||
audit observers + Budget gate; RunnableAgent
|
||
DTO + nil-safe run.Ports. Palette delegation +
|
||
Critic (monitor/deadline/steer) + Delivery
|
||
WIRED. Follow-ups: Checkpointer (needs a
|
||
majordomo msg-history hook), Phases [C0c]
|
||
dispatchguard/ loop/depth/fan-out caps [P0 ✓]
|
||
pendingattach/ attachment dedupe [P0 ✓]
|
||
tool/ registry + 3-stage permissions + ssrf [P1 ✓]
|
||
model/ config-driven tier resolution over majordomo [P1 ✓]
|
||
(convar->config.Source; UsageSink/TraceSink seams; GenerateWith[T]
|
||
structured output — no separate structured/ pkg)
|
||
llmmeta/ shared meta-LLM helper over model/ [P1 ✓]
|
||
compact/ context compactor (WithCompactor hook) [P2 ✓]
|
||
tools/ generic tool library: Register (think/now/ [P3 ✓]
|
||
cite, zero-config) + RegisterMeta (classify/
|
||
extract_entities/summarize) + RegisterStore
|
||
(kv_*/file_*, default static quota); seams in
|
||
research_providers.go/file_storage.go/
|
||
kv_storage.go/quota_provider.go. End-to-end
|
||
"agent calls a tool" test green. Remaining
|
||
(deferred): web/net/compose groups + backends
|
||
|
||
BATTERIES (opt-in siblings, each nil-safe + a default):
|
||
persona/ Agent noun + Storage seam + builtin loader [P4 ✓]
|
||
+ ToRunnable() bridge to run.RunnableAgent +
|
||
Memory default (host: chatbot/commands/personalization)
|
||
skill/ Skill noun + LEAN SkillStore (lifecycle/ [P4 ✓]
|
||
versions/schedule, NOT mort's 60-method
|
||
monster) + ToRunnable + Memory default
|
||
audit/ run.Audit Sink + Writer + queryable Memory [P4 ✓]
|
||
default (skillaudit Storage iface; GORM stays in mort)
|
||
critic/ two-tier timeout watchdog (run.Critic) + [P4 ✓]
|
||
Escalator policy seam + ExtendOnce default
|
||
schedule/ generic cron Runner (Tick/Loop over a wired [P4 ✓]
|
||
Due/Run/Mark/Next; no cron grammar of its own)
|
||
checkpoint/ CheckpointStore + run.Checkpointer handle [P4 ✓]
|
||
(throttled Save/Complete/Fail) + Memory
|
||
budget/ DBBudget rolling-7d + NoOp (run.Budget); [P4 ✓]
|
||
BudgetStorage iface + Memory default
|
||
|
||
contrib/store/ SECOND module (+ modernc.org/sqlite): [P4 ✓]
|
||
pure-Go SQLite impls of ALL store seams: budget +
|
||
persona + skill + audit (JSON-blob+indexed cols,
|
||
round-trip tested). CI proves the driver lands HERE,
|
||
not in the core go.sum.
|
||
|
||
NOTE: critic/checkpoint executor wiring (run.Ports.Critic /
|
||
.Checkpointer call sites) is a P2 follow-up — the batteries +
|
||
defaults exist ahead of that wiring.
|
||
```
|
||
|
||
### The one architectural move
|
||
|
||
The kernel must import **no battery**. In mort today, `agentexec` imports
|
||
`agents`, `agentcritic`, and `skillaudit` directly — those three up-pointing edges
|
||
get inverted into nil-safe `run.Ports` interfaces (`PaletteSource`, `Critic`,
|
||
`Audit`) plus a `RunnableAgent` DTO. Everything else is wide-but-shallow
|
||
repackaging.
|
||
|
||
## Invariants (enforced in CI)
|
||
|
||
- The core module builds with **majordomo + stdlib only**. `go.sum` must not
|
||
contain gorm/redis/discordgo/sqlite/gin.
|
||
- No `core/*` package imports a `battery/*` package.
|
||
- Standard Go gates: `go build`, `go vet`, `go test -race`, `go mod tidy` clean.
|
||
|
||
## Extraction roadmap
|
||
|
||
P0 module + zero-coupling moves + core seams (this) → P1 tool registry + model →
|
||
P2 run kernel + Ports inversion → P3 generic tools + defaults → P4 persona/skill
|
||
redesign + batteries + SQLite store → P5 gadfly-on-core canary (examples/reviewer ✓) → P6
|
||
rewire mort + tag v0.1.0. The mort-side rewrite reuses mort's existing
|
||
`mort_*_adapters.go` wall as the host adapter layer.
|
||
|
||
## Conventions
|
||
|
||
- **Keep `README.md`, this `CLAUDE.md`, and `examples/` in sync with every change,
|
||
in the SAME commit.** No aspirational docs: when you add/rename a package, change
|
||
a seam or a default, or alter the public API, update the docs and the relevant
|
||
example so they always reflect reality (mirrors majordomo's house rule). The
|
||
status markers in the tier map above must track what's actually landed.
|
||
- Mirror majordomo's house style: gofmt; check errors immediately and wrap with
|
||
`fmt.Errorf("...: %w", err)`; `// Why:` comments where rationale isn't obvious;
|
||
hermetic tests (majordomo's fake provider; no network in the default suite).
|
||
- Every seam is an interface with a nil-safe accessor and a shipped default.
|
||
- Keep the core seam surface small and stable — push churn into tools and host
|
||
adapters, not core interfaces.
|