steve cb4c612461
executus CI / test (pull_request) Successful in 1m45s
fix(run): address gadfly review of the critic-deadline PR
All 11 findings were real (3 clusters):

- Failsafe ceiling could pre-empt the critic's backstop (e9c9483f, 9109317b,
  d5a9bf0d, 76ad171e): CriticAbsoluteMax was 6h, but the host's backstop
  (MaxRuntime × multiplier, or its own absolute max) can reach 6h+, so the
  ceiling fired first and reintroduced a premature hard cap. Now CriticAbsoluteMax
  is a 24h RUNAWAY guard set far beyond any realistic backstop (the host clamps
  its own backstop to a much smaller absolute max, e.g. mort's 6h convar), so it
  never pre-empts a healthy supervised run. Comments corrected.

- nil Monitor handle lost the MaxRuntime cap (df016a6f, 9dd42827): a critic-enabled
  run whose host Monitor returned no handle had no deadline-watch and was bounded
  only by the generous ceiling. Added an unsupervised-run failsafe that re-wraps
  runCtx to the nominal MaxRuntime when the critic is enabled but didn't arm.
  New test TestCriticOwnsDeadline_NilHandleFallsBackToMaxRuntime.

- CriticSoftTimeout vestigial / dead fallback (f7764919, 9805bebe, 6864086f,
  b2b11721): the soft trigger is now always the resolved MaxRuntime (> 0), so the
  CriticSoftTimeout field + its startCritic fallback were unreachable. Removed the
  field entirely; the remaining 90s floor is documented as defensive-only.

- DRY (f30ce827): extracted e.criticOwnsDeadline(ra), now the single predicate used
  by both Run and startCritic so they can't drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01Jo75sqmeVPgFUWZQBn179X
2026-06-30 11:32:46 -04:00

executus

⚠️ This project is vibe-coded. executus is written almost entirely by an AI coding agent (Claude), with a human steering at the design and review level rather than typing the code. That's a deliberate choice, stated up front — the same way gadfly is. Read the code before you depend on it, pin a version, and file issues if something looks off. It is offered as-is.

A batteries-included base for building LLM agent harnesses in Go. Import it, do a little wiring, and you have agentic capabilities: a bounded run loop, a tool registry with a suite of common tools, context compaction, config-driven model tiering and failover, structured output, and parallel fan-out — with sensible defaults so a brand-new project is agentic with almost no setup, and pluggable seams so a serious host can swap in its own storage, config, delivery, and tools.

executus sits strictly above majordomo — the lean LLM substrate (agent loop, canonical llm types, providers, media normalization, model parsing / failover / tiering). majordomo stays the substrate; executus is the opinionated, batteries-included layer on top. executus requires no changes to majordomo.

Status

Early. Being extracted, phase by phase, from the agent layer of mort (a Discord bot) — mort and gadfly are the first two consumers (heavy and light). See CLAUDE.md for the architecture and the extraction roadmap (P0P6).

Available today:

  • run/executus is runnable. run.Executor ties model resolution, the tool registry, majordomo's agent loop, context compaction, run-bounding, and step/audit instrumentation into one Run(ctx, RunnableAgent, inv) Result, with every host concern behind a nil-safe run.Ports (Audit/Budget/Critic/ Checkpointer/PaletteSource/Delivery/InputFiles). See examples/minimal.
  • model/ — config-driven tier resolution + failover over majordomo, with pluggable UsageSink/TraceSink and GenerateWith[T] structured output.
  • tool/ — the tool registry + 3-stage permission model + SSRF guard.
  • compact/ — the per-run context compactor.
  • lane/ — bounded worker pool with fair-share queueing (run- and provider-concurrency).
  • fanout/ — programmatic N×M swarm with bounded global + per-key concurrency.
  • config/, deliver/, identity/ — host seams (config / output / identity), each with a shipped default.
  • dispatchguard/, pendingattach/ — run-safety primitives.
  • examples/reviewer — a gadfly-shaped PR reviewer on the core only (env-config model fleet → fanout N×M swarm → model.GenerateWith[T] structured findings → consolidation), the light-tier canary; CI asserts it pulls in no battery.

Design

Two tiers in one module (go.mod = majordomo + stdlib only):

  • Core — everything a light host needs to be agentic: run loop, tool registry + common tools, model resolution, compaction, lanes, fan-out, structured output. No persistence, no scheduling.
  • Batteries (opt-in sibling packages) — persona/agent nouns, saved skills, audit, run-critic, scheduling, budgets, checkpointing. Each is nil-safe and ships a default, so you add only what you use.

Persistence that needs a real database lives in a separate nested module (contrib/store, pure-Go SQLite) so the core never drags in a DB driver — a static-binary host (gadfly) stays static.

License

TBD.

S
Description
Batteries-included base for building LLM agent harnesses in Go (above majordomo). Vibe-coded.
Readme 1.1 MiB
Languages
Go 100%