dc2d4ec425
executus CI / test (push) Failing after 1m6s
Completes the P4 battery set (squashed onto main from phase-4c-batteries). - checkpoint/: run.Checkpointer durable-resume (CheckpointStore + throttled handle + Memory). - schedule/: generic cron Runner (Tick/Loop; no cron grammar of its own). - critic/: two-tier timeout watchdog (run.Critic) + Escalator policy seam + ExtendOnce default. Includes the verified gadfly #6 fixes (ExtendOnce per-run, Kill-sticky, watch panic-recovery; checkpoint throttle-after-success; schedule Next-before-Run + nil-guard + Loop recovery). P4 battery set complete: audit, budget, persona, skill, checkpoint, schedule, critic — each nil-safe, each with a default, each core-import-clean. Executor wiring for Critic/Checkpointer remains a P2 follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
51 lines
1.8 KiB
Go
51 lines
1.8 KiB
Go
// Package checkpoint is the durable-resume battery: it persists a run's
|
|
// resumable progress so a run interrupted by a shutdown can be recovered and
|
|
// continued on the next boot, rather than silently lost. It plugs into
|
|
// run.Ports.Checkpointer.
|
|
//
|
|
// Mort backs CheckpointStore with its durable-job table; Memory() is the
|
|
// zero-dependency default; contrib/store can add a SQLite one. NOTE: the
|
|
// executor's call into run.Ports.Checkpointer is a P2 follow-up — this battery
|
|
// provides the seam + impls ahead of that wiring.
|
|
package checkpoint
|
|
|
|
import (
|
|
"context"
|
|
"time"
|
|
|
|
"gitea.stevedudenhoeffer.com/steve/majordomo/llm"
|
|
)
|
|
|
|
// RunCheckpointMeta is the run attribution needed to resume a run from scratch
|
|
// (mirrors mort's agentexec.RunCheckpointMeta).
|
|
type RunCheckpointMeta struct {
|
|
RunID string
|
|
AgentID string
|
|
AgentName string
|
|
CallerID string
|
|
ChannelID string
|
|
GuildID string
|
|
Prompt string
|
|
ModelTier string
|
|
ParentRunID string
|
|
}
|
|
|
|
// RunCheckpoint is one persisted snapshot of a run's resumable progress.
|
|
type RunCheckpoint struct {
|
|
Meta RunCheckpointMeta
|
|
Messages []llm.Message // conversation so far
|
|
Iteration int // completed agent-loop iterations
|
|
ActivePhase string // current phase name (multi-phase agents); "" otherwise
|
|
UpdatedAt time.Time
|
|
}
|
|
|
|
// CheckpointStore persists run checkpoints keyed by run id. A live checkpoint
|
|
// means "this run was in flight and not cleanly finished"; Complete/Fail delete
|
|
// it. ListInterrupted returns every surviving checkpoint at boot for recovery.
|
|
type CheckpointStore interface {
|
|
Save(ctx context.Context, cp RunCheckpoint) error
|
|
Load(ctx context.Context, runID string) (*RunCheckpoint, error)
|
|
Delete(ctx context.Context, runID string) error
|
|
ListInterrupted(ctx context.Context) ([]RunCheckpoint, error)
|
|
}
|