Files
majordomo/docs/adr/0002-canonical-message-model.md
steve dcd004289f feat: foundations — canonical types, Parse grammar, env DSNs, health, chains
Phase 1 of the majordomo build:
- llm/ canonical contract (messages, parts, tools, capabilities, streaming,
  Model/Provider, error classification)
- health/ clock-injected tracker (threshold bench, exponential capped
  cooldown, reset-on-success)
- root Registry + Parse (verbatim model ids, inline recursive alias
  expansion with cycle detection, chain dedup), LLM_* env-DSN providers
  (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain
  executor behind the Model interface
- provider/fake scriptable test provider; hermetic test suite incl. the
  trailing-thinking chain and foreman:// env loading
- ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow,
  docs/phase-1-design.md

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 12:35:34 +02:00

2.6 KiB

ADR-0002: Canonical message/content model

Status: Accepted — 2026-06-10

Context

Every provider has a different wire shape for conversations, content, tool calls, and system prompts. majordomo needs one canonical shape that all providers translate to/from, expressive enough for multimodality and tool loops, small enough to keep providers honest.

Decision

  • Message{Role, Parts, ToolCalls, ToolResults} with roles system / user / assistant / tool. Part is a sealed interface (TextPart, ImagePart) so providers can switch exhaustively; new media kinds are deliberate API changes, not silent pass-throughs.
  • ImagePart is bytes + MIME only — no URL form. The media pipeline must inspect/resize/transcode images against target capabilities, which requires bytes; fetching remote URLs is the caller's job, not a hidden network dependency inside a model call.
  • Request.System is a dedicated top-level field (maps to Anthropic system, Google SystemInstruction, an OpenAI/Ollama system message). RoleSystem messages in the history are also accepted and folded by providers. Request also carries Tools, ToolChoice, Schema/SchemaName, and sampling knobs; per-call mutation happens via Option funcs applied to a copy, so Request values are reusable.
  • Model ids never carry behavior suffixes: unlike go-llm there is no :low/:medium/:high reasoning-suffix grammar (it conflicts with verbatim model ids like minimax-m3:cloud, see ADR-0003). Reasoning effort will be a request option when providers land.
  • Response{Parts, ToolCalls, FinishReason, Usage, Model, Raw}Model names the target that actually served the request (vital with chains); Raw is the provider-native escape hatch, never required.
  • Streaming (Stream.Next() → StreamEvent): text deltas stream as they arrive; tool-call arguments are buffered until complete (consumers never see partial JSON); the final event carries the accumulated *Response; io.EOF terminates.

Consequences

  • Providers stay translation layers; nothing provider-specific leaks into the canonical API.
  • Callers needing remote images fetch them first — explicit, testable.
  • Partial-tool-call streaming UIs are out of scope (acceptable: arguments are rarely useful before they parse).

Alternatives considered

  • Open Part interface — silent content drops on unknown kinds. Rejected.
  • URL image parts with lazy fetch — hidden I/O inside Generate, breaks capability normalization. Rejected.
  • go-llm-style reasoning suffixes — see ADR-0003. Rejected.