majordomo/docs/adr/0002-canonical-message-model.md

# ADR-0002: Canonical message/content model

**Status:** Accepted — 2026-06-10

## Context

Every provider has a different wire shape for conversations, content,
tool calls, and system prompts. majordomo needs one canonical shape that all
providers translate to/from, expressive enough for multimodality and tool
loops, small enough to keep providers honest.

## Decision

- `Message{Role, Parts, ToolCalls, ToolResults}` with roles system / user /
  assistant / tool. `Part` is a **sealed** interface (`TextPart`,
  `ImagePart`) so providers can switch exhaustively; new media kinds are
  deliberate API changes, not silent pass-throughs.
- `ImagePart` is **bytes + MIME only** — no URL form. The media pipeline
  must inspect/resize/transcode images against target capabilities, which
  requires bytes; fetching remote URLs is the caller's job, not a hidden
  network dependency inside a model call.
- `Request.System` is a dedicated top-level field (maps to Anthropic
  `system`, Google `SystemInstruction`, an OpenAI/Ollama system message).
  RoleSystem messages in the history are also accepted and folded by
  providers. Request also carries Tools, ToolChoice, Schema/SchemaName, and
  sampling knobs; per-call mutation happens via `Option` funcs applied to a
  copy, so Request values are reusable.
- Model ids never carry behavior suffixes: unlike go-llm there is **no
  `:low/:medium/:high` reasoning-suffix grammar** (it conflicts with
  verbatim model ids like `minimax-m3:cloud`, see ADR-0003). Reasoning
  effort will be a request option when providers land.
- `Response{Parts, ToolCalls, FinishReason, Usage, Model, Raw}` — `Model`
  names the target that actually served the request (vital with chains);
  `Raw` is the provider-native escape hatch, never required.
- Streaming (`Stream.Next() → StreamEvent`): text deltas stream as they
  arrive; **tool-call arguments are buffered until complete** (consumers
  never see partial JSON); the final event carries the accumulated
  `*Response`; `io.EOF` terminates.

## Consequences

- Providers stay translation layers; nothing provider-specific leaks into
  the canonical API.
- Callers needing remote images fetch them first — explicit, testable.
- Partial-tool-call streaming UIs are out of scope (acceptable: arguments
  are rarely useful before they parse).

## Alternatives considered

- Open `Part` interface — silent content drops on unknown kinds. Rejected.
- URL image parts with lazy fetch — hidden I/O inside Generate, breaks
  capability normalization. Rejected.
- go-llm-style reasoning suffixes — see ADR-0003. Rejected.