Files
foreman/docs/adr/0011-go-client-and-go-llm-integration.md
T
2026-05-23 16:41:20 -04:00

3.1 KiB

ADR-0011: Go integration — the Foreman interface

Status: Accepted — 2026-05-23

Context

The ultimate goal: use the M1 Pro simply as a target for go-llm.

Verified (v2/constructors.go, v2/ollama/ollama.go): llm.OllamaCloud(key, WithBaseURL(...)) already targets "a private Ollama deployment that requires auth" — native /api/chat + Authorization: Bearer <key> against any base URL. foreman is exactly that on the wire (ADR-0003). So integration needs no new provider — only a clean, intent-revealing seam so call sites say "foreman," not "Ollama."

go-llm's provider contract (v2/provider) is two methods, Complete and Stream; a future dedicated provider would implement them.

Decision

Add a llm.Foreman(baseURL, apiKey, opts...) constructor to go-llm that delegates to the ollama native provider — the ollama translation happens behind the scenes:

func Foreman(baseURL, apiKey string, opts ...ClientOption) *Client {
    cfg := &clientConfig{}
    for _, opt := range opts {
        opt(cfg)
    }
    if cfg.baseURL != "" {
        baseURL = cfg.baseURL
    }
    return NewClient(ollamaProvider.New(apiKey, baseURL))
}

// model := llm.Foreman("http://foreman.orgrimmar:PORT", token).Model("qwen3.6:35b")

baseURL is required (foreman has no default public address). This is a deliberate seam: v1 is a pass-through to the ollama provider; a dedicated foreman provider can later replace the delegate to surface job IDs / async state without changing call sites.

Three escalating levels

  • Level 0 — llm.Foreman(...) (now, the headline goal). Transparent, synchronous, full native tool-calling / think:false / streaming. Queueing and model-swap management happen invisibly inside the daemon. Zero provider code.
  • Level 1 — foreman client package (when an orchestration caller needs it). A synchronous facade over the async /jobs surface: given messages, it manages an ephemeral webhook receiver, blocks until done, and returns result + artifacts (falling back to GET /jobs/{id} polling if it can't receive callbacks). For callers wanting async semantics — surfaced job IDs, no long-held connection — with a synchronous call signature.
  • Level 2 — dedicated provider.Provider (only if needed). Wraps Level 1 so foreman is a first-class go-llm backend exposing job IDs / state / artifacts the plain ollama provider can't. Built only if Level 0 proves insufficient.

Consequences

  • Headline goal met with one constructor and no provider code.
  • Call sites are foreman-named and future-proofed by the seam.
  • Async ergonomics are available later without forcing webhook plumbing on callers, and without touching Level-0 users.

Alternatives considered

  • Just tell users to call OllamaCloud with a base URL. Works identically today, but leaks the implementation ("it's Ollama") and offers no seam for future foreman-specific behavior. The named constructor is the requested "foreman interface."
  • Ship a dedicated provider from day one (Level 2 first). More code; bypasses the zero-friction win. Deferred.