majordomo

Author	SHA1	Message	Date
steve	64642c43c4	fix(llamaswap): address Gadfly review findings CI / Tidy (pull_request) Successful in 9m25s Details CI / Build & Test (pull_request) Successful in 10m15s Details - Unload: reject model ids containing path separators (/?#) so a model name can't redirect the request to another endpoint; ":" (common in ids) stays verbatim. - doJSON: take a model arg so image/management HTTP errors carry the target id (was always ""); add a base-URL guard so management methods fail clearly instead of building a bare-path request; cap the success-path JSON decode with io.LimitReader (64 MiB) and drain the body when out is nil for conn reuse. - image: reject negative Request.N before sending. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 16:04:23 -04:00
steve	96c612e707	feat(llamaswap): add llama-swap provider + canonical imagegen interface CI / Tidy (pull_request) Successful in 9m25s Details CI / Build & Test (pull_request) Successful in 10m15s Details Add provider/llamaswap, a tailored provider for llama-swap (the model-swapping proxy over llama.cpp / stable-diffusion.cpp). Its chat path delegates to provider/openai at {base}/v1 — no duplicated wire client (ADR-0007) — with legacy max_tokens, a Bearer no-key placeholder for keyless local instances, and a timeout-free client so cold model swaps rely on context deadlines. The "tailored" surface is concrete management methods (ListModels / Running / Unload) that don't belong on the canonical llm.Provider interface. The llama-swap:// DSN scheme builds an http base URL (local-first); a no-URL built-in errors clearly on use, mirroring foreman. Add imagegen, a new canonical text-to-image interface separate from llm (Request/Result/Model/Provider; Image = llm.ImagePart so generated images feed straight back into chat). First backend is llama-swap via OpenAI /v1/images/generations (b64_json, bytes-only). Re-exported from the root. v1 is txt2img only. Hermetic httptest coverage for chat delegation, management endpoints, image decode, and scheme wiring. ADR-0015 + ADR-0016, README support matrix + image-gen section, CLAUDE.md package map, and progress.md updated in the same commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-27 15:01:54 -04:00
steve	0147a79d18	feat: conversion-driven extensions — resolvers, DefineTool, hooks, ops controls CI / Tidy (push) Successful in 9m31s Details CI / Build & Test (push) Successful in 10m13s Details Phase 9a (ADR-0014): Registry.RegisterResolver for dynamic tiers; DefineTool[Args] typed tools; Usage cache/reasoning detail fields wired through anthropic/openai/google; WithPromptCaching (Anthropic cache_control); agent supervision hooks (WithMaxStepsFunc, WithSteer, WithCompactor, WithToolErrorLimits + ErrToolLoop); health Bench/Unbench/Snapshot; ChainConfig.Observer failover events. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 13:30:06 +02:00
steve	04b21fdad2	feat: live-validated against Ollama Cloud; schema instruction fallback for cloud Phase 8: all six live checks pass (tier aliases, thinking-tier chat, real tool invocation, structured Generate[T], forced failover with bench+skip, skill agent). Discovery: ollama.com ignores the format field — the provider now also states the schema as a system instruction (constrained decoding locally, instruction-guided JSON on cloud), with hermetic test. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 13:22:54 +02:00
steve	1ca607906d	feat: Google (Gemini) provider on the official Gen AI SDK Phase 4: provider/google on google.golang.org/genai v1.59.0 — lazy cached client, FunctionResponse tool loop, raw-JSON-schema tools and structured output, ThinkingLevel reasoning mapping, iter.Pull2 streaming, hermetic httptest suite via HTTPOptions.BaseURL. Registry wires google + gemini schemes to the real client; stub machinery deleted (all built-ins real). ADR-0011; README matrix + CLAUDE.md synced. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 13:04:28 +02:00
steve	043249e0e1	feat: OpenAI, Anthropic, and native-Ollama providers + media pipeline Phase 3: - provider/openai: Chat Completions for OpenAI + compat endpoints (SSE streaming with by-index tool-call assembly, response_format json_schema, legacy max_tokens option, reasoning_effort) - provider/anthropic: Messages API (tool_use/tool_result, GA structured output via output_config.format, full SSE event parser, 529 transient) - provider/ollama: one native /api/chat client behind the ollama, ollama-cloud, and foreman built-ins (presets; NDJSON streaming tolerant of foreman's buffered single-object responses; object tool arguments; format-schema structured output; think mapping) - media/: capability normalization (sniff, downscale, transcode, byte ladder, ErrUnsupported), wired into the chain executor per target with penalty-free advance past incapable elements - registry: real provider + scheme wiring, WithHTTPClient option, required env-foreman TLS chat round-trip test - ADR-0009 multimodal strategy, ADR-0010 tools/structured mapping; README matrix + CLAUDE.md synced Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 12:58:08 +02:00
steve	dcd004289f	feat: foundations — canonical types, Parse grammar, env DSNs, health, chains Phase 1 of the majordomo build: - llm/ canonical contract (messages, parts, tools, capabilities, streaming, Model/Provider, error classification) - health/ clock-injected tracker (threshold bench, exponential capped cooldown, reset-on-success) - root Registry + Parse (verbatim model ids, inline recursive alias expansion with cycle detection, chain dedup), LLM_* env-DSN providers (go-llm parity: lazy fallback + eager LoadEnv), health-aware chain executor behind the Model interface - provider/fake scriptable test provider; hermetic test suite incl. the trailing-thinking chain and foreman:// env loading - ADRs 0001-0008, CLAUDE.md, README (honest matrix), CI workflow, docs/phase-1-design.md Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>	2026-06-10 12:35:34 +02:00

7 Commits