Introduces an opt-in level-based reasoning toggle (low/medium/high) that
each provider translates to its native parameter:
- Anthropic: thinking.budget_tokens (1024/8000/24000), with temperature
forced to default and MaxTokens auto-grown above the budget.
- OpenAI/xAI/Groq via openaicompat: reasoning_effort string, gated by a
new Rules.SupportsReasoning predicate so non-reasoning models don't
receive the parameter. xAI uses Rules.MapReasoningEffort to remap
"medium" to "high" since its API only accepts low|high.
- Google: thinking_config.thinking_budget + include_thoughts:true.
- DeepSeek: SupportsReasoning=false (reasoner is always-on; the
reasoning_content trace was already extracted via openaicompat).
Reasoning content is surfaced as Response.Thinking on Complete and as
StreamEventThinking deltas during streaming. Provider-side: extracted
from Anthropic thinking content blocks, Google's part.Thought=true
parts, and the non-standard reasoning_content field that DeepSeek and
Groq emit (parsed out of raw JSON since openai-go doesn't type it).
Public API:
- llm.ReasoningLevel + ReasoningLow/Medium/High constants
- llm.WithReasoning(level) request option
- Model.WithReasoning(level) for baked-in defaults
- provider.Request.Reasoning, provider.Response.Thinking
- provider.StreamEventThinking
Tests cover Rules-based gating, MapReasoningEffort, reasoning_content
extraction (Complete + Stream), Anthropic budget mapping, and
temperature suppression when thinking is enabled. Existing behavior is
unchanged when Reasoning is the empty string.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five OpenAI-compatible providers join the library as first-class constructors
(llm.DeepSeek, llm.Moonshot, llm.XAI, llm.Groq, llm.Ollama). Their wire-level
implementation is shared via a new v2/openaicompat package which is the
extracted guts of the old v2/openai provider; each provider supplies its own
Rules value to declare per-model constraints (e.g., DeepSeek Reasoner rejects
tools and temperature, Moonshot/xAI accept images only on *-vision* models,
Groq rejects audio input). v2/openai itself becomes a thin wrapper that sets
RestrictTemperature for o-series and gpt-5 models.
A new provider registry (v2/registry.go) exposes llm.Providers() and drives
the TUI's provider picker so adding a provider in future is a single-file
change.
The TUI at cmd/llm was migrated from v1 to v2 and moved to v2/cmd/llm. With
nothing else depending on v1, the v1 code at the repo root (all .go files,
schema/, internal/, provider/, root go.mod/go.sum) is deleted.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Verifies that WithPromptCaching() on a Chat results in CacheHints being
set on the provider.Request that reaches the provider layer, and that
omitting the option leaves CacheHints nil (no behavior change for
existing callers).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
buildRequest now tracks a source-index → built-message-index mapping
during the role-merge pass, then uses the mapping to attach
cache_control: {type: ephemeral} markers at the positions indicated by
Request.CacheHints. The last tool, the last system part, and the last
non-system message each get a marker when the corresponding hint is set.
Covers the merge-induced index drift that would otherwise cause the
breakpoint to land on the wrong content block when consecutive same-role
source messages are combined into a single Anthropic message with
multiple content blocks.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Removes the blank-assign workaround that was only needed because the
anth import was being kept alive for Task 5's use. Task 5 will bring
the import back when it actually references anth.CacheControlTypeEphemeral.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Switches buildRequest to emit anthReq.MultiSystem instead of anthReq.System
whenever a system message is present. Upstream's MarshalJSON prefers
MultiSystem when non-empty, so the wire format is unchanged for requests
without cache_control. This refactor is a prerequisite for attaching
cache_control markers to system parts in the next commit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds two boundary tests suggested by code review:
- TestBuildProviderRequest_CachingEnabled_EmptyMessages: verifies
that caching with an empty message list still emits a CacheHints
with LastCacheableMessageIndex=-1, not a spurious breakpoint.
- TestBuildProviderRequest_CachingNonNilButDisabled: verifies that
an explicitly-disabled cacheConfig (non-nil, enabled=false)
produces nil CacheHints, exercising the &&-guard branch that
the previous "disabled" test left untested.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
buildProviderRequest now computes cache-breakpoint positions automatically
when the WithPromptCaching() option is set. It places up to 3 hints:
tools, system, and the index of the last non-system message. Providers
that don't support caching (OpenAI, Google) ignore the field.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces an opt-in RequestOption that callers can pass to enable
automatic prompt-caching markers. The option populates a cacheConfig
on requestConfig but has no effect yet — plumbing through to
provider.Request and on to the Anthropic provider lands in subsequent
commits.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds an optional CacheHints field on provider.Request that carries
cache-breakpoint placement directives from the public llm package down
to individual provider implementations. Anthropic will consume these in
a follow-up commit; OpenAI and Google ignore them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add provider-specific usage details, fix streaming usage, and return
usage from all high-level APIs (Chat.Send, Generate[T], Agent.Run).
Breaking changes:
- Chat.Send/SendMessage/SendWithImages now return (string, *Usage, error)
- Generate[T]/GenerateWith[T] now return (T, *Usage, error)
- Agent.Run/RunMessages now return (string, *Usage, error)
New features:
- Usage.Details map for provider-specific token breakdowns
(reasoning, cached, audio, thoughts tokens)
- OpenAI streaming now captures usage via StreamOptions.IncludeUsage
- Google streaming now captures UsageMetadata from final chunk
- UsageTracker.Details() for accumulated detail totals
- ModelPricing and PricingRegistry for cost computation
Closes#2
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Audio struct alongside Image for sending audio attachments to
multimodal LLMs. OpenAI uses input_audio content parts (wav/mp3),
Google Gemini uses genai.NewPartFromBytes, and Anthropic skips
audio gracefully since it's not supported.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provides a complete lifecycle manager for ephemeral sandbox environments:
- ProxmoxClient: thin REST wrapper for container CRUD, IP discovery, internet toggle
- SSHExecutor: persistent SSH/SFTP for command execution and file transfer
- Manager/Sandbox: high-level orchestrator tying Proxmox + SSH together
- 22 unit tests with mock Proxmox HTTP server
- Proxmox setup & hardening guide (docs/sandbox-setup.md)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces v2/agent with a minimal API: Agent, New(), Run(), and AsTool().
Agents wrap a model + system prompt + tools. AsTool() turns an agent into
a llm.Tool, enabling parent agents to delegate to sub-agents through the
normal tool-call loop — no channels, pools, or orchestration needed.
Also exports NewClient(provider.Provider) for custom provider integration.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Generic functions that use the "hidden tool" technique to force models
to return structured JSON matching a Go struct's schema, replacing the
verbose "tool as structured output" pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Cover all core library logic (Client, Model, Chat, middleware, streaming,
message conversion, request building) using a configurable mock provider
that avoids real API calls. ~50 tests across 7 files.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Migrate speech-to-text transcription types and OpenAI transcriber
implementation from v1. Types are defined in provider/ to avoid
import cycles and re-exported via type aliases from the root package.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Runs on all pushes and PRs:
- Build, vet, and test both root and v2 modules (with -race)
- Verify go.mod/go.sum tidiness for both modules
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
v2 is a new Go module (v2/) with a dramatically simpler API:
- Unified Message type (no more Input marker interface)
- Define[T] for ergonomic tool creation with standard context.Context
- Chat session with automatic tool-call loop (agent loop)
- Streaming via pull-based StreamReader
- MCP one-call connect (MCPStdioServer, MCPHTTPServer, MCPSSEServer)
- Middleware support (logging, retry, timeout, usage tracking)
- Decoupled JSON Schema (map[string]any, no provider coupling)
- Sample tools: WebSearch, Browser, Exec, ReadFile, WriteFile, HTTP
- Providers: OpenAI, Anthropic, Google (all with streaming)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Introduce `MCPServer` to support connecting to MCP servers via stdio, SSE, or HTTP.
- Implement tool fetching, management, and invocation through MCP.
- Add `WithMCPServer` method to `ToolBox` for seamless tool integration.
- Extend schema package to handle raw JSON schemas for MCP tools.
- Update documentation with MCP usage guidelines and examples.
- Migrate `compress_image.go` to `internal/imageutil` for better encapsulation.
- Reorganize LLM provider implementations into distinct packages (`google`, `openai`, and `anthropic`).
- Replace `go_llm` package name with `llm`.
- Refactor internal APIs for improved clarity, including renaming `anthropic` to `anthropicImpl` and `google` to `googleImpl`.
- Add helper methods and restructure message handling for better separation of concerns.
- Update all Go dependencies to latest versions
- Migrate from github.com/google/generative-ai-go/genai to google.golang.org/genai
- Fix google.go to use the new SDK API (NewPartFromText, NewContentFromParts, etc.)
- Update schema package imports to use the new genai package
- Add CLAUDE.md with README maintenance guideline
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Introduce `Providers` struct to handle different language model providers. Implement `Parse` method to extract and validate provider/model from input string, then return a chat completion interface. Add error handling for invalid formats or unknown providers.
Previously, required fields were not handled in OpenAI and Google parameter generation. This update adds logic to include a "required" list for both, ensuring mandatory fields are accurately captured in the schema outputs.
This update introduces support for `jsonschema.Integer` types and updates the logic to handle nested items in schemas. Added a new default error log for unknown types using `slog.Error`. Also, integrated tool configuration with a `FunctionCallingConfig` when `dontRequireTool` is false.
This update introduces support for `jsonschema.Integer` types and updates the logic to handle nested items in schemas. Added a new default error log for unknown types using `slog.Error`. Also, integrated tool configuration with a `FunctionCallingConfig` when `dontRequireTool` is false.
Updated the return type of functions and related code from `string` to `any` to improve flexibility and support more diverse outputs. Adjusted function implementations, signatures, and handling of results accordingly.
Introduced `WithFunctionRemoved` and `ExecuteCallbacks` methods to enhance `ToolBox` functionality. This allows dynamic function removal and execution of custom callbacks during tool call processing. Also cleaned up logging and improved handling for required tools in `openai.go`.
Introduce `Response()` and `ToolCall()` methods to access the respective fields from the `Context` struct. This enhances encapsulation and provides a standardized way to retrieve these values.
Previously, OpenAI messages containing both `Content` and `MultiContent` could cause inconsistent behavior. This update ensures `Content` is converted into a `MultiContent` entry to maintain compatibility.