diff --git a/v2/CLAUDE.md b/v2/CLAUDE.md
index dad6c90..840ca28 100644
--- a/v2/CLAUDE.md
+++ b/v2/CLAUDE.md
@@ -17,6 +17,7 @@
 - Root package `llm` — public API (Client, Model, Chat, ToolBox, Message types)
 - `provider/` — Provider interface that backends implement
 - `openai/`, `anthropic/`, `google/` — Provider implementations
+- `ollama/` — Native `/api/chat` provider, used by both `llm.Ollama()` (local) and `llm.OllamaCloud(apiKey)` (cloud).
 - `tools/` — Ready-to-use sample tools (WebSearch, Browser, Exec, ReadFile, WriteFile, HTTP)
 - `sandbox/` — Isolated Linux container environments via Proxmox LXC + SSH
 - `internal/schema/` — JSON Schema generation from Go structs
@@ -30,3 +31,4 @@
 5. MCP one-call connect: `MCPStdioServer(ctx, cmd, args...)`
 6. Streaming via pull-based `StreamReader.Next()`
 7. Middleware for logging, retry, timeout, usage tracking
+8. Ollama uses the native `/api/chat` API rather than the OpenAI-compat `/v1` endpoint. Native API supports `think: false` for thinking-capable models, has more reliable tool calling, and is approximately 15-20% lower latency. Both local and cloud share the same provider; only the apiKey/baseURL differ. `llm.Ollama()` targets `http://localhost:11434` with no Authorization header; `llm.OllamaCloud(key)` targets `https://ollama.com` with `Authorization: Bearer <key>`.