- Add /models route (ModelsDash) with unload-all, model list with
start/stop buttons, and show-unlisted toggle
- Make sidebar Models and Playground headings navigate to their pages
while the chevron independently expands/collapses the section
- Extract shared model load/unload orchestration into modelLoad store
- Left-align model names in the ConcurrencyInterface load-test list
- Widen CaptureDialog to 90% with flex-based scroll overflow
- Use sm:max-w-[90%] to override the shadcn dialog's sm:max-w-sm cap
Convert the remaining .btn usages (Concurrency, Performance, CaptureDialog)
to shadcn Button, fix CaptureDialog/PerformanceChart styles to shadcn
tokens, and remove the transitional legacy palette aliases and component
classes from index.css. Drop the now-unused lucide-svelte and shadcn-svelte
dependencies.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01UmuGqwNBJNEAMaWsdCDqUC
- RerankInterface uses Button/Input/Textarea/ToggleGroup
- normalize legacy color utilities and lucide imports across the
remaining playground interfaces, Performance and CaptureDialog
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01UmuGqwNBJNEAMaWsdCDqUC
- add support for http handlers in the request chain to append metadata
to the request
- metrics middleware will include metadata in the activity log
- update Activity UI to support metadata, drag sort columns
- update Activity UI capture dialog to use more screen space
Updates #834
Add a comprehensive performance monitoring system that collects CPU, memory, swap, load average, network IO, and GPU stats. Provides both a REST API for the UI and a Prometheus /metrics endpoint.
Backend changes:
- New internal/perf package with configurable interval-based stats collection
- GPU monitoring via LACT (Unix socket) and nvidia-smi fallback on Linux
- Ring buffer (internal/ring) for time-series stat storage
- Prometheus /metrics endpoint with all system and GPU metrics
- Moved LogMonitor to internal/logmon package
- New PerformanceConfig for hot-reloadable monitoring settings
- REST /api/performance endpoint replacing SSE streaming
UI changes:
- New Performance page with real-time charts for CPU, memory, GPU, and network
- Reusable PerformanceChart component
- LLAMA_SWAP_URL environment variable support
- Improved capture dialog display
Other:
- Example Grafana dashboard for Prometheus metrics
- monitor-test standalone binary
- Config schema and example updates
fixes#596
Add saving request and response headers and bodies that go through
llama-swap in memory.
- captureBuffer added to configuration. Captures are enabled by default.
- 5MB of memory is allocated for req/response captures in a ring buffer.
Setting captureBuffer to 0 will disable captures.
- UI elements to view captured data added to Activity page. Includes
some
QOL features like json formatting and recombining SSE chat streams
- capture saving is done at the byte level and has minimal impact on
llama-swap performance
Fixes#464
Ref #503