Add a comprehensive performance monitoring system that collects CPU, memory, swap, load average, network IO, and GPU stats. Provides both a REST API for the UI and a Prometheus /metrics endpoint.
Backend changes:
- New internal/perf package with configurable interval-based stats collection
- GPU monitoring via LACT (Unix socket) and nvidia-smi fallback on Linux
- Ring buffer (internal/ring) for time-series stat storage
- Prometheus /metrics endpoint with all system and GPU metrics
- Moved LogMonitor to internal/logmon package
- New PerformanceConfig for hot-reloadable monitoring settings
- REST /api/performance endpoint replacing SSE streaming
UI changes:
- New Performance page with real-time charts for CPU, memory, GPU, and network
- Reusable PerformanceChart component
- LLAMA_SWAP_URL environment variable support
- Improved capture dialog display
Other:
- Example Grafana dashboard for Prometheus metrics
- monitor-test standalone binary
- Config schema and example updates
fixes#596
- inference handles to store an activity record for all inference endpoints
- add path, status code, and content type to Activities page
- toggle on/off columns no Activities page
- add configurable capture level for inference endpoints so large binary blobs are not stored in memory
- store captures in compressed binary format
Upgrade vite and related dependencies to take advantage of Vite 8's
improved build times via Rolldown and Oxc.
- vite: ^6.3.5 → ^8.0.0
- @sveltejs/vite-plugin-svelte: ^5.0.3 → ^7.0.0
- svelte: ^5.19.0 → ^5.46.4
- vite-plugin-compression2: ^2.4.0 → ^2.5.1
- vitest: ^4.0.18 → ^4.1.0
---------
Co-authored-by: Claude <noreply@anthropic.com>
Add a copy-to-clipboard button that appears on hover for each code block
rendered in the chat interface assistant messages.
- Svelte action `codeBlockCopy` injects a button into every `<pre>`
element
- MutationObserver reattaches buttons as streaming content arrives
- Button shows a check icon for 2 seconds after a successful copy
- Uses clipboard API with execCommand fallback for non-secure contexts
- CSS hides button by default and reveals it on pre:hover
https://claude.ai/code/session_01PTA5ao5YQuFAS6a9juLeZW
---------
Co-authored-by: Claude <noreply@anthropic.com>
Add saving request and response headers and bodies that go through
llama-swap in memory.
- captureBuffer added to configuration. Captures are enabled by default.
- 5MB of memory is allocated for req/response captures in a ring buffer.
Setting captureBuffer to 0 will disable captures.
- UI elements to view captured data added to Activity page. Includes
some
QOL features like json formatting and recombining SSE chat streams
- capture saving is done at the byte level and has minimal impact on
llama-swap performance
Fixes#464
Ref #503
Reorganizes control placement in the playground interfaces and
improves form interactions for better UX, particularly on mobile
devices.
## Key Changes
- **AudioInterface & ImageInterface**: Moved "Clear" buttons from the
top control bar into the action button group below the form inputs for
better visual hierarchy and logical grouping
- **ImageInterface**:
- Added prompt clearing to the `clearImage()` function so the input
field is reset when clearing generated images
- Updated Clear button disabled state to also check if prompt is empty,
allowing users to clear an empty prompt
- Added responsive flex styling (`flex-1 md:flex-none`) to the Clear
button for better mobile layout
- **ExpandableTextarea**:
- Imported `untrack` from Svelte to properly handle reactive
dependencies
- Wrapped `expandedValue.length` in `untrack()` to prevent unnecessary
reactivity when setting cursor position
- Improved button visibility on mobile by changing opacity from
`opacity-0` to `opacity-60` with `md:opacity-0` breakpoint, making the
expand button more discoverable on touch devices
## Implementation Details
The `untrack()` usage in ExpandableTextarea ensures that reading the
text length doesn't create a reactive dependency, preventing potential
infinite loops while still allowing the effect to run when `isExpanded`
changes.
Replace the legacy React UI with the new Svelte-based one. Introduce a Playground in the UI to quickly test out text, image, text to speech and speech to text models behind llama-swap.
Key Changes
New Svelte UI (ui-svelte/)
- Multi-tab Playground with Chat, Image Generation, Audio Transcription, and Speech interfaces
- Chat: message editing/regeneration, markdown rendering with LaTeX math support, image attachments, code syntax highlighting
- Image: size selector, download/fullscreen viewing
- Audio: transcription with peer support
- Speech: voice caching with manual refresh, download button
- Responsive mobile layout with collapsible navigation
- XSS fixes and accessibility improvements
Proxy Improvements
- Add gzip/brotli compression for UI static assets (proxy/ui_compress.go)
- Add GET /v1/audio/voices?model={model} endpoint for voice listing
- Add peer support for /v1/audio/transcriptions
Trying out svelte for the UI. The port was done by Claude Code on the iOS app w/ Opus 4.5.
---
* ui: add Svelte port of React UI
Port the React-based UI to Svelte 5 with the following changes:
- Create new ui-svelte directory with complete Svelte 5 implementation
- Use Svelte stores instead of React contexts for state management
- Implement custom ResizablePanels component to replace react-resizable-panels
- Port all pages: LogViewer, Models, Activity
- Port all components: Header, ConnectionStatus, LogPanel, ModelsPanel, etc.
- Use svelte-spa-router for client-side routing
- Same build output directory (proxy/ui_dist) and base path (/ui/)
- Tailwind CSS 4 with same theme configuration
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: simplify state management
- Remove redundant state syncing pattern in LogPanel and ModelsPanel
- Use store values directly with $ syntax instead of manual subscriptions
- Consolidate duplicate title sync logic in App.svelte
- Use existing syncTitleToDocument() from theme.ts
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: use idiomatic Svelte 5 patterns
- Use $effect for document side effects (theme, title) instead of
store subscriptions
- Use class: directive for active nav links in Header
- Remove SSR guards (unnecessary for client-only SPA)
- Remove leaked subscription in syncThemeToDocument
- Simplify theme.ts by removing sync functions
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: fix build warnings and improve accessibility
Fix Svelte build warnings and add proper accessibility support
to interactive components.
- add aria-labels to buttons for screen readers
- implement keyboard navigation for resizable separator
- suppress intentional state initialization warnings
- update Makefile to use ui-svelte build directory
- add peer:true to package-lock.json dependencies
* ui-svelte: reorganize navigation and add log view toggle
Make Models the default landing page and add view mode toggle
to the Logs page with persistent state.
- set Models as default route at /
- move Logs to /logs route
- reorder navigation: Models, Activity, Logs
- add view toggle with three modes: Panels, Proxy only, Upstream only
- fix horizontal overflow with width constraints