Updated README to enhance the description of the web interface and added details about features like token metrics, request inspection, model management, and real-time log streaming.
Add a new Rerank tab to the playground that lets users test /v1/rerank
endpoints. Supports a visual table editor and a JSON editor mode that
stay in sync when toggling between them.
- add rerankApi.ts with typed wrapper for /v1/rerank
- add RerankInterface.svelte with query input, sortable document table,
color-coded scores, auto-add row, cancel/clear, and token usage
- add rerankLoading store to playgroundActivity derived store
- register Rerank tab in Playground.svelte
Updates #481
Add setParamsByID filter that applies different request parameters based
on the requested model ID, enabling per-alias behaviour for a single
loaded model.
- add SetParamsByID field to Filters struct and SanitizedSetParamsByID
method
- substitute ${MODEL_ID} and other macros in setParamsByID keys and
values
- validate no unknown macros remain in keys or values after substitution
- apply setParamsByID in proxyInferenceHandler after setParams (can
override it)
- update config-schema.json with setParamsByID definition
- update UI to show aliases and make them selectable in the Playground
closes#534
Pause auto-scroll when the user scrolls up to review logs, and resume
when they scroll back to the bottom.
- add `userScrolledUp` state variable
- add `handleScroll` to detect scroll position with 40px threshold
- guard the auto-scroll effect with `!userScrolledUp`
closes#529
- Keep Playground component mounted when navigating away, preserving
streaming/generating state
- Add animated gradient effect on Playground nav link when activity is
in progress
## Summary
- Add `--provenance=false` to docker build commands in
`build-container.sh`
- BuildKit attestation manifests are stored as untagged images in GHCR,
and the `delete-untagged-containers` cleanup job deletes them, breaking
the manifest list and causing `manifest unknown` errors on pull
- ref: https://github.com/actions/delete-package-versions/issues/162
Add saving request and response headers and bodies that go through
llama-swap in memory.
- captureBuffer added to configuration. Captures are enabled by default.
- 5MB of memory is allocated for req/response captures in a ring buffer.
Setting captureBuffer to 0 will disable captures.
- UI elements to view captured data added to Activity page. Includes
some
QOL features like json formatting and recombining SSE chat streams
- capture saving is done at the byte level and has minimal impact on
llama-swap performance
Fixes#464
Ref #503
Reorganizes control placement in the playground interfaces and
improves form interactions for better UX, particularly on mobile
devices.
## Key Changes
- **AudioInterface & ImageInterface**: Moved "Clear" buttons from the
top control bar into the action button group below the form inputs for
better visual hierarchy and logical grouping
- **ImageInterface**:
- Added prompt clearing to the `clearImage()` function so the input
field is reset when clearing generated images
- Updated Clear button disabled state to also check if prompt is empty,
allowing users to clear an empty prompt
- Added responsive flex styling (`flex-1 md:flex-none`) to the Clear
button for better mobile layout
- **ExpandableTextarea**:
- Imported `untrack` from Svelte to properly handle reactive
dependencies
- Wrapped `expandedValue.length` in `untrack()` to prevent unnecessary
reactivity when setting cursor position
- Improved button visibility on mobile by changing opacity from
`opacity-0` to `opacity-60` with `md:opacity-0` breakpoint, making the
expand button more discoverable on touch devices
## Implementation Details
The `untrack()` usage in ExpandableTextarea ensures that reading the
text length doesn't create a reactive dependency, preventing potential
infinite loops while still allowing the effect to run when `isExpanded`
changes.
* .github/workflows: add UI tests and path-filter Go CI
Add ui-tests.yml workflow to run svelte type checking and vitest
on push/PR to main when ui-svelte/ files change.
- Add path filters to go-ci.yml and go-ci-windows.yml to skip
Go tests when only non-backend files change
- Filter on **/*.go, go.mod, go.sum, and Makefile
https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7
* ui-svelte: remove unused declarations in SpeechInterface
Remove unused `generatedText` state and `clearAudio` function
that caused svelte-check errors.
https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7
* .github/workflows: update Node.js to v24
Node 23 is end-of-life; bump to 24 in ui-tests.yml and release.yml.
https://claude.ai/code/session_01E6acq54D8JjuE7pczxPGT7
---------
Co-authored-by: Claude <noreply@anthropic.com>
Replace the legacy React UI with the new Svelte-based one. Introduce a Playground in the UI to quickly test out text, image, text to speech and speech to text models behind llama-swap.
Key Changes
New Svelte UI (ui-svelte/)
- Multi-tab Playground with Chat, Image Generation, Audio Transcription, and Speech interfaces
- Chat: message editing/regeneration, markdown rendering with LaTeX math support, image attachments, code syntax highlighting
- Image: size selector, download/fullscreen viewing
- Audio: transcription with peer support
- Speech: voice caching with manual refresh, download button
- Responsive mobile layout with collapsible navigation
- XSS fixes and accessibility improvements
Proxy Improvements
- Add gzip/brotli compression for UI static assets (proxy/ui_compress.go)
- Add GET /v1/audio/voices?model={model} endpoint for voice listing
- Add peer support for /v1/audio/transcriptions
Trying out svelte for the UI. The port was done by Claude Code on the iOS app w/ Opus 4.5.
---
* ui: add Svelte port of React UI
Port the React-based UI to Svelte 5 with the following changes:
- Create new ui-svelte directory with complete Svelte 5 implementation
- Use Svelte stores instead of React contexts for state management
- Implement custom ResizablePanels component to replace react-resizable-panels
- Port all pages: LogViewer, Models, Activity
- Port all components: Header, ConnectionStatus, LogPanel, ModelsPanel, etc.
- Use svelte-spa-router for client-side routing
- Same build output directory (proxy/ui_dist) and base path (/ui/)
- Tailwind CSS 4 with same theme configuration
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: simplify state management
- Remove redundant state syncing pattern in LogPanel and ModelsPanel
- Use store values directly with $ syntax instead of manual subscriptions
- Consolidate duplicate title sync logic in App.svelte
- Use existing syncTitleToDocument() from theme.ts
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: use idiomatic Svelte 5 patterns
- Use $effect for document side effects (theme, title) instead of
store subscriptions
- Use class: directive for active nav links in Header
- Remove SSR guards (unnecessary for client-only SPA)
- Remove leaked subscription in syncThemeToDocument
- Simplify theme.ts by removing sync functions
https://claude.ai/code/session_01F3xXLYsd62gePVSFv7aboP
* ui-svelte: fix build warnings and improve accessibility
Fix Svelte build warnings and add proper accessibility support
to interactive components.
- add aria-labels to buttons for screen readers
- implement keyboard navigation for resizable separator
- suppress intentional state initialization warnings
- update Makefile to use ui-svelte build directory
- add peer:true to package-lock.json dependencies
* ui-svelte: reorganize navigation and add log view toggle
Make Models the default landing page and add view mode toggle
to the Logs page with persistent state.
- set Models as default route at /
- move Logs to /logs route
- reorder navigation: Models, Activity, Logs
- add view toggle with three modes: Panels, Proxy only, Upstream only
- fix horizontal overflow with width constraints
Extend the /running endpoint to return more details about running
processes beyond just model and state.
- add cmd field to show the command being executed
- add proxy field to show the proxy URL
- add ttl (UnloadAfter) for automatic unloading configuration
- add name and description for model metadata
- update tests to verify new fields are returned correctly
fixes#471
This commit simplifies substitution of environment variables into the configuration. There was a lot of repetitive code substituting ${env.VAR_NAME} into different fields after the configuration was parsed into a config.Config. This refactor uses a string substitution of env vars into the YAML config before it is fully parsed. This eliminates a lot of logic while maintaining backwards compatibility.
* config: support environment variable macros in peer apiKeys
Add ${env.VAR_NAME} substitution for peer apiKey fields, consistent
with existing env macro support for model fields and global apiKeys.
- Add env macro substitution for peers.{name}.apiKey in LoadConfigFromReader
- Add tests for peer apiKey env substitution
- Update config.example.yaml to show env macro usage
* config: support macros in peer apiKey and filters
Extend macro substitution to peer configuration fields:
- peers.{name}.apiKey supports both global macros and env macros
- peers.{name}.filters.stripParams supports both macro types
- peers.{name}.filters.setParams supports both macro types
Also renamed validateMetadataForUnknownMacros to validateNestedForUnknownMacros
for reuse across model metadata and peer filters validation.
Add substituteEnvMacros support for apiKeys configuration field,
allowing API keys to be loaded from environment variables using
the ${env.VAR_NAME} syntax.
- Apply env macro substitution before validation
- Add tests for env macro substitution in apiKeys
* config: add environment variable macros
Add support for ${env.VAR_NAME} syntax to pull values from system
environment variables during config loading.
- env macros processed before regular macros (allows macros to reference env vars)
- works in cmd, cmdStop, proxy, checkEndpoint, filters.stripParams, metadata
- returns error if env var is not set
- add comprehensive tests
fixes#462
* docs: add env macro example to config.example.yaml
This unifies the filtering capabilities for models and peers
- stripParams: removes params in the request
- setParams: sets params in the request
fixes#453
* docker: add .env usage in build-container.sh
* .github,docker: add rocm, improve logging
* .github,CLAUDE.md: fix workflow and update guidelines
Update containers workflow to only push images when triggered
manually or on schedule, not on workflow file changes.
- add push trigger for workflow file changes in containers.yml
- update push condition to skip on regular push events
- update CLAUDE.md commit message guidelines
* docker: remove comma in build-container.sh
* .github,docker: improve container build workflow
Add pagination support for fetching llama.cpp tags and improve debugging.
- add build-container.sh to workflow trigger paths
- implement fetch_llama_tag() with pagination support
- replace .env with local testing instructions
- add DEBUG_ABORT_BUILD flag for testing
Fixes#444 where the UI with api keys did not work. The choice to use
http basic authorization is for simple, automatic browser support. No
changes to the UI were necessary. Just use an API key as the password,
no user name is required.
This PR allows a single llama-swap to be the central proxy for models served by other inference servers. The peer servers can be another llama-swap or any API that supports the /v1/* inference endpoint.
Updates: #433, #299Closes: #296
Add configuration support for api keys that are enforced by llama-swap. Keys are stripped before sending them to upstream servers.
Updates: #433, #50 and #251