Compare commits

..

2 Commits

Author SHA1 Message Date
Benson Wong 574fdfabb4 UI improvements (#213)
* use two column for logs view on wider screens

* hide log controls when panel is minimized
2025-07-31 11:59:21 -07:00
Benson Wong 5172cb2e12 Update docs in Readme [skip ci] 2025-07-30 11:51:14 -07:00
2 changed files with 43 additions and 32 deletions
+16 -7
View File
@@ -27,6 +27,7 @@ Written in golang, it is very easy to install (single binary with no dependencie
- `/upstream/:model_id` - direct access to upstream HTTP server ([demo](https://github.com/mostlygeek/llama-swap/pull/31))
- `/unload` - manually unload running models ([#58](https://github.com/mostlygeek/llama-swap/issues/58))
- `/running` - list currently running models ([#61](https://github.com/mostlygeek/llama-swap/issues/61))
- `/health` - just returns "OK"
- ✅ Run multiple models at once with `Groups` ([#107](https://github.com/mostlygeek/llama-swap/issues/107))
- ✅ Automatic unloading of models after timeout by setting a `ttl`
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc)
@@ -74,10 +75,18 @@ llama-swap ships with a real time web interface to monitor logs and status of mo
<img width="1786" height="1334" alt="image" src="https://github.com/user-attachments/assets/d6258cb9-1dad-40db-828f-2be860aec8fe" />
## Installation
## Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
llama-swap can be installed in multiple ways
Docker is the quickest way to try out llama-swap:
1. Docker
2. Homebrew (OSX and Linux)
3. From release binaries
4. From source
### Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
Docker images with llama-swap and llama-server are built nightly.
```shell
# use CPU inference comes with the example config above
@@ -99,7 +108,7 @@ $ curl -s http://localhost:9292/v1/chat/completions \
```
<details>
<summary>Docker images are built nightly for cuda, intel, vulcan, etc ...</summary>
<summary>Docker images are built nightly with llama-server for cuda, intel, vulcan and musa.</summary>
They include:
@@ -122,9 +131,9 @@ $ docker run -it --rm --runtime nvidia -p 9292:8080 \
</details>
## Homebrew Install (macOS/Linux)
### Homebrew Install (macOS/Linux)
For macOS & Linux users, `llama-swap` can be installed via [Homebrew](https://brew.sh):
The latest release of `llama-swap` can be installed via [Homebrew](https://brew.sh).
```shell
# Set up tap and install formula
@@ -136,9 +145,9 @@ llama-swap --config path/to/config.yaml --listen localhost:8080
This will install the `llama-swap` binary and make it available in your path. See the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration)
## Bare metal Install ([download](https://github.com/mostlygeek/llama-swap/releases))
### Pre-built Binaries ([download](https://github.com/mostlygeek/llama-swap/releases))
Pre-built binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The baremetal install works with any OpenAI compatible server, not just llama-server.
Binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The binary install works with any OpenAI compatible server, not just llama-server.
1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture.
1. Create a configuration file, see the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration).
+27 -25
View File
@@ -6,7 +6,7 @@ const LogViewer = () => {
const { proxyLogs, upstreamLogs } = useAPI();
return (
<div className="flex flex-col gap-5" style={{ height: "calc(100vh - 125px)" }}>
<div className="flex flex-col lg:flex-row gap-5" style={{ height: "calc(100vh - 125px)" }}>
<LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} />
<LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} />
</div>
@@ -90,34 +90,36 @@ export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
<div className="flex flex-col md:flex-row md:items-center md:justify-between gap-4">
{/* Title - Always full width on mobile, normal on desktop */}
<div className="w-full md:w-auto" onClick={() => setIsCollapsed(!isCollapsed)}>
<h3 className="m-0 text-lg">{title}</h3>
<h3 className="m-0 text-lg p-0">{title}</h3>
</div>
<div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto">
{/* Sizing Buttons - Stacks vertically on mobile */}
<div className="flex flex-wrap gap-2">
<button className="btn" onClick={toggleFontSize}>
font: {fontSize}
</button>
<button className="btn" onClick={() => setTextWrap((prev) => !prev)}>
{wrapText ? "wrap" : "wrap off"}
</button>
</div>
{!isCollapsed && (
<div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto">
{/* Sizing Buttons - Stacks vertically on mobile */}
<div className="flex flex-wrap gap-2">
<button className="btn" onClick={toggleFontSize}>
font: {fontSize}
</button>
<button className="btn" onClick={() => setTextWrap((prev) => !prev)}>
{wrapText ? "wrap" : "wrap off"}
</button>
</div>
{/* Filtering Options - Full width on mobile, normal on desktop */}
<div className="flex flex-1 min-w-0 gap-2">
<input
type="text"
className="flex-1 min-w-[120px] text-sm border p-2 rounded"
placeholder="Filter logs..."
value={filterRegex}
onChange={(e) => setFilterRegex(e.target.value)}
/>
<button className="btn" onClick={() => setFilterRegex("")}>
Clear
</button>
{/* Filtering Options - Full width on mobile, normal on desktop */}
<div className="flex flex-1 min-w-0 gap-2">
<input
type="text"
className="flex-1 min-w-[120px] text-sm border p-2 rounded"
placeholder="Filter logs..."
value={filterRegex}
onChange={(e) => setFilterRegex(e.target.value)}
/>
<button className="btn" onClick={() => setFilterRegex("")}>
Clear
</button>
</div>
</div>
</div>
)}
</div>
</div>