Compare commits

..

3 Commits

Author SHA1 Message Date
Benson Wong 574fdfabb4 UI improvements (#213)
* use two column for logs view on wider screens

* hide log controls when panel is minimized
2025-07-31 11:59:21 -07:00
Benson Wong 5172cb2e12 Update docs in Readme [skip ci] 2025-07-30 11:51:14 -07:00
Benson Wong 5672cb03fd Update github actions for notifying homebrew build (#212)
Combine homebrew-llama-swap event with the release action
2025-07-30 11:29:03 -07:00
4 changed files with 76 additions and 59 deletions
+33 -3
View File
@@ -7,6 +7,10 @@ on:
# Allows manual triggering of the workflow # Allows manual triggering of the workflow
workflow_dispatch: workflow_dispatch:
inputs:
tag:
description: 'Tag version to release (e.g. v144)'
required: true
permissions: permissions:
contents: write contents: write
@@ -20,15 +24,15 @@ jobs:
uses: actions/checkout@v4 uses: actions/checkout@v4
with: with:
fetch-depth: 0 fetch-depth: 0
ref: ${{ github.event.inputs.tag || github.ref }}
- -
name: Set up Go name: Set up Go
uses: actions/setup-go@v5 uses: actions/setup-go@v5
- -
name: Set up Node.js name: Set up Node.js
uses: actions/setup-node@v4 uses: actions/setup-node@v4
with: with:
node-version: '23' # or your preferred version node-version: '23'
- -
name: Install dependencies and build UI name: Install dependencies and build UI
run: | run: |
@@ -46,4 +50,30 @@ jobs:
version: '~> v2' version: '~> v2'
args: release --clean args: release --clean
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
trigger-tap-update:
runs-on: ubuntu-latest
needs: goreleaser
steps:
- name: "Resolve tag to dispatch"
id: tag
run: |
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
echo "tag=${{ github.event.inputs.tag }}" >> "$GITHUB_OUTPUT"
else
echo "tag=${{ github.ref_name }}" >> "$GITHUB_OUTPUT"
fi
- name: "Trigger tap repository update"
uses: peter-evans/repository-dispatch@v2
with:
token: ${{ secrets.TAP_REPO_PAT }}
repository: mostlygeek/homebrew-llama-swap
event-type: new-release
client-payload: |
{
"release": {
"tag_name": "${{ steps.tag.outputs.tag }}"
}
}
@@ -1,24 +0,0 @@
name: Trigger Homebrew Tap Update
on:
release:
types: [published]
# Allows manual triggering of the workflow
workflow_dispatch:
jobs:
trigger-tap-update:
runs-on: ubuntu-latest
steps:
- name: "Trigger tap repository update"
uses: peter-evans/repository-dispatch@v2
with:
token: ${{ secrets.TAP_REPO_PAT }}
repository: mostlygeek/homebrew-llama-swap
event-type: new-release
client-payload: |-
{
"release": {
"tag_name": "${{ github.event.release.tag_name }}"
}
}
+16 -7
View File
@@ -27,6 +27,7 @@ Written in golang, it is very easy to install (single binary with no dependencie
- `/upstream/:model_id` - direct access to upstream HTTP server ([demo](https://github.com/mostlygeek/llama-swap/pull/31)) - `/upstream/:model_id` - direct access to upstream HTTP server ([demo](https://github.com/mostlygeek/llama-swap/pull/31))
- `/unload` - manually unload running models ([#58](https://github.com/mostlygeek/llama-swap/issues/58)) - `/unload` - manually unload running models ([#58](https://github.com/mostlygeek/llama-swap/issues/58))
- `/running` - list currently running models ([#61](https://github.com/mostlygeek/llama-swap/issues/61)) - `/running` - list currently running models ([#61](https://github.com/mostlygeek/llama-swap/issues/61))
- `/health` - just returns "OK"
- ✅ Run multiple models at once with `Groups` ([#107](https://github.com/mostlygeek/llama-swap/issues/107)) - ✅ Run multiple models at once with `Groups` ([#107](https://github.com/mostlygeek/llama-swap/issues/107))
- ✅ Automatic unloading of models after timeout by setting a `ttl` - ✅ Automatic unloading of models after timeout by setting a `ttl`
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc) - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc)
@@ -74,10 +75,18 @@ llama-swap ships with a real time web interface to monitor logs and status of mo
<img width="1786" height="1334" alt="image" src="https://github.com/user-attachments/assets/d6258cb9-1dad-40db-828f-2be860aec8fe" /> <img width="1786" height="1334" alt="image" src="https://github.com/user-attachments/assets/d6258cb9-1dad-40db-828f-2be860aec8fe" />
## Installation
## Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap)) llama-swap can be installed in multiple ways
Docker is the quickest way to try out llama-swap: 1. Docker
2. Homebrew (OSX and Linux)
3. From release binaries
4. From source
### Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
Docker images with llama-swap and llama-server are built nightly.
```shell ```shell
# use CPU inference comes with the example config above # use CPU inference comes with the example config above
@@ -99,7 +108,7 @@ $ curl -s http://localhost:9292/v1/chat/completions \
``` ```
<details> <details>
<summary>Docker images are built nightly for cuda, intel, vulcan, etc ...</summary> <summary>Docker images are built nightly with llama-server for cuda, intel, vulcan and musa.</summary>
They include: They include:
@@ -122,9 +131,9 @@ $ docker run -it --rm --runtime nvidia -p 9292:8080 \
</details> </details>
## Homebrew Install (macOS/Linux) ### Homebrew Install (macOS/Linux)
For macOS & Linux users, `llama-swap` can be installed via [Homebrew](https://brew.sh): The latest release of `llama-swap` can be installed via [Homebrew](https://brew.sh).
```shell ```shell
# Set up tap and install formula # Set up tap and install formula
@@ -136,9 +145,9 @@ llama-swap --config path/to/config.yaml --listen localhost:8080
This will install the `llama-swap` binary and make it available in your path. See the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration) This will install the `llama-swap` binary and make it available in your path. See the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration)
## Bare metal Install ([download](https://github.com/mostlygeek/llama-swap/releases)) ### Pre-built Binaries ([download](https://github.com/mostlygeek/llama-swap/releases))
Pre-built binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The baremetal install works with any OpenAI compatible server, not just llama-server. Binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The binary install works with any OpenAI compatible server, not just llama-server.
1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture. 1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture.
1. Create a configuration file, see the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration). 1. Create a configuration file, see the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration).
+27 -25
View File
@@ -6,7 +6,7 @@ const LogViewer = () => {
const { proxyLogs, upstreamLogs } = useAPI(); const { proxyLogs, upstreamLogs } = useAPI();
return ( return (
<div className="flex flex-col gap-5" style={{ height: "calc(100vh - 125px)" }}> <div className="flex flex-col lg:flex-row gap-5" style={{ height: "calc(100vh - 125px)" }}>
<LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} /> <LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} />
<LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} /> <LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} />
</div> </div>
@@ -90,34 +90,36 @@ export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
<div className="flex flex-col md:flex-row md:items-center md:justify-between gap-4"> <div className="flex flex-col md:flex-row md:items-center md:justify-between gap-4">
{/* Title - Always full width on mobile, normal on desktop */} {/* Title - Always full width on mobile, normal on desktop */}
<div className="w-full md:w-auto" onClick={() => setIsCollapsed(!isCollapsed)}> <div className="w-full md:w-auto" onClick={() => setIsCollapsed(!isCollapsed)}>
<h3 className="m-0 text-lg">{title}</h3> <h3 className="m-0 text-lg p-0">{title}</h3>
</div> </div>
<div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto"> {!isCollapsed && (
{/* Sizing Buttons - Stacks vertically on mobile */} <div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto">
<div className="flex flex-wrap gap-2"> {/* Sizing Buttons - Stacks vertically on mobile */}
<button className="btn" onClick={toggleFontSize}> <div className="flex flex-wrap gap-2">
font: {fontSize} <button className="btn" onClick={toggleFontSize}>
</button> font: {fontSize}
<button className="btn" onClick={() => setTextWrap((prev) => !prev)}> </button>
{wrapText ? "wrap" : "wrap off"} <button className="btn" onClick={() => setTextWrap((prev) => !prev)}>
</button> {wrapText ? "wrap" : "wrap off"}
</div> </button>
</div>
{/* Filtering Options - Full width on mobile, normal on desktop */} {/* Filtering Options - Full width on mobile, normal on desktop */}
<div className="flex flex-1 min-w-0 gap-2"> <div className="flex flex-1 min-w-0 gap-2">
<input <input
type="text" type="text"
className="flex-1 min-w-[120px] text-sm border p-2 rounded" className="flex-1 min-w-[120px] text-sm border p-2 rounded"
placeholder="Filter logs..." placeholder="Filter logs..."
value={filterRegex} value={filterRegex}
onChange={(e) => setFilterRegex(e.target.value)} onChange={(e) => setFilterRegex(e.target.value)}
/> />
<button className="btn" onClick={() => setFilterRegex("")}> <button className="btn" onClick={() => setFilterRegex("")}>
Clear Clear
</button> </button>
</div>
</div> </div>
</div> )}
</div> </div>
</div> </div>