Compare commits

...

12 Commits

Author SHA1 Message Date
Benson Wong 558a72de17 UI Improvements (#219)
- use react-resizable-panels for UI
- improve icons for buttons
- improve mobile layout with drag/resize panels
2025-08-03 17:49:13 -07:00
Leoyzen dc42cf366d Add config monitor support for k8s configmap. (#217) 2025-08-03 08:05:48 -07:00
Ryein Goddard ba0a81937a Update README.md (#216)
Update git clone protocol to https
2025-08-01 19:48:09 -07:00
Benson Wong 574fdfabb4 UI improvements (#213)
* use two column for logs view on wider screens

* hide log controls when panel is minimized
2025-07-31 11:59:21 -07:00
Benson Wong 5172cb2e12 Update docs in Readme [skip ci] 2025-07-30 11:51:14 -07:00
Benson Wong 5672cb03fd Update github actions for notifying homebrew build (#212)
Combine homebrew-llama-swap event with the release action
2025-07-30 11:29:03 -07:00
Benson Wong 0f583163f7 add /health (#211) 2025-07-30 10:37:10 -07:00
Benson Wong 7905fa9ea3 Update trigger-homebrew-update.yml [skip ci] 2025-07-30 10:13:49 -07:00
Ian Sebastian Mathew bbaf172956 add trigger to rebuild homebrew formula (#210) 2025-07-30 10:12:21 -07:00
Benson Wong fd50932dbc Decouple MetricsMiddleware from downstream handlers (#206)
* Decouple MetricsMiddleware from downstream handlers

Remove ls-real-model-name optimization. Within proxyOAIHandler the
request body's bytes are required for various rewriting features
anyways. This negated any benefits from trying not to parse it twice.
2025-07-27 10:36:06 -07:00
Gaël James 8c693e7fcf Add endpoint aliases for reranking models (#201)
* Add endpoint aliases for reranking models
* Add MetricsMiddleware to the previous reranking endpoint
* Fix the embeddings endpoint not having model set
2025-07-24 08:32:47 -07:00
Benson Wong 8f2af26a41 fix stats on model page 2025-07-23 13:57:33 -07:00
12 changed files with 369 additions and 157 deletions
+33 -3
View File
@@ -7,6 +7,10 @@ on:
# Allows manual triggering of the workflow
workflow_dispatch:
inputs:
tag:
description: 'Tag version to release (e.g. v144)'
required: true
permissions:
contents: write
@@ -20,15 +24,15 @@ jobs:
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.inputs.tag || github.ref }}
-
name: Set up Go
uses: actions/setup-go@v5
-
name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '23' # or your preferred version
node-version: '23'
-
name: Install dependencies and build UI
run: |
@@ -46,4 +50,30 @@ jobs:
version: '~> v2'
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
trigger-tap-update:
runs-on: ubuntu-latest
needs: goreleaser
steps:
- name: "Resolve tag to dispatch"
id: tag
run: |
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
echo "tag=${{ github.event.inputs.tag }}" >> "$GITHUB_OUTPUT"
else
echo "tag=${{ github.ref_name }}" >> "$GITHUB_OUTPUT"
fi
- name: "Trigger tap repository update"
uses: peter-evans/repository-dispatch@v2
with:
token: ${{ secrets.TAP_REPO_PAT }}
repository: mostlygeek/homebrew-llama-swap
event-type: new-release
client-payload: |
{
"release": {
"tag_name": "${{ steps.tag.outputs.tag }}"
}
}
+30 -7
View File
@@ -18,7 +18,7 @@ Written in golang, it is very easy to install (single binary with no dependencie
- `v1/completions`
- `v1/chat/completions`
- `v1/embeddings`
- `v1/rerank`
- `v1/rerank`, `v1/reranking`, `rerank`
- `v1/audio/speech` ([#36](https://github.com/mostlygeek/llama-swap/issues/36))
- `v1/audio/transcriptions` ([docs](https://github.com/mostlygeek/llama-swap/issues/41#issuecomment-2722637867))
- ✅ llama-swap custom API endpoints
@@ -27,6 +27,7 @@ Written in golang, it is very easy to install (single binary with no dependencie
- `/upstream/:model_id` - direct access to upstream HTTP server ([demo](https://github.com/mostlygeek/llama-swap/pull/31))
- `/unload` - manually unload running models ([#58](https://github.com/mostlygeek/llama-swap/issues/58))
- `/running` - list currently running models ([#61](https://github.com/mostlygeek/llama-swap/issues/61))
- `/health` - just returns "OK"
- ✅ Run multiple models at once with `Groups` ([#107](https://github.com/mostlygeek/llama-swap/issues/107))
- ✅ Automatic unloading of models after timeout by setting a `ttl`
- ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc)
@@ -74,10 +75,18 @@ llama-swap ships with a real time web interface to monitor logs and status of mo
<img width="1786" height="1334" alt="image" src="https://github.com/user-attachments/assets/d6258cb9-1dad-40db-828f-2be860aec8fe" />
## Installation
## Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
llama-swap can be installed in multiple ways
Docker is the quickest way to try out llama-swap:
1. Docker
2. Homebrew (OSX and Linux)
3. From release binaries
4. From source
### Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
Docker images with llama-swap and llama-server are built nightly.
```shell
# use CPU inference comes with the example config above
@@ -99,7 +108,7 @@ $ curl -s http://localhost:9292/v1/chat/completions \
```
<details>
<summary>Docker images are built nightly for cuda, intel, vulcan, etc ...</summary>
<summary>Docker images are built nightly with llama-server for cuda, intel, vulcan and musa.</summary>
They include:
@@ -122,9 +131,23 @@ $ docker run -it --rm --runtime nvidia -p 9292:8080 \
</details>
## Bare metal Install ([download](https://github.com/mostlygeek/llama-swap/releases))
### Homebrew Install (macOS/Linux)
Pre-built binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The baremetal install works with any OpenAI compatible server, not just llama-server.
The latest release of `llama-swap` can be installed via [Homebrew](https://brew.sh).
```shell
# Set up tap and install formula
brew tap mostlygeek/llama-swap
brew install llama-swap
# Run llama-swap
llama-swap --config path/to/config.yaml --listen localhost:8080
```
This will install the `llama-swap` binary and make it available in your path. See the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration)
### Pre-built Binaries ([download](https://github.com/mostlygeek/llama-swap/releases))
Binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The binary install works with any OpenAI compatible server, not just llama-server.
1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture.
1. Create a configuration file, see the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration).
@@ -138,7 +161,7 @@ Pre-built binaries are available for Linux, Mac, Windows and FreeBSD. These are
### Building from source
1. Build requires golang and nodejs for the user interface.
1. `git clone git@github.com:mostlygeek/llama-swap.git`
1. `git clone https://github.com/mostlygeek/llama-swap.git`
1. `make clean all`
1. Binaries will be in `build/` subdirectory
+5
View File
@@ -132,6 +132,11 @@ func main() {
event.Emit(proxy.ConfigFileChangedEvent{
ReloadingState: proxy.ReloadingStateStart,
})
} else if changeEvent.Name == filepath.Join(configDir, "..data") && changeEvent.Has(fsnotify.Create) {
// the change for k8s configmap
event.Emit(proxy.ConfigFileChangedEvent{
ReloadingState: proxy.ReloadingStateStart,
})
}
case err := <-watcher.Errors:
+3 -1
View File
@@ -17,6 +17,7 @@ func MetricsMiddleware(pm *ProxyManager) gin.HandlerFunc {
bodyBytes, err := io.ReadAll(c.Request.Body)
if err != nil {
pm.sendErrorResponse(c, http.StatusBadRequest, "could not ready request body")
c.Abort()
return
}
c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
@@ -24,15 +25,16 @@ func MetricsMiddleware(pm *ProxyManager) gin.HandlerFunc {
requestedModel := gjson.GetBytes(bodyBytes, "model").String()
if requestedModel == "" {
pm.sendErrorResponse(c, http.StatusBadRequest, "missing or invalid 'model' key")
c.Abort()
return
}
realModelName, found := pm.config.RealModelName(requestedModel)
if !found {
pm.sendErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("could not find real modelID for %s", requestedModel))
c.Abort()
return
}
c.Set("ls-real-model-name", realModelName)
writer := &MetricsResponseWriter{
ResponseWriter: c.Writer,
+17 -5
View File
@@ -14,6 +14,7 @@ import (
"time"
"github.com/gin-gonic/gin"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
@@ -160,8 +161,10 @@ func (pm *ProxyManager) setupGinEngine() {
pm.ginEngine.POST("/v1/completions", mm, pm.proxyOAIHandler)
// Support embeddings
pm.ginEngine.POST("/v1/embeddings", pm.proxyOAIHandler)
pm.ginEngine.POST("/v1/rerank", pm.proxyOAIHandler)
pm.ginEngine.POST("/v1/embeddings", mm, pm.proxyOAIHandler)
pm.ginEngine.POST("/v1/rerank", mm, pm.proxyOAIHandler)
pm.ginEngine.POST("/v1/reranking", mm, pm.proxyOAIHandler)
pm.ginEngine.POST("/rerank", mm, pm.proxyOAIHandler)
// Support audio/speech endpoint
pm.ginEngine.POST("/v1/audio/speech", pm.proxyOAIHandler)
@@ -188,6 +191,9 @@ func (pm *ProxyManager) setupGinEngine() {
pm.ginEngine.GET("/unload", pm.unloadAllModelsHandler)
pm.ginEngine.GET("/running", pm.listRunningProcessesHandler)
pm.ginEngine.GET("/health", func(c *gin.Context) {
c.String(http.StatusOK, "OK")
})
pm.ginEngine.GET("/favicon.ico", func(c *gin.Context) {
if data, err := reactStaticFS.ReadFile("ui_dist/favicon.ico"); err == nil {
@@ -365,9 +371,15 @@ func (pm *ProxyManager) proxyOAIHandler(c *gin.Context) {
return
}
realModelName := c.GetString("ls-real-model-name") // Should be set in MetricsMiddleware
if realModelName == "" {
pm.sendErrorResponse(c, http.StatusInternalServerError, "ls-real-model-name not set")
requestedModel := gjson.GetBytes(bodyBytes, "model").String()
if requestedModel == "" {
pm.sendErrorResponse(c, http.StatusBadRequest, "missing or invalid 'model' key")
return
}
realModelName, found := pm.config.RealModelName(requestedModel)
if !found {
pm.sendErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("could not find real modelID for %s", requestedModel))
return
}
+18
View File
@@ -755,3 +755,21 @@ func TestProxyManager_MiddlewareWritesMetrics_Streaming(t *testing.T) {
assert.Greater(t, lastMetric.TokensPerSecond, 0.0, "tokens per second should be greater than 0")
assert.Greater(t, lastMetric.DurationMs, 0, "duration should be greater than 0")
}
func TestProxyManager_HealthEndpoint(t *testing.T) {
config := AddDefaultGroupToConfig(Config{
HealthCheckTimeout: 15,
Models: map[string]ModelConfig{
"model1": getTestSimpleResponderConfig("model1"),
},
LogLevel: "error",
})
proxy := New(config)
defer proxy.StopProcesses(StopWaitForInflightRequest)
req := httptest.NewRequest("GET", "/health", nil)
rec := httptest.NewRecorder()
proxy.ServeHTTP(rec, req)
assert.Equal(t, http.StatusOK, rec.Code)
assert.Equal(t, "OK", rec.Body.String())
}
+21
View File
@@ -12,6 +12,8 @@
"@tanstack/react-query": "^5.80.6",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"react-icons": "^5.5.0",
"react-resizable-panels": "^3.0.4",
"react-router-dom": "^7.6.2",
"tailwindcss": "^4.1.8"
},
@@ -3460,6 +3462,15 @@
"react": "^19.1.0"
}
},
"node_modules/react-icons": {
"version": "5.5.0",
"resolved": "https://registry.npmjs.org/react-icons/-/react-icons-5.5.0.tgz",
"integrity": "sha512-MEFcXdkP3dLo8uumGI5xN3lDFNsRtrjbOEKDLD7yv76v4wpnEq2Lt2qeHaQOr34I/wPN3s3+N08WkQ+CW37Xiw==",
"license": "MIT",
"peerDependencies": {
"react": "*"
}
},
"node_modules/react-refresh": {
"version": "0.17.0",
"resolved": "https://registry.npmjs.org/react-refresh/-/react-refresh-0.17.0.tgz",
@@ -3470,6 +3481,16 @@
"node": ">=0.10.0"
}
},
"node_modules/react-resizable-panels": {
"version": "3.0.4",
"resolved": "https://registry.npmjs.org/react-resizable-panels/-/react-resizable-panels-3.0.4.tgz",
"integrity": "sha512-8Y4KNgV94XhUvI2LeByyPIjoUJb71M/0hyhtzkHaqpVHs+ZQs8b627HmzyhmVYi3C9YP6R+XD1KmG7hHjEZXFQ==",
"license": "MIT",
"peerDependencies": {
"react": "^16.14.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc",
"react-dom": "^16.14.0 || ^17.0.0 || ^18.0.0 || ^19.0.0 || ^19.0.0-rc"
}
},
"node_modules/react-router": {
"version": "7.6.2",
"resolved": "https://registry.npmjs.org/react-router/-/react-router-7.6.2.tgz",
+3 -1
View File
@@ -14,6 +14,8 @@
"@tanstack/react-query": "^5.80.6",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"react-icons": "^5.5.0",
"react-resizable-panels": "^3.0.4",
"react-router-dom": "^7.6.2",
"tailwindcss": "^4.1.8"
},
@@ -30,4 +32,4 @@
"typescript-eslint": "^8.30.1",
"vite": "^6.3.5"
}
}
}
+8 -6
View File
@@ -4,16 +4,18 @@ import { APIProvider } from "./contexts/APIProvider";
import LogViewerPage from "./pages/LogViewer";
import ModelPage from "./pages/Models";
import ActivityPage from "./pages/Activity";
import { RiSunFill, RiMoonFill } from "react-icons/ri";
function App() {
const theme = useTheme();
const { isNarrow, toggleTheme, isDarkMode } = useTheme();
return (
<Router basename="/ui/">
<APIProvider>
<div>
<div className="flex flex-col h-screen">
<nav className="bg-surface border-b border-border p-2 h-[75px]">
<div className="flex items-center justify-between mx-auto px-4 h-full">
<h1 className="flex items-center p-0">llama-swap</h1>
{!isNarrow && <h1 className="flex items-center p-0">llama-swap</h1>}
<div className="flex items-center space-x-4">
<NavLink to="/" className={({ isActive }) => (isActive ? "navlink active" : "navlink")}>
Logs
@@ -26,14 +28,14 @@ function App() {
<NavLink to="/activity" className={({ isActive }) => (isActive ? "navlink active" : "navlink")}>
Activity
</NavLink>
<button className="btn btn--sm" onClick={theme.toggleTheme}>
{theme.isDarkMode ? "🌙" : "☀️"}
<button className="" onClick={toggleTheme}>
{isDarkMode ? <RiMoonFill /> : <RiSunFill />}
</button>
</div>
</div>
</nav>
<main className="mx-auto py-4 px-4">
<main className="flex-1 overflow-auto p-4">
<Routes>
<Route path="/" element={<LogViewerPage />} />
<Route path="/models" element={<ModelPage />} />
+37 -2
View File
@@ -1,8 +1,11 @@
import { createContext, useContext, useEffect, type ReactNode } from "react";
import { createContext, useContext, useEffect, type ReactNode, useMemo, useState } from "react";
import { usePersistentState } from "../hooks/usePersistentState";
type ScreenWidth = "xs" | "sm" | "md" | "lg" | "xl" | "2xl";
type ThemeContextType = {
isDarkMode: boolean;
screenWidth: ScreenWidth;
isNarrow: boolean;
toggleTheme: () => void;
};
@@ -14,14 +17,46 @@ type ThemeProviderProps = {
export function ThemeProvider({ children }: ThemeProviderProps) {
const [isDarkMode, setIsDarkMode] = usePersistentState<boolean>("theme", false);
const [screenWidth, setScreenWidth] = useState<ScreenWidth>("md"); // Default to md
// matches tailwind classes
// https://tailwindcss.com/docs/responsive-design
useEffect(() => {
const checkInnerWidth = () => {
const innerWidth = window.innerWidth;
if (innerWidth < 640) {
setScreenWidth("xs");
} else if (innerWidth < 768) {
setScreenWidth("sm");
} else if (innerWidth < 1024) {
setScreenWidth("md");
} else if (innerWidth < 1280) {
setScreenWidth("lg");
} else if (innerWidth < 1536) {
setScreenWidth("xl");
} else {
setScreenWidth("2xl");
}
};
checkInnerWidth();
window.addEventListener("resize", checkInnerWidth);
return () => window.removeEventListener("resize", checkInnerWidth);
}, []);
useEffect(() => {
document.documentElement.setAttribute("data-theme", isDarkMode ? "dark" : "light");
}, [isDarkMode]);
const toggleTheme = () => setIsDarkMode((prev) => !prev);
const isNarrow = useMemo(() => {
return screenWidth === "xs" || screenWidth === "sm" || screenWidth === "md";
}, [screenWidth]);
return <ThemeContext.Provider value={{ isDarkMode, toggleTheme }}>{children}</ThemeContext.Provider>;
return (
<ThemeContext.Provider value={{ isDarkMode, toggleTheme, screenWidth, isNarrow }}>{children}</ThemeContext.Provider>
);
}
export function useTheme(): ThemeContextType {
+70 -45
View File
@@ -1,15 +1,38 @@
import { useState, useEffect, useRef, useMemo, useCallback } from "react";
import { useAPI } from "../contexts/APIProvider";
import { usePersistentState } from "../hooks/usePersistentState";
import { Panel, PanelGroup, PanelResizeHandle } from "react-resizable-panels";
import {
RiTextWrap,
RiAlignJustify,
RiFontSize,
RiMenuSearchLine,
RiMenuSearchFill,
RiCloseCircleFill,
} from "react-icons/ri";
import { useTheme } from "../contexts/ThemeProvider";
const LogViewer = () => {
const { proxyLogs, upstreamLogs } = useAPI();
const { isNarrow } = useTheme();
const direction = isNarrow ? "vertical" : "horizontal";
return (
<div className="flex flex-col gap-5" style={{ height: "calc(100vh - 125px)" }}>
<LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} />
<LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} />
</div>
<PanelGroup direction={direction} className="gap-2" autoSaveId={`logviewer-panel-group-${direction}`}>
<Panel id="proxy" defaultSize={50} minSize={5} maxSize={100} collapsible={true}>
<LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} />
</Panel>
<PanelResizeHandle
className={
direction === "horizontal"
? "w-2 h-full bg-primary hover:bg-success transition-colors rounded"
: "w-full h-2 bg-primary hover:bg-success transition-colors rounded"
}
/>
<Panel id="upstream" defaultSize={50} minSize={5} maxSize={100} collapsible={true}>
<LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} />
</Panel>
</PanelGroup>
);
};
@@ -17,17 +40,15 @@ interface LogPanelProps {
id: string;
title: string;
logData: string;
className?: string;
}
export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
const [isCollapsed, setIsCollapsed] = usePersistentState(`logPanel-${id}-isCollapsed`, false);
export const LogPanel = ({ id, title, logData }: LogPanelProps) => {
const [filterRegex, setFilterRegex] = useState("");
const [fontSize, setFontSize] = usePersistentState<"xxs" | "xs" | "small" | "normal">(
`logPanel-${id}-fontSize`,
"normal"
);
const [wrapText, setTextWrap] = usePersistentState(`logPanel-${id}-wrapText`, false);
const [showFilter, setShowFilter] = usePersistentState(`logPanel-${id}-showFilter`, false);
const textWrapClass = useMemo(() => {
return wrapText ? "whitespace-pre-wrap" : "whitespace-pre";
@@ -48,6 +69,19 @@ export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
});
}, []);
const toggleWrapText = useCallback(() => {
setTextWrap((prev) => !prev);
}, []);
const toggleFilter = useCallback(() => {
if (showFilter) {
setShowFilter(false);
setFilterRegex(""); // Clear filter when closing
} else {
setShowFilter(true);
}
}, [filterRegex, setFilterRegex, showFilter]);
const fontSizeClass = useMemo(() => {
switch (fontSize) {
case "xxs":
@@ -81,56 +115,47 @@ export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
}, [filteredLogs]);
return (
<div
className={`bg-surface border border-border rounded-lg overflow-hidden flex flex-col ${
!isCollapsed && "h-full"
} ${className || ""}`}
>
<div className="bg-surface border border-border rounded-lg overflow-hidden flex flex-col h-full">
<div className="p-4 border-b border-border bg-secondary">
<div className="flex flex-col md:flex-row md:items-center md:justify-between gap-4">
{/* Title - Always full width on mobile, normal on desktop */}
<div className="w-full md:w-auto" onClick={() => setIsCollapsed(!isCollapsed)}>
<h3 className="m-0 text-lg">{title}</h3>
<div className="flex items-center justify-between">
<h3 className="m-0 text-lg p-0">{title}</h3>
<div className="flex gap-2 items-center">
<button className="btn" onClick={toggleFontSize}>
<RiFontSize />
</button>
<button className="btn" onClick={toggleWrapText}>
{wrapText ? <RiTextWrap /> : <RiAlignJustify />}
</button>
<button className="btn" onClick={toggleFilter}>
{showFilter ? <RiMenuSearchFill /> : <RiMenuSearchLine />}
</button>
</div>
</div>
<div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto">
{/* Sizing Buttons - Stacks vertically on mobile */}
<div className="flex flex-wrap gap-2">
<button className="btn" onClick={toggleFontSize}>
font: {fontSize}
</button>
<button className="btn" onClick={() => setTextWrap((prev) => !prev)}>
{wrapText ? "wrap" : "wrap off"}
</button>
</div>
{/* Filtering Options - Full width on mobile, normal on desktop */}
<div className="flex flex-1 min-w-0 gap-2">
{/* Filtering Options - Full width on mobile, normal on desktop */}
{showFilter && (
<div className="mt-2 w-full">
<div className="flex gap-2 items-center w-full">
<input
type="text"
className="flex-1 min-w-[120px] text-sm border p-2 rounded"
className="w-full text-sm border p-2 rounded"
placeholder="Filter logs..."
value={filterRegex}
onChange={(e) => setFilterRegex(e.target.value)}
/>
<button className="btn" onClick={() => setFilterRegex("")}>
Clear
<button className="pl-2" onClick={() => setFilterRegex("")}>
<RiCloseCircleFill size="24" />
</button>
</div>
</div>
</div>
)}
</div>
<div className="bg-background font-mono text-sm flex-1 overflow-hidden">
<pre ref={preTagRef} className={`${textWrapClass} ${fontSizeClass} h-full overflow-auto p-4`}>
{filteredLogs}
</pre>
</div>
{!isCollapsed && (
<div className="flex-1 bg-background font-mono text-sm p-3 overflow-hidden">
<pre
ref={preTagRef}
className={`h-full p-4 overflow-y-auto whitespace-pre min-h-0 ${textWrapClass} ${fontSizeClass}`}
>
{filteredLogs}
</pre>
</div>
)}
</div>
);
};
+124 -87
View File
@@ -2,9 +2,42 @@ import { useState, useCallback, useMemo } from "react";
import { useAPI } from "../contexts/APIProvider";
import { LogPanel } from "./LogViewer";
import { usePersistentState } from "../hooks/usePersistentState";
import { Panel, PanelGroup, PanelResizeHandle } from "react-resizable-panels";
import { useTheme } from "../contexts/ThemeProvider";
import { RiEyeFill, RiEyeOffFill, RiStopCircleLine } from "react-icons/ri";
export default function ModelsPage() {
const { models, unloadAllModels, loadModel, upstreamLogs, metrics } = useAPI();
const { isNarrow } = useTheme();
const direction = isNarrow ? "vertical" : "horizontal";
const { upstreamLogs } = useAPI();
return (
<PanelGroup direction={direction} className="gap-2" autoSaveId={`models-panel-group-${direction}`}>
<Panel id="models" defaultSize={50} minSize={isNarrow ? 0 : 25} maxSize={100} collapsible={isNarrow}>
<ModelsPanel />
</Panel>
<PanelResizeHandle
className={
direction === "horizontal"
? "w-2 h-full bg-primary hover:bg-success transition-colors rounded"
: "w-full h-2 bg-primary hover:bg-success transition-colors rounded"
}
/>
<Panel collapsible={true} defaultSize={50} minSize={0}>
<div className="flex flex-col h-full space-y-4">
{direction === "horizontal" && <StatsPanel />}
<div className="flex-1 min-h-0">
<LogPanel id="modelsupstream" title="Upstream Logs" logData={upstreamLogs} />
</div>
</div>
</Panel>
</PanelGroup>
);
}
function ModelsPanel() {
const { models, loadModel, unloadAllModels } = useAPI();
const [isUnloading, setIsUnloading] = useState(false);
const [showUnlisted, setShowUnlisted] = usePersistentState("showUnlisted", true);
@@ -19,101 +52,105 @@ export default function ModelsPage() {
} catch (e) {
console.error(e);
} finally {
// at least give it a second to show the unloading message
setTimeout(() => {
setIsUnloading(false);
}, 1000);
}
}, []);
const [totalRequests, totalTokens, avgTokensPerSecond] = useMemo(() => {
const totalTokens = metrics.reduce((sum, m) => sum + m.input_tokens + m.output_tokens, 0);
const totalSeconds = metrics.reduce((sum, m) => sum + m.duration_ms / 1000, 0);
const avgTokensPerSecond = totalSeconds > 0 ? totalTokens / totalSeconds : 0;
return [metrics.length, totalTokens, avgTokensPerSecond.toFixed(2)];
}, [metrics]);
}, [unloadAllModels]);
return (
<div>
<div className="flex flex-col md:flex-row gap-4">
{/* Left Column */}
<div className="w-full md:w-1/2 flex items-top">
<div className="card w-full">
<h2 className="">Models</h2>
<div className="flex justify-between">
<button className="btn" onClick={() => setShowUnlisted(!showUnlisted)} style={{ lineHeight: "1.2" }}>
{showUnlisted ? "🟢 unlisted" : "⚫️ unlisted"}
</button>
<button className="btn" onClick={handleUnloadAllModels} disabled={isUnloading}>
{isUnloading ? "Stopping ..." : "Stop All"}
</button>
</div>
<table className="w-full mt-4">
<thead>
<tr className="border-b border-primary">
<th className="text-left p-2">Name</th>
<th className="text-left p-2"></th>
<th className="text-left p-2">State</th>
</tr>
</thead>
<tbody>
{filteredModels.map((model) => (
<tr key={model.id} className="border-b hover:bg-secondary-hover border-border">
<td className="p-2">
<a href={`/upstream/${model.id}/`} className="underline" target="_blank">
{model.name !== "" ? model.name : model.id}
</a>
{model.description != "" && (
<p>
<em>{model.description}</em>
</p>
)}
</td>
<td className="p-2 w-[50px]">
<button
className="btn btn--sm"
disabled={model.state !== "stopped"}
onClick={() => loadModel(model.id)}
>
Load
</button>
</td>
<td className="p-2 w-[75px]">
<span className={`status status--${model.state}`}>{model.state}</span>
</td>
</tr>
))}
</tbody>
</table>
</div>
<div className="card h-full flex flex-col">
<div className="shrink-0">
<h2>Models</h2>
<div className="flex justify-between">
<button
className="btn flex items-center gap-2"
onClick={() => setShowUnlisted(!showUnlisted)}
style={{ lineHeight: "1.2" }}
>
{showUnlisted ? <RiEyeFill /> : <RiEyeOffFill />} unlisted
</button>
<button className="btn flex items-center gap-2" onClick={handleUnloadAllModels} disabled={isUnloading}>
<RiStopCircleLine size="24" /> {isUnloading ? "Unloading..." : "Unload"}
</button>
</div>
</div>
{/* Right Column */}
<div className="w-full md:w-1/2 flex flex-col" style={{ height: "calc(100vh - 125px)" }}>
<div className="card mb-4 min-h-[225px]">
<h2>Chat Activity</h2>
<table className="w-full border border-gray-200">
<tbody>
<tr className="border-b border-gray-200">
<td className="py-2 px-4 font-medium border-r border-gray-200">Requests</td>
<td className="py-2 px-4 text-right">{totalRequests}</td>
</tr>
<tr className="border-b border-gray-200">
<td className="py-2 px-4 font-medium border-r border-gray-200">Total Tokens Generated</td>
<td className="py-2 px-4 text-right">{totalTokens}</td>
</tr>
<tr>
<td className="py-2 px-4 font-medium border-r border-gray-200">Average Tokens/Second</td>
<td className="py-2 px-4 text-right">{avgTokensPerSecond}</td>
</tr>
</tbody>
</table>
</div>
<LogPanel id="modelsupstream" title="Upstream Logs" logData={upstreamLogs} />
</div>
<div className="flex-1 overflow-y-auto">
<table className="w-full">
<thead className="sticky top-0 bg-card z-10">
<tr className="border-b border-primary bg-surface">
<th className="text-left p-2">Name</th>
<th className="text-left p-2"></th>
<th className="text-left p-2">State</th>
</tr>
</thead>
<tbody>
{filteredModels.map((model) => (
<tr key={model.id} className="border-b hover:bg-secondary-hover border-border">
<td className={`p-2 ${model.unlisted ? "text-txtsecondary" : ""}`}>
<a href={`/upstream/${model.id}/`} className={`underline`} target="_blank">
{model.name !== "" ? model.name : model.id}
</a>
{model.description !== "" && (
<p className={model.unlisted ? "text-opacity-70" : ""}>
<em>{model.description}</em>
</p>
)}
</td>
<td className="p-2 w-[50px]">
<button
className="btn btn--sm"
disabled={model.state !== "stopped"}
onClick={() => loadModel(model.id)}
>
Load
</button>
</td>
<td className="p-2 w-[75px]">
<span className={`status status--${model.state}`}>{model.state}</span>
</td>
</tr>
))}
</tbody>
</table>
</div>
</div>
);
}
function StatsPanel() {
const { metrics } = useAPI();
const [totalRequests, totalTokens, avgTokensPerSecond] = useMemo(() => {
const totalRequests = metrics.length;
if (totalRequests === 0) {
return [0, 0, 0];
}
const totalTokens = metrics.reduce((sum, m) => sum + m.output_tokens, 0);
const avgTokensPerSecond = (metrics.reduce((sum, m) => sum + m.tokens_per_second, 0) / totalRequests).toFixed(2);
return [totalRequests, totalTokens, avgTokensPerSecond];
}, [metrics]);
return (
<div className="card">
<h2>Chat Activity</h2>
<table className="w-full border border-gray-200">
<tbody>
<tr className="border-b border-gray-200">
<td className="py-2 px-4 font-medium border-r border-gray-200">Requests</td>
<td className="py-2 px-4 text-right">{totalRequests}</td>
</tr>
<tr className="border-b border-gray-200">
<td className="py-2 px-4 font-medium border-r border-gray-200">Total Tokens Generated</td>
<td className="py-2 px-4 text-right">{totalTokens}</td>
</tr>
<tr>
<td className="py-2 px-4 font-medium border-r border-gray-200">Average Tokens/Second</td>
<td className="py-2 px-4 text-right">{avgTokensPerSecond}</td>
</tr>
</tbody>
</table>
</div>
);
}