proxy,ui: add performance monitoring with Prometheus metrics (#743)

Add a comprehensive performance monitoring system that collects CPU, memory, swap, load average, network IO, and GPU stats. Provides both a REST API for the UI and a Prometheus /metrics endpoint. Backend changes: - New internal/perf package with configurable interval-based stats collection - GPU monitoring via LACT (Unix socket) and nvidia-smi fallback on Linux - Ring buffer (internal/ring) for time-series stat storage - Prometheus /metrics endpoint with all system and GPU metrics - Moved LogMonitor to internal/logmon package - New PerformanceConfig for hot-reloadable monitoring settings - REST /api/performance endpoint replacing SSE streaming UI changes: - New Performance page with real-time charts for CPU, memory, GPU, and network - Reusable PerformanceChart component - LLAMA_SWAP_URL environment variable support - Improved capture dialog display Other: - Example Grafana dashboard for Prometheus metrics - monitor-test standalone binary - Config schema and example updates fixes #596
2026-05-09 13:29:22 -07:00
parent e261745c66
commit 7e3e94a08a
49 changed files with 4322 additions and 273 deletions
@@ -55,6 +55,28 @@ metricsMaxInMemory: 1000
 # - set to 0 to disable
 captureBuffer: 15

+# performance: configuration for system monitoring statistics
+# - timing values are duration strings like 1s, 1h30m, 90m, 2h10s, etc.
+performance:
+  # enabled: boolean
+  # - default: true
+  enable: true
+
+  # every: delay between polling for new performance statistics
+  # - default: 15s
+  # - minimum duration 1s
+  # - note: setting this very low will use up more RAM as stats are kept in memory.
+  every: 15s
+
+  # maxAge: maximum age of a performance statistics before it is eligible for garbage collection
+  # - default: 1h
+  maxAge: 12h
+
+  # gc: garbage collection frequency in seconds
+  # - how many seconds the garbage collector runs to clear old stats
+  # - default 5m
+  gc: 5m
+
 # startPort: sets the starting port number for the automatic ${PORT} macro.
 # - optional, default: 5800
 # - the ${PORT} macro can be used in model.cmd and model.proxy settings