proxy,ui: add performance monitoring with Prometheus metrics (#743)
Add a comprehensive performance monitoring system that collects CPU, memory, swap, load average, network IO, and GPU stats. Provides both a REST API for the UI and a Prometheus /metrics endpoint. Backend changes: - New internal/perf package with configurable interval-based stats collection - GPU monitoring via LACT (Unix socket) and nvidia-smi fallback on Linux - Ring buffer (internal/ring) for time-series stat storage - Prometheus /metrics endpoint with all system and GPU metrics - Moved LogMonitor to internal/logmon package - New PerformanceConfig for hot-reloadable monitoring settings - REST /api/performance endpoint replacing SSE streaming UI changes: - New Performance page with real-time charts for CPU, memory, GPU, and network - Reusable PerformanceChart component - LLAMA_SWAP_URL environment variable support - Improved capture dialog display Other: - Example Grafana dashboard for Prometheus metrics - monitor-test standalone binary - Config schema and example updates fixes #596
This commit is contained in:
@@ -55,6 +55,28 @@ metricsMaxInMemory: 1000
|
||||
# - set to 0 to disable
|
||||
captureBuffer: 15
|
||||
|
||||
# performance: configuration for system monitoring statistics
|
||||
# - timing values are duration strings like 1s, 1h30m, 90m, 2h10s, etc.
|
||||
performance:
|
||||
# enabled: boolean
|
||||
# - default: true
|
||||
enable: true
|
||||
|
||||
# every: delay between polling for new performance statistics
|
||||
# - default: 15s
|
||||
# - minimum duration 1s
|
||||
# - note: setting this very low will use up more RAM as stats are kept in memory.
|
||||
every: 15s
|
||||
|
||||
# maxAge: maximum age of a performance statistics before it is eligible for garbage collection
|
||||
# - default: 1h
|
||||
maxAge: 12h
|
||||
|
||||
# gc: garbage collection frequency in seconds
|
||||
# - how many seconds the garbage collector runs to clear old stats
|
||||
# - default 5m
|
||||
gc: 5m
|
||||
|
||||
# startPort: sets the starting port number for the automatic ${PORT} macro.
|
||||
# - optional, default: 5800
|
||||
# - the ${PORT} macro can be used in model.cmd and model.proxy settings
|
||||
|
||||
Reference in New Issue
Block a user