Improve Activity event handling in the UI (#254 )

Improve Activity event handling in the UI - fixes #252 found that the Activity page showed activity inconsistent with /api/metrics - Change data structure for event metrics to array. - Add Event stream connections status indicator
add 'unconfirmed bug' as default label in bug-report.md
2025-08-15 21:44:08 -07:00 · 2025-08-15 15:38:12 -07:00 · 2025-08-14 10:27:28 -07:00 · 2025-08-14 10:02:16 -07:00 · 2025-08-08 13:39:46 -07:00 · 2025-08-08 13:33:47 -07:00
22 changed files with 571 additions and 126 deletions
@@ -1,11 +1,13 @@
 ---
 name: Bug Report
-about: Something is not working as expected...
+about: I found a defect
 title: ''
-labels: bug
+labels: 'unconfirmed bug'
 assignees: ''

 ---
+> [!IMPORTANT]
+> If you have questions about llama-swap please post in the Q&A in Discussions. Use bug reports when you've found a defect and wish to discuss a fix.

 **Describe the bug**
 A clear and concise description of what the bug is.
@@ -22,6 +22,13 @@ jobs:
      with:
        go-version: '1.23'

+    # Only run in this linux based runner
+    - name: Check Formatting
+      run: |
+        if [ "$(gofmt -l . | grep -v 'event/.*_test.go' | wc -l)" -gt 0 ]; then
+          gofmt -l . | grep -v 'event/.*_test.go'
+          exit 1
+        fi
    # cache simple-responder to save the build time
    - name: Restore Simple Responder
      id: restore-simple-responder
@@ -4,3 +4,4 @@ build/
 dist/
 .vscode
 .DS_Store
+.dev/
@@ -31,8 +31,9 @@ Written in golang, it is very easy to install (single binary with no dependencie
 - ✅ Run multiple models at once with `Groups` ([#107](https://github.com/mostlygeek/llama-swap/issues/107))
 - ✅ Automatic unloading of models after timeout by setting a `ttl`
 - ✅ Use any local OpenAI compatible server (llama.cpp, vllm, tabbyAPI, etc)
- ✅ Docker and Podman support
+- ✅ Reliable Docker and Podman support with `cmdStart` and `cmdStop`
 - ✅ Full control over server settings per model
+- ✅ Preload models on startup with `hooks` ([#235](https://github.com/mostlygeek/llama-swap/pull/235))

 ## How does llama-swap work?

@@ -71,9 +72,13 @@ See the [configuration documentation](https://github.com/mostlygeek/llama-swap/w

 ## Web UI

-llama-swap ships with a real time web interface to monitor logs and status of models:
+llama-swap includes a real time web interface for monitoring logs and models:

-<img width="1786" height="1334" alt="image" src="https://github.com/user-attachments/assets/d6258cb9-1dad-40db-828f-2be860aec8fe" />
+<img width="1360" height="963" alt="image" src="https://github.com/user-attachments/assets/adef4a8e-de0b-49db-885a-8f6dedae6799" />
+
+The Activity Page shows recent requests:
+
+<img width="1360" height="963" alt="image" src="https://github.com/user-attachments/assets/5f3edee6-d03a-4ae5-ae06-b20ac1f135bd" />

 ## Installation

@@ -1,6 +1,13 @@
 # llama-swap YAML configuration example
 # -------------------------------------
 #
+# 💡 Tip - Use an LLM with this file!
+# ====================================
+#  This example configuration is written to be LLM friendly! Try
+#  copying this file into an LLM and asking it to explain or generate
+#  sections for you.
+# ====================================
+#
 # - Below are all the available configuration options for llama-swap.
 # - Settings with a default value, or noted as optional can be omitted.
 # - Settings that are marked required must be in your configuration file
@@ -207,3 +214,19 @@ groups:
      - "forever-modelA"
      - "forever-modelB"
      - "forever-modelc"
+
+# hooks: a dictionary of event triggers and actions
+# - optional, default: empty dictionary
+# - the only supported hook is on_startup
+hooks:
+  # on_startup: a dictionary of actions to perform on startup
+  # - optional, default: empty dictionar
+  # - the only supported action is preload
+  on_startup:
+        # preload: a list of model ids to load on startup
+        # - optional, default: empty list
+        # - model names must match keys in the models sections
+        # - when preloading multiple models at once, define a group
+        #   otherwise models will be loaded and swapped out
+    preload:
+      - "llama"
@@ -0,0 +1,159 @@
+package main
+
+// created for issue: #252 https://github.com/mostlygeek/llama-swap/issues/252
+// this simple benchmark tool sends a lot of small chat completion requests to llama-swap
+// to make sure all the requests are accounted for.
+//
+// requests can be sent in parallel, and the tool will report the results.
+// usage: go run main.go -baseurl http://localhost:8080/v1 -model llama3 -requests 1000 -par 5
+
+import (
+	"bytes"
+	"flag"
+	"fmt"
+	"io"
+	"log"
+	"net/http"
+	"os"
+	"sync"
+	"time"
+)
+
+func main() {
+	// ----- CLI arguments ----------------------------------------------------
+	var (
+		baseurl         string
+		modelName       string
+		totalRequests   int
+		parallelization int
+	)
+
+	flag.StringVar(&baseurl, "baseurl", "http://localhost:8080/v1", "Base URL of the API (e.g., https://api.example.com)")
+	flag.StringVar(&modelName, "model", "", "Model name to use")
+	flag.IntVar(&totalRequests, "requests", 1, "Total number of requests to send")
+	flag.IntVar(&parallelization, "par", 1, "Maximum number of concurrent requests")
+	flag.Parse()
+
+	if baseurl == "" || modelName == "" {
+		fmt.Println("Error: both -baseurl and -model are required.")
+		flag.Usage()
+		os.Exit(1)
+	}
+	if totalRequests <= 0 {
+		fmt.Println("Error: -requests must be greater than 0.")
+		os.Exit(1)
+	}
+	if parallelization <= 0 {
+		fmt.Println("Error: -parallelization must be greater than 0.")
+		os.Exit(1)
+	}
+
+	// ----- HTTP client -------------------------------------------------------
+	client := &http.Client{
+		Timeout: 30 * time.Second,
+	}
+
+	// ----- Tracking response codes -------------------------------------------
+	statusCounts := make(map[int]int) // map[statusCode]count
+	var mu sync.Mutex                 // protects statusCounts
+
+	// ----- Request queue (buffered channel) ----------------------------------
+	requests := make(chan int, 10) // Buffered channel with capacity 10
+
+	// Goroutine to fill the request queue
+	go func() {
+		for i := 0; i < totalRequests; i++ {
+			requests <- i + 1
+		}
+		close(requests)
+	}()
+
+	// ----- Worker pool -------------------------------------------------------
+	var wg sync.WaitGroup
+	for i := 0; i < parallelization; i++ {
+		wg.Add(1)
+		go func(workerID int) {
+			defer wg.Done()
+
+			for reqID := range requests {
+				// Build request payload as a single line JSON string
+				payload := `{"model":"` + modelName + `","max_tokens":100,"stream":false,"messages":[{"role":"user","content":"write a snake game in python"}]}`
+
+				// Send POST request
+				req, err := http.NewRequest(http.MethodPost,
+					fmt.Sprintf("%s/chat/completions", baseurl),
+					bytes.NewReader([]byte(payload)))
+				if err != nil {
+					log.Printf("[worker %d][req %d] request creation error: %v", workerID, reqID, err)
+					mu.Lock()
+					statusCounts[-1]++
+					mu.Unlock()
+					continue
+				}
+				req.Header.Set("Content-Type", "application/json")
+
+				resp, err := client.Do(req)
+				if err != nil {
+					log.Printf("[worker %d][req %d] HTTP request error: %v", workerID, reqID, err)
+					mu.Lock()
+					statusCounts[-1]++
+					mu.Unlock()
+					continue
+				}
+				io.Copy(io.Discard, resp.Body)
+				resp.Body.Close()
+
+				// Record status code
+				mu.Lock()
+				statusCounts[resp.StatusCode]++
+				mu.Unlock()
+			}
+		}(i + 1)
+	}
+
+	// ----- Status ticker (prints every second) -------------------------------
+	done := make(chan struct{})
+	tickerDone := make(chan struct{})
+	go func() {
+		ticker := time.NewTicker(1 * time.Second)
+		startTime := time.Now()
+		for {
+			select {
+			case <-ticker.C:
+				mu.Lock()
+				// Compute how many requests have completed so far
+				completed := 0
+				for _, cnt := range statusCounts {
+					completed += cnt
+				}
+				// Calculate duration and progress
+				duration := time.Since(startTime)
+				progress := completed * 100 / totalRequests
+				fmt.Printf("Duration: %v, Completed: %d%% requests\n", duration, progress)
+				mu.Unlock()
+			case <-done:
+				duration := time.Since(startTime)
+				fmt.Printf("Duration: %v, Completed: %d%% requests\n", duration, 100)
+				close(tickerDone)
+				return
+			}
+		}
+	}()
+
+	// Wait for all workers to finish
+	wg.Wait()
+	close(done)  // stops the status-update goroutine
+	<-tickerDone // give ticker time to finish / print
+
+	// ----- Summary ------------------------------------------------------------
+	fmt.Println("\n\n=== HTTP response code summary ===")
+	mu.Lock()
+	for code, cnt := range statusCounts {
+		if code == -1 {
+			fmt.Printf("Client-side errors (no HTTP response): %d\n", cnt)
+		} else {
+			fmt.Printf("%d : %d\n", code, cnt)
+		}
+	}
+	mu.Unlock()
+}
@@ -138,6 +138,14 @@ func (c *GroupConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
 	return nil
 }

+type HooksConfig struct {
+	OnStartup HookOnStartup `yaml:"on_startup"`
+}
+
+type HookOnStartup struct {
+	Preload []string `yaml:"preload"`
+}
+
 type Config struct {
 	HealthCheckTimeout int                    `yaml:"healthCheckTimeout"`
 	LogRequests        bool                   `yaml:"logRequests"`
@@ -155,6 +163,9 @@ type Config struct {

 	// automatic port assignments
 	StartPort int `yaml:"startPort"`
+
+	// hooks, see: #209
+	Hooks HooksConfig `yaml:"hooks"`
 }

 func (c *Config) RealModelName(search string) (string, bool) {
@@ -330,6 +341,22 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
 		}
 	}

+	// clean up hooks preload
+	if len(config.Hooks.OnStartup.Preload) > 0 {
+		var toPreload []string
+		for _, modelID := range config.Hooks.OnStartup.Preload {
+			modelID = strings.TrimSpace(modelID)
+			if modelID == "" {
+				continue
+			}
+			if real, found := config.RealModelName(modelID); found {
+				toPreload = append(toPreload, real)
+			}
+		}
+
+		config.Hooks.OnStartup.Preload = toPreload
+	}
+
 	return config, nil
 }

@@ -100,6 +100,9 @@ func TestConfig_LoadPosix(t *testing.T) {
 	content := `
 macros:
  svr-path: "path/to/server"
+hooks:
+  on_startup:
+    preload: ["model1", "model2"]
 models:
  model1:
    cmd: path/to/cmd --arg1 one
@@ -163,6 +166,11 @@ groups:
 		Macros: map[string]string{
 			"svr-path": "path/to/server",
 		},
+		Hooks: HooksConfig{
+			OnStartup: HookOnStartup{
+				Preload: []string{"model1", "model2"},
+			},
+		},
 		Models: map[string]ModelConfig{
 			"model1": {
 				Cmd:           "path/to/cmd --arg1 one",
@@ -0,0 +1,27 @@
+package proxy
+
+import "net/http"
+
+// Custom discard writer that implements http.ResponseWriter but just discards everything
+type DiscardWriter struct {
+	header http.Header
+	status int
+}
+
+func (w *DiscardWriter) Header() http.Header {
+	if w.header == nil {
+		w.header = make(http.Header)
+	}
+	return w.header
+}
+
+func (w *DiscardWriter) Write(data []byte) (int, error) {
+	return len(data), nil
+}
+
+func (w *DiscardWriter) WriteHeader(code int) {
+	w.status = code
+}
+
+// Satisfy the http.Flusher interface for streaming responses
+func (w *DiscardWriter) Flush() {}
@@ -7,6 +7,7 @@ const ChatCompletionStatsEventID = 0x02
 const ConfigFileChangedEventID = 0x03
 const LogDataEventID = 0x04
 const TokenMetricsEventID = 0x05
+const ModelPreloadedEventID = 0x06

 type ProcessStateChangeEvent struct {
 	ProcessName string
@@ -48,3 +49,12 @@ type LogDataEvent struct {
 func (e LogDataEvent) Type() uint32 {
 	return LogDataEventID
 }
+
+type ModelPreloadedEvent struct {
+	ModelName string
+	Success   bool
+}
+
+func (e ModelPreloadedEvent) Type() uint32 {
+	return ModelPreloadedEventID
+}
@@ -13,9 +13,10 @@ import (
 )

 var (
-	nextTestPort int = 12000
-	portMutex    sync.Mutex
-	testLogger   = NewLogMonitorWriter(os.Stdout)
+	nextTestPort        int = 12000
+	portMutex           sync.Mutex
+	testLogger          = NewLogMonitorWriter(os.Stdout)
+	simpleResponderPath = getSimpleResponderPath()
 )

 // Check if the binary exists
@@ -69,13 +70,11 @@ func getTestSimpleResponderConfig(expectedMessage string) ModelConfig {
 }

 func getTestSimpleResponderConfigPort(expectedMessage string, port int) ModelConfig {
-	binaryPath := getSimpleResponderPath()
-
 	// Create a YAML string with just the values we want to set
 	yamlStr := fmt.Sprintf(`
 cmd: '%s --port %d --silent --respond %s'
 proxy: "http://127.0.0.1:%d"
-`, binaryPath, port, expectedMessage, port)
+`, simpleResponderPath, port, expectedMessage, port)

 	var cfg ModelConfig
 	if err := yaml.Unmarshal([]byte(yamlStr), &cfg); err != nil {
@@ -79,10 +79,12 @@ func (rec *MetricsRecorder) parseAndRecordMetrics(jsonData gjson.Result) bool {
 	outputTokens := int(jsonData.Get("usage.completion_tokens").Int())
 	inputTokens := int(jsonData.Get("usage.prompt_tokens").Int())
 	tokensPerSecond := -1.0
+	promptPerSecond := -1.0
 	durationMs := int(time.Since(rec.startTime).Milliseconds())

 	// use llama-server's timing data for tok/sec and duration as it is more accurate
 	if timings := jsonData.Get("timings"); timings.Exists() {
+		promptPerSecond = jsonData.Get("timings.prompt_per_second").Float()
 		tokensPerSecond = jsonData.Get("timings.predicted_per_second").Float()
 		durationMs = int(jsonData.Get("timings.prompt_ms").Float() + jsonData.Get("timings.predicted_ms").Float())
 	}
@@ -92,6 +94,7 @@ func (rec *MetricsRecorder) parseAndRecordMetrics(jsonData gjson.Result) bool {
 		Model:           rec.realModelName,
 		InputTokens:     inputTokens,
 		OutputTokens:    outputTokens,
+		PromptPerSecond: promptPerSecond,
 		TokensPerSecond: tokensPerSecond,
 		DurationMs:      durationMs,
 	})
@@ -15,6 +15,7 @@ type TokenMetrics struct {
 	Model           string    `json:"model"`
 	InputTokens     int       `json:"input_tokens"`
 	OutputTokens    int       `json:"output_tokens"`
+	PromptPerSecond float64   `json:"prompt_per_second"`
 	TokensPerSecond float64   `json:"tokens_per_second"`
 	DurationMs      int       `json:"duration_ms"`
 }
@@ -15,6 +15,7 @@ import (
 	"time"

 	"github.com/gin-gonic/gin"
+	"github.com/mostlygeek/llama-swap/event"
 	"github.com/tidwall/gjson"
 	"github.com/tidwall/sjson"
 )
@@ -96,6 +97,35 @@ func New(config Config) *ProxyManager {
 	}

 	pm.setupGinEngine()
+
+	// run any startup hooks
+	if len(config.Hooks.OnStartup.Preload) > 0 {
+		// do it in the background, don't block startup -- not sure if good idea yet
+		go func() {
+			discardWriter := &DiscardWriter{}
+			for _, realModelName := range config.Hooks.OnStartup.Preload {
+				proxyLogger.Infof("Preloading model: %s", realModelName)
+				processGroup, _, err := pm.swapProcessGroup(realModelName)
+
+				if err != nil {
+					event.Emit(ModelPreloadedEvent{
+						ModelName: realModelName,
+						Success:   false,
+					})
+					proxyLogger.Errorf("Failed to preload model %s: %v", realModelName, err)
+					continue
+				} else {
+					req, _ := http.NewRequest("GET", "/", nil)
+					processGroup.ProxyRequest(realModelName, discardWriter, req)
+					event.Emit(ModelPreloadedEvent{
+						ModelName: realModelName,
+						Success:   true,
+					})
+				}
+			}
+		}()
+	}
+
 	return pm
 }

@@ -361,7 +391,7 @@ func (pm *ProxyManager) proxyToUpstream(c *gin.Context) {
 		return
 	}

-	processGroup, _, err := pm.swapProcessGroup(requestedModel)
+	processGroup, realModelName, err := pm.swapProcessGroup(requestedModel)
 	if err != nil {
 		pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error swapping process group: %s", err.Error()))
 		return
@@ -369,7 +399,7 @@ func (pm *ProxyManager) proxyToUpstream(c *gin.Context) {

 	// rewrite the path
 	c.Request.URL.Path = c.Param("upstreamPath")
-	processGroup.ProxyRequest(requestedModel, c.Writer, c.Request)
+	processGroup.ProxyRequest(realModelName, c.Writer, c.Request)
 }

 func (pm *ProxyManager) proxyOAIHandler(c *gin.Context) {
@@ -132,7 +132,7 @@ func (pm *ProxyManager) apiSendEvents(c *gin.Context) {
 		}
 	}

-	sendMetrics := func(metrics TokenMetrics) {
+	sendMetrics := func(metrics []TokenMetrics) {
 		jsonData, err := json.Marshal(metrics)
 		if err == nil {
 			select {
@@ -168,16 +168,14 @@ func (pm *ProxyManager) apiSendEvents(c *gin.Context) {
 	 * Send Metrics data
 	 */
 	defer event.On(func(e TokenMetricsEvent) {
-		sendMetrics(e.Metrics)
+		sendMetrics([]TokenMetrics{e.Metrics})
 	})()

 	// send initial batch of data
 	sendLogData("proxy", pm.proxyLogger.GetHistory())
 	sendLogData("upstream", pm.upstreamLogger.GetHistory())
 	sendModels()
-	for _, metrics := range pm.metricsMonitor.GetMetrics() {
-		sendMetrics(metrics)
-	}
+	sendMetrics(pm.metricsMonitor.GetMetrics())

 	for {
 		select {
@@ -9,10 +9,12 @@ import (
 	"net/http"
 	"net/http/httptest"
 	"strconv"
+	"strings"
 	"sync"
 	"testing"
 	"time"

+	"github.com/mostlygeek/llama-swap/event"
 	"github.com/stretchr/testify/assert"
 	"github.com/tidwall/gjson"
 )
@@ -280,48 +282,48 @@ func TestProxyManager_ListModelsHandler(t *testing.T) {
 }

 func TestProxyManager_ListModelsHandler_SortedByID(t *testing.T) {
-    // Intentionally add models in non-sorted order and with an unlisted model
-    config := Config{
-        HealthCheckTimeout: 15,
-        Models: map[string]ModelConfig{
-            "zeta":   getTestSimpleResponderConfig("zeta"),
-            "alpha":  getTestSimpleResponderConfig("alpha"),
-            "beta":   getTestSimpleResponderConfig("beta"),
-            "hidden": func() ModelConfig {
-                mc := getTestSimpleResponderConfig("hidden")
-                mc.Unlisted = true
-                return mc
-            }(),
-        },
-        LogLevel: "error",
-    }
+	// Intentionally add models in non-sorted order and with an unlisted model
+	config := Config{
+		HealthCheckTimeout: 15,
+		Models: map[string]ModelConfig{
+			"zeta":  getTestSimpleResponderConfig("zeta"),
+			"alpha": getTestSimpleResponderConfig("alpha"),
+			"beta":  getTestSimpleResponderConfig("beta"),
+			"hidden": func() ModelConfig {
+				mc := getTestSimpleResponderConfig("hidden")
+				mc.Unlisted = true
+				return mc
+			}(),
+		},
+		LogLevel: "error",
+	}

-    proxy := New(config)
+	proxy := New(config)

-    // Request models list
-    req := httptest.NewRequest("GET", "/v1/models", nil)
-    w := httptest.NewRecorder()
-    proxy.ServeHTTP(w, req)
+	// Request models list
+	req := httptest.NewRequest("GET", "/v1/models", nil)
+	w := httptest.NewRecorder()
+	proxy.ServeHTTP(w, req)

-    assert.Equal(t, http.StatusOK, w.Code)
+	assert.Equal(t, http.StatusOK, w.Code)

-    var response struct {
-        Data []map[string]interface{} `json:"data"`
-    }
-    if err := json.Unmarshal(w.Body.Bytes(), &response); err != nil {
-        t.Fatalf("Failed to parse JSON response: %v", err)
-    }
+	var response struct {
+		Data []map[string]interface{} `json:"data"`
+	}
+	if err := json.Unmarshal(w.Body.Bytes(), &response); err != nil {
+		t.Fatalf("Failed to parse JSON response: %v", err)
+	}

-    // We expect only the listed models in sorted order by id
-    expectedOrder := []string{"alpha", "beta", "zeta"}
-    if assert.Len(t, response.Data, len(expectedOrder), "unexpected number of listed models") {
-        got := make([]string, 0, len(response.Data))
-        for _, m := range response.Data {
-            id, _ := m["id"].(string)
-            got = append(got, id)
-        }
-        assert.Equal(t, expectedOrder, got, "models should be sorted by id ascending")
-    }
+	// We expect only the listed models in sorted order by id
+	expectedOrder := []string{"alpha", "beta", "zeta"}
+	if assert.Len(t, response.Data, len(expectedOrder), "unexpected number of listed models") {
+		got := make([]string, 0, len(response.Data))
+		for _, m := range response.Data {
+			id, _ := m["id"].(string)
+			got = append(got, id)
+		}
+		assert.Equal(t, expectedOrder, got, "models should be sorted by id ascending")
+	}
 }

 func TestProxyManager_Shutdown(t *testing.T) {
@@ -656,21 +658,34 @@ func TestProxyManager_CORSOptionsHandler(t *testing.T) {
 }

 func TestProxyManager_Upstream(t *testing.T) {
-	config := AddDefaultGroupToConfig(Config{
-		HealthCheckTimeout: 15,
-		Models: map[string]ModelConfig{
-			"model1": getTestSimpleResponderConfig("model1"),
-		},
-		LogLevel: "error",
-	})
+	configStr := fmt.Sprintf(`
+logLevel: error
+models:
+  model1:
+    cmd: %s -port ${PORT} -silent -respond model1
+    aliases: [model-alias]
+`, getSimpleResponderPath())
+
+	config, err := LoadConfigFromReader(strings.NewReader(configStr))
+	assert.NoError(t, err)

 	proxy := New(config)
 	defer proxy.StopProcesses(StopWaitForInflightRequest)
-	req := httptest.NewRequest("GET", "/upstream/model1/test", nil)
-	rec := httptest.NewRecorder()
-	proxy.ServeHTTP(rec, req)
-	assert.Equal(t, http.StatusOK, rec.Code)
-	assert.Equal(t, "model1", rec.Body.String())
+	t.Run("main model name", func(t *testing.T) {
+		req := httptest.NewRequest("GET", "/upstream/model1/test", nil)
+		rec := httptest.NewRecorder()
+		proxy.ServeHTTP(rec, req)
+		assert.Equal(t, http.StatusOK, rec.Code)
+		assert.Equal(t, "model1", rec.Body.String())
+	})
+
+	t.Run("model alias", func(t *testing.T) {
+		req := httptest.NewRequest("GET", "/upstream/model-alias/test", nil)
+		rec := httptest.NewRecorder()
+		proxy.ServeHTTP(rec, req)
+		assert.Equal(t, http.StatusOK, rec.Code)
+		assert.Equal(t, "model1", rec.Body.String())
+	})
 }

 func TestProxyManager_ChatContentLength(t *testing.T) {
@@ -818,3 +833,62 @@ func TestProxyManager_HealthEndpoint(t *testing.T) {
 	assert.Equal(t, http.StatusOK, rec.Code)
 	assert.Equal(t, "OK", rec.Body.String())
 }
+
+func TestProxyManager_StartupHooks(t *testing.T) {
+
+	// using real YAML as the configuration has gotten more complex
+	// is the right approach as LoadConfigFromReader() does a lot more
+	// than parse YAML now. Eventually migrate all tests to use this approach
+	configStr := strings.Replace(`
+logLevel: error
+hooks:
+  on_startup:
+    preload:
+      - model1
+      - model2
+groups:
+  preloadTestGroup:
+    swap: false
+    members:
+       - model1
+       - model2
+models:
+  model1:
+    cmd: ${simpleresponderpath} --port ${PORT} --silent --respond model1
+  model2:
+      cmd: ${simpleresponderpath} --port ${PORT} --silent --respond model2
+`, "${simpleresponderpath}", simpleResponderPath, -1)
+
+	// Create a test model configuration
+	config, err := LoadConfigFromReader(strings.NewReader(configStr))
+	if !assert.NoError(t, err, "Invalid configuration") {
+		return
+	}
+
+	preloadChan := make(chan ModelPreloadedEvent, 2) // buffer for 2 expected events
+
+	unsub := event.On(func(e ModelPreloadedEvent) {
+		preloadChan <- e
+	})
+
+	defer unsub()
+
+	// Create the proxy which should trigger preloading
+	proxy := New(config)
+	defer proxy.StopProcesses(StopWaitForInflightRequest)
+
+	for i := 0; i < 2; i++ {
+		select {
+		case <-preloadChan:
+		case <-time.After(5 * time.Second):
+			t.Fatal("timed out waiting for models to preload")
+		}
+	}
+	// make sure they are both loaded
+	_, foundGroup := proxy.processGroups["preloadTestGroup"]
+	if !assert.True(t, foundGroup, "preloadTestGroup should exist") {
+		return
+	}
+	assert.Equal(t, StateReady, proxy.processGroups["preloadTestGroup"].processes["model1"].CurrentState())
+	assert.Equal(t, StateReady, proxy.processGroups["preloadTestGroup"].processes["model2"].CurrentState())
+}
@@ -4,6 +4,7 @@ import { APIProvider } from "./contexts/APIProvider";
 import LogViewerPage from "./pages/LogViewer";
 import ModelPage from "./pages/Models";
 import ActivityPage from "./pages/Activity";
+import ConnectionStatus from "./components/ConnectionStatus";
 import { RiSunFill, RiMoonFill } from "react-icons/ri";

 function App() {
@@ -31,6 +32,7 @@ function App() {
                <button className="" onClick={toggleTheme}>
                  {isDarkMode ? <RiMoonFill /> : <RiSunFill />}
                </button>
+                <ConnectionStatus />
              </div>
            </div>
          </nav>
@@ -0,0 +1,36 @@
+import { useAPI } from "../contexts/APIProvider";
+import { useEffect, useState, useMemo } from "react";
+
+type ConnectionStatus = "disconnected" | "connecting" | "connected";
+
+const ConnectionStatus = () => {
+  const { getConnectionStatus } = useAPI();
+  const [eventStreamStatus, setEventStreamStatus] = useState<ConnectionStatus>("disconnected");
+
+  useEffect(() => {
+    const interval = setInterval(() => {
+      setEventStreamStatus(getConnectionStatus());
+    }, 1000);
+    return () => clearInterval(interval);
+  });
+
+  const eventStatusColor = useMemo(() => {
+    switch (eventStreamStatus) {
+      case "connected":
+        return "bg-green-500";
+      case "connecting":
+        return "bg-yellow-500";
+      case "disconnected":
+      default:
+        return "bg-red-500";
+    }
+  }, [eventStreamStatus]);
+
+  return (
+    <div className="flex items-center" title={`event stream: ${eventStreamStatus}`}>
+      <span className={`inline-block w-3 h-3 rounded-full ${eventStatusColor} mr-2`}></span>
+    </div>
+  );
+};
+
+export default ConnectionStatus;
@@ -20,6 +20,7 @@ interface APIProviderType {
  proxyLogs: string;
  upstreamLogs: string;
  metrics: Metrics[];
+  getConnectionStatus: () => "connected" | "connecting" | "disconnected";
 }

 interface Metrics {
@@ -28,6 +29,7 @@ interface Metrics {
  model: string;
  input_tokens: number;
  output_tokens: number;
+  prompt_per_second: number;
  tokens_per_second: number;
  duration_ms: number;
 }
@@ -62,6 +64,16 @@ export function APIProvider({ children, autoStartAPIEvents = true }: APIProvider
    });
  }, []);

+  const getConnectionStatus = useCallback(() => {
+    if (apiEventSource.current?.readyState === EventSource.OPEN) {
+      return "connected";
+    } else if (apiEventSource.current?.readyState === EventSource.CONNECTING) {
+      return "connecting";
+    } else {
+      return "disconnected";
+    }
+  }, []);
+
  const enableAPIEvents = useCallback((enabled: boolean) => {
    if (!enabled) {
      apiEventSource.current?.close();
@@ -76,6 +88,14 @@ export function APIProvider({ children, autoStartAPIEvents = true }: APIProvider
    const connect = () => {
      const eventSource = new EventSource("/api/events");

+      eventSource.onopen = () => {
+        // clear everything out on connect to keep things in sync
+        setProxyLogs("");
+        setUpstreamLogs("");
+        setMetrics([]); // clear metrics on reconnect
+        setModels([]); // clear models on reconnect
+      };
+
      eventSource.onmessage = (e: MessageEvent) => {
        try {
          const message = JSON.parse(e.data) as APIEventEnvelope;
@@ -83,6 +103,12 @@ export function APIProvider({ children, autoStartAPIEvents = true }: APIProvider
            case "modelStatus":
              {
                const models = JSON.parse(message.data) as Model[];
+
+                // sort models by name and id
+                models.sort((a, b) => {
+                  return (a.name + a.id).localeCompare(b.name + b.id);
+                });
+
                setModels(models);
              }
              break;
@@ -101,9 +127,9 @@ export function APIProvider({ children, autoStartAPIEvents = true }: APIProvider

            case "metrics":
              {
-                const newMetric = JSON.parse(message.data) as Metrics;
+                const newMetrics = JSON.parse(message.data) as Metrics[];
                setMetrics((prevMetrics) => {
-                  return [newMetric, ...prevMetrics];
+                  return [...newMetrics, ...prevMetrics];
                });
              }
              break;
@@ -187,6 +213,7 @@ export function APIProvider({ children, autoStartAPIEvents = true }: APIProvider
      proxyLogs,
      upstreamLogs,
      metrics,
+      getConnectionStatus,
    }),
    [models, listModels, unloadAllModels, loadModel, enableAPIEvents, proxyLogs, upstreamLogs, metrics]
  );
@@ -1,4 +1,4 @@
-import { useState, useEffect } from "react";
+import { useMemo } from "react";
 import { useAPI } from "../contexts/APIProvider";

 const formatTimestamp = (timestamp: string): string => {
@@ -15,25 +15,10 @@ const formatDuration = (ms: number): string => {

 const ActivityPage = () => {
  const { metrics } = useAPI();
-  const [error, setError] = useState<string | null>(null);
-
-  useEffect(() => {
-    if (metrics.length > 0) {
-      setError(null);
-    }
+  const sortedMetrics = useMemo(() => {
+    return [...metrics].sort((a, b) => b.id - a.id);
  }, [metrics]);

-  if (error) {
-    return (
-      <div className="p-6">
-        <h1 className="text-2xl font-bold mb-4">Activity</h1>
-        <div className="bg-red-50 border border-red-200 rounded-md p-4">
-          <p className="text-red-800">{error}</p>
-        </div>
-      </div>
-    );
-  }
-
  return (
    <div className="p-6">
      <h1 className="text-2xl font-bold mb-4">Activity</h1>
@@ -47,21 +32,25 @@ const ActivityPage = () => {
          <table className="min-w-full divide-y">
            <thead>
              <tr>
+                <th className="px-4 py-3 text-left text-xs font-medium uppercase tracking-wider">Id</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Timestamp</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Model</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Input Tokens</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Output Tokens</th>
+                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Prompt Processing</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Generation Speed</th>
                <th className="px-6 py-3 text-left text-xs font-medium uppercase tracking-wider">Duration</th>
              </tr>
            </thead>
            <tbody className="divide-y">
-              {metrics.map((metric, index) => (
-                <tr key={`${metric.id}-${index}`}>
+              {sortedMetrics.map((metric) => (
+                <tr key={`metric_${metric.id}`}>
+                  <td className="px-4 py-4 whitespace-nowrap text-sm">{metric.id + 1 /* un-zero index */}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{formatTimestamp(metric.timestamp)}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{metric.model}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{metric.input_tokens.toLocaleString()}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{metric.output_tokens.toLocaleString()}</td>
+                  <td className="px-6 py-4 whitespace-nowrap text-sm">{formatSpeed(metric.prompt_per_second)}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{formatSpeed(metric.tokens_per_second)}</td>
                  <td className="px-6 py-4 whitespace-nowrap text-sm">{formatDuration(metric.duration_ms)}</td>
                </tr>
@@ -4,7 +4,7 @@ import { LogPanel } from "./LogViewer";
 import { usePersistentState } from "../hooks/usePersistentState";
 import { Panel, PanelGroup, PanelResizeHandle } from "react-resizable-panels";
 import { useTheme } from "../contexts/ThemeProvider";
-import { RiEyeFill, RiEyeOffFill, RiStopCircleLine } from "react-icons/ri";
+import { RiEyeFill, RiEyeOffFill, RiStopCircleLine, RiSwapBoxFill } from "react-icons/ri";

 export default function ModelsPage() {
  const { isNarrow } = useTheme();
@@ -40,6 +40,7 @@ function ModelsPanel() {
  const { models, loadModel, unloadAllModels } = useAPI();
  const [isUnloading, setIsUnloading] = useState(false);
  const [showUnlisted, setShowUnlisted] = usePersistentState("showUnlisted", true);
+  const [showIdorName, setShowIdorName] = usePersistentState<"id" | "name">("showIdorName", "id"); // true = show ID, false = show name

  const filteredModels = useMemo(() => {
    return models.filter((model) => showUnlisted || !model.unlisted);
@@ -58,18 +59,28 @@ function ModelsPanel() {
    }
  }, [unloadAllModels]);

+  const toggleIdorName = useCallback(() => {
+    setShowIdorName((prev) => (prev === "name" ? "id" : "name"));
+  }, [showIdorName]);
+
  return (
    <div className="card h-full flex flex-col">
      <div className="shrink-0">
        <h2>Models</h2>
        <div className="flex justify-between">
-          <button
-            className="btn flex items-center gap-2"
-            onClick={() => setShowUnlisted(!showUnlisted)}
-            style={{ lineHeight: "1.2" }}
-          >
-            {showUnlisted ? <RiEyeFill /> : <RiEyeOffFill />} unlisted
-          </button>
+          <div className="flex gap-2">
+            <button className="btn flex items-center gap-2" onClick={toggleIdorName} style={{ lineHeight: "1.2" }}>
+              <RiSwapBoxFill /> {showIdorName === "id" ? "ID" : "Name"}
+            </button>
+
+            <button
+              className="btn flex items-center gap-2"
+              onClick={() => setShowUnlisted(!showUnlisted)}
+              style={{ lineHeight: "1.2" }}
+            >
+              {showUnlisted ? <RiEyeFill /> : <RiEyeOffFill />} unlisted
+            </button>
+          </div>
          <button className="btn flex items-center gap-2" onClick={handleUnloadAllModels} disabled={isUnloading}>
            <RiStopCircleLine size="24" /> {isUnloading ? "Unloading..." : "Unload"}
          </button>
@@ -80,7 +91,7 @@ function ModelsPanel() {
        <table className="w-full">
          <thead className="sticky top-0 bg-card z-10">
            <tr className="border-b border-primary bg-surface">
-              <th className="text-left p-2">Name</th>
+              <th className="text-left p-2">{showIdorName === "id" ? "Model ID" : "Name"}</th>
              <th className="text-left p-2"></th>
              <th className="text-left p-2">State</th>
            </tr>
@@ -90,7 +101,7 @@ function ModelsPanel() {
              <tr key={model.id} className="border-b hover:bg-secondary-hover border-border">
                <td className={`p-2 ${model.unlisted ? "text-txtsecondary" : ""}`}>
                  <a href={`/upstream/${model.id}/`} className={`underline`} target="_blank">
-                    {model.name !== "" ? model.name : model.id}
+                    {showIdorName === "id" ? model.id : model.name !== "" ? model.name : model.id}
                  </a>
                  {model.description !== "" && (
                    <p className={model.unlisted ? "text-opacity-70" : ""}>
@@ -122,35 +133,41 @@ function ModelsPanel() {
 function StatsPanel() {
  const { metrics } = useAPI();

-  const [totalRequests, totalTokens, avgTokensPerSecond] = useMemo(() => {
+  const [totalRequests, totalInputTokens, totalOutputTokens, avgTokensPerSecond] = useMemo(() => {
    const totalRequests = metrics.length;
    if (totalRequests === 0) {
      return [0, 0, 0];
    }
-    const totalTokens = metrics.reduce((sum, m) => sum + m.output_tokens, 0);
+    const totalInputTokens = metrics.reduce((sum, m) => sum + m.input_tokens, 0);
+    const totalOutputTokens = metrics.reduce((sum, m) => sum + m.output_tokens, 0);
    const avgTokensPerSecond = (metrics.reduce((sum, m) => sum + m.tokens_per_second, 0) / totalRequests).toFixed(2);
-    return [totalRequests, totalTokens, avgTokensPerSecond];
+    return [totalRequests, totalInputTokens, totalOutputTokens, avgTokensPerSecond];
  }, [metrics]);

  return (
    <div className="card">
-      <h2>Chat Activity</h2>
-      <table className="w-full border border-gray-200">
-        <tbody>
-          <tr className="border-b border-gray-200">
-            <td className="py-2 px-4 font-medium border-r border-gray-200">Requests</td>
-            <td className="py-2 px-4 text-right">{totalRequests}</td>
-          </tr>
-          <tr className="border-b border-gray-200">
-            <td className="py-2 px-4 font-medium border-r border-gray-200">Total Tokens Generated</td>
-            <td className="py-2 px-4 text-right">{totalTokens}</td>
-          </tr>
-          <tr>
-            <td className="py-2 px-4 font-medium border-r border-gray-200">Average Tokens/Second</td>
-            <td className="py-2 px-4 text-right">{avgTokensPerSecond}</td>
-          </tr>
-        </tbody>
-      </table>
+      <div className="rounded-lg overflow-hidden border border-gray-200">
+        <table className="w-full">
+          <tbody>
+            <tr>
+              <th className="p-2 font-medium border-b border-gray-200 text-right">Requests</th>
+              <th className="p-2 font-medium border-l border-b border-gray-200 text-right">Processed</th>
+              <th className="p-2 font-medium border-l border-b border-gray-200 text-right">Generated</th>
+              <th className="p-2 font-medium border-l border-b border-gray-200 text-right">Tokens/Sec</th>
+            </tr>
+            <tr>
+              <td className="p-2 text-right border-r border-gray-200">{totalRequests}</td>
+              <td className="p-2 text-right border-r border-gray-200">
+                {new Intl.NumberFormat().format(totalInputTokens)}
+              </td>
+              <td className="p-2 text-right border-r border-gray-200">
+                {new Intl.NumberFormat().format(totalOutputTokens)}
+              </td>
+              <td className="p-2 text-right">{avgTokensPerSecond}</td>
+            </tr>
+          </tbody>
+        </table>
+      </div>
    </div>
  );
 }
Author	SHA1	Message	Date
Benson Wong	04fc67354a	Improve Activity event handling in the UI (#254 ) Improve Activity event handling in the UI - fixes #252 found that the Activity page showed activity inconsistent with /api/metrics - Change data structure for event metrics to array. - Add Event stream connections status indicator	2025-08-15 21:44:08 -07:00
Benson Wong	4662cf7699	add 'unconfirmed bug' as default label in bug-report.md	2025-08-15 15:38:12 -07:00
Benson Wong	5dc6b3e6d9	Add barebones but working implementation of model preload (#209 , #235 ) Add barebones but working implementation of model preload * add config test for Preload hook * improve TestProxyManager_StartupHooks * docs for new hook configuration * add a .dev to .gitignore	2025-08-14 10:27:28 -07:00
Benson Wong	74c69f39ef	Add prompt processing metrics (#250 ) - capture prompt processing metrics - display prompt processing metrics on UI Activity page	2025-08-14 10:02:16 -07:00
Benson Wong	a186318892	Update Readme, Add screenshot for Activities page [skip ci]	2025-08-08 13:39:46 -07:00
Benson Wong	c4e4d5e1e9	Update Readme UI Screenshot [skip ci]	2025-08-08 13:33:47 -07:00
Benson Wong	7985e94ba4	add tokens processed to ui models page	2025-08-08 13:28:39 -07:00
Benson Wong	74556c3a36	Update bug-report.md [skip ci]	2025-08-08 09:52:05 -07:00
Benson Wong	5c381e4b30	Add gofmt linting to ci	2025-08-07 20:29:18 -07:00
Benson Wong	10569ed546	Fix model alias usage in upstream path (#230 ) Model alias values are not properly resolved and work in upstream/ path. Related to #229.	2025-08-07 20:16:56 -07:00
Benson Wong	5b10b3c23f	UI Tweaks (#228 ) * sort model names in UI * add toggle to show model id/name on UI model page	2025-08-07 11:07:03 -07:00