Change fsnotify to watch config directory instead of file

The fsnotify library suggests watching a directory and checking that the name matches the configuration file.
Add Event Bus (#184 )
2025-07-02 10:23:52 -07:00 · 2025-07-01 22:17:35 -07:00 · 2025-06-30 23:02:44 -07:00 · 2025-06-27 11:49:31 -07:00 · 2025-06-26 09:20:50 -07:00 · 2025-06-25 12:27:49 -07:00
62 changed files with 6628 additions and 1047 deletions
@@ -15,8 +15,7 @@ jobs:
    runs-on: ubuntu-latest
    strategy:
      matrix:
-        #platform: [intel, cuda, vulkan, cpu, musa]
+        platform: [intel, cuda, vulkan, cpu, musa]
        platform: [cuda, vulkan, cpu, musa]
      fail-fast: false
    steps:
      - name: Checkout code
@@ -23,6 +23,19 @@ jobs:
      -
        name: Set up Go
        uses: actions/setup-go@v5
      -
        name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '23'  # or your preferred version
      -
        name: Install dependencies and build UI
        run: |
          cd ui
          npm ci
          npm run build
      -
        name: Run GoReleaser
        uses: goreleaser/goreleaser-action@v6
@@ -17,14 +17,16 @@ builds:
      - goos: windows
        goarch: arm64
 # use zip format for windows
 archives:
  - id: default
-    format: tar.gz
+    formats:
      - tar.gz
    name_template: "{{ .ProjectName }}_{{ .Version }}_{{ .Os }}_{{ .Arch }}"
    builds_info:
      group: root
      owner: root
    format_overrides:
      # use zip format for windows
      - goos: windows
-        format: zip
+        formats:
          - zip
@@ -19,24 +19,35 @@ all: mac linux simple-responder
 clean:
 	rm -rf $(BUILD_DIR)
-test:
+proxy/ui_dist/placeholder.txt:
 	mkdir -p proxy/ui_dist
 	touch $@
 test: proxy/ui_dist/placeholder.txt
 	go test -short -v -count=1 ./proxy
-test-all:
+test-all: proxy/ui_dist/placeholder.txt
 	go test -v -count=1 ./proxy
 ui/node_modules:
 	cd ui && npm install
 # build react UI
 ui: ui/node_modules
 	cd ui && npm run build
 # Build OSX binary
-mac:
+mac: ui
 	@echo "Building Mac binary..."
 	GOOS=darwin GOARCH=arm64 go build -ldflags="-X main.commit=${GIT_HASH} -X main.version=local_${GIT_HASH} -X main.date=${BUILD_DATE}" -o $(BUILD_DIR)/$(APP_NAME)-darwin-arm64
 # Build Linux binary
-linux:
+linux: ui
 	@echo "Building Linux binary..."
 	GOOS=linux GOARCH=amd64 go build -ldflags="-X main.commit=${GIT_HASH} -X main.version=local_${GIT_HASH} -X main.date=${BUILD_DATE}" -o $(BUILD_DIR)/$(APP_NAME)-linux-amd64
 # Build Windows binary
-windows:
+windows: ui
 	@echo "Building Windows binary..."
 	GOOS=windows GOARCH=amd64 go build -ldflags="-X main.commit=${GIT_HASH} -X main.version=local_${GIT_HASH} -X main.date=${BUILD_DATE}" -o $(BUILD_DIR)/$(APP_NAME)-windows-amd64.exe
@@ -69,4 +80,4 @@ release:
 	git tag "$$new_tag";
 # Phony targets
-.PHONY: all clean mac linux windows simple-responder
+.PHONY: all clean ui mac linux windows simple-responder
@@ -22,6 +22,7 @@ Written in golang, it is very easy to install (single binary with no dependencie
  - `v1/audio/speech` ([#36](https://github.com/mostlygeek/llama-swap/issues/36))
  - `v1/audio/transcriptions` ([docs](https://github.com/mostlygeek/llama-swap/issues/41#issuecomment-2722637867))
 - ✅ llama-swap custom API endpoints
  - `/ui` - web UI
  - `/log` - remote log monitoring
  - `/upstream/:model_id` - direct access to upstream HTTP server ([demo](https://github.com/mostlygeek/llama-swap/pull/31))
  - `/unload` - manually unload running models ([#58](https://github.com/mostlygeek/llama-swap/issues/58))
@@ -40,36 +41,38 @@ In the most basic configuration llama-swap handles one model at a time. For more
 ## config.yaml
-llama-swap's configuration is purposefully simple:
+llama-swap is managed entirely through a yaml configuration file. 
 It can be very minimal to start: 
 ```yaml
 models:
  "qwen2.5":
    cmd: |
-      /app/llama-server
+      /path/to/llama-server
      -hf bartowski/Qwen2.5-0.5B-Instruct-GGUF:Q4_K_M
      --port ${PORT}
  "smollm2":
    cmd: |
      /app/llama-server
      -hf bartowski/SmolLM2-135M-Instruct-GGUF:Q4_K_M
      --port ${PORT}
 ```
-.. but also supports many advanced features:
+However, there are many more capabilities that llama-swap supports: 
 - `groups` to run multiple models at once
 - `macros` for reusable snippets
 - `ttl` to automatically unload models
 - `macros` for reusable snippets
 - `aliases` to use familiar model names (e.g., "gpt-4o-mini")
- `env` variables to pass custom environment to inference servers
+- `env` to pass custom environment variables to inference servers
 - `cmdStop` for to gracefully stop Docker/Podman containers
 - `useModelName` to override model names sent to upstream servers
 - `healthCheckTimeout` to control model startup wait times
 - `${PORT}` automatic port variables for dynamic port assignment
 - `cmdStop` for to gracefully stop Docker/Podman containers
-Check the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration) in the wiki for all options.
+See the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration) in the wiki all options and examples.
 ## Web UI
 llama-swap ships with a web based interface to make it easier to monitor logs and check the status of models. 
 <img width="1758" alt="image" src="https://github.com/user-attachments/assets/31ae5bcd-5efd-46b0-b64b-6db9e60196d3" />
 ## Docker Install ([download images](https://github.com/mostlygeek/llama-swap/pkgs/container/llama-swap))
@@ -120,11 +123,11 @@ $ docker run -it --rm --runtime nvidia -p 9292:8080 \
 ## Bare metal Install ([download](https://github.com/mostlygeek/llama-swap/releases))
-Pre-built binaries are available for Linux, FreeBSD and Darwin (OSX). These are automatically published and are likely a few hours ahead of the docker releases. The baremetal install works with any OpenAI compatible server, not just llama-server.
+Pre-built binaries are available for Linux, Mac, Windows and FreeBSD. These are automatically published and are likely a few hours ahead of the docker releases. The baremetal install works with any OpenAI compatible server, not just llama-server.
 1. Create a configuration file, see [config.example.yaml](config.example.yaml)
 1. Download a [release](https://github.com/mostlygeek/llama-swap/releases) appropriate for your OS and architecture.
-1. Run the binary with `llama-swap --config path/to/config.yaml`.
+1. Create a configuration file, see the [configuration documentation](https://github.com/mostlygeek/llama-swap/wiki/Configuration).
 1. Run the binary with `llama-swap --config path/to/config.yaml --listen localhost:8080`.
   Available flags:
   - `--config`: Path to the configuration file (default: `config.yaml`).
   - `--listen`: Address and port to listen on (default: `:8080`).
@@ -133,16 +136,16 @@ Pre-built binaries are available for Linux, FreeBSD and Darwin (OSX). These are
 ### Building from source
-1. Install golang for your system
+1. Build requires golang and nodejs for the user interface.
 1. `git clone git@github.com:mostlygeek/llama-swap.git`
 1. `make clean all`
 1. Binaries will be in `build/` subdirectory
 ## Monitoring Logs
-Open the `http://<host>/logs` with your browser to get a web interface with streaming logs.
+Open the `http://<host>:<port>/` with your browser to get a web interface with streaming logs.
-Of course, CLI access is also supported:
+CLI access is also supported:
 ```shell
 # sends up to the last 10KB of logs
@@ -1,88 +1,203 @@
-# Seconds to wait for llama.cpp to be available to serve requests
+# llama-swap YAML configuration example
-# Default (and minimum): 15 seconds
+# -------------------------------------
-healthCheckTimeout: 90
+#
 # - Below are all the available configuration options for llama-swap.
 # - Settings with a default value, or noted as optional can be omitted.
 # - Settings that are marked required must be in your configuration file
-# valid log levels: debug, info (default), warn, error
+# healthCheckTimeout: number of seconds to wait for a model to be ready to serve requests
-logLevel: debug
+# - optional, default: 120
 # - minimum value is 15 seconds, anything less will be set to this value
 healthCheckTimeout: 500
-# creating a coding profile with models for code generation and general questions
+# logLevel: sets the logging value
-groups:
+# - optional, default: info
-  coding:
+# - Valid log levels: debug, info, warn, error
-    swap: false
+logLevel: info
    members:
      - "qwen"
      - "llama"
 # startPort: sets the starting port number for the automatic ${PORT} macro.
 # - optional, default: 5800
 # - the ${PORT} macro can be used in model.cmd and model.proxy settings
 # - it is automatically incremented for every model that uses it
 startPort: 10001
 # macros: sets a dictionary of string:string pairs
 # - optional, default: empty dictionary
 # - these are reusable snippets
 # - used in a model's cmd, cmdStop, proxy and checkEndpoint
 # - useful for reducing common configuration settings
 macros:
  "latest-llama": >
    /path/to/llama-server/llama-server-ec9e0301
    --port ${PORT}
 # models: a dictionary of model configurations
 # - required
 # - each key is the model's ID, used in API requests
 # - model settings have default values that are used if they are not defined here
 # - below are examples of the various settings a model can have:
 # - available model settings: env, cmd, cmdStop, proxy, aliases, checkEndpoint, ttl, unlisted
 models:
  # keys are the model names used in API requests
  "llama":
    # cmd: the command to run to start the inference server.
    # - required
    # - it is just a string, similar to what you would run on the CLI
    # - using `|` allows for comments in the command, these will be parsed out
    # - macros can be used within cmd
    cmd: |
-      models/llama-server-osx
+      # ${latest-llama} is a macro that is defined above
-      --port ${PORT}
+      ${latest-llama}
-      -m models/Llama-3.2-1B-Instruct-Q4_0.gguf
+      --model path/to/llama-8B-Q4_K_M.gguf
-    # list of model name aliases this llama.cpp instance can serve
+    # name: a display name for the model
    # - optional, default: empty string
    # - if set, it will be used in the v1/models API response
    # - if not set, it will be omitted in the JSON model record
    name: "llama 3.1 8B"
    # description: a description for the model
    # - optional, default: empty string
    # - if set, it will be used in the v1/models API response
    # - if not set, it will be omitted in the JSON model record
    description: "A small but capable model used for quick testing"
    # env: define an array of environment variables to inject into cmd's environment
    # - optional, default: empty array
    # - each value is a single string
    # - in the format: ENV_NAME=value
    env:
      - "CUDA_VISIBLE_DEVICES=0,1,2"
    # proxy: the URL where llama-swap routes API requests
    # - optional, default: http://localhost:${PORT}
    # - if you used ${PORT} in cmd this can be omitted
    # - if you use a custom port in cmd this *must* be set
    proxy: http://127.0.0.1:8999
    # aliases: alternative model names that this model configuration is used for
    # - optional, default: empty array
    # - aliases must be unique globally
    # - useful for impersonating a specific model
    aliases:
-    - gpt-4o-mini
+      - "gpt-4o-mini"
      - "gpt-3.5-turbo"
-    # check this path for a HTTP 200 response for the server to be ready
+    # checkEndpoint: URL path to check if the server is ready
-    checkEndpoint: /health
+    # - optional, default: /health
    # - use "none" to skip endpoint ready checking
    # - endpoint is expected to return an HTTP 200 response
    # - all requests wait until the endpoint is ready (or fails)
    checkEndpoint: /custom-endpoint
-    # unload model after 5 seconds
+    # ttl: automatically unload the model after this many seconds
-    ttl: 5
+    # - optional, default: 0
    # - ttl values must be a value greater than 0
    # - a value of 0 disables automatic unloading of the model
    ttl: 60
-  "qwen":
+    # useModelName: overrides the model name that is sent to upstream server
-    cmd: models/llama-server-osx --port ${PORT} -m models/qwen2.5-0.5b-instruct-q8_0.gguf
+    # - optional, default: ""
-    aliases:
+    # - useful when the upstream server expects a specific model name or format
-      - gpt-3.5-turbo
+    useModelName: "qwen:qwq"
-  # Embedding example with Nomic
+    # filters: a dictionary of filter settings
-  # https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF
+    # - optional, default: empty dictionary
-  "nomic":
+    filters:
-    cmd: |
+      # strip_params: a comma separated list of parameters to remove from the request
-      models/llama-server-osx --port ${PORT}
+      # - optional, default: ""
-      -m models/nomic-embed-text-v1.5.Q8_0.gguf
+      # - useful for preventing overriding of default server params by requests
-      --ctx-size 8192
+      # - `model` parameter is never removed
-      --batch-size 8192
+      # - can be any JSON key in the request body
-      --rope-scaling yarn
+      # - recommended to stick to sampling parameters
-      --rope-freq-scale 0.75
+      strip_params: "temperature, top_p, top_k"
      -ngl 99
      --embeddings
-  # Reranking example with bge-reranker
+  # Unlisted model example:
-  # https://huggingface.co/gpustack/bge-reranker-v2-m3-GGUF
+  "qwen-unlisted":
-  "bge-reranker":
+    # unlisted: true or false
-    cmd: |
+    # - optional, default: false
-      models/llama-server-osx --port ${PORT}
+    # - unlisted models do not show up in /v1/models or /upstream lists
-      -m models/bge-reranker-v2-m3-Q4_K_M.gguf
+    # - can be requested as normal through all apis
-      --ctx-size 8192
+    unlisted: true
-      --reranking
+    cmd: llama-server --port ${PORT} -m Llama-3.2-1B-Instruct-Q4_K_M.gguf -ngl 0
-  # Docker Support (v26.1.4+ required!)
+  # Docker example:
-  "dockertest":
+  # container run times like Docker and Podman can also be used with a
  # a combination of cmd and cmdStop.
  "docker-llama":
    proxy: "http://127.0.0.1:${PORT}"
    cmd: |
      docker run --name dockertest
      --init --rm -p ${PORT}:8080 -v /mnt/nvme/models:/models
-      ghcr.io/ggerganov/llama.cpp:server
+      ghcr.io/ggml-org/llama.cpp:server
      --model '/models/Qwen2.5-Coder-0.5B-Instruct-Q4_K_M.gguf'
-  "simple":
+    # cmdStop: command to run to stop the model gracefully
-    # example of setting environment variables
+    # - optional, default: ""
-    env:
+    # - useful for stopping commands managed by another system
-      - CUDA_VISIBLE_DEVICES=0,1
+    # - on POSIX systems: a SIGTERM is sent for graceful shutdown
-      - env1=hello
+    # - on Windows, taskkill is used
-    cmd: build/simple-responder --port ${PORT}
+    # - processes are given 5 seconds to shutdown until they are forcefully killed
-    unlisted: true
+    # - the upstream's process id is available in the ${PID} macro
    cmdStop: docker stop dockertest
-    # use "none" to skip check. Caution this may cause some requests to fail
+# groups: a dictionary of group settings
-    # until the upstream server is ready for traffic
+# - optional, default: empty dictionary
-    checkEndpoint: none
+# - provide advanced controls over model swapping behaviour.
 # - Using groups some models can be kept loaded indefinitely, while others are swapped out.
 # - model ids must be defined in the Models section
 # - a model can only be a member of one group
 # - group behaviour is controlled via the `swap`, `exclusive` and `persistent` fields
 # - see issue #109 for details
 #
 # NOTE: the example below uses model names that are not defined above for demonstration purposes
 groups:
  # group1 is same as the default behaviour of llama-swap where only one model is allowed
  # to run a time across the whole llama-swap instance
  "group1":
    # swap: controls the model swapping behaviour in within the group
    # - optional, default: true
    # - true : only one model is allowed to run at a time
    # - false: all models can run together, no swapping
    swap: true
-  # don't use these, just for testing if things are broken
+    # exclusive: controls how the group affects other groups
-  "broken":
+    # - optional, default: true
-    cmd: models/llama-server-osx --port 8999 -m models/doesnotexist.gguf
+    # - true: causes all other groups to unload when this group runs a model
-    proxy: http://127.0.0.1:8999
+    # - false: does not affect other groups
-    unlisted: true
+    exclusive: true
-  "broken_timeout":
+
-    cmd: models/llama-server-osx --port 8999 -m models/qwen2.5-0.5b-instruct-q8_0.gguf
+    # members references the models defined above
-    proxy: http://127.0.0.1:9000
+    # required
-    unlisted: true
+    members:
      - "llama"
      - "qwen-unlisted"
  # Example:
  # - in this group all the models can run at the same time
  # - when a different group loads all running models in this group are unloaded
  "group2":
    swap: false
    exclusive: false
    members:
      - "docker-llama"
      - "modelA"
      - "modelB"
  # Example:
  # - a persistent group, prevents other groups from unloading it
  "forever":
    # persistent: prevents over groups from unloading the models in this group
    # - optional, default: false
    # - does not affect individual model behaviour
    persistent: true
    # set swap/exclusive to false to prevent swapping inside the group
    # and the unloading of other groups
    swap: false
    exclusive: false
    members:
      - "forever-modelA"
      - "forever-modelB"
      - "forever-modelc"
@@ -9,6 +9,7 @@ require (
 	github.com/tidwall/gjson v1.18.0
 	github.com/tidwall/sjson v1.2.5
 	gopkg.in/yaml.v3 v3.0.1
 	github.com/kelindar/event v1.5.2
 )
 require (
@@ -36,6 +36,8 @@ github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510 h1:El6M4kTTCOh6aBiKaU
 github.com/google/shlex v0.0.0-20191202100458-e7afc7fbc510/go.mod h1:pupxD2MaaD3pAXIBCelhxNneeOaAeabZDe5s4K6zSpQ=
 github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
 github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
 github.com/kelindar/event v1.5.2 h1:qtgssZqMh/QQMCIxlbx4wU3DoMHOrJXKdiZhphJ4YbY=
 github.com/kelindar/event v1.5.2/go.mod h1:UxWPQjWK8u0o9Z3ponm2mgREimM95hm26/M9z8F488Q=
 github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
 github.com/klauspost/cpuid/v2 v2.2.7 h1:ZWSB3igEs+d0qvnxR/ZBzXVmxkgt8DdzP6m9pfuVLDM=
 github.com/klauspost/cpuid/v2 v2.2.7/go.mod h1:Lcz8mBdAVJIBVzewtcLocK12l3Y+JytZYpaMropDUws=
@@ -14,6 +14,7 @@ import (
 	"github.com/fsnotify/fsnotify"
 	"github.com/gin-gonic/gin"
 	"github.com/kelindar/event"
 	"github.com/mostlygeek/llama-swap/proxy"
 )
@@ -53,137 +54,130 @@ func main() {
 		gin.SetMode(gin.ReleaseMode)
 	}
 	proxyManager := proxy.New(config)
 	// Setup channels for server management
 	reloadChan := make(chan *proxy.ProxyManager)
 	exitChan := make(chan struct{})
 	sigChan := make(chan os.Signal, 1)
 	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
 	// Create server with initial handler
 	srv := &http.Server{
-		Addr:    *listenStr,
+		Addr: *listenStr,
 		Handler: proxyManager,
 	}
 	// Support for watching config and reloading when it changes
 	reloadProxyManager := func() {
 		if currentPM, ok := srv.Handler.(*proxy.ProxyManager); ok {
 			config, err = proxy.LoadConfig(*configPath)
 			if err != nil {
 				fmt.Printf("Warning, unable to reload configuration: %v\n", err)
 				return
 			}
 			fmt.Println("Configuration Changed")
 			currentPM.Shutdown()
 			srv.Handler = proxy.New(config)
 			fmt.Println("Configuration Reloaded")
 			// wait a few seconds and tell any UI to reload
 			time.AfterFunc(3*time.Second, func() {
 				event.Emit(proxy.ConfigFileChangedEvent{
 					ReloadingState: proxy.ReloadingStateEnd,
 				})
 			})
 		} else {
 			config, err = proxy.LoadConfig(*configPath)
 			if err != nil {
 				fmt.Printf("Error, unable to load configuration: %v\n", err)
 				os.Exit(1)
 			}
 			srv.Handler = proxy.New(config)
 		}
 	}
 	// load the initial proxy manager
 	reloadProxyManager()
 	debouncedReload := debounce(time.Second, reloadProxyManager)
 	if *watchConfig {
 		defer event.On(func(e proxy.ConfigFileChangedEvent) {
 			if e.ReloadingState == proxy.ReloadingStateStart {
 				debouncedReload()
 			}
 		})()
 		fmt.Println("Watching Configuration for changes")
 		go func() {
 			absConfigPath, err := filepath.Abs(*configPath)
 			if err != nil {
 				fmt.Printf("Error getting absolute path for watching config file: %v\n", err)
 				return
 			}
 			watcher, err := fsnotify.NewWatcher()
 			if err != nil {
 				fmt.Printf("Error creating file watcher: %v. File watching disabled.\n", err)
 				return
 			}
 			configDir := filepath.Dir(absConfigPath)
 			err = watcher.Add(configDir)
 			if err != nil {
 				fmt.Printf("Error adding config path directory (%s) to watcher: %v. File watching disabled.", configDir, err)
 				return
 			}
 			defer watcher.Close()
 			for {
 				select {
 				case changeEvent := <-watcher.Events:
 					if changeEvent.Name == absConfigPath && (changeEvent.Has(fsnotify.Write) || changeEvent.Has(fsnotify.Create) || changeEvent.Has(fsnotify.Remove)) {
 						event.Emit(proxy.ConfigFileChangedEvent{
 							ReloadingState: proxy.ReloadingStateStart,
 						})
 					}
 				case err := <-watcher.Errors:
 					log.Printf("File watcher error: %v", err)
 				}
 			}
 		}()
 	}
 	// shutdown on signal
 	go func() {
 		sig := <-sigChan
 		fmt.Printf("Received signal %v, shutting down...\n", sig)
 		ctx, cancel := context.WithTimeout(context.Background(), time.Second*5)
 		defer cancel()
 		if pm, ok := srv.Handler.(*proxy.ProxyManager); ok {
 			pm.Shutdown()
 		} else {
 			fmt.Println("srv.Handler is not of type *proxy.ProxyManager")
 		}
 		if err := srv.Shutdown(ctx); err != nil {
 			fmt.Printf("Server shutdown error: %v\n", err)
 		}
 		close(exitChan)
 	}()
 	// Start server
 	fmt.Printf("llama-swap listening on %s\n", *listenStr)
 	go func() {
 		if err := srv.ListenAndServe(); err != nil && err != http.ErrServerClosed {
-			fmt.Printf("Fatal server error: %v\n", err)
+			log.Fatalf("Fatal server error: %v\n", err)
 			close(exitChan)
 		}
 	}()
 	// Handle config reloads and signals
 	go func() {
 		currentManager := proxyManager
 		for {
 			select {
 			case newManager := <-reloadChan:
 				log.Println("Config change detected, waiting for in-flight requests to complete...")
 				// Stop old manager processes gracefully (this waits for in-flight requests)
 				currentManager.StopProcesses(proxy.StopWaitForInflightRequest)
 				// Now do a full shutdown to clear the process map
 				currentManager.Shutdown()
 				currentManager = newManager
 				srv.Handler = newManager
 				log.Println("Server handler updated with new config")
 			case sig := <-sigChan:
 				fmt.Printf("Received signal %v, shutting down...\n", sig)
 				ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
 				defer cancel()
 				currentManager.Shutdown()
 				if err := srv.Shutdown(ctx); err != nil {
 					fmt.Printf("Server shutdown error: %v\n", err)
 				}
 				close(exitChan)
 				return
 			}
 		}
 	}()
 	// Start file watcher if requested
 	if *watchConfig {
 		absConfigPath, err := filepath.Abs(*configPath)
 		if err != nil {
 			log.Printf("Error getting absolute path for config: %v. File watching disabled.", err)
 		} else {
 			go watchConfigFileWithReload(absConfigPath, reloadChan)
 		}
 	}
 	// Wait for exit signal
 	<-exitChan
 }
-// watchConfigFileWithReload monitors the configuration file and sends new ProxyManager instances through reloadChan.
+func debounce(interval time.Duration, f func()) func() {
-func watchConfigFileWithReload(configPath string, reloadChan chan<- *proxy.ProxyManager) {
+	var timer *time.Timer
-	watcher, err := fsnotify.NewWatcher()
+	return func() {
-	if err != nil {
+		if timer != nil {
-		log.Printf("Error creating file watcher: %v. File watching disabled.", err)
+			timer.Stop()
 		return
 	}
 	defer watcher.Close()
 	err = watcher.Add(configPath)
 	if err != nil {
 		log.Printf("Error adding config path (%s) to watcher: %v. File watching disabled.", configPath, err)
 		return
 	}
 	log.Printf("Watching config file for changes: %s", configPath)
 	var debounceTimer *time.Timer
 	debounceDuration := 2 * time.Second
 	for {
 		select {
 		case event, ok := <-watcher.Events:
 			if !ok {
 				return
 			}
 			// We only care about writes to the specific config file
 			if event.Name == configPath && event.Has(fsnotify.Write) {
 				// Reset or start the debounce timer
 				if debounceTimer != nil {
 					debounceTimer.Stop()
 				}
 				debounceTimer = time.AfterFunc(debounceDuration, func() {
 					log.Printf("Config file modified: %s, reloading...", event.Name)
 					// Try up to 3 times with exponential backoff
 					var newConfig proxy.Config
 					var err error
 					for retries := 0; retries < 3; retries++ {
 						// Load new configuration
 						newConfig, err = proxy.LoadConfig(configPath)
 						if err == nil {
 							break
 						}
 						log.Printf("Error loading new config (attempt %d/3): %v", retries+1, err)
 						if retries < 2 {
 							time.Sleep(time.Duration(1<<retries) * time.Second)
 						}
 					}
 					if err != nil {
 						log.Printf("Failed to load new config after retries: %v", err)
 						return
 					}
 					// Create new ProxyManager with new config
 					newPM := proxy.New(newConfig)
 					reloadChan <- newPM
 					log.Println("Config reloaded successfully")
 				})
 			}
 		case err, ok := <-watcher.Errors:
 			if !ok {
 				log.Println("File watcher error channel closed.")
 				return
 			}
 			log.Printf("File watcher error: %v", err)
 		}
 		timer = time.AfterFunc(interval, f)
 	}
 }
@@ -0,0 +1,91 @@
 package main
 import (
 	"context"
 	"errors"
 	"fmt"
 	"os"
 	"os/exec"
 	"os/signal"
 	"syscall"
 	"time"
 )
 /*
 **
 Test how exec.Cmd.CommandContext behaves under certain conditions:*
  - process is killed externally, what happens with cmd.Wait() *
    ✔︎ it returns. catches crashes.*
  - process ignores SIGTERM*
    ✔︎ `kill()` is called after cmd.WaitDelay*
  - this process exits, what happens with children (kill -9 <this process' pid>)*
    x they stick around. have to be manually killed.*
  - .WithTimeout()'s cancel is called *
    ✔︎ process is killed after it ignores sigterm, cmd.Wait() catches it.*
  - parent receives SIGINT/SIGTERM, what happens
    ✔︎ waits for child process to exit, then exits gracefully.
 */
 func main() {
 	// swap between these to use kill -9 <pid> on the cli to sim external crash
 	ctx, cancel := context.WithCancel(context.Background())
 	//ctx, cancel := context.WithTimeout(context.Background(), 1000*time.Millisecond)
 	defer cancel()
 	//cmd := exec.CommandContext(ctx, "sleep", "1")
 	cmd := exec.CommandContext(ctx,
 		"../../build/simple-responder_darwin_arm64",
 		//"-ignore-sig-term", /* so it doesn't exit on receiving SIGTERM, test cmd.WaitTimeout */
 	)
 	cmd.Stdin = os.Stdin
 	cmd.Stdout = os.Stdout
 	cmd.Stderr = os.Stderr
 	// set a wait delay before signing sig kill
 	cmd.WaitDelay = 500 * time.Millisecond
 	cmd.Cancel = func() error {
 		fmt.Println("✔︎ Cancel() called, sending SIGTERM")
 		cmd.Process.Signal(syscall.SIGTERM)
 		//return nil
 		// this error is returned by cmd.Wait(), and can be used to
 		// single an error when the process couldn't be normally terminated
 		// but since a SIGTERM is sent, it's probably ok to return a nil
 		// as WaitDelay timing out will override the any error set here.
 		//
 		// test by enabling/disabling -ignore-sig-term on the process
 		// with -ignore-sig-term enabled, cmd.Wait() will have "signal: killed"
 		// without it, it will show the "new error from cancel"
 		return errors.New("error from cmd.Cancel()") // sets error returned by cmd.Wait()
 	}
 	if err := cmd.Start(); err != nil {
 		fmt.Println("Error starting process:", err)
 		return
 	}
 	// catch signals. Calls cancel() which will cause cmd.Wait() to return and
 	// this program to eventually exit gracefully.
 	sigChan := make(chan os.Signal, 1)
 	signal.Notify(sigChan, syscall.SIGINT, syscall.SIGTERM)
 	go func() {
 		signal := <-sigChan
 		fmt.Printf("✔︎ Received signal: %d, Killing process... with cancel before exiting\n", signal)
 		cancel()
 	}()
 	fmt.Printf("✔︎ Parent Pid: %d, Process Pid: %d\n", os.Getpid(), cmd.Process.Pid)
 	fmt.Println("✔︎ Process started, cmd.Wait() ... ")
 	if err := cmd.Wait(); err != nil {
 		fmt.Println("✔︎ cmd.Wait returned, Error:", err)
 	} else {
 		fmt.Println("✔︎ cmd.Wait returned, Process exited on its own")
 	}
 	fmt.Println("✔︎ Child process exited, Done.")
 }
@@ -42,9 +42,12 @@ func main() {
 			time.Sleep(wait)
 		}
 		bodyBytes, _ := io.ReadAll(c.Request.Body)
 		c.JSON(http.StatusOK, gin.H{
 			"responseMessage":  *responseMessage,
 			"h_content_length": c.Request.Header.Get("Content-Length"),
 			"request_body":     string(bodyBytes),
 		})
 	})
@@ -0,0 +1 @@
 ui_dist/*
@@ -6,6 +6,7 @@ import (
 	"os"
 	"regexp"
 	"runtime"
 	"slices"
 	"sort"
 	"strconv"
 	"strings"
@@ -27,14 +28,91 @@ type ModelConfig struct {
 	Unlisted      bool     `yaml:"unlisted"`
 	UseModelName  string   `yaml:"useModelName"`
 	// #179 for /v1/models
 	Name        string `yaml:"name"`
 	Description string `yaml:"description"`
 	// Limit concurrency of HTTP requests to process
 	ConcurrencyLimit int `yaml:"concurrencyLimit"`
 	// Model filters see issue #174
 	Filters ModelFilters `yaml:"filters"`
 }
 func (m *ModelConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
 	type rawModelConfig ModelConfig
 	defaults := rawModelConfig{
 		Cmd:              "",
 		CmdStop:          "",
 		Proxy:            "http://localhost:${PORT}",
 		Aliases:          []string{},
 		Env:              []string{},
 		CheckEndpoint:    "/health",
 		UnloadAfter:      0,
 		Unlisted:         false,
 		UseModelName:     "",
 		ConcurrencyLimit: 0,
 		Name:             "",
 		Description:      "",
 	}
 	// the default cmdStop to taskkill /f /t /pid ${PID}
 	if runtime.GOOS == "windows" {
 		defaults.CmdStop = "taskkill /f /t /pid ${PID}"
 	}
 	if err := unmarshal(&defaults); err != nil {
 		return err
 	}
 	*m = ModelConfig(defaults)
 	return nil
 }
 func (m *ModelConfig) SanitizedCommand() ([]string, error) {
 	return SanitizeCommand(m.Cmd)
 }
 // ModelFilters see issue #174
 type ModelFilters struct {
 	StripParams string `yaml:"strip_params"`
 }
 func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error {
 	type rawModelFilters ModelFilters
 	defaults := rawModelFilters{
 		StripParams: "",
 	}
 	if err := unmarshal(&defaults); err != nil {
 		return err
 	}
 	*m = ModelFilters(defaults)
 	return nil
 }
 func (f ModelFilters) SanitizedStripParams() ([]string, error) {
 	if f.StripParams == "" {
 		return nil, nil
 	}
 	params := strings.Split(f.StripParams, ",")
 	cleaned := make([]string, 0, len(params))
 	for _, param := range params {
 		trimmed := strings.TrimSpace(param)
 		if trimmed == "model" || trimmed == "" {
 			continue
 		}
 		cleaned = append(cleaned, trimmed)
 	}
 	// sort cleaned
 	slices.Sort(cleaned)
 	return cleaned, nil
 }
 type GroupConfig struct {
 	Swap       bool     `yaml:"swap"`
 	Exclusive  bool     `yaml:"exclusive"`
@@ -111,26 +189,23 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
 		return Config{}, err
 	}
-	var config Config
+	// default configuration values
 	config := Config{
 		HealthCheckTimeout: 120,
 		StartPort:          5800,
 		LogLevel:           "info",
 	}
 	err = yaml.Unmarshal(data, &config)
 	if err != nil {
 		return Config{}, err
 	}
-	if config.HealthCheckTimeout == 0 {
+	if config.HealthCheckTimeout < 15 {
 		// this high default timeout helps avoid failing health checks
 		// for configurations that wait for docker or have slower startup
 		config.HealthCheckTimeout = 120
 	} else if config.HealthCheckTimeout < 15 {
 		// set a minimum of 15 seconds
 		config.HealthCheckTimeout = 15
 	}
-	// set default port ranges
+	if config.StartPort < 1 {
 	if config.StartPort == 0 {
 		// default to 5800
 		config.StartPort = 5800
 	} else if config.StartPort < 1 {
 		return Config{}, fmt.Errorf("startPort must be greater than 1")
 	}
@@ -187,21 +262,21 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
 			modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroValue)
 			modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroValue)
 			modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroValue)
 			modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroValue)
 		}
 		// enforce ${PORT} used in both cmd and proxy
 		if !strings.Contains(modelConfig.Cmd, "${PORT}") && strings.Contains(modelConfig.Proxy, "${PORT}") {
 			return Config{}, fmt.Errorf("model %s: proxy uses ${PORT} but cmd does not - ${PORT} is only available when used in cmd", modelId)
 		}
 		// only iterate over models that use ${PORT} to keep port numbers from increasing unnecessarily
 		if strings.Contains(modelConfig.Cmd, "${PORT}") || strings.Contains(modelConfig.Proxy, "${PORT}") || strings.Contains(modelConfig.CmdStop, "${PORT}") {
 			if modelConfig.Proxy == "" {
 				modelConfig.Proxy = "http://localhost:${PORT}"
 			}
 			nextPortStr := strconv.Itoa(nextPort)
 			modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${PORT}", nextPortStr)
 			modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${PORT}", nextPortStr)
 			modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, "${PORT}", nextPortStr)
 			nextPort++
 		} else if modelConfig.Proxy == "" {
 			return Config{}, fmt.Errorf("model %s requires a proxy value when not using automatic ${PORT}", modelId)
 		}
 		// make sure there are no unknown macros that have not been replaced
@@ -217,6 +292,9 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
 			matches := macroPattern.FindAllStringSubmatch(fieldValue, -1)
 			for _, match := range matches {
 				macroName := match[1]
 				if macroName == "PID" && fieldName == "cmdStop" {
 					continue // this is ok, has to be replaced by process later
 				}
 				if _, exists := config.Macros[macroName]; !exists {
 					return Config{}, fmt.Errorf("unknown macro '${%s}' found in %s.%s", macroName, modelId, fieldName)
 				}
@@ -3,6 +3,9 @@
 package proxy
 import (
 	"os"
 	"path/filepath"
 	"strings"
 	"testing"
 	"github.com/stretchr/testify/assert"
@@ -40,3 +43,191 @@ func TestConfig_SanitizeCommand(t *testing.T) {
 	assert.Error(t, err)
 	assert.Nil(t, args)
 }
 // Test the default values are automatically set for global, model and group configurations
 // after loading the configuration
 func TestConfig_DefaultValuesPosix(t *testing.T) {
 	content := `
 models:
  model1:
    cmd: path/to/cmd --port ${PORT}
 `
 	config, err := LoadConfigFromReader(strings.NewReader(content))
 	assert.NoError(t, err)
 	assert.Equal(t, 120, config.HealthCheckTimeout)
 	assert.Equal(t, 5800, config.StartPort)
 	assert.Equal(t, "info", config.LogLevel)
 	// Test default group exists
 	defaultGroup, exists := config.Groups["(default)"]
 	assert.True(t, exists, "default group should exist")
 	if assert.NotNil(t, defaultGroup, "default group should not be nil") {
 		assert.Equal(t, true, defaultGroup.Swap)
 		assert.Equal(t, true, defaultGroup.Exclusive)
 		assert.Equal(t, false, defaultGroup.Persistent)
 		assert.Equal(t, []string{"model1"}, defaultGroup.Members)
 	}
 	model1, exists := config.Models["model1"]
 	assert.True(t, exists, "model1 should exist")
 	if assert.NotNil(t, model1, "model1 should not be nil") {
 		assert.Equal(t, "path/to/cmd --port 5800", model1.Cmd) // has the port replaced
 		assert.Equal(t, "", model1.CmdStop)
 		assert.Equal(t, "http://localhost:5800", model1.Proxy)
 		assert.Equal(t, "/health", model1.CheckEndpoint)
 		assert.Equal(t, []string{}, model1.Aliases)
 		assert.Equal(t, []string{}, model1.Env)
 		assert.Equal(t, 0, model1.UnloadAfter)
 		assert.Equal(t, false, model1.Unlisted)
 		assert.Equal(t, "", model1.UseModelName)
 		assert.Equal(t, 0, model1.ConcurrencyLimit)
 	}
 	// default empty filter exists
 	assert.Equal(t, "", model1.Filters.StripParams)
 }
 func TestConfig_LoadPosix(t *testing.T) {
 	// Create a temporary YAML file for testing
 	tempDir, err := os.MkdirTemp("", "test-config")
 	if err != nil {
 		t.Fatalf("Failed to create temporary directory: %v", err)
 	}
 	defer os.RemoveAll(tempDir)
 	tempFile := filepath.Join(tempDir, "config.yaml")
 	content := `
 macros:
  svr-path: "path/to/server"
 models:
  model1:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8080"
    name: "Model 1"
    description: "This is model 1"
    aliases:
      - "m1"
      - "model-one"
    env:
      - "VAR1=value1"
      - "VAR2=value2"
    checkEndpoint: "/health"
  model2:
    cmd: ${svr-path} --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "m2"
    checkEndpoint: "/"
  model3:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "mthree"
    checkEndpoint: "/"
  model4:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8082"
    checkEndpoint: "/"
 healthCheckTimeout: 15
 profiles:
  test:
    - model1
    - model2
 groups:
  group1:
    swap: true
    exclusive: false
    members: ["model2"]
  forever:
    exclusive: false
    persistent: true
    members:
      - "model4"
 `
 	if err := os.WriteFile(tempFile, []byte(content), 0644); err != nil {
 		t.Fatalf("Failed to write temporary file: %v", err)
 	}
 	// Load the config and verify
 	config, err := LoadConfig(tempFile)
 	if err != nil {
 		t.Fatalf("Failed to load config: %v", err)
 	}
 	expected := Config{
 		LogLevel:  "info",
 		StartPort: 5800,
 		Macros: map[string]string{
 			"svr-path": "path/to/server",
 		},
 		Models: map[string]ModelConfig{
 			"model1": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8080",
 				Aliases:       []string{"m1", "model-one"},
 				Env:           []string{"VAR1=value1", "VAR2=value2"},
 				CheckEndpoint: "/health",
 				Name:          "Model 1",
 				Description:   "This is model 1",
 			},
 			"model2": {
 				Cmd:           "path/to/server --arg1 one",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"m2"},
 				Env:           []string{},
 				CheckEndpoint: "/",
 			},
 			"model3": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"mthree"},
 				Env:           []string{},
 				CheckEndpoint: "/",
 			},
 			"model4": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8082",
 				CheckEndpoint: "/",
 				Aliases:       []string{},
 				Env:           []string{},
 			},
 		},
 		HealthCheckTimeout: 15,
 		Profiles: map[string][]string{
 			"test": {"model1", "model2"},
 		},
 		aliases: map[string]string{
 			"m1":        "model1",
 			"model-one": "model1",
 			"m2":        "model2",
 			"mthree":    "model3",
 		},
 		Groups: map[string]GroupConfig{
 			DEFAULT_GROUP_ID: {
 				Swap:      true,
 				Exclusive: true,
 				Members:   []string{"model1", "model3"},
 			},
 			"group1": {
 				Swap:      true,
 				Exclusive: false,
 				Members:   []string{"model2"},
 			},
 			"forever": {
 				Swap:       true,
 				Exclusive:  false,
 				Persistent: true,
 				Members:    []string{"model4"},
 			},
 		},
 	}
 	assert.Equal(t, expected, config)
 	realname, found := config.RealModelName("m1")
 	assert.True(t, found)
 	assert.Equal(t, "model1", realname)
 }
@@ -1,151 +1,12 @@
 package proxy
 import (
 	"os"
 	"path/filepath"
 	"strings"
 	"testing"
 	"github.com/stretchr/testify/assert"
 )
 func TestConfig_Load(t *testing.T) {
 	// Create a temporary YAML file for testing
 	tempDir, err := os.MkdirTemp("", "test-config")
 	if err != nil {
 		t.Fatalf("Failed to create temporary directory: %v", err)
 	}
 	defer os.RemoveAll(tempDir)
 	tempFile := filepath.Join(tempDir, "config.yaml")
 	content := `
 macros:
  svr-path: "path/to/server"
 models:
  model1:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8080"
    aliases:
      - "m1"
      - "model-one"
    env:
      - "VAR1=value1"
      - "VAR2=value2"
    checkEndpoint: "/health"
  model2:
    cmd: ${svr-path} --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "m2"
    checkEndpoint: "/"
  model3:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "mthree"
    checkEndpoint: "/"
  model4:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8082"
    checkEndpoint: "/"
 healthCheckTimeout: 15
 profiles:
  test:
    - model1
    - model2
 groups:
  group1:
    swap: true
    exclusive: false
    members: ["model2"]
  forever:
    exclusive: false
    persistent: true
    members:
      - "model4"
 `
 	if err := os.WriteFile(tempFile, []byte(content), 0644); err != nil {
 		t.Fatalf("Failed to write temporary file: %v", err)
 	}
 	// Load the config and verify
 	config, err := LoadConfig(tempFile)
 	if err != nil {
 		t.Fatalf("Failed to load config: %v", err)
 	}
 	expected := Config{
 		StartPort: 5800,
 		Macros: map[string]string{
 			"svr-path": "path/to/server",
 		},
 		Models: map[string]ModelConfig{
 			"model1": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8080",
 				Aliases:       []string{"m1", "model-one"},
 				Env:           []string{"VAR1=value1", "VAR2=value2"},
 				CheckEndpoint: "/health",
 			},
 			"model2": {
 				Cmd:           "path/to/server --arg1 one",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"m2"},
 				Env:           nil,
 				CheckEndpoint: "/",
 			},
 			"model3": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"mthree"},
 				Env:           nil,
 				CheckEndpoint: "/",
 			},
 			"model4": {
 				Cmd:           "path/to/cmd --arg1 one",
 				Proxy:         "http://localhost:8082",
 				CheckEndpoint: "/",
 			},
 		},
 		HealthCheckTimeout: 15,
 		Profiles: map[string][]string{
 			"test": {"model1", "model2"},
 		},
 		aliases: map[string]string{
 			"m1":        "model1",
 			"model-one": "model1",
 			"m2":        "model2",
 			"mthree":    "model3",
 		},
 		Groups: map[string]GroupConfig{
 			DEFAULT_GROUP_ID: {
 				Swap:      true,
 				Exclusive: true,
 				Members:   []string{"model1", "model3"},
 			},
 			"group1": {
 				Swap:      true,
 				Exclusive: false,
 				Members:   []string{"model2"},
 			},
 			"forever": {
 				Swap:       true,
 				Exclusive:  false,
 				Persistent: true,
 				Members:    []string{"model4"},
 			},
 		},
 	}
 	assert.Equal(t, expected, config)
 	realname, found := config.RealModelName("m1")
 	assert.True(t, found)
 	assert.Equal(t, "model1", realname)
 }
 func TestConfig_GroupMemberIsUnique(t *testing.T) {
 	content := `
 models:
@@ -333,7 +194,7 @@ models:
    cmd: svr --port 111
 `
 		_, err := LoadConfigFromReader(strings.NewReader(content))
-		assert.Equal(t, "model model1 requires a proxy value when not using automatic ${PORT}", err.Error())
+		assert.Equal(t, "model model1: proxy uses ${PORT} but cmd does not - ${PORT} is only available when used in cmd", err.Error())
 	})
 }
@@ -439,3 +300,28 @@ models:
 		})
 	}
 }
 func TestConfig_ModelFilters(t *testing.T) {
 	content := `
 macros:
  default_strip: "temperature, top_p"
 models:
  model1:
    cmd: path/to/cmd --port ${PORT}
    filters:
      strip_params: "model, top_k, ${default_strip}, , ,"
 `
 	config, err := LoadConfigFromReader(strings.NewReader(content))
 	assert.NoError(t, err)
 	modelConfig, ok := config.Models["model1"]
 	if !assert.True(t, ok) {
 		t.FailNow()
 	}
 	// make sure `model` and enmpty strings are not in the list
 	assert.Equal(t, "model, top_k, temperature, top_p, , ,", modelConfig.Filters.StripParams)
 	sanitized, err := modelConfig.Filters.SanitizedStripParams()
 	if assert.NoError(t, err) {
 		assert.Equal(t, []string{"temperature", "top_k", "top_p"}, sanitized)
 	}
 }
@@ -3,6 +3,9 @@
 package proxy
 import (
 	"os"
 	"path/filepath"
 	"strings"
 	"testing"
 	"github.com/stretchr/testify/assert"
@@ -39,3 +42,189 @@ func TestConfig_SanitizeCommand(t *testing.T) {
 	assert.Error(t, err)
 	assert.Nil(t, args)
 }
 func TestConfig_DefaultValuesWindows(t *testing.T) {
 	content := `
 models:
  model1:
    cmd: path/to/cmd --port ${PORT}
 `
 	config, err := LoadConfigFromReader(strings.NewReader(content))
 	assert.NoError(t, err)
 	assert.Equal(t, 120, config.HealthCheckTimeout)
 	assert.Equal(t, 5800, config.StartPort)
 	assert.Equal(t, "info", config.LogLevel)
 	// Test default group exists
 	defaultGroup, exists := config.Groups["(default)"]
 	assert.True(t, exists, "default group should exist")
 	if assert.NotNil(t, defaultGroup, "default group should not be nil") {
 		assert.Equal(t, true, defaultGroup.Swap)
 		assert.Equal(t, true, defaultGroup.Exclusive)
 		assert.Equal(t, false, defaultGroup.Persistent)
 		assert.Equal(t, []string{"model1"}, defaultGroup.Members)
 	}
 	model1, exists := config.Models["model1"]
 	assert.True(t, exists, "model1 should exist")
 	if assert.NotNil(t, model1, "model1 should not be nil") {
 		assert.Equal(t, "path/to/cmd --port 5800", model1.Cmd) // has the port replaced
 		assert.Equal(t, "taskkill /f /t /pid ${PID}", model1.CmdStop)
 		assert.Equal(t, "http://localhost:5800", model1.Proxy)
 		assert.Equal(t, "/health", model1.CheckEndpoint)
 		assert.Equal(t, []string{}, model1.Aliases)
 		assert.Equal(t, []string{}, model1.Env)
 		assert.Equal(t, 0, model1.UnloadAfter)
 		assert.Equal(t, false, model1.Unlisted)
 		assert.Equal(t, "", model1.UseModelName)
 		assert.Equal(t, 0, model1.ConcurrencyLimit)
 	}
 	// default empty filter exists
 	assert.Equal(t, "", model1.Filters.StripParams)
 }
 func TestConfig_LoadWindows(t *testing.T) {
 	// Create a temporary YAML file for testing
 	tempDir, err := os.MkdirTemp("", "test-config")
 	if err != nil {
 		t.Fatalf("Failed to create temporary directory: %v", err)
 	}
 	defer os.RemoveAll(tempDir)
 	tempFile := filepath.Join(tempDir, "config.yaml")
 	content := `
 macros:
  svr-path: "path/to/server"
 models:
  model1:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8080"
    aliases:
      - "m1"
      - "model-one"
    env:
      - "VAR1=value1"
      - "VAR2=value2"
    checkEndpoint: "/health"
  model2:
    cmd: ${svr-path} --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "m2"
    checkEndpoint: "/"
  model3:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8081"
    aliases:
      - "mthree"
    checkEndpoint: "/"
  model4:
    cmd: path/to/cmd --arg1 one
    proxy: "http://localhost:8082"
    checkEndpoint: "/"
 healthCheckTimeout: 15
 profiles:
  test:
    - model1
    - model2
 groups:
  group1:
    swap: true
    exclusive: false
    members: ["model2"]
  forever:
    exclusive: false
    persistent: true
    members:
      - "model4"
 `
 	if err := os.WriteFile(tempFile, []byte(content), 0644); err != nil {
 		t.Fatalf("Failed to write temporary file: %v", err)
 	}
 	// Load the config and verify
 	config, err := LoadConfig(tempFile)
 	if err != nil {
 		t.Fatalf("Failed to load config: %v", err)
 	}
 	expected := Config{
 		LogLevel:  "info",
 		StartPort: 5800,
 		Macros: map[string]string{
 			"svr-path": "path/to/server",
 		},
 		Models: map[string]ModelConfig{
 			"model1": {
 				Cmd:           "path/to/cmd --arg1 one",
 				CmdStop:       "taskkill /f /t /pid ${PID}",
 				Proxy:         "http://localhost:8080",
 				Aliases:       []string{"m1", "model-one"},
 				Env:           []string{"VAR1=value1", "VAR2=value2"},
 				CheckEndpoint: "/health",
 			},
 			"model2": {
 				Cmd:           "path/to/server --arg1 one",
 				CmdStop:       "taskkill /f /t /pid ${PID}",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"m2"},
 				Env:           []string{},
 				CheckEndpoint: "/",
 			},
 			"model3": {
 				Cmd:           "path/to/cmd --arg1 one",
 				CmdStop:       "taskkill /f /t /pid ${PID}",
 				Proxy:         "http://localhost:8081",
 				Aliases:       []string{"mthree"},
 				Env:           []string{},
 				CheckEndpoint: "/",
 			},
 			"model4": {
 				Cmd:           "path/to/cmd --arg1 one",
 				CmdStop:       "taskkill /f /t /pid ${PID}",
 				Proxy:         "http://localhost:8082",
 				CheckEndpoint: "/",
 				Aliases:       []string{},
 				Env:           []string{},
 			},
 		},
 		HealthCheckTimeout: 15,
 		Profiles: map[string][]string{
 			"test": {"model1", "model2"},
 		},
 		aliases: map[string]string{
 			"m1":        "model1",
 			"model-one": "model1",
 			"m2":        "model2",
 			"mthree":    "model3",
 		},
 		Groups: map[string]GroupConfig{
 			DEFAULT_GROUP_ID: {
 				Swap:      true,
 				Exclusive: true,
 				Members:   []string{"model1", "model3"},
 			},
 			"group1": {
 				Swap:      true,
 				Exclusive: false,
 				Members:   []string{"model2"},
 			},
 			"forever": {
 				Swap:       true,
 				Exclusive:  false,
 				Persistent: true,
 				Members:    []string{"model4"},
 			},
 		},
 	}
 	assert.Equal(t, expected, config)
 	realname, found := config.RealModelName("m1")
 	assert.True(t, found)
 	assert.Equal(t, "model1", realname)
 }
@@ -0,0 +1,49 @@
 package proxy
 // package level registry of the different event types
 const ProcessStateChangeEventID = 0x01
 const ChatCompletionStatsEventID = 0x02
 const ConfigFileChangedEventID = 0x03
 const LogDataEventID = 0x04
 type ProcessStateChangeEvent struct {
 	ProcessName string
 	NewState    ProcessState
 	OldState    ProcessState
 }
 func (e ProcessStateChangeEvent) Type() uint32 {
 	return ProcessStateChangeEventID
 }
 type ChatCompletionStats struct {
 	TokensGenerated int
 }
 func (e ChatCompletionStats) Type() uint32 {
 	return ChatCompletionStatsEventID
 }
 type ReloadingState int
 const (
 	ReloadingStateStart ReloadingState = iota
 	ReloadingStateEnd
 )
 type ConfigFileChangedEvent struct {
 	ReloadingState ReloadingState
 }
 func (e ConfigFileChangedEvent) Type() uint32 {
 	return ConfigFileChangedEventID
 }
 type LogDataEvent struct {
 	Data []byte
 }
 func (e LogDataEvent) Type() uint32 {
 	return LogDataEventID
 }
@@ -9,6 +9,7 @@ import (
 	"testing"
 	"github.com/gin-gonic/gin"
 	"gopkg.in/yaml.v3"
 )
 var (
@@ -70,10 +71,16 @@ func getTestSimpleResponderConfig(expectedMessage string) ModelConfig {
 func getTestSimpleResponderConfigPort(expectedMessage string, port int) ModelConfig {
 	binaryPath := getSimpleResponderPath()
-	// Create a process configuration
+	// Create a YAML string with just the values we want to set
-	return ModelConfig{
+	yamlStr := fmt.Sprintf(`
-		Cmd:           fmt.Sprintf("%s --port %d --silent --respond %s", binaryPath, port, expectedMessage),
+cmd: '%s --port %d --silent --respond %s'
-		Proxy:         fmt.Sprintf("http://127.0.0.1:%d", port),
+proxy: "http://127.0.0.1:%d"
-		CheckEndpoint: "/health",
+`, binaryPath, port, expectedMessage, port)
 	var cfg ModelConfig
 	if err := yaml.Unmarshal([]byte(yamlStr), &cfg); err != nil {
 		panic(fmt.Sprintf("failed to unmarshal test config: %v in [%s]", err, yamlStr))
 	}
 	return cfg
 }
@@ -1,14 +0,0 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>llama-swap</title>
 </head>
 <body>
    <h1>llama-swap</h1>
    <p>
        <a href="/logs">view logs</a> | <a href="/upstream">configured models</a> | <a href="https://github.com/mostlygeek/llama-swap">github</a>
    </p>
 </body>
 </html>
@@ -1,259 +0,0 @@
 <!DOCTYPE html>
 <html lang="en">
 <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Logs</title>
    <style>
        body {
            margin: 0;
            height: 100vh;
            display: flex;
            flex-direction: column;
            font-family: "Courier New", Courier, monospace;
        }
        .log-container {
            display: flex;
            flex: 1;
            gap: 0.5em;
            margin: 0.5em;
            min-height: 0;
        }
        .log-column {
            display: flex;
            flex-direction: column;
            flex: 1;
            min-width: 0;
            transition: flex 0.3s ease;
        }
        .log-column.minimized {
            flex: 0.1;
            max-width: 50px;
            border: 1px solid #777;
            color: green;
        }
        .log-controls {
            display: grid;
            grid-template-columns: 1fr auto;
            gap: 0.5em;
            margin-bottom: 0.5em;
        }
        .log-controls input {
            width: 100%;
            padding: 4px;
        }
        .log-controls input:focus {
           outline: none;
        }
        .log-stream {
            flex: 1;
            padding: 1em;
            background: #f4f4f4;
            overflow-y: auto;
            white-space: pre-wrap;
            word-wrap: break-word;
            min-height: 0;
        }
        .regex-error {
            background-color: #ff0000 !important;
        }
        /* Make headers clickable and show pointer cursor */
        h2 {
            cursor: pointer;
            user-select: none;
            margin: 0 0 0.5em 0;
            padding: 0.5em;
        }
        h2:hover {
            background-color: rgba(0, 0, 0, 0.05);
        }
        /* Dark mode styles */
        @media (prefers-color-scheme: dark) {
            body {
                background-color: #333;
                color: #fff;
            }
            .log-stream {
                background: #444;
                color: #fff;
            }
            .log-controls input {
                background: #555;
                color: #fff;
                border: 1px solid #777;
            }
            .log-controls button {
                background: #555;
                color: #fff;
                border: 1px solid #777;
            }
            h2:hover {
                background-color: rgba(255, 255, 255, 0.1);
            }
        }
        /* Hide content when minimized */
        .log-column.minimized .log-controls,
        .log-column.minimized .log-stream {
            display: none;
        }
        .log-column.minimized h2 {
            writing-mode: vertical-rl;
            text-orientation: mixed;
            transform: rotate(180deg);
            white-space: nowrap;
            margin: auto;
        }
    </style>
 </head>
 <body>
    <div class="log-container">
        <div class="log-column">
            <h2>Proxy Logs</h2>
            <div class="log-controls">
                <input type="text" id="proxy-filter-input" placeholder="proxy regex filter">
                <button id="proxy-clear-button">clear</button>
            </div>
            <pre class="log-stream" id="proxy-log-stream">Waiting for proxy logs...</pre>
        </div>
        <div class="log-column minimized">
            <h2>Upstream Logs</h2>
            <div class="log-controls">
                <input type="text" id="upstream-filter-input" placeholder="upstream regex filter">
                <button id="upstream-clear-button">clear</button>
            </div>
            <pre class="log-stream" id="upstream-log-stream">Waiting for upstream logs...</pre>
        </div>
    </div>
    <script>
        class LogStream {
            constructor(streamElement, filterInput, clearButton, endpoint) {
                this.streamElement = streamElement;
                this.filterInput = filterInput;
                this.clearButton = clearButton;
                this.endpoint = endpoint;
                this.logData = "";
                this.regexFilter = null;
                this.eventSource = null;
                this.initialize();
            }
            initialize() {
                this.filterInput.addEventListener('input', () => this.updateFilter());
                this.clearButton.addEventListener('click', () => {
                    this.filterInput.value = "";
                    this.regexFilter = null;
                    this.render();
                });
                this.setupEventSource();
            }
            setupEventSource() {
                if (typeof(EventSource) === "undefined") {
                    this.logData = "SSE Not supported by this browser.";
                    this.render();
                    return;
                }
                const connect = () => {
                    this.eventSource = new EventSource(this.endpoint);
                    this.eventSource.onmessage = (event) => {
                        this.logData += event.data;
                        this.logData = this.logData.slice(-1024 * 100);
                        this.render();
                    };
                    this.eventSource.onerror = (err) => {
                        // Close the current connection
                        this.eventSource.close();
                        this.logData += "\nConnection lost. Retrying in 5 seconds...\n";
                        this.render();
                        // Attempt to reconnect after 5 seconds
                        setTimeout(() => {
                            this.logData += "Attempting to reconnect...\n";
                            this.render();
                            connect();
                        }, 5000);
                    };
                };
                // Initial connection
                connect();
            }
            render() {
                let content = this.logData;
                if (this.regexFilter) {
                    const lines = content.split('\n');
                    const filteredLines = lines.filter(line => this.regexFilter.test(line));
                    content = filteredLines.length > 0 ? filteredLines.join('\n') + '\n' : "";
                }
                this.streamElement.textContent = content;
                this.streamElement.scrollTop = this.streamElement.scrollHeight;
            }
            updateFilter() {
                const pattern = this.filterInput.value.trim();
                this.filterInput.classList.remove('regex-error');
                if (!pattern) {
                    this.regexFilter = null;
                    this.render();
                    return;
                }
                try {
                    this.regexFilter = new RegExp(pattern);
                } catch (e) {
                    console.error("Invalid regex pattern:", e);
                    this.regexFilter = null;
                    this.filterInput.classList.add('regex-error');
                    return;
                }
                this.render();
            }
        }
        // Initialize both log streams
        document.addEventListener('DOMContentLoaded', () => {
            new LogStream(
                document.getElementById('proxy-log-stream'),
                document.getElementById('proxy-filter-input'),
                document.getElementById('proxy-clear-button'),
                "/logs/streamSSE/proxy"
            );
            new LogStream(
                document.getElementById('upstream-log-stream'),
                document.getElementById('upstream-filter-input'),
                document.getElementById('upstream-clear-button'),
                "/logs/streamSSE/upstream"
            );
            // Initialize clickable headers
            document.querySelectorAll('h2').forEach(header => {
                header.addEventListener('click', () => {
                    const column = header.closest('.log-column');
                    column.classList.toggle('minimized');
                });
            });
        });
    </script>
 </body>
 </html>
@@ -1,10 +0,0 @@
 package proxy
 import "embed"
 //go:embed html
 var htmlFiles embed.FS
 func getHTMLFile(path string) ([]byte, error) {
 	return htmlFiles.ReadFile("html/" + path)
 }
@@ -2,10 +2,13 @@ package proxy
 import (
 	"container/ring"
 	"context"
 	"fmt"
 	"io"
 	"os"
 	"sync"
 	"github.com/kelindar/event"
 )
 type LogLevel int
@@ -18,7 +21,7 @@ const (
 )
 type LogMonitor struct {
-	clients  map[chan []byte]bool
+	eventbus *event.Dispatcher
 	mu       sync.RWMutex
 	buffer   *ring.Ring
 	bufferMu sync.RWMutex
@@ -37,11 +40,11 @@ func NewLogMonitor() *LogMonitor {
 func NewLogMonitorWriter(stdout io.Writer) *LogMonitor {
 	return &LogMonitor{
-		clients: make(map[chan []byte]bool),
+		eventbus: event.NewDispatcher(),
-		buffer:  ring.New(10 * 1024), // keep 10KB of buffered logs
+		buffer:   ring.New(10 * 1024), // keep 10KB of buffered logs
-		stdout:  stdout,
+		stdout:   stdout,
-		level:   LevelInfo,
+		level:    LevelInfo,
-		prefix:  "",
+		prefix:   "",
 	}
 }
@@ -81,34 +84,14 @@ func (w *LogMonitor) GetHistory() []byte {
 	return history
 }
-func (w *LogMonitor) Subscribe() chan []byte {
+func (w *LogMonitor) OnLogData(callback func(data []byte)) context.CancelFunc {
-	w.mu.Lock()
+	return event.Subscribe(w.eventbus, func(e LogDataEvent) {
-	defer w.mu.Unlock()
+		callback(e.Data)
-
+	})
 	ch := make(chan []byte, 100)
 	w.clients[ch] = true
 	return ch
 }
 func (w *LogMonitor) Unsubscribe(ch chan []byte) {
 	w.mu.Lock()
 	defer w.mu.Unlock()
 	delete(w.clients, ch)
 	close(ch)
 }
 func (w *LogMonitor) broadcast(msg []byte) {
-	w.mu.RLock()
+	event.Publish(w.eventbus, LogDataEvent{Data: msg})
 	defer w.mu.RUnlock()
 	for client := range w.clients {
 		select {
 		case client <- msg:
 		default:
 			// If client buffer is full, skip
 		}
 	}
 }
 func (w *LogMonitor) SetPrefix(prefix string) {
@@ -10,38 +10,29 @@ import (
 func TestLogMonitor(t *testing.T) {
 	logMonitor := NewLogMonitorWriter(io.Discard)
-	// Test subscription
+	// A WaitGroup is used to wait for all the expected writes to complete
-	client1 := logMonitor.Subscribe()
+	var wg sync.WaitGroup
 	client2 := logMonitor.Subscribe()
 	defer logMonitor.Unsubscribe(client1)
 	defer logMonitor.Unsubscribe(client2)
 	client1Messages := make([]byte, 0)
 	client2Messages := make([]byte, 0)
-	var wg sync.WaitGroup
+	defer logMonitor.OnLogData(func(data []byte) {
-	wg.Add(1)
+		client1Messages = append(client1Messages, data...)
 		wg.Done()
 	})()
-	go func() {
+	defer logMonitor.OnLogData(func(data []byte) {
-		defer wg.Done()
+		client2Messages = append(client2Messages, data...)
-		for {
+		wg.Done()
-			select {
+	})()
-			case data := <-client1:
+
-				client1Messages = append(client1Messages, data...)
+	wg.Add(6) // 2 x 3 writes
 			case data := <-client2:
 				client2Messages = append(client2Messages, data...)
 			default:
 				return
 			}
 		}
 	}()
 	logMonitor.Write([]byte("1"))
 	logMonitor.Write([]byte("2"))
 	logMonitor.Write([]byte("3"))
-	// Wait for the goroutine to finish
+	// wait for all writes to complete
 	wg.Wait()
 	// Check the buffer
@@ -8,12 +8,13 @@ import (
 	"net/http"
 	"net/url"
 	"os/exec"
 	"runtime"
 	"strconv"
 	"strings"
 	"sync"
 	"syscall"
 	"time"
 	"github.com/kelindar/event"
 )
 type ProcessState string
@@ -24,9 +25,6 @@ const (
 	StateReady    ProcessState = ProcessState("ready")
 	StateStopping ProcessState = ProcessState("stopping")
 	// failed a health check on start and will not be recovered
 	StateFailed ProcessState = ProcessState("failed")
 	// process is shutdown and will not be restarted
 	StateShutdown ProcessState = ProcessState("shutdown")
 )
@@ -43,8 +41,11 @@ type Process struct {
 	config ModelConfig
 	cmd    *exec.Cmd
-	// for p.cmd.Wait() select { ... }
+	// PR #155 called to cancel the upstream process
-	cmdWaitChan chan error
+	cancelUpstream context.CancelFunc
 	// closed when command exits
 	cmdWaitChan chan struct{}
 	processLogger *LogMonitor
 	proxyLogger   *LogMonitor
@@ -62,22 +63,17 @@ type Process struct {
 	// used to block on multiple start() calls
 	waitStarting sync.WaitGroup
 	// for managing shutdown state
 	shutdownCtx    context.Context
 	shutdownCancel context.CancelFunc
 	// for managing concurrency limits
 	concurrencyLimitSemaphore chan struct{}
-	// stop timeout waiting for graceful shutdown
+	// used for testing to override the default value
 	gracefulStopTimeout time.Duration
-	// track that this happened
+	// track the number of failed starts
-	upstreamWasStoppedWithKill bool
+	failedStartCount int
 }
 func NewProcess(ID string, healthCheckTimeout int, config ModelConfig, processLogger *LogMonitor, proxyLogger *LogMonitor) *Process {
 	ctx, cancel := context.WithCancel(context.Background())
 	concurrentLimit := 10
 	if config.ConcurrencyLimit > 0 {
 		concurrentLimit = config.ConcurrencyLimit
@@ -87,21 +83,20 @@ func NewProcess(ID string, healthCheckTimeout int, config ModelConfig, processLo
 		ID:                      ID,
 		config:                  config,
 		cmd:                     nil,
-		cmdWaitChan:             make(chan error, 1),
+		cancelUpstream:          nil,
 		processLogger:           processLogger,
 		proxyLogger:             proxyLogger,
 		healthCheckTimeout:      healthCheckTimeout,
 		healthCheckLoopInterval: 5 * time.Second, /* default, can not be set by user - used for testing */
 		state:                   StateStopped,
 		shutdownCtx:             ctx,
 		shutdownCancel:          cancel,
 		// concurrency limit
 		concurrencyLimitSemaphore: make(chan struct{}, concurrentLimit),
 		// To be removed when migration over exec.CommandContext is complete
 		// stop timeout
-		gracefulStopTimeout:        10 * time.Second,
+		gracefulStopTimeout: 10 * time.Second,
-		upstreamWasStoppedWithKill: false,
+		cmdWaitChan:         make(chan struct{}),
 	}
 }
@@ -134,6 +129,7 @@ func (p *Process) swapState(expectedState, newState ProcessState) (ProcessState,
 	p.state = newState
 	p.proxyLogger.Debugf("<%s> swapState() State transitioned from %s to %s", p.ID, expectedState, newState)
 	event.Emit(ProcessStateChangeEvent{ProcessName: p.ID, NewState: newState, OldState: expectedState})
 	return p.state, nil
 }
@@ -143,13 +139,11 @@ func isValidTransition(from, to ProcessState) bool {
 	case StateStopped:
 		return to == StateStarting
 	case StateStarting:
-		return to == StateReady || to == StateFailed || to == StateStopping
+		return to == StateReady || to == StateStopping || to == StateStopped
 	case StateReady:
 		return to == StateStopping
 	case StateStopping:
 		return to == StateStopped || to == StateShutdown
 	case StateFailed:
 		return to == StateStopping
 	case StateShutdown:
 		return false // No transitions allowed from these states
 	}
@@ -197,17 +191,26 @@ func (p *Process) start() error {
 	p.waitStarting.Add(1)
 	defer p.waitStarting.Done()
 	cmdContext, ctxCancelUpstream := context.WithCancel(context.Background())
-	p.cmd = exec.Command(args[0], args[1:]...)
+	p.cmd = exec.CommandContext(cmdContext, args[0], args[1:]...)
 	p.cmd.Stdout = p.processLogger
 	p.cmd.Stderr = p.processLogger
-	p.cmd.Env = p.config.Env
+	p.cmd.Env = append(p.cmd.Environ(), p.config.Env...)
 	p.cmd.Cancel = p.cmdStopUpstreamProcess
 	p.cmd.WaitDelay = p.gracefulStopTimeout
 	p.cancelUpstream = ctxCancelUpstream
 	p.cmdWaitChan = make(chan struct{})
 	p.failedStartCount++ // this will be reset to zero when the process has successfully started
 	p.proxyLogger.Debugf("<%s> Executing start command: %s, env: %s", p.ID, strings.Join(args, " "), strings.Join(p.config.Env, ", "))
 	err = p.cmd.Start()
 	// Set process state to failed
 	if err != nil {
-		if curState, swapErr := p.swapState(StateStarting, StateFailed); swapErr != nil {
+		if curState, swapErr := p.swapState(StateStarting, StateStopped); swapErr != nil {
 			p.state = StateStopped // force it into a stopped state
 			return fmt.Errorf(
 				"failed to start command and state swap failed. command error: %v, current state: %v, state swap error: %v",
 				err, curState, swapErr,
@@ -217,20 +220,7 @@ func (p *Process) start() error {
 	}
 	// Capture the exit error for later signalling
-	go func() {
+	go p.waitForCmd()
 		exitErr := p.cmd.Wait()
 		p.proxyLogger.Debugf("<%s> cmd.Wait() returned error: %v", p.ID, exitErr)
 		// there is a race condition when SIGKILL is used, p.cmd.Wait() returns, and then
 		// the code below fires, putting an error into cmdWaitChan. This code is to prevent this
 		if p.upstreamWasStoppedWithKill {
 			p.proxyLogger.Debugf("<%s> process was killed, NOT sending exitErr: %v", p.ID, exitErr)
 			p.upstreamWasStoppedWithKill = false
 			return
 		}
 		p.cmdWaitChan <- exitErr
 	}()
 	// One of three things can happen at this stage:
 	// 1. The command exits unexpectedly
@@ -246,67 +236,38 @@ func (p *Process) start() error {
 	// a "none" means don't check for health ... I could have picked a better word :facepalm:
 	if checkEndpoint != "none" {
 		// keep default behaviour
 		if checkEndpoint == "" {
 			checkEndpoint = "/health"
 		}
 		proxyTo := p.config.Proxy
 		healthURL, err := url.JoinPath(proxyTo, checkEndpoint)
 		if err != nil {
 			return fmt.Errorf("failed to create health check URL proxy=%s and checkEndpoint=%s", proxyTo, checkEndpoint)
 		}
 		checkDeadline, cancelHealthCheck := context.WithDeadline(
 			context.Background(),
 			checkStartTime.Add(maxDuration),
 		)
 		defer cancelHealthCheck()
 	loop:
 		// Ready Check loop
 		for {
-			select {
+			currentState := p.CurrentState()
-			case <-checkDeadline.Done():
+			if currentState != StateStarting {
-				if curState, err := p.swapState(StateStarting, StateFailed); err != nil {
+				if currentState == StateStopped {
-					return fmt.Errorf("health check timed out after %vs AND state swap failed: %v, current state: %v", maxDuration.Seconds(), err, curState)
+					return fmt.Errorf("upstream command exited prematurely but successfully")
 				} else {
 					return fmt.Errorf("health check timed out after %vs", maxDuration.Seconds())
 				}
 			case <-p.shutdownCtx.Done():
 				return errors.New("health check interrupted due to shutdown")
 			case exitErr := <-p.cmdWaitChan:
 				if exitErr != nil {
 					p.proxyLogger.Warnf("<%s> upstream command exited prematurely with error: %v", p.ID, exitErr)
 					if curState, err := p.swapState(StateStarting, StateFailed); err != nil {
 						return fmt.Errorf("upstream command exited unexpectedly: %s AND state swap failed: %v, current state: %v", exitErr.Error(), err, curState)
 					} else {
 						return fmt.Errorf("upstream command exited unexpectedly: %s", exitErr.Error())
 					}
 				} else {
 					p.proxyLogger.Warnf("<%s> upstream command exited prematurely but successfully", p.ID)
 					if curState, err := p.swapState(StateStarting, StateFailed); err != nil {
 						return fmt.Errorf("upstream command exited prematurely but successfully AND state swap failed: %v, current state: %v", err, curState)
 					} else {
 						return fmt.Errorf("upstream command exited prematurely but successfully")
 					}
 				}
 			default:
 				if err := p.checkHealthEndpoint(healthURL); err == nil {
 					p.proxyLogger.Infof("<%s> Health check passed on %s", p.ID, healthURL)
 					cancelHealthCheck()
 					break loop
 				} else {
 					if strings.Contains(err.Error(), "connection refused") {
 						endTime, _ := checkDeadline.Deadline()
 						ttl := time.Until(endTime)
 						p.proxyLogger.Debugf("<%s> Connection refused on %s, giving up in %.0fs (normal during startup)", p.ID, healthURL, ttl.Seconds())
 					} else {
 						p.proxyLogger.Debugf("<%s> Health check error on %s, %v (normal during startup)", p.ID, healthURL, err)
 					}
 				}
 			}
 			if time.Since(checkStartTime) > maxDuration {
 				p.stopCommand()
 				return fmt.Errorf("health check timed out after %vs", maxDuration.Seconds())
 			}
 			if err := p.checkHealthEndpoint(healthURL); err == nil {
 				p.proxyLogger.Infof("<%s> Health check passed on %s", p.ID, healthURL)
 				break
 			} else {
 				if strings.Contains(err.Error(), "connection refused") {
 					ttl := time.Until(checkStartTime.Add(maxDuration))
 					p.proxyLogger.Debugf("<%s> Connection refused on %s, giving up in %.0fs (normal during startup)", p.ID, healthURL, ttl.Seconds())
 				} else {
 					p.proxyLogger.Debugf("<%s> Health check error on %s, %v (normal during startup)", p.ID, healthURL, err)
 				}
 			}
 			<-time.After(p.healthCheckLoopInterval)
 		}
 	}
@@ -337,6 +298,7 @@ func (p *Process) start() error {
 	if curState, err := p.swapState(StateStarting, StateReady); err != nil {
 		return fmt.Errorf("failed to set Process state to ready: current state: %v, error: %v", curState, err)
 	} else {
 		p.failedStartCount = 0
 		return nil
 	}
 }
@@ -361,26 +323,12 @@ func (p *Process) StopImmediately() {
 	}
 	p.proxyLogger.Debugf("<%s> Stopping process, current state: %s", p.ID, p.CurrentState())
-	currentState := p.CurrentState()
+	if curState, err := p.swapState(StateReady, StateStopping); err != nil {
-
+		p.proxyLogger.Infof("<%s> Stop() Ready -> StateStopping err: %v, current state: %v", p.ID, err, curState)
-	if currentState == StateFailed {
+		return
 		if curState, err := p.swapState(StateFailed, StateStopping); err != nil {
 			p.proxyLogger.Infof("<%s> Stop() Failed -> StateStopping err: %v, current state: %v", p.ID, err, curState)
 			return
 		}
 	} else {
 		if curState, err := p.swapState(StateReady, StateStopping); err != nil {
 			p.proxyLogger.Infof("<%s> Stop() Ready -> StateStopping err: %v, current state: %v", p.ID, err, curState)
 			return
 		}
 	}
-	// stop the process with a graceful exit timeout
+	p.stopCommand()
 	p.stopCommand(p.gracefulStopTimeout)
 	if curState, err := p.swapState(StateStopping, StateStopped); err != nil {
 		p.proxyLogger.Infof("<%s> Stop() StateStopping -> StateStopped err: %v, current state: %v", p.ID, err, curState)
 	}
 }
 // Shutdown is called when llama-swap is shutting down. It will give a little bit
@@ -392,91 +340,26 @@ func (p *Process) Shutdown() {
 		return
 	}
-	p.shutdownCancel()
+	p.stopCommand()
 	p.stopCommand(p.gracefulStopTimeout)
 	// just force it to this state since there is no recovery from shutdown
 	p.state = StateShutdown
 }
 // stopCommand will send a SIGTERM to the process and wait for it to exit.
 // If it does not exit within 5 seconds, it will send a SIGKILL.
-func (p *Process) stopCommand(sigtermTTL time.Duration) {
+func (p *Process) stopCommand() {
 	stopStartTime := time.Now()
 	defer func() {
 		p.proxyLogger.Debugf("<%s> stopCommand took %v", p.ID, time.Since(stopStartTime))
 	}()
-	sigtermTimeout, cancelTimeout := context.WithTimeout(context.Background(), sigtermTTL)
+	if p.cancelUpstream == nil {
-	defer cancelTimeout()
+		p.proxyLogger.Errorf("<%s> stopCommand has a nil p.cancelUpstream()", p.ID)
 	if p.cmd == nil || p.cmd.Process == nil {
 		p.proxyLogger.Debugf("<%s> cmd or cmd.Process is nil (normal during config reload)", p.ID)
 		return
 	}
-	// if err := p.terminateProcess(); err != nil {
+	p.cancelUpstream()
-	// 	p.proxyLogger.Debugf("<%s> Process already terminated: %v (normal during shutdown)", p.ID, err)
+	<-p.cmdWaitChan
 	// }
 	// the default cmdStop to taskkill /f /t /pid ${PID}
 	if runtime.GOOS == "windows" && strings.TrimSpace(p.config.CmdStop) == "" {
 		p.config.CmdStop = "taskkill /f /t /pid ${PID}"
 	}
 	if p.config.CmdStop != "" {
 		// replace ${PID} with the pid of the process
 		stopArgs, err := SanitizeCommand(strings.ReplaceAll(p.config.CmdStop, "${PID}", fmt.Sprintf("%d", p.cmd.Process.Pid)))
 		if err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to sanitize stop command: %v", p.ID, err)
 			return
 		}
 		p.proxyLogger.Debugf("<%s> Executing stop command: %s", p.ID, strings.Join(stopArgs, " "))
 		stopCmd := exec.Command(stopArgs[0], stopArgs[1:]...)
 		stopCmd.Stdout = p.processLogger
 		stopCmd.Stderr = p.processLogger
 		stopCmd.Env = p.config.Env
 		if err := stopCmd.Run(); err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to exec stop command: %v", p.ID, err)
 			return
 		}
 	} else {
 		if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to send SIGTERM to process: %v", p.ID, err)
 			return
 		}
 	}
 	select {
 	case <-sigtermTimeout.Done():
 		p.proxyLogger.Debugf("<%s> Process timed out waiting to stop, sending KILL signal (normal during shutdown)", p.ID)
 		p.upstreamWasStoppedWithKill = true
 		if err := p.cmd.Process.Kill(); err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to kill process: %v", p.ID, err)
 		}
 	case err := <-p.cmdWaitChan:
 		// Note: in start(), p.cmdWaitChan also has a select { ... }. That should be OK
 		// because if we make it here then the cmd has been successfully running and made it
 		// through the health check. There is a possibility that the cmd crashed after the health check
 		// succeeded but that's not a case llama-swap is handling for now.
 		if err != nil {
 			if errno, ok := err.(syscall.Errno); ok {
 				p.proxyLogger.Errorf("<%s> errno >> %v", p.ID, errno)
 			} else if exitError, ok := err.(*exec.ExitError); ok {
 				if strings.Contains(exitError.String(), "signal: terminated") {
 					p.proxyLogger.Debugf("<%s> Process stopped OK", p.ID)
 				} else if strings.Contains(exitError.String(), "signal: interrupt") {
 					p.proxyLogger.Debugf("<%s> Process interrupted OK", p.ID)
 				} else {
 					p.proxyLogger.Warnf("<%s> ExitError >> %v, exit code: %d", p.ID, exitError, exitError.ExitCode())
 				}
 			} else {
 				p.proxyLogger.Errorf("<%s> Process exited >> %v", p.ID, err)
 			}
 		}
 	}
 }
 func (p *Process) checkHealthEndpoint(healthURL string) error {
@@ -509,7 +392,7 @@ func (p *Process) ProxyRequest(w http.ResponseWriter, r *http.Request) {
 	// prevent new requests from being made while stopping or irrecoverable
 	currentState := p.CurrentState()
-	if currentState == StateFailed || currentState == StateShutdown || currentState == StateStopping {
+	if currentState == StateShutdown || currentState == StateStopping {
 		http.Error(w, fmt.Sprintf("Process can not ProxyRequest, state is %s", currentState), http.StatusServiceUnavailable)
 		return
 	}
@@ -591,3 +474,79 @@ func (p *Process) ProxyRequest(w http.ResponseWriter, r *http.Request) {
 	p.proxyLogger.Debugf("<%s> request %s - start: %v, total: %v",
 		p.ID, r.RequestURI, startDuration, totalTime)
 }
 // waitForCmd waits for the command to exit and handles exit conditions depending on current state
 func (p *Process) waitForCmd() {
 	exitErr := p.cmd.Wait()
 	p.proxyLogger.Debugf("<%s> cmd.Wait() returned error: %v", p.ID, exitErr)
 	if exitErr != nil {
 		if errno, ok := exitErr.(syscall.Errno); ok {
 			p.proxyLogger.Errorf("<%s> errno >> %v", p.ID, errno)
 		} else if exitError, ok := exitErr.(*exec.ExitError); ok {
 			if strings.Contains(exitError.String(), "signal: terminated") {
 				p.proxyLogger.Debugf("<%s> Process stopped OK", p.ID)
 			} else if strings.Contains(exitError.String(), "signal: interrupt") {
 				p.proxyLogger.Debugf("<%s> Process interrupted OK", p.ID)
 			} else {
 				p.proxyLogger.Warnf("<%s> ExitError >> %v, exit code: %d", p.ID, exitError, exitError.ExitCode())
 			}
 		} else {
 			if exitErr.Error() != "context canceled" /* this is normal */ {
 				p.proxyLogger.Errorf("<%s> Process exited >> %v", p.ID, exitErr)
 			}
 		}
 	}
 	currentState := p.CurrentState()
 	switch currentState {
 	case StateStopping:
 		if curState, err := p.swapState(StateStopping, StateStopped); err != nil {
 			p.proxyLogger.Errorf("<%s> Process exited but could not swap to StateStopped. curState=%s, err: %v", p.ID, curState, err)
 			p.state = StateStopped
 		}
 	default:
 		p.proxyLogger.Infof("<%s> process exited but not StateStopping, current state: %s", p.ID, currentState)
 		p.state = StateStopped // force it to be in this state
 	}
 	close(p.cmdWaitChan)
 }
 // cmdStopUpstreamProcess attemps to stop the upstream process gracefully
 func (p *Process) cmdStopUpstreamProcess() error {
 	p.processLogger.Debugf("<%s> cmdStopUpstreamProcess() initiating graceful stop of upstream process", p.ID)
 	// this should never happen ...
 	if p.cmd == nil || p.cmd.Process == nil {
 		p.proxyLogger.Debugf("<%s> cmd or cmd.Process is nil (normal during config reload)", p.ID)
 		return fmt.Errorf("<%s> process is nil or cmd is nil, skipping graceful stop", p.ID)
 	}
 	if p.config.CmdStop != "" {
 		// replace ${PID} with the pid of the process
 		stopArgs, err := SanitizeCommand(strings.ReplaceAll(p.config.CmdStop, "${PID}", fmt.Sprintf("%d", p.cmd.Process.Pid)))
 		if err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to sanitize stop command: %v", p.ID, err)
 			return err
 		}
 		p.proxyLogger.Debugf("<%s> Executing stop command: %s", p.ID, strings.Join(stopArgs, " "))
 		stopCmd := exec.Command(stopArgs[0], stopArgs[1:]...)
 		stopCmd.Stdout = p.processLogger
 		stopCmd.Stderr = p.processLogger
 		stopCmd.Env = p.cmd.Env
 		if err := stopCmd.Run(); err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to exec stop command: %v", p.ID, err)
 			return err
 		}
 	} else {
 		if err := p.cmd.Process.Signal(syscall.SIGTERM); err != nil {
 			p.proxyLogger.Errorf("<%s> Failed to send SIGTERM to process: %v", p.ID, err)
 			return err
 		}
 	}
 	return nil
 }
@@ -106,8 +106,8 @@ func TestProcess_BrokenModelConfig(t *testing.T) {
 	w = httptest.NewRecorder()
 	process.ProxyRequest(w, req)
-	assert.Equal(t, http.StatusServiceUnavailable, w.Code)
+	assert.Equal(t, http.StatusBadGateway, w.Code)
-	assert.Contains(t, w.Body.String(), "Process can not ProxyRequest, state is failed")
+	assert.Contains(t, w.Body.String(), "start() failed: ")
 }
 func TestProcess_UnloadAfterTTL(t *testing.T) {
@@ -248,18 +248,14 @@ func TestProcess_SwapState(t *testing.T) {
 	}{
 		{"Stopped to Starting", StateStopped, StateStopped, StateStarting, nil, StateStarting},
 		{"Starting to Ready", StateStarting, StateStarting, StateReady, nil, StateReady},
 		{"Starting to Failed", StateStarting, StateStarting, StateFailed, nil, StateFailed},
 		{"Starting to Stopping", StateStarting, StateStarting, StateStopping, nil, StateStopping},
 		{"Starting to Stopped", StateStarting, StateStarting, StateStopped, nil, StateStopped},
 		{"Ready to Stopping", StateReady, StateReady, StateStopping, nil, StateStopping},
 		{"Stopping to Stopped", StateStopping, StateStopping, StateStopped, nil, StateStopped},
 		{"Stopping to Shutdown", StateStopping, StateStopping, StateShutdown, nil, StateShutdown},
 		{"Stopped to Ready", StateStopped, StateStopped, StateReady, ErrInvalidStateTransition, StateStopped},
 		{"Starting to Stopped", StateStarting, StateStarting, StateStopped, ErrInvalidStateTransition, StateStarting},
 		{"Ready to Starting", StateReady, StateReady, StateStarting, ErrInvalidStateTransition, StateReady},
 		{"Ready to Failed", StateReady, StateReady, StateFailed, ErrInvalidStateTransition, StateReady},
 		{"Stopping to Ready", StateStopping, StateStopping, StateReady, ErrInvalidStateTransition, StateStopping},
 		{"Failed to Stopped", StateFailed, StateFailed, StateStopped, ErrInvalidStateTransition, StateFailed},
 		{"Failed to Starting", StateFailed, StateFailed, StateStarting, ErrInvalidStateTransition, StateFailed},
 		{"Shutdown to Stopped", StateShutdown, StateShutdown, StateStopped, ErrInvalidStateTransition, StateShutdown},
 		{"Shutdown to Starting", StateShutdown, StateShutdown, StateStarting, ErrInvalidStateTransition, StateShutdown},
 		{"Expected state mismatch", StateStopped, StateStarting, StateStarting, ErrExpectedStateMismatch, StateStopped},
@@ -339,7 +335,7 @@ func TestProcess_ExitInterruptsHealthCheck(t *testing.T) {
 	process.healthCheckLoopInterval = time.Second // make it faster
 	err := process.start()
 	assert.Equal(t, "upstream command exited prematurely but successfully", err.Error())
-	assert.Equal(t, process.CurrentState(), StateFailed)
+	assert.Equal(t, process.CurrentState(), StateStopped)
 }
 func TestProcess_ConcurrencyLimit(t *testing.T) {
@@ -398,6 +394,9 @@ func TestProcess_StopImmediately(t *testing.T) {
 // Test that SIGKILL is sent when gracefulStopTimeout is reached and properly terminates
 // the upstream command
 func TestProcess_ForceStopWithKill(t *testing.T) {
 	if runtime.GOOS == "windows" {
 		t.Skip("skipping SIGTERM test on Windows ")
 	}
 	expectedMessage := "test_sigkill"
 	binaryPath := getSimpleResponderPath()
@@ -468,3 +467,27 @@ func TestProcess_StopCmd(t *testing.T) {
 	process.StopImmediately()
 	assert.Equal(t, process.CurrentState(), StateStopped)
 }
 func TestProcess_EnvironmentSetCorrectly(t *testing.T) {
 	expectedMessage := "test_env_not_emptied"
 	config := getTestSimpleResponderConfig(expectedMessage)
 	// ensure that the the default config does not blank out the inherited environment
 	configWEnv := config
 	// ensure the additiona variables are appended to the process' environment
 	configWEnv.Env = append(configWEnv.Env, "TEST_ENV1=1", "TEST_ENV2=2")
 	process1 := NewProcess("env_test", 2, config, debugLogger, debugLogger)
 	process2 := NewProcess("env_test", 2, configWEnv, debugLogger, debugLogger)
 	process1.start()
 	defer process1.Stop()
 	process2.start()
 	defer process2.Stop()
 	assert.NotZero(t, len(process1.cmd.Environ()))
 	assert.NotZero(t, len(process2.cmd.Environ()))
 	assert.Equal(t, len(process1.cmd.Environ())+2, len(process2.cmd.Environ()), "process2 should have 2 more environment variables than process1")
 }
@@ -2,13 +2,12 @@ package proxy
 import (
 	"bytes"
-	"encoding/json"
+	"context"
 	"fmt"
 	"io"
 	"mime/multipart"
 	"net/http"
 	"os"
 	"sort"
 	"strconv"
 	"strings"
 	"sync"
@@ -35,6 +34,10 @@ type ProxyManager struct {
 	muxLogger      *LogMonitor
 	processGroups map[string]*ProcessGroup
 	// shutdown signaling
 	shutdownCtx    context.Context
 	shutdownCancel context.CancelFunc
 }
 func New(config Config) *ProxyManager {
@@ -65,6 +68,8 @@ func New(config Config) *ProxyManager {
 		upstreamLogger.SetLogLevel(LevelInfo)
 	}
 	shutdownCtx, shutdownCancel := context.WithCancel(context.Background())
 	pm := &ProxyManager{
 		config:    config,
 		ginEngine: gin.New(),
@@ -74,6 +79,9 @@ func New(config Config) *ProxyManager {
 		upstreamLogger: upstreamLogger,
 		processGroups: make(map[string]*ProcessGroup),
 		shutdownCtx:    shutdownCtx,
 		shutdownCancel: shutdownCancel,
 	}
 	// create the process groups
@@ -159,42 +167,61 @@ func (pm *ProxyManager) setupGinEngine() {
 	// in proxymanager_loghandlers.go
 	pm.ginEngine.GET("/logs", pm.sendLogsHandlers)
 	pm.ginEngine.GET("/logs/stream", pm.streamLogsHandler)
 	pm.ginEngine.GET("/logs/streamSSE", pm.streamLogsHandlerSSE)
 	pm.ginEngine.GET("/logs/stream/:logMonitorID", pm.streamLogsHandler)
 	pm.ginEngine.GET("/logs/streamSSE/:logMonitorID", pm.streamLogsHandlerSSE)
-	pm.ginEngine.GET("/upstream", pm.upstreamIndex)
+	/**
 	 * User Interface Endpoints
 	 */
 	pm.ginEngine.GET("/", func(c *gin.Context) {
 		c.Redirect(http.StatusFound, "/ui")
 	})
 	pm.ginEngine.GET("/upstream", func(c *gin.Context) {
 		c.Redirect(http.StatusFound, "/ui/models")
 	})
 	pm.ginEngine.Any("/upstream/:model_id/*upstreamPath", pm.proxyToUpstream)
 	pm.ginEngine.GET("/unload", pm.unloadAllModelsHandler)
 	pm.ginEngine.GET("/running", pm.listRunningProcessesHandler)
 	pm.ginEngine.GET("/", func(c *gin.Context) {
 		// Set the Content-Type header to text/html
 		c.Header("Content-Type", "text/html")
 		// Write the embedded HTML content to the response
 		htmlData, err := getHTMLFile("index.html")
 		if err != nil {
 			c.String(http.StatusInternalServerError, err.Error())
 			return
 		}
 		_, err = c.Writer.Write(htmlData)
 		if err != nil {
 			c.String(http.StatusInternalServerError, fmt.Sprintf("failed to write response: %v", err))
 			return
 		}
 	})
 	pm.ginEngine.GET("/favicon.ico", func(c *gin.Context) {
-		if data, err := getHTMLFile("favicon.ico"); err == nil {
+		if data, err := reactStaticFS.ReadFile("ui_dist/favicon.ico"); err == nil {
 			c.Data(http.StatusOK, "image/x-icon", data)
 		} else {
 			c.String(http.StatusInternalServerError, err.Error())
 		}
 	})
 	reactFS, err := GetReactFS()
 	if err != nil {
 		pm.proxyLogger.Errorf("Failed to load React filesystem: %v", err)
 	} else {
 		// serve files that exist under /ui/*
 		pm.ginEngine.StaticFS("/ui", reactFS)
 		// server SPA for UI under /ui/*
 		pm.ginEngine.NoRoute(func(c *gin.Context) {
 			if !strings.HasPrefix(c.Request.URL.Path, "/ui") {
 				c.AbortWithStatus(http.StatusNotFound)
 				return
 			}
 			file, err := reactFS.Open("index.html")
 			if err != nil {
 				c.String(http.StatusInternalServerError, err.Error())
 				return
 			}
 			defer file.Close()
 			http.ServeContent(c.Writer, c.Request, "index.html", time.Now(), file)
 		})
 	}
 	// see: proxymanager_api.go
 	// add API handler functions
 	addApiHandlers(pm)
 	// Disable console color for testing
 	gin.DisableConsoleColor()
 }
@@ -242,6 +269,7 @@ func (pm *ProxyManager) Shutdown() {
 		}(processGroup)
 	}
 	wg.Wait()
 	pm.shutdownCancel()
 }
 func (pm *ProxyManager) swapProcessGroup(requestedModel string) (*ProcessGroup, string, error) {
@@ -269,32 +297,41 @@ func (pm *ProxyManager) swapProcessGroup(requestedModel string) (*ProcessGroup,
 }
 func (pm *ProxyManager) listModelsHandler(c *gin.Context) {
-	data := []interface{}{}
+	data := make([]gin.H, 0, len(pm.config.Models))
 	createdTime := time.Now().Unix()
 	for id, modelConfig := range pm.config.Models {
 		if modelConfig.Unlisted {
 			continue
 		}
-		data = append(data, map[string]interface{}{
+		record := gin.H{
 			"id":       id,
 			"object":   "model",
-			"created":  time.Now().Unix(),
+			"created":  createdTime,
 			"owned_by": "llama-swap",
-		})
+		}
 		if name := strings.TrimSpace(modelConfig.Name); name != "" {
 			record["name"] = name
 		}
 		if desc := strings.TrimSpace(modelConfig.Description); desc != "" {
 			record["description"] = desc
 		}
 		data = append(data, record)
 	}
-	// Set the Content-Type header to application/json
+	// Set CORS headers if origin exists
-	c.Header("Content-Type", "application/json")
+	if origin := c.GetHeader("Origin"); origin != "" {
 	if origin := c.Request.Header.Get("Origin"); origin != "" {
 		c.Header("Access-Control-Allow-Origin", origin)
 	}
-	// Encode the data as JSON and write it to the response writer
+	// Use gin's JSON method which handles content-type and encoding
-	if err := json.NewEncoder(c.Writer).Encode(map[string]interface{}{"object": "list", "data": data}); err != nil {
+	c.JSON(http.StatusOK, gin.H{
-		pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error encoding JSON %s", err.Error()))
+		"object": "list",
-		return
+		"data":   data,
-	}
+	})
 }
 func (pm *ProxyManager) proxyToUpstream(c *gin.Context) {
@@ -316,57 +353,6 @@ func (pm *ProxyManager) proxyToUpstream(c *gin.Context) {
 	processGroup.ProxyRequest(requestedModel, c.Writer, c.Request)
 }
 func (pm *ProxyManager) upstreamIndex(c *gin.Context) {
 	var html strings.Builder
 	html.WriteString("<!doctype HTML>\n<html><body><h1>Available Models</h1><a href=\"/unload\">Unload all models</a><ul>")
 	// Extract keys and sort them
 	var modelIDs []string
 	for modelID, modelConfig := range pm.config.Models {
 		if modelConfig.Unlisted {
 			continue
 		}
 		modelIDs = append(modelIDs, modelID)
 	}
 	sort.Strings(modelIDs)
 	// Iterate over sorted keys
 	for _, modelID := range modelIDs {
 		// Get process state
 		processGroup := pm.findGroupByModelName(modelID)
 		var state string
 		if processGroup != nil {
 			process := processGroup.processes[modelID]
 			if process != nil {
 				var stateStr string
 				switch process.CurrentState() {
 				case StateReady:
 					stateStr = "Ready"
 				case StateStarting:
 					stateStr = "Starting"
 				case StateStopping:
 					stateStr = "Stopping"
 				case StateFailed:
 					stateStr = "Failed"
 				case StateShutdown:
 					stateStr = "Shutdown"
 				case StateStopped:
 					stateStr = "Stopped"
 				default:
 					stateStr = "Unknown"
 				}
 				state = stateStr
 			}
 		}
 		html.WriteString(fmt.Sprintf("<li><a href=\"/upstream/%s\">%s</a> - %s</li>", modelID, modelID, state))
 	}
 	html.WriteString("</ul></body></html>")
 	c.Header("Content-Type", "text/html")
 	c.String(http.StatusOK, html.String())
 }
 func (pm *ProxyManager) proxyOAIHandler(c *gin.Context) {
 	bodyBytes, err := io.ReadAll(c.Request.Body)
 	if err != nil {
@@ -396,6 +382,21 @@ func (pm *ProxyManager) proxyOAIHandler(c *gin.Context) {
 		}
 	}
 	// issue #174 strip parameters from the JSON body
 	stripParams, err := pm.config.Models[realModelName].Filters.SanitizedStripParams()
 	if err != nil { // just log it and continue
 		pm.proxyLogger.Errorf("Error sanitizing strip params string: %s, %s", pm.config.Models[realModelName].Filters.StripParams, err.Error())
 	} else {
 		for _, param := range stripParams {
 			pm.proxyLogger.Debugf("<%s> stripping param: %s", realModelName, param)
 			bodyBytes, err = sjson.DeleteBytes(bodyBytes, param)
 			if err != nil {
 				pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error deleting parameter %s from request", param))
 				return
 			}
 		}
 	}
 	c.Request.Body = io.NopCloser(bytes.NewBuffer(bodyBytes))
 	// dechunk it as we already have all the body bytes see issue #11
@@ -0,0 +1,169 @@
 package proxy
 import (
 	"context"
 	"encoding/json"
 	"net/http"
 	"sort"
 	"github.com/gin-gonic/gin"
 	"github.com/kelindar/event"
 )
 type Model struct {
 	Id          string `json:"id"`
 	Name        string `json:"name"`
 	Description string `json:"description"`
 	State       string `json:"state"`
 }
 func addApiHandlers(pm *ProxyManager) {
 	// Add API endpoints for React to consume
 	apiGroup := pm.ginEngine.Group("/api")
 	{
 		apiGroup.POST("/models/unload", pm.apiUnloadAllModels)
 		apiGroup.GET("/events", pm.apiSendEvents)
 	}
 }
 func (pm *ProxyManager) apiUnloadAllModels(c *gin.Context) {
 	pm.StopProcesses(StopImmediately)
 	c.JSON(http.StatusOK, gin.H{"msg": "ok"})
 }
 func (pm *ProxyManager) getModelStatus() []Model {
 	// Extract keys and sort them
 	models := []Model{}
 	modelIDs := make([]string, 0, len(pm.config.Models))
 	for modelID := range pm.config.Models {
 		modelIDs = append(modelIDs, modelID)
 	}
 	sort.Strings(modelIDs)
 	// Iterate over sorted keys
 	for _, modelID := range modelIDs {
 		// Get process state
 		processGroup := pm.findGroupByModelName(modelID)
 		state := "unknown"
 		if processGroup != nil {
 			process := processGroup.processes[modelID]
 			if process != nil {
 				var stateStr string
 				switch process.CurrentState() {
 				case StateReady:
 					stateStr = "ready"
 				case StateStarting:
 					stateStr = "starting"
 				case StateStopping:
 					stateStr = "stopping"
 				case StateShutdown:
 					stateStr = "shutdown"
 				case StateStopped:
 					stateStr = "stopped"
 				default:
 					stateStr = "unknown"
 				}
 				state = stateStr
 			}
 		}
 		models = append(models, Model{
 			Id:          modelID,
 			Name:        pm.config.Models[modelID].Name,
 			Description: pm.config.Models[modelID].Description,
 			State:       state,
 		})
 	}
 	return models
 }
 type messageType string
 const (
 	msgTypeModelStatus messageType = "modelStatus"
 	msgTypeLogData     messageType = "logData"
 )
 type messageEnvelope struct {
 	Type messageType `json:"type"`
 	Data string      `json:"data"`
 }
 // sends a stream of different message types that happen on the server
 func (pm *ProxyManager) apiSendEvents(c *gin.Context) {
 	c.Header("Content-Type", "text/event-stream")
 	c.Header("Cache-Control", "no-cache")
 	c.Header("Connection", "keep-alive")
 	c.Header("X-Content-Type-Options", "nosniff")
 	sendBuffer := make(chan messageEnvelope, 25)
 	ctx, cancel := context.WithCancel(c.Request.Context())
 	sendModels := func() {
 		data, err := json.Marshal(pm.getModelStatus())
 		if err == nil {
 			msg := messageEnvelope{Type: msgTypeModelStatus, Data: string(data)}
 			select {
 			case sendBuffer <- msg:
 			case <-ctx.Done():
 				return
 			default:
 			}
 		}
 	}
 	sendLogData := func(source string, data []byte) {
 		data, err := json.Marshal(gin.H{
 			"source": source,
 			"data":   string(data),
 		})
 		if err == nil {
 			select {
 			case sendBuffer <- messageEnvelope{Type: msgTypeLogData, Data: string(data)}:
 			case <-ctx.Done():
 				return
 			default:
 			}
 		}
 	}
 	/**
 	 * Send updated models list
 	 */
 	defer event.On(func(e ProcessStateChangeEvent) {
 		sendModels()
 	})()
 	defer event.On(func(e ConfigFileChangedEvent) {
 		sendModels()
 	})()
 	/**
 	 * Send Log data
 	 */
 	defer pm.proxyLogger.OnLogData(func(data []byte) {
 		sendLogData("proxy", data)
 	})()
 	defer pm.upstreamLogger.OnLogData(func(data []byte) {
 		sendLogData("upstream", data)
 	})()
 	// send initial batch of data
 	sendLogData("proxy", pm.proxyLogger.GetHistory())
 	sendLogData("upstream", pm.upstreamLogger.GetHistory())
 	sendModels()
 	for {
 		select {
 		case <-c.Request.Context().Done():
 			cancel()
 			return
 		case <-pm.shutdownCtx.Done():
 			cancel()
 			return
 		case msg := <-sendBuffer:
 			c.SSEvent("message", msg)
 			c.Writer.Flush()
 		}
 	}
 }
@@ -1,6 +1,7 @@
 package proxy
 import (
 	"context"
 	"fmt"
 	"net/http"
 	"strings"
@@ -11,20 +12,7 @@ import (
 func (pm *ProxyManager) sendLogsHandlers(c *gin.Context) {
 	accept := c.GetHeader("Accept")
 	if strings.Contains(accept, "text/html") {
-		// Set the Content-Type header to text/html
+		c.Redirect(http.StatusFound, "/ui/")
 		c.Header("Content-Type", "text/html")
 		// Write the embedded HTML content to the response
 		logsHTML, err := getHTMLFile("logs.html")
 		if err != nil {
 			c.String(http.StatusInternalServerError, err.Error())
 			return
 		}
 		_, err = c.Writer.Write(logsHTML)
 		if err != nil {
 			c.String(http.StatusInternalServerError, fmt.Sprintf("failed to write response: %v", err))
 			return
 		}
 	} else {
 		c.Header("Content-Type", "text/plain")
 		history := pm.muxLogger.GetHistory()
@@ -47,10 +35,7 @@ func (pm *ProxyManager) streamLogsHandler(c *gin.Context) {
 		c.String(http.StatusBadRequest, err.Error())
 		return
 	}
 	ch := logger.Subscribe()
 	defer logger.Unsubscribe(ch)
 	notify := c.Request.Context().Done()
 	flusher, ok := c.Writer.(http.Flusher)
 	if !ok {
 		c.AbortWithError(http.StatusInternalServerError, fmt.Errorf("streaming unsupported"))
@@ -68,57 +53,28 @@ func (pm *ProxyManager) streamLogsHandler(c *gin.Context) {
 		}
 	}
-	// Stream new logs
+	sendChan := make(chan []byte, 10)
 	ctx, cancel := context.WithCancel(c.Request.Context())
 	defer logger.OnLogData(func(data []byte) {
 		select {
 		case sendChan <- data:
 		case <-ctx.Done():
 			return
 		default:
 		}
 	})()
 	for {
 		select {
-		case msg := <-ch:
+		case <-c.Request.Context().Done():
-			_, err := c.Writer.Write(msg)
+			cancel()
-			if err != nil {
+			return
-				// just break the loop if we can't write for some reason
+		case <-pm.shutdownCtx.Done():
-				return
+			cancel()
-			}
+			return
 		case data := <-sendChan:
 			c.Writer.Write(data)
 			flusher.Flush()
 		case <-notify:
 			return
 		}
 	}
 }
 func (pm *ProxyManager) streamLogsHandlerSSE(c *gin.Context) {
 	c.Header("Content-Type", "text/event-stream")
 	c.Header("Cache-Control", "no-cache")
 	c.Header("Connection", "keep-alive")
 	c.Header("X-Content-Type-Options", "nosniff")
 	logMonitorId := c.Param("logMonitorID")
 	logger, err := pm.getLogger(logMonitorId)
 	if err != nil {
 		c.String(http.StatusBadRequest, err.Error())
 		return
 	}
 	ch := logger.Subscribe()
 	defer logger.Unsubscribe(ch)
 	notify := c.Request.Context().Done()
 	// Send history first if not skipped
 	_, skipHistory := c.GetQuery("no-history")
 	if !skipHistory {
 		history := logger.GetHistory()
 		if len(history) != 0 {
 			c.SSEvent("message", string(history))
 			c.Writer.Flush()
 		}
 	}
 	// Stream new logs
 	for {
 		select {
 		case msg := <-ch:
 			c.SSEvent("message", string(msg))
 			c.Writer.Flush()
 		case <-notify:
 			return
 		}
 	}
 }
@@ -183,11 +183,20 @@ func TestProxyManager_SwapMultiProcessParallelRequests(t *testing.T) {
 }
 func TestProxyManager_ListModelsHandler(t *testing.T) {
 	model1Config := getTestSimpleResponderConfig("model1")
 	model1Config.Name = "Model 1"
 	model1Config.Description = "Model 1 description is used for testing"
 	model2Config := getTestSimpleResponderConfig("model2")
 	model2Config.Name = "     " // empty whitespace only strings will get ignored
 	model2Config.Description = "  "
 	config := Config{
 		HealthCheckTimeout: 15,
 		Models: map[string]ModelConfig{
-			"model1": getTestSimpleResponderConfig("model1"),
+			"model1": model1Config,
-			"model2": getTestSimpleResponderConfig("model2"),
+			"model2": model2Config,
 			"model3": getTestSimpleResponderConfig("model3"),
 		},
 		LogLevel: "error",
@@ -213,6 +222,7 @@ func TestProxyManager_ListModelsHandler(t *testing.T) {
 	var response struct {
 		Data []map[string]interface{} `json:"data"`
 	}
 	if err := json.Unmarshal(w.Body.Bytes(), &response); err != nil {
 		t.Fatalf("Failed to parse JSON response: %v", err)
 	}
@@ -227,6 +237,7 @@ func TestProxyManager_ListModelsHandler(t *testing.T) {
 		"model3": {},
 	}
 	// make all models
 	for _, model := range response.Data {
 		modelID, ok := model["id"].(string)
 		assert.True(t, ok, "model ID should be a string")
@@ -245,6 +256,21 @@ func TestProxyManager_ListModelsHandler(t *testing.T) {
 		ownedBy, ok := model["owned_by"].(string)
 		assert.True(t, ok, "owned_by should be a string")
 		assert.Equal(t, "llama-swap", ownedBy)
 		// check for optional name and description
 		if modelID == "model1" {
 			name, ok := model["name"].(string)
 			assert.True(t, ok, "name should be a string")
 			assert.Equal(t, "Model 1", name)
 			description, ok := model["description"].(string)
 			assert.True(t, ok, "description should be a string")
 			assert.Equal(t, "Model 1 description is used for testing", description)
 		} else {
 			_, exists := model["name"]
 			assert.False(t, exists, "unexpected name field for model: %s", modelID)
 			_, exists = model["description"]
 			assert.False(t, exists, "unexpected description field for model: %s", modelID)
 		}
 	}
 	// Ensure all expected models were returned
@@ -623,3 +649,37 @@ func TestProxyManager_ChatContentLength(t *testing.T) {
 	assert.Equal(t, "81", response["h_content_length"])
 	assert.Equal(t, "model1", response["responseMessage"])
 }
 func TestProxyManager_FiltersStripParams(t *testing.T) {
 	modelConfig := getTestSimpleResponderConfig("model1")
 	modelConfig.Filters = ModelFilters{
 		StripParams: "temperature, model, stream",
 	}
 	config := AddDefaultGroupToConfig(Config{
 		HealthCheckTimeout: 15,
 		LogLevel:           "error",
 		Models: map[string]ModelConfig{
 			"model1": modelConfig,
 		},
 	})
 	proxy := New(config)
 	defer proxy.StopProcesses(StopWaitForInflightRequest)
 	reqBody := `{"model":"model1", "temperature":0.1, "x_param":"123", "y_param":"abc", "stream":true}`
 	req := httptest.NewRequest("POST", "/v1/chat/completions", bytes.NewBufferString(reqBody))
 	w := httptest.NewRecorder()
 	proxy.ServeHTTP(w, req)
 	assert.Equal(t, http.StatusOK, w.Code)
 	var response map[string]string
 	assert.NoError(t, json.Unmarshal(w.Body.Bytes(), &response))
 	// `temperature` and `stream` are gone but model remains
 	assert.Equal(t, `{"model":"model1", "x_param":"123", "y_param":"abc"}`, response["request_body"])
 	// assert.Nil(t, response["temperature"])
 	// assert.Equal(t, "123", response["x_param"])
 	// assert.Equal(t, "abc", response["y_param"])
 	// t.Logf("%v", response)
 }
@@ -0,0 +1,24 @@
 package proxy
 import (
 	"embed"
 	"io/fs"
 	"net/http"
 )
 //go:embed ui_dist
 var reactStaticFS embed.FS
 // GetReactFS returns the embedded React filesystem
 func GetReactFS() (http.FileSystem, error) {
 	subFS, err := fs.Sub(reactStaticFS, "ui_dist")
 	if err != nil {
 		return nil, err
 	}
 	return http.FS(subFS), nil
 }
 // GetReactIndexHTML returns the main index.html for the React app
 func GetReactIndexHTML() ([]byte, error) {
 	return reactStaticFS.ReadFile("ui_dist/index.html")
 }
@@ -0,0 +1,25 @@
 .vite
 # Logs
 logs
 *.log
 npm-debug.log*
 yarn-debug.log*
 yarn-error.log*
 pnpm-debug.log*
 lerna-debug.log*
 node_modules
 dist
 dist-ssr
 *.local
 # Editor directories and files
 .vscode/*
 !.vscode/extensions.json
 .idea
 .DS_Store
 *.suo
 *.ntvs*
 *.njsproj
 *.sln
 *.sw?
@@ -0,0 +1,54 @@
 # React + TypeScript + Vite
 This template provides a minimal setup to get React working in Vite with HMR and some ESLint rules.
 Currently, two official plugins are available:
 - [@vitejs/plugin-react](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react) uses [Babel](https://babeljs.io/) for Fast Refresh
 - [@vitejs/plugin-react-swc](https://github.com/vitejs/vite-plugin-react/blob/main/packages/plugin-react-swc) uses [SWC](https://swc.rs/) for Fast Refresh
 ## Expanding the ESLint configuration
 If you are developing a production application, we recommend updating the configuration to enable type-aware lint rules:
 ```js
 export default tseslint.config({
  extends: [
    // Remove ...tseslint.configs.recommended and replace with this
    ...tseslint.configs.recommendedTypeChecked,
    // Alternatively, use this for stricter rules
    ...tseslint.configs.strictTypeChecked,
    // Optionally, add this for stylistic rules
    ...tseslint.configs.stylisticTypeChecked,
  ],
  languageOptions: {
    // other options...
    parserOptions: {
      project: ['./tsconfig.node.json', './tsconfig.app.json'],
      tsconfigRootDir: import.meta.dirname,
    },
  },
 })
 ```
 You can also install [eslint-plugin-react-x](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-x) and [eslint-plugin-react-dom](https://github.com/Rel1cx/eslint-react/tree/main/packages/plugins/eslint-plugin-react-dom) for React-specific lint rules:
 ```js
 // eslint.config.js
 import reactX from 'eslint-plugin-react-x'
 import reactDom from 'eslint-plugin-react-dom'
 export default tseslint.config({
  plugins: {
    // Add the react-x and react-dom plugins
    'react-x': reactX,
    'react-dom': reactDom,
  },
  rules: {
    // other rules...
    // Enable its recommended typescript rules
    ...reactX.configs['recommended-typescript'].rules,
    ...reactDom.configs.recommended.rules,
  },
 })
 ```
@@ -0,0 +1,28 @@
 import js from '@eslint/js'
 import globals from 'globals'
 import reactHooks from 'eslint-plugin-react-hooks'
 import reactRefresh from 'eslint-plugin-react-refresh'
 import tseslint from 'typescript-eslint'
 export default tseslint.config(
  { ignores: ['dist'] },
  {
    extends: [js.configs.recommended, ...tseslint.configs.recommended],
    files: ['**/*.{ts,tsx}'],
    languageOptions: {
      ecmaVersion: 2020,
      globals: globals.browser,
    },
    plugins: {
      'react-hooks': reactHooks,
      'react-refresh': reactRefresh,
    },
    rules: {
      ...reactHooks.configs.recommended.rules,
      'react-refresh/only-export-components': [
        'warn',
        { allowConstantExport: true },
      ],
    },
  },
 )
@@ -0,0 +1,17 @@
 <!doctype html>
 <html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <link rel="icon" type="image/png" href="/favicon-96x96.png" sizes="96x96" />
    <link rel="icon" type="image/svg+xml" href="/favicon.svg" />
    <link rel="shortcut icon" href="/favicon.ico" />
    <link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png" />
    <link rel="manifest" href="/site.webmanifest" />
    <title>llama-swap</title>
  </head>
  <body >
    <div id="root"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
 </html>
@@ -0,0 +1,33 @@
 {
  "name": "ui",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build --emptyOutDir",
    "lint": "eslint .",
    "preview": "vite preview"
  },
  "dependencies": {
    "@tailwindcss/vite": "^4.1.8",
    "@tanstack/react-query": "^5.80.6",
    "react": "^19.1.0",
    "react-dom": "^19.1.0",
    "react-router-dom": "^7.6.2",
    "tailwindcss": "^4.1.8"
  },
  "devDependencies": {
    "@eslint/js": "^9.25.0",
    "@types/react": "^19.1.2",
    "@types/react-dom": "^19.1.2",
    "@vitejs/plugin-react": "^4.4.1",
    "eslint": "^9.25.0",
    "eslint-plugin-react-hooks": "^5.2.0",
    "eslint-plugin-react-refresh": "^0.4.19",
    "globals": "^16.0.0",
    "typescript": "~5.8.3",
    "typescript-eslint": "^8.30.1",
    "vite": "^6.3.5"
  }
 }
@@ -0,0 +1,21 @@
 {
  "name": "llama-swap",
  "short_name": "llama-swap",
  "icons": [
    {
      "src": "/web-app-manifest-192x192.png",
      "sizes": "192x192",
      "type": "image/png",
      "purpose": "maskable"
    },
    {
      "src": "/web-app-manifest-512x512.png",
      "sizes": "512x512",
      "type": "image/png",
      "purpose": "maskable"
    }
  ],
  "theme_color": "#ffffff",
  "background_color": "#ffffff",
  "display": "standalone"
 }
@@ -0,0 +1,6 @@
 #root {
  max-width: 1280px;
  margin: 0 auto;
  padding: 2rem;
  text-align: center;
 }
@@ -0,0 +1,44 @@
 import { BrowserRouter as Router, Routes, Route, Navigate, NavLink } from "react-router-dom";
 import { useTheme } from "./contexts/ThemeProvider";
 import { APIProvider } from "./contexts/APIProvider";
 import LogViewerPage from "./pages/LogViewer";
 import ModelPage from "./pages/Models";
 function App() {
  const theme = useTheme();
  return (
    <Router basename="/ui/">
      <APIProvider>
        <div>
          <nav className="bg-surface border-b border-border p-2 h-[75px]">
            <div className="flex items-center justify-between mx-auto px-4 h-full">
              <h1 className="flex items-center p-0">llama-swap</h1>
              <div className="flex items-center space-x-4">
                <NavLink to="/" className={({ isActive }) => (isActive ? "navlink active" : "navlink")}>
                  Logs
                </NavLink>
                <NavLink to="/models" className={({ isActive }) => (isActive ? "navlink active" : "navlink")}>
                  Models
                </NavLink>
                <button className="btn btn--sm" onClick={theme.toggleTheme}>
                  {theme.isDarkMode ? "🌙" : "☀️"}
                </button>
              </div>
            </div>
          </nav>
          <main className="mx-auto py-4 px-4">
            <Routes>
              <Route path="/" element={<LogViewerPage />} />
              <Route path="/models" element={<ModelPage />} />
              <Route path="*" element={<Navigate to="/" replace />} />
            </Routes>
          </main>
        </div>
      </APIProvider>
    </Router>
  );
 }
 export default App;
@@ -0,0 +1 @@
 <svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="35.93" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 228"><path fill="#00D8FF" d="M210.483 73.824a171.49 171.49 0 0 0-8.24-2.597c.465-1.9.893-3.777 1.273-5.621c6.238-30.281 2.16-54.676-11.769-62.708c-13.355-7.7-35.196.329-57.254 19.526a171.23 171.23 0 0 0-6.375 5.848a155.866 155.866 0 0 0-4.241-3.917C100.759 3.829 77.587-4.822 63.673 3.233C50.33 10.957 46.379 33.89 51.995 62.588a170.974 170.974 0 0 0 1.892 8.48c-3.28.932-6.445 1.924-9.474 2.98C17.309 83.498 0 98.307 0 113.668c0 15.865 18.582 31.778 46.812 41.427a145.52 145.52 0 0 0 6.921 2.165a167.467 167.467 0 0 0-2.01 9.138c-5.354 28.2-1.173 50.591 12.134 58.266c13.744 7.926 36.812-.22 59.273-19.855a145.567 145.567 0 0 0 5.342-4.923a168.064 168.064 0 0 0 6.92 6.314c21.758 18.722 43.246 26.282 56.54 18.586c13.731-7.949 18.194-32.003 12.4-61.268a145.016 145.016 0 0 0-1.535-6.842c1.62-.48 3.21-.974 4.76-1.488c29.348-9.723 48.443-25.443 48.443-41.52c0-15.417-17.868-30.326-45.517-39.844Zm-6.365 70.984c-1.4.463-2.836.91-4.3 1.345c-3.24-10.257-7.612-21.163-12.963-32.432c5.106-11 9.31-21.767 12.459-31.957c2.619.758 5.16 1.557 7.61 2.4c23.69 8.156 38.14 20.213 38.14 29.504c0 9.896-15.606 22.743-40.946 31.14Zm-10.514 20.834c2.562 12.94 2.927 24.64 1.23 33.787c-1.524 8.219-4.59 13.698-8.382 15.893c-8.067 4.67-25.32-1.4-43.927-17.412a156.726 156.726 0 0 1-6.437-5.87c7.214-7.889 14.423-17.06 21.459-27.246c12.376-1.098 24.068-2.894 34.671-5.345a134.17 134.17 0 0 1 1.386 6.193ZM87.276 214.515c-7.882 2.783-14.16 2.863-17.955.675c-8.075-4.657-11.432-22.636-6.853-46.752a156.923 156.923 0 0 1 1.869-8.499c10.486 2.32 22.093 3.988 34.498 4.994c7.084 9.967 14.501 19.128 21.976 27.15a134.668 134.668 0 0 1-4.877 4.492c-9.933 8.682-19.886 14.842-28.658 17.94ZM50.35 144.747c-12.483-4.267-22.792-9.812-29.858-15.863c-6.35-5.437-9.555-10.836-9.555-15.216c0-9.322 13.897-21.212 37.076-29.293c2.813-.98 5.757-1.905 8.812-2.773c3.204 10.42 7.406 21.315 12.477 32.332c-5.137 11.18-9.399 22.249-12.634 32.792a134.718 134.718 0 0 1-6.318-1.979Zm12.378-84.26c-4.811-24.587-1.616-43.134 6.425-47.789c8.564-4.958 27.502 2.111 47.463 19.835a144.318 144.318 0 0 1 3.841 3.545c-7.438 7.987-14.787 17.08-21.808 26.988c-12.04 1.116-23.565 2.908-34.161 5.309a160.342 160.342 0 0 1-1.76-7.887Zm110.427 27.268a347.8 347.8 0 0 0-7.785-12.803c8.168 1.033 15.994 2.404 23.343 4.08c-2.206 7.072-4.956 14.465-8.193 22.045a381.151 381.151 0 0 0-7.365-13.322Zm-45.032-43.861c5.044 5.465 10.096 11.566 15.065 18.186a322.04 322.04 0 0 0-30.257-.006c4.974-6.559 10.069-12.652 15.192-18.18ZM82.802 87.83a323.167 323.167 0 0 0-7.227 13.238c-3.184-7.553-5.909-14.98-8.134-22.152c7.304-1.634 15.093-2.97 23.209-3.984a321.524 321.524 0 0 0-7.848 12.897Zm8.081 65.352c-8.385-.936-16.291-2.203-23.593-3.793c2.26-7.3 5.045-14.885 8.298-22.6a321.187 321.187 0 0 0 7.257 13.246c2.594 4.48 5.28 8.868 8.038 13.147Zm37.542 31.03c-5.184-5.592-10.354-11.779-15.403-18.433c4.902.192 9.899.29 14.978.29c5.218 0 10.376-.117 15.453-.343c-4.985 6.774-10.018 12.97-15.028 18.486Zm52.198-57.817c3.422 7.8 6.306 15.345 8.596 22.52c-7.422 1.694-15.436 3.058-23.88 4.071a382.417 382.417 0 0 0 7.859-13.026a347.403 347.403 0 0 0 7.425-13.565Zm-16.898 8.101a358.557 358.557 0 0 1-12.281 19.815a329.4 329.4 0 0 1-23.444.823c-7.967 0-15.716-.248-23.178-.732a310.202 310.202 0 0 1-12.513-19.846h.001a307.41 307.41 0 0 1-10.923-20.627a310.278 310.278 0 0 1 10.89-20.637l-.001.001a307.318 307.318 0 0 1 12.413-19.761c7.613-.576 15.42-.876 23.31-.876H128c7.926 0 15.743.303 23.354.883a329.357 329.357 0 0 1 12.335 19.695a358.489 358.489 0 0 1 11.036 20.54a329.472 329.472 0 0 1-11 20.722Zm22.56-122.124c8.572 4.944 11.906 24.881 6.52 51.026c-.344 1.668-.73 3.367-1.15 5.09c-10.622-2.452-22.155-4.275-34.23-5.408c-7.034-10.017-14.323-19.124-21.64-27.008a160.789 160.789 0 0 1 5.888-5.4c18.9-16.447 36.564-22.941 44.612-18.3ZM128 90.808c12.625 0 22.86 10.235 22.86 22.86s-10.235 22.86-22.86 22.86s-22.86-10.235-22.86-22.86s10.235-22.86 22.86-22.86Z"></path></svg>
@@ -0,0 +1,181 @@
 import { useRef, createContext, useState, useContext, useEffect, useCallback, useMemo, type ReactNode } from "react";
 type ModelStatus = "ready" | "starting" | "stopping" | "stopped" | "shutdown" | "unknown";
 const LOG_LENGTH_LIMIT = 1024 * 100; /* 100KB of log data */
 export interface Model {
  id: string;
  state: ModelStatus;
  name: string;
  description: string;
 }
 interface APIProviderType {
  models: Model[];
  listModels: () => Promise<Model[]>;
  unloadAllModels: () => Promise<void>;
  loadModel: (model: string) => Promise<void>;
  enableAPIEvents: (enabled: boolean) => void;
  proxyLogs: string;
  upstreamLogs: string;
 }
 interface LogData {
  source: "upstream" | "proxy";
  data: string;
 }
 interface APIEventEnvelope {
  type: "modelStatus" | "logData";
  data: string;
 }
 const APIContext = createContext<APIProviderType | undefined>(undefined);
 type APIProviderProps = {
  children: ReactNode;
 };
 export function APIProvider({ children }: APIProviderProps) {
  const [proxyLogs, setProxyLogs] = useState("");
  const [upstreamLogs, setUpstreamLogs] = useState("");
  const proxyEventSource = useRef<EventSource | null>(null);
  const upstreamEventSource = useRef<EventSource | null>(null);
  const apiEventSource = useRef<EventSource | null>(null);
  const [models, setModels] = useState<Model[]>([]);
  const modelStatusEventSource = useRef<EventSource | null>(null);
  const appendLog = useCallback((newData: string, setter: React.Dispatch<React.SetStateAction<string>>) => {
    setter((prev) => {
      const updatedLog = prev + newData;
      return updatedLog.length > LOG_LENGTH_LIMIT ? updatedLog.slice(-LOG_LENGTH_LIMIT) : updatedLog;
    });
  }, []);
  const enableAPIEvents = useCallback((enabled: boolean) => {
    if (!enabled) {
      apiEventSource.current?.close();
      apiEventSource.current = null;
      return;
    }
    let retryCount = 0;
    const maxRetries = 3;
    const initialDelay = 1000; // 1 second
    const connect = () => {
      const eventSource = new EventSource("/api/events");
      eventSource.onmessage = (e: MessageEvent) => {
        try {
          const message = JSON.parse(e.data) as APIEventEnvelope;
          switch (message.type) {
            case "modelStatus":
              {
                const models = JSON.parse(message.data) as Model[];
                setModels(models);
              }
              break;
            case "logData": {
              const logData = JSON.parse(message.data) as LogData;
              switch (logData.source) {
                case "proxy":
                  appendLog(logData.data, setProxyLogs);
                  break;
                case "upstream":
                  appendLog(logData.data, setUpstreamLogs);
                  break;
              }
            }
          }
        } catch (err) {
          console.error(e.data, err);
        }
      };
      eventSource.onerror = () => {
        eventSource.close();
        if (retryCount < maxRetries) {
          retryCount++;
          const delay = initialDelay * Math.pow(2, retryCount - 1);
          setTimeout(connect, delay);
        }
      };
      apiEventSource.current = eventSource;
    };
    connect();
  }, []);
  useEffect(() => {
    return () => {
      proxyEventSource.current?.close();
      upstreamEventSource.current?.close();
      modelStatusEventSource.current?.close();
    };
  }, []);
  const listModels = useCallback(async (): Promise<Model[]> => {
    try {
      const response = await fetch("/api/models/");
      if (!response.ok) {
        throw new Error(`HTTP error! status: ${response.status}`);
      }
      const data = await response.json();
      return data || [];
    } catch (error) {
      console.error("Failed to fetch models:", error);
      return []; // Return empty array as fallback
    }
  }, []);
  const unloadAllModels = useCallback(async () => {
    try {
      const response = await fetch(`/api/models/unload/`, {
        method: "POST",
      });
      if (!response.ok) {
        throw new Error(`Failed to unload models: ${response.status}`);
      }
    } catch (error) {
      console.error("Failed to unload models:", error);
      throw error; // Re-throw to let calling code handle it
    }
  }, []);
  const loadModel = useCallback(async (model: string) => {
    try {
      const response = await fetch(`/upstream/${model}/`, {
        method: "GET",
      });
      if (!response.ok) {
        throw new Error(`Failed to load model: ${response.status}`);
      }
    } catch (error) {
      console.error("Failed to load model:", error);
      throw error; // Re-throw to let calling code handle it
    }
  }, []);
  const value = useMemo(
    () => ({
      models,
      listModels,
      unloadAllModels,
      loadModel,
      enableAPIEvents,
      proxyLogs,
      upstreamLogs,
    }),
    [models, listModels, unloadAllModels, loadModel, enableAPIEvents, proxyLogs, upstreamLogs]
  );
  return <APIContext.Provider value={value}>{children}</APIContext.Provider>;
 }
 export function useAPI() {
  const context = useContext(APIContext);
  if (context === undefined) {
    throw new Error("useAPI must be used within an APIProvider");
  }
  return context;
 }
@@ -0,0 +1,33 @@
 import { createContext, useContext, useEffect, type ReactNode } from "react";
 import { usePersistentState } from "../hooks/usePersistentState";
 type ThemeContextType = {
  isDarkMode: boolean;
  toggleTheme: () => void;
 };
 const ThemeContext = createContext<ThemeContextType | undefined>(undefined);
 type ThemeProviderProps = {
  children: ReactNode;
 };
 export function ThemeProvider({ children }: ThemeProviderProps) {
  const [isDarkMode, setIsDarkMode] = usePersistentState<boolean>("theme", false);
  useEffect(() => {
    document.documentElement.setAttribute("data-theme", isDarkMode ? "dark" : "light");
  }, [isDarkMode]);
  const toggleTheme = () => setIsDarkMode((prev) => !prev);
  return <ThemeContext.Provider value={{ isDarkMode, toggleTheme }}>{children}</ThemeContext.Provider>;
 }
 export function useTheme(): ThemeContextType {
  const context = useContext(ThemeContext);
  if (context === undefined) {
    throw new Error("useTheme must be used within a ThemeProvider");
  }
  return context;
 }
@@ -0,0 +1,39 @@
 import { useState, useEffect, useCallback } from "react";
 export function usePersistentState<T>(key: string, initialValue: T): [T, (value: T | ((prevState: T) => T)) => void] {
  const [state, setState] = useState<T>(() => {
    if (typeof window === "undefined") return initialValue;
    try {
      const saved = localStorage.getItem(key);
      return saved !== null ? JSON.parse(saved) : initialValue;
    } catch (e) {
      console.error(`Error parsing stored value for ${key}`, e);
      return initialValue;
    }
  });
  const setPersistentState = useCallback(
    (value: T | ((prevState: T) => T)) => {
      setState((prev) => {
        const nextValue = typeof value === "function" ? (value as (prevState: T) => T)(prev) : value;
        try {
          localStorage.setItem(key, JSON.stringify(nextValue));
        } catch (e) {
          console.error(`Error saving value for ${key}`, e);
        }
        return nextValue;
      });
    },
    [key]
  );
  useEffect(() => {
    try {
      localStorage.setItem(key, JSON.stringify(state));
    } catch (e) {
      console.error(`Error saving value for ${key}`, e);
    }
  }, [key, state]);
  return [state, setPersistentState];
 }
@@ -0,0 +1,168 @@
@import "tailwindcss";
@custom-variant dark (&:where([data-theme=dark], [data-theme=dark] *));
@theme {
  --color-background: rgba(252, 252, 249, 1);
  --color-surface: rgba(255, 255, 253, 1);
  /* text colors */
  --color-txtmain: rgba(19, 52, 59, 1);
  --color-txtsecondary: rgba(98, 108, 113, 1);
  --color-navlink-active: rgba(245, 245, 245, 1);
  --color-primary: rgba(50, 184, 198, 1);
  --color-primary-hover: rgba(29, 116, 128, 1);
  --color-primary-active: rgba(26, 104, 115, 1);
  --color-secondary: rgba(94, 82, 64, 0.12);
  --color-secondary-hover: rgba(94, 82, 64, 0.2);
  --color-secondary-active: rgba(94, 82, 64, 0.25);
  --color-border: rgba(94, 82, 64, 0.3);
  --color-btn-primary-text: rgba(252, 252, 249, 1);
  --color-card-border: rgba(94, 82, 64, 0.12);
  --color-card-border-inner: rgba(94, 82, 64, 0.12);
  --color-error: rgba(192, 21, 47, 1);
  --color-success: rgba(33, 128, 141, 1);
  --color-warning: rgb(244, 155, 0);
  --color-info: rgba(98, 108, 113, 1);
  --color-focus-ring: rgba(33, 128, 141, 0.4);
  --color-select-caret: rgba(19, 52, 59, 0.8);
  --color-btn-border: rgba(94, 82, 64, 0.7);
 }
@layer theme {
  /* over ride theme for dark mode */
  [data-theme="dark"] {
    --color-background: rgba(31, 33, 33, 1);
    --color-surface: rgba(38, 40, 40, 1);
    /* text colors */
    --color-txtmain: rgba(245, 245, 245, 1);
    --color-txtsecondary: rgba(167, 169, 169, 0.7);
    --color-navlink-active: rgba(245, 245, 245, 1);
    --color-primary: rgba(33, 128, 141, 1);
    --color-primary-hover: rgba(45, 166, 178, 1);
    --color-primary-active: rgba(41, 150, 161, 1);
    --color-secondary: rgba(119, 124, 124, 0.15);
    --color-secondary-hover: rgba(119, 124, 124, 0.25);
    --color-secondary-active: rgba(119, 124, 124, 0.3);
    --color-border: rgba(119, 124, 124, 0.3);
    --color-error: rgba(255, 84, 89, 1);
    --color-success: rgba(50, 184, 198, 1);
    --color-warning: rgb(244, 155, 0);
    --color-info: rgba(167, 169, 169, 1);
    --color-focus-ring: rgba(50, 184, 198, 0.4);
    --color-btn-primary-text: rgba(19, 52, 59, 1);
    --color-card-border: rgba(119, 124, 124, 0.2);
    --color-card-border-inner: rgba(119, 124, 124, 0.15);
    --shadow-inset-sm: inset 0 1px 0 rgba(255, 255, 255, 0.1), inset 0 -1px 0 rgba(0, 0, 0, 0.15);
    --button-border-secondary: rgba(119, 124, 124, 0.2);
  }
 }
@layer base {
  body {
    /* example of how colors using theme colors*/
    @apply bg-background text-txtmain;
  }
  h1 {
    @apply text-4xl text-txtmain font-bold pb-4;
  }
  h2 {
    @apply text-3xl text-txtmain font-bold pb-4;
  }
  h3 {
    @apply text-2xl text-txtmain font-bold pb-4;
  }
  h4 {
    @apply text-xl text-txtmain font-bold pb-4;
  }
  h5 {
    @apply text-lg text-txtmain font-bold pb-4;
  }
  h6 {
    @apply text-base text-txtmain font-bold pb-4;
  }
 }
 /* define CSS classes here for specific types of components */
@layer components {
  .container {
    @apply px-4;
  }
  /* Navigation Header */
  .navlink {
    @apply text-txtsecondary hover:bg-secondary hover:text-txtmain rounded-lg p-2;
  }
  .navlink.active {
    @apply bg-primary text-navlink-active;
  }
  /* Card component */
  .card {
    @apply bg-surface rounded-lg border border-card-border shadow-sm overflow-hidden p-4;
  }
  .card:hover {
    @apply shadow-md;
  }
  .card__body {
    @apply p-4;
  }
  .card__header,
  .card__footer {
    @apply p-4 border-b border-card-border-inner;
  }
  /* Status Badges */
  .status {
    @apply inline-block px-2 py-1 text-xs font-medium rounded-full;
  }
  .status--ready {
    @apply bg-success/10 text-success;
  }
  .status--starting,
  .status--stopping {
    @apply bg-warning/10 text-warning;
  }
  .status--stopped {
    @apply bg-error/10 text-error;
  }
  /* Buttons */
  .btn {
    @apply bg-surface p-2 px-4 text-sm rounded-full border border-2 transition-colors duration-200 border-btn-border;
  }
  .btn:hover {
    cursor: pointer;
  }
  .btn--sm {
    @apply px-2 py-0.5 text-xs;
  }
  .btn:disabled {
    @apply opacity-50 cursor-not-allowed;
  }
 }
@layer utilities {
  .ml-2 {
    margin-left: 0.5rem;
  }
  .my-8 {
    margin-top: 2rem;
    margin-bottom: 2rem;
  }
 }
@@ -0,0 +1,18 @@
 export function processEvalTimes(text: string) {
  const lines = text.match(/^ *eval time.*$/gm) || [];
  let totalTokens = 0;
  let totalTime = 0;
  lines.forEach((line) => {
    const tokensMatch = line.match(/\/\s*(\d+)\s*tokens/);
    const timeMatch = line.match(/=\s*(\d+\.\d+)\s*ms/);
    if (tokensMatch) totalTokens += parseFloat(tokensMatch[1]);
    if (timeMatch) totalTime += parseFloat(timeMatch[1]);
  });
  const avgTokensPerSecond = totalTime > 0 ? totalTokens / (totalTime / 1000) : 0;
  return [lines.length, totalTokens, Math.round(avgTokensPerSecond * 100) / 100];
 }
@@ -0,0 +1,13 @@
 import { StrictMode } from "react";
 import { createRoot } from "react-dom/client";
 import "./index.css";
 import App from "./App.tsx";
 import { ThemeProvider } from "./contexts/ThemeProvider";
 createRoot(document.getElementById("root")!).render(
  <StrictMode>
    <ThemeProvider>
      <App />
    </ThemeProvider>
  </StrictMode>
 );
@@ -0,0 +1,144 @@
 import { useState, useEffect, useRef, useMemo, useCallback } from "react";
 import { useAPI } from "../contexts/APIProvider";
 import { usePersistentState } from "../hooks/usePersistentState";
 const LogViewer = () => {
  const { proxyLogs, upstreamLogs, enableAPIEvents } = useAPI();
  useEffect(() => {
    enableAPIEvents(true);
    return () => {
      enableAPIEvents(false);
    };
  }, []);
  return (
    <div className="flex flex-col gap-5" style={{ height: "calc(100vh - 125px)" }}>
      <LogPanel id="proxy" title="Proxy Logs" logData={proxyLogs} />
      <LogPanel id="upstream" title="Upstream Logs" logData={upstreamLogs} />
    </div>
  );
 };
 interface LogPanelProps {
  id: string;
  title: string;
  logData: string;
  className?: string;
 }
 export const LogPanel = ({ id, title, logData, className }: LogPanelProps) => {
  const [isCollapsed, setIsCollapsed] = usePersistentState(`logPanel-${id}-isCollapsed`, false);
  const [filterRegex, setFilterRegex] = useState("");
  const [fontSize, setFontSize] = usePersistentState<"xxs" | "xs" | "small" | "normal">(
    `logPanel-${id}-fontSize`,
    "normal"
  );
  const [wrapText, setTextWrap] = usePersistentState(`logPanel-${id}-wrapText`, false);
  const textWrapClass = useMemo(() => {
    return wrapText ? "whitespace-pre-wrap" : "whitespace-pre";
  }, [wrapText]);
  const toggleFontSize = useCallback(() => {
    setFontSize((prev) => {
      switch (prev) {
        case "xxs":
          return "xs";
        case "xs":
          return "small";
        case "small":
          return "normal";
        case "normal":
          return "xxs";
      }
    });
  }, []);
  const fontSizeClass = useMemo(() => {
    switch (fontSize) {
      case "xxs":
        return "text-[0.5rem]"; // 0.5rem (8px)
      case "xs":
        return "text-[0.75rem]"; // 0.75rem (12px)
      case "small":
        return "text-[0.875rem]"; // 0.875rem (14px)
      case "normal":
        return "text-base"; // 1rem (16px)
    }
  }, [fontSize]);
  const filteredLogs = useMemo(() => {
    if (!filterRegex) return logData;
    try {
      const regex = new RegExp(filterRegex, "i");
      const lines = logData.split("\n");
      const filtered = lines.filter((line) => regex.test(line));
      return filtered.join("\n");
    } catch (e) {
      return logData; // Return unfiltered if regex is invalid
    }
  }, [logData, filterRegex]);
  // auto scroll to bottom
  const preTagRef = useRef<HTMLPreElement>(null);
  useEffect(() => {
    if (!preTagRef.current) return;
    preTagRef.current.scrollTop = preTagRef.current.scrollHeight;
  }, [filteredLogs]);
  return (
    <div
      className={`bg-surface border border-border rounded-lg overflow-hidden flex flex-col ${
        !isCollapsed && "h-full"
      } ${className || ""}`}
    >
      <div className="p-4 border-b border-border bg-secondary">
        <div className="flex flex-col md:flex-row md:items-center md:justify-between gap-4">
          {/* Title - Always full width on mobile, normal on desktop */}
          <div className="w-full md:w-auto" onClick={() => setIsCollapsed(!isCollapsed)}>
            <h3 className="m-0 text-lg">{title}</h3>
          </div>
          <div className="flex flex-col sm:flex-row gap-4 w-full md:w-auto">
            {/* Sizing Buttons - Stacks vertically on mobile */}
            <div className="flex flex-wrap gap-2">
              <button className="btn" onClick={toggleFontSize}>
                font: {fontSize}
              </button>
              <button className="btn" onClick={() => setTextWrap((prev) => !prev)}>
                {wrapText ? "wrap" : "wrap off"}
              </button>
            </div>
            {/* Filtering Options - Full width on mobile, normal on desktop */}
            <div className="flex flex-1 min-w-0 gap-2">
              <input
                type="text"
                className="flex-1 min-w-[120px] text-sm border p-2 rounded"
                placeholder="Filter logs..."
                value={filterRegex}
                onChange={(e) => setFilterRegex(e.target.value)}
              />
              <button className="btn" onClick={() => setFilterRegex("")}>
                Clear
              </button>
            </div>
          </div>
        </div>
      </div>
      {!isCollapsed && (
        <div className="flex-1 bg-background font-mono text-sm p-3 overflow-hidden">
          <pre
            ref={preTagRef}
            className={`h-full p-4 overflow-y-auto whitespace-pre min-h-0 ${textWrapClass} ${fontSizeClass}`}
          >
            {filteredLogs}
          </pre>
        </div>
      )}
    </div>
  );
 };
 export default LogViewer;
@@ -0,0 +1,113 @@
 import { useState, useEffect, useCallback, useMemo } from "react";
 import { useAPI } from "../contexts/APIProvider";
 import { LogPanel } from "./LogViewer";
 import { processEvalTimes } from "../lib/Utils";
 export default function ModelsPage() {
  const { models, unloadAllModels, loadModel, upstreamLogs, enableAPIEvents } = useAPI();
  const [isUnloading, setIsUnloading] = useState(false);
  useEffect(() => {
    enableAPIEvents(true);
    return () => {
      enableAPIEvents(false);
    };
  }, []);
  const handleUnloadAllModels = useCallback(async () => {
    setIsUnloading(true);
    try {
      await unloadAllModels();
    } catch (e) {
      console.error(e);
    } finally {
      // at least give it a second to show the unloading message
      setTimeout(() => {
        setIsUnloading(false);
      }, 1000);
    }
  }, []);
  const [totalLines, totalTokens, avgTokensPerSecond] = useMemo(() => {
    return processEvalTimes(upstreamLogs);
  }, [upstreamLogs]);
  return (
    <div>
      <div className="flex flex-col md:flex-row gap-4">
        {/* Left Column */}
        <div className="w-full md:w-1/2 flex items-top">
          <div className="card w-full">
            <h2 className="">Models</h2>
            <button className="btn" onClick={handleUnloadAllModels} disabled={isUnloading}>
              {isUnloading ? "Unloading..." : "Unload All Models"}
            </button>
            <table className="w-full mt-4">
              <thead>
                <tr className="border-b border-primary">
                  <th className="text-left p-2">Name</th>
                  <th className="text-left p-2"></th>
                  <th className="text-left p-2">State</th>
                </tr>
              </thead>
              <tbody>
                {models.map((model) => (
                  <tr key={model.id} className="border-b hover:bg-secondary-hover border-border">
                    <td className="p-2">
                      <a href={`/upstream/${model.id}/`} className="underline" target="_blank">
                        {model.name !== "" ? model.name : model.id}
                      </a>
                      {model.description != "" && (
                        <p>
                          <em>{model.description}</em>
                        </p>
                      )}
                    </td>
                    <td className="p-2">
                      <button
                        className="btn btn--sm"
                        disabled={model.state !== "stopped"}
                        onClick={() => loadModel(model.id)}
                      >
                        Load
                      </button>
                    </td>
                    <td className="p-2">
                      <span className={`status status--${model.state}`}>{model.state}</span>
                    </td>
                  </tr>
                ))}
              </tbody>
            </table>
          </div>
        </div>
        {/* Right Column */}
        <div className="w-full md:w-1/2 flex flex-col" style={{ height: "calc(100vh - 125px)" }}>
          <div className="card mb-4 min-h-[250px]">
            <h2>Log Stats</h2>
            <p className="italic my-2">note: eval logs from llama-server</p>
            <table className="w-full border border-gray-200">
              <tbody>
                <tr className="border-b border-gray-200">
                  <td className="py-2 px-4 font-medium border-r border-gray-200">Requests</td>
                  <td className="py-2 px-4 text-right">{totalLines}</td>
                </tr>
                <tr className="border-b border-gray-200">
                  <td className="py-2 px-4 font-medium border-r border-gray-200">Total Tokens Generated</td>
                  <td className="py-2 px-4 text-right">{totalTokens}</td>
                </tr>
                <tr>
                  <td className="py-2 px-4 font-medium border-r border-gray-200">Average Tokens/Second</td>
                  <td className="py-2 px-4 text-right">{avgTokensPerSecond}</td>
                </tr>
              </tbody>
            </table>
          </div>
          <LogPanel id="modelsupstream" title="Upstream Logs" logData={upstreamLogs} />
        </div>
      </div>
    </div>
  );
 }
@@ -0,0 +1 @@
 /// <reference types="vite/client" />
@@ -0,0 +1,27 @@
 {
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.app.tsbuildinfo",
    "target": "ES2020",
    "useDefineForClassFields": true,
    "lib": ["ES2020", "DOM", "DOM.Iterable"],
    "module": "ESNext",
    "skipLibCheck": true,
    /* Bundler mode */
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "moduleDetection": "force",
    "noEmit": true,
    "jsx": "react-jsx",
    /* Linting */
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true,
    "noUncheckedSideEffectImports": true
  },
  "include": ["src"]
 }
@@ -0,0 +1,7 @@
 {
  "files": [],
  "references": [
    { "path": "./tsconfig.app.json" },
    { "path": "./tsconfig.node.json" }
  ]
 }
@@ -0,0 +1,25 @@
 {
  "compilerOptions": {
    "tsBuildInfoFile": "./node_modules/.tmp/tsconfig.node.tsbuildinfo",
    "target": "ES2022",
    "lib": ["ES2023"],
    "module": "ESNext",
    "skipLibCheck": true,
    /* Bundler mode */
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "moduleDetection": "force",
    "noEmit": true,
    /* Linting */
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "erasableSyntaxOnly": true,
    "noFallthroughCasesInSwitch": true,
    "noUncheckedSideEffectImports": true
  },
  "include": ["vite.config.ts"]
 }
@@ -0,0 +1,20 @@
 import { defineConfig } from "vite";
 import react from "@vitejs/plugin-react";
 import tailwindcss from "@tailwindcss/vite";
 // https://vite.dev/config/
 export default defineConfig({
  plugins: [react(), tailwindcss()],
  base: "/ui/",
  build: {
    outDir: "../proxy/ui_dist",
    assetsDir: "assets",
  },
  server: {
    proxy: {
      "/api": "http://localhost:8080", // Proxy API calls to Go backend during development
      "/logs": "http://localhost:8080",
      "/upstream": "http://localhost:8080",
    },
  },
 });
Author	SHA1	Message	Date
Benson Wong	6a058e4191	Change fsnotify to watch config directory instead of file The fsnotify library suggests watching a directory and checking that the name matches the configuration file.	2025-07-02 10:23:52 -07:00
Benson Wong	1921e570d7	Add Event Bus (#184 ) Major internal refactor to use an event bus to pass event/messages along. These changes are largely invisible user facing but sets up internal design for real time stats and information. - `--watch-config` logic refactored for events - remove multiple SSE api endpoints, replaced with /api/events - keep all functionality essentially the same - UI/backend sync is in near real time now	2025-07-01 22:17:35 -07:00
Benson Wong	c867a6c9a2	Add name and description to v1/models list (#179 ) * Add support for name and description in v1/models list * add configuration example for name and description	2025-06-30 23:02:44 -07:00
Leoyzen	3bd1b23ce0	fix config hot-reload on k8s (#181 ) Co-authored-by: Leoyzen <leoyzen@gmial.com>	2025-06-27 11:49:31 -07:00
srevn	10606abf89	fix config hot-reload on macos (#180 ) Co-authored-by: srevn <srevn@github>	2025-06-26 09:20:50 -07:00
Benson Wong	fefd14903d	improve log display and add a small stats table in ui (#178 )	2025-06-25 12:27:49 -07:00
Benson Wong	717d64e336	update GUI image in README [skip ci]	2025-06-24 10:38:28 -07:00
Benson Wong	285191e655	Various UI improvements (#176 ) * add retry/backoff to reconnecting log streams * update favicons	2025-06-23 16:17:21 -07:00
Benson Wong	4236cec03a	Add Filters to Model Configuration (#174 ) llama-swap can strip specific keys in JSON requests. This is useful for removing the ability for clients to set sampling parameters like temperature, top_k, top_p, etc.	2025-06-23 10:52:29 -07:00
Alex O'Connell	756193d0dd	Load models in the UI without navigating the page (#173 ) * Load models in the UI without navigating the page * fix table layout for mobile	2025-06-19 14:39:07 -07:00
Benson Wong	a6b2e930d8	Update README.md [skip ci]	2025-06-18 11:47:08 -07:00
Benson Wong	9e02c22ff8	stopCmd should use same environment as p.cmd.Env (#171 , #172 )	2025-06-18 11:36:59 -07:00
Benson Wong	0bdbf2fdc1	fix more goreleaser deprecation warnings [skip ci]	2025-06-18 11:15:12 -07:00
Benson Wong	49035e2e8e	Append custom env vars instead of replace in Process (#171 ) Append custom env vars instead of replace in Process (#168, #169) PR #162 refactored the default configuration code. This introduced a subtle bug where `env` became `[]string{}` instead of the default of `nil`. In golang, `exec.Cmd.Env == nil` means to use the "current process's environment". By setting it to `[]string{}` as a default the Process's environment was emptied out which caused an array of strange and difficult to troubleshoot behaviour. See issues #168 and #169 This commit changes the behaviour to append model configured environment variables to the default list rather than replace them.	2025-06-18 11:09:13 -07:00
Benson Wong	9963ae18bf	fix? deprecation warning in .goreleaser.yaml [skip-ci]	2025-06-18 07:49:33 -07:00
Benson Wong	2ae48c713b	add debug output for start command	2025-06-18 07:43:23 -07:00
Benson Wong	54c519e365	update Makefile to install ui deps	2025-06-17 09:54:01 -07:00
Benson Wong	3fce9ee0e9	Update README.md [skip ci]	2025-06-17 09:53:22 -07:00
Benson Wong	5899ae7966	Update README.md [skip ci]	2025-06-17 09:52:47 -07:00
Benson Wong	591a9cdf4d	update release.yml	2025-06-16 16:50:25 -07:00
Benson Wong	9a3c656738	New UI (#157 , #164 ) - Add a react UI to replace the plain HTML one. - Serve as a foundation for better GUI interactions	2025-06-16 16:45:19 -07:00
Benson Wong	75015f82ea	fix bug caused by macro replacement order (#166 ) User defined macros should be applied before checking for ${PORT} constraint in model.cmd and model.proxy.	2025-06-16 15:32:09 -07:00
Thammachart Chinvarapon	cc33b6c270	restore intel docker builds (#163 )	2025-06-16 11:13:49 -07:00
Benson Wong	4fa12a429c	Refactor all default config values into config.go (#162 ) - Move all default values into one place. - Update tests to be more cross platform	2025-06-15 12:32:00 -07:00
Benson Wong	2dc0ca0663	improve llama-swap upstream process recovery and restarts (#155 ) Refactor internal upstream process life cycle management to recover better from unexpected situations. With this change llama-swap should never need to be restarted due to a crashed upstream child process. The `StateFailed` state was removed in favour of always trying to start/restart a process.	2025-06-05 16:24:55 -07:00
		`@@ -0,0 +1 @@`
							<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="iconify iconify--logos" width="35.93" height="32" preserveAspectRatio="xMidYMid meet" viewBox="0 0 256 228"><path fill="#00D8FF" d="M210.483 73.824a171.49 171.49 0 0 0-8.24-2.597c.465-1.9.893-3.777 1.273-5.621c6.238-30.281 2.16-54.676-11.769-62.708c-13.355-7.7-35.196.329-57.254 19.526a171.23 171.23 0 0 0-6.375 5.848a155.866 155.866 0 0 0-4.241-3.917C100.759 3.829 77.587-4.822 63.673 3.233C50.33 10.957 46.379 33.89 51.995 62.588a170.974 170.974 0 0 0 1.892 8.48c-3.28.932-6.445 1.924-9.474 2.98C17.309 83.498 0 98.307 0 113.668c0 15.865 18.582 31.778 46.812 41.427a145.52 145.52 0 0 0 6.921 2.165a167.467 167.467 0 0 0-2.01 9.138c-5.354 28.2-1.173 50.591 12.134 58.266c13.744 7.926 36.812-.22 59.273-19.855a145.567 145.567 0 0 0 5.342-4.923a168.064 168.064 0 0 0 6.92 6.314c21.758 18.722 43.246 26.282 56.54 18.586c13.731-7.949 18.194-32.003 12.4-61.268a145.016 145.016 0 0 0-1.535-6.842c1.62-.48 3.21-.974 4.76-1.488c29.348-9.723 48.443-25.443 48.443-41.52c0-15.417-17.868-30.326-45.517-39.844Zm-6.365 70.984c-1.4.463-2.836.91-4.3 1.345c-3.24-10.257-7.612-21.163-12.963-32.432c5.106-11 9.31-21.767 12.459-31.957c2.619.758 5.16 1.557 7.61 2.4c23.69 8.156 38.14 20.213 38.14 29.504c0 9.896-15.606 22.743-40.946 31.14Zm-10.514 20.834c2.562 12.94 2.927 24.64 1.23 33.787c-1.524 8.219-4.59 13.698-8.382 15.893c-8.067 4.67-25.32-1.4-43.927-17.412a156.726 156.726 0 0 1-6.437-5.87c7.214-7.889 14.423-17.06 21.459-27.246c12.376-1.098 24.068-2.894 34.671-5.345a134.17 134.17 0 0 1 1.386 6.193ZM87.276 214.515c-7.882 2.783-14.16 2.863-17.955.675c-8.075-4.657-11.432-22.636-6.853-46.752a156.923 156.923 0 0 1 1.869-8.499c10.486 2.32 22.093 3.988 34.498 4.994c7.084 9.967 14.501 19.128 21.976 27.15a134.668 134.668 0 0 1-4.877 4.492c-9.933 8.682-19.886 14.842-28.658 17.94ZM50.35 144.747c-12.483-4.267-22.792-9.812-29.858-15.863c-6.35-5.437-9.555-10.836-9.555-15.216c0-9.322 13.897-21.212 37.076-29.293c2.813-.98 5.757-1.905 8.812-2.773c3.204 10.42 7.406 21.315 12.477 32.332c-5.137 11.18-9.399 22.249-12.634 32.792a134.718 134.718 0 0 1-6.318-1.979Zm12.378-84.26c-4.811-24.587-1.616-43.134 6.425-47.789c8.564-4.958 27.502 2.111 47.463 19.835a144.318 144.318 0 0 1 3.841 3.545c-7.438 7.987-14.787 17.08-21.808 26.988c-12.04 1.116-23.565 2.908-34.161 5.309a160.342 160.342 0 0 1-1.76-7.887Zm110.427 27.268a347.8 347.8 0 0 0-7.785-12.803c8.168 1.033 15.994 2.404 23.343 4.08c-2.206 7.072-4.956 14.465-8.193 22.045a381.151 381.151 0 0 0-7.365-13.322Zm-45.032-43.861c5.044 5.465 10.096 11.566 15.065 18.186a322.04 322.04 0 0 0-30.257-.006c4.974-6.559 10.069-12.652 15.192-18.18ZM82.802 87.83a323.167 323.167 0 0 0-7.227 13.238c-3.184-7.553-5.909-14.98-8.134-22.152c7.304-1.634 15.093-2.97 23.209-3.984a321.524 321.524 0 0 0-7.848 12.897Zm8.081 65.352c-8.385-.936-16.291-2.203-23.593-3.793c2.26-7.3 5.045-14.885 8.298-22.6a321.187 321.187 0 0 0 7.257 13.246c2.594 4.48 5.28 8.868 8.038 13.147Zm37.542 31.03c-5.184-5.592-10.354-11.779-15.403-18.433c4.902.192 9.899.29 14.978.29c5.218 0 10.376-.117 15.453-.343c-4.985 6.774-10.018 12.97-15.028 18.486Zm52.198-57.817c3.422 7.8 6.306 15.345 8.596 22.52c-7.422 1.694-15.436 3.058-23.88 4.071a382.417 382.417 0 0 0 7.859-13.026a347.403 347.403 0 0 0 7.425-13.565Zm-16.898 8.101a358.557 358.557 0 0 1-12.281 19.815a329.4 329.4 0 0 1-23.444.823c-7.967 0-15.716-.248-23.178-.732a310.202 310.202 0 0 1-12.513-19.846h.001a307.41 307.41 0 0 1-10.923-20.627a310.278 310.278 0 0 1 10.89-20.637l-.001.001a307.318 307.318 0 0 1 12.413-19.761c7.613-.576 15.42-.876 23.31-.876H128c7.926 0 15.743.303 23.354.883a329.357 329.357 0 0 1 12.335 19.695a358.489 358.489 0 0 1 11.036 20.54a329.472 329.472 0 0 1-11 20.722Zm22.56-122.124c8.572 4.944 11.906 24.881 6.52 51.026c-.344 1.668-.73 3.367-1.15 5.09c-10.622-2.452-22.155-4.275-34.23-5.408c-7.034-10.017-14.323-19.124-21.64-27.008a160.789 160.789 0 0 1 5.888-5.4c18.9-16.447 36.564-22.941 44.612-18.3ZM128 90.808c12.625 0 22.86 10.235 22.86 22.86s-10.235 22.86-22.86 22.86s-22.86-10.235-22.86-22.86s10.235-22.86 22.86-22.86Z"></path></svg>