Compare commits

...

14 Commits

Author SHA1 Message Date
Benson Wong 6439ab1515 ui: add peer:true in package-lock.json 2026-01-22 08:43:36 -08:00
dependabot[bot] f94226122c build(deps-dev): bump tar from 7.5.3 to 7.5.6 in /ui (#477)
Bumps [tar](https://github.com/isaacs/node-tar) from 7.5.3 to 7.5.6.
- [Release notes](https://github.com/isaacs/node-tar/releases)
- [Changelog](https://github.com/isaacs/node-tar/blob/main/CHANGELOG.md)
- [Commits](https://github.com/isaacs/node-tar/compare/v7.5.3...v7.5.6)

---
updated-dependencies:
- dependency-name: tar
  dependency-version: 7.5.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-21 22:55:02 -08:00
Ryan Voots 7493618fdc Add count_tokens api proxying (#476) 2026-01-20 09:34:42 -08:00
Benson Wong 205efd40a1 proxy: extend /running endpoint with additional process data (#474)
Extend the /running endpoint to return more details about running
processes beyond just model and state.

- add cmd field to show the command being executed
- add proxy field to show the proxy URL
- add ttl (UnloadAfter) for automatic unloading configuration
- add name and description for model metadata
- update tests to verify new fields are returned correctly

fixes #471
2026-01-19 17:37:00 -08:00
Benson Wong 14207f8492 ui: npm security update 2026-01-18 21:56:32 -08:00
Benson Wong 4e850c2834 config: refactor macro substitution in configuration (#470)
This commit simplifies substitution of environment variables into the configuration. There was a lot of repetitive code substituting ${env.VAR_NAME} into different fields after the configuration was parsed into a config.Config. This refactor uses a string substitution of env vars into the YAML config before it is fully parsed. This eliminates a lot of logic while maintaining backwards compatibility.
2026-01-18 21:52:34 -08:00
Benson Wong 75fced579e config: support macros in peer apiKey and filters (#469)
* config: support environment variable macros in peer apiKeys

Add ${env.VAR_NAME} substitution for peer apiKey fields, consistent
with existing env macro support for model fields and global apiKeys.

- Add env macro substitution for peers.{name}.apiKey in LoadConfigFromReader
- Add tests for peer apiKey env substitution
- Update config.example.yaml to show env macro usage

* config: support macros in peer apiKey and filters

Extend macro substitution to peer configuration fields:
- peers.{name}.apiKey supports both global macros and env macros
- peers.{name}.filters.stripParams supports both macro types
- peers.{name}.filters.setParams supports both macro types

Also renamed validateMetadataForUnknownMacros to validateNestedForUnknownMacros
for reuse across model metadata and peer filters validation.
2026-01-16 23:10:50 -08:00
Benson Wong b73f367f22 config-schema.json,config.example.yaml: Update examples and schema 2026-01-16 22:43:25 -08:00
Benson Wong 8f2137c72b config: support environment variable macros in apiKeys (#467)
Add substituteEnvMacros support for apiKeys configuration field,
allowing API keys to be loaded from environment variables using
the ${env.VAR_NAME} syntax.

- Apply env macro substitution before validation
- Add tests for env macro substitution in apiKeys
2026-01-16 22:41:14 -08:00
Benson Wong 124007cc98 config: add environment variable macros (#466)
* config: add environment variable macros

Add support for ${env.VAR_NAME} syntax to pull values from system
environment variables during config loading.

- env macros processed before regular macros (allows macros to reference env vars)
- works in cmd, cmdStop, proxy, checkEndpoint, filters.stripParams, metadata
- returns error if env var is not set
- add comprehensive tests

fixes #462

* docs: add env macro example to config.example.yaml
2026-01-16 22:25:20 -08:00
Benson Wong eb5bfff0b0 proxy: unify filtering for local models and peers
This unifies the filtering capabilities for models and peers

- stripParams: removes params in the request
- setParams: sets params in the request

fixes #453
2026-01-15 18:59:43 -08:00
Benson Wong 3edb180c08 ci: free up disk space before ROCm container build (#460) 2026-01-14 22:03:42 -08:00
Benson Wong 66d555e625 Improve container build reliability (#457)
* docker: add .env usage in build-container.sh
* .github,docker: add rocm, improve logging
* .github,CLAUDE.md: fix workflow and update guidelines

Update containers workflow to only push images when triggered
manually or on schedule, not on workflow file changes.

- add push trigger for workflow file changes in containers.yml
- update push condition to skip on regular push events
- update CLAUDE.md commit message guidelines

* docker: remove comma in build-container.sh

* .github,docker: improve container build workflow

Add pagination support for fetching llama.cpp tags and improve debugging.

- add build-container.sh to workflow trigger paths
- implement fetch_llama_tag() with pagination support
- replace .env with local testing instructions
- add DEBUG_ABORT_BUILD flag for testing
2026-01-10 22:14:33 -08:00
Benson Wong 4f863fd9fc CLAUDE.md: tweak instructions 2026-01-09 21:42:06 -08:00
18 changed files with 1338 additions and 161 deletions
+21 -2
View File
@@ -10,17 +10,36 @@ on:
# Allows manual triggering of the workflow # Allows manual triggering of the workflow
workflow_dispatch: workflow_dispatch:
# Run on workflow file changes (without pushing)
push:
paths:
- '.github/workflows/containers.yml'
- 'docker/build-container.sh'
jobs: jobs:
build-and-push: build-and-push:
runs-on: ubuntu-latest runs-on: ubuntu-latest
strategy: strategy:
matrix: matrix:
platform: [intel, cuda, vulkan, cpu, musa] platform: [intel, cuda, vulkan, cpu, musa, rocm]
fail-fast: false fail-fast: false
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@v4 uses: actions/checkout@v4
- name: Free up disk space
if: matrix.platform == 'rocm'
run: |
echo "Before cleanup:"
df -h
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
sudo rm -rf /opt/hostedtoolcache/CodeQL
sudo docker system prune -af
echo "After cleanup:"
df -h
- name: Log in to GitHub Container Registry - name: Log in to GitHub Container Registry
uses: docker/login-action@v2 uses: docker/login-action@v2
with: with:
@@ -31,7 +50,7 @@ jobs:
- name: Run build-container - name: Run build-container
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: ./docker/build-container.sh ${{ matrix.platform }} true run: ./docker/build-container.sh ${{ matrix.platform }} ${{ github.event_name != 'push' }}
# note make sure mostlygeek/llama-swap has admin rights to the llama-swap package # note make sure mostlygeek/llama-swap has admin rights to the llama-swap package
# see: https://github.com/actions/delete-package-versions/issues/74 # see: https://github.com/actions/delete-package-versions/issues/74
+10 -7
View File
@@ -5,14 +5,16 @@ llama-swap is a light weight, transparent proxy server that provides automatic m
## Tech stack ## Tech stack
- golang - golang
- typescript, vite and react for UI (ui/) - typescript, vite and react for UI (located in ui/)
## Workflow Tasks ## Workflow Tasks
- when summarizing changes only include details that require further action - when summarizing changes only include details that require further action
- just say "Done." when there is no further action - just say "Done." when there is no further action
- use `gh` to create PRs and load issues - use `gh` to create PRs and load issues
- do not mention "created by claude" in commit messages - do include Co-Authored-By or created by when committing changes or creating PRs
- keep PR descriptions short and focused on changes.
- never include a test plan
## Testing ## Testing
@@ -39,8 +41,9 @@ fixes #123
- use three levels High, Medium, Low severity - use three levels High, Medium, Low severity
- label each discovered issue with a label like H1, M2, L3 respectively - label each discovered issue with a label like H1, M2, L3 respectively
- High severity are must fix issues: - High severity are must fix issues (security, race conditions, critical bugs)
- Medium severity are recommended improvements (coding style, missing functionality, inconsistencies)
- security issues - Low severity are nice to have changes and nits
- Include a suggestion with each discovered item
- Medium are recommended improvements - Limit your code review to three items with the highest priority first
- Double check your discovered items and recommended remediations
+1
View File
@@ -27,6 +27,7 @@ Built in Go for performance and simplicity, llama-swap has zero dependencies and
- `v1/images/edits` - `v1/images/edits`
- ✅ Anthropic API supported endpoints: - ✅ Anthropic API supported endpoints:
- `v1/messages` - `v1/messages`
- `v1/messages/count_tokens`
- ✅ llama-server (llama.cpp) supported endpoints - ✅ llama-server (llama.cpp) supported endpoints
- `v1/rerank`, `v1/reranking`, `/rerank` - `v1/rerank`, `v1/reranking`, `/rerank`
- `/infill` - for code infilling - `/infill` - for code infilling
+27 -1
View File
@@ -188,11 +188,17 @@
"default": "", "default": "",
"pattern": "^[a-zA-Z0-9_, ]*$", "pattern": "^[a-zA-Z0-9_, ]*$",
"description": "Comma separated list of parameters to remove from the request. Used for server-side enforcement of sampling parameters." "description": "Comma separated list of parameters to remove from the request. Used for server-side enforcement of sampling parameters."
},
"setParams": {
"type": "object",
"additionalProperties": true,
"default": {},
"description": "Dictionary of parameters to set/override in requests. Useful for enforcing specific parameter values. Protected params like 'model' cannot be overridden. Values can be strings, numbers, booleans, arrays, or objects."
} }
}, },
"additionalProperties": false, "additionalProperties": false,
"default": {}, "default": {},
"description": "Dictionary of filter settings. Only stripParams is supported." "description": "Dictionary of filter settings. Supports stripParams and setParams."
}, },
"metadata": { "metadata": {
"type": "object", "type": "object",
@@ -320,6 +326,26 @@
"minLength": 1 "minLength": 1
}, },
"description": "A list of models served by the peer." "description": "A list of models served by the peer."
},
"filters": {
"type": "object",
"properties": {
"stripParams": {
"type": "string",
"default": "",
"pattern": "^[a-zA-Z0-9_, ]*$",
"description": "Comma separated list of parameters to remove from the request. Useful for removing parameters that the peer doesn't support."
},
"setParams": {
"type": "object",
"additionalProperties": true,
"default": {},
"description": "Dictionary of parameters to set/override in requests to this peer. Useful for injecting provider-specific settings. Protected params like 'model' cannot be overridden. Values can be strings, numbers, booleans, arrays, or objects."
}
},
"additionalProperties": false,
"default": {},
"description": "Dictionary of filter settings for peer requests. Supports stripParams and setParams."
} }
} }
}, },
+54 -12
View File
@@ -70,16 +70,6 @@ sendLoadingState: true
# all fields except for Id so chat UIs can use the alias equivalent to the original. # all fields except for Id so chat UIs can use the alias equivalent to the original.
includeAliasesInList: false includeAliasesInList: false
# apiKeys: require an API key when making requests to inference endpoints
# - optional, default: []
# - when empty (the default) authorization will not be checked as llama-swap is default-allow
# - each key is a non-empty string
apiKeys:
- "sk-hunter2"
# hint, one liner: printf "sk-%s\n" "$(head -c 48 /dev/urandom | base64 )"
- "sk-gyCPiKUcIfPlaM4OSMZekkprgijPx6+OsmQs8Rsg0xZ9qpy6gKWsIKqHOk+cgXVx"
- "sk-+QtIn0Zjj4UHjiaZYiZEnru4mrwKM9RzhmJeK5SobNXLl8QMFXxGz1/2lEuvQpkb"
# macros: a dictionary of string substitutions # macros: a dictionary of string substitutions
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - macros are reusable snippets # - macros are reusable snippets
@@ -90,6 +80,9 @@ apiKeys:
# - macro names must not be a reserved name: PORT or MODEL_ID # - macro names must not be a reserved name: PORT or MODEL_ID
# - macro values can be numbers, bools, or strings # - macro values can be numbers, bools, or strings
# - macros can contain other macros, but they must be defined before they are used # - macros can contain other macros, but they must be defined before they are used
# - environment variables can be referenced with ${env.VAR_NAME} syntax
# - env macros are substituted first, before regular macros
# - if the env var is not set, config loading will fail with an error
macros: macros:
# Example of a multi-line macro # Example of a multi-line macro
"latest-llama": > "latest-llama": >
@@ -102,6 +95,24 @@ macros:
# but they must be previously declared. # but they must be previously declared.
"default_args": "--ctx-size ${default_ctx}" "default_args": "--ctx-size ${default_ctx}"
# Example of environment variable macros
# - ${env.VAR_NAME} pulls the value from the system environment
# - useful for paths, secrets, or machine-specific configuration
"models_dir": "${env.HOME}/models"
# apiKeys: require an API key when making requests to inference endpoints
# - optional, default: []
# - when empty (the default) authorization will not be checked as llama-swap is default-allow
# - each key is a non-empty string
apiKeys:
- "sk-hunter2"
# tip, one liner: printf "sk-%s\n" "$(head -c 48 /dev/urandom | base64 )"
- "sk-gyCPiKUcIfPlaM4OSMZekkprgijPx6+OsmQs8Rsg0xZ9qpy6gKWsIKqHOk+cgXVx"
# use environment variable macros to keep secrets out of the config
- "${env.API_KEY_1}"
- "${env.API_KEY_2}"
# models: a dictionary of model configurations # models: a dictionary of model configurations
# - required # - required
# - each key is the model's ID, used in API requests # - each key is the model's ID, used in API requests
@@ -185,7 +196,7 @@ models:
# filters: a dictionary of filter settings # filters: a dictionary of filter settings
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - only stripParams is currently supported # - same capabilities as peer filters (stripParams, setParams)
filters: filters:
# stripParams: a comma separated list of parameters to remove from the request # stripParams: a comma separated list of parameters to remove from the request
# - optional, default: "" # - optional, default: ""
@@ -195,6 +206,16 @@ models:
# - recommended to stick to sampling parameters # - recommended to stick to sampling parameters
stripParams: "temperature, top_p, top_k" stripParams: "temperature, top_p, top_k"
# setParams: a dictionary of parameters to set/override in requests
# - optional, default: empty dictionary
# - useful for enforcing specific parameter values
# - protected params like "model" cannot be overridden
# - values can be strings, numbers, booleans, arrays, or objects
setParams:
# Example: enforce specific sampling parameters
temperature: 0.7
top_p: 0.9
# metadata: a dictionary of arbitrary values that are included in /v1/models # metadata: a dictionary of arbitrary values that are included in /v1/models
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - while metadata can contains complex types it is recommended to keep it simple # - while metadata can contains complex types it is recommended to keep it simple
@@ -365,7 +386,8 @@ peers:
# - optional, default: "" # - optional, default: ""
# - if blank, no key will be added to the request # - if blank, no key will be added to the request
# - key will be injected into headers: Authorization: Bearer <key> and x-api-key: <key> # - key will be injected into headers: Authorization: Bearer <key> and x-api-key: <key>
apiKey: sk-your-openrouter-key # - can be a string or a macro
apiKey: ${env.OPENROUTER_API_KEY}
models: models:
- meta-llama/llama-3.1-8b-instruct - meta-llama/llama-3.1-8b-instruct
- qwen/qwen3-235b-a22b-2507 - qwen/qwen3-235b-a22b-2507
@@ -373,3 +395,23 @@ peers:
- z-ai/glm-4.7 - z-ai/glm-4.7
- moonshotai/kimi-k2-0905 - moonshotai/kimi-k2-0905
- minimax/minimax-m2.1 - minimax/minimax-m2.1
# filters: a dictionary of filter settings for peer requests
# - optional, default: empty dictionary
# - same capabilities as model filters (stripParams, setParams)
filters:
# stripParams: a comma separated list of parameters to remove from the request
# - optional, default: ""
# - useful for removing parameters that the peer doesn't support
# - the `model` parameter can never be removed
stripParams: "temperature, top_p"
# setParams: a dictionary of parameters to set/override in requests to this peer
# - optional, default: empty dictionary
# - useful for injecting provider-specific settings like data retention policies
# - protected params like "model" cannot be overridden
# - values can be strings, numbers, booleans, arrays, or objects
setParams:
# Example: enforce zero-data-retention for OpenRouter
provider:
data_collection: "deny"
zdr: true
+79 -14
View File
@@ -2,21 +2,37 @@
cd $(dirname "$0") cd $(dirname "$0")
# use this to test locally, example:
# GITHUB_TOKEN=$(gh auth token) LOG_DEBUG=1 DEBUG_ABORT_BUILD=1 ./docker/build-container.sh rocm
# you need read:package scope on the token. Generate a personal access token with
# the scopes: gist, read:org, repo, write:packages
# then: gh auth login (and copy/paste the new token)
log_debug() {
if [ "$LOG_DEBUG" = "1" ]; then
echo "[DEBUG] $*"
fi
}
log_info() {
echo "[INFO] $*"
}
ARCH=$1 ARCH=$1
PUSH_IMAGES=${2:-false} PUSH_IMAGES=${2:-false}
# List of allowed architectures # List of allowed architectures
ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cpu") ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cpu" "rocm")
# Check if ARCH is in the allowed list # Check if ARCH is in the allowed list
if [[ ! " ${ALLOWED_ARCHS[@]} " =~ " ${ARCH} " ]]; then if [[ ! " ${ALLOWED_ARCHS[@]} " =~ " ${ARCH} " ]]; then
echo "Error: ARCH must be one of the following: ${ALLOWED_ARCHS[@]}" log_info "Error: ARCH must be one of the following: ${ALLOWED_ARCHS[@]}"
exit 1 exit 1
fi fi
# Check if GITHUB_TOKEN is set and not empty # Check if GITHUB_TOKEN is set and not empty
if [[ -z "$GITHUB_TOKEN" ]]; then if [[ -z "$GITHUB_TOKEN" ]]; then
echo "Error: GITHUB_TOKEN is not set or is empty." log_info "Error: GITHUB_TOKEN is not set or is empty."
exit 1 exit 1
fi fi
@@ -32,25 +48,74 @@ LS_REPO=${GITHUB_REPOSITORY:-mostlygeek/llama-swap}
# have to strip out the 'v' due to .tar.gz file naming # have to strip out the 'v' due to .tar.gz file naming
LS_VER=$(curl -s https://api.github.com/repos/${LS_REPO}/releases/latest | jq -r .tag_name | sed 's/v//') LS_VER=$(curl -s https://api.github.com/repos/${LS_REPO}/releases/latest | jq -r .tag_name | sed 's/v//')
# Fetches the most recent llama.cpp tag matching the given prefix
# Handles pagination to search beyond the first 100 results
# $1 - tag_prefix (e.g., "server" or "server-vulkan")
# Returns: the version number extracted from the tag
fetch_llama_tag() {
local tag_prefix=$1
local page=1
local per_page=100
while true; do
log_debug "Fetching page $page for tag prefix: $tag_prefix"
local response=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions?per_page=${per_page}&page=${page}")
# Check for API errors
if echo "$response" | jq -e '.message' > /dev/null 2>&1; then
local error_msg=$(echo "$response" | jq -r '.message')
log_info "GitHub API error: $error_msg"
return 1
fi
# Check if response is empty array (no more pages)
if [ "$(echo "$response" | jq 'length')" -eq 0 ]; then
log_debug "No more pages (empty response)"
return 1
fi
# Extract matching tag from this page
local found_tag=$(echo "$response" | jq -r \
".[] | select(.metadata.container.tags[]? | startswith(\"$tag_prefix\")) | .metadata.container.tags[] | select(startswith(\"$tag_prefix\"))" \
| sort -r | head -n1)
if [ -n "$found_tag" ]; then
log_debug "Found tag: $found_tag on page $page"
echo "$found_tag" | awk -F '-' '{print $NF}'
return 0
fi
page=$((page + 1))
# Safety limit to prevent infinite loops
if [ $page -gt 50 ]; then
log_info "Reached pagination safety limit (50 pages)"
return 1
fi
done
}
if [ "$ARCH" == "cpu" ]; then if [ "$ARCH" == "cpu" ]; then
# cpu only containers just use the server tag LCPP_TAG=$(fetch_llama_tag "server")
LCPP_TAG=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions" \
| jq -r '.[] | select(.metadata.container.tags[] | startswith("server")) | .metadata.container.tags[]' \
| sort -r | head -n1 | awk -F '-' '{print $3}')
BASE_TAG=server-${LCPP_TAG} BASE_TAG=server-${LCPP_TAG}
else else
LCPP_TAG=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \ LCPP_TAG=$(fetch_llama_tag "server-${ARCH}")
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions" \
| jq -r --arg arch "$ARCH" '.[] | select(.metadata.container.tags[] | startswith("server-\($arch)")) | .metadata.container.tags[]' \
| sort -r | head -n1 | awk -F '-' '{print $3}')
BASE_TAG=server-${ARCH}-${LCPP_TAG} BASE_TAG=server-${ARCH}-${LCPP_TAG}
fi fi
# Abort if LCPP_TAG is empty. # Abort if LCPP_TAG is empty.
if [[ -z "$LCPP_TAG" ]]; then if [[ -z "$LCPP_TAG" ]]; then
echo "Abort: Could not find llama-server container for arch: $ARCH" log_info "Abort: Could not find llama-server container for arch: $ARCH"
exit 1 exit 1
else
log_info "LCPP_TAG: $LCPP_TAG"
fi
if [[ ! -z "$DEBUG_ABORT_BUILD" ]]; then
log_info "Abort: DEBUG_ABORT_BUILD set"
exit 0
fi fi
for CONTAINER_TYPE in non-root root; do for CONTAINER_TYPE in non-root root; do
@@ -68,7 +133,7 @@ for CONTAINER_TYPE in non-root root; do
USER_HOME=/app USER_HOME=/app
fi fi
echo "Building $CONTAINER_TYPE $CONTAINER_TAG $LS_VER" log_info "Building $CONTAINER_TYPE $CONTAINER_TAG $LS_VER"
docker build -f llama-swap.Containerfile --build-arg BASE_TAG=${BASE_TAG} --build-arg LS_VER=${LS_VER} --build-arg UID=${USER_UID} \ docker build -f llama-swap.Containerfile --build-arg BASE_TAG=${BASE_TAG} --build-arg LS_VER=${LS_VER} --build-arg UID=${USER_UID} \
--build-arg LS_REPO=${LS_REPO} --build-arg GID=${USER_GID} --build-arg USER_HOME=${USER_HOME} -t ${CONTAINER_TAG} -t ${CONTAINER_LATEST} \ --build-arg LS_REPO=${LS_REPO} --build-arg GID=${USER_GID} --build-arg USER_HOME=${USER_HOME} -t ${CONTAINER_TAG} -t ${CONTAINER_LATEST} \
--build-arg BASE_IMAGE=${BASE_IMAGE} . --build-arg BASE_IMAGE=${BASE_IMAGE} .
+128 -64
View File
@@ -87,6 +87,7 @@ type GroupConfig struct {
var ( var (
macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`) macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`)
macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`) macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`)
envMacroRegex = regexp.MustCompile(`\$\{env\.([a-zA-Z_][a-zA-Z0-9_]*)\}`)
) )
// set default values for GroupConfig // set default values for GroupConfig
@@ -183,8 +184,16 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
if err != nil { if err != nil {
return Config{}, err return Config{}, err
} }
yamlStr := string(data)
// default configuration values // Phase 1: Substitute all ${env.VAR} macros at string level
// This is safe because env values are simple strings without YAML formatting
yamlStr, err = substituteEnvMacros(yamlStr)
if err != nil {
return Config{}, err
}
// Unmarshal into full Config with defaults
config := Config{ config := Config{
HealthCheckTimeout: 120, HealthCheckTimeout: 120,
StartPort: 5800, StartPort: 5800,
@@ -193,13 +202,11 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
LogToStdout: LogToStdoutProxy, LogToStdout: LogToStdoutProxy,
MetricsMaxInMemory: 1000, MetricsMaxInMemory: 1000,
} }
err = yaml.Unmarshal(data, &config) if err = yaml.Unmarshal([]byte(yamlStr), &config); err != nil {
if err != nil {
return Config{}, err return Config{}, err
} }
if config.HealthCheckTimeout < 15 { if config.HealthCheckTimeout < 15 {
// set a minimum of 15 seconds
config.HealthCheckTimeout = 15 config.HealthCheckTimeout = 15
} }
@@ -224,55 +231,46 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
/* check macro constraint rules: // Validate global macros
- name must fit the regex ^[a-zA-Z0-9_-]+$
- names must be less than 64 characters (no reason, just cause)
- name can not be any reserved macros: PORT, MODEL_ID
- macro values must be less than 1024 characters
*/
for _, macro := range config.Macros { for _, macro := range config.Macros {
if err = validateMacro(macro.Name, macro.Value); err != nil { if err = validateMacro(macro.Name, macro.Value); err != nil {
return Config{}, err return Config{}, err
} }
} }
// Get and sort all model IDs first, makes testing more consistent // Get and sort all model IDs for consistent port assignment
modelIds := make([]string, 0, len(config.Models)) modelIds := make([]string, 0, len(config.Models))
for modelId := range config.Models { for modelId := range config.Models {
modelIds = append(modelIds, modelId) modelIds = append(modelIds, modelId)
} }
sort.Strings(modelIds) // This guarantees stable iteration order sort.Strings(modelIds)
nextPort := config.StartPort nextPort := config.StartPort
for _, modelId := range modelIds { for _, modelId := range modelIds {
modelConfig := config.Models[modelId] modelConfig := config.Models[modelId]
// Strip comments from command fields before macro expansion // Strip comments from command fields
modelConfig.Cmd = StripComments(modelConfig.Cmd) modelConfig.Cmd = StripComments(modelConfig.Cmd)
modelConfig.CmdStop = StripComments(modelConfig.CmdStop) modelConfig.CmdStop = StripComments(modelConfig.CmdStop)
// validate model macros // Validate model macros
for _, macro := range modelConfig.Macros { for _, macro := range modelConfig.Macros {
if err = validateMacro(macro.Name, macro.Value); err != nil { if err = validateMacro(macro.Name, macro.Value); err != nil {
return Config{}, fmt.Errorf("model %s: %s", modelId, err.Error()) return Config{}, fmt.Errorf("model %s: %s", modelId, err.Error())
} }
} }
// Merge global config and model macros. Model macros take precedence // Build merged macro list: MODEL_ID + global macros + model macros (model overrides global)
mergedMacros := make(MacroList, 0, len(config.Macros)+len(modelConfig.Macros)) mergedMacros := make(MacroList, 0, len(config.Macros)+len(modelConfig.Macros)+1)
mergedMacros = append(mergedMacros, MacroEntry{Name: "MODEL_ID", Value: modelId}) mergedMacros = append(mergedMacros, MacroEntry{Name: "MODEL_ID", Value: modelId})
// Add global macros first
mergedMacros = append(mergedMacros, config.Macros...) mergedMacros = append(mergedMacros, config.Macros...)
// Add model macros (can override global) // Add model macros (override globals with same name)
for _, entry := range modelConfig.Macros { for _, entry := range modelConfig.Macros {
// Remove any existing global macro with same name
found := false found := false
for i, existing := range mergedMacros { for i, existing := range mergedMacros {
if existing.Name == entry.Name { if existing.Name == entry.Name {
mergedMacros[i] = entry // Override mergedMacros[i] = entry
found = true found = true
break break
} }
@@ -282,23 +280,20 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
// First pass: Substitute user-defined macros in reverse order (LIFO - last defined first) // Substitute remaining macros in model fields (LIFO order)
// This allows later macros to reference earlier ones
for i := len(mergedMacros) - 1; i >= 0; i-- { for i := len(mergedMacros) - 1; i >= 0; i-- {
entry := mergedMacros[i] entry := mergedMacros[i]
macroSlug := fmt.Sprintf("${%s}", entry.Name) macroSlug := fmt.Sprintf("${%s}", entry.Name)
macroStr := fmt.Sprintf("%v", entry.Value) macroStr := fmt.Sprintf("%v", entry.Value)
// Substitute in command fields
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroStr) modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroStr)
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr) modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr)
modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr) modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr)
modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroStr) modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroStr)
modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroStr) modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroStr)
// Substitute in metadata (recursive) // Substitute in metadata (type-preserving)
if len(modelConfig.Metadata) > 0 { if len(modelConfig.Metadata) > 0 {
var err error
result, err := substituteMacroInValue(modelConfig.Metadata, entry.Name, entry.Value) result, err := substituteMacroInValue(modelConfig.Metadata, entry.Name, entry.Value)
if err != nil { if err != nil {
return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error()) return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error())
@@ -307,18 +302,14 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
// Final pass: check if PORT macro is needed after macro expansion // Handle PORT macro - only allocate if cmd uses it
// ${PORT} is a resource on the local machine so a new port is only allocated
// if it is required in either cmd or proxy keys
cmdHasPort := strings.Contains(modelConfig.Cmd, "${PORT}") cmdHasPort := strings.Contains(modelConfig.Cmd, "${PORT}")
proxyHasPort := strings.Contains(modelConfig.Proxy, "${PORT}") proxyHasPort := strings.Contains(modelConfig.Proxy, "${PORT}")
if cmdHasPort || proxyHasPort { // either has it if cmdHasPort || proxyHasPort {
if !cmdHasPort && proxyHasPort { // but both don't have it if !cmdHasPort && proxyHasPort {
return Config{}, fmt.Errorf("model %s: proxy uses ${PORT} but cmd does not - ${PORT} is only available when used in cmd", modelId) return Config{}, fmt.Errorf("model %s: proxy uses ${PORT} but cmd does not - ${PORT} is only available when used in cmd", modelId)
} }
// Add PORT macro and substitute it
portEntry := MacroEntry{Name: "PORT", Value: nextPort}
macroSlug := "${PORT}" macroSlug := "${PORT}"
macroStr := fmt.Sprintf("%v", nextPort) macroStr := fmt.Sprintf("%v", nextPort)
@@ -326,10 +317,8 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr) modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr)
modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr) modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr)
// Substitute PORT in metadata
if len(modelConfig.Metadata) > 0 { if len(modelConfig.Metadata) > 0 {
var err error result, err := substituteMacroInValue(modelConfig.Metadata, "PORT", nextPort)
result, err := substituteMacroInValue(modelConfig.Metadata, portEntry.Name, portEntry.Value)
if err != nil { if err != nil {
return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error()) return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error())
} }
@@ -339,7 +328,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
nextPort++ nextPort++
} }
// make sure there are no unknown macros that have not been replaced // Validate no unknown macros remain
fieldMap := map[string]string{ fieldMap := map[string]string{
"cmd": modelConfig.Cmd, "cmd": modelConfig.Cmd,
"cmdStop": modelConfig.CmdStop, "cmdStop": modelConfig.CmdStop,
@@ -353,35 +342,27 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
for _, match := range matches { for _, match := range matches {
macroName := match[1] macroName := match[1]
if macroName == "PID" && fieldName == "cmdStop" { if macroName == "PID" && fieldName == "cmdStop" {
continue // this is ok, has to be replaced by process later continue // replaced at runtime
} }
// Reserved macros are always valid (they should have been substituted already)
if macroName == "PORT" || macroName == "MODEL_ID" { if macroName == "PORT" || macroName == "MODEL_ID" {
return Config{}, fmt.Errorf("macro '${%s}' should have been substituted in %s.%s", macroName, modelId, fieldName) return Config{}, fmt.Errorf("macro '${%s}' should have been substituted in %s.%s", macroName, modelId, fieldName)
} }
// Any other macro is unknown
return Config{}, fmt.Errorf("unknown macro '${%s}' found in %s.%s", macroName, modelId, fieldName) return Config{}, fmt.Errorf("unknown macro '${%s}' found in %s.%s", macroName, modelId, fieldName)
} }
} }
// Check for unknown macros in metadata
if len(modelConfig.Metadata) > 0 { if len(modelConfig.Metadata) > 0 {
if err := validateMetadataForUnknownMacros(modelConfig.Metadata, modelId); err != nil { if err := validateNestedForUnknownMacros(modelConfig.Metadata, fmt.Sprintf("model %s metadata", modelId)); err != nil {
return Config{}, err return Config{}, err
} }
} }
// Validate the proxy URL.
if _, err := url.Parse(modelConfig.Proxy); err != nil { if _, err := url.Parse(modelConfig.Proxy); err != nil {
return Config{}, fmt.Errorf( return Config{}, fmt.Errorf("model %s: invalid proxy URL: %w", modelId, err)
"model %s: invalid proxy URL: %w", modelId, err,
)
} }
// if sendLoadingState is nil, set it to the global config value
// see #366
if modelConfig.SendLoadingState == nil { if modelConfig.SendLoadingState == nil {
v := config.SendLoadingState // copy it v := config.SendLoadingState
modelConfig.SendLoadingState = &v modelConfig.SendLoadingState = &v
} }
@@ -389,18 +370,17 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
config = AddDefaultGroupToConfig(config) config = AddDefaultGroupToConfig(config)
// check that members are all unique in the groups
memberUsage := make(map[string]string) // maps member to group it appears in // Validate group members
memberUsage := make(map[string]string)
for groupID, groupConfig := range config.Groups { for groupID, groupConfig := range config.Groups {
prevSet := make(map[string]bool) prevSet := make(map[string]bool)
for _, member := range groupConfig.Members { for _, member := range groupConfig.Members {
// Check for duplicates within this group
if _, found := prevSet[member]; found { if _, found := prevSet[member]; found {
return Config{}, fmt.Errorf("duplicate model member %s found in group: %s", member, groupID) return Config{}, fmt.Errorf("duplicate model member %s found in group: %s", member, groupID)
} }
prevSet[member] = true prevSet[member] = true
// Check if member is used in another group
if existingGroup, exists := memberUsage[member]; exists { if existingGroup, exists := memberUsage[member]; exists {
return Config{}, fmt.Errorf("model member %s is used in multiple groups: %s and %s", member, existingGroup, groupID) return Config{}, fmt.Errorf("model member %s is used in multiple groups: %s and %s", member, existingGroup, groupID)
} }
@@ -408,7 +388,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
// clean up hooks preload // Clean up hooks preload
if len(config.Hooks.OnStartup.Preload) > 0 { if len(config.Hooks.OnStartup.Preload) > 0 {
var toPreload []string var toPreload []string
for _, modelID := range config.Hooks.OnStartup.Preload { for _, modelID := range config.Hooks.OnStartup.Preload {
@@ -420,19 +400,54 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
toPreload = append(toPreload, real) toPreload = append(toPreload, real)
} }
} }
config.Hooks.OnStartup.Preload = toPreload config.Hooks.OnStartup.Preload = toPreload
} }
// check api keys validatity // Validate API keys (env macros already substituted at string level)
for _, apikey := range config.RequiredAPIKeys { for i, apikey := range config.RequiredAPIKeys {
if apikey == "" { if apikey == "" {
return Config{}, fmt.Errorf("empty api key found in apiKeys") return Config{}, fmt.Errorf("empty api key found in apiKeys")
} }
if strings.Contains(apikey, " ") { if strings.Contains(apikey, " ") {
return Config{}, fmt.Errorf("api key cannot contain spaces: `%s`", apikey) return Config{}, fmt.Errorf("api key cannot contain spaces: `%s`", apikey)
} }
config.RequiredAPIKeys[i] = apikey
}
// Process peers with global macro substitution
for peerName, peerConfig := range config.Peers {
// Substitute global macros (LIFO order)
for i := len(config.Macros) - 1; i >= 0; i-- {
entry := config.Macros[i]
macroSlug := fmt.Sprintf("${%s}", entry.Name)
macroStr := fmt.Sprintf("%v", entry.Value)
peerConfig.ApiKey = strings.ReplaceAll(peerConfig.ApiKey, macroSlug, macroStr)
peerConfig.Filters.StripParams = strings.ReplaceAll(peerConfig.Filters.StripParams, macroSlug, macroStr)
// Substitute in setParams (type-preserving)
if len(peerConfig.Filters.SetParams) > 0 {
result, err := substituteMacroInValue(peerConfig.Filters.SetParams, entry.Name, entry.Value)
if err != nil {
return Config{}, fmt.Errorf("peers.%s.filters.setParams: %w", peerName, err)
}
peerConfig.Filters.SetParams = result.(map[string]any)
}
}
// Validate no unknown macros remain
if matches := macroPatternRegex.FindAllStringSubmatch(peerConfig.ApiKey, -1); len(matches) > 0 {
return Config{}, fmt.Errorf("peers.%s.apiKey: unknown macro '${%s}'", peerName, matches[0][1])
}
if matches := macroPatternRegex.FindAllStringSubmatch(peerConfig.Filters.StripParams, -1); len(matches) > 0 {
return Config{}, fmt.Errorf("peers.%s.filters.stripParams: unknown macro '${%s}'", peerName, matches[0][1])
}
if len(peerConfig.Filters.SetParams) > 0 {
if err := validateNestedForUnknownMacros(peerConfig.Filters.SetParams, fmt.Sprintf("peers.%s.filters.setParams", peerName)); err != nil {
return Config{}, err
}
}
config.Peers[peerName] = peerConfig
} }
return config, nil return config, nil
@@ -565,20 +580,26 @@ func validateMacro(name string, value any) error {
return nil return nil
} }
// validateMetadataForUnknownMacros recursively checks for any remaining macro references in metadata // validateNestedForUnknownMacros recursively checks for any remaining macro references in nested structures
func validateMetadataForUnknownMacros(value any, modelId string) error { func validateNestedForUnknownMacros(value any, context string) error {
switch v := value.(type) { switch v := value.(type) {
case string: case string:
matches := macroPatternRegex.FindAllStringSubmatch(v, -1) matches := macroPatternRegex.FindAllStringSubmatch(v, -1)
for _, match := range matches { for _, match := range matches {
macroName := match[1] macroName := match[1]
return fmt.Errorf("model %s metadata: unknown macro '${%s}'", modelId, macroName) return fmt.Errorf("%s: unknown macro '${%s}'", context, macroName)
}
// Check for unsubstituted env macros
envMatches := envMacroRegex.FindAllStringSubmatch(v, -1)
for _, match := range envMatches {
varName := match[1]
return fmt.Errorf("%s: environment variable '%s' not set", context, varName)
} }
return nil return nil
case map[string]any: case map[string]any:
for _, val := range v { for _, val := range v {
if err := validateMetadataForUnknownMacros(val, modelId); err != nil { if err := validateNestedForUnknownMacros(val, context); err != nil {
return err return err
} }
} }
@@ -586,7 +607,7 @@ func validateMetadataForUnknownMacros(value any, modelId string) error {
case []any: case []any:
for _, val := range v { for _, val := range v {
if err := validateMetadataForUnknownMacros(val, modelId); err != nil { if err := validateNestedForUnknownMacros(val, context); err != nil {
return err return err
} }
} }
@@ -645,3 +666,46 @@ func substituteMacroInValue(value any, macroName string, macroValue any) (any, e
return value, nil return value, nil
} }
} }
// substituteEnvMacros replaces ${env.VAR_NAME} with environment variable values
// Returns error if any env var is not set or contains invalid characters
func substituteEnvMacros(s string) (string, error) {
result := s
matches := envMacroRegex.FindAllStringSubmatch(s, -1)
for _, match := range matches {
fullMatch := match[0] // ${env.VAR_NAME}
varName := match[1] // VAR_NAME
value, exists := os.LookupEnv(varName)
if !exists {
return "", fmt.Errorf("environment variable '%s' is not set", varName)
}
// Sanitize the value for safe YAML substitution
value, err := sanitizeEnvValueForYAML(value, varName)
if err != nil {
return "", err
}
result = strings.ReplaceAll(result, fullMatch, value)
}
return result, nil
}
// sanitizeEnvValueForYAML ensures an environment variable value is safe for YAML substitution.
// It rejects values with characters that break YAML structure and escapes quotes/backslashes
// for compatibility with double-quoted YAML strings.
func sanitizeEnvValueForYAML(value, varName string) (string, error) {
// Reject values that would break YAML structure regardless of quoting context
if strings.ContainsAny(value, "\n\r\x00") {
return "", fmt.Errorf("environment variable '%s' contains newlines or null bytes which are not allowed in YAML substitution", varName)
}
// Escape backslashes and double quotes for safe use in double-quoted YAML strings.
// In unquoted contexts, these escapes appear literally (harmless for most use cases).
// In double-quoted contexts, they are interpreted correctly.
value = strings.ReplaceAll(value, `\`, `\\`)
value = strings.ReplaceAll(value, `"`, `\"`)
return value, nil
}
+500
View File
@@ -809,3 +809,503 @@ func TestConfig_APIKeys_Invalid(t *testing.T) {
}) })
} }
} }
func TestConfig_APIKeys_EnvMacros(t *testing.T) {
t.Run("env substitution in apiKeys", func(t *testing.T) {
t.Setenv("TEST_API_KEY", "secret-key-123")
content := `apiKeys: ["${env.TEST_API_KEY}"]`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, []string{"secret-key-123"}, config.RequiredAPIKeys)
})
t.Run("multiple env substitutions in apiKeys", func(t *testing.T) {
t.Setenv("TEST_API_KEY_1", "key-one")
t.Setenv("TEST_API_KEY_2", "key-two")
content := `apiKeys: ["${env.TEST_API_KEY_1}", "${env.TEST_API_KEY_2}", "static-key"]`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, []string{"key-one", "key-two", "static-key"}, config.RequiredAPIKeys)
})
t.Run("missing env var in apiKeys", func(t *testing.T) {
content := `apiKeys: ["${env.NONEXISTENT_API_KEY}"]`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
// With string-level env substitution, error only includes var name
assert.Contains(t, err.Error(), "NONEXISTENT_API_KEY")
})
t.Run("env substitution results in empty key", func(t *testing.T) {
t.Setenv("TEST_EMPTY_KEY", "")
content := `apiKeys: ["${env.TEST_EMPTY_KEY}"]`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Equal(t, "empty api key found in apiKeys", err.Error())
})
}
func TestConfig_EnvMacros(t *testing.T) {
t.Run("basic env substitution in cmd", func(t *testing.T) {
t.Setenv("TEST_MODEL_PATH", "/opt/models")
content := `
models:
test:
cmd: "${env.TEST_MODEL_PATH}/llama-server"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "/opt/models/llama-server", config.Models["test"].Cmd)
})
t.Run("env substitution in multiple fields", func(t *testing.T) {
t.Setenv("TEST_HOST", "myserver")
t.Setenv("TEST_PORT", "9999")
content := `
models:
test:
cmd: "server --host ${env.TEST_HOST}"
proxy: "http://${env.TEST_HOST}:${env.TEST_PORT}"
checkEndpoint: "http://${env.TEST_HOST}/health"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --host myserver", config.Models["test"].Cmd)
assert.Equal(t, "http://myserver:9999", config.Models["test"].Proxy)
assert.Equal(t, "http://myserver/health", config.Models["test"].CheckEndpoint)
})
t.Run("env in global macro value", func(t *testing.T) {
t.Setenv("TEST_BASE_PATH", "/usr/local")
content := `
macros:
SERVER_PATH: "${env.TEST_BASE_PATH}/bin/server"
models:
test:
cmd: "${SERVER_PATH} --port 8080"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "/usr/local/bin/server --port 8080", config.Models["test"].Cmd)
})
t.Run("env in model-level macro value", func(t *testing.T) {
t.Setenv("TEST_MODEL_DIR", "/models/llama")
content := `
models:
test:
macros:
MODEL_FILE: "${env.TEST_MODEL_DIR}/model.gguf"
cmd: "server --model ${MODEL_FILE}"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --model /models/llama/model.gguf", config.Models["test"].Cmd)
})
t.Run("env in metadata", func(t *testing.T) {
t.Setenv("TEST_API_KEY", "secret123")
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
metadata:
api_key: "${env.TEST_API_KEY}"
nested:
key: "${env.TEST_API_KEY}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "secret123", config.Models["test"].Metadata["api_key"])
nested := config.Models["test"].Metadata["nested"].(map[string]any)
assert.Equal(t, "secret123", nested["key"])
})
t.Run("env in filters.stripParams", func(t *testing.T) {
t.Setenv("TEST_STRIP_PARAMS", "temperature,top_p")
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
filters:
stripParams: "${env.TEST_STRIP_PARAMS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "temperature,top_p", config.Models["test"].Filters.StripParams)
})
t.Run("env in cmdStop", func(t *testing.T) {
t.Setenv("TEST_KILL_SIGNAL", "SIGTERM")
content := `
models:
test:
cmd: "server --port ${PORT}"
cmdStop: "kill -${env.TEST_KILL_SIGNAL} ${PID}"
proxy: "http://localhost:${PORT}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Contains(t, config.Models["test"].CmdStop, "-SIGTERM")
})
t.Run("missing env var returns error", func(t *testing.T) {
content := `
models:
test:
cmd: "${env.UNDEFINED_VAR_12345}/server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_VAR_12345")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in global macro", func(t *testing.T) {
content := `
macros:
PATH: "${env.UNDEFINED_GLOBAL_VAR}"
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_GLOBAL_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in model macro", func(t *testing.T) {
content := `
models:
test:
macros:
MY_PATH: "${env.UNDEFINED_MODEL_VAR}"
cmd: "server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_MODEL_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in metadata", func(t *testing.T) {
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
metadata:
key: "${env.UNDEFINED_META_VAR}"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_META_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("env combined with regular macros", func(t *testing.T) {
t.Setenv("TEST_ROOT", "/data")
content := `
macros:
MODEL_BASE: "${env.TEST_ROOT}/models"
models:
test:
cmd: "server --model ${MODEL_BASE}/${MODEL_ID}.gguf"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --model /data/models/test.gguf", config.Models["test"].Cmd)
})
t.Run("multiple env vars in same string", func(t *testing.T) {
t.Setenv("TEST_USER", "admin")
t.Setenv("TEST_PASS", "secret")
content := `
models:
test:
cmd: "server --auth ${env.TEST_USER}:${env.TEST_PASS}"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --auth admin:secret", config.Models["test"].Cmd)
})
t.Run("env value with newline is rejected", func(t *testing.T) {
t.Setenv("TEST_MULTILINE", "line1\nline2")
content := `
models:
test:
cmd: "server --config ${env.TEST_MULTILINE}"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "TEST_MULTILINE")
assert.Contains(t, err.Error(), "newlines")
}
})
t.Run("env value with carriage return is rejected", func(t *testing.T) {
t.Setenv("TEST_CR", "line1\rline2")
content := `
models:
test:
cmd: "server --config ${env.TEST_CR}"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "TEST_CR")
assert.Contains(t, err.Error(), "newlines")
}
})
t.Run("env value with quotes is escaped for YAML", func(t *testing.T) {
t.Setenv("TEST_QUOTED", `value with "quotes"`)
content := `
models:
test:
cmd: "server --arg \"${env.TEST_QUOTED}\""
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
// Quotes are escaped before YAML parsing, then YAML unescapes them
// Final result preserves the original value with quotes
assert.Contains(t, config.Models["test"].Cmd, `"quotes"`)
})
t.Run("env value with backslash is escaped for YAML", func(t *testing.T) {
t.Setenv("TEST_BACKSLASH", `path\to\file`)
content := `
models:
test:
cmd: "server --path \"${env.TEST_BACKSLASH}\""
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
// Backslashes are escaped before YAML parsing, then YAML unescapes them
// Final result preserves the original value with backslashes
assert.Contains(t, config.Models["test"].Cmd, `path\to\file`)
})
}
func TestConfig_PeerApiKey_EnvMacros(t *testing.T) {
t.Run("env substitution in peer apiKey", func(t *testing.T) {
t.Setenv("TEST_PEER_API_KEY", "sk-peer-secret-123")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${env.TEST_PEER_API_KEY}"
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-peer-secret-123", config.Peers["openrouter"].ApiKey)
})
t.Run("missing env var in peer apiKey", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${env.NONEXISTENT_PEER_KEY}"
models:
- llama-3.1-8b
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
// With string-level env substitution, error only includes var name
assert.Contains(t, err.Error(), "NONEXISTENT_PEER_KEY")
})
t.Run("static apiKey unchanged", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: sk-static-key
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-static-key", config.Peers["openrouter"].ApiKey)
})
t.Run("multiple peers with env apiKeys", func(t *testing.T) {
t.Setenv("TEST_PEER_KEY_1", "key-one")
t.Setenv("TEST_PEER_KEY_2", "key-two")
content := `
peers:
peer1:
proxy: https://peer1.example.com
apiKey: "${env.TEST_PEER_KEY_1}"
models:
- model-a
peer2:
proxy: https://peer2.example.com
apiKey: "${env.TEST_PEER_KEY_2}"
models:
- model-b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "key-one", config.Peers["peer1"].ApiKey)
assert.Equal(t, "key-two", config.Peers["peer2"].ApiKey)
})
t.Run("global macro substitution in peer apiKey", func(t *testing.T) {
content := `
macros:
API_KEY: sk-from-global-macro
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${API_KEY}"
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-from-global-macro", config.Peers["openrouter"].ApiKey)
})
t.Run("global macro in peer filters.stripParams", func(t *testing.T) {
content := `
macros:
STRIP_LIST: "temperature, top_p"
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
stripParams: "${STRIP_LIST}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "temperature, top_p", config.Peers["openrouter"].Filters.StripParams)
})
t.Run("global macro in peer filters.setParams", func(t *testing.T) {
content := `
macros:
MAX_TOKENS: 4096
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
max_tokens: "${MAX_TOKENS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, 4096, config.Peers["openrouter"].Filters.SetParams["max_tokens"])
})
t.Run("env macro in peer filters.setParams", func(t *testing.T) {
t.Setenv("TEST_RETENTION_POLICY", "deny")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
data_collection: "${env.TEST_RETENTION_POLICY}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "deny", config.Peers["openrouter"].Filters.SetParams["data_collection"])
})
t.Run("env macro in peer filters.stripParams", func(t *testing.T) {
t.Setenv("TEST_STRIP_PARAMS", "frequency_penalty, presence_penalty")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
stripParams: "${env.TEST_STRIP_PARAMS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "frequency_penalty, presence_penalty", config.Peers["openrouter"].Filters.StripParams)
})
t.Run("unknown macro in peer apiKey fails", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${UNDEFINED_MACRO}"
models:
- llama-3.1-8b
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "peers.openrouter.apiKey")
assert.Contains(t, err.Error(), "unknown macro")
})
t.Run("unknown macro in peer filters.setParams fails", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
value: "${UNDEFINED_MACRO}"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "peers.openrouter.filters.setParams")
assert.Contains(t, err.Error(), "unknown macro")
})
}
+81
View File
@@ -0,0 +1,81 @@
package config
import (
"slices"
"sort"
"strings"
)
// ProtectedParams is a list of parameters that cannot be set or stripped via filters
// These are protected to prevent breaking the proxy's ability to route requests correctly
var ProtectedParams = []string{"model"}
// Filters contains filter settings for modifying request parameters
// Used by both models and peers
type Filters struct {
// StripParams is a comma-separated list of parameters to remove from requests
// The "model" parameter can never be removed
StripParams string `yaml:"stripParams"`
// SetParams is a dictionary of parameters to set/override in requests
// Protected params (like "model") cannot be set
SetParams map[string]any `yaml:"setParams"`
}
// SanitizedStripParams returns a sorted list of parameters to strip,
// with duplicates, empty strings, and protected params removed
func (f Filters) SanitizedStripParams() []string {
if f.StripParams == "" {
return nil
}
params := strings.Split(f.StripParams, ",")
cleaned := make([]string, 0, len(params))
seen := make(map[string]bool)
for _, param := range params {
trimmed := strings.TrimSpace(param)
// Skip protected params, empty strings, and duplicates
if slices.Contains(ProtectedParams, trimmed) || trimmed == "" || seen[trimmed] {
continue
}
seen[trimmed] = true
cleaned = append(cleaned, trimmed)
}
if len(cleaned) == 0 {
return nil
}
slices.Sort(cleaned)
return cleaned
}
// SanitizedSetParams returns a copy of SetParams with protected params removed
// and keys sorted for consistent iteration order
func (f Filters) SanitizedSetParams() (map[string]any, []string) {
if len(f.SetParams) == 0 {
return nil, nil
}
result := make(map[string]any, len(f.SetParams))
keys := make([]string, 0, len(f.SetParams))
for key, value := range f.SetParams {
// Skip protected params
if slices.Contains(ProtectedParams, key) {
continue
}
result[key] = value
keys = append(keys, key)
}
// Sort keys for consistent ordering
sort.Strings(keys)
if len(result) == 0 {
return nil, nil
}
return result, keys
}
+168
View File
@@ -0,0 +1,168 @@
package config
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestFilters_SanitizedStripParams(t *testing.T) {
tests := []struct {
name string
stripParams string
want []string
}{
{
name: "empty string",
stripParams: "",
want: nil,
},
{
name: "single param",
stripParams: "temperature",
want: []string{"temperature"},
},
{
name: "multiple params",
stripParams: "temperature, top_p, top_k",
want: []string{"temperature", "top_k", "top_p"}, // sorted
},
{
name: "model param filtered",
stripParams: "model, temperature, top_p",
want: []string{"temperature", "top_p"},
},
{
name: "only model param",
stripParams: "model",
want: nil,
},
{
name: "duplicates removed",
stripParams: "temperature, top_p, temperature",
want: []string{"temperature", "top_p"},
},
{
name: "extra whitespace",
stripParams: " temperature , top_p ",
want: []string{"temperature", "top_p"},
},
{
name: "empty values filtered",
stripParams: "temperature,,top_p,",
want: []string{"temperature", "top_p"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
f := Filters{StripParams: tt.stripParams}
got := f.SanitizedStripParams()
assert.Equal(t, tt.want, got)
})
}
}
func TestFilters_SanitizedSetParams(t *testing.T) {
tests := []struct {
name string
setParams map[string]any
wantParams map[string]any
wantKeys []string
}{
{
name: "empty setParams",
setParams: nil,
wantParams: nil,
wantKeys: nil,
},
{
name: "empty map",
setParams: map[string]any{},
wantParams: nil,
wantKeys: nil,
},
{
name: "normal params",
setParams: map[string]any{
"temperature": 0.7,
"top_p": 0.9,
},
wantParams: map[string]any{
"temperature": 0.7,
"top_p": 0.9,
},
wantKeys: []string{"temperature", "top_p"},
},
{
name: "protected model param filtered",
setParams: map[string]any{
"model": "should-be-filtered",
"temperature": 0.7,
},
wantParams: map[string]any{
"temperature": 0.7,
},
wantKeys: []string{"temperature"},
},
{
name: "only protected param",
setParams: map[string]any{
"model": "should-be-filtered",
},
wantParams: nil,
wantKeys: nil,
},
{
name: "complex nested values",
setParams: map[string]any{
"provider": map[string]any{
"data_collection": "deny",
"allow_fallbacks": false,
},
"transforms": []string{"middle-out"},
},
wantParams: map[string]any{
"provider": map[string]any{
"data_collection": "deny",
"allow_fallbacks": false,
},
"transforms": []string{"middle-out"},
},
wantKeys: []string{"provider", "transforms"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
f := Filters{SetParams: tt.setParams}
gotParams, gotKeys := f.SanitizedSetParams()
assert.Equal(t, len(tt.wantKeys), len(gotKeys), "keys length mismatch")
for i, key := range gotKeys {
assert.Equal(t, tt.wantKeys[i], key, "key mismatch at %d", i)
}
if tt.wantParams == nil {
assert.Nil(t, gotParams, "expected nil params")
return
}
assert.Equal(t, len(tt.wantParams), len(gotParams), "params length mismatch")
for key, wantValue := range tt.wantParams {
gotValue, exists := gotParams[key]
assert.True(t, exists, "missing key: %s", key)
// Simple comparison for basic types
switch v := wantValue.(type) {
case string, int, float64, bool:
assert.Equal(t, v, gotValue, "value mismatch for key %s", key)
}
}
})
}
}
func TestProtectedParams(t *testing.T) {
// Verify that "model" is protected
assert.Contains(t, ProtectedParams, "model")
}
+7 -27
View File
@@ -3,8 +3,6 @@ package config
import ( import (
"errors" "errors"
"runtime" "runtime"
"slices"
"strings"
) )
type ModelConfig struct { type ModelConfig struct {
@@ -74,16 +72,15 @@ func (m *ModelConfig) SanitizedCommand() ([]string, error) {
return SanitizeCommand(m.Cmd) return SanitizeCommand(m.Cmd)
} }
// ModelFilters see issue #174 // ModelFilters embeds Filters and adds legacy support for strip_params field
// See issue #174
type ModelFilters struct { type ModelFilters struct {
StripParams string `yaml:"stripParams"` Filters `yaml:",inline"`
} }
func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error { func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error {
type rawModelFilters ModelFilters type rawModelFilters ModelFilters
defaults := rawModelFilters{ defaults := rawModelFilters{}
StripParams: "",
}
if err := unmarshal(&defaults); err != nil { if err := unmarshal(&defaults); err != nil {
return err return err
@@ -104,25 +101,8 @@ func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error {
return nil return nil
} }
// SanitizedStripParams wraps Filters.SanitizedStripParams for backwards compatibility
// Returns ([]string, error) to match existing API
func (f ModelFilters) SanitizedStripParams() ([]string, error) { func (f ModelFilters) SanitizedStripParams() ([]string, error) {
if f.StripParams == "" { return f.Filters.SanitizedStripParams(), nil
return nil, nil
}
params := strings.Split(f.StripParams, ",")
cleaned := make([]string, 0, len(params))
seen := make(map[string]bool)
for _, param := range params {
trimmed := strings.TrimSpace(param)
if trimmed == "model" || trimmed == "" || seen[trimmed] {
continue
}
seen[trimmed] = true
cleaned = append(cleaned, trimmed)
}
// sort cleaned
slices.Sort(cleaned)
return cleaned, nil
} }
+32
View File
@@ -72,3 +72,35 @@ models:
assert.True(t, *config.Models["model2"].SendLoadingState) assert.True(t, *config.Models["model2"].SendLoadingState)
} }
} }
func TestConfig_ModelFiltersWithSetParams(t *testing.T) {
content := `
models:
model1:
cmd: path/to/cmd --port ${PORT}
filters:
stripParams: "top_k"
setParams:
temperature: 0.7
top_p: 0.9
stop:
- "<|end|>"
- "<|stop|>"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
modelConfig := config.Models["model1"]
// Check stripParams
stripParams, err := modelConfig.Filters.SanitizedStripParams()
assert.NoError(t, err)
assert.Equal(t, []string{"top_k"}, stripParams)
// Check setParams
setParams, keys := modelConfig.Filters.SanitizedSetParams()
assert.NotNil(t, setParams)
assert.Equal(t, []string{"stop", "temperature", "top_p"}, keys)
assert.Equal(t, 0.7, setParams["temperature"])
assert.Equal(t, 0.9, setParams["top_p"])
}
+2
View File
@@ -11,6 +11,7 @@ type PeerConfig struct {
ProxyURL *url.URL `yaml:"-"` ProxyURL *url.URL `yaml:"-"`
ApiKey string `yaml:"apiKey"` ApiKey string `yaml:"apiKey"`
Models []string `yaml:"models"` Models []string `yaml:"models"`
Filters Filters `yaml:"filters"`
} }
func (c *PeerConfig) UnmarshalYAML(unmarshal func(interface{}) error) error { func (c *PeerConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
@@ -19,6 +20,7 @@ func (c *PeerConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
Proxy: "", Proxy: "",
ApiKey: "", ApiKey: "",
Models: []string{}, Models: []string{},
Filters: Filters{},
} }
if err := unmarshal(&defaults); err != nil { if err := unmarshal(&defaults); err != nil {
+70
View File
@@ -137,3 +137,73 @@ func searchSubstring(s, substr string) bool {
} }
return false return false
} }
func TestPeerConfig_WithFilters(t *testing.T) {
yamlData := `
proxy: https://openrouter.ai/api
apiKey: sk-test
models:
- model_a
filters:
setParams:
temperature: 0.7
provider:
data_collection: deny
`
var config PeerConfig
err := yaml.Unmarshal([]byte(yamlData), &config)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if config.Filters.SetParams == nil {
t.Fatal("Filters.SetParams should not be nil")
}
if config.Filters.SetParams["temperature"] != 0.7 {
t.Errorf("expected temperature 0.7, got %v", config.Filters.SetParams["temperature"])
}
provider, ok := config.Filters.SetParams["provider"].(map[string]any)
if !ok {
t.Fatal("provider should be a map")
}
if provider["data_collection"] != "deny" {
t.Errorf("expected data_collection deny, got %v", provider["data_collection"])
}
}
func TestPeerConfig_WithBothFilters(t *testing.T) {
yamlData := `
proxy: https://openrouter.ai/api
apiKey: sk-test
models:
- model_a
filters:
stripParams: "temperature, top_p"
setParams:
max_tokens: 1000
`
var config PeerConfig
err := yaml.Unmarshal([]byte(yamlData), &config)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Check stripParams
stripParams := config.Filters.SanitizedStripParams()
if len(stripParams) != 2 {
t.Errorf("expected 2 strip params, got %d", len(stripParams))
}
if stripParams[0] != "temperature" || stripParams[1] != "top_p" {
t.Errorf("unexpected strip params: %v", stripParams)
}
// Check setParams
if config.Filters.SetParams == nil {
t.Fatal("Filters.SetParams should not be nil")
}
if config.Filters.SetParams["max_tokens"] != 1000 {
t.Errorf("expected max_tokens 1000, got %v", config.Filters.SetParams["max_tokens"])
}
}
+14
View File
@@ -106,6 +106,20 @@ func (p *PeerProxy) HasPeerModel(modelID string) bool {
return found return found
} }
// GetPeerFilters returns the filters for a peer model, or empty filters if not found
func (p *PeerProxy) GetPeerFilters(modelID string) config.Filters {
pp, found := p.proxyMap[modelID]
if !found {
return config.Filters{}
}
// Get the peer config using the peerID
peer, found := p.peers[pp.peerID]
if !found {
return config.Filters{}
}
return peer.Filters
}
func (p *PeerProxy) ListPeers() config.PeerDictionaryConfig { func (p *PeerProxy) ListPeers() config.PeerDictionaryConfig {
return p.peers return p.peers
} }
+44 -1
View File
@@ -282,6 +282,8 @@ func (pm *ProxyManager) setupGinEngine() {
pm.ginEngine.POST("/v1/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler) pm.ginEngine.POST("/v1/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler)
// Support anthropic /v1/messages (added https://github.com/ggml-org/llama.cpp/pull/17570) // Support anthropic /v1/messages (added https://github.com/ggml-org/llama.cpp/pull/17570)
pm.ginEngine.POST("/v1/messages", pm.apiKeyAuth(), pm.proxyInferenceHandler) pm.ginEngine.POST("/v1/messages", pm.apiKeyAuth(), pm.proxyInferenceHandler)
// Support anthropic count_tokens API (Also added in the above PR)
pm.ginEngine.POST("/v1/messages/count_tokens", pm.apiKeyAuth(), pm.proxyInferenceHandler)
// Support embeddings and reranking // Support embeddings and reranking
pm.ginEngine.POST("/v1/embeddings", pm.apiKeyAuth(), pm.proxyInferenceHandler) pm.ginEngine.POST("/v1/embeddings", pm.apiKeyAuth(), pm.proxyInferenceHandler)
@@ -650,13 +652,49 @@ func (pm *ProxyManager) proxyInferenceHandler(c *gin.Context) {
} }
} }
// issue #453 set/override parameters in the JSON body
setParams, setParamKeys := pm.config.Models[modelID].Filters.SanitizedSetParams()
for _, key := range setParamKeys {
pm.proxyLogger.Debugf("<%s> setting param: %s", modelID, key)
bodyBytes, err = sjson.SetBytes(bodyBytes, key, setParams[key])
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error setting parameter %s in request", key))
return
}
}
pm.proxyLogger.Debugf("ProxyManager using local Process for model: %s", requestedModel) pm.proxyLogger.Debugf("ProxyManager using local Process for model: %s", requestedModel)
nextHandler = processGroup.ProxyRequest nextHandler = processGroup.ProxyRequest
} else if pm.peerProxy != nil && pm.peerProxy.HasPeerModel(requestedModel) { } else if pm.peerProxy != nil && pm.peerProxy.HasPeerModel(requestedModel) {
pm.proxyLogger.Debugf("ProxyManager using ProxyPeer for model: %s", requestedModel) pm.proxyLogger.Debugf("ProxyManager using ProxyPeer for model: %s", requestedModel)
modelID = requestedModel modelID = requestedModel
nextHandler = pm.peerProxy.ProxyRequest
// issue #453 apply filters for peer requests
peerFilters := pm.peerProxy.GetPeerFilters(requestedModel)
// Apply stripParams - remove specified parameters from request
stripParams := peerFilters.SanitizedStripParams()
for _, param := range stripParams {
pm.proxyLogger.Debugf("<%s> stripping param: %s", requestedModel, param)
bodyBytes, err = sjson.DeleteBytes(bodyBytes, param)
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error stripping parameter %s from request", param))
return
}
}
// Apply setParams - set/override specified parameters in request
setParams, setParamKeys := peerFilters.SanitizedSetParams()
for _, key := range setParamKeys {
pm.proxyLogger.Debugf("<%s> setting param: %s", requestedModel, key)
bodyBytes, err = sjson.SetBytes(bodyBytes, key, setParams[key])
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error setting parameter %s in request", key))
return
}
}
nextHandler = pm.peerProxy.ProxyRequest
} }
if nextHandler == nil { if nextHandler == nil {
@@ -894,6 +932,11 @@ func (pm *ProxyManager) listRunningProcessesHandler(context *gin.Context) {
runningProcesses = append(runningProcesses, gin.H{ runningProcesses = append(runningProcesses, gin.H{
"model": process.ID, "model": process.ID,
"state": process.state, "state": process.state,
"cmd": process.config.Cmd,
"proxy": process.config.Proxy,
"ttl": process.config.UnloadAfter,
"name": process.config.Name,
"description": process.config.Description,
}) })
} }
} }
+12
View File
@@ -674,6 +674,11 @@ func TestProxyManager_RunningEndpoint(t *testing.T) {
Running []struct { Running []struct {
Model string `json:"model"` Model string `json:"model"`
State string `json:"state"` State string `json:"state"`
Cmd string `json:"cmd"`
Proxy string `json:"proxy"`
TTL int `json:"ttl"`
Name string `json:"name"`
Description string `json:"description"`
} `json:"running"` } `json:"running"`
} }
@@ -721,6 +726,11 @@ func TestProxyManager_RunningEndpoint(t *testing.T) {
// Is the model loaded? // Is the model loaded?
assert.Equal(t, "ready", response.Running[0].State) assert.Equal(t, "ready", response.Running[0].State)
// Verify extended fields are present
assert.NotEmpty(t, response.Running[0].Cmd, "cmd should be populated")
assert.NotEmpty(t, response.Running[0].Proxy, "proxy should be populated")
assert.Equal(t, 0, response.Running[0].TTL, "ttl should default to 0")
}) })
} }
@@ -966,7 +976,9 @@ func TestProxyManager_ChatContentLength(t *testing.T) {
func TestProxyManager_FiltersStripParams(t *testing.T) { func TestProxyManager_FiltersStripParams(t *testing.T) {
modelConfig := getTestSimpleResponderConfig("model1") modelConfig := getTestSimpleResponderConfig("model1")
modelConfig.Filters = config.ModelFilters{ modelConfig.Filters = config.ModelFilters{
Filters: config.Filters{
StripParams: "temperature, model, stream", StripParams: "temperature, model, stream",
},
} }
config := config.AddDefaultGroupToConfig(config.Config{ config := config.AddDefaultGroupToConfig(config.Config{
+80 -25
View File
@@ -75,6 +75,7 @@
"integrity": "sha512-bXYxrXFubeYdvB0NhD/NBB3Qi6aZeV20GOWVI47t2dkecCEoneR4NPVcb7abpXDEvejgrUfFtG6vG/zxAKmg+g==", "integrity": "sha512-bXYxrXFubeYdvB0NhD/NBB3Qi6aZeV20GOWVI47t2dkecCEoneR4NPVcb7abpXDEvejgrUfFtG6vG/zxAKmg+g==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"@ampproject/remapping": "^2.2.0", "@ampproject/remapping": "^2.2.0",
"@babel/code-frame": "^7.27.1", "@babel/code-frame": "^7.27.1",
@@ -1593,6 +1594,66 @@
"node": ">=14.0.0" "node": ">=14.0.0"
} }
}, },
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/core": {
"version": "1.4.3",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"@emnapi/wasi-threads": "1.0.2",
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/runtime": {
"version": "1.4.3",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@emnapi/wasi-threads": {
"version": "1.0.2",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@napi-rs/wasm-runtime": {
"version": "0.2.10",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"@emnapi/core": "^1.4.3",
"@emnapi/runtime": "^1.4.3",
"@tybys/wasm-util": "^0.9.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/@tybys/wasm-util": {
"version": "0.9.0",
"dev": true,
"inBundle": true,
"license": "MIT",
"optional": true,
"dependencies": {
"tslib": "^2.4.0"
}
},
"node_modules/@tailwindcss/oxide-wasm32-wasi/node_modules/tslib": {
"version": "2.8.0",
"dev": true,
"inBundle": true,
"license": "0BSD",
"optional": true
},
"node_modules/@tailwindcss/oxide-win32-arm64-msvc": { "node_modules/@tailwindcss/oxide-win32-arm64-msvc": {
"version": "4.1.8", "version": "4.1.8",
"resolved": "https://registry.npmjs.org/@tailwindcss/oxide-win32-arm64-msvc/-/oxide-win32-arm64-msvc-4.1.8.tgz", "resolved": "https://registry.npmjs.org/@tailwindcss/oxide-win32-arm64-msvc/-/oxide-win32-arm64-msvc-4.1.8.tgz",
@@ -1707,6 +1768,7 @@
"integrity": "sha512-JeG0rEWak0N6Itr6QUx+X60uQmN+5t3j9r/OVDtWzFXKaj6kD1BwJzOksD0FF6iWxZlbE1kB0q9vtnU2ekqa1Q==", "integrity": "sha512-JeG0rEWak0N6Itr6QUx+X60uQmN+5t3j9r/OVDtWzFXKaj6kD1BwJzOksD0FF6iWxZlbE1kB0q9vtnU2ekqa1Q==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"csstype": "^3.0.2" "csstype": "^3.0.2"
} }
@@ -1767,6 +1829,7 @@
"integrity": "sha512-qwxv6dq682yVvgKKp2qWwLgRbscDAYktPptK4JPojCwwi3R9cwrvIxS4lvBpzmcqzR4bdn54Z0IG1uHFskW4dA==", "integrity": "sha512-qwxv6dq682yVvgKKp2qWwLgRbscDAYktPptK4JPojCwwi3R9cwrvIxS4lvBpzmcqzR4bdn54Z0IG1uHFskW4dA==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"@typescript-eslint/scope-manager": "8.33.1", "@typescript-eslint/scope-manager": "8.33.1",
"@typescript-eslint/types": "8.33.1", "@typescript-eslint/types": "8.33.1",
@@ -2018,6 +2081,7 @@
"integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==", "integrity": "sha512-NZyJarBfL7nWwIq+FDL6Zp/yHEhePMNnnJ0y3qfieCrmNvYct8uvtiV41UvlSe6apAfk0fY1FbWx+NwfmpvtTg==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"bin": { "bin": {
"acorn": "bin/acorn" "acorn": "bin/acorn"
}, },
@@ -2126,6 +2190,7 @@
} }
], ],
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"caniuse-lite": "^1.0.30001718", "caniuse-lite": "^1.0.30001718",
"electron-to-chromium": "^1.5.160", "electron-to-chromium": "^1.5.160",
@@ -2392,6 +2457,7 @@
"integrity": "sha512-BhHmn2yNOFA9H9JmmIVKJmd288g9hrVRDkdoIgRCRuSySRUHH7r/DI6aAXW9T1WwUuY3DFgrcaqB+deURBLR5g==", "integrity": "sha512-BhHmn2yNOFA9H9JmmIVKJmd288g9hrVRDkdoIgRCRuSySRUHH7r/DI6aAXW9T1WwUuY3DFgrcaqB+deURBLR5g==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"@eslint-community/eslint-utils": "^4.8.0", "@eslint-community/eslint-utils": "^4.8.0",
"@eslint-community/regexpp": "^4.12.1", "@eslint-community/regexpp": "^4.12.1",
@@ -3271,9 +3337,9 @@
} }
}, },
"node_modules/minizlib": { "node_modules/minizlib": {
"version": "3.0.2", "version": "3.1.0",
"resolved": "https://registry.npmjs.org/minizlib/-/minizlib-3.0.2.tgz", "resolved": "https://registry.npmjs.org/minizlib/-/minizlib-3.1.0.tgz",
"integrity": "sha512-oG62iEk+CYt5Xj2YqI5Xi9xWUeZhDI8jjQmC5oThVH5JGCTgIjr7ciJDzC7MBzYd//WvR1OTmP5Q38Q8ShQtVA==", "integrity": "sha512-KZxYo1BUkWD2TVFLr0MQoM8vUUigWD3LlD83a/75BqC+4qE0Hb1Vo5v1FgcfaNXvfXzr+5EhQ6ing/CaBijTlw==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"dependencies": { "dependencies": {
@@ -3283,22 +3349,6 @@
"node": ">= 18" "node": ">= 18"
} }
}, },
"node_modules/mkdirp": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/mkdirp/-/mkdirp-3.0.1.tgz",
"integrity": "sha512-+NsyUUAZDmo6YVHzL/stxSu3t9YS1iljliy3BSDrXJ/dkn1KYdmtZODGGjLcc9XLgVVpH4KshHB8XmZgMhaBXg==",
"dev": true,
"license": "MIT",
"bin": {
"mkdirp": "dist/cjs/src/bin.js"
},
"engines": {
"node": ">=10"
},
"funding": {
"url": "https://github.com/sponsors/isaacs"
}
},
"node_modules/ms": { "node_modules/ms": {
"version": "2.1.3", "version": "2.1.3",
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz", "resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
@@ -3517,6 +3567,7 @@
"resolved": "https://registry.npmjs.org/react/-/react-19.1.0.tgz", "resolved": "https://registry.npmjs.org/react/-/react-19.1.0.tgz",
"integrity": "sha512-FS+XFBNvn3GTAWq26joslQgWNoFu08F4kl0J4CgdNKADkdSGXQyTCnKteIAJy96Br6YbpEU1LSzV5dYtjMkMDg==", "integrity": "sha512-FS+XFBNvn3GTAWq26joslQgWNoFu08F4kl0J4CgdNKADkdSGXQyTCnKteIAJy96Br6YbpEU1LSzV5dYtjMkMDg==",
"license": "MIT", "license": "MIT",
"peer": true,
"engines": { "engines": {
"node": ">=0.10.0" "node": ">=0.10.0"
} }
@@ -3526,6 +3577,7 @@
"resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.1.0.tgz", "resolved": "https://registry.npmjs.org/react-dom/-/react-dom-19.1.0.tgz",
"integrity": "sha512-Xs1hdnE+DyKgeHJeJznQmYMIBG3TKIHJJT95Q58nHLSrElKlGQqDTR2HQ9fx5CN/Gk6Vh/kupBTDLU11/nDk/g==", "integrity": "sha512-Xs1hdnE+DyKgeHJeJznQmYMIBG3TKIHJJT95Q58nHLSrElKlGQqDTR2HQ9fx5CN/Gk6Vh/kupBTDLU11/nDk/g==",
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"scheduler": "^0.26.0" "scheduler": "^0.26.0"
}, },
@@ -3791,17 +3843,16 @@
} }
}, },
"node_modules/tar": { "node_modules/tar": {
"version": "7.4.3", "version": "7.5.6",
"resolved": "https://registry.npmjs.org/tar/-/tar-7.4.3.tgz", "resolved": "https://registry.npmjs.org/tar/-/tar-7.5.6.tgz",
"integrity": "sha512-5S7Va8hKfV7W5U6g3aYxXmlPoZVAwUMy9AOKyF2fVuZa2UD3qZjg578OrLRt8PcNN1PleVaL/5/yYATNL0ICUw==", "integrity": "sha512-xqUeu2JAIJpXyvskvU3uvQW8PAmHrtXp2KDuMJwQqW8Sqq0CaZBAQ+dKS3RBXVhU4wC5NjAdKrmh84241gO9cA==",
"dev": true, "dev": true,
"license": "ISC", "license": "BlueOak-1.0.0",
"dependencies": { "dependencies": {
"@isaacs/fs-minipass": "^4.0.0", "@isaacs/fs-minipass": "^4.0.0",
"chownr": "^3.0.0", "chownr": "^3.0.0",
"minipass": "^7.1.2", "minipass": "^7.1.2",
"minizlib": "^3.0.1", "minizlib": "^3.1.0",
"mkdirp": "^3.0.1",
"yallist": "^5.0.0" "yallist": "^5.0.0"
}, },
"engines": { "engines": {
@@ -3856,6 +3907,7 @@
"integrity": "sha512-M7BAV6Rlcy5u+m6oPhAPFgJTzAioX/6B0DxyvDlo9l8+T3nLKbrczg2WLUyzd45L8RqfUMyGPzekbMvX2Ldkwg==", "integrity": "sha512-M7BAV6Rlcy5u+m6oPhAPFgJTzAioX/6B0DxyvDlo9l8+T3nLKbrczg2WLUyzd45L8RqfUMyGPzekbMvX2Ldkwg==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"engines": { "engines": {
"node": ">=12" "node": ">=12"
}, },
@@ -3908,6 +3960,7 @@
"integrity": "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ==", "integrity": "sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ==",
"dev": true, "dev": true,
"license": "Apache-2.0", "license": "Apache-2.0",
"peer": true,
"bin": { "bin": {
"tsc": "bin/tsc", "tsc": "bin/tsc",
"tsserver": "bin/tsserver" "tsserver": "bin/tsserver"
@@ -3986,6 +4039,7 @@
"integrity": "sha512-+Oxm7q9hDoLMyJOYfUYBuHQo+dkAloi33apOPP56pzj+vsdJDzr+j1NISE5pyaAuKL4A3UD34qd0lx5+kfKp2g==", "integrity": "sha512-+Oxm7q9hDoLMyJOYfUYBuHQo+dkAloi33apOPP56pzj+vsdJDzr+j1NISE5pyaAuKL4A3UD34qd0lx5+kfKp2g==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"dependencies": { "dependencies": {
"esbuild": "^0.25.0", "esbuild": "^0.25.0",
"fdir": "^6.4.4", "fdir": "^6.4.4",
@@ -4076,6 +4130,7 @@
"integrity": "sha512-M7BAV6Rlcy5u+m6oPhAPFgJTzAioX/6B0DxyvDlo9l8+T3nLKbrczg2WLUyzd45L8RqfUMyGPzekbMvX2Ldkwg==", "integrity": "sha512-M7BAV6Rlcy5u+m6oPhAPFgJTzAioX/6B0DxyvDlo9l8+T3nLKbrczg2WLUyzd45L8RqfUMyGPzekbMvX2Ldkwg==",
"dev": true, "dev": true,
"license": "MIT", "license": "MIT",
"peer": true,
"engines": { "engines": {
"node": ">=12" "node": ">=12"
}, },