Compare commits

..

11 Commits

Author SHA1 Message Date
Benson Wong 75fced579e config: support macros in peer apiKey and filters (#469)
* config: support environment variable macros in peer apiKeys

Add ${env.VAR_NAME} substitution for peer apiKey fields, consistent
with existing env macro support for model fields and global apiKeys.

- Add env macro substitution for peers.{name}.apiKey in LoadConfigFromReader
- Add tests for peer apiKey env substitution
- Update config.example.yaml to show env macro usage

* config: support macros in peer apiKey and filters

Extend macro substitution to peer configuration fields:
- peers.{name}.apiKey supports both global macros and env macros
- peers.{name}.filters.stripParams supports both macro types
- peers.{name}.filters.setParams supports both macro types

Also renamed validateMetadataForUnknownMacros to validateNestedForUnknownMacros
for reuse across model metadata and peer filters validation.
2026-01-16 23:10:50 -08:00
Benson Wong b73f367f22 config-schema.json,config.example.yaml: Update examples and schema 2026-01-16 22:43:25 -08:00
Benson Wong 8f2137c72b config: support environment variable macros in apiKeys (#467)
Add substituteEnvMacros support for apiKeys configuration field,
allowing API keys to be loaded from environment variables using
the ${env.VAR_NAME} syntax.

- Apply env macro substitution before validation
- Add tests for env macro substitution in apiKeys
2026-01-16 22:41:14 -08:00
Benson Wong 124007cc98 config: add environment variable macros (#466)
* config: add environment variable macros

Add support for ${env.VAR_NAME} syntax to pull values from system
environment variables during config loading.

- env macros processed before regular macros (allows macros to reference env vars)
- works in cmd, cmdStop, proxy, checkEndpoint, filters.stripParams, metadata
- returns error if env var is not set
- add comprehensive tests

fixes #462

* docs: add env macro example to config.example.yaml
2026-01-16 22:25:20 -08:00
Benson Wong eb5bfff0b0 proxy: unify filtering for local models and peers
This unifies the filtering capabilities for models and peers

- stripParams: removes params in the request
- setParams: sets params in the request

fixes #453
2026-01-15 18:59:43 -08:00
Benson Wong 3edb180c08 ci: free up disk space before ROCm container build (#460) 2026-01-14 22:03:42 -08:00
Benson Wong 66d555e625 Improve container build reliability (#457)
* docker: add .env usage in build-container.sh
* .github,docker: add rocm, improve logging
* .github,CLAUDE.md: fix workflow and update guidelines

Update containers workflow to only push images when triggered
manually or on schedule, not on workflow file changes.

- add push trigger for workflow file changes in containers.yml
- update push condition to skip on regular push events
- update CLAUDE.md commit message guidelines

* docker: remove comma in build-container.sh

* .github,docker: improve container build workflow

Add pagination support for fetching llama.cpp tags and improve debugging.

- add build-container.sh to workflow trigger paths
- implement fetch_llama_tag() with pagination support
- replace .env with local testing instructions
- add DEBUG_ABORT_BUILD flag for testing
2026-01-10 22:14:33 -08:00
Benson Wong 4f863fd9fc CLAUDE.md: tweak instructions 2026-01-09 21:42:06 -08:00
Benson Wong 267c030457 ui: update react-router-dom to 7.12.0 (#456)
Update react-router-dom from 7.6.2 to 7.12.0 to address security vulnerability.

- Updated dependency in package.json
- Regenerated package-lock.json
- Verified build passes successfully
- Confirmed 0 vulnerabilities with npm audit

Co-authored-by: Claude <noreply@anthropic.com>
2026-01-08 16:13:09 -08:00
Benson Wong c19309fe7e CLAUDE.md: small instruction tweaks 2026-01-07 21:34:23 -08:00
Benson Wong 4413881b2d proxy: actually add /v1/responses endpoint (#449)
ref: #448
2026-01-01 13:35:45 -08:00
18 changed files with 1270 additions and 108 deletions
+21 -2
View File
@@ -10,17 +10,36 @@ on:
# Allows manual triggering of the workflow # Allows manual triggering of the workflow
workflow_dispatch: workflow_dispatch:
# Run on workflow file changes (without pushing)
push:
paths:
- '.github/workflows/containers.yml'
- 'docker/build-container.sh'
jobs: jobs:
build-and-push: build-and-push:
runs-on: ubuntu-latest runs-on: ubuntu-latest
strategy: strategy:
matrix: matrix:
platform: [intel, cuda, vulkan, cpu, musa] platform: [intel, cuda, vulkan, cpu, musa, rocm]
fail-fast: false fail-fast: false
steps: steps:
- name: Checkout code - name: Checkout code
uses: actions/checkout@v4 uses: actions/checkout@v4
- name: Free up disk space
if: matrix.platform == 'rocm'
run: |
echo "Before cleanup:"
df -h
sudo rm -rf /usr/share/dotnet
sudo rm -rf /usr/local/lib/android
sudo rm -rf /opt/ghc
sudo rm -rf /opt/hostedtoolcache/CodeQL
sudo docker system prune -af
echo "After cleanup:"
df -h
- name: Log in to GitHub Container Registry - name: Log in to GitHub Container Registry
uses: docker/login-action@v2 uses: docker/login-action@v2
with: with:
@@ -31,7 +50,7 @@ jobs:
- name: Run build-container - name: Run build-container
env: env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: ./docker/build-container.sh ${{ matrix.platform }} true run: ./docker/build-container.sh ${{ matrix.platform }} ${{ github.event_name != 'push' }}
# note make sure mostlygeek/llama-swap has admin rights to the llama-swap package # note make sure mostlygeek/llama-swap has admin rights to the llama-swap package
# see: https://github.com/actions/delete-package-versions/issues/74 # see: https://github.com/actions/delete-package-versions/issues/74
+28 -24
View File
@@ -1,5 +1,3 @@
# Project: llama-swap
## Project Description: ## Project Description:
llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server. llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server.
@@ -7,7 +5,16 @@ llama-swap is a light weight, transparent proxy server that provides automatic m
## Tech stack ## Tech stack
- golang - golang
- typescript, vite and react for UI (ui/) - typescript, vite and react for UI (located in ui/)
## Workflow Tasks
- when summarizing changes only include details that require further action
- just say "Done." when there is no further action
- use `gh` to create PRs and load issues
- do include Co-Authored-By or created by when committing changes or creating PRs
- keep PR descriptions short and focused on changes.
- never include a test plan
## Testing ## Testing
@@ -16,30 +23,27 @@ llama-swap is a light weight, transparent proxy server that provides automatic m
- Use `make test-dev` after running new tests for a quick over all test run. This runs `go test` and `staticcheck`. Fix any static checking errors. Use this only when changes are made to any code under the `proxy/` directory - Use `make test-dev` after running new tests for a quick over all test run. This runs `go test` and `staticcheck`. Fix any static checking errors. Use this only when changes are made to any code under the `proxy/` directory
- Use `make test-all` before completing work. This includes long running concurrency tests. - Use `make test-all` before completing work. This includes long running concurrency tests.
## Workflow Tasks ### Commit message example format:
### Plan Improvements ```
proxy: add new feature
Work plans are located in ai-plans/. Plans written by the user may be incomplete, contain inconsistencies or errors. Add new feature that implements functionality X and Y.
When the user asks to improve a plan follow these guidelines for expanding and improving it. - key change 1
- key change 2
- key change 3
- Identify any inconsistencies. fixes #123
- Expand plans out to be detailed specification of requirements and changes to be made. ```
- Plans should have at least these sections:
- Title - very short, describes changes
- Overview: A more detailed summary of goal and outcomes desired
- Design Requirements: Detailed descriptions of what needs to be done
- Testing Plan: Tests to be implemented
- Checklist: A detailed list of changes to be made
Look for "plan expansion" as explicit instructions to improve a plan. ## Code Reviews
### Implementation of plans - use three levels High, Medium, Low severity
- label each discovered issue with a label like H1, M2, L3 respectively
When the user says "paint it", respond with "commencing automated assembly". Then implement the changes as described by the plan. Update the checklist as you complete items. - High severity are must fix issues (security, race conditions, critical bugs)
- Medium severity are recommended improvements (coding style, missing functionality, inconsistencies)
## General Rules - Low severity are nice to have changes and nits
- Include a suggestion with each discovered item
- when summarizing changes only include details that require further action (action items) - Limit your code review to three items with the highest priority first
- when there are no action items, just say "Done." - Double check your discovered items and recommended remediations
+27 -1
View File
@@ -188,11 +188,17 @@
"default": "", "default": "",
"pattern": "^[a-zA-Z0-9_, ]*$", "pattern": "^[a-zA-Z0-9_, ]*$",
"description": "Comma separated list of parameters to remove from the request. Used for server-side enforcement of sampling parameters." "description": "Comma separated list of parameters to remove from the request. Used for server-side enforcement of sampling parameters."
},
"setParams": {
"type": "object",
"additionalProperties": true,
"default": {},
"description": "Dictionary of parameters to set/override in requests. Useful for enforcing specific parameter values. Protected params like 'model' cannot be overridden. Values can be strings, numbers, booleans, arrays, or objects."
} }
}, },
"additionalProperties": false, "additionalProperties": false,
"default": {}, "default": {},
"description": "Dictionary of filter settings. Only stripParams is supported." "description": "Dictionary of filter settings. Supports stripParams and setParams."
}, },
"metadata": { "metadata": {
"type": "object", "type": "object",
@@ -320,6 +326,26 @@
"minLength": 1 "minLength": 1
}, },
"description": "A list of models served by the peer." "description": "A list of models served by the peer."
},
"filters": {
"type": "object",
"properties": {
"stripParams": {
"type": "string",
"default": "",
"pattern": "^[a-zA-Z0-9_, ]*$",
"description": "Comma separated list of parameters to remove from the request. Useful for removing parameters that the peer doesn't support."
},
"setParams": {
"type": "object",
"additionalProperties": true,
"default": {},
"description": "Dictionary of parameters to set/override in requests to this peer. Useful for injecting provider-specific settings. Protected params like 'model' cannot be overridden. Values can be strings, numbers, booleans, arrays, or objects."
}
},
"additionalProperties": false,
"default": {},
"description": "Dictionary of filter settings for peer requests. Supports stripParams and setParams."
} }
} }
}, },
+54 -12
View File
@@ -70,16 +70,6 @@ sendLoadingState: true
# all fields except for Id so chat UIs can use the alias equivalent to the original. # all fields except for Id so chat UIs can use the alias equivalent to the original.
includeAliasesInList: false includeAliasesInList: false
# apiKeys: require an API key when making requests to inference endpoints
# - optional, default: []
# - when empty (the default) authorization will not be checked as llama-swap is default-allow
# - each key is a non-empty string
apiKeys:
- "sk-hunter2"
# hint, one liner: printf "sk-%s\n" "$(head -c 48 /dev/urandom | base64 )"
- "sk-gyCPiKUcIfPlaM4OSMZekkprgijPx6+OsmQs8Rsg0xZ9qpy6gKWsIKqHOk+cgXVx"
- "sk-+QtIn0Zjj4UHjiaZYiZEnru4mrwKM9RzhmJeK5SobNXLl8QMFXxGz1/2lEuvQpkb"
# macros: a dictionary of string substitutions # macros: a dictionary of string substitutions
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - macros are reusable snippets # - macros are reusable snippets
@@ -90,6 +80,9 @@ apiKeys:
# - macro names must not be a reserved name: PORT or MODEL_ID # - macro names must not be a reserved name: PORT or MODEL_ID
# - macro values can be numbers, bools, or strings # - macro values can be numbers, bools, or strings
# - macros can contain other macros, but they must be defined before they are used # - macros can contain other macros, but they must be defined before they are used
# - environment variables can be referenced with ${env.VAR_NAME} syntax
# - env macros are substituted first, before regular macros
# - if the env var is not set, config loading will fail with an error
macros: macros:
# Example of a multi-line macro # Example of a multi-line macro
"latest-llama": > "latest-llama": >
@@ -102,6 +95,24 @@ macros:
# but they must be previously declared. # but they must be previously declared.
"default_args": "--ctx-size ${default_ctx}" "default_args": "--ctx-size ${default_ctx}"
# Example of environment variable macros
# - ${env.VAR_NAME} pulls the value from the system environment
# - useful for paths, secrets, or machine-specific configuration
"models_dir": "${env.HOME}/models"
# apiKeys: require an API key when making requests to inference endpoints
# - optional, default: []
# - when empty (the default) authorization will not be checked as llama-swap is default-allow
# - each key is a non-empty string
apiKeys:
- "sk-hunter2"
# tip, one liner: printf "sk-%s\n" "$(head -c 48 /dev/urandom | base64 )"
- "sk-gyCPiKUcIfPlaM4OSMZekkprgijPx6+OsmQs8Rsg0xZ9qpy6gKWsIKqHOk+cgXVx"
# use environment variable macros to keep secrets out of the config
- "${env.API_KEY_1}"
- "${env.API_KEY_2}"
# models: a dictionary of model configurations # models: a dictionary of model configurations
# - required # - required
# - each key is the model's ID, used in API requests # - each key is the model's ID, used in API requests
@@ -185,7 +196,7 @@ models:
# filters: a dictionary of filter settings # filters: a dictionary of filter settings
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - only stripParams is currently supported # - same capabilities as peer filters (stripParams, setParams)
filters: filters:
# stripParams: a comma separated list of parameters to remove from the request # stripParams: a comma separated list of parameters to remove from the request
# - optional, default: "" # - optional, default: ""
@@ -195,6 +206,16 @@ models:
# - recommended to stick to sampling parameters # - recommended to stick to sampling parameters
stripParams: "temperature, top_p, top_k" stripParams: "temperature, top_p, top_k"
# setParams: a dictionary of parameters to set/override in requests
# - optional, default: empty dictionary
# - useful for enforcing specific parameter values
# - protected params like "model" cannot be overridden
# - values can be strings, numbers, booleans, arrays, or objects
setParams:
# Example: enforce specific sampling parameters
temperature: 0.7
top_p: 0.9
# metadata: a dictionary of arbitrary values that are included in /v1/models # metadata: a dictionary of arbitrary values that are included in /v1/models
# - optional, default: empty dictionary # - optional, default: empty dictionary
# - while metadata can contains complex types it is recommended to keep it simple # - while metadata can contains complex types it is recommended to keep it simple
@@ -365,7 +386,8 @@ peers:
# - optional, default: "" # - optional, default: ""
# - if blank, no key will be added to the request # - if blank, no key will be added to the request
# - key will be injected into headers: Authorization: Bearer <key> and x-api-key: <key> # - key will be injected into headers: Authorization: Bearer <key> and x-api-key: <key>
apiKey: sk-your-openrouter-key # - can be a string or a macro
apiKey: ${env.OPENROUTER_API_KEY}
models: models:
- meta-llama/llama-3.1-8b-instruct - meta-llama/llama-3.1-8b-instruct
- qwen/qwen3-235b-a22b-2507 - qwen/qwen3-235b-a22b-2507
@@ -373,3 +395,23 @@ peers:
- z-ai/glm-4.7 - z-ai/glm-4.7
- moonshotai/kimi-k2-0905 - moonshotai/kimi-k2-0905
- minimax/minimax-m2.1 - minimax/minimax-m2.1
# filters: a dictionary of filter settings for peer requests
# - optional, default: empty dictionary
# - same capabilities as model filters (stripParams, setParams)
filters:
# stripParams: a comma separated list of parameters to remove from the request
# - optional, default: ""
# - useful for removing parameters that the peer doesn't support
# - the `model` parameter can never be removed
stripParams: "temperature, top_p"
# setParams: a dictionary of parameters to set/override in requests to this peer
# - optional, default: empty dictionary
# - useful for injecting provider-specific settings like data retention policies
# - protected params like "model" cannot be overridden
# - values can be strings, numbers, booleans, arrays, or objects
setParams:
# Example: enforce zero-data-retention for OpenRouter
provider:
data_collection: "deny"
zdr: true
+79 -14
View File
@@ -2,21 +2,37 @@
cd $(dirname "$0") cd $(dirname "$0")
# use this to test locally, example:
# GITHUB_TOKEN=$(gh auth token) LOG_DEBUG=1 DEBUG_ABORT_BUILD=1 ./docker/build-container.sh rocm
# you need read:package scope on the token. Generate a personal access token with
# the scopes: gist, read:org, repo, write:packages
# then: gh auth login (and copy/paste the new token)
log_debug() {
if [ "$LOG_DEBUG" = "1" ]; then
echo "[DEBUG] $*"
fi
}
log_info() {
echo "[INFO] $*"
}
ARCH=$1 ARCH=$1
PUSH_IMAGES=${2:-false} PUSH_IMAGES=${2:-false}
# List of allowed architectures # List of allowed architectures
ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cpu") ALLOWED_ARCHS=("intel" "vulkan" "musa" "cuda" "cpu" "rocm")
# Check if ARCH is in the allowed list # Check if ARCH is in the allowed list
if [[ ! " ${ALLOWED_ARCHS[@]} " =~ " ${ARCH} " ]]; then if [[ ! " ${ALLOWED_ARCHS[@]} " =~ " ${ARCH} " ]]; then
echo "Error: ARCH must be one of the following: ${ALLOWED_ARCHS[@]}" log_info "Error: ARCH must be one of the following: ${ALLOWED_ARCHS[@]}"
exit 1 exit 1
fi fi
# Check if GITHUB_TOKEN is set and not empty # Check if GITHUB_TOKEN is set and not empty
if [[ -z "$GITHUB_TOKEN" ]]; then if [[ -z "$GITHUB_TOKEN" ]]; then
echo "Error: GITHUB_TOKEN is not set or is empty." log_info "Error: GITHUB_TOKEN is not set or is empty."
exit 1 exit 1
fi fi
@@ -32,25 +48,74 @@ LS_REPO=${GITHUB_REPOSITORY:-mostlygeek/llama-swap}
# have to strip out the 'v' due to .tar.gz file naming # have to strip out the 'v' due to .tar.gz file naming
LS_VER=$(curl -s https://api.github.com/repos/${LS_REPO}/releases/latest | jq -r .tag_name | sed 's/v//') LS_VER=$(curl -s https://api.github.com/repos/${LS_REPO}/releases/latest | jq -r .tag_name | sed 's/v//')
# Fetches the most recent llama.cpp tag matching the given prefix
# Handles pagination to search beyond the first 100 results
# $1 - tag_prefix (e.g., "server" or "server-vulkan")
# Returns: the version number extracted from the tag
fetch_llama_tag() {
local tag_prefix=$1
local page=1
local per_page=100
while true; do
log_debug "Fetching page $page for tag prefix: $tag_prefix"
local response=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions?per_page=${per_page}&page=${page}")
# Check for API errors
if echo "$response" | jq -e '.message' > /dev/null 2>&1; then
local error_msg=$(echo "$response" | jq -r '.message')
log_info "GitHub API error: $error_msg"
return 1
fi
# Check if response is empty array (no more pages)
if [ "$(echo "$response" | jq 'length')" -eq 0 ]; then
log_debug "No more pages (empty response)"
return 1
fi
# Extract matching tag from this page
local found_tag=$(echo "$response" | jq -r \
".[] | select(.metadata.container.tags[]? | startswith(\"$tag_prefix\")) | .metadata.container.tags[] | select(startswith(\"$tag_prefix\"))" \
| sort -r | head -n1)
if [ -n "$found_tag" ]; then
log_debug "Found tag: $found_tag on page $page"
echo "$found_tag" | awk -F '-' '{print $NF}'
return 0
fi
page=$((page + 1))
# Safety limit to prevent infinite loops
if [ $page -gt 50 ]; then
log_info "Reached pagination safety limit (50 pages)"
return 1
fi
done
}
if [ "$ARCH" == "cpu" ]; then if [ "$ARCH" == "cpu" ]; then
# cpu only containers just use the server tag LCPP_TAG=$(fetch_llama_tag "server")
LCPP_TAG=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions" \
| jq -r '.[] | select(.metadata.container.tags[] | startswith("server")) | .metadata.container.tags[]' \
| sort -r | head -n1 | awk -F '-' '{print $3}')
BASE_TAG=server-${LCPP_TAG} BASE_TAG=server-${LCPP_TAG}
else else
LCPP_TAG=$(curl -s -H "Authorization: Bearer $GITHUB_TOKEN" \ LCPP_TAG=$(fetch_llama_tag "server-${ARCH}")
"https://api.github.com/users/ggml-org/packages/container/llama.cpp/versions" \
| jq -r --arg arch "$ARCH" '.[] | select(.metadata.container.tags[] | startswith("server-\($arch)")) | .metadata.container.tags[]' \
| sort -r | head -n1 | awk -F '-' '{print $3}')
BASE_TAG=server-${ARCH}-${LCPP_TAG} BASE_TAG=server-${ARCH}-${LCPP_TAG}
fi fi
# Abort if LCPP_TAG is empty. # Abort if LCPP_TAG is empty.
if [[ -z "$LCPP_TAG" ]]; then if [[ -z "$LCPP_TAG" ]]; then
echo "Abort: Could not find llama-server container for arch: $ARCH" log_info "Abort: Could not find llama-server container for arch: $ARCH"
exit 1 exit 1
else
log_info "LCPP_TAG: $LCPP_TAG"
fi
if [[ ! -z "$DEBUG_ABORT_BUILD" ]]; then
log_info "Abort: DEBUG_ABORT_BUILD set"
exit 0
fi fi
for CONTAINER_TYPE in non-root root; do for CONTAINER_TYPE in non-root root; do
@@ -68,7 +133,7 @@ for CONTAINER_TYPE in non-root root; do
USER_HOME=/app USER_HOME=/app
fi fi
echo "Building $CONTAINER_TYPE $CONTAINER_TAG $LS_VER" log_info "Building $CONTAINER_TYPE $CONTAINER_TAG $LS_VER"
docker build -f llama-swap.Containerfile --build-arg BASE_TAG=${BASE_TAG} --build-arg LS_VER=${LS_VER} --build-arg UID=${USER_UID} \ docker build -f llama-swap.Containerfile --build-arg BASE_TAG=${BASE_TAG} --build-arg LS_VER=${LS_VER} --build-arg UID=${USER_UID} \
--build-arg LS_REPO=${LS_REPO} --build-arg GID=${USER_GID} --build-arg USER_HOME=${USER_HOME} -t ${CONTAINER_TAG} -t ${CONTAINER_LATEST} \ --build-arg LS_REPO=${LS_REPO} --build-arg GID=${USER_GID} --build-arg USER_HOME=${USER_HOME} -t ${CONTAINER_TAG} -t ${CONTAINER_LATEST} \
--build-arg BASE_IMAGE=${BASE_IMAGE} . --build-arg BASE_IMAGE=${BASE_IMAGE} .
+188 -8
View File
@@ -87,6 +87,7 @@ type GroupConfig struct {
var ( var (
macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`) macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`)
macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`) macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`)
envMacroRegex = regexp.MustCompile(`\$\{env\.([a-zA-Z_][a-zA-Z0-9_]*)\}`)
) )
// set default values for GroupConfig // set default values for GroupConfig
@@ -237,6 +238,17 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
// Process environment variable macros in global macro values first
for i, macro := range config.Macros {
if strVal, ok := macro.Value.(string); ok {
newVal, err := substituteEnvMacros(strVal)
if err != nil {
return Config{}, fmt.Errorf("global macro '%s': %w", macro.Name, err)
}
config.Macros[i].Value = newVal
}
}
// Get and sort all model IDs first, makes testing more consistent // Get and sort all model IDs first, makes testing more consistent
modelIds := make([]string, 0, len(config.Models)) modelIds := make([]string, 0, len(config.Models))
for modelId := range config.Models { for modelId := range config.Models {
@@ -252,6 +264,48 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
modelConfig.Cmd = StripComments(modelConfig.Cmd) modelConfig.Cmd = StripComments(modelConfig.Cmd)
modelConfig.CmdStop = StripComments(modelConfig.CmdStop) modelConfig.CmdStop = StripComments(modelConfig.CmdStop)
// Substitute environment variable macros in model fields
modelConfig.Cmd, err = substituteEnvMacros(modelConfig.Cmd)
if err != nil {
return Config{}, fmt.Errorf("model %s cmd: %w", modelId, err)
}
modelConfig.CmdStop, err = substituteEnvMacros(modelConfig.CmdStop)
if err != nil {
return Config{}, fmt.Errorf("model %s cmdStop: %w", modelId, err)
}
modelConfig.Proxy, err = substituteEnvMacros(modelConfig.Proxy)
if err != nil {
return Config{}, fmt.Errorf("model %s proxy: %w", modelId, err)
}
modelConfig.CheckEndpoint, err = substituteEnvMacros(modelConfig.CheckEndpoint)
if err != nil {
return Config{}, fmt.Errorf("model %s checkEndpoint: %w", modelId, err)
}
modelConfig.Filters.StripParams, err = substituteEnvMacros(modelConfig.Filters.StripParams)
if err != nil {
return Config{}, fmt.Errorf("model %s filters.stripParams: %w", modelId, err)
}
// Substitute env macros in model-level macro values
for i, macro := range modelConfig.Macros {
if strVal, ok := macro.Value.(string); ok {
newVal, err := substituteEnvMacros(strVal)
if err != nil {
return Config{}, fmt.Errorf("model %s macro '%s': %w", modelId, macro.Name, err)
}
modelConfig.Macros[i].Value = newVal
}
}
// Substitute env macros in metadata
if len(modelConfig.Metadata) > 0 {
result, err := substituteEnvMacrosInValue(modelConfig.Metadata)
if err != nil {
return Config{}, fmt.Errorf("model %s metadata: %w", modelId, err)
}
modelConfig.Metadata = result.(map[string]any)
}
// validate model macros // validate model macros
for _, macro := range modelConfig.Macros { for _, macro := range modelConfig.Macros {
if err = validateMacro(macro.Name, macro.Value); err != nil { if err = validateMacro(macro.Name, macro.Value); err != nil {
@@ -362,11 +416,18 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
// Any other macro is unknown // Any other macro is unknown
return Config{}, fmt.Errorf("unknown macro '${%s}' found in %s.%s", macroName, modelId, fieldName) return Config{}, fmt.Errorf("unknown macro '${%s}' found in %s.%s", macroName, modelId, fieldName)
} }
// Check for unsubstituted env macros
envMatches := envMacroRegex.FindAllStringSubmatch(fieldValue, -1)
for _, match := range envMatches {
varName := match[1]
return Config{}, fmt.Errorf("environment variable '%s' not set (found in %s.%s)", varName, modelId, fieldName)
}
} }
// Check for unknown macros in metadata // Check for unknown macros in metadata
if len(modelConfig.Metadata) > 0 { if len(modelConfig.Metadata) > 0 {
if err := validateMetadataForUnknownMacros(modelConfig.Metadata, modelId); err != nil { if err := validateNestedForUnknownMacros(modelConfig.Metadata, fmt.Sprintf("model %s metadata", modelId)); err != nil {
return Config{}, err return Config{}, err
} }
} }
@@ -424,8 +485,14 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
config.Hooks.OnStartup.Preload = toPreload config.Hooks.OnStartup.Preload = toPreload
} }
// check api keys validatity // check api keys validity and substitute env macros
for _, apikey := range config.RequiredAPIKeys { for i, apikey := range config.RequiredAPIKeys {
apikey, err = substituteEnvMacros(apikey)
if err != nil {
return Config{}, fmt.Errorf("apiKeys[%d]: %w", i, err)
}
config.RequiredAPIKeys[i] = apikey
if apikey == "" { if apikey == "" {
return Config{}, fmt.Errorf("empty api key found in apiKeys") return Config{}, fmt.Errorf("empty api key found in apiKeys")
} }
@@ -435,6 +502,62 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
} }
} }
// substitute macros and env macros in peer fields
for peerName, peerConfig := range config.Peers {
// Substitute global macros first (LIFO order like models)
for i := len(config.Macros) - 1; i >= 0; i-- {
entry := config.Macros[i]
macroSlug := fmt.Sprintf("${%s}", entry.Name)
macroStr := fmt.Sprintf("%v", entry.Value)
peerConfig.ApiKey = strings.ReplaceAll(peerConfig.ApiKey, macroSlug, macroStr)
peerConfig.Filters.StripParams = strings.ReplaceAll(peerConfig.Filters.StripParams, macroSlug, macroStr)
// Substitute in setParams
if len(peerConfig.Filters.SetParams) > 0 {
result, err := substituteMacroInValue(peerConfig.Filters.SetParams, entry.Name, entry.Value)
if err != nil {
return Config{}, fmt.Errorf("peers.%s.filters.setParams: %w", peerName, err)
}
peerConfig.Filters.SetParams = result.(map[string]any)
}
}
// Substitute env macros
peerConfig.ApiKey, err = substituteEnvMacros(peerConfig.ApiKey)
if err != nil {
return Config{}, fmt.Errorf("peers.%s.apiKey: %w", peerName, err)
}
peerConfig.Filters.StripParams, err = substituteEnvMacros(peerConfig.Filters.StripParams)
if err != nil {
return Config{}, fmt.Errorf("peers.%s.filters.stripParams: %w", peerName, err)
}
if len(peerConfig.Filters.SetParams) > 0 {
result, err := substituteEnvMacrosInValue(peerConfig.Filters.SetParams)
if err != nil {
return Config{}, fmt.Errorf("peers.%s.filters.setParams: %w", peerName, err)
}
peerConfig.Filters.SetParams = result.(map[string]any)
}
// Validate no unknown macros remain
if matches := macroPatternRegex.FindAllStringSubmatch(peerConfig.ApiKey, -1); len(matches) > 0 {
return Config{}, fmt.Errorf("peers.%s.apiKey: unknown macro '${%s}'", peerName, matches[0][1])
}
if matches := macroPatternRegex.FindAllStringSubmatch(peerConfig.Filters.StripParams, -1); len(matches) > 0 {
return Config{}, fmt.Errorf("peers.%s.filters.stripParams: unknown macro '${%s}'", peerName, matches[0][1])
}
if len(peerConfig.Filters.SetParams) > 0 {
if err := validateNestedForUnknownMacros(peerConfig.Filters.SetParams, fmt.Sprintf("peers.%s.filters.setParams", peerName)); err != nil {
return Config{}, err
}
}
config.Peers[peerName] = peerConfig
}
return config, nil return config, nil
} }
@@ -565,20 +688,26 @@ func validateMacro(name string, value any) error {
return nil return nil
} }
// validateMetadataForUnknownMacros recursively checks for any remaining macro references in metadata // validateNestedForUnknownMacros recursively checks for any remaining macro references in nested structures
func validateMetadataForUnknownMacros(value any, modelId string) error { func validateNestedForUnknownMacros(value any, context string) error {
switch v := value.(type) { switch v := value.(type) {
case string: case string:
matches := macroPatternRegex.FindAllStringSubmatch(v, -1) matches := macroPatternRegex.FindAllStringSubmatch(v, -1)
for _, match := range matches { for _, match := range matches {
macroName := match[1] macroName := match[1]
return fmt.Errorf("model %s metadata: unknown macro '${%s}'", modelId, macroName) return fmt.Errorf("%s: unknown macro '${%s}'", context, macroName)
}
// Check for unsubstituted env macros
envMatches := envMacroRegex.FindAllStringSubmatch(v, -1)
for _, match := range envMatches {
varName := match[1]
return fmt.Errorf("%s: environment variable '%s' not set", context, varName)
} }
return nil return nil
case map[string]any: case map[string]any:
for _, val := range v { for _, val := range v {
if err := validateMetadataForUnknownMacros(val, modelId); err != nil { if err := validateNestedForUnknownMacros(val, context); err != nil {
return err return err
} }
} }
@@ -586,7 +715,7 @@ func validateMetadataForUnknownMacros(value any, modelId string) error {
case []any: case []any:
for _, val := range v { for _, val := range v {
if err := validateMetadataForUnknownMacros(val, modelId); err != nil { if err := validateNestedForUnknownMacros(val, context); err != nil {
return err return err
} }
} }
@@ -645,3 +774,54 @@ func substituteMacroInValue(value any, macroName string, macroValue any) (any, e
return value, nil return value, nil
} }
} }
// substituteEnvMacros replaces ${env.VAR_NAME} with environment variable values
// Returns error if any env var is not set
func substituteEnvMacros(s string) (string, error) {
result := s
matches := envMacroRegex.FindAllStringSubmatch(s, -1)
for _, match := range matches {
fullMatch := match[0] // ${env.VAR_NAME}
varName := match[1] // VAR_NAME
value, exists := os.LookupEnv(varName)
if !exists {
return "", fmt.Errorf("environment variable '%s' is not set", varName)
}
result = strings.ReplaceAll(result, fullMatch, value)
}
return result, nil
}
// substituteEnvMacrosInValue recursively substitutes env macros in nested structures
func substituteEnvMacrosInValue(value any) (any, error) {
switch v := value.(type) {
case string:
return substituteEnvMacros(v)
case map[string]any:
newMap := make(map[string]any)
for key, val := range v {
newVal, err := substituteEnvMacrosInValue(val)
if err != nil {
return nil, err
}
newMap[key] = newVal
}
return newMap, nil
case []any:
newSlice := make([]any, len(v))
for i, val := range v {
newVal, err := substituteEnvMacrosInValue(val)
if err != nil {
return nil, err
}
newSlice[i] = newVal
}
return newSlice, nil
default:
return value, nil
}
}
+436
View File
@@ -809,3 +809,439 @@ func TestConfig_APIKeys_Invalid(t *testing.T) {
}) })
} }
} }
func TestConfig_APIKeys_EnvMacros(t *testing.T) {
t.Run("env substitution in apiKeys", func(t *testing.T) {
t.Setenv("TEST_API_KEY", "secret-key-123")
content := `apiKeys: ["${env.TEST_API_KEY}"]`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, []string{"secret-key-123"}, config.RequiredAPIKeys)
})
t.Run("multiple env substitutions in apiKeys", func(t *testing.T) {
t.Setenv("TEST_API_KEY_1", "key-one")
t.Setenv("TEST_API_KEY_2", "key-two")
content := `apiKeys: ["${env.TEST_API_KEY_1}", "${env.TEST_API_KEY_2}", "static-key"]`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, []string{"key-one", "key-two", "static-key"}, config.RequiredAPIKeys)
})
t.Run("missing env var in apiKeys", func(t *testing.T) {
content := `apiKeys: ["${env.NONEXISTENT_API_KEY}"]`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "apiKeys[0]")
assert.Contains(t, err.Error(), "NONEXISTENT_API_KEY")
})
t.Run("env substitution results in empty key", func(t *testing.T) {
t.Setenv("TEST_EMPTY_KEY", "")
content := `apiKeys: ["${env.TEST_EMPTY_KEY}"]`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Equal(t, "empty api key found in apiKeys", err.Error())
})
}
func TestConfig_EnvMacros(t *testing.T) {
t.Run("basic env substitution in cmd", func(t *testing.T) {
t.Setenv("TEST_MODEL_PATH", "/opt/models")
content := `
models:
test:
cmd: "${env.TEST_MODEL_PATH}/llama-server"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "/opt/models/llama-server", config.Models["test"].Cmd)
})
t.Run("env substitution in multiple fields", func(t *testing.T) {
t.Setenv("TEST_HOST", "myserver")
t.Setenv("TEST_PORT", "9999")
content := `
models:
test:
cmd: "server --host ${env.TEST_HOST}"
proxy: "http://${env.TEST_HOST}:${env.TEST_PORT}"
checkEndpoint: "http://${env.TEST_HOST}/health"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --host myserver", config.Models["test"].Cmd)
assert.Equal(t, "http://myserver:9999", config.Models["test"].Proxy)
assert.Equal(t, "http://myserver/health", config.Models["test"].CheckEndpoint)
})
t.Run("env in global macro value", func(t *testing.T) {
t.Setenv("TEST_BASE_PATH", "/usr/local")
content := `
macros:
SERVER_PATH: "${env.TEST_BASE_PATH}/bin/server"
models:
test:
cmd: "${SERVER_PATH} --port 8080"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "/usr/local/bin/server --port 8080", config.Models["test"].Cmd)
})
t.Run("env in model-level macro value", func(t *testing.T) {
t.Setenv("TEST_MODEL_DIR", "/models/llama")
content := `
models:
test:
macros:
MODEL_FILE: "${env.TEST_MODEL_DIR}/model.gguf"
cmd: "server --model ${MODEL_FILE}"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --model /models/llama/model.gguf", config.Models["test"].Cmd)
})
t.Run("env in metadata", func(t *testing.T) {
t.Setenv("TEST_API_KEY", "secret123")
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
metadata:
api_key: "${env.TEST_API_KEY}"
nested:
key: "${env.TEST_API_KEY}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "secret123", config.Models["test"].Metadata["api_key"])
nested := config.Models["test"].Metadata["nested"].(map[string]any)
assert.Equal(t, "secret123", nested["key"])
})
t.Run("env in filters.stripParams", func(t *testing.T) {
t.Setenv("TEST_STRIP_PARAMS", "temperature,top_p")
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
filters:
stripParams: "${env.TEST_STRIP_PARAMS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "temperature,top_p", config.Models["test"].Filters.StripParams)
})
t.Run("env in cmdStop", func(t *testing.T) {
t.Setenv("TEST_KILL_SIGNAL", "SIGTERM")
content := `
models:
test:
cmd: "server --port ${PORT}"
cmdStop: "kill -${env.TEST_KILL_SIGNAL} ${PID}"
proxy: "http://localhost:${PORT}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Contains(t, config.Models["test"].CmdStop, "-SIGTERM")
})
t.Run("missing env var returns error", func(t *testing.T) {
content := `
models:
test:
cmd: "${env.UNDEFINED_VAR_12345}/server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_VAR_12345")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in global macro", func(t *testing.T) {
content := `
macros:
PATH: "${env.UNDEFINED_GLOBAL_VAR}"
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_GLOBAL_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in model macro", func(t *testing.T) {
content := `
models:
test:
macros:
MY_PATH: "${env.UNDEFINED_MODEL_VAR}"
cmd: "server"
proxy: "http://localhost:8080"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_MODEL_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("missing env var in metadata", func(t *testing.T) {
content := `
models:
test:
cmd: "server"
proxy: "http://localhost:8080"
metadata:
key: "${env.UNDEFINED_META_VAR}"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
if assert.Error(t, err) {
assert.Contains(t, err.Error(), "UNDEFINED_META_VAR")
assert.Contains(t, err.Error(), "not set")
}
})
t.Run("env combined with regular macros", func(t *testing.T) {
t.Setenv("TEST_ROOT", "/data")
content := `
macros:
MODEL_BASE: "${env.TEST_ROOT}/models"
models:
test:
cmd: "server --model ${MODEL_BASE}/${MODEL_ID}.gguf"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --model /data/models/test.gguf", config.Models["test"].Cmd)
})
t.Run("multiple env vars in same string", func(t *testing.T) {
t.Setenv("TEST_USER", "admin")
t.Setenv("TEST_PASS", "secret")
content := `
models:
test:
cmd: "server --auth ${env.TEST_USER}:${env.TEST_PASS}"
proxy: "http://localhost:8080"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "server --auth admin:secret", config.Models["test"].Cmd)
})
}
func TestConfig_PeerApiKey_EnvMacros(t *testing.T) {
t.Run("env substitution in peer apiKey", func(t *testing.T) {
t.Setenv("TEST_PEER_API_KEY", "sk-peer-secret-123")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${env.TEST_PEER_API_KEY}"
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-peer-secret-123", config.Peers["openrouter"].ApiKey)
})
t.Run("missing env var in peer apiKey", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${env.NONEXISTENT_PEER_KEY}"
models:
- llama-3.1-8b
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "peers.openrouter.apiKey")
assert.Contains(t, err.Error(), "NONEXISTENT_PEER_KEY")
})
t.Run("static apiKey unchanged", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: sk-static-key
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-static-key", config.Peers["openrouter"].ApiKey)
})
t.Run("multiple peers with env apiKeys", func(t *testing.T) {
t.Setenv("TEST_PEER_KEY_1", "key-one")
t.Setenv("TEST_PEER_KEY_2", "key-two")
content := `
peers:
peer1:
proxy: https://peer1.example.com
apiKey: "${env.TEST_PEER_KEY_1}"
models:
- model-a
peer2:
proxy: https://peer2.example.com
apiKey: "${env.TEST_PEER_KEY_2}"
models:
- model-b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "key-one", config.Peers["peer1"].ApiKey)
assert.Equal(t, "key-two", config.Peers["peer2"].ApiKey)
})
t.Run("global macro substitution in peer apiKey", func(t *testing.T) {
content := `
macros:
API_KEY: sk-from-global-macro
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${API_KEY}"
models:
- llama-3.1-8b
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "sk-from-global-macro", config.Peers["openrouter"].ApiKey)
})
t.Run("global macro in peer filters.stripParams", func(t *testing.T) {
content := `
macros:
STRIP_LIST: "temperature, top_p"
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
stripParams: "${STRIP_LIST}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "temperature, top_p", config.Peers["openrouter"].Filters.StripParams)
})
t.Run("global macro in peer filters.setParams", func(t *testing.T) {
content := `
macros:
MAX_TOKENS: 4096
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
max_tokens: "${MAX_TOKENS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, 4096, config.Peers["openrouter"].Filters.SetParams["max_tokens"])
})
t.Run("env macro in peer filters.setParams", func(t *testing.T) {
t.Setenv("TEST_RETENTION_POLICY", "deny")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
data_collection: "${env.TEST_RETENTION_POLICY}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "deny", config.Peers["openrouter"].Filters.SetParams["data_collection"])
})
t.Run("env macro in peer filters.stripParams", func(t *testing.T) {
t.Setenv("TEST_STRIP_PARAMS", "frequency_penalty, presence_penalty")
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
stripParams: "${env.TEST_STRIP_PARAMS}"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
assert.Equal(t, "frequency_penalty, presence_penalty", config.Peers["openrouter"].Filters.StripParams)
})
t.Run("unknown macro in peer apiKey fails", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
apiKey: "${UNDEFINED_MACRO}"
models:
- llama-3.1-8b
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "peers.openrouter.apiKey")
assert.Contains(t, err.Error(), "unknown macro")
})
t.Run("unknown macro in peer filters.setParams fails", func(t *testing.T) {
content := `
peers:
openrouter:
proxy: https://openrouter.ai/api
models:
- llama-3.1-8b
filters:
setParams:
value: "${UNDEFINED_MACRO}"
`
_, err := LoadConfigFromReader(strings.NewReader(content))
assert.Error(t, err)
assert.Contains(t, err.Error(), "peers.openrouter.filters.setParams")
assert.Contains(t, err.Error(), "unknown macro")
})
}
+81
View File
@@ -0,0 +1,81 @@
package config
import (
"slices"
"sort"
"strings"
)
// ProtectedParams is a list of parameters that cannot be set or stripped via filters
// These are protected to prevent breaking the proxy's ability to route requests correctly
var ProtectedParams = []string{"model"}
// Filters contains filter settings for modifying request parameters
// Used by both models and peers
type Filters struct {
// StripParams is a comma-separated list of parameters to remove from requests
// The "model" parameter can never be removed
StripParams string `yaml:"stripParams"`
// SetParams is a dictionary of parameters to set/override in requests
// Protected params (like "model") cannot be set
SetParams map[string]any `yaml:"setParams"`
}
// SanitizedStripParams returns a sorted list of parameters to strip,
// with duplicates, empty strings, and protected params removed
func (f Filters) SanitizedStripParams() []string {
if f.StripParams == "" {
return nil
}
params := strings.Split(f.StripParams, ",")
cleaned := make([]string, 0, len(params))
seen := make(map[string]bool)
for _, param := range params {
trimmed := strings.TrimSpace(param)
// Skip protected params, empty strings, and duplicates
if slices.Contains(ProtectedParams, trimmed) || trimmed == "" || seen[trimmed] {
continue
}
seen[trimmed] = true
cleaned = append(cleaned, trimmed)
}
if len(cleaned) == 0 {
return nil
}
slices.Sort(cleaned)
return cleaned
}
// SanitizedSetParams returns a copy of SetParams with protected params removed
// and keys sorted for consistent iteration order
func (f Filters) SanitizedSetParams() (map[string]any, []string) {
if len(f.SetParams) == 0 {
return nil, nil
}
result := make(map[string]any, len(f.SetParams))
keys := make([]string, 0, len(f.SetParams))
for key, value := range f.SetParams {
// Skip protected params
if slices.Contains(ProtectedParams, key) {
continue
}
result[key] = value
keys = append(keys, key)
}
// Sort keys for consistent ordering
sort.Strings(keys)
if len(result) == 0 {
return nil, nil
}
return result, keys
}
+168
View File
@@ -0,0 +1,168 @@
package config
import (
"testing"
"github.com/stretchr/testify/assert"
)
func TestFilters_SanitizedStripParams(t *testing.T) {
tests := []struct {
name string
stripParams string
want []string
}{
{
name: "empty string",
stripParams: "",
want: nil,
},
{
name: "single param",
stripParams: "temperature",
want: []string{"temperature"},
},
{
name: "multiple params",
stripParams: "temperature, top_p, top_k",
want: []string{"temperature", "top_k", "top_p"}, // sorted
},
{
name: "model param filtered",
stripParams: "model, temperature, top_p",
want: []string{"temperature", "top_p"},
},
{
name: "only model param",
stripParams: "model",
want: nil,
},
{
name: "duplicates removed",
stripParams: "temperature, top_p, temperature",
want: []string{"temperature", "top_p"},
},
{
name: "extra whitespace",
stripParams: " temperature , top_p ",
want: []string{"temperature", "top_p"},
},
{
name: "empty values filtered",
stripParams: "temperature,,top_p,",
want: []string{"temperature", "top_p"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
f := Filters{StripParams: tt.stripParams}
got := f.SanitizedStripParams()
assert.Equal(t, tt.want, got)
})
}
}
func TestFilters_SanitizedSetParams(t *testing.T) {
tests := []struct {
name string
setParams map[string]any
wantParams map[string]any
wantKeys []string
}{
{
name: "empty setParams",
setParams: nil,
wantParams: nil,
wantKeys: nil,
},
{
name: "empty map",
setParams: map[string]any{},
wantParams: nil,
wantKeys: nil,
},
{
name: "normal params",
setParams: map[string]any{
"temperature": 0.7,
"top_p": 0.9,
},
wantParams: map[string]any{
"temperature": 0.7,
"top_p": 0.9,
},
wantKeys: []string{"temperature", "top_p"},
},
{
name: "protected model param filtered",
setParams: map[string]any{
"model": "should-be-filtered",
"temperature": 0.7,
},
wantParams: map[string]any{
"temperature": 0.7,
},
wantKeys: []string{"temperature"},
},
{
name: "only protected param",
setParams: map[string]any{
"model": "should-be-filtered",
},
wantParams: nil,
wantKeys: nil,
},
{
name: "complex nested values",
setParams: map[string]any{
"provider": map[string]any{
"data_collection": "deny",
"allow_fallbacks": false,
},
"transforms": []string{"middle-out"},
},
wantParams: map[string]any{
"provider": map[string]any{
"data_collection": "deny",
"allow_fallbacks": false,
},
"transforms": []string{"middle-out"},
},
wantKeys: []string{"provider", "transforms"},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
f := Filters{SetParams: tt.setParams}
gotParams, gotKeys := f.SanitizedSetParams()
assert.Equal(t, len(tt.wantKeys), len(gotKeys), "keys length mismatch")
for i, key := range gotKeys {
assert.Equal(t, tt.wantKeys[i], key, "key mismatch at %d", i)
}
if tt.wantParams == nil {
assert.Nil(t, gotParams, "expected nil params")
return
}
assert.Equal(t, len(tt.wantParams), len(gotParams), "params length mismatch")
for key, wantValue := range tt.wantParams {
gotValue, exists := gotParams[key]
assert.True(t, exists, "missing key: %s", key)
// Simple comparison for basic types
switch v := wantValue.(type) {
case string, int, float64, bool:
assert.Equal(t, v, gotValue, "value mismatch for key %s", key)
}
}
})
}
}
func TestProtectedParams(t *testing.T) {
// Verify that "model" is protected
assert.Contains(t, ProtectedParams, "model")
}
+7 -27
View File
@@ -3,8 +3,6 @@ package config
import ( import (
"errors" "errors"
"runtime" "runtime"
"slices"
"strings"
) )
type ModelConfig struct { type ModelConfig struct {
@@ -74,16 +72,15 @@ func (m *ModelConfig) SanitizedCommand() ([]string, error) {
return SanitizeCommand(m.Cmd) return SanitizeCommand(m.Cmd)
} }
// ModelFilters see issue #174 // ModelFilters embeds Filters and adds legacy support for strip_params field
// See issue #174
type ModelFilters struct { type ModelFilters struct {
StripParams string `yaml:"stripParams"` Filters `yaml:",inline"`
} }
func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error { func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error {
type rawModelFilters ModelFilters type rawModelFilters ModelFilters
defaults := rawModelFilters{ defaults := rawModelFilters{}
StripParams: "",
}
if err := unmarshal(&defaults); err != nil { if err := unmarshal(&defaults); err != nil {
return err return err
@@ -104,25 +101,8 @@ func (m *ModelFilters) UnmarshalYAML(unmarshal func(interface{}) error) error {
return nil return nil
} }
// SanitizedStripParams wraps Filters.SanitizedStripParams for backwards compatibility
// Returns ([]string, error) to match existing API
func (f ModelFilters) SanitizedStripParams() ([]string, error) { func (f ModelFilters) SanitizedStripParams() ([]string, error) {
if f.StripParams == "" { return f.Filters.SanitizedStripParams(), nil
return nil, nil
}
params := strings.Split(f.StripParams, ",")
cleaned := make([]string, 0, len(params))
seen := make(map[string]bool)
for _, param := range params {
trimmed := strings.TrimSpace(param)
if trimmed == "model" || trimmed == "" || seen[trimmed] {
continue
}
seen[trimmed] = true
cleaned = append(cleaned, trimmed)
}
// sort cleaned
slices.Sort(cleaned)
return cleaned, nil
} }
+32
View File
@@ -72,3 +72,35 @@ models:
assert.True(t, *config.Models["model2"].SendLoadingState) assert.True(t, *config.Models["model2"].SendLoadingState)
} }
} }
func TestConfig_ModelFiltersWithSetParams(t *testing.T) {
content := `
models:
model1:
cmd: path/to/cmd --port ${PORT}
filters:
stripParams: "top_k"
setParams:
temperature: 0.7
top_p: 0.9
stop:
- "<|end|>"
- "<|stop|>"
`
config, err := LoadConfigFromReader(strings.NewReader(content))
assert.NoError(t, err)
modelConfig := config.Models["model1"]
// Check stripParams
stripParams, err := modelConfig.Filters.SanitizedStripParams()
assert.NoError(t, err)
assert.Equal(t, []string{"top_k"}, stripParams)
// Check setParams
setParams, keys := modelConfig.Filters.SanitizedSetParams()
assert.NotNil(t, setParams)
assert.Equal(t, []string{"stop", "temperature", "top_p"}, keys)
assert.Equal(t, 0.7, setParams["temperature"])
assert.Equal(t, 0.9, setParams["top_p"])
}
+5 -3
View File
@@ -11,14 +11,16 @@ type PeerConfig struct {
ProxyURL *url.URL `yaml:"-"` ProxyURL *url.URL `yaml:"-"`
ApiKey string `yaml:"apiKey"` ApiKey string `yaml:"apiKey"`
Models []string `yaml:"models"` Models []string `yaml:"models"`
Filters Filters `yaml:"filters"`
} }
func (c *PeerConfig) UnmarshalYAML(unmarshal func(interface{}) error) error { func (c *PeerConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
type rawPeerConfig PeerConfig type rawPeerConfig PeerConfig
defaults := rawPeerConfig{ defaults := rawPeerConfig{
Proxy: "", Proxy: "",
ApiKey: "", ApiKey: "",
Models: []string{}, Models: []string{},
Filters: Filters{},
} }
if err := unmarshal(&defaults); err != nil { if err := unmarshal(&defaults); err != nil {
+70
View File
@@ -137,3 +137,73 @@ func searchSubstring(s, substr string) bool {
} }
return false return false
} }
func TestPeerConfig_WithFilters(t *testing.T) {
yamlData := `
proxy: https://openrouter.ai/api
apiKey: sk-test
models:
- model_a
filters:
setParams:
temperature: 0.7
provider:
data_collection: deny
`
var config PeerConfig
err := yaml.Unmarshal([]byte(yamlData), &config)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if config.Filters.SetParams == nil {
t.Fatal("Filters.SetParams should not be nil")
}
if config.Filters.SetParams["temperature"] != 0.7 {
t.Errorf("expected temperature 0.7, got %v", config.Filters.SetParams["temperature"])
}
provider, ok := config.Filters.SetParams["provider"].(map[string]any)
if !ok {
t.Fatal("provider should be a map")
}
if provider["data_collection"] != "deny" {
t.Errorf("expected data_collection deny, got %v", provider["data_collection"])
}
}
func TestPeerConfig_WithBothFilters(t *testing.T) {
yamlData := `
proxy: https://openrouter.ai/api
apiKey: sk-test
models:
- model_a
filters:
stripParams: "temperature, top_p"
setParams:
max_tokens: 1000
`
var config PeerConfig
err := yaml.Unmarshal([]byte(yamlData), &config)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
// Check stripParams
stripParams := config.Filters.SanitizedStripParams()
if len(stripParams) != 2 {
t.Errorf("expected 2 strip params, got %d", len(stripParams))
}
if stripParams[0] != "temperature" || stripParams[1] != "top_p" {
t.Errorf("unexpected strip params: %v", stripParams)
}
// Check setParams
if config.Filters.SetParams == nil {
t.Fatal("Filters.SetParams should not be nil")
}
if config.Filters.SetParams["max_tokens"] != 1000 {
t.Errorf("expected max_tokens 1000, got %v", config.Filters.SetParams["max_tokens"])
}
}
+14
View File
@@ -106,6 +106,20 @@ func (p *PeerProxy) HasPeerModel(modelID string) bool {
return found return found
} }
// GetPeerFilters returns the filters for a peer model, or empty filters if not found
func (p *PeerProxy) GetPeerFilters(modelID string) config.Filters {
pp, found := p.proxyMap[modelID]
if !found {
return config.Filters{}
}
// Get the peer config using the peerID
peer, found := p.peers[pp.peerID]
if !found {
return config.Filters{}
}
return peer.Filters
}
func (p *PeerProxy) ListPeers() config.PeerDictionaryConfig { func (p *PeerProxy) ListPeers() config.PeerDictionaryConfig {
return p.peers return p.peers
} }
+38 -1
View File
@@ -277,6 +277,7 @@ func (pm *ProxyManager) setupGinEngine() {
// Set up routes using the Gin engine // Set up routes using the Gin engine
// Protected routes use pm.apiKeyAuth() middleware // Protected routes use pm.apiKeyAuth() middleware
pm.ginEngine.POST("/v1/chat/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler) pm.ginEngine.POST("/v1/chat/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler)
pm.ginEngine.POST("/v1/responses", pm.apiKeyAuth(), pm.proxyInferenceHandler)
// Support legacy /v1/completions api, see issue #12 // Support legacy /v1/completions api, see issue #12
pm.ginEngine.POST("/v1/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler) pm.ginEngine.POST("/v1/completions", pm.apiKeyAuth(), pm.proxyInferenceHandler)
// Support anthropic /v1/messages (added https://github.com/ggml-org/llama.cpp/pull/17570) // Support anthropic /v1/messages (added https://github.com/ggml-org/llama.cpp/pull/17570)
@@ -649,13 +650,49 @@ func (pm *ProxyManager) proxyInferenceHandler(c *gin.Context) {
} }
} }
// issue #453 set/override parameters in the JSON body
setParams, setParamKeys := pm.config.Models[modelID].Filters.SanitizedSetParams()
for _, key := range setParamKeys {
pm.proxyLogger.Debugf("<%s> setting param: %s", modelID, key)
bodyBytes, err = sjson.SetBytes(bodyBytes, key, setParams[key])
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error setting parameter %s in request", key))
return
}
}
pm.proxyLogger.Debugf("ProxyManager using local Process for model: %s", requestedModel) pm.proxyLogger.Debugf("ProxyManager using local Process for model: %s", requestedModel)
nextHandler = processGroup.ProxyRequest nextHandler = processGroup.ProxyRequest
} else if pm.peerProxy != nil && pm.peerProxy.HasPeerModel(requestedModel) { } else if pm.peerProxy != nil && pm.peerProxy.HasPeerModel(requestedModel) {
pm.proxyLogger.Debugf("ProxyManager using ProxyPeer for model: %s", requestedModel) pm.proxyLogger.Debugf("ProxyManager using ProxyPeer for model: %s", requestedModel)
modelID = requestedModel modelID = requestedModel
nextHandler = pm.peerProxy.ProxyRequest
// issue #453 apply filters for peer requests
peerFilters := pm.peerProxy.GetPeerFilters(requestedModel)
// Apply stripParams - remove specified parameters from request
stripParams := peerFilters.SanitizedStripParams()
for _, param := range stripParams {
pm.proxyLogger.Debugf("<%s> stripping param: %s", requestedModel, param)
bodyBytes, err = sjson.DeleteBytes(bodyBytes, param)
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error stripping parameter %s from request", param))
return
}
}
// Apply setParams - set/override specified parameters in request
setParams, setParamKeys := peerFilters.SanitizedSetParams()
for _, key := range setParamKeys {
pm.proxyLogger.Debugf("<%s> setting param: %s", requestedModel, key)
bodyBytes, err = sjson.SetBytes(bodyBytes, key, setParams[key])
if err != nil {
pm.sendErrorResponse(c, http.StatusInternalServerError, fmt.Sprintf("error setting parameter %s in request", key))
return
}
}
nextHandler = pm.peerProxy.ProxyRequest
} }
if nextHandler == nil { if nextHandler == nil {
+3 -1
View File
@@ -966,7 +966,9 @@ func TestProxyManager_ChatContentLength(t *testing.T) {
func TestProxyManager_FiltersStripParams(t *testing.T) { func TestProxyManager_FiltersStripParams(t *testing.T) {
modelConfig := getTestSimpleResponderConfig("model1") modelConfig := getTestSimpleResponderConfig("model1")
modelConfig.Filters = config.ModelFilters{ modelConfig.Filters = config.ModelFilters{
StripParams: "temperature, model, stream", Filters: config.Filters{
StripParams: "temperature, model, stream",
},
} }
config := config.AddDefaultGroupToConfig(config.Config{ config := config.AddDefaultGroupToConfig(config.Config{
+18 -14
View File
@@ -12,7 +12,7 @@
"react-dom": "^19.1.0", "react-dom": "^19.1.0",
"react-icons": "^5.5.0", "react-icons": "^5.5.0",
"react-resizable-panels": "^3.0.4", "react-resizable-panels": "^3.0.4",
"react-router-dom": "^7.6.2" "react-router-dom": "^7.12.0"
}, },
"devDependencies": { "devDependencies": {
"@eslint/js": "^9.25.0", "@eslint/js": "^9.25.0",
@@ -2232,12 +2232,16 @@
"license": "MIT" "license": "MIT"
}, },
"node_modules/cookie": { "node_modules/cookie": {
"version": "1.0.2", "version": "1.1.1",
"resolved": "https://registry.npmjs.org/cookie/-/cookie-1.0.2.tgz", "resolved": "https://registry.npmjs.org/cookie/-/cookie-1.1.1.tgz",
"integrity": "sha512-9Kr/j4O16ISv8zBBhJoi4bXOYNTkFLOqSL3UDB0njXxCXNezjeyVrJyGOWtgfs/q2km1gwBcfH8q1yEGoMYunA==", "integrity": "sha512-ei8Aos7ja0weRpFzJnEA9UHJ/7XQmqglbRwnf2ATjcB9Wq874VKH9kfjjirM6UhU2/E5fFYadylyhFldcqSidQ==",
"license": "MIT", "license": "MIT",
"engines": { "engines": {
"node": ">=18" "node": ">=18"
},
"funding": {
"type": "opencollective",
"url": "https://opencollective.com/express"
} }
}, },
"node_modules/cross-spawn": { "node_modules/cross-spawn": {
@@ -3559,9 +3563,9 @@
} }
}, },
"node_modules/react-router": { "node_modules/react-router": {
"version": "7.6.2", "version": "7.12.0",
"resolved": "https://registry.npmjs.org/react-router/-/react-router-7.6.2.tgz", "resolved": "https://registry.npmjs.org/react-router/-/react-router-7.12.0.tgz",
"integrity": "sha512-U7Nv3y+bMimgWjhlT5CRdzHPu2/KVmqPwKUCChW8en5P3znxUqwlYFlbmyj8Rgp1SF6zs5X4+77kBVknkg6a0w==", "integrity": "sha512-kTPDYPFzDVGIIGNLS5VJykK0HfHLY5MF3b+xj0/tTyNYL1gF1qs7u67Z9jEhQk2sQ98SUaHxlG31g1JtF7IfVw==",
"license": "MIT", "license": "MIT",
"dependencies": { "dependencies": {
"cookie": "^1.0.1", "cookie": "^1.0.1",
@@ -3581,12 +3585,12 @@
} }
}, },
"node_modules/react-router-dom": { "node_modules/react-router-dom": {
"version": "7.6.2", "version": "7.12.0",
"resolved": "https://registry.npmjs.org/react-router-dom/-/react-router-dom-7.6.2.tgz", "resolved": "https://registry.npmjs.org/react-router-dom/-/react-router-dom-7.12.0.tgz",
"integrity": "sha512-Q8zb6VlTbdYKK5JJBLQEN06oTUa/RAbG/oQS1auK1I0TbJOXktqm+QENEVJU6QvWynlXPRBXI3fiOQcSEA78rA==", "integrity": "sha512-pfO9fiBcpEfX4Tx+iTYKDtPbrSLLCbwJ5EqP+SPYQu1VYCXdy79GSj0wttR0U4cikVdlImZuEZ/9ZNCgoaxwBA==",
"license": "MIT", "license": "MIT",
"dependencies": { "dependencies": {
"react-router": "7.6.2" "react-router": "7.12.0"
}, },
"engines": { "engines": {
"node": ">=20.0.0" "node": ">=20.0.0"
@@ -3705,9 +3709,9 @@
} }
}, },
"node_modules/set-cookie-parser": { "node_modules/set-cookie-parser": {
"version": "2.7.1", "version": "2.7.2",
"resolved": "https://registry.npmjs.org/set-cookie-parser/-/set-cookie-parser-2.7.1.tgz", "resolved": "https://registry.npmjs.org/set-cookie-parser/-/set-cookie-parser-2.7.2.tgz",
"integrity": "sha512-IOc8uWeOZgnb3ptbCURJWNjWUPcO3ZnTTdzsurqERrP6nPyv+paC55vJM0LpOlT2ne+Ix+9+CRG1MNLlyZ4GjQ==", "integrity": "sha512-oeM1lpU/UvhTxw+g3cIfxXHyJRc/uidd3yK1P242gzHds0udQBYzs3y8j4gCCW+ZJ7ad0yctld8RYO+bdurlvw==",
"license": "MIT" "license": "MIT"
}, },
"node_modules/shebang-command": { "node_modules/shebang-command": {
+1 -1
View File
@@ -14,7 +14,7 @@
"react-dom": "^19.1.0", "react-dom": "^19.1.0",
"react-icons": "^5.5.0", "react-icons": "^5.5.0",
"react-resizable-panels": "^3.0.4", "react-resizable-panels": "^3.0.4",
"react-router-dom": "^7.6.2" "react-router-dom": "^7.12.0"
}, },
"devDependencies": { "devDependencies": {
"@eslint/js": "^9.25.0", "@eslint/js": "^9.25.0",