Install uv after the cpp tool binaries are copied and before the
llama-swap binary, enabling `uv run` usage for Python-based inference
backends like vLLM.
- add python3-pip to runtime apt installs
- add `pip install uv --break-system-packages` after cpp installs
fixes#628
Co-authored-by: Claude <noreply@anthropic.com>
Expose CMAKE_CUDA_ARCHITECTURES as a Docker build ARG so users can
customize CUDA architectures via --build-arg without editing the
Dockerfile.
- convert hardcoded ENV to ARG with default, feeding into ENV
- replace silent fallback defaults (:-) in scripts with :? guards
to fail fast if the env var is missing
- add usage example to Dockerfile header
Follow up to: #624https://claude.ai/code/session_01EWiUe7jNABX7Uz95dUGJqK
Co-authored-by: Claude <noreply@anthropic.com>
multiple fixes to vulkan build:
- use ubuntu 26.04 to be compatible with AMD 395+ (Strix halo) hardware
- add home directory in container
- fix stable-diffusion install to actually enable vulkan
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- set up a GHA scheduled job to build the container nightly
- enabling pushing a llama-swap:unified and a llama-swap:unified-Y-M-D
image to ghcr.io
- tidy up Dockerfile to use a non-root user and llama-swap as an entry
point