llama-swap

Author	SHA1	Message	Date
Benson Wong	66639e83f7	proxy: replace fsnotify with stat-poll watcher and add SIGHUP reload (#685 ) The fsnotify-based config watcher does not work reliably when the config file is bind-mounted into a Docker container as an individual file, and mishandles k8s ConfigMap projections (atomically swapped symlinks). Replace it with a small os.Stat-polling watcher and add SIGHUP as an explicit reload signal. - new proxy/configwatcher package: 2s os.Stat poller, follows symlinks, fires on mtime/size change and on missing -> present transitions - SIGHUP triggers reload unconditionally (works without --watch-config) via the same ConfigFileChangedEvent pipeline so the UI sees identical state transitions - watcher goroutine now exits cleanly on shutdown via a context - drop github.com/fsnotify/fsnotify dependency fixes #682	2026-04-21 23:21:48 -07:00
Benson Wong	5e3c646829	proxy: compress captures with zstd (#668 ) The previous captures were saved uncompressed in memory. In agentic workflows there can be many turns with each request containing the previous context in the body with a lot of redundant data. Use zstd to compress the request and response data before keeping a copy of memory. Results: - Average Percentage Saved: 73.19% - Average Compression Factor: ~6.77:1	2026-04-17 23:29:37 -07:00
Benson Wong	a3725e7d09	Update go.mod to 1.26.1 (#593 )	2026-03-20 16:09:58 +09:00
Benson Wong	15bd55d3a9	proxy, ui-svelte: add /sdapi/v1 endpoint support (#587 ) Add proxy routes for stable-diffusion.cpp's /sdapi/v1/txt2img, /sdapi/v1/img2img, and /sdapi/v1/loras endpoints. POST endpoints use proxyInferenceHandler (model in JSON body), GET /loras uses proxyGETModelHandler (model in query param). Update the image playground with a dual-mode UI supporting both OpenAI and SDAPI backends. In SDAPI mode, loras are fetched first to prime the server-side cache, and all txt2img parameters are exposed (negative prompt, steps, cfg_scale, seed, batch_size, clip_skip, sampler, scheduler, lora selection with multipliers). - Add 3 sdapi route registrations in proxymanager.go - Add sdApi.ts client with generateSdImage and fetchSdLoras - Add SDAPI types (SdApiTxt2ImgRequest, SdApiResponse, etc.) - Add /sdapi to vite dev proxy config - Add backend tests for sdapi routing - Support batch image display in gallery grid https://claude.ai/code/session_0186MGX6NXdHVBTv2KH45fqn --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-03-19 22:08:31 +09:00
Benson Wong	bccce5fa19	go.mod,ui/package-lock.json: dependency and security updates (#418 )	2025-11-29 22:27:22 -08:00
Benson Wong	6299c1b874	Fix High CPU (#189 ) * vendor in kelindar/event lib and refactor to remove time.Ticker	2025-07-15 18:04:30 -07:00
Benson Wong	1921e570d7	Add Event Bus (#184 ) Major internal refactor to use an event bus to pass event/messages along. These changes are largely invisible user facing but sets up internal design for real time stats and information. - `--watch-config` logic refactored for events - remove multiple SSE api endpoints, replaced with /api/events - keep all functionality essentially the same - UI/backend sync is in near real time now	2025-07-01 22:17:35 -07:00
Benson Wong	d7b390df74	Add GH Action for Testing on Windows (#132 ) * Add windows specific test changes * Change the command line parsing library - Possible breaking changes for windows users!	2025-05-14 21:51:53 -07:00
Sam	bc652709a5	Add config hot-reload (#106 ) introduce --watch-config command line option to reload ProxyManager when configuration changes.	2025-05-11 17:37:00 -07:00
Benson Wong	8404244fab	Moderate security update for golang/x/net -> v0.38.0	2025-04-24 09:58:40 -07:00
Benson Wong	671c1a5a7b	update deps	2025-03-13 14:00:15 -07:00
Benson Wong	b3d331da0d	Properly strip profile name slug from models fixes (#62 ) The profile slug in a model name, `profile:model`, is specific to llama-swap. This strips `profile:` out of the model name request so upstreams that expect just `model` work and do not require knowing about the profile slug.	2025-03-09 12:41:52 -07:00
Benson Wong	db6715bec3	update golang.org/x/net -> v0.33.0 for dependabot	2024-12-20 11:28:32 -08:00
Benson Wong	d4e22cceaa	Fix security vulnerability with golang.org/x/crypto - does not affect the project as llama-swap does not use the crypto libraries - good practice to keep security deps updated!	2024-12-14 10:20:22 -08:00
Benson Wong	9fc5d5b5eb	improve cmd parsing (#22 ) Switch from using a naive strings.Fields() to shlex.Split() for parsing the model startup command into a string[]. This makes parsing much more reliable around newlines, quotes, etc.	2024-12-01 09:02:58 -08:00
Benson Wong	c3b4bb1684	use gin for http server	2024-11-18 15:30:16 -08:00
Benson Wong	be82d1a6a0	Support multiline cmds in YAML configuration Add support for multiline `cmd` configurations allowing for nicer looking configuration YAML files.	2024-10-19 20:06:59 -07:00
Benson Wong	4fae7cf946	update docs	2024-10-04 21:11:08 -07:00
Benson Wong	ef05c05f9c	renaming to llama-swap	2024-10-04 20:21:11 -07:00
Benson Wong	844615bfcc	rename to llamagate	2024-10-04 11:09:36 -07:00
Benson Wong	f44faf5a93	move config to its own package	2024-10-03 21:08:11 -07:00
Benson Wong	b63b81b121	first commit	2024-10-03 20:20:01 -07:00

22 Commits