llama-swap

Author	SHA1	Message	Date
Benson Wong	7931212d3e	proxy: add v1/images/edits API endpoint (#447 ) Updates #433	2026-01-01 12:43:06 -08:00
Benson Wong	37d74efc2d	proxy: add /v1/images/generations (#443 ) Add support for the /v1/images/generations endpoint Updates #433 Closes #191	2025-12-30 21:04:58 -08:00
Benson Wong	53b32f3601	proxy: add API key support (#436 ) Add configuration support for api keys that are enforced by llama-swap. Keys are stripped before sending them to upstream servers. Updates: #433, #50 and #251	2025-12-23 23:39:33 -08:00
Benson Wong	d3f329f924	proxy: Improve logging performance and allow separate log streaming (#421 ) Replace container/ring.Ring with a custom circularBuffer that uses a single contiguous []byte slice. This fixes the original implementation which created 10,240 ring elements instead of 10KB of storage. GetHistory is now 139x faster (145μs → 1μs) and uses 117x less memory (1.2MB → 10KB). Allocations reduced from 2 to 1 per write operation. Create a LogMonitor per proxy.Process, replacing the usage of a shared one. The buffer in LogMonitor is lazy allocated on the first call to Write and freed when the Process is stopped. This reduces unnecessary memory usage when a model is not active. The /logs/stream/{model_id} endpoint was added to stream logs from a specific process.	2025-12-18 21:49:25 -08:00
Benson Wong	7b3b0f5eae	move header images around [skip ci]	2025-12-02 19:40:42 -08:00
Benson Wong	021ccceef1	README: update hero image	2025-12-02 19:37:03 -08:00
Benson Wong	f03871c50a	Update README.md - add supported anthropic API - add example for docker hot reload support	2025-12-02 19:03:01 -08:00
Ryan Steed	dc00d17abe	docs: add documentation for non-root container images and security considerations (#416 ) * docs: add documentation for non-root container images and security considerations * docs: move container security section to dedicated file and update README links	2025-12-02 08:52:26 -08:00
Benson Wong	3567b7df08	Update image in README.md for web UI section	2025-11-08 15:29:37 -08:00
Benson Wong	7ff50631e0	Update README for setup instructions clarity [skip ci]	2025-10-19 14:55:23 -07:00
Benson Wong	9fc0431531	Clean up and Documentation (#347 ) [skip ci] * cmd,misc: move misc binaries to cmd/ * docs: add docs and move examples/ there * misc: remove unused misc/assets dir * docs: add configuration.md * update README with better structure Updates: #334	2025-10-19 14:53:13 -07:00
Artur Podsiadły	558801db1a	Fix nginx proxy buffering for streaming endpoints (#295 ) * Fix nginx proxy buffering for streaming endpoints - Add X-Accel-Buffering: no header to SSE endpoints (/api/events, /logs/stream) - Add X-Accel-Buffering: no header to proxied text/event-stream responses - Add nginx reverse proxy configuration section to README - Add tests for X-Accel-Buffering header on streaming endpoints Fixes #236 * Fix goroutine cleanup in streaming endpoints test Add context cancellation to TestProxyManager_StreamingEndpointsReturnNoBufferingHeader to ensure the goroutine is properly cleaned up when the test completes.	2025-09-09 16:07:46 -07:00
Benson Wong	954e2dee73	Remove `cmdStart` from README [skip ci] cmdStart was in the README but it doesn't exist. Fixed the typo. Oops.	2025-09-04 11:57:28 -07:00
Benson Wong	2457840698	Update README.md [skip ci]	2025-08-28 23:44:37 -07:00
Benson Wong	7f55494151	Update README.md [skip ci]	2025-08-28 22:47:28 -07:00
Yandrik	977f1856bb	add /completion endpoint (#275 ) * feat: add /completion endpoint * chore: reformat using gofmt	2025-08-28 21:41:02 -07:00
Benson Wong	57803fd3aa	Support llama-server's /infill endpoint (#272 ) Add support for llama-server's /infill endpoint and metrics gathering on the Activities page.	2025-08-27 08:36:05 -07:00
Benson Wong	5dc6b3e6d9	Add barebones but working implementation of model preload (#209 , #235 ) Add barebones but working implementation of model preload * add config test for Preload hook * improve TestProxyManager_StartupHooks * docs for new hook configuration * add a .dev to .gitignore	2025-08-14 10:27:28 -07:00
Benson Wong	a186318892	Update Readme, Add screenshot for Activities page [skip ci]	2025-08-08 13:39:46 -07:00
Benson Wong	c4e4d5e1e9	Update Readme UI Screenshot [skip ci]	2025-08-08 13:33:47 -07:00
Benson Wong	701476c0c4	Update README.md - remove contributor block [skip ci] Contributor information available on the Github page's sidebar. Redundant.	2025-08-06 11:11:47 -07:00
Martin Garton	8be5073c51	Fix typo (#223 ) [skip ci] Fix typo `lama-swap` -> `llama-swap`	2025-08-06 10:02:38 -07:00
Ryein Goddard	ba0a81937a	Update README.md (#216 ) Update git clone protocol to https	2025-08-01 19:48:09 -07:00
Benson Wong	5172cb2e12	Update docs in Readme [skip ci]	2025-07-30 11:51:14 -07:00
Ian Sebastian Mathew	bbaf172956	add trigger to rebuild homebrew formula (#210 )	2025-07-30 10:12:21 -07:00
Gaël James	8c693e7fcf	Add endpoint aliases for reranking models (#201 ) * Add endpoint aliases for reranking models * Add MetricsMiddleware to the previous reranking endpoint * Fix the embeddings endpoint not having model set	2025-07-24 08:32:47 -07:00
Benson Wong	accd65294b	add contributors to README [skip ci]	2025-07-21 23:16:48 -07:00
Benson Wong	7472a25864	Update README.md [skip ci] update screenshot for web UI	2025-07-21 23:08:19 -07:00
Benson Wong	717d64e336	update GUI image in README [skip ci]	2025-06-24 10:38:28 -07:00
Benson Wong	a6b2e930d8	Update README.md [skip ci]	2025-06-18 11:47:08 -07:00
Benson Wong	3fce9ee0e9	Update README.md [skip ci]	2025-06-17 09:53:22 -07:00
Benson Wong	5899ae7966	Update README.md [skip ci]	2025-06-17 09:52:47 -07:00
Benson Wong	4d02ccd26a	Update README.md [skip ci]	2025-05-30 09:38:45 -07:00
Benson Wong	dfd47eeac4	Readme updates [skip ci]	2025-05-30 09:19:08 -07:00
Benson Wong	1ac6499c08	Add macros to Configuration schema (#149 ) * Add macros to Configuration schema * update docs	2025-05-29 21:51:25 -07:00
Benson Wong	25f3dc25e7	small doc update [skip ci]	2025-05-26 16:03:27 -07:00
Benson Wong	8422e4e6a1	move some docs to the wiki [no-ci]	2025-05-26 15:46:08 -07:00
Benson Wong	02ee29d881	increase default healthCheckTimeout to 120s	2025-05-26 09:57:53 -07:00
Yuta Hayashibe	fb44cf4e08	Fix typos (#143 )	2025-05-23 08:40:15 -07:00
Benson Wong	6e2ff28d59	improve cmdStop docs [no ci]	2025-05-16 13:52:04 -07:00
Benson Wong	a8b81f2799	Add stopCmd for custom stopping instructions (#136 ) Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.	2025-05-16 13:48:42 -07:00
Benson Wong	f9ee7156dc	update configuration examples for multiline yaml commands #133	2025-05-16 11:45:39 -07:00
Sam	bc652709a5	Add config hot-reload (#106 ) introduce --watch-config command line option to reload ProxyManager when configuration changes.	2025-05-11 17:37:00 -07:00
Benson Wong	5c5a5da664	Update README.md removed extra section.	2025-05-06 06:59:15 -07:00
Benson Wong	09e52c0500	Automatic Port Numbers (#105 ) Add automatic port numbers assignment in configuration file. The string `${PORT}` will be substituted in model.cmd and model.proxy for an actual port number. This also allows model.proxy to be omitted from the configuration.	2025-05-05 17:07:43 -07:00
Benson Wong	448ccae959	Introduce Groups Feature (#107 ) Groups allows more control over swapping behaviour when a model is requested. The new groups feature provides three ways to control swapping: within the group, swapping out other groups or keep the models in the group loaded persistently (never swapped out). Closes #96, #99 and #106.	2025-05-02 22:35:38 -07:00
Benson Wong	1f7aa359b1	Update header image AI has finally made my dreams of llamas in funny clothing and stuck in a claw machine waiting to be picked come true!	2025-04-23 13:02:12 -07:00
Benson Wong	b138d6cf25	fix starhistory in README	2025-04-15 20:23:46 -07:00
Benson Wong	b8f888f864	Logging Improvements (#88 ) This change revamps the internal logging architecture to be more flexible and descriptive. Previously all logs from both llama-swap and upstream services were mixed together. This makes it harder to troubleshoot and identify problems. This PR adds these new endpoints: - `/logs/stream/proxy` - just llama-swap's logs - `/logs/stream/upstream` - stdout output from the upstream server	2025-04-04 21:01:33 -07:00
Benson Wong	5565fca3ac	add some badges to README	2025-03-19 11:25:06 -07:00

1 2

100 Commits