Benson Wong
ec0348e431
Reduce stale time for issues
2025-04-29 21:16:34 -07:00
Benson Wong
06eda7f591
tag all process logs with its ID ( #103 )
...
Makes identifying Process of log messages easier
v106
2025-04-25 12:58:25 -07:00
Benson Wong
5fad24c16f
Make checkHealthTimeout Interruptable during startup ( #102 )
...
interrupt and exit Process.start() early if the upstream process exits prematurely or unexpectedly.
v105
2025-04-24 14:39:33 -07:00
Benson Wong
8404244fab
Moderate security update for golang/x/net -> v0.38.0
2025-04-24 09:58:40 -07:00
Benson Wong
712cd01081
fix confusing INFO message [no ci]
2025-04-24 09:56:20 -07:00
Benson Wong
1f7aa359b1
Update header image
...
AI has finally made my dreams of llamas in funny clothing and stuck in
a claw machine waiting to be picked come true!
2025-04-23 13:02:12 -07:00
Benson Wong
b138d6cf25
fix starhistory in README
2025-04-15 20:23:46 -07:00
Benson Wong
fb7c808082
add timing for Process start, stop, total request time ( #91 )
v104
2025-04-14 14:34:59 -07:00
Benson Wong
a7e640b0f7
add aider example
2025-04-07 12:37:14 -07:00
Benson Wong
593604dfdc
Show proxy and upstream logs in separate columns in logs UI
v103
2025-04-05 10:36:54 -07:00
Benson Wong
b8f888f864
Logging Improvements ( #88 )
...
This change revamps the internal logging architecture to be more flexible and descriptive. Previously all logs from both llama-swap and upstream services were mixed together. This makes it harder to troubleshoot and identify problems. This PR adds these new endpoints:
- `/logs/stream/proxy` - just llama-swap's logs
- `/logs/stream/upstream` - stdout output from the upstream server
v102
2025-04-04 21:01:33 -07:00
Benson Wong
192b2ae621
Remove no longer needed test
v101
2025-04-04 14:46:01 -07:00
Benson Wong
b7f8cb5094
Limit Access-Control-Allow-Origin to OPTIONS preflight requests #85
2025-04-04 14:44:35 -07:00
Benson Wong
a23da6eb57
Sanitize CORS headers ( #85 )
...
Add sanitation step for `Access-Control-Allow-Headers` when echoing back user supplied headers
v100
2025-04-01 08:43:53 -07:00
Grigorii Khvatskii
4c3aa40564
add graceful process termination on windows ( #82 )
v99
2025-03-25 15:26:33 -07:00
Benson Wong
84e2c07a7e
Refactor wildcard out of CORS headers ( #81 )
...
Changes to CORS functionality:
- `Access-Control-Allow-Origin: *` is set for all requests
- for pre-flight OPTIONS requests
- specify methods: `Access-Control-Allow-Methods: GET, POST, PUT, PATCH, DELETE, OPTIONS`
- if the client sent `Access-Control-Request-Headers` then echo back the same value in `Access-Control-Allow-Headers`. If no `Access-Control-Request-Headers` were sent, then send back a default set
- set `Access-Control-Max-Age: 86400` to that may improve performance
- Add CORS tests to the proxy-manager
2025-03-25 15:24:43 -07:00
Benson Wong
680af28bcc
Allow very permissive CORS headers ( #77 )
v98
2025-03-20 15:50:21 -07:00
Benson Wong
d94db42ffe
fix bug checking incorrect error
2025-03-20 15:49:36 -07:00
Benson Wong
93cd83c55c
add override for windows ( #76 )
2025-03-20 13:23:04 -07:00
Benson Wong
5565fca3ac
add some badges to README
2025-03-19 11:25:06 -07:00
Benson Wong
d625ab8d92
Refactor process state management ( #70 ) ( #73 )
...
* add isValidStateTransition helper function
* Replace Process.setState() with Process.swapState()
* Refactor locking logic in Process
v96
2025-03-15 17:14:03 -07:00
Benson Wong
a3f82c140b
tidy up config examples in README
2025-03-15 10:36:45 -07:00
Benson Wong
5c97299e7b
Add support for sending a custom model name to upstream ( #69 ) ( #71 )
...
* add test for splitRequestedModel()
* Add `useModelName` parameter to model configuration
* add docs to README
v95
2025-03-14 21:07:52 -07:00
Benson Wong
671c1a5a7b
update deps
v94
2025-03-13 14:00:15 -07:00
Benson Wong
52c0196e0f
clean up feature list in readme
2025-03-13 13:55:20 -07:00
Benson Wong
3201a68a04
Add /v1/audio/transcriptions support ( #41 )
...
* add support for /v1/audio/transcriptions
2025-03-13 13:49:39 -07:00
Florin-Gabriel Dumitru
3ac94ad20e
Adds an endpoint '/running' ( #61 )
...
* Adds an endpoint '/running' that returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping.
* Improves the `/running` endpoint by allowing multiple entries under the `running` key within the JSON response.
Refactors the `/running` method name (listRunningProcessesHandler).
Removes the unlisted filter implementation.
* Adds tests for:
- no model loaded
- one model loaded
- multiple models loaded
* Adds simple comments.
* Simplified code structure as per 250313 comments on PR #65 .
---------
Co-authored-by: FGDumitru|B <xelotx@gmail.com >
2025-03-13 13:42:59 -07:00
Benson Wong
60355bf74a
fix some potentially confusing Process.start() comment
2025-03-11 11:00:45 -07:00
Benson Wong
9b2ed244e2
Improve Continuous integration and fix concurrency bugs ( #66 )
...
- improvements to the continuous GH actions
- fix edge case concurrency bugs with Process.start() and state transitions discovered setting up CI.
v93
2025-03-11 10:39:14 -07:00
Benson Wong
eeb72297f7
add first version of CI for go
2025-03-11 08:45:56 -07:00
Benson Wong
eabfe70cc6
add GH action to close inactive issues
2025-03-09 19:51:48 -07:00
Benson Wong
29cd98878d
better container build logic when upstream containers do not exist
2025-03-09 13:02:06 -07:00
Benson Wong
b3d331da0d
Properly strip profile name slug from models fixes ( #62 )
...
The profile slug in a model name, `profile:model`, is specific to
llama-swap. This strips `profile:` out of the model name request so
upstreams that expect just `model` work and do not require knowing about
the profile slug.
v92
2025-03-09 12:41:52 -07:00
Benson Wong
62275e078d
add examples to restart on config change #59
2025-03-06 10:50:29 -08:00
Benson Wong
88916059e1
add /unload to docs
2025-03-03 10:44:16 -08:00
Benson Wong
082d5d0fc5
Add /unload endpoint ( #58 ) to unload all currently running models
v91
2025-03-03 10:33:36 -08:00
Benson Wong
53338938bd
increase health check to a minimum of 5 seconds
2025-03-03 10:04:08 -08:00
Benson Wong
af653347ae
Update README.md w/ starhistory graph
2025-02-27 16:43:34 -08:00
Benson Wong
1e25b44a06
add workflow_dispatch to release action
2025-02-18 17:27:43 -08:00
Benson Wong
0815bb4cc3
Add windows to goreleaser #54
2025-02-18 17:26:43 -08:00
daschiller
7187cfe52e
add Windows build support to Makefile ( #54 )
2025-02-18 17:24:31 -08:00
Benson Wong
24089d2d9c
remove "no musa container" note from README
2025-02-18 16:38:48 -08:00
Benson Wong
ebabe55ff3
Delete untagged packages after build and push ( #55 )
2025-02-18 10:32:32 -08:00
Benson Wong
41a338297c
deletion of untagged containers happen after build-and-push
2025-02-18 10:11:59 -08:00
Benson Wong
7e3353efeb
add action step to remove untagged containers
2025-02-18 10:08:41 -08:00
Benson Wong
4ed58fb173
update container build action
2025-02-18 09:59:06 -08:00
Benson Wong
f5a2be698d
revert package src until new ggml-org has them
2025-02-15 18:23:58 -08:00
Benson Wong
f5e6ec3b7a
fix package src in containerfile
2025-02-15 18:20:35 -08:00
Benson Wong
3f462da146
switch package source from ggerganov to ggml-org
2025-02-15 18:18:49 -08:00
Benson Wong
48bd766536
Update README.md
2025-02-14 22:05:52 -08:00