-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
fix: runtime capability detection for backends #6149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Sertac Ozercan <[email protected]>
✅ Deploy Preview for localai ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
Signed-off-by: Sertac Ozercan <[email protected]>
Signed-off-by: Sertac Ozercan <[email protected]>
Thanks! this makes sense, just small nits here and there for consistency |
core/gallery/backends.go
Outdated
// ListSystemBackendsSelected lists system backends and, when multiple concrete backends share the same alias | ||
// (e.g., cpu-llama-cpp and cuda12-llama-cpp both alias to "llama-cpp"), selects the optimal one based on the | ||
// detected system capability (GPU vendor/platform). Concrete backend names are always included. | ||
func ListSystemBackendsSelected(systemState *system.SystemState) (SystemBackends, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think at this point would make sense to actually modify directly ListSystemBackends
LocalAI/core/gallery/backends.go
Line 284 in 21faa41
func ListSystemBackends(systemState *system.SystemState) (SystemBackends, error) { |
Its usage in the code is quite limited https://github.com/search?q=repo%3Amudler%2FLocalAI%20ListSystemBackends&type=code
otherwise would make sense to re-use it as much as possible, to avoid code dups
core/gallery/backends.go
Outdated
return backends, nil | ||
} | ||
|
||
func selectBestCandidate(systemState *system.SystemState, cands []backendCandidate) backendCandidate { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this is better placed in the capabilities code, to keep the capability logic well isolated.
Could maybe be just a method of system State?
Signed-off-by: Sertac Ozercan <[email protected]>
Description
This fixes a regression since llama-cpp is a modular backend now. Previously, we detected capabilities at runtime, and fallback to cpu if it's not possible to run with the highest priority meta backend.
Issue happens if you have a container with cuda (or other) and cpu llama-cpp backends. Since llama-cpp alias has both meta backends, it might use the cuda runtime, which we may not be able to run depending on the container and host capabilities. We'll need to detect the platform at runtime so we can fallback gracefully instead of expecting user to set the appropriate value.
Notes for Reviewers
Signed commits