Skip to content

Conversation

sozercan
Copy link
Collaborator

@sozercan sozercan commented Aug 26, 2025

Description

This fixes a regression since llama-cpp is a modular backend now. Previously, we detected capabilities at runtime, and fallback to cpu if it's not possible to run with the highest priority meta backend.

Issue happens if you have a container with cuda (or other) and cpu llama-cpp backends. Since llama-cpp alias has both meta backends, it might use the cuda runtime, which we may not be able to run depending on the container and host capabilities. We'll need to detect the platform at runtime so we can fallback gracefully instead of expecting user to set the appropriate value.

Notes for Reviewers

Signed commits

  • Yes, I signed my commits.

@sozercan sozercan requested a review from mudler August 26, 2025 18:02
Copy link

netlify bot commented Aug 26, 2025

Deploy Preview for localai ready!

Name Link
🔨 Latest commit 3b673ff
🔍 Latest deploy log https://app.netlify.com/projects/localai/deploys/68b686fd614895000889d74f
😎 Deploy Preview https://deploy-preview-6149--localai.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Signed-off-by: Sertac Ozercan <[email protected]>
Signed-off-by: Sertac Ozercan <[email protected]>
@mudler
Copy link
Owner

mudler commented Aug 27, 2025

Thanks! this makes sense, just small nits here and there for consistency

// ListSystemBackendsSelected lists system backends and, when multiple concrete backends share the same alias
// (e.g., cpu-llama-cpp and cuda12-llama-cpp both alias to "llama-cpp"), selects the optimal one based on the
// detected system capability (GPU vendor/platform). Concrete backend names are always included.
func ListSystemBackendsSelected(systemState *system.SystemState) (SystemBackends, error) {
Copy link
Owner

@mudler mudler Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at this point would make sense to actually modify directly ListSystemBackends

func ListSystemBackends(systemState *system.SystemState) (SystemBackends, error) {

Its usage in the code is quite limited https://github.com/search?q=repo%3Amudler%2FLocalAI%20ListSystemBackends&type=code

otherwise would make sense to re-use it as much as possible, to avoid code dups

return backends, nil
}

func selectBestCandidate(systemState *system.SystemState, cands []backendCandidate) backendCandidate {
Copy link
Owner

@mudler mudler Aug 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably this is better placed in the capabilities code, to keep the capability logic well isolated.

Could maybe be just a method of system State?

https://github.com/mudler/LocalAI/blob/21faa4114bf6c8980fc612e7db5a2a13b62e8d23/pkg/system/capabilities.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants