llm-d incubation
Incubating components of llm-d, a Kubernetes-native high-performance distributed LLM inference framework
Popular repositories Loading
-
-
-
llm-d-modelservice
llm-d-modelservice Publichelm charts for deploying models with llm-d
-
Repositories
Showing 6 of 6 repositories
- llm-d-fast-model-actuation Public
llm-d-incubation/llm-d-fast-model-actuation’s past year of commit activity - inferno-autoscaler Public
llm-d-incubation/inferno-autoscaler’s past year of commit activity - ig-wva Public
Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives
llm-d-incubation/ig-wva’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…