-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Closed
Labels
feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Description
🚀 The feature, motivation and pitch
Currently we always capture cuda graphs up to the highest value of cuda_graph_batch_sizes.
if the user sets a higher max_batch_size it will fail on assertion in CapturedGraph because input batch size dim will be higher than the maximal cuda_graph_batch_sizes value. need to capture up to the max value of cuda_graph_batch_sizes and fallback to non-captured graph when higher batch size occurs.
Alternatives
No response
Additional context
No response
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Metadata
Metadata
Assignees
Labels
feature requestNew feature or request. This includes new model, dtype, functionality supportNew feature or request. This includes new model, dtype, functionality support
Type
Projects
Status
Done