Commit e9a1be0
committed
CUDA: Simplify and improve CUDA graphs through use of indirect copy pointers
Previously there was complexity in the CUDA graphs implementation due
frequently changing parameters to copy kernels associated with K and V
cache pointers. This patch simplifies by using indirection to avoid
such parameters frequently changing, avoiding the need for frequent
graph updates.
Fixes #121521 parent ba76543 commit e9a1be0
File tree
4 files changed
+96
-107
lines changed- ggml
- include
- src/ggml-cuda
4 files changed
+96
-107
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
349 | 349 | | |
350 | 350 | | |
351 | 351 | | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
352 | 355 | | |
353 | 356 | | |
354 | 357 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
711 | 711 | | |
712 | 712 | | |
713 | 713 | | |
714 | | - | |
| 714 | + | |
715 | 715 | | |
716 | 716 | | |
717 | 717 | | |
| |||
0 commit comments