Commit 15b1937
authored
Offloading tensors based on total VRAM budget and offloading policy (ggml-org#6)
* deprecate ffn_b
* get tensor offloading levels
* wip: split tensor loading
* wip: framework of loading sparse model tensors
* save and flush gpu alloc buffer
* vram budget will fall back to remaining free memory
* minor: remove vram safety margin
* add options for vram budget; clean old env vars
* minor: bugfix1 parent b89a0b7 commit 15b1937
File tree
6 files changed
+417
-119
lines changed- common
6 files changed
+417
-119
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
565 | 565 | | |
566 | 566 | | |
567 | 567 | | |
| 568 | + | |
| 569 | + | |
| 570 | + | |
| 571 | + | |
| 572 | + | |
| 573 | + | |
| 574 | + | |
| 575 | + | |
| 576 | + | |
| 577 | + | |
568 | 578 | | |
569 | 579 | | |
570 | 580 | | |
| |||
801 | 811 | | |
802 | 812 | | |
803 | 813 | | |
| 814 | + | |
804 | 815 | | |
805 | 816 | | |
806 | 817 | | |
| |||
895 | 906 | | |
896 | 907 | | |
897 | 908 | | |
| 909 | + | |
898 | 910 | | |
899 | 911 | | |
900 | 912 | | |
| |||
1402 | 1414 | | |
1403 | 1415 | | |
1404 | 1416 | | |
| 1417 | + | |
1405 | 1418 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
64 | 64 | | |
65 | 65 | | |
66 | 66 | | |
| 67 | + | |
67 | 68 | | |
68 | 69 | | |
69 | 70 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9338 | 9338 | | |
9339 | 9339 | | |
9340 | 9340 | | |
| 9341 | + | |
| 9342 | + | |
| 9343 | + | |
| 9344 | + | |
| 9345 | + | |
| 9346 | + | |
| 9347 | + | |
9341 | 9348 | | |
9342 | 9349 | | |
9343 | 9350 | | |
| |||
9610 | 9617 | | |
9611 | 9618 | | |
9612 | 9619 | | |
| 9620 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
51 | 51 | | |
52 | 52 | | |
53 | 53 | | |
| 54 | + | |
54 | 55 | | |
55 | 56 | | |
56 | 57 | | |
| |||
0 commit comments