Commit 573694e
kv-cache : avoid modifying recurrent cells when setting inputs (ggml-org#13834)
* kv-cache : avoid modifying recurrent cells when setting inputs
* kv-cache : remove inp_s_mask
It was replaced with equivalent and simpler functionality
with rs_z (the first zeroed state) and the already-existing inp_s_copy.
* kv-cache : fix non-consecutive token pos warning for recurrent models
The problem was apparently caused by how the tail cells were swapped.
* graph : simplify logic for recurrent state copies
* kv-cache : use cell without src refs for rs_z in recurrent cache
* llama-graph : fix recurrent state copy
The `state_copy` shuffle assumes everything is moved at once,
which is not true when `states_extra` is copied back to the cache
before copying the range of states between `head` and `head + n_seqs`.
This is only a problem if any of the cells in [`head`, `head + n_seqs`)
have an `src` in [`head + n_seqs`, `head + n_kv`),
which does happen when `n_ubatch > 1` in the `llama-parallel` example.
Changing the order of the operations avoids the potential overwrite
before use, although when copies are avoided (like with Mamba2),
this will require further changes.
* llama-graph : rename n_state to state_size in build_recurrent_state
This naming should reduce confusion between the state size
and the number of states.1 parent 97d02ff commit 573694e
File tree
4 files changed
+133
-156
lines changed- src
4 files changed
+133
-156
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
191 | 191 | | |
192 | 192 | | |
193 | 193 | | |
194 | | - | |
| 194 | + | |
195 | 195 | | |
196 | | - | |
197 | | - | |
| 196 | + | |
| 197 | + | |
198 | 198 | | |
199 | | - | |
200 | | - | |
| 199 | + | |
| 200 | + | |
201 | 201 | | |
202 | | - | |
203 | | - | |
204 | | - | |
| 202 | + | |
| 203 | + | |
205 | 204 | | |
206 | | - | |
207 | | - | |
208 | | - | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
209 | 211 | | |
210 | 212 | | |
211 | 213 | | |
| |||
228 | 230 | | |
229 | 231 | | |
230 | 232 | | |
231 | | - | |
232 | | - | |
| 233 | + | |
| 234 | + | |
233 | 235 | | |
234 | 236 | | |
235 | 237 | | |
| |||
962 | 964 | | |
963 | 965 | | |
964 | 966 | | |
965 | | - | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
966 | 985 | | |
967 | 986 | | |
968 | 987 | | |
| |||
1425 | 1444 | | |
1426 | 1445 | | |
1427 | 1446 | | |
1428 | | - | |
| 1447 | + | |
1429 | 1448 | | |
1430 | 1449 | | |
1431 | 1450 | | |
1432 | | - | |
1433 | | - | |
1434 | | - | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
1435 | 1454 | | |
1436 | 1455 | | |
1437 | 1456 | | |
1438 | 1457 | | |
| 1458 | + | |
| 1459 | + | |
| 1460 | + | |
1439 | 1461 | | |
1440 | | - | |
| 1462 | + | |
| 1463 | + | |
| 1464 | + | |
| 1465 | + | |
| 1466 | + | |
| 1467 | + | |
1441 | 1468 | | |
1442 | 1469 | | |
1443 | 1470 | | |
| |||
1448 | 1475 | | |
1449 | 1476 | | |
1450 | 1477 | | |
1451 | | - | |
| 1478 | + | |
| 1479 | + | |
1452 | 1480 | | |
1453 | 1481 | | |
1454 | 1482 | | |
| |||
1457 | 1485 | | |
1458 | 1486 | | |
1459 | 1487 | | |
1460 | | - | |
1461 | | - | |
1462 | | - | |
1463 | | - | |
1464 | | - | |
1465 | | - | |
1466 | | - | |
1467 | | - | |
1468 | | - | |
1469 | | - | |
1470 | | - | |
1471 | | - | |
1472 | | - | |
1473 | | - | |
1474 | | - | |
1475 | | - | |
1476 | | - | |
1477 | | - | |
1478 | | - | |
1479 | | - | |
1480 | | - | |
1481 | | - | |
1482 | | - | |
1483 | | - | |
1484 | | - | |
1485 | | - | |
1486 | | - | |
1487 | | - | |
1488 | | - | |
1489 | | - | |
1490 | | - | |
1491 | | - | |
1492 | | - | |
1493 | | - | |
1494 | | - | |
1495 | | - | |
1496 | | - | |
1497 | 1488 | | |
1498 | | - | |
1499 | | - | |
1500 | | - | |
| 1489 | + | |
| 1490 | + | |
| 1491 | + | |
1501 | 1492 | | |
1502 | 1493 | | |
1503 | 1494 | | |
| |||
1507 | 1498 | | |
1508 | 1499 | | |
1509 | 1500 | | |
1510 | | - | |
1511 | | - | |
1512 | | - | |
| 1501 | + | |
| 1502 | + | |
| 1503 | + | |
1513 | 1504 | | |
1514 | 1505 | | |
1515 | 1506 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
199 | 199 | | |
200 | 200 | | |
201 | 201 | | |
202 | | - | |
| 202 | + | |
203 | 203 | | |
204 | 204 | | |
205 | 205 | | |
| |||
547 | 547 | | |
548 | 548 | | |
549 | 549 | | |
| 550 | + | |
550 | 551 | | |
551 | 552 | | |
552 | 553 | | |
| |||
646 | 647 | | |
647 | 648 | | |
648 | 649 | | |
649 | | - | |
650 | | - | |
651 | | - | |
652 | | - | |
653 | | - | |
654 | | - | |
655 | | - | |
656 | | - | |
657 | | - | |
658 | | - | |
659 | | - | |
660 | | - | |
661 | | - | |
662 | | - | |
663 | | - | |
664 | | - | |
665 | | - | |
666 | | - | |
667 | | - | |
668 | | - | |
669 | | - | |
670 | | - | |
671 | | - | |
672 | | - | |
673 | | - | |
674 | | - | |
675 | | - | |
676 | | - | |
677 | | - | |
678 | | - | |
679 | | - | |
680 | | - | |
681 | | - | |
682 | | - | |
683 | | - | |
| 650 | + | |
| 651 | + | |
| 652 | + | |
| 653 | + | |
| 654 | + | |
| 655 | + | |
| 656 | + | |
684 | 657 | | |
685 | 658 | | |
686 | | - | |
687 | | - | |
688 | | - | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
689 | 662 | | |
690 | 663 | | |
691 | 664 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
429 | 429 | | |
430 | 430 | | |
431 | 431 | | |
432 | | - | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
433 | 435 | | |
434 | 436 | | |
435 | 437 | | |
| |||
539 | 541 | | |
540 | 542 | | |
541 | 543 | | |
542 | | - | |
| 544 | + | |
543 | 545 | | |
544 | 546 | | |
545 | 547 | | |
| |||
553 | 555 | | |
554 | 556 | | |
555 | 557 | | |
556 | | - | |
557 | 558 | | |
558 | | - | |
| 559 | + | |
559 | 560 | | |
560 | 561 | | |
561 | 562 | | |
| |||
565 | 566 | | |
566 | 567 | | |
567 | 568 | | |
568 | | - | |
569 | | - | |
| 569 | + | |
| 570 | + | |
570 | 571 | | |
571 | 572 | | |
572 | 573 | | |
| |||
578 | 579 | | |
579 | 580 | | |
580 | 581 | | |
581 | | - | |
582 | | - | |
| 582 | + | |
583 | 583 | | |
584 | | - | |
| 584 | + | |
585 | 585 | | |
586 | 586 | | |
587 | 587 | | |
| |||
634 | 634 | | |
635 | 635 | | |
636 | 636 | | |
637 | | - | |
| 637 | + | |
638 | 638 | | |
639 | 639 | | |
640 | 640 | | |
641 | 641 | | |
642 | 642 | | |
643 | | - | |
| 643 | + | |
644 | 644 | | |
645 | 645 | | |
646 | 646 | | |
| |||
1104 | 1104 | | |
1105 | 1105 | | |
1106 | 1106 | | |
1107 | | - | |
1108 | | - | |
| 1107 | + | |
| 1108 | + | |
| 1109 | + | |
| 1110 | + | |
| 1111 | + | |
| 1112 | + | |
1109 | 1113 | | |
1110 | 1114 | | |
1111 | 1115 | | |
| |||
1116 | 1120 | | |
1117 | 1121 | | |
1118 | 1122 | | |
1119 | | - | |
1120 | | - | |
1121 | | - | |
1122 | | - | |
1123 | | - | |
1124 | | - | |
| 1123 | + | |
| 1124 | + | |
1125 | 1125 | | |
0 commit comments