Commit 67ff796
kv-cache : avoid modifying recurrent cells when setting inputs (ggml-org#13834)
* kv-cache : avoid modifying recurrent cells when setting inputs
* kv-cache : remove inp_s_mask
It was replaced with equivalent and simpler functionality
with rs_z (the first zeroed state) and the already-existing inp_s_copy.
* kv-cache : fix non-consecutive token pos warning for recurrent models
The problem was apparently caused by how the tail cells were swapped.
* graph : simplify logic for recurrent state copies
* kv-cache : use cell without src refs for rs_z in recurrent cache
* llama-graph : fix recurrent state copy
The `state_copy` shuffle assumes everything is moved at once,
which is not true when `states_extra` is copied back to the cache
before copying the range of states between `head` and `head + n_seqs`.
This is only a problem if any of the cells in [`head`, `head + n_seqs`)
have an `src` in [`head + n_seqs`, `head + n_kv`),
which does happen when `n_ubatch > 1` in the `llama-parallel` example.
Changing the order of the operations avoids the potential overwrite
before use, although when copies are avoided (like with Mamba2),
this will require further changes.
* llama-graph : rename n_state to state_size in build_recurrent_state
This naming should reduce confusion between the state size
and the number of states.1 parent 663365d commit 67ff796
File tree
6 files changed
+117
-180
lines changed- src
6 files changed
+117
-180
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
254 | | - | |
255 | | - | |
256 | | - | |
257 | | - | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
262 | | - | |
263 | | - | |
264 | | - | |
265 | | - | |
266 | | - | |
267 | | - | |
268 | | - | |
269 | 253 | | |
270 | 254 | | |
271 | 255 | | |
| |||
987 | 971 | | |
988 | 972 | | |
989 | 973 | | |
990 | | - | |
991 | | - | |
992 | | - | |
993 | | - | |
994 | | - | |
995 | | - | |
996 | | - | |
997 | | - | |
998 | | - | |
999 | | - | |
1000 | | - | |
1001 | | - | |
1002 | | - | |
1003 | | - | |
1004 | | - | |
1005 | | - | |
1006 | | - | |
1007 | 974 | | |
1008 | 975 | | |
1009 | 976 | | |
| |||
1456 | 1423 | | |
1457 | 1424 | | |
1458 | 1425 | | |
1459 | | - | |
| 1426 | + | |
1460 | 1427 | | |
1461 | 1428 | | |
1462 | 1429 | | |
1463 | | - | |
1464 | | - | |
1465 | | - | |
| 1430 | + | |
| 1431 | + | |
| 1432 | + | |
1466 | 1433 | | |
1467 | 1434 | | |
1468 | 1435 | | |
1469 | 1436 | | |
| 1437 | + | |
1470 | 1438 | | |
1471 | | - | |
| 1439 | + | |
1472 | 1440 | | |
1473 | | - | |
1474 | | - | |
1475 | | - | |
1476 | | - | |
| 1441 | + | |
| 1442 | + | |
| 1443 | + | |
| 1444 | + | |
1477 | 1445 | | |
1478 | | - | |
1479 | | - | |
1480 | | - | |
| 1446 | + | |
| 1447 | + | |
| 1448 | + | |
| 1449 | + | |
| 1450 | + | |
| 1451 | + | |
| 1452 | + | |
| 1453 | + | |
| 1454 | + | |
| 1455 | + | |
| 1456 | + | |
| 1457 | + | |
| 1458 | + | |
1481 | 1459 | | |
1482 | | - | |
| 1460 | + | |
| 1461 | + | |
1483 | 1462 | | |
1484 | 1463 | | |
1485 | | - | |
1486 | | - | |
| 1464 | + | |
| 1465 | + | |
1487 | 1466 | | |
1488 | | - | |
1489 | | - | |
| 1467 | + | |
1490 | 1468 | | |
1491 | 1469 | | |
1492 | 1470 | | |
1493 | 1471 | | |
1494 | 1472 | | |
1495 | | - | |
1496 | 1473 | | |
1497 | 1474 | | |
1498 | 1475 | | |
| |||
1503 | 1480 | | |
1504 | 1481 | | |
1505 | 1482 | | |
1506 | | - | |
1507 | | - | |
| 1483 | + | |
| 1484 | + | |
1508 | 1485 | | |
1509 | 1486 | | |
1510 | 1487 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
200 | 200 | | |
201 | 201 | | |
202 | 202 | | |
203 | | - | |
204 | | - | |
205 | | - | |
206 | | - | |
207 | | - | |
208 | | - | |
209 | | - | |
210 | | - | |
211 | | - | |
212 | | - | |
213 | | - | |
214 | | - | |
215 | 203 | | |
216 | 204 | | |
217 | 205 | | |
| |||
521 | 509 | | |
522 | 510 | | |
523 | 511 | | |
524 | | - | |
525 | 512 | | |
526 | 513 | | |
527 | 514 | | |
| |||
606 | 593 | | |
607 | 594 | | |
608 | 595 | | |
609 | | - | |
| 596 | + | |
610 | 597 | | |
611 | 598 | | |
612 | 599 | | |
613 | | - | |
614 | | - | |
615 | | - | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
616 | 603 | | |
617 | 604 | | |
618 | 605 | | |
619 | 606 | | |
620 | | - | |
621 | 607 | | |
622 | 608 | | |
623 | 609 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
406 | 406 | | |
407 | 407 | | |
408 | 408 | | |
409 | | - | |
410 | | - | |
411 | | - | |
412 | | - | |
413 | | - | |
414 | | - | |
415 | | - | |
416 | | - | |
417 | | - | |
418 | | - | |
419 | | - | |
420 | | - | |
421 | | - | |
422 | | - | |
423 | | - | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
424 | 415 | | |
425 | 416 | | |
426 | 417 | | |
| |||
431 | 422 | | |
432 | 423 | | |
433 | 424 | | |
434 | | - | |
435 | | - | |
| 425 | + | |
436 | 426 | | |
437 | 427 | | |
438 | 428 | | |
439 | 429 | | |
440 | 430 | | |
441 | | - | |
| 431 | + | |
442 | 432 | | |
443 | 433 | | |
444 | 434 | | |
| |||
534 | 524 | | |
535 | 525 | | |
536 | 526 | | |
| 527 | + | |
537 | 528 | | |
538 | 529 | | |
539 | 530 | | |
540 | 531 | | |
541 | | - | |
542 | 532 | | |
| 533 | + | |
543 | 534 | | |
544 | 535 | | |
545 | 536 | | |
546 | | - | |
547 | 537 | | |
548 | 538 | | |
549 | 539 | | |
| |||
553 | 543 | | |
554 | 544 | | |
555 | 545 | | |
556 | | - | |
557 | | - | |
| 546 | + | |
| 547 | + | |
558 | 548 | | |
559 | 549 | | |
560 | 550 | | |
| |||
563 | 553 | | |
564 | 554 | | |
565 | 555 | | |
566 | | - | |
567 | | - | |
568 | | - | |
569 | | - | |
570 | | - | |
571 | | - | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
| 562 | + | |
| 563 | + | |
572 | 564 | | |
573 | 565 | | |
574 | 566 | | |
575 | 567 | | |
576 | 568 | | |
577 | 569 | | |
578 | 570 | | |
579 | | - | |
| 571 | + | |
580 | 572 | | |
581 | 573 | | |
582 | 574 | | |
| |||
594 | 586 | | |
595 | 587 | | |
596 | 588 | | |
| 589 | + | |
| 590 | + | |
| 591 | + | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| 619 | + | |
| 620 | + | |
597 | 621 | | |
598 | 622 | | |
599 | 623 | | |
| |||
605 | 629 | | |
606 | 630 | | |
607 | 631 | | |
608 | | - | |
609 | | - | |
610 | | - | |
611 | | - | |
612 | | - | |
613 | | - | |
614 | | - | |
615 | | - | |
616 | | - | |
617 | | - | |
618 | | - | |
619 | | - | |
620 | | - | |
621 | | - | |
622 | | - | |
623 | | - | |
624 | | - | |
625 | | - | |
626 | | - | |
627 | | - | |
628 | | - | |
629 | | - | |
630 | | - | |
631 | | - | |
632 | | - | |
633 | | - | |
634 | | - | |
635 | | - | |
636 | | - | |
637 | | - | |
638 | | - | |
639 | | - | |
640 | | - | |
641 | | - | |
642 | | - | |
643 | | - | |
644 | | - | |
645 | | - | |
646 | | - | |
647 | | - | |
648 | | - | |
| 632 | + | |
| 633 | + | |
649 | 634 | | |
650 | 635 | | |
651 | 636 | | |
| |||
1111 | 1096 | | |
1112 | 1097 | | |
1113 | 1098 | | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
1114 | 1103 | | |
1115 | 1104 | | |
1116 | 1105 | | |
| |||
1124 | 1113 | | |
1125 | 1114 | | |
1126 | 1115 | | |
1127 | | - | |
1128 | | - | |
1129 | | - | |
1130 | | - | |
1131 | | - | |
| 1116 | + | |
1132 | 1117 | | |
0 commit comments