Commit a1631e5
llama : simplify Mamba with advanced batch splits (ggml-org#8526)
* llama : advanced batch splits
This includes equal-sequence-length batch splits which are useful
to simplify recurrent model operators.
* llama : always make recurrent state slots contiguous
* ggml : simplify mamba operators
* llama : fix integer signedness mixing
* llama : logits_all has priority over batch->logits
Otherwise, the server embeddings tests failed.
This was likely an existing problem but was only detected here
because of an additional assertion.
* llama : apply suggestions
Co-authored-by: Georgi Gerganov <[email protected]>
* llama : fix t5 segfault
* llama : fix Mamba session save and restore
* llama : minor cosmetic changes
* llama : rename llama_reorder_outputs to llama_output_reorder
Also move it closer to llama_output_reserve.
* llama : fix pooled embeddings when using batches with equal_seqs
* minor : add struct members for clarity
ggml-ci
* llama : fix T5 segfault again
* llama : fix Mamba pooled embeddings with multiple sequences
Until the pooled embeddings are refactored to allow splitting
across ubatches for causal embeddings,
recurrent models can only process a single sequence per ubatch
when calculating pooled embeddings.
* llama : add llama_model_is_recurrent to simplify figuring that out
This will make it easier to more cleanly support RWKV-v6 and Mamba-2.
* llama : fix simple splits when the batch contains embeddings
---------
Co-authored-by: Georgi Gerganov <[email protected]>1 parent fc54ef0 commit a1631e5
4 files changed
+1137
-678
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1777 | 1777 | | |
1778 | 1778 | | |
1779 | 1779 | | |
1780 | | - | |
1781 | | - | |
1782 | | - | |
1783 | | - | |
| 1780 | + | |
| 1781 | + | |
1784 | 1782 | | |
1785 | 1783 | | |
1786 | 1784 | | |
| |||
1789 | 1787 | | |
1790 | 1788 | | |
1791 | 1789 | | |
1792 | | - | |
1793 | | - | |
| 1790 | + | |
1794 | 1791 | | |
1795 | 1792 | | |
1796 | 1793 | | |
| |||
0 commit comments