Commit 97bdd26
Refactor lora adapter support (#8332)
* lora: load to devide buft
* add patch tensor function
* correct tensor patch
* llama_lora_adapter_apply
* correct ggml_backend_tensor_copy
* add llm_build_mm
* fix auto merge
* update based on review comments
* add convert script
* no more transpose A
* add f16 convert
* add metadata check
* add sanity check
* fix ftype
* add requirements
* fix requirements
* fix outfile
* conversion: only allow selected models
* fix types
* cuda : do not use dmmv if the tensor does not have enough cols
* llama : lora fixes
* do not disable mmap with lora
Co-authored-by: slaren <[email protected]>
* llm_build_lora_mm_id
* convert_lora : MoE LoRA conversion support
* convert_lora : prefer safetensors, similarly to convert_hf
* convert_hf : simplify modify_tensors for InternLM2
* convert_lora : lazy conversion
* llama : load and use alpha from LoRA adapters
* llama : use llm_build_lora_mm in most model graphs
* auto scale
* Revert "auto scale"
This reverts commit 42415a4.
* remove redundant params
* Apply suggestions from code review
Co-authored-by: slaren <[email protected]>
* change kv metadata
* move add_type to __init__
* convert_hf : move add_type to main()
* convert_lora : use the GGUFWriter from Model instead of overwriting it
---------
Co-authored-by: slaren <[email protected]>
Co-authored-by: Francis Couture-Harpin <[email protected]>1 parent 4db8f60 commit 97bdd26
File tree
12 files changed
+944
-511
lines changed- common
- ggml/src
- gguf-py/gguf
- include
- requirements
- src
12 files changed
+944
-511
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
685 | 685 | | |
686 | 686 | | |
687 | 687 | | |
688 | | - | |
689 | 688 | | |
690 | 689 | | |
691 | 690 | | |
692 | 691 | | |
693 | 692 | | |
694 | 693 | | |
695 | 694 | | |
696 | | - | |
697 | 695 | | |
698 | 696 | | |
699 | 697 | | |
| |||
2089 | 2087 | | |
2090 | 2088 | | |
2091 | 2089 | | |
2092 | | - | |
2093 | | - | |
2094 | | - | |
2095 | | - | |
2096 | | - | |
2097 | | - | |
2098 | | - | |
2099 | | - | |
| 2090 | + | |
| 2091 | + | |
2100 | 2092 | | |
2101 | 2093 | | |
2102 | 2094 | | |
2103 | 2095 | | |
2104 | 2096 | | |
| 2097 | + | |
2105 | 2098 | | |
2106 | 2099 | | |
2107 | 2100 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2264 | 2264 | | |
2265 | 2265 | | |
2266 | 2266 | | |
2267 | | - | |
2268 | | - | |
2269 | | - | |
2270 | | - | |
2271 | | - | |
2272 | | - | |
2273 | | - | |
2274 | 2267 | | |
2275 | 2268 | | |
2276 | 2269 | | |
| |||
2290 | 2283 | | |
2291 | 2284 | | |
2292 | 2285 | | |
2293 | | - | |
| 2286 | + | |
2294 | 2287 | | |
2295 | | - | |
| 2288 | + | |
2296 | 2289 | | |
2297 | 2290 | | |
2298 | | - | |
2299 | | - | |
2300 | | - | |
2301 | | - | |
| 2291 | + | |
2302 | 2292 | | |
2303 | | - | |
2304 | | - | |
2305 | | - | |
| 2293 | + | |
| 2294 | + | |
| 2295 | + | |
| 2296 | + | |
2306 | 2297 | | |
2307 | | - | |
2308 | | - | |
2309 | | - | |
2310 | | - | |
2311 | | - | |
2312 | | - | |
| 2298 | + | |
| 2299 | + | |
| 2300 | + | |
| 2301 | + | |
2313 | 2302 | | |
2314 | 2303 | | |
2315 | 2304 | | |
| |||
3585 | 3574 | | |
3586 | 3575 | | |
3587 | 3576 | | |
| 3577 | + | |
3588 | 3578 | | |
3589 | 3579 | | |
3590 | 3580 | | |
| |||
0 commit comments