Tutorial: How to convert HuggingFace model to GGUF format [UPDATED] #7927
Replies: 6 comments 2 replies
-
|
There is a problem…… |
Beta Was this translation helpful? Give feedback.
-
|
If the directory of the model before conversion contains config.json for hugging face, can I use it in the model after conversion to gguf format? |
Beta Was this translation helpful? Give feedback.
-
|
Also, how can we convert mistral model by using convert_hf_to_gguf.py? |
Beta Was this translation helpful? Give feedback.
-
|
Hey Guys, I am trying to convert increased context Llama model from https://huggingface.co/togethercomputer/LLaMA-2-7B-32K to gguf. When I ran the above instructions, I got this error. Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
|
bro this script it's driving me crazy it was so easy to convert to gguf a year back python convert_hf_to_gguf.py llama-3-1-8b-samanta-spectrum --outfile neural-samanta-spectrum.gguf --outtype f16 |
Beta Was this translation helpful? Give feedback.
-
|
Python 3.12 |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I wanted to make this Tutorial because of the latest changes made these last few days in this PR that changes the way you have to tackle the convertion.
Download the Hugging Face model
Source: https://www.substratus.ai/blog/converting-hf-model-gguf-model/
This haven't been changed so you can still use the old method, here is a link for how to do this part.
For this exemple I will be using the Bloom 3b model
Convert the model
Here is where things changed quit a bit from the last Tutorial.
llama.cpp comes with a script that does the GGUF convertion from either a GGML model or an hf model (HuggingFace model).
First start by cloning the repository :
Install the Python Libraries :
Important : if the install works just fine then that's good but if you face some problems maybe try changing the
numpypackage version inrequirements-convert-legacy-llama.txtfromnumpy~=1.24.4tonumpy~=1.26.4. And if you get another error saying it can't download the2.1.1version oftorchthen changetorch~=2.1.1totorch~=2.2.1in bothrequirements-convert-hf-to-gguf-update.txtandrequirements-convert-hf-to-gguf.txt. These files can be found in therequirementsfolder.Now go to the
convert_hf_to_gguf_update.pyfile and add your model to the models array, you will find this last one at around line 64 :In this same file make sur that the function calls
convert_py_pth.read_text()andconvert_py_pth.write_text(convert_py)at around line 217 have the parameterencodingset toutf-8:Remark : for some people this won't change anything but for others they will face problems later on if this is not set
Make sure that you have already executed this command before doing the next step
Now execute the command shown at the start of the
convert_hf_to_gguf_update.pyfile :Remark : replace
<huggingface_token>with your actuelle huggingface account token. here is how to do it if you still don't have one.Finally you can run this command to create your
.gguffile :llama.cpp/convert-hf-to-gguf.py : the path to
the convert-hf-to-gguf.pyfile. relative to the current directory of the terminalBloom-3b : path to the HF model folder. relative to the current directory of the terminal
--outfile Bloom-3b.gguf : the ouput file, it need to have the
.ggufextension at the end--outtype q8_0 : the quantization method
Go to the output directory to see if the
.gguffile was created.IMPORTANT : in case the downloaded model doesn't have the
config.jsonfile in it you will probably get an error saying that it can't find it. if the model is aLlamamodel then you can use the same command above but replacellama.cpp/convert-hf-to-gguf.pywithllama.cpp/examples/convert-legacy-llama.pyinstead and hopefully it should work.If you get an
open raise BadZipFile(f"Overlapped entries: {zinfo.orig_filename!r} (possible zip bomb)")error try downgrading frompython 3.12topython 3.10Beta Was this translation helpful? Give feedback.
All reactions