You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
subprocess.CalledProcessError: Command '['python3', 'convert.py', '/home/ubuntu/volume_2k/Capstone_5/src/grag/quantize/test_data/llama.cpp/models/Meta-Llama-3-8B-Instruct/']' returned non-zero exit status 1.
Encountered the above error while quantizing Llama-3-8B because it uses a BPE tokenizer. For quantizing that you have to use below command line: python convert.py models/mymodel/ --vocab-type bpe
more here
Add support for such models.
Better error handling for models with different quantization mechanisms.
Validate if LlamaCpp supports quantization for given model.
sanchitvj
changed the title
Quantization not working for models with BPW tokenizer
Quantization not working for models with BPE tokenizer
Apr 25, 2024
Meta Llama-3 cannot be quantized because it uses BPE tokenizer.
The text was updated successfully, but these errors were encountered: