Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantization not working for models with BPE tokenizer #102

Closed
sanchitvj opened this issue Apr 25, 2024 · 2 comments · Fixed by #120
Closed

Quantization not working for models with BPE tokenizer #102

sanchitvj opened this issue Apr 25, 2024 · 2 comments · Fixed by #120

Comments

@sanchitvj
Copy link
Collaborator

Meta Llama-3 cannot be quantized because it uses BPE tokenizer.

@sanchitvj
Copy link
Collaborator Author

subprocess.CalledProcessError: Command '['python3', 'convert.py', '/home/ubuntu/volume_2k/Capstone_5/src/grag/quantize/test_data/llama.cpp/models/Meta-Llama-3-8B-Instruct/']' returned non-zero exit status 1.
Encountered the above error while quantizing Llama-3-8B because it uses a BPE tokenizer. For quantizing that you have to use below command line:
python convert.py models/mymodel/ --vocab-type bpe
more here

  • Add support for such models.
  • Better error handling for models with different quantization mechanisms.
  • Validate if LlamaCpp supports quantization for given model.

@sanchitvj sanchitvj changed the title Quantization not working for models with BPW tokenizer Quantization not working for models with BPE tokenizer Apr 25, 2024
@sanchitvj
Copy link
Collaborator Author

This issue is solved by the #118 PR. Now LLMs both with and without BPE tokenizer are supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant