Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Commit

Permalink
Update gemma_utils.cpp
Browse files Browse the repository at this point in the history
  • Loading branch information
intellinjun authored Mar 22, 2024
1 parent 055f840 commit bb59f1f
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion neural_speed/models/gemma/gemma_utils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -223,7 +223,7 @@ class gemma_quant_layer : public quant_layer_base {
if ((layername.find("embedding") != std::string::npos) ||
(layername == "token_embd.weight" || layername == "model.embed_tokens.weight")) {
// special layer process, can be loaded by config file
return quant_params_internal(); // q40
return quant_params_internal{quant_bits::q8}; // q80
}
quantize &= (ne.size() == 2);
if (quantize) {
Expand Down

0 comments on commit bb59f1f

Please sign in to comment.