You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.28s/it]
Repo card metadata block was not found. Setting CardData to empty.
Token indices sequence length is longer than the specified maximum sequence length for this model (132274 > 16384). Running this sequence through the model will result in indexing errors
AWQ: 0%| | 0/27 [00:05<?, ?it/s]
Traceback (most recent call last):
File "/testspace/repo/deepseek/AutoAWQ/tests/deepseek_quantize.py", line 33, in
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_wikitext())
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/testspace/repo/deepseek/AutoAWQ/awq/models/base.py", line 232, in quantize
self.quantizer.quantize()
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 166, in quantize
scales_list = [
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 167, in
self._search_best_scale(self.modules[i], **layer)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 330, in _search_best_scale
best_scales = self._compute_best_scale(
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 391, in _compute_best_scale
self.pseudo_quantize_tensor(fc.weight.data)[0] / scales_view
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 76, in pseudo_quantize_tensor
assert org_w_shape[-1] % self.group_size == 0
AssertionError
The text was updated successfully, but these errors were encountered:
tohnee
changed the title
DeepSeek-Coder-V2-Lite-Instruct 量化失败
DeepSeek-Coder-V2-Lite-Instruct Error!
Oct 30, 2024
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.28s/it]
Repo card metadata block was not found. Setting CardData to empty.
Token indices sequence length is longer than the specified maximum sequence length for this model (132274 > 16384). Running this sequence through the model will result in indexing errors
AWQ: 0%| | 0/27 [00:05<?, ?it/s]
Traceback (most recent call last):
File "/testspace/repo/deepseek/AutoAWQ/tests/deepseek_quantize.py", line 33, in
model.quantize(tokenizer, quant_config=quant_config, calib_data=load_wikitext())
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/testspace/repo/deepseek/AutoAWQ/awq/models/base.py", line 232, in quantize
self.quantizer.quantize()
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 166, in quantize
scales_list = [
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 167, in
self._search_best_scale(self.modules[i], **layer)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 330, in _search_best_scale
best_scales = self._compute_best_scale(
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 391, in _compute_best_scale
self.pseudo_quantize_tensor(fc.weight.data)[0] / scales_view
File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 76, in pseudo_quantize_tensor
assert org_w_shape[-1] % self.group_size == 0
AssertionError
The text was updated successfully, but these errors were encountered: