Output of Compressor unable to be to be loaded by latest HF Transformers #865

hyaticua · 2024-10-23T20:18:41Z

Describe the bug
When using the preset W8A8 recipe from llm-compressor, the output results in a model config.json that fails validation when loaded by HF Transformers. This is a dev version of Transformers that includes the PRs from Neural Magic that support loading compressor models. The issue seems to be that quantization_config.config_groups.group_0.input_activations.observer is being set to None. When I diff the configs from the checkpoints NM has uploaded to HF hub, the main difference is that observer is set to "memoryless". Manually changing this seems to allow the models to load and evaluate properly.

Pinging @robertgshaw2-neuralmagic at his request 😄

Errors

  File "REDACTED/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "REDACTED/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3604, in from_pretrained
    config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
  File "REDACTED/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 169, in merge_quantization_configs
    quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
  File "REDACTED/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 99, in from_dict
    return target_cls.from_dict(quantization_config_dict)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 1158, in from_dict
    return super().from_dict(config_dict, return_unused_kwargs=return_unused_kwargs, **kwargs)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 101, in from_dict
    config = cls(**config_dict)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 1114, in __init__
    self.quantization_config = QuantizationConfig.parse_obj(
  File "REDACTED/lib/python3.10/site-packages/pydantic/main.py", line 1162, in parse_obj
    return cls.model_validate(obj)
  File "REDACTED/lib/python3.10/site-packages/pydantic/main.py", line 596, in model_validate
    return cls.__pydantic_validator__.validate_python(
pydantic_core._pydantic_core.ValidationError: 2 validation errors for QuantizationConfig
config_groups.group_0.QuantizationScheme.input_activations.observer
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type

Additional context
my_config.json

Using llm-compressor v0.2.0 tag

The text was updated successfully, but these errors were encountered:

robertgshaw2-neuralmagic · 2024-10-25T18:20:54Z

We are taking a look (sorry for delay)

dsikka · 2024-10-25T19:30:52Z

HI @hyaticua - could you confirm what version of compressed-tensors you're using?
Upgrading to 0.7.1 should solve this problem.

hyaticua added the bug Something isn't working label Oct 23, 2024

dsikka self-assigned this Oct 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Output of Compressor unable to be to be loaded by latest HF Transformers #865

Output of Compressor unable to be to be loaded by latest HF Transformers #865

hyaticua commented Oct 23, 2024 •

edited

Loading

robertgshaw2-neuralmagic commented Oct 25, 2024

dsikka commented Oct 25, 2024

Output of Compressor unable to be to be loaded by latest HF Transformers #865

Output of Compressor unable to be to be loaded by latest HF Transformers #865

Comments

hyaticua commented Oct 23, 2024 • edited Loading

robertgshaw2-neuralmagic commented Oct 25, 2024

dsikka commented Oct 25, 2024

hyaticua commented Oct 23, 2024 •

edited

Loading