Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output of Compressor unable to be to be loaded by latest HF Transformers #865

Open
hyaticua opened this issue Oct 23, 2024 · 2 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@hyaticua
Copy link

hyaticua commented Oct 23, 2024

Describe the bug
When using the preset W8A8 recipe from llm-compressor, the output results in a model config.json that fails validation when loaded by HF Transformers. This is a dev version of Transformers that includes the PRs from Neural Magic that support loading compressor models. The issue seems to be that quantization_config.config_groups.group_0.input_activations.observer is being set to None. When I diff the configs from the checkpoints NM has uploaded to HF hub, the main difference is that observer is set to "memoryless". Manually changing this seems to allow the models to load and evaluate properly.

Pinging @robertgshaw2-neuralmagic at his request 😄

Errors

  File "REDACTED/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "REDACTED/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3604, in from_pretrained
    config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
  File "REDACTED/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 169, in merge_quantization_configs
    quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
  File "REDACTED/lib/python3.10/site-packages/transformers/quantizers/auto.py", line 99, in from_dict
    return target_cls.from_dict(quantization_config_dict)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 1158, in from_dict
    return super().from_dict(config_dict, return_unused_kwargs=return_unused_kwargs, **kwargs)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 101, in from_dict
    config = cls(**config_dict)
  File "REDACTED/lib/python3.10/site-packages/transformers/utils/quantization_config.py", line 1114, in __init__
    self.quantization_config = QuantizationConfig.parse_obj(
  File "REDACTED/lib/python3.10/site-packages/pydantic/main.py", line 1162, in parse_obj
    return cls.model_validate(obj)
  File "REDACTED/lib/python3.10/site-packages/pydantic/main.py", line 596, in model_validate
    return cls.__pydantic_validator__.validate_python(
pydantic_core._pydantic_core.ValidationError: 2 validation errors for QuantizationConfig
config_groups.group_0.QuantizationScheme.input_activations.observer
  Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type

Additional context
my_config.json

Using llm-compressor v0.2.0 tag

@hyaticua hyaticua added the bug Something isn't working label Oct 23, 2024
@robertgshaw2-neuralmagic
Copy link
Collaborator

We are taking a look (sorry for delay)

@dsikka dsikka self-assigned this Oct 25, 2024
@dsikka
Copy link
Collaborator

dsikka commented Oct 25, 2024

HI @hyaticua - could you confirm what version of compressed-tensors you're using?
Upgrading to 0.7.1 should solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants