How to print/display/show integer quantized weights? #1644

V0XNIHILI · 2023-03-17T13:06:51Z

V0XNIHILI
Mar 17, 2023

Basically as the title says: how can I view the integer-valued quantized weights per layer so I can export them for use outside of Openvino?

I followed this tutorial: https://docs.openvino.ai/latest/notebooks/302-pytorch-quantization-aware-training-with-output.html

Answered by vshampor

Mar 17, 2023

@V0XNIHILI note that knowing quantized weight values themselves (the INT8 values) is not enough to specify the weights - you have to know the scale of quantization and its zero-point. Note also that this is not exactly the use case we are covering with NNCF right now.

The nncf.torch.quantization.algo.QuantizationController object should have all the info you need to build a quantized representation. Here's some code to get you started (would only work at the develop branch right now):

import nncf.torch
import torch

from nncf.torch import create_compressed_model
from nncf import NNCFConfig
from nncf.torch.quantization.algo import QuantizationController


class MyWeightedModel(torch.nn.Module

View full answer

vshampor · 2023-03-17T14:29:02Z

vshampor
Mar 17, 2023
Maintainer

@V0XNIHILI note that knowing quantized weight values themselves (the INT8 values) is not enough to specify the weights - you have to know the scale of quantization and its zero-point. Note also that this is not exactly the use case we are covering with NNCF right now.

The nncf.torch.quantization.algo.QuantizationController object should have all the info you need to build a quantized representation. Here's some code to get you started (would only work at the develop branch right now):

import nncf.torch
import torch

from nncf.torch import create_compressed_model
from nncf import NNCFConfig
from nncf.torch.quantization.algo import QuantizationController


class MyWeightedModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.conv2d = torch.nn.Conv2d(5, 5, 1)
        torch.nn.init.normal_(self.conv2d.weight)

    def forward(self, x):
        return self.conv2d(x)

config = NNCFConfig.from_dict({
    "input_info": {"sample_size": [1, 5, 5, 5]},
    "compression": {
        "algorithm": "quantization",
        "initializer": {
            "range": {"num_init_samples": 0},
            "batchnorm_adaptation": {"num_bn_adaptation_samples": 0}
        }
    },
    "target_device": "TRIAL"
})

ctrl, model = create_compressed_model(MyWeightedModel(), config)

assert isinstance(ctrl, QuantizationController)
for w_qid, w_qinfo in ctrl.weight_quantizers.items():
    weight = w_qinfo.quantized_module.weight
    quantizer = w_qinfo.quantizer_module_ref
    qmin, qmax, scale, zp = quantizer.get_parameters_for_torch_fq()
    scale = scale.reshape(quantizer.scale_shape)
    zp = zp.reshape(quantizer.scale_shape)
    qweight = zp + (weight / scale).round().clip(min=qmin, max=qmax)
    print(f"layer:{w_qid.target_node_name}\nscale: {scale}\nzp: {zp}\nqmin: {qmin} qmax: {qmax}\nqweight:{qweight}\n\n")

Output:

layer:MyWeightedModel/NNCFConv2d[conv2d]/conv2d_0
scale: tensor([0.0039])
zp: tensor([0], dtype=torch.int32)
qmin: 0 qmax: 255
qweight:tensor([[[[  0.]],

         [[ 24.]],

         [[255.]],

         [[255.]],

         [[224.]]],


        [[[ 15.]],

         [[  0.]],

         [[255.]],

         [[  0.]],

         [[255.]]],


        [[[  0.]],

         [[  0.]],

         [[  0.]],

         [[187.]],

         [[  0.]]],


        [[[  0.]],

         [[  0.]],

         [[  0.]],

         [[161.]],

         [[  0.]]],


        [[[  0.]],

         [[209.]],

         [[  0.]],

         [[194.]],

         [[  0.]]]], grad_fn=<AddBackward0>)

The w_qid.target_node_name can be parsed to find out the module in the PyTorch model that this weight belongs to - it's up to you to do the matching, though. If you can prepare a PR that would introduce this kind of functionality into NNCF's quantization algo controller out-of-the-box in a well-defined, quantization format-abstract way, we would be happy to review it.

2 replies

V0XNIHILI Mar 17, 2023
Author

Thanks so so much for the super quick response!! This was exactly what I was looking for!

And if I would introduce batchnorm folding into the weights, then would I do this right after this weight = w_qinfo.quantized_module.weight line or right after the qweight = ... line (following the equation below)?

Regarding a PR, I would be more than happy to add this kind of functionality. Could you point me to what the best location is to add such utilities? I.e. should this be part of some class or should this be a function that takes into in the ctrl compression_state dict?

vshampor Mar 17, 2023
Maintainer

Simulating BN folding is probably more complicated than that. The example I gave above is just for reference. Make sure you understand the quantization formula and quantizer parameters for both SymmetricQuantizer and AsymmetricQuantizer, then do proper, rigorous math with the BN folding formula and quantization formulae to obtain the recipe of adjusting qmin-qmax-zp-scale parameters post-factum for BN folding.

As for the functionality in the PR - you could try adding a method into nncf.torch.quantization.algo.QuantizationController that would take in a nncf.torch.quantization.structs.WeightQuantizerInfo and return a structure (would have to create a new class for that, say, QuantizedWeight) that would hold the quantized values (maybe even with a torch.int8 dtype), and the most general scaling/level info that would allow to uniquely reconstruct the quantized weights in the floating-point domain. Make sure to cover both SymmetricQuantizer and AsymmetricQuantizer cases, including all values of narrow_range/half_range options, and add corresponding unit tests into tests.torch.quantization.test_algo_quantization.

AlexKoff88 · 2023-03-23T06:04:06Z

AlexKoff88
Mar 23, 2023
Maintainer

If you convert the quantized model into OpenVINO IR the weight will be quantized while some operations such as BatchNorms are fused. You can see the constants in the Netron tool, for example.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to print/display/show integer quantized weights? #1644

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments 2 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

How to print/display/show integer quantized weights? #1644

V0XNIHILI Mar 17, 2023

Replies: 2 comments · 2 replies

vshampor Mar 17, 2023 Maintainer

V0XNIHILI Mar 17, 2023 Author

vshampor Mar 17, 2023 Maintainer

AlexKoff88 Mar 23, 2023 Maintainer

V0XNIHILI
Mar 17, 2023

Replies: 2 comments 2 replies

vshampor
Mar 17, 2023
Maintainer

V0XNIHILI Mar 17, 2023
Author

vshampor Mar 17, 2023
Maintainer

AlexKoff88
Mar 23, 2023
Maintainer