You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to quantize a model to FP8 W8A16 (since I am on Ampere). In the quantization_w8a8_fp8 example, it says no calibration is needed for FP8 W8A8. Is this also possible for FP8 W8A16? I did not find any information on this.
Also, if possible, can you give me an example on how to do this (like the FP8 W8A8 example)? Thanks in advance.
The text was updated successfully, but these errors were encountered:
I want to quantize a model to FP8 W8A16 (since I am on Ampere). In the quantization_w8a8_fp8 example, it says no calibration is needed for FP8 W8A8. Is this also possible for FP8 W8A16? I did not find any information on this.
Also, if possible, can you give me an example on how to do this (like the FP8 W8A8 example)? Thanks in advance.
The text was updated successfully, but these errors were encountered: