Is it possible to quantize to FP8 W8A16 without calibration data #858

us58 · 2024-10-21T07:55:00Z

I want to quantize a model to FP8 W8A16 (since I am on Ampere). In the quantization_w8a8_fp8 example, it says no calibration is needed for FP8 W8A8. Is this also possible for FP8 W8A16? I did not find any information on this.

Also, if possible, can you give me an example on how to do this (like the FP8 W8A8 example)? Thanks in advance.

okwinds · 2024-10-22T01:16:45Z

A single sample should work

us58 added the enhancement New feature or request label Oct 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to quantize to FP8 W8A16 without calibration data #858

Is it possible to quantize to FP8 W8A16 without calibration data #858

us58 commented Oct 21, 2024

okwinds commented Oct 22, 2024

Is it possible to quantize to FP8 W8A16 without calibration data #858

Is it possible to quantize to FP8 W8A16 without calibration data #858

Comments

us58 commented Oct 21, 2024

okwinds commented Oct 22, 2024