Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it possible to quantize to FP8 W8A16 without calibration data #858

Open
us58 opened this issue Oct 21, 2024 · 1 comment
Open

Is it possible to quantize to FP8 W8A16 without calibration data #858

us58 opened this issue Oct 21, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@us58
Copy link

us58 commented Oct 21, 2024

I want to quantize a model to FP8 W8A16 (since I am on Ampere). In the quantization_w8a8_fp8 example, it says no calibration is needed for FP8 W8A8. Is this also possible for FP8 W8A16? I did not find any information on this.

Also, if possible, can you give me an example on how to do this (like the FP8 W8A8 example)? Thanks in advance.

@us58 us58 added the enhancement New feature or request label Oct 21, 2024
@okwinds
Copy link

okwinds commented Oct 22, 2024

A single sample should work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants