-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix hybrid quantization reg issue #27687
Fix hybrid quantization reg issue #27687
Commits on Nov 5, 2024
-
[GPU] Fixes for hybrid quantization (openvinotoolkit#27127)
### Details: - set LPT callbacks to handle compression and avoid constant folding for it (taken from openvinotoolkit#20973) - Allow u8/i8 output data type for compressed onednn FC - Disable Dequantize propagation through Transpose if it's a dependency of SDPA to keep Transpose+SDPA fusion
Configuration menu - View commit details
-
Copy full SHA for 2d3c917 - Browse repository at this point
Copy the full SHA 2d3c917View commit details -
[GPU] Fix hybrid quantization regression issue.
Many daily int8 models have Perf regression.(incorrect conv data type and bias) Fix kernel selection issue. Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 6ed5c74 - Browse repository at this point
Copy the full SHA 6ed5c74View commit details
Commits on Nov 11, 2024
-
Fix int8 daily regression models issue.
Convolution is expected int8 data type in int8 model, but with mixed weight compressed occurs, it run fp16. Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 2e62681 - Browse repository at this point
Copy the full SHA 2e62681View commit details
Commits on Nov 12, 2024
-
Disable quantized int8 onednn convolution in dynamic.
Currently dynamic quantized int8 onednn convolution has the problem. Working with ref convolution. So replace to run fp16 mode instead. Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for f6448ff - Browse repository at this point
Copy the full SHA f6448ffView commit details -
Fix the failure when int4 weight compression pattern in lama3.2
Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0ec2a87 - Browse repository at this point
Copy the full SHA 0ec2a87View commit details
Commits on Nov 13, 2024
-
Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 0fc98bb - Browse repository at this point
Copy the full SHA 0fc98bbView commit details -
Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for a03dc6a - Browse repository at this point
Copy the full SHA a03dc6aView commit details
Commits on Nov 17, 2024
-
Apply the code review comments.
Signed-off-by: hyunback <[email protected]>
Configuration menu - View commit details
-
Copy full SHA for 49ad601 - Browse repository at this point
Copy the full SHA 49ad601View commit details