Fix hybrid quantization reg issue #27687

### Details: - set LPT callbacks to handle compression and avoid constant folding for it (taken from openvinotoolkit#20973) - Allow u8/i8 output data type for compressed onednn FC - Disable Dequantize propagation through Transpose if it's a dependency of SDPA to keep Transpose+SDPA fusion

Many daily int8 models have Perf regression.(incorrect conv data type and bias) Fix kernel selection issue. Signed-off-by: hyunback <[email protected]>

Convolution is expected int8 data type in int8 model, but with mixed weight compressed occurs, it run fp16. Signed-off-by: hyunback <[email protected]>

Currently dynamic quantized int8 onednn convolution has the problem. Working with ref convolution. So replace to run fp16 mode instead. Signed-off-by: hyunback <[email protected]>

Signed-off-by: hyunback <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hybrid quantization reg issue #27687

Fix hybrid quantization reg issue #27687

Commits on Nov 5, 2024

Commits on Nov 11, 2024

Commits on Nov 12, 2024

Commits on Nov 13, 2024

Commits on Nov 17, 2024