Support for int8 matrix multiplication #287
hchoi-moveworks
started this conversation in
General
Replies: 1 comment
-
This work is ongoing. Perf is good but not yet on par with cutlass (like 10-20% slower). We are working on a full process to support quantization out of the box. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Would there be a support for optimizing model that leverages int8 matmul1 ?
Beta Was this translation helpful? Give feedback.
All reactions