Hello, why the transformation in ipynb for the matrix multiplication can make cache more friendly? #13
Unanswered
mazdarx7fc3s
asked this question in
Q&A
Replies: 2 comments 1 reply
-
It's because we enhance the cache hit rate. Please see https://tvm.apache.org/docs/how_to/optimize_operators/opt_gemm.html#blocking |
Beta Was this translation helpful? Give feedback.
1 reply
-
usually reuse of a small chunk of data helps cache-friendliness. Considering the buffer access under the inner loops, i.e. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
before transformation:
after transformation:
In my view, before the transformation, we need make a 1024*1024*1024-for-loop, after the transformation, we still need make the 1024*1024*1024-for-loop. Why the time costs decreases so much?
Beta Was this translation helpful? Give feedback.
All reactions