-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Complex Calculate use avx2 are slower 3 times than normal #880
Comments
Sorry, I cannot reproduce your timings. Here is the test program I've been using
compiled with I get a consistent x2.5 speedup with xsimd on... same for |
I use your code ,get the same result,even if i change the n=1e8,p=1. but when I change the data with my input
the time is Very different. the n=1e8,p=1. (12s with 4s). with n=1e6,p=1e2 time 1.1s(whit xsimd avx2) to 2.4s(normal) . I think maybe 2 reason. 1.the data input ,2 the data size could affect the time cost. |
Here's a godbolt with the alternatives: https://godbolt.org/z/TosvEr9fz Both show the 2.5x speedup. |
the code you use n=1e6,p=1e2 . with this parameter ,time ture speedup,I get the same with you. but when n=1e8,p=1,time will different,as i description above. when i change your code
unfortunately,can't execute. with n = std::atoi("1000000") p = std::atoi("100") certainly the time with yours. so if you make the n size enough,you will find time xmind will cost more, i debug find the internal ximd::sin or simd::cos time-consuming. if the data size small .such n=1e6 the function will not,but if you use n=1e8 and use that great impact ximd::cos or ximd::sin, if the data size is small ,such you use n=1e6,not big different,but n=1e8,the result big different.) |
use your example ,only modify operate. the std::vector<double, xsimd::aligned_allocator> size is 1e8.
xsimd::sqrt((xsimd::cos(ba) + xsimd::sin(bb)) / 2) use time 12s is slower than
std::sqrt((std::cos(a[i]) + std::sin(b[i])) / 2) use time 4.4s
The text was updated successfully, but these errors were encountered: