-
-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add code measuring CPU frequency #125
Comments
Nice. Just we already have better measurements than gettimeofday And on Linux you can just ask the kernel. It deviates constantly btw. |
As I wrote earlier, seems that the best code for measuring up to clock cycles inside the t1ha benchmark. It supports x86, arm64, ppc64, s390x, e2k, ia64, etc, as well as I was planning to rearrange this code as a separate "mera" library, but I don't have time for this yet. PPC64:
ARM64:
x390s
AMD64:
|
It seems that you both say about measuring time intervals, while the code I provided is about measuring effective CPU frequency - using any abovementioned way to measure the time interval. My point is that using rdtsc to count CPU cycles is broken for about 10 years, because it reports cycles of fixed base frequency (such as 2 GHz in reports provided in encode.su thread). So, instead I wrote small code for which we know how much CPU cycles it will be executed, and by measuring time spent on it, we can easily compute the frequency. Moreover, the method works for almost any supersclalar CPU. Using this approach, we can finally correctly report how much CPU cycles spent for each hashing operation. |
Yes, I know these loop counting tricks from gamers to calculate the frame rate. It's a rather stable way to do it. I'll check if rtdsc with cpuid is better or worse. But "better" would be reading the freq from the kernel via proc. |
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
not hardcoded to 3 GHz. Some code is based on GH #125, but this result is not really good. On linux I found an easy way.
Switching frequency on a modern core is usually in microseconds, AMD's Precision boost is pretty crazy, my CPU will be anywhere between 4.5 and 5.1GHz with single core boost, constantly changing due power demand etc, I kinda doubt you can get accurate readings through anything non-atomic with the execution of the code. Real world time is also important especially when older Intel's AVX512 will clock a system down below "base" (Zen 4 doesn't have this penalty), potentially hiding some of the performance penalty because a user might think 30 cycles at 2GHz is better than 40cycles at 3GHz. There's also other things to consider, I'm pretty sure some AVX units can take upwards of 200 cycles just to turn on, which might not be measured here if the unit is already hot. https://blog.cloudflare.com/on-the-dangers-of-intels-frequency-scaling/ |
I've drafted a patch that approaches the issue from the different angle. I usually know CPU frequency of the machine I'm working with. Also, there might be some easy way to query it. So, the frequency itself is of purely informational value to me. However, I don't always know if the cycle counter code is somewhat correct and #241 together with #292 highlight that, so some "visual control" is handy. So, I decided to combine tick counter and real-time clock into ae7ccd9 that produces output similar to the following:
|
Our cycle counter code is correct for Intel. In fact one of the only ones which is actually correct, after an Intel paper. |
I think so. I'm mostly focused on MIPS (having 32-bit cycle counter) and ARM at this moment. The output above comes from my go-to MIPS32 router and reflects its frequency correctly. The code is basically a |
I just wrote a little snippet measuring actual frequency of CPU core performing this code: https://encode.su/threads/3389-Code-snippet-to-compute-CPU-frequency
Please consider using it to correctly compute number of CPU cycles spent by hash functions - instead of RDTSC whose fakeness was discussed here a few years ago.
The text was updated successfully, but these errors were encountered: