Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize ping pong with graviton3 intrinsics #2

Open
welly87 opened this issue Aug 11, 2022 · 2 comments
Open

optimize ping pong with graviton3 intrinsics #2

welly87 opened this issue Aug 11, 2022 · 2 comments

Comments

@welly87
Copy link
Owner

welly87 commented Aug 11, 2022

current result without arm compiler flags

#[Mean    =       11.498, StdDeviation   =       21.781]
#[Max     =    14901.247, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 85,475.067415 RTTs/sec
@welly87
Copy link
Owner Author

welly87 commented Aug 11, 2022

before adding flags

from vscode

#[Mean    =        9.599, StdDeviation   =       18.778]
#[Max     =    12918.783, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 103,119.784649 RTTs/sec
#[Mean    =        9.470, StdDeviation   =       18.620]
#[Max     =     9052.159, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 104,518.685563 RTTs/sec
#[Mean    =        9.808, StdDeviation   =       21.577]
#[Max     =    20103.167, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 100,940.492482 RTTs/sec

after adding flags

#[Mean    =       16.056, StdDeviation   =       23.649]
#[Max     =     8388.607, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 60,870.374780 RTTs/sec

using release run

#[Mean    =        9.839, StdDeviation   =       17.878]
#[Max     =    12017.663, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 100,630.186374 RTTs/sec

once again

#[Mean    =        9.691, StdDeviation   =       16.324]
#[Max     =     5853.183, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 102,155.608562 RTTs/sec
#[Mean    =        9.448, StdDeviation   =       16.813]
#[Max     =     5849.087, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
Throughput of 104,747.479048 RTTs/sec

@welly87
Copy link
Owner Author

welly87 commented Aug 11, 2022

another try with different process

#[Mean    =       16.920, StdDeviation   =       18.266]
#[Max     =    11706.367, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]
#[Mean    =       18.682, StdDeviation   =       31.254]
#[Max     =    22904.831, Total count    =     10000000]
#[Buckets =           24, SubBuckets     =         2048]

Repository owner deleted a comment from yehyabelhadad Feb 22, 2024
Repository owner deleted a comment from skodumur Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@welly87 and others