Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking read performance issues #121

Open
reythia opened this issue Aug 21, 2023 · 4 comments
Open

Tracking read performance issues #121

reythia opened this issue Aug 21, 2023 · 4 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed performance

Comments

@reythia
Copy link

reythia commented Aug 21, 2023

I've observed on multiple systems and OS that profiler speed resullts never exceed approx 3 GiB/s regardless of nonces and available hardware.

Edit: I have observed the same limitation on production nodes during the cycle gap.

Running multiple instances of profiler allows for cumulative read speeds that scale to the system CPU and IO limits, but still no more than ~3GiB/s per instance.

Tests run with --data-size=32

@github-project-automation github-project-automation bot moved this to 📋 Backlog in Dev team kanban Aug 30, 2023
@poszu poszu added performance good first issue Good for newcomers help wanted Extra attention is needed labels Aug 30, 2023
@pigmej
Copy link
Member

pigmej commented Sep 13, 2023

@poszu maybe it's about buffer sizes?

@reythia
Copy link
Author

reythia commented Sep 13, 2023

@poszu maybe it's about buffer sizes?

I didn't PR this because I suspect it would impact weaker machines and general home user accessibility. I also use 16-32 GiB post files which are not necessarily representative of most user's setups.

That said, 8x the read buffer is the sweet spot for me.

8x got got me to about 4 GiB/s - about inline with what I'd expect from a QD1 / single threaded read - but leaving a lot on the table for a RAID0 NVMe.

It would be ideal if readers could be partitioned to process multiple postdata_bin files in parallel, but I suspect very few users will benefit from it when running separate nodes has the same effect.

@lrettig
Copy link
Member

lrettig commented Oct 17, 2023

See this related thread on discord: https://discord.com/channels/623195163510046732/1163187703496585267

@lrettig
Copy link
Member

lrettig commented Oct 23, 2023

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed performance
Projects
Status: 📋 Backlog
Development

No branches or pull requests

4 participants