chore(flamegraph): Lower number of concurrent reads #537

Zylphrex · 2024-11-29T20:58:45Z

The new flamegraph endpoint is very likely the suspect of the OOMs we've been seeing. Upon examining the old flamegraph endpoint, we used to spawn ~25 go routines per request but at a lower throughput. So having 100 concurrent workers may be the cause of the OOMs as we try to read too many profiles at once.

viglia · 2024-12-02T09:17:27Z

The new flamegraph endpoint is very likely the suspect of the OOMs we've been seeing. Upon examining the old flamegraph endpoint, we used to spawn ~25 go routines per request but at a lower throughput. So having 100 concurrent workers may be the cause of the OOMs as we try to read too many profiles at once.

a few clarifications:

previously we used to spawn a set for individuals goroutines for each requests, meaning that each request would be guaranteed those routines to download the profiles, with the new pool approach that's not the case. We have a shared pool and different requests compete to access the routines and download chunks
we were not limited to ~25 goroutines per request. We had a dynamic allocation based on number of profiles. Roughly 5 goroutines every 100 profiles. Given that we were limiting the max number of profiles to use in aggregation to 600 we happened to be around that number but it could have been less or more depending on the number of profiles
such limit (600) was later resized to 300 in this PR. Quite frankly I don't remember why it was lowered as there is no comment/description on the PR.

With that said, let's try to lower the pool to 25 and see how it behaves. Hard to tell in advance, but keep in mind that this is shared among different requests.

Zylphrex · 2024-12-02T15:07:31Z

The other difference is that the old flamegraph endpoint was much less frequently used compared to this one so we need to restrict it better.

…wer-number-of-concurrent-reads

Zylphrex requested a review from a team as a code owner November 29, 2024 20:58

update changelog

b6b7b11

viglia approved these changes Dec 2, 2024

View reviewed changes

Merge branch 'main' of github.com:getsentry/vroom into txiao/chore/lo…

784dad4

…wer-number-of-concurrent-reads

Zylphrex merged commit bca3dba into main Dec 2, 2024
12 checks passed

Zylphrex deleted the txiao/chore/lower-number-of-concurrent-reads branch December 2, 2024 15:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(flamegraph): Lower number of concurrent reads #537

chore(flamegraph): Lower number of concurrent reads #537

Zylphrex commented Nov 29, 2024

viglia commented Dec 2, 2024

Zylphrex commented Dec 2, 2024

chore(flamegraph): Lower number of concurrent reads #537

chore(flamegraph): Lower number of concurrent reads #537

Conversation

Zylphrex commented Nov 29, 2024

viglia commented Dec 2, 2024

Zylphrex commented Dec 2, 2024