-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw/sfs: check number of file descriptors on start #752
Comments
JFTR, if FDs are exhausted when making requests, you see things like this in the logs:
Or maybe this:
After that, subsequent requests will tend to just hang (or, if you're lucky, maybe fail with "access denied") |
I can't remember whether we discussed this detail, but is there any reason not use |
This is somewhat arbitrary, but the idea is that we potentially need at least 4 FDs per worker thread (two for the sqlite db and its WAL, and another two to accommodate files that may be being read or written), plus about 40 for various pipes and sockets and things that appear in in /proc/$(pgrep radosgw)/fd before anything interesting happens, so let's round that 40 up to 64 just in case. Fixes: https://github.com/aquarist-labs/s3gw/issues/752 Signed-off-by: Tim Serong <[email protected]>
I think we just didn't know about that, or it didn't occur to anyone. If that works for the intended purpose, all the better. @irq0 thoughts? |
I only found out about it today when I did some further digging :-) |
No limit / high limit doesn't mean they are actually free and there is no other mechanism that limits them 🙃 Some interesting more info in https://0pointer.net/blog/file-descriptor-limits.html - I think we should follow the advice at the end about soft / hard limits. |
Fascinating. Thanks for the link. Most straightforward then is to do what Lennart says and bump the soft limit to the hard limit (which I can confirm is 524288 on my Tumbleweed desktop), but also maybe double check that the hard limit is nice and high, just out of paranoia. I've attempted to confirm that there's no use of |
Wwe potentially need at least 4 FDs per worker thread (two for the sqlite db and its WAL, and another two to accommodate files that may be being read or written), plus about 40 for various pipes and sockets and things that appear in in /proc/$(pgrep radosgw)/fd before anything interesting happens. That's more than two thousand FDs, but the default soft FD limit is only 1024. The most straightforward and probably safest thing to do is just bump the RLIMIT_NOFILE soft limit (1024) to the hard limit (which these days should be 524288) on startup. In case the hard limit is somehow low, this commit also includes a check to see if it's at least as high as what we imagine we need. See https://0pointer.net/blog/file-descriptor-limits.html for discussion on bumping RLIMIT_NOFILE. Fixes: https://github.com/aquarist-labs/s3gw/issues/752 Signed-off-by: Tim Serong <[email protected]>
We potentially need at least 4 FDs per worker thread (two for the sqlite db and its WAL, and another two to accommodate files that may be being read or written), plus about 40 for various pipes and sockets and things that appear in in /proc/$(pgrep radosgw)/fd before anything interesting happens. That's more than two thousand FDs, but the default soft FD limit is only 1024. The most straightforward and probably safest thing to do is just bump the RLIMIT_NOFILE soft limit (1024) to the hard limit (which these days should be 524288) on startup. In case the hard limit is somehow low, this commit also includes a check to see if it's at least as high as what we imagine we need. See https://0pointer.net/blog/file-descriptor-limits.html for discussion on bumping RLIMIT_NOFILE. Fixes: https://github.com/aquarist-labs/s3gw/issues/752 Signed-off-by: Tim Serong <[email protected]>
OK, I've updated aquarist-labs/ceph#229 to try to bump the soft limit (1024) to the hard limit (which should be 524288 on any reasonably modern system). Under the circumstances, given that limit is huge, I don't know that we need to try to actually allocate the couple thousand FDs we suspect we actually need at maximum. |
We potentially need at least 4 FDs per worker thread (two for the sqlite db and its WAL, and another two to accommodate files that may be being read or written), plus about 40 for various pipes and sockets and things that appear in in /proc/$(pgrep radosgw)/fd before anything interesting happens. That's more than two thousand FDs, but the default soft FD limit is only 1024. The most straightforward and probably safest thing to do is just bump the RLIMIT_NOFILE soft limit (1024) to the hard limit (which these days should be 524288) on startup. In case the hard limit is somehow low, this commit also includes a check to see if it's at least as high as what we imagine we need. See https://0pointer.net/blog/file-descriptor-limits.html for discussion on bumping RLIMIT_NOFILE. Fixes: https://github.com/aquarist-labs/s3gw/issues/752 Signed-off-by: Tim Serong <[email protected]>
We should ensure the process is able to allocate more than just the 1024 file descriptors (default value), because otherwise we could end up having issues after exhausting the number of file descriptors.
The proposal is to allocate a bunch of file descriptors on start, and ensuring that we can do it. If not, die with a message to the user. Otherwise, continue. The expectation is that this would prevent potential problems down the line, with a few hundred milliseconds as the trade-off on start.
The text was updated successfully, but these errors were encountered: