Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

atproto_firehose subscriber falls behind when atproto-hub is busy #1641

Open
snarfed opened this issue Dec 20, 2024 · 0 comments
Open

atproto_firehose subscriber falls behind when atproto-hub is busy #1641

snarfed opened this issue Dec 20, 2024 · 0 comments
Labels

Comments

@snarfed
Copy link
Owner

snarfed commented Dec 20, 2024

Started seeing an ugly failure mode in atproto-hub yesterday: our firehose consumer slowed down below the Bluesky relay's event rate, so we started falling behind. atproto-hub is CPU bound, and we had many (8-10) other firehose clients consuming our firehose at the same time, which is high. I dropped ROLLBACK_WNDOW from 200k seqs to 50k, and added a second core to atproto-hub, which seemed to help, but we were still falling behind occasionally for a bit and then catching back up. odd.

image image

...and then this morning it got worse. I added snarfed/lexrpc@22d9fee to shed load by denying additional connections from the same IP after the first, which helped, so we're now out of the woods:

image image

I still don't fully understand the failure mode though.

@snarfed snarfed added the infra label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant