You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Started seeing an ugly failure mode in atproto-hub yesterday: our firehose consumer slowed down below the Bluesky relay's event rate, so we started falling behind. atproto-hub is CPU bound, and we had many (8-10) other firehose clients consuming our firehose at the same time, which is high. I dropped ROLLBACK_WNDOW from 200k seqs to 50k, and added a second core to atproto-hub, which seemed to help, but we were still falling behind occasionally for a bit and then catching back up. odd.
...and then this morning it got worse. I added snarfed/lexrpc@22d9fee to shed load by denying additional connections from the same IP after the first, which helped, so we're now out of the woods:
I still don't fully understand the failure mode though.
The text was updated successfully, but these errors were encountered:
Started seeing an ugly failure mode in atproto-hub yesterday: our firehose consumer slowed down below the Bluesky relay's event rate, so we started falling behind. atproto-hub is CPU bound, and we had many (8-10) other firehose clients consuming our firehose at the same time, which is high. I dropped
ROLLBACK_WNDOW
from 200k seqs to 50k, and added a second core to atproto-hub, which seemed to help, but we were still falling behind occasionally for a bit and then catching back up. odd....and then this morning it got worse. I added snarfed/lexrpc@22d9fee to shed load by denying additional connections from the same IP after the first, which helped, so we're now out of the woods:
I still don't fully understand the failure mode though.
The text was updated successfully, but these errors were encountered: