You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We (Beeper) just had a fun adventure upgrading to 1.88.
Looks like back in 1.74 a background update was added populate_user_directory_process_users and unbeknownst to us we've been grinding away at that background update ever since we upgraded back in January, mostly because we have 28 million users in our users table thanks to all the appservice users, as well as https://github.com/matrix-org/synapse/pull/15435/files#r1281053946 causing us to miss an index and only being able to process a single user every few seconds. This also means that any more recently added background updates haven't run, most importantly, profiles_full_user_id_key_idx and room_membership_user_room_index. We had updated to 1.85 in mid June, but the background update has still been chugging along and we were still missing the indexes. When we upgraded to 1.88 earlier today it ended up blowing up our DB since queries like SELECT displayname FROM profiles WHERE full_user_id = '@brad:beeper.com' would just scan the 28 million row table. We resolved the issue by just adding the index by hand which only took a couple minutes to complete, after which our Synapse instance was usable again.
At the time of the update, our background_updates table looked like this:
After some discussion in #synapse-dev:matrix.org, it's clear that having these indexes be added by a background update is a good thing, as they can be added without a long blocking migration and done in a release before they're required. However, this leaves open a gap where if for whatever reason the previously assumed-to-be-there background updates don't complete before the update happens that depends on them you end up with a extremely poorly performing or worse broken Synapse installation.
In an ideal world, Synapse at startup could check that the background updates that it needs were completed and explode gracefully with a nice error message. Ideally ideally, this would happen prior to migrations being applied, as to make it easier to rollback the upgrade. This would require completed background updates remaining in the table with something like a completed_at column, but that's probably nice for auditability after the fact.
The text was updated successfully, but these errors were encountered:
This issue has been migrated from #16047.
We (Beeper) just had a fun adventure upgrading to 1.88.
Looks like back in 1.74 a background update was added populate_user_directory_process_users and unbeknownst to us we've been grinding away at that background update ever since we upgraded back in January, mostly because we have 28 million users in our users table thanks to all the appservice users, as well as https://github.com/matrix-org/synapse/pull/15435/files#r1281053946 causing us to miss an index and only being able to process a single user every few seconds. This also means that any more recently added background updates haven't run, most importantly, profiles_full_user_id_key_idx and room_membership_user_room_index. We had updated to 1.85 in mid June, but the background update has still been chugging along and we were still missing the indexes. When we upgraded to 1.88 earlier today it ended up blowing up our DB since queries like SELECT displayname FROM profiles WHERE full_user_id = '@brad:beeper.com' would just scan the 28 million row table. We resolved the issue by just adding the index by hand which only took a couple minutes to complete, after which our Synapse instance was usable again.
At the time of the update, our
background_updates
table looked like this:After some discussion in #synapse-dev:matrix.org, it's clear that having these indexes be added by a background update is a good thing, as they can be added without a long blocking migration and done in a release before they're required. However, this leaves open a gap where if for whatever reason the previously assumed-to-be-there background updates don't complete before the update happens that depends on them you end up with a extremely poorly performing or worse broken Synapse installation.
In an ideal world, Synapse at startup could check that the background updates that it needs were completed and explode gracefully with a nice error message. Ideally ideally, this would happen prior to migrations being applied, as to make it easier to rollback the upgrade. This would require completed background updates remaining in the table with something like a
completed_at
column, but that's probably nice for auditability after the fact.The text was updated successfully, but these errors were encountered: