Updated Synapse versions should fail to start if required background updates have not run yet. #16047

matrixbot · 2023-12-21T06:32:39Z

This issue has been migrated from #16047.

We (Beeper) just had a fun adventure upgrading to 1.88.

Looks like back in 1.74 a background update was added populate_user_directory_process_users and unbeknownst to us we've been grinding away at that background update ever since we upgraded back in January, mostly because we have 28 million users in our users table thanks to all the appservice users, as well as https://github.com/matrix-org/synapse/pull/15435/files#r1281053946 causing us to miss an index and only being able to process a single user every few seconds. This also means that any more recently added background updates haven't run, most importantly, profiles_full_user_id_key_idx and room_membership_user_room_index. We had updated to 1.85 in mid June, but the background update has still been chugging along and we were still missing the indexes. When we upgraded to 1.88 earlier today it ended up blowing up our DB since queries like SELECT displayname FROM profiles WHERE full_user_id = '@brad:beeper.com' would just scan the 28 million row table. We resolved the issue by just adding the index by hand which only took a couple minutes to complete, after which our Synapse instance was usable again.

At the time of the update, our background_updates table looked like this:

After some discussion in #synapse-dev:matrix.org, it's clear that having these indexes be added by a background update is a good thing, as they can be added without a long blocking migration and done in a release before they're required. However, this leaves open a gap where if for whatever reason the previously assumed-to-be-there background updates don't complete before the update happens that depends on them you end up with a extremely poorly performing or worse broken Synapse installation.

In an ideal world, Synapse at startup could check that the background updates that it needs were completed and explode gracefully with a nice error message. Ideally ideally, this would happen prior to migrations being applied, as to make it easier to rollback the upgrade. This would require completed background updates remaining in the table with something like a completed_at column, but that's probably nice for auditability after the fact.

The text was updated successfully, but these errors were encountered:

matrixbot closed this as completed Dec 21, 2023

matrixbot changed the title ~~Dummy issue~~ Updated Synapse versions should fail to start if required background updates have not run yet. Dec 22, 2023

matrixbot added S-Major O-Occasional T-Enhancement labels Dec 22, 2023

matrixbot reopened this Dec 22, 2023

matrixbot mentioned this issue Dec 22, 2023

Bail during start-up if there are old background updates. #16397

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updated Synapse versions should fail to start if required background updates have not run yet. #16047

Updated Synapse versions should fail to start if required background updates have not run yet. #16047

matrixbot commented Dec 21, 2023 •

edited

Loading

Updated Synapse versions should fail to start if required background updates have not run yet. #16047

Updated Synapse versions should fail to start if required background updates have not run yet. #16047

Comments

matrixbot commented Dec 21, 2023 • edited Loading

matrixbot commented Dec 21, 2023 •

edited

Loading