Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Synapse versions should fail to start if required background updates have not run yet. #16047

Open
matrixbot opened this issue Dec 21, 2023 · 0 comments

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 21, 2023

This issue has been migrated from #16047.


We (Beeper) just had a fun adventure upgrading to 1.88.

Looks like back in 1.74 a background update was added populate_user_directory_process_users and unbeknownst to us we've been grinding away at that background update ever since we upgraded back in January, mostly because we have 28 million users in our users table thanks to all the appservice users, as well as https://github.com/matrix-org/synapse/pull/15435/files#r1281053946 causing us to miss an index and only being able to process a single user every few seconds. This also means that any more recently added background updates haven't run, most importantly, profiles_full_user_id_key_idx and room_membership_user_room_index. We had updated to 1.85 in mid June, but the background update has still been chugging along and we were still missing the indexes. When we upgraded to 1.88 earlier today it ended up blowing up our DB since queries like SELECT displayname FROM profiles WHERE full_user_id = '@brad:beeper.com' would just scan the 28 million row table. We resolved the issue by just adding the index by hand which only took a couple minutes to complete, after which our Synapse instance was usable again.

At the time of the update, our background_updates table looked like this:
image

After some discussion in #synapse-dev:matrix.org, it's clear that having these indexes be added by a background update is a good thing, as they can be added without a long blocking migration and done in a release before they're required. However, this leaves open a gap where if for whatever reason the previously assumed-to-be-there background updates don't complete before the update happens that depends on them you end up with a extremely poorly performing or worse broken Synapse installation.

In an ideal world, Synapse at startup could check that the background updates that it needs were completed and explode gracefully with a nice error message. Ideally ideally, this would happen prior to migrations being applied, as to make it easier to rollback the upgrade. This would require completed background updates remaining in the table with something like a completed_at column, but that's probably nice for auditability after the fact.

@matrixbot matrixbot changed the title Dummy issue Updated Synapse versions should fail to start if required background updates have not run yet. Dec 22, 2023
@matrixbot matrixbot reopened this Dec 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant