-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix relfrozenxid, relminmxid and relallvisible on table data files swap #157
Conversation
@dvarrazzo @MasahikoSawada do you guys have an opinion about this fix? Thanks for posting this @bashtanov , sorry it is taking us so long to review. |
It is probably a problem that must be fix. But the entire swapping has to be fast-forwarded about 10 years. I started working on it at the time @bashtanov sent his MR so it was conflicting with what I was doing too, but then I didn't get around progressing with that work so much. The behaviour in #152 is described as bad (see Robert Haas tranchant response in the thread) So we should likely merge this. But I still wonder why it was that way in first place. For me, I'd call it an oversight of the original error and I'd put back the xids the way they were meant to be. This MR also handles |
FWIW, my read of that email is that Robert might think the person's "custom addition to update the relfrozenxid" on top of pg_repack is really bad. Seems a little unclear anyway. I'll comment on #152 as well, which seems to be a bigger thread - but it seems to me that if repack is not processing visibility (ie. copying dead rows) then the relfrozenxid needs to be preserved. OTOH, if repack is processing visibility information as it copies data (skipping dead rows, essentially vacuuming as it copies) then it should be safe to update relfrozenxid afterwards. Needs a close look of course, but that makes sense to me at a high level. |
I've been meaning to add to this discussion as a few months ago we used repack as part of an xid wraparound avoidance operation, which caused us to look into this a bit deeper than most. One of the early things we reasoned was that due to the way repack creates the new table and inserts data, all rows would be written with a current xid within the system. This is fairly evident even from the SQL level and we were able to run a number of checks to confirm this. I should maybe mention that the table in question was over 1TB in size, and had hundreds of millions of rows, and roughly 20 different indexes. Once we were happy from the SQL level, we did some relevant investigation of the Postgres source to see if we could find any code areas that we should be concerned about, but again because we had recreated the table and all indexes entirely during the operation, we saw no such cases. Furthermore, new rows and updates rows also behaved as expected, in fact the only piece that seemed missing was pg_class.relfrozenxid which contained the value from the original pre-repacked table. At this point we felt confident we could update pg_class.relfrozenxid to match the earliest xid within the data itself and be safe, but before that, we reached out to our upstream cloud provider to have their postgres internals team (yes, we have an usually good relationship with them) also review our findings, and their team concurred it would be safe. Once the system catalog was updated, we saw no issues with performance nor system errors, and a subsequent vacuum verbose and vacuum freeze of the table raised no issues, and we have not seen any further issues having now run for several months. So the summary is, consider me a +1 for updating relfrozenxid of the repacked table. As mentioned elsewhere in this discussion, I think that was the original intent of the reorg authors, and at least as of v14 I see no reason why this would cause issues, with the only caveat being I have not looked into the nearby logical replication issue to understand how it might affect that, but in the common case this would be solid improvement for existing repack users, imho. |
@bashtanov Since this PR changes quite a few internal values, is it possible to check that it is working as expected? Other than that, it looks good. |
/* | ||
* Swap relfrozenxid and relminmxid, as they must be consistent with the data | ||
*/ | ||
swaptemp = relform1->relfrozenxid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swaptempxid not swaptemp should be initialized here
relform2->relfrozenxid = swaptempxid; | ||
|
||
#if PG_VERSION_NUM >= 90300 | ||
swaptemp = relform1->relminmxid; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swaptempxid not swaptemp should be initialized here
I believe it doesn't as it uses an uninitialized variable. I'll test it when I have a moment but it may take similarly long. |
Here is the follow up PR: #377 |
This PR is follow up of #157. - Fixed the typo: #157 (comment) - Removed `PG_VERSION_NUM >= 90300` check since we don't support such old versions anyway - Added check `relform1->relkind != RELKIND_INDEX` since `relfrozenxid` and `relminmxid` is non-zero only for tables --------- Co-authored-by: Alexey Bashtanov <[email protected]>
This commit was merged together with the PR #377. |
Currently pg_repack sets both
relfrozenxid
s to the older one (the code comment says "larger", butTransactionIdFollows
acts as>
, so it essentially works the opposite). It implies excessive anti-wraparound vacuums on tables that have been repacked recently. It seems correct to me to swap this attribute along with the data, regardless whether the xid is normal or not. Same for relminmxid, and for relallvisible, which is a part of statistics.See also the discussion at #152.