Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(checkpoint-postgres): store channel_values correctly #577

Conversation

benjamincburns
Copy link
Contributor

@benjamincburns benjamincburns commented Oct 10, 2024

Prior to this change, PostgresSaver wouldn't store channel_values if newVersions wasn't passed to the put method. This is due to a filter check in _dumpBlobs that checks each channel name to ensure it appears in the channel versions object prior to serializing it for storage.

Upon inspecting the other CheckpointSaver types, I noticed that MemorySaver, MongoDBSaver, and SqliteSaver) all ignore the newVersions argument to the BaseCheckpointSaver.put method, so I think this is the correct fix. From what I can see by searching references, no test code (apart from the tests this PR changes) passes the newVersions argument.

That said, the one reference to BaseCheckpointSaver.put that I found in actual shipped code is in PregelLoop._putCheckpoint (via _checkpointerPutAfterPrevious), and this does pass the newVersions argument.

So I have to ask, are all of the other CheckpointSaver implementations, and all of the existing tests implemented incorrectly, or is this the correct fix?

Found this while working on #541, thanks to the getTuple tests (all of them failed when I wired them up to PostgresSaver).

@benjamincburns benjamincburns force-pushed the fix-postgressaver-channel_values branch from c8ae5dd to 91210ac Compare October 10, 2024 06:32
@benjamincburns benjamincburns changed the title fix: PostgresSaver doesn't store channel_values correctly fix: make PostgresSaver store channel_values correctly Oct 10, 2024
@hgoona
Copy link

hgoona commented Oct 10, 2024

@benjamincburns are you saying the current PostgresSaver implementation is faulty? (Trying to understand which checkpoint Saver to use as a reference to develop my own one. It's so very confusing. I'm still trying to fix mine and hitting errors.. )

@benjamincburns
Copy link
Contributor Author

I don't know whether it is faulty or not. It looks to me like either it is faulty, or the other three are. It seems unlikely that it would be the other three, especially since this one was just introduced, but I can't say for certain, which is why I asked.

@benjamincburns benjamincburns force-pushed the fix-postgressaver-channel_values branch from 91210ac to 54c5311 Compare October 11, 2024 00:46
@benjamincburns benjamincburns changed the title fix: make PostgresSaver store channel_values correctly fix(PostgresSaver): store channel_values correctly Oct 11, 2024
@benjamincburns benjamincburns changed the title fix(PostgresSaver): store channel_values correctly fix(checkpoint-postgres): store channel_values correctly Oct 11, 2024
@benjamincburns benjamincburns force-pushed the fix-postgressaver-channel_values branch from 54c5311 to f2818bb Compare October 11, 2024 00:58
@jacoblee93
Copy link
Collaborator

jacoblee93 commented Oct 11, 2024

Hey @benjamincburns, respecting newVersions is the correct thing to do. Thanks so much for flagging this!

Let's close this one, if you can open an issue for making other implementations respect this that would be helpful, otherwise I can.

@jacoblee93 jacoblee93 closed this Oct 11, 2024
@benjamincburns
Copy link
Contributor Author

@jacoblee93 ah, glad I asked! I'll take a crack at it and reply back here if I'm unable to make it work for some reason.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants