Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MongoDBSaver.*list yields nothing when options.filter is defined, and fixing it requires a breaking change #581

Open
benjamincburns opened this issue Oct 11, 2024 · 3 comments · May be fixed by #625

Comments

@benjamincburns
Copy link
Contributor

benjamincburns commented Oct 11, 2024

For #541 I decided the best course of action for testing CheckpointSaver list implementations would be to write a combinatorial test that spams list with a bunch of different argument combinations. For MongoDBSaver, every test that involved specifying values under options.filter failed.

The options.filter argument appears to be meant to constrain the result based on values in the stored CheckpointMetadata. The MongoDBSaver attempts to do this here, with the following code:

    if (filter) {
      Object.entries(filter).forEach(([key, value]) => {
        query[`metadata.${key}`] = value;
      });
    }

Unfortunately because of how checkpoint and metadata are serialized in the put method (the output of this.serde.dumpsTyped is of type [string, UInt8Array]), metadata is stored as a BSON Binary field, so any time options.filter is passed in, list yields no results.

At first glance you might think that this would be fixable without a breaking change by applying the filter in the client (in the for (const doc of result) loop). That would work, but only if you also applied the limit constraint client-side as well. Otherwise any time a limit constraint is applied, you'll likely miss items that would've matched your filter.

The same problem exists on the SqliteSaver as well, however that's solvable without a breaking change by joining the query with a CTE that deserializes the stored metadata object (will PR that change tomorrow). I'm not a MongoDB aficionado, but I don't think it has facilities for transforming documents in any way prior to applying filters.

The simple fix is to persist the serialized fields as objects nested into the document object, rather than byte arrays. Unfortunately, doing this without a migration will break existing collections of checkpoint history.

At present there's no mechanism baked into either the langgraph-checkpoint library or the MongoDBSaver to check for out of date document structures, or to define or kick off migrations like these, so the design of this will likely require a bit of care. It would be easy enough to write an automatic migration that runs on start, but that gets ugly in production environments when there are multiple concurrent processes. As a result, I'd recommend making it so the updated MongoDBSaver fails on initialization if the migration is required for an existing db. That way the migration can be applied out of band, and there's no risk that someone will blow away their checkpoint storage as a result of an npm update.

For now I've pruned the filter checks in my list tests in order to prevent this from blocking progress on #541.

@benjamincburns benjamincburns changed the title Support for options.filter is broken on MongoDBSaver.list and can't be fixed without a breaking change MongoDBSaver.*list yields nothing when options.filter is defined, and fixing it requires a breaking change Oct 11, 2024
@jacoblee93
Copy link
Collaborator

Thanks for flagging and for the detailed writeup - we can make a minor bump with breaking changes.

@benjamincburns
Copy link
Contributor Author

benjamincburns commented Oct 11, 2024

Easy enough. I'll see if I can't submit a PR later today my time.

@benjamincburns benjamincburns changed the title MongoDBSaver.*list yields nothing when options.filter is defined, and fixing it requires a breaking change MongoDBSaver.*list yields nothing when options.filter is defined, and fixing it requires a breaking change Oct 14, 2024
@benjamincburns
Copy link
Contributor Author

benjamincburns commented Oct 14, 2024

Easy enough. I'll see if I can't submit a PR later today my time.

Yeah, that was wildly optimistic on my part. I'm going to get the tests into PR tonight with exclusions for the issues that I found while writing them (just raised the last of those), and then I'll circle back and submit some more PRs for those issues. 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants