-
Notifications
You must be signed in to change notification settings - Fork 906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate datasets versions #4347
Conversation
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, and all makes sense 👍
One thing I was wondering is if this could somehow break existing workflows that use the DataCatalog
?
Signed-off-by: Elena Khaustova <[email protected]>
Signed-off-by: Elena Khaustova <[email protected]>
That's a good question. Technically, we always required The alternative is to apply these changes just to |
Yes so technically speaking this is breaking for people who have been passing instantiated dataset objects to catalog constructor with different save versions. It's a very rare thing to do and I agree this is actually improving the expected behaviour. So let's just keep it like this, but then at least if someone does complain we know what's going on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
Description
Solves #4327
Merge before #4329
Development notes
Added
_validate_versions
function to ensure all datasets in a catalog adhere to a versioning scheme - we allow single load version per dataset in the catalog and one save version for all datasets in the catalog. The function automatically updates the provided load versions based on the versions specified for the individual datasets. It also ensures all versioned datasets in the catalog share the same save version. If a conflict arises, aVersionAlreadyExistsError
is raised.Validation is applied to both
DataCatalog
andKedroDataCatalog
when a catalog is created or the dataset is added.Developer Certificate of Origin
We need all contributions to comply with the Developer Certificate of Origin (DCO). All commits must be signed off by including a
Signed-off-by
line in the commit message. See our wiki for guidance.If your PR is blocked due to unsigned commits, then you must follow the instructions under "Rebase the branch" on the GitHub Checks page for your PR. This will retroactively add the sign-off to all unsigned commits and allow the DCO check to pass.
Checklist
RELEASE.md
file