-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Proposal for refreshes needed metadata #67
Changes from 9 commits
771d56d
574208d
dfd12f7
e56fc45
8e3e1a1
37a3de0
066604d
29223c2
fba5ce8
e3ce32e
c762ef3
912eec5
2ab6964
db58c46
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -464,6 +464,8 @@ definitions: | |
- stream | ||
- sync_mode | ||
- destination_sync_mode | ||
- current_generation_id | ||
- sync_id | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. are there any backwards compatibility concerns that we need to worry about by making these required? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's a good point, there will be if a migrated connector is running on a non migrated platform. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my intuition is that we are actually safe, because i can't think of a time that a connector ever gives a configured catalog to the platform. it's only ever the platform giving it to a connector. so maybe it's safe? I'd have to dig through the code a bit more to sanity check it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My concern is when the platform gives a catalog to a connector with those fields. If the connector are migrated, those fields will be required but not present. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When would that happen though? If the platform knows about though fields it should also populate them, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that it can happen in OSS where we can have the connectors running on an old platform. The platform won't know about the fields but the connector will do. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ah. good point. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, does the solution which extract those fields in a dedicated object and make it optional in the configuredAirbyteStream sound good? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i'm not excited about that. I think it's letting a migration details dictate the design. i'd rather make them all nullable but specify that they are nullable for backwards compatibility reasons and if they are null what the default behavior is. @evantahler wdyt? relates to your comment below. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not ideal, but I think it's the best we can do. Added to https://docs.google.com/document/d/1wnnyYT7nzEyrvTiCOQGImfbgCyiTaBU9QIzUmojeozg/edit |
||
properties: | ||
stream: | ||
"$ref": "#/definitions/AirbyteStream" | ||
|
@@ -485,6 +487,15 @@ definitions: | |
type: array | ||
items: | ||
type: string | ||
generation_id: | ||
description: Monotically increasing numeric id representing the current generation of a stream. This id can be shared across syncs | ||
type: integer | ||
minimum_generation_id: | ||
description: The minimum generation id which is needed in a stream. If it is present, the destination will try to delete the data that are part of a generation lower than this property. If it is absent, the destination won't try to delete data | ||
type: integer | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why would this be null? I think on the first sync both generation_id and minimum_generation_id both be "0", and non-null There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is set to null for regular sync where clearing the data is not needed. Having the minimum generation id won't be useful for the destination in this case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it a problem if we always include it? Allowing null implies a 3rd state that is a little confusing. Unless there is a reason we must allow it to be nullable, I think it makes most sense to always set it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, the behavior if we set it to 0 will be the same than setting it at null. It will mean that we will not perform a cleaning of the data. FYI @edgao There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. kk. if that's the case. i agree with evan. it shouldn't be nullable. |
||
sync_id: | ||
description: Monotically increasing numeric id representing the current sync id. This is aimed to be unique per sync. | ||
type: integer | ||
SyncMode: | ||
type: string | ||
enum: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's
generation_id
below, notcurrent_generation_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update following the discussion in #67 (comment)