-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Proposal for refreshes needed metadata #67
Conversation
This PR was made as a support for the writing of the specs. The change in the protocol will go through the normal proposal process which is describe here: https://www.notion.so/Protocol-Change-Process-64877a3d60564884bda07dc5dea8b4af |
Add |
…ric/protocol-change-for-refreshes
…ric/protocol-change-for-refreshes
There's a second file to update as well: https://github.com/airbytehq/airbyte-protocol/blob/main/protocol-models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml |
@@ -485,6 +497,20 @@ definitions: | |||
type: array | |||
items: | |||
type: string | |||
StreamMetadata: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be added into ConfiguredAirbyteStream?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not to get too into bike shedding, but at the point that it is in ConfiguredAirbyteStream
does it need its own struct at all? can't these all just be fields directly in ConfiguredAirbyteStream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
…ric/protocol-change-for-refreshes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a second file to update as well: https://github.com/airbytehq/airbyte-protocol/blob/main/protocol-models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml
minimum_generation_id: | ||
description: The minimum generation id which is needed in a stream. If it is present, the destination will try to delete the data that are part of a generation lower than this property. If it is absent, the destination won't try to delete data | ||
type: integer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would this be null? I think on the first sync both generation_id and minimum_generation_id both be "0", and non-null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is set to null for regular sync where clearing the data is not needed. Having the minimum generation id won't be useful for the destination in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it a problem if we always include it? Allowing null implies a 3rd state that is a little confusing. Unless there is a reason we must allow it to be nullable, I think it makes most sense to always set it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the behavior if we set it to 0 will be the same than setting it at null. It will mean that we will not perform a cleaning of the data. FYI @edgao
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kk. if that's the case. i agree with evan. it shouldn't be nullable.
Updated and I also bumped the version which was missing. |
protocol-models/src/main/resources/airbyte_protocol/airbyte_protocol.yaml
Show resolved
Hide resolved
- current_generation_id | ||
- sync_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any backwards compatibility concerns that we need to worry about by making these required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, there will be if a migrated connector is running on a non migrated platform.
@cgardens Is it ok if I move it back to a dedicated object with the 3 fields being required and the new object will be optional in the ConfiguredAirbyteStream
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my intuition is that we are actually safe, because i can't think of a time that a connector ever gives a configured catalog to the platform. it's only ever the platform giving it to a connector. so maybe it's safe? I'd have to dig through the code a bit more to sanity check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is when the platform gives a catalog to a connector with those fields. If the connector are migrated, those fields will be required but not present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When would that happen though? If the platform knows about though fields it should also populate them, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it can happen in OSS where we can have the connectors running on an old platform. The platform won't know about the fields but the connector will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, does the solution which extract those fields in a dedicated object and make it optional in the configuredAirbyteStream sound good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not excited about that. I think it's letting a migration details dictate the design. i'd rather make them all nullable but specify that they are nullable for backwards compatibility reasons and if they are null what the default behavior is.
@evantahler wdyt? relates to your comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not ideal, but I think it's the best we can do. Added to https://docs.google.com/document/d/1wnnyYT7nzEyrvTiCOQGImfbgCyiTaBU9QIzUmojeozg/edit
@@ -464,6 +464,8 @@ definitions: | |||
- stream | |||
- sync_mode | |||
- destination_sync_mode | |||
- current_generation_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generation_id
below, not current_generation_id
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update following the discussion in #67 (comment)
This is the implementation of https://docs.google.com/document/d/1MAReO2FwV7l-ETAZTLScrcX4DZgEP7N_aCtOC9keAiU/edit