You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We ran into several regressions due to updates to construction schema of I/O transforms breaking transform upgrade via TransformService (from older Beam versions).
The pattern is:
(1) We add a new field/property to the I/O transform
(2) We update the schema in the corresponding IOTranslation class (for example, [1] [2])
(3) We update the schema.
If we do not take special mitigation actions, this might result in breakages during upgrade. This is because the Row objects we try to parse might actually come from older Beam versions where the new field does not exist.
Example regressions/fixes: [3] [4]
Potential strategies to prevent similar issues in the future:
(1) Develop utils to safely migrate the schema
(2) More integration/compat tests
(3) Move to proto
You mean as opposed to streaming updates ? If so yes. This gets hit when a pipeline tries to upgrade a single transform to a new Beam version (via TransformService for example).
What would you like to happen?
We ran into several regressions due to updates to construction schema of I/O transforms breaking transform upgrade via TransformService (from older Beam versions).
The pattern is:
(1) We add a new field/property to the I/O transform
(2) We update the schema in the corresponding IOTranslation class (for example, [1] [2])
(3) We update the schema.
If we do not take special mitigation actions, this might result in breakages during upgrade. This is because the Row objects we try to parse might actually come from older Beam versions where the new field does not exist.
Example regressions/fixes: [3] [4]
Potential strategies to prevent similar issues in the future:
(1) Develop utils to safely migrate the schema
(2) More integration/compat tests
(3) Move to proto
[1] https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTranslation.java
[2]
beam/sdks/java/io/kafka/upgrade/src/main/java/org/apache/beam/sdk/io/kafka/upgrade/KafkaIOTranslation.java
Line 70 in 911c525
[3] #30551
[4]#31685
Issue Priority
Priority: 2 (default / most feature requests should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: