You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
BigQueryIO's CDC ingestion requires usage of RowMutationInformation class. This class was two pairs of methods to return the change sequence number. The recently deprecated pair, "public static RowMutationInformation of(MutationType mutationType, long sequenceNumber)" and "public abstract Long getSequenceNumber();" are no longer work correctly - sequence number provided in the first method is no longer returned in the second due to this code. This breaks existing pipelines which haven't converted to the newly introduced methods.
Additionally, the new method uses compute intensive checking for the proper formatting of the sequence number. Is it possible that the underlying Storage Write API does the same validation and there is no need to do it twice?
Also, using "checkArgument" function in the pipeline's runtime code can cause a streaming pipeline with a single row with incorrect RowMutationInformation to fail, unless the developer explicitly catches IllegalStateException. it will have to be cancelled and could not be drained.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
Component: Python SDK
Component: Java SDK
Component: Go SDK
Component: Typescript SDK
Component: IO connector
Component: Beam YAML
Component: Beam examples
Component: Beam playground
Component: Beam katas
Component: Website
Component: Infrastructure
Component: Spark Runner
Component: Flink Runner
Component: Samza Runner
Component: Twister2 Runner
Component: Hazelcast Jet Runner
Component: Google Cloud Dataflow Runner
The text was updated successfully, but these errors were encountered:
What happened?
BigQueryIO's CDC ingestion requires usage of RowMutationInformation class. This class was two pairs of methods to return the change sequence number. The recently deprecated pair, "public static RowMutationInformation of(MutationType mutationType, long sequenceNumber)" and "public abstract Long getSequenceNumber();" are no longer work correctly - sequence number provided in the first method is no longer returned in the second due to this code. This breaks existing pipelines which haven't converted to the newly introduced methods.
Additionally, the new method uses compute intensive checking for the proper formatting of the sequence number. Is it possible that the underlying Storage Write API does the same validation and there is no need to do it twice?
Also, using "checkArgument" function in the pipeline's runtime code can cause a streaming pipeline with a single row with incorrect RowMutationInformation to fail, unless the developer explicitly catches IllegalStateException. it will have to be cancelled and could not be drained.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: