-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Newly created Storage API write streams do not recognize the previously updated table schema #33238
Comments
After investigating a little, I found this doesn't actually have to do with dynamic destinations. The issue is when the sink creates a new StreamWriter after the schema has been updated. Looks like we create the StreamWriter with a fixed schema. A StreamWriter returns an updated schema only if it was created before such an update. So it doesn't pull the updated schema and continues using the old schema it was created with. |
Say a streaming pipeline has been running for a while.. then the table's schema gets updated. If the pipeline decides to create new write streams after this (e.g. autosharding determines we need more shards), we will create those new write streams based on the original schema. We do not communicate to new shards that actually we are writing with a new schema. See the following code reference: Lines 514 to 530 in 288c156
Lines 933 to 943 in 288c156
|
We do always attempt to fetch the StreamWriter's updated schema: Lines 927 to 928 in 85bff0d
But looking at the implementation in Storage API code base, we see that the updated schema is returned only if the StreamWriter was created before the schema update operation: https://github.com/googleapis/java-bigquerystorage/blob/f090c8eb91c1fad9e7f13850b367cca64c0afe5c/google-cloud-bigquerystorage/src/main/java/com/google/cloud/bigquery/storage/v1/StreamWriter.java#L576-L591 |
What happened?
#33231 added a test, showing the combination of these two feature do not work together.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components
The text was updated successfully, but these errors were encountered: