You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During a large FILE_LOAD BQ write, the WriteRename DoFn currently takes care of both the copy job and the deletion of temp tables (the latter happening at finishBundle). This implementation is risky because there is no separation between the copy step's retry and the table deletion step's retry. If an error occurs in the middle and the whole bundle retries, we may attempt to copy from tables that no longer exist.
Example:
Say, we have 10 temp tables and we receive a BQ exception when deleting the 5th table. WriteRename will retry and attempt to copy from all 10, but at this point 4 tables no longer exist. This bundle will continue retrying forever.
Solution:
Look to how temp file deletion is implemented. The two steps (load and delete) are separated into two different DoFns.
Issue Priority
Priority: 2
Issue Component
Component: io-java-gcp
The text was updated successfully, but these errors were encountered:
ahmedabu98
changed the title
(Java BigQueryIO) Copy jobs and temp table deletion occurs in the same DoFn, making the operation vulnerable during retries
(Java BigQueryIO) Copy job and temp table deletion occurs in the same DoFn, making the operation vulnerable during retries
Aug 26, 2022
What happened?
During a large FILE_LOAD BQ write, the WriteRename DoFn currently takes care of both the copy job and the deletion of temp tables (the latter happening at finishBundle). This implementation is risky because there is no separation between the copy step's retry and the table deletion step's retry. If an error occurs in the middle and the whole bundle retries, we may attempt to copy from tables that no longer exist.
Example:
Say, we have 10 temp tables and we receive a BQ exception when deleting the 5th table. WriteRename will retry and attempt to copy from all 10, but at this point 4 tables no longer exist. This bundle will continue retrying forever.
Solution:
Look to how temp file deletion is implemented. The two steps (load and delete) are separated into two different DoFns.
Issue Priority
Priority: 2
Issue Component
Component: io-java-gcp
The text was updated successfully, but these errors were encountered: