-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix cleanup timer timestamp to not exceed max allowed timestamp #33037
Conversation
This fixes an exception during drain on jobs with GlobalWindows + AllowedLateness > 24h + @OnExpiredWindows callback
This fixes #20203. I think this is a safe change. Not 100% sure if an element with timestamp |
R: @scwhittle |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
Run PreCommit Java |
.maxTimestamp() | ||
.plus(windowingStrategy.getAllowedLateness()) | ||
.plus(Duration.millis(1L)); | ||
return cleanupTime.isAfter(BoundedWindow.TIMESTAMP_MAX_VALUE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be nice to add processTimers coverage to SimpleDoFnTest to verify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a test checking the cleanup timer timestamp for the globalwindow + allowed lateness > max timestamp
This fixes an exception during drain on jobs with GlobalWindows + AllowedLateness > 24h + @OnExpiredWindows callback
Before the fix, the computed state cleanup timer timestamp can be > BoundedWindow.TIMESTAMP_MAX_VALUE.
Before the timer is committed, logic in https://github.com/apache/beam/blob/81f35ab62298a2ec9fadeded82461b363b6401db/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/WindmillTimeUtils.java caps the timestamp to BoundedWindow.TIMESTAMP_MAX_VALUE
When the timer fires at TIMESTAMP_MAX_VALUE, the check in SimpleParDoFn.java that compares timer timestamp with
earliestAllowableCleanupTime()
fails.beam/runners/google-cloud-dataflow-java/worker/src/main/java/org/apache/beam/runners/dataflow/worker/SimpleParDoFn.java
Line 397 in e598df7