Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ArithmeticException in toJavaDuration. #1950

Merged
merged 4 commits into from
Dec 14, 2023
Merged

Conversation

chronos-tachyon
Copy link
Contributor

@chronos-tachyon chronos-tachyon commented Dec 6, 2023

Fixes #1938 aka SDK-1319.

What was changed

Eliminate risk of ArithmeticException due to long overflow in toJavaDuration.

Why?

Make the API exception-safe for extreme duration values, especially for values which are legal in the protobuf wire format but not for Java's Duration type.

Checklist

  1. Closes GitHub Issue #1938.

  2. How was this tested: created new tests in temporal-sdk module that captured existing behavior with small values to prove that the visible behavior was unchanged for already-legal values.

@chronos-tachyon chronos-tachyon requested a review from a team as a code owner December 6, 2023 00:44
@CLAassistant
Copy link

CLAassistant commented Dec 6, 2023

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@Quinn-With-Two-Ns
Copy link
Contributor

One thing I called out in the issue, but may not have mentioned is we should maintain the precision of the conversion.

We should probably convert the seconds directly and truncate the nanoseconds to millisecond precision to maintain consistency

changing the conversion precision might subtly break users tests using the time skipping test server.

@Quinn-With-Two-Ns
Copy link
Contributor

For the CI failure you just need to rebase

@cretz
Copy link
Member

cretz commented Dec 6, 2023

I am not sure this change is backwards compatible. While I understand the reasoning for preventing overflow, the loss of precision I think is intentional or at least safe. No use of proto duration by the Java SDK or the Temporal server is expected to be more precise than the millisecond level.

Durations.toMillis rounds. What if my workflow run timeout is 1.0005 seconds and I have this code:

if (Workflow.getInfo().getWorkflowRunTimeout().toMillis() > 1000) {
    myActivities.doThing();
}

This PR changes that conditional I think. Now, I admit this is a contrived example and I'm not that worried about this incompatibility, but it should be known.

I think we should stick with millisecond precision for all Temporal duration use (this doesn't apply to users' use of protos correct?). Can we just protect against overflow (convert to max long?) but otherwise leave the code as is?

@chronos-tachyon
Copy link
Contributor Author

One thing I called out in the issue, but may not have mentioned is we should maintain the precision of the conversion.

We should probably convert the seconds directly and truncate the nanoseconds to millisecond precision to maintain consistency

changing the conversion precision might subtly break users tests using the time skipping test server.

Whoops, I overlooked that. Fixed.

@chronos-tachyon
Copy link
Contributor Author

I think we should stick with millisecond precision for all Temporal duration use (this doesn't apply to users' use of protos correct?). Can we just protect against overflow (convert to max long?) but otherwise leave the code as is?

I'm now doing the rounding myself. Any use of toAnything() is going to truncate to 64-bit precision or less, which if we ask for millis means that we're losing about 10 bits off the top of the 64-bit seconds count.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, with the current changes, the behavior is exactly as it is today for normal positive durations that don't overflow and only handles overflow but not the loss of precision? The PR title could be confusing if we removed the fix-loss-of-precision requirement, which is fine.

May want a test or two.

@chronos-tachyon chronos-tachyon changed the title Fix loss of precision in Java <-> Proto Duration conversions. Fix ArithmeticException in toJavaDuration. Dec 13, 2023
@chronos-tachyon chronos-tachyon marked this pull request as draft December 13, 2023 19:55
public class ProtobufTimeUtilsTest {

@Parameters
public static Collection<Object[]> data() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we confirm all the tests that didn't throw have the same behaviour before and after your change?

Copy link
Contributor Author

@chronos-tachyon chronos-tachyon Dec 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I specifically tested that in the first commit after the rebase + force-push. Commit c7854d1e641e contains no changes to the logic (just the Objects.isNull cosmetic change) and passes all tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happily, the belief that Durations.toMillis does round to nearest instead of truncation was mistaken, as that would have meant that there was an asymmetry in the conversion. Both Duration.toMillis (Java) and Durations.toMillis (protobuf) truncate.

@chronos-tachyon chronos-tachyon marked this pull request as ready for review December 14, 2023 18:10
@chronos-tachyon
Copy link
Contributor Author

To confirm, with the current changes, the behavior is exactly as it is today for normal positive durations that don't overflow and only handles overflow but not the loss of precision? The PR title could be confusing if we removed the fix-loss-of-precision requirement, which is fine.

May want a test or two.

It now has tests and I've retitled the pull request + interactive rebased the commit messages to better reflect what's going on.

Copy link
Contributor

@Quinn-With-Two-Ns Quinn-With-Two-Ns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, first commit 🎉

@chronos-tachyon
Copy link
Contributor Author

Durations.toMillis rounds. What if my workflow run timeout is 1.0005 seconds and I have this code:

if (Workflow.getInfo().getWorkflowRunTimeout().toMillis() > 1000) {
    myActivities.doThing();
}

This PR changes that conditional I think.

This turns out to be factually false, but the documentation is confusingly worded. The Durations.toMillis JavaDoc states: "The result will be rounded towards 0 to the nearest millisecond. E.g., if the duration represents -1 nanosecond, it will be rounded to 0." The word "nearest" is causing confusion here, I think.

@chronos-tachyon chronos-tachyon merged commit 7ee8f96 into master Dec 14, 2023
8 checks passed
@chronos-tachyon chronos-tachyon deleted the dking/SDK-1319 branch December 14, 2023 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

toProtoDuration may throw an ArithmeticException if Duration is too large
4 participants