Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix integer overflow in date_parser #529

Merged
merged 1 commit into from
Jun 7, 2024
Merged

Conversation

eejbyfeldt
Copy link
Contributor

Which issue does this PR close?

This resolves one of the overflows discussed in #481

Rationale for this change

The date_parser was introduced in #383 and is mostly a direct port of code in Spark. In Spark the code uses Scala which has defined integer overflow as two's complement wrapping.

What changes are included in this PR?

The proposed fixed is to use std::num::Wrapping to get the same wrapping behavior in rust. The overflown value will still be rejected in a later check that uses current_segment_digits so allowing the overflow does not lead to correctness issues.

How are these changes tested?

New and existing unit tests.

The date_parser was introduced in apache#383 and is mostly a direct port of
code in Spark. Since the code uses the JVM it has defined integer
overflow as wrapping. The proposed fixed is to use std::num::Wrapping to
get the same wrapping behavior in rust.  The overflown value will still
be disgarded in a later check that uses `current_segment_digits` so
allowing the overflow does not lead to correctness issues.

This resolves one of the overflows discussed in apache#481
@eejbyfeldt eejbyfeldt changed the title Fix overflow in date_parser bug: Fix integer overflow in date_parser Jun 6, 2024
Comment on lines +1840 to +1841
// Overflows i32
Some("-4607172990231812908"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

@@ -19,6 +19,7 @@ use std::{
any::Any,
fmt::{Debug, Display, Formatter},
hash::{Hash, Hasher},
num::Wrapping,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of Wrapping. TIL.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this is also faster since there is no overflow check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually use some methods like wrapping_add, etc.

@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 54.68%. Comparing base (7ab37eb) to head (e2f738a).
Report is 5 commits behind head on main.

Additional details and impacted files
@@              Coverage Diff              @@
##               main     #529       +/-   ##
=============================================
+ Coverage     34.23%   54.68%   +20.45%     
+ Complexity      803      795        -8     
=============================================
  Files           105      102        -3     
  Lines         38483     9688    -28795     
  Branches       8560     1845     -6715     
=============================================
- Hits          13173     5298     -7875     
+ Misses        22555     3433    -19122     
+ Partials       2755      957     -1798     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -19,6 +19,7 @@ use std::{
any::Any,
fmt::{Debug, Display, Formatter},
hash::{Hash, Hasher},
num::Wrapping,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this is also faster since there is no overflow check?

@viirya viirya changed the title bug: Fix integer overflow in date_parser fix: Fix integer overflow in date_parser Jun 6, 2024
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @eejbyfeldt.

Thanks for the review @parthchandra

@andygrove andygrove merged commit c3e27c0 into apache:main Jun 7, 2024
45 checks passed
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
The date_parser was introduced in apache#383 and is mostly a direct port of
code in Spark. Since the code uses the JVM it has defined integer
overflow as wrapping. The proposed fixed is to use std::num::Wrapping to
get the same wrapping behavior in rust.  The overflown value will still
be disgarded in a later check that uses `current_segment_digits` so
allowing the overflow does not lead to correctness issues.

This resolves one of the overflows discussed in apache#481
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants