-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Support bitwise aggregate functions #197
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #197 +/- ##
============================================
+ Coverage 33.30% 33.51% +0.20%
Complexity 767 767
============================================
Files 107 107
Lines 35372 35436 +64
Branches 7657 7707 +50
============================================
+ Hits 11782 11877 +95
+ Misses 21144 21076 -68
- Partials 2446 2483 +37 ☔ View full report in Codecov by Sentry. |
@@ -186,6 +186,13 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde { | |||
} | |||
} | |||
|
|||
private def bitwiseAggTypeSupported(dt: DataType): Boolean = { | |||
dt match { | |||
case _: IntegerType => true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Spark supports all Integral
type as input type which includes other types such as long and short. Maybe we can add them later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added all the other IntegralType
"SELECT BIT_AND(col1), BIT_OR(col1), BIT_XOR(col1), col3 FROM test GROUP BY col3", | ||
expectedNumOfCometAggregates) | ||
|
||
checkSparkAnswerAndNumOfAggregates( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add the comments why this test is needed, it looks like the same as one before that, just MIN/COUNT added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments were added. Thanks
private def bitwiseAggTypeSupported(dt: DataType): Boolean = { | ||
dt match { | ||
case _: IntegerType | LongType | ShortType | ByteType => true | ||
case _ => false | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the limitation due to DataFusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If Spark bit aggregate function supports more types than DataFusion, can you add a comment here and open a ticket at DataFusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark bitwise aggregates only supports INTEGRAL
type. If I try other types, Spark throws Exception:
Cannot resolve "bit_and(col1)" due to data type mismatch: Parameter 1 requires the "INTEGRAL" type, however "col1" has the type ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @huaxingao
Merged, thanks! |
Thank you all! |
Which issue does this PR close?
Closes #.
Support bitwise aggregate functions such as
BitAnd
,BitOr
,BitXor
Rationale for this change
What changes are included in this PR?
How are these changes tested?