-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add support of TakeOrderedAndProjectExec in Comet #88
Conversation
spark/src/main/scala/org/apache/spark/sql/comet/CometTakeOrderedAndProjectExec.scala
Outdated
Show resolved
Hide resolved
spark/src/main/scala/org/apache/spark/sql/comet/CometTakeOrderedAndProjectExec.scala
Outdated
Show resolved
Hide resolved
spark/src/main/scala/org/apache/spark/sql/comet/CometTakeOrderedAndProjectExec.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except one minor comment.
if isCometNative(s.child) && isCometOperatorEnabled(conf, "takeOrderedAndProjectExec") | ||
&& isCometShuffleEnabled(conf) && | ||
CometTakeOrderedAndProjectExec.isSupported(s.projectList, s.sortOrder, s.child) => | ||
// TODO: support offset for Spark 3.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment: this might be a correctness issue for Spark 3.4 with offset > 0
as we simply ignore the possible offset
field.
One possible way to address that would be that define a getOffset
method in ShimCometSparkSessionExtensions
to access offset
field via reflection.
Of course it should be addressed in another PR and other xxLimitExec
too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can probably fallback to Spark when offset > 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, we need to do fallback for offset > 0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. That's a minimal fix. However if we can access offset
, we can implement the offset logic in comet as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me do fallback in this PR. I will address offset > 0 case in follow ups.
I was wondering why there is test failure as I didn't see it locally. But after rebased with |
Found the cause. We need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged. Thanks. |
Which issue does this PR close?
Closes #89.
Rationale for this change
TakeOrderedAndProjectExec
is a common operator in Spark. In Comet, we should support it to increase query coverage.What changes are included in this PR?
How are these changes tested?