-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build: Add CI for TPCDS queries #99
Conversation
I've seen this locally too. We probably should add it in here. |
Yea, added it to exclude list. |
@@ -26,7 +26,8 @@ import org.apache.comet.CometConf | |||
class CometTPCDSQuerySuite | |||
extends { | |||
// This is private in `TPCDSBase`. | |||
val excludedTpcdsQueries: Seq[String] = Seq("q34", "q64") | |||
val excludedTpcdsQueries: Seq[String] = | |||
Seq("q34", "q66", "q64", "q71", "q88", "q90", "q96") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weird that we have more queries that are not same with Spark, compared with what we have internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to figure out what changes are missed or messed during open sourcing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know what are the error messages for these additional queries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the additional queries, looks like their outputs are different to Spark actually. We can see the difference in CI log.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the failed CI run https://github.com/apache/arrow-datafusion-comet/actions/runs/8025460291/job/21926164052.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example:
- q96 *** FAILED ***
java.lang.Exception: Expected "[0]", but got "[888]" Result did not match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- q90 *** FAILED ***
java.lang.Exception: Expected "[NULL]", but got "[0.58933333333333333333]" Result did not match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- q88 *** FAILED ***
java.lang.Exception: Expected "[0 0 0 0 0 0 0 0]", but got "[2358 4664 4828 7447 6997 3886 4073 4923]" Result did not match
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- q71 *** FAILED ***
java.lang.Exception: Expected "[]", but got "[10016002 corpamalgamalg #x 19 7 23183.08
10005002 scholarunivamalg #x 8 8 17109.63
8014006 edu packmaxi #x 9 53 16420.75
3003002 exportiexporti #x 17 44 16337.23
3001001 amalgexporti #x 17 46 16088.53
4001001 amalgedu pack #x 9 28 15557.83
...
The actual (wrong) result is pretty long, I ignore it above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- q66 *** FAILED ***
java.lang.Exception: Expected "[]", but got "[Just good amou 933435 Midway Williamson County TN United States DHL,BARIAN 2001 9246702.65 4973596.11 8243830.24 11536095.04 8848441.04 6589907.46 14946163.85 24260830.90 23156311.42 21897174.69 33046376.63 24591398.12 9.906102353136 5.328272573880 8.831713231238
...
Same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged, thanks! |
Which issue does this PR close?
Closes #101.
Rationale for this change
We need a CI pipeline that could help us verify Comet query correctness.
What changes are included in this PR?
This patch enables
CometTPCDSQuerySuite
in CI pipeline that runs TPCDS queries with Comet enabled and checks the results with Spark.How are these changes tested?