-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EPIC] Add Spark expression coverage #240
Comments
The list of Spark expression can be found https://spark.apache.org/docs/latest/api/sql/index.html |
As an umbrella issue, If you are going to make a list of frequently used expressions, maybe you can add the hash expressions(which I created #205 earlier) as one of the categories/list. |
I added hash expressions. Free feel to edit the expression list in the issue description to add more expressions. |
I was thinking to add Spark
Once its done, we can just download all the queries from https://spark.apache.org/docs/latest/api/sql/index.html and run it automatically and see the coverage. How does it sound? |
I am adding RowToColumnar support in #206. Once it's done, I think it's trivial to add RDDScanExec(which OneRowRelation is translated to as PhysicalPlan) support.
That sounds like a great idea. |
Another potential solution we can do is to transform OneRowRelation and to use DF |
Of course, that would be more performant and straightforward. |
It sounds good if we can automatically test expression coverage, although I'm not sure if it is easy to do. |
I have an idea to do that. Planning to create a draft soon. It will be easier to do if Comet supported OneRowRelation but even without it there is a workaround. Once all builtinn function queries done there should be some HTML with total results |
Hmmm, I think you can get expression example usage directly from its annotated class. |
Great idea, I think I'm able to fetch it as here https://github.com/apache/spark/blob/6fdf9c9df545ed50acbce1ec874625baf03d4d2e/sql/core/src/test/scala/org/apache/spark/sql/expressions/ExpressionInfoSuite.scala#L166 |
What is the problem the feature request solves?
This is an umbrella ticket for the list of unsupported Spark expressions. This is not necessary comprehensive list of all Spark expressions because they are too many. We can start from frequently used expressions.
Hash expressions #205
crypto_expressions
includes blake3 which cannot be built on Mac platform. We cannot separately enable only md5 in DataFusion.Datetime expressions
Conditional expressions
Arithmetic expressions
Bitwise expressions
String expressions
Math expressions
Predicates
Null expressions
Aggregate expressions
Others
...
Describe the potential solution
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: