Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fallback to Spark for LIKE with custom escape character #478

Merged
merged 3 commits into from
May 28, 2024

Conversation

sujithjay
Copy link
Contributor

@sujithjay sujithjay commented May 27, 2024

Fallback to Spark for LIKE with custom escape character because currently, LIKE with custom escape character produces incorrect results.

Which issue does this PR close?

Closes #462

Rationale for this change

What changes are included in this PR?

An if clause is introduced for LIKE in exprToProtoInternal in QueryPlanSerde.scala. It ensures that we fallback to Spark in case LIKE has a custom escape character.

How are these changes tested?

A test was added to org.apache.comet.CometExpressionSuite.

Currently, LIKE with custom escape character produces incorrect results.
@andygrove andygrove changed the title Fallback to Spark for LIKE with custom escape character fix: Fallback to Spark for LIKE with custom escape character May 28, 2024
@@ -981,7 +981,7 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde {
None
}

case Like(left, right, _) =>
case Like(left, right, escapeChar) if escapeChar == '\\' =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tihnk it would be better to have a check for escape char within the match and then call withInfo if the escape char is not \\ so that we can give the user a specific error such as escape char not supported otherwise I think we fall through to the catchall logic and the user will get an error like is not supported, which could be confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove I have made this change in bfac08a. Please review.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you @sujithjay


// Filter column having values that include underscores
val queryDefaultEscape = sql("select id from names where name like '%\\_%'")
checkAnswer(queryDefaultEscape, Row(2) :: Row(3) :: Nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking, checkSparkAnswerAndOperator() to be used to make sure native comet is actually executed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. @sujithjay could you make that change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 496e885. Please review @kazuyukitanimura @andygrove.

Copy link
Contributor

@kazuyukitanimura kazuyukitanimura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Pending CI

@andygrove andygrove merged commit fd90a28 into apache:main May 28, 2024
43 checks passed
@sujithjay sujithjay deleted the sujithjay/like-custom-escape branch May 29, 2024 00:25
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
)

* Fallback to Spark for LIKE with custom escape character

Currently, LIKE with custom escape character produces incorrect results.

* For custom escape character, provide user with specific info message

* Test case for default escape char with checkSparkAnswerAndOperator

(cherry picked from commit fd90a28)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: LIKE with custom escape char produces incorrect results
4 participants