Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support save WithCTE for insertRepartitionBeforeWrite #6783

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ic4y
Copy link
Contributor

@ic4y ic4y commented Oct 28, 2024

🔍 Description

Issue References 🔗

First, I'd like to thank @wForget for the help with this issue.

When using the "save to HDFS" feature, queries ending with an ORDER BY sometimes lose their sort order in the results. Upon investigating the code, I discovered that when using WITH statements and saving SQL results with toDF.write.save, a WithCTE node is generated after the Sort node. This causes the canInsertRepartitionByExpression check to fail, leading to an incorrect Repartition node insertion after the Sort node, which ultimately disrupts the sort order.

However, this issue does not occur when using INSERT INTO TABLE with WithCTE nodes.

The provided unit test can reproduce this issue, but after using toDF.write.save, I am unable to access the complete execution plan to assert whether a Repartition node is present. Therefore, the current test is ineffective.

Hope someone can help figure out how to write this unit test.

Describe Your Solution 🔧

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Types of changes 🔖

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Test Plan 🧪

Behavior Without This Pull Request ⚰️

Behavior With This Pull Request 🎉

Related Unit Tests


Checklist 📝

Be nice. Be informative.

@codecov-commenter
Copy link

codecov-commenter commented Oct 28, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (d3520dd) to head (00fc1fa).
Report is 2 commits behind head on master.

Additional details and impacted files
@@          Coverage Diff           @@
##           master   #6783   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         687     687           
  Lines       42439   42441    +2     
  Branches     5793    5793           
======================================
- Misses      42439   42441    +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pan3793 pan3793 requested a review from ulysses-you November 4, 2024 06:29
Copy link
Contributor

@ulysses-you ulysses-you left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm if tests pass

@ic4y ic4y changed the title [bugfix] Fix Incorrect Insertion of Repartition Node When WithCTE Node Exists Before Sort Node [bugfix] Support save WithCTE for insertRepartitionBeforeWrite Nov 5, 2024
@ic4y ic4y changed the title [bugfix] Support save WithCTE for insertRepartitionBeforeWrite Support save WithCTE for insertRepartitionBeforeWrite Nov 5, 2024
@wForget
Copy link
Member

wForget commented Nov 5, 2024

There is an issue with the logic of check write in base unit test. Please wait #6793

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants