-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Enhancement] improve sql digest for massive compound predicates #53207
base: main
Are you sure you want to change the base?
[Enhancement] improve sql digest for massive compound predicates #53207
Conversation
Signed-off-by: Murphy <[email protected]>
exprs.add((SlotRef) x); | ||
} | ||
}); | ||
return "$massive_compounds[" + exprs.stream().map(SlotRef::toSqlImpl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SlotRefs are query-specific so unstable, suggest to use original SlotRefs(slotref references to table column) and convert original slotrefs into table columns then sort these columns lexicographically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea.
if (pair.first + pair.second >= MASSIVE_COMPOUND_LIMIT) { | ||
// Only record de-duplicated slots if there are too many compounds | ||
Set<SlotRef> exprs = Sets.newHashSet(); | ||
traverse(node, x -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use node.collectAllSlotRefs()
directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, I should use it. but it seems I should improve the performance of that method first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it constructs a lot of intermediate list,, i'm afraid the performance can be terrible
@Override | ||
public String visitCompoundPredicate(CompoundPredicate node, Void context) { | ||
Pair<Integer, Integer> pair = countCompound(node); | ||
if (pair.first + pair.second >= MASSIVE_COMPOUND_LIMIT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about Expr#flattenPredicate;
[FE Incremental Coverage Report]✅ pass : 33 / 33 (100.00%) file detail
|
Signed-off-by: Murphy <[email protected]>
Quality Gate passedIssues Measures |
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
Why I'm doing:
What I'm doing:
In cases where an OR predicate is dynamically constructed with a fixed column, the SQL digest varies due to a differing number of predicates. To address this, we consolidate extensive compound predicates into a compact format, ensuring consistent SQL digests.
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: