You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, we execute joins in the order they were given by the user. If the user does not pay attention, this can cause a significant performance penalty due to an unnecessary explosion of intermediate results.
Problem
Currently, we execute joins in the order they were given by the user. If the user does not pay attention, this can cause a significant performance penalty due to an unnecessary explosion of intermediate results.
Solution
We should automatically optimize the join ordering. Ideally, we have cardinality estimates for this from Parquet files or a metadata store, but we should also try to optimize join ordering without meaningful statistics. Possible approaches here include optimization based on partition counts or equivalence sets (https://blobs.duckdb.org/papers/tom-ebergen-msc-thesis-join-order-optimization-with-almost-no-statistics.pdf).
Previous work
We have already experimented with this, but never gotten to a working solution that we could merge (e.g., #809).
The text was updated successfully, but these errors were encountered: