Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QueryPlan Optimizer aims to run TPCH 100x faster #116

Draft
wants to merge 392 commits into
base: main-23.10-20240617
Choose a base branch
from

Conversation

auxten
Copy link
Member

@auxten auxten commented Oct 3, 2023

Still in a very early stage. This PR is only for tracking progress.

auxten and others added 30 commits August 15, 2023 03:50
* PR_SET_NAME workaround

AWS Lambdas (and other virtualized platforms) lack of support for PR_SET_NAME causing a blocking exception. Pending an upstream PR or fix in ClickHouse, this patch allows this function to fail unharmed. The resulting executable has been tested on various platforms without drawbacks and discussed in clickhouse issue [29378](ClickHouse/ClickHouse#29378)

$ sed -i '/Cannot set thread name/c\' /ClickHouse/src/Common/setThreadName.cpp

* Disable AVX2 support
Proposal to enable S3 features to improve remote file reading performance compared to plain url engine.
Although we lock llvm@15, different github action runner may have llvm 15.0.7 or 15.0.7_1.
This will cause ccache miss a lot. Hope brew update will keep llvm latest
Although we lock llvm@15, different github action runner may have llvm 15.0.7 or 15.0.7_1.
This will cause ccache miss a lot. Hope brew update will keep llvm latest
    1. Add missing transforms
    2. Disable IQueryPlanStep operator==
    3. virtual void updateOutputStream() throw Exception
    4. Remove ASTSerDerHelper
    1. Throw exception on serialize or deserialize
    2. Fix JoinStep and JoiningTransform
    3. Drop PlanCache
    4. Fix chdbVersion
@auxten auxten marked this pull request as draft October 3, 2023 05:47
@djouallah
Copy link

djouallah commented Oct 4, 2023

wait are you going to build a new Query optimizer from scratch ?

@auxten
Copy link
Member Author

auxten commented Oct 4, 2023

wait are you going to build a new Query optimizer from scratch ?

No, transplant optimizer from ByConity which has forked from ClickHouse v21.8

@alanpaulkwan
Copy link

Is there any way to implement this in the main branch of Clickhouse as well? Would be huge

@auxten
Copy link
Member Author

auxten commented Oct 4, 2023

Is there any way to implement this in the main branch of Clickhouse as well? Would be huge

Perhaps later, even though it has already taken me months, I am unsure if this will be successful.😅

@djouallah
Copy link

@auxten you can do it !!!

@alexey-milovidov
Copy link

It makes sense to also send this PR to the main repository - interesting if it will pass the tests. Although I expect it is going to be difficult...

auxten added 4 commits October 4, 2023 13:32
    1. Fix all ASOF::Inequality to ASOFJoinInequality
    2. Remove Exchange and RemoteExchange stuff
    3. Remove DynamicFilter
    4. Remove ProjectionStep
@alexey-milovidov
Copy link

One downside: if it is not tested by ClickHouse CI, it will most likely contain a ton of bugs, and having these bugs will lead to reputational risks for ClickHouse. It means - we have to send a PR to the main ClickHouse repository.

@lmangani
Copy link
Contributor

lmangani commented Oct 8, 2023

One downside: if it is not tested by ClickHouse CI, it will most likely contain a ton of bugs, and having these bugs will lead to reputational risks for ClickHouse. It means - we have to send a PR to the main ClickHouse repository.

For sure. This will be considered highly experimental and it won't have mainstream ambitions unless we all agree the results are overwhelmingly positive. On the other hand as a fork, we can afford taking some early risks the mainline project cannot 😉

@auxten
Copy link
Member Author

auxten commented Oct 8, 2023

Absolutely agree with @alexey-milovidov. As a data tool the quality is the most important thing. This branch is very very experimental. I have never expected it would draw so much attention.
I'm not even very confident on this could barely run. The quality of chdb depends mostly on ClickHouse. If this experiment could run(god bless), I will try to make a PR to clickhouse main repo. If I can manage it to pass all the ClickHouse tests and reviewed(god bless again). Then we can say it works.
It may take years, but I am happy working on this.

    1. Remove FinishSortingStep
    2. Fix FilterStep
    3. Fix ProjectionMatchContext
    4. Fix TableScanExecutor::TableScanExecutor
    5. Just keep serialize/deserialize throw Exception
@CLAassistant
Copy link

CLAassistant commented Dec 3, 2023

CLA assistant check
All committers have signed the CLA.

@alexey-milovidov
Copy link

@auxten Rebase?

@auxten
Copy link
Member Author

auxten commented Oct 9, 2024

@auxten Rebase?

I didn't working on this for a long time. It's still very far from general available.
Maybe I should just close this PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants