Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-7028][CH][Part-2] Refactor: Move MergeTree related UT to mergetree module #7279

Merged
merged 7 commits into from
Sep 19, 2024

Conversation

baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Sep 19, 2024

What changes were proposed in this pull request?

This PR does two things:

  1. Move MergeTree related UT to mergetree module, so that we can run mergetree related ut by package name org.apache.gluten.execution.mergetree
  2. Inroduce implicit class CHConf.GlutenCHConf, wo can simpfy setting configs, see the following pigture.

image

Other

  1. set "spark.gluten.sql.columnar.backend.ch.runtime_config.path" to "/data" for GlutenClickHouseMergeTreeCacheDataSuite, GlutenClickHouseMergeTreeWriteOnHDFSSuite, GlutenClickHouseMergeTreeWriteOnHDFSWithRocksDBMetaSuite and GlutenClickHouseMergeTreeWriteOnS3Suite, previously, we use default value '/' which is bad for local test.

(Fixes: #7028)

How was this patch tested?

Using Existed UTs

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@baibaichen baibaichen marked this pull request as ready for review September 19, 2024 07:07
Copy link

Run Gluten Clickhouse CI

@baibaichen baibaichen changed the title [CH] Minor: Refactor UT [GLUTEN-7028][CH][Part-2] Refactor UTs Sep 19, 2024
Copy link

#7028

@baibaichen baibaichen changed the title [GLUTEN-7028][CH][Part-2] Refactor UTs [GLUTEN-7028][CH][Part-2] Refactor: Move MergeTree related UT to mergetree module Sep 19, 2024
Copy link

Run Gluten Clickhouse CI

 - Use CHConf
 - use CHConf.prefixOf() instead of "spark.gluten.sql.columnar.backend.ch."
 - settingsKey => runtimeSettings
 - configKey => runtimeConfig
 - CH => CONF_PREFIX
Copy link

Run Gluten Clickhouse CI

@baibaichen baibaichen merged commit 4e30ed1 into apache:main Sep 19, 2024
6 checks passed
@baibaichen baibaichen deleted the feature/MergeTreeUT branch September 19, 2024 14:23
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCDS SF2000 with Velox backend, for reference only ====

query log/native_master_09_19_2024_time.csv log/native_master_09_18_2024_f1f6c2cf75_time.csv difference percentage
q1 13.43 13.63 0.198 101.47%
q2 15.77 14.78 -0.992 93.71%
q3 5.18 4.50 -0.683 86.82%
q4 71.65 70.37 -1.282 98.21%
q5 8.45 10.13 1.679 119.87%
q6 2.07 4.37 2.304 211.35%
q7 7.30 6.11 -1.190 83.70%
q8 3.31 5.04 1.740 152.63%
q9 24.61 23.48 -1.135 95.39%
q10 9.33 8.87 -0.456 95.11%
q11 37.83 37.46 -0.365 99.03%
q12 1.40 1.37 -0.024 98.27%
q13 6.49 6.68 0.193 102.97%
q14a 47.99 45.57 -2.420 94.96%
q14b 40.67 42.62 1.951 104.80%
q15 2.65 3.30 0.647 124.42%
q16 48.44 47.73 -0.715 98.52%
q17 4.87 4.66 -0.208 95.72%
q18 7.04 6.85 -0.191 97.29%
q19 2.12 2.21 0.087 104.11%
q20 1.48 1.86 0.376 125.44%
q21 1.13 1.03 -0.107 90.53%
q22 7.85 7.65 -0.195 97.51%
q23a 104.40 105.75 1.347 101.29%
q23b 128.59 127.24 -1.354 98.95%
q24a 104.00 114.90 10.906 110.49%
q24b 106.38 107.02 0.638 100.60%
q25 4.27 4.27 0.001 100.02%
q26 3.94 4.01 0.065 101.64%
q27 4.97 4.82 -0.149 97.00%
q28 30.99 32.69 1.707 105.51%
q29 10.75 12.68 1.932 117.97%
q30 4.80 4.55 -0.248 94.84%
q31 7.15 8.59 1.440 120.13%
q32 1.22 1.16 -0.057 95.34%
q33 4.24 4.15 -0.090 97.87%
q34 3.90 3.72 -0.186 95.24%
q35 7.96 7.39 -0.579 92.73%
q36 5.39 4.84 -0.553 89.75%
q37 6.01 4.88 -1.124 81.28%
q38 14.19 13.86 -0.331 97.67%
q39a 3.14 3.04 -0.100 96.82%
q39b 2.90 3.25 0.349 112.03%
q40 4.06 3.96 -0.102 97.50%
q41 0.64 0.57 -0.063 90.03%
q42 0.96 0.84 -0.122 87.36%
q43 4.54 4.29 -0.251 94.47%
q44 9.23 8.86 -0.375 95.94%
q45 3.31 3.16 -0.144 95.64%
q46 3.75 4.18 0.428 111.42%
q47 17.70 17.32 -0.376 97.87%
q48 4.84 4.85 0.012 100.25%
q49 7.38 6.89 -0.493 93.31%
q50 21.55 21.45 -0.104 99.52%
q51 9.70 9.18 -0.519 94.64%
q52 1.06 1.05 -0.009 99.15%
q53 2.17 2.10 -0.061 97.18%
q54 3.75 3.65 -0.095 97.45%
q55 1.04 1.05 0.007 100.66%
q56 4.10 4.06 -0.039 99.05%
q57 10.44 10.78 0.335 103.21%
q58 2.52 2.44 -0.077 96.95%
q59 10.55 10.19 -0.365 96.54%
q60 4.06 4.00 -0.062 98.47%
q61 3.94 4.18 0.239 106.07%
q62 4.48 4.58 0.093 102.08%
q63 2.16 2.14 -0.020 99.06%
q64 62.97 63.06 0.090 100.14%
q65 18.05 16.98 -1.069 94.08%
q66 4.41 4.49 0.084 101.90%
q67 396.63 383.93 -12.697 96.80%
q68 4.07 3.87 -0.200 95.07%
q69 5.47 8.56 3.088 156.46%
q70 12.82 10.03 -2.796 78.20%
q71 2.33 2.28 -0.048 97.94%
q72 213.99 215.06 1.070 100.50%
q73 2.27 2.23 -0.049 97.86%
q74 22.88 23.21 0.332 101.45%
q75 26.57 26.62 0.047 100.18%
q76 12.76 12.34 -0.413 96.76%
q77 2.24 2.04 -0.190 91.48%
q78 49.17 49.72 0.547 101.11%
q79 3.87 3.80 -0.070 98.19%
q80 11.10 11.35 0.246 102.22%
q81 4.47 4.51 0.036 100.81%
q82 6.53 6.63 0.107 101.63%
q83 1.50 1.63 0.130 108.71%
q84 2.83 2.79 -0.040 98.58%
q85 7.41 7.39 -0.025 99.66%
q86 4.15 4.16 0.004 100.10%
q87 13.95 13.32 -0.639 95.42%
q88 16.79 17.30 0.510 103.04%
q89 3.13 3.15 0.020 100.63%
q90 2.80 2.70 -0.101 96.38%
q91 2.37 1.90 -0.475 79.98%
q92 1.32 1.28 -0.037 97.17%
q93 40.66 45.22 4.557 111.21%
q94 26.28 26.35 0.064 100.24%
q9 89.68 90.73 1.050 101.17%
q5 2.70 2.70 -0.001 99.97%
q96 18.01 17.66 -0.344 98.09%
q97 1.86 2.00 0.141 107.59%
q98 10.95 11.12 0.171 101.56%
q99 10.95 11.12 0.171 101.56%
total 2177.18 2180.94 3.761 100.17%

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_master_09_19_2024_time.csv log/native_master_09_18_2024_f1f6c2cf75_time.csv difference percentage
q1 52.97 52.97 -0.003 99.99%
q2 29.76 28.49 -1.268 95.74%
q3 53.52 53.10 -0.424 99.21%
q4 41.89 41.00 -0.883 97.89%
q5 105.69 105.64 -0.050 99.95%
q6 12.00 13.43 1.437 111.98%
q7 112.13 112.97 0.840 100.75%
q8 116.72 118.64 1.921 101.65%
q9 172.75 172.90 0.150 100.09%
q10 67.80 67.13 -0.672 99.01%
q11 27.88 26.03 -1.853 93.35%
q12 31.38 31.76 0.382 101.22%
q13 53.54 52.67 -0.873 98.37%
q14 24.32 24.71 0.392 101.61%
q15 50.33 48.45 -1.887 96.25%
q16 17.24 17.28 0.037 100.21%
q17 126.55 127.91 1.359 101.07%
q18 199.39 200.11 0.727 100.36%
q19 28.16 29.27 1.105 103.93%
q20 42.03 44.53 2.494 105.93%
q21 328.91 337.10 8.188 102.49%
q22 15.72 15.78 0.053 100.34%
total 1710.66 1721.84 11.176 100.65%

sharkdtu pushed a commit to sharkdtu/gluten that referenced this pull request Nov 11, 2024
…etree module (apache#7279)

* Add CHConf

* Move MergeTree related UT to mergetree module

* fix scala stye

* spark32 spark33 spark35

* More CH Conf

* update per apache#7265

 - Use CHConf
 - use CHConf.prefixOf() instead of "spark.gluten.sql.columnar.backend.ch."
 - settingsKey => runtimeSettings
 - configKey => runtimeConfig
 - CH => CONF_PREFIX

* fix due to apache#7263
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH] Fully Support writing parquet and mergetree in spark 3.5.x with delta protocol
3 participants