Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: distributed execution of update statement #13971

Merged
merged 11 commits into from
Jan 4, 2024

Conversation

SkyFan2002
Copy link
Member

@SkyFan2002 SkyFan2002 commented Dec 11, 2023

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

Summary

  1. Introduce distributed execution of update statement
  2. Fix: deduplicate label is invalid in distributed mode

Compared with v1.2.250-nightly in medium size warehouse:

#13971
bendsql --query='drop table update_target;'
bendsql --query='create table update_target(c1 int,c2 int);'
bendsql --query='insert into update_target select number,number from numbers(5000000);'
bendsql --query='insert into update_target select number,number from numbers(5000000);'
bendsql --query='ALTER warehouse update_distributed suspend;'
bendsql --query='use warehouse update_distributed;'
time bendsql --query='update update_target set c2 = 1 where c1 % 2 = 0;' # 0m5.577s
bendsql --query='select count(*) from update_target where (c1 % 2 = 0) and (c2 != 1);'

# v1.2.250-nightly
bendsql --query='drop table update_target;'
bendsql --query='create table update_target(c1 int,c2 int);'
bendsql --query='insert into update_target select number,number from numbers(5000000);'
bendsql --query='insert into update_target select number,number from numbers(5000000);'
bendsql --query='ALTER warehouse single suspend;'
bendsql --query='use warehouse single;'
time bendsql --query='update update_target set c2 = 1 where c1 % 2 = 0;' # 0m7.747s
bendsql --query='select count(*) from update_target where (c1 % 2 = 0) and (c2 != 1);'

This change is Reviewable

@github-actions github-actions bot added the pr-feature this PR introduces a new feature to the codebase label Dec 11, 2023
@SkyFan2002 SkyFan2002 added the ci-cloud Build docker image for cloud test label Dec 11, 2023
Copy link
Contributor

Docker Image for PR

  • tag: pr-13971-5f9ca8b

note: this image tag is only available for internal use,
please check the internal doc for more details.

Copy link
Contributor

github-actions bot commented Dec 11, 2023

Pull request description must contain CLA like the following:

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

## Summary

Summary about this PR

- Close #issue

@SkyFan2002 SkyFan2002 added ci-cloud Build docker image for cloud test and removed ci-cloud Build docker image for cloud test labels Dec 11, 2023
Copy link
Contributor

Docker Image for PR

  • tag: pr-13971-4108d97

note: this image tag is only available for internal use,
please check the internal doc for more details.

@SkyFan2002 SkyFan2002 marked this pull request as ready for review December 11, 2023 16:43
@SkyFan2002 SkyFan2002 marked this pull request as draft December 12, 2023 03:21
@SkyFan2002
Copy link
Member Author

After discussing with @zhang2014, it'd be better that #13981 be solved in another PR, temporarily convert this PR to draft.

# Conflicts:
#	src/query/service/src/schedulers/fragments/fragmenter.rs
@SkyFan2002 SkyFan2002 marked this pull request as ready for review January 4, 2024 03:49
@SkyFan2002
Copy link
Member Author

Get deduplicate_label early to make it work in distributed mode, which skips #13981 for now. Let's merge this PR cc @zhang2014 @dantengsky

@SkyFan2002 SkyFan2002 requested a review from zhang2014 January 4, 2024 03:57
Copy link
Member

@dantengsky dantengsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@BohuTANG
Copy link
Member

BohuTANG commented Jan 4, 2024

Can we add a explain of the distributed update in another PR?

@BohuTANG BohuTANG merged commit 2048297 into databendlabs:main Jan 4, 2024
75 checks passed
@SkyFan2002
Copy link
Member Author

Can we add a explain of the distributed update in another PR?

Of course

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-feature this PR introduces a new feature to the codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature: distributed execution of update statement
5 participants