Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: test merge into runtime filter performance #14208

Closed
wants to merge 78 commits into from

Conversation

JackTan25
Copy link
Contributor

@JackTan25 JackTan25 commented Jan 3, 2024

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR.

Fixes #[Link the issue here]

Tests

  • Unit Test
  • Logic Test
  • Benchmark Test
  • No Test - Explain why

Type of change

  • Bug Fix (non-breaking change which fixes an issue)
  • New Feature (non-breaking change which adds functionality)
  • Breaking Change (fix or feature that could cause existing functionality not to work as expected)
  • Documentation Update
  • Refactoring
  • Performance Improvement
  • Other (please describe):

This change is Reviewable

sundy-li and others added 30 commits December 20, 2023 13:23
databendlabs#14078)

* refactor drop table

* fix lint

* remove code

* add more tests

* add UNKNOWN_CATALOG and UNKNOWN_DATABASE

* modify tests

* correct result

* add empty line

* refactor codes
* Add auto_compaction_threshold setting

* fix test

* make lint

* resolve conflict

---------

Co-authored-by: dantengsky <[email protected]>
…#14097)

* chore: fix drop share database error

* chore: fix drop share database error

* chore: fix drop share database error

* ci
* refactor create agg index

* refactor create agg index

* fix sqllogic tests

* add some comments
* ci: add lychee links checker with cache

Signed-off-by: Chojan Shang <[email protected]>

* chore: fix links

Signed-off-by: Chojan Shang <[email protected]>

* chore: fix fmt

Signed-off-by: Chojan Shang <[email protected]>

---------

Signed-off-by: Chojan Shang <[email protected]>
* refactor: rename full_match to all_pruned.

* feat: delta engine store partition columns to tableMeta options.

* delta: partition add field partition_values.

* delta: add utils abort detla partition columns.

* delta: add partition values to partition.

* feat: support reading partitioned delta table.

* store delta partition in TableMeta.engine_options instead of options.

* fix serde.

* pruner check check partition columns.

* add util fn Expr::fill_const_column

* delta pruner support partition columns.

* ci: rename test data dir name.

* ci: test read partitioned delta table.

* ci: test read partitioned delta table.

* chore: add some comments.

* chore: fix clippy.

* chore: fix unit tests.

* chore: fix format.

* chore: fix test.

* chore: fix test.
* add metrics to record the duration of http request

* fix lint
…s#14091)

* chore: the final merge sort output block directly.

* Fix corner case.
* improve concat boolean types

* use offset_from
)

* chore(query): round the results for decimal division

* chore(query): round the results for decimal division

* chore(query): round the results for decimal division

* chore(query): round the results for decimal division
…dlabs#14116)

chore: rename CompactOptions::limit to num_segment_limit
* Add limit for compact hook

* resize to 3
…n_threshold to auto_compaction_imperfect_blocks_threshold (databendlabs#14117)

* chore: remove the compact hook warn message

* auto_compaction_threshold -> auto_compaction_imperfect_blocks_threshold and collapsed hook_compact setting check

* fix auto_compaction_imperfect_blocks_threshold setting in the test
* chore: fix renamed method in the comment guide

* docs: add compatibility to meta/README.md
* feat: create function support lambda

* check lambda

* make lint

* improve test case

* improve test case(2
…endlabs#14122)

* chore(ast): support expr in the position of function parameter

* add test

* fix
* fix stream table limit error

* add stream statistics

* add test
zhang2014 and others added 26 commits December 27, 2023 23:03
…4163)

* remove misused exchange operator

* fix tests

* add more tests

* fix tests
* fix: remove runtime filtered FusePart

so that the number of PartInfoPtr and DataSource passed to
downstream will be equal to each other

* refactor: dedup code

* more contextual infor for deserialization error

* sqlogic test for issue databendlabs#14165

* add new ci job "standalone-minio"

run sql logic test dir "query" with
- mysql and http hanlders
- minio as backend storage
…#14181)

Revert "chore(query): remove some onlyif mysql in test (databendlabs#14168)"

This reverts commit 493bff4.
chore(executor): add test for schedule queue

Signed-off-by: Liuqing Yue <[email protected]>
Co-authored-by: Bohu <[email protected]>
* fix(query): fix parse string to JSON value

* fix
* fix(executor): fix broken query profiling graph

* fix(executor): fix broken query profiling graph
…ad of PUBLIC (databendlabs#14112)

* default as account_admin

* fix cargo check

* allow list in system tables

* tune comments

* remove verify_ownership parameter in validate_access_db and _table

* fix typo

* fix lint

* add test for rewritten

* add stateless tests

* fix stateless error

* fix set_role

* add result file

* rename validate_ownership to has_ownership

* fix typo
* feat: using runtime filter to reduce data block

* fix

* fix

* fix build

* add bloom filter

* cargo fix

* fix lint

* fix build column construct

* optimize bloom filter generate

* refactor build generate filter

* refactor build generate filter

* fix lint

* refactor

* fix lint

* optimize native format by runtime filter

* rebase main

* move native runtime filter to prune page

* support min max runtime filter

* remove inlist filter from native

* add more interface for runtime filter

* fix lint

* fix conflict

* fix native skip

* remove duplicate progress scan

* use reduce to merge rf filters

* reduce duplicate code

* fix offset for internal column
* add stream_status http api

* remove unused code

---------

Co-authored-by: zhihanz <[email protected]>
* new filter exection framework

* refine

* refine

* remove useless

* refine test

* vectorized

* fix

* make lint

* fix

* improve process_and and process_or

* introduce different strategy to generate datablock

* support column ref and constant

* remove selection_op_ref

* fix BooleanConstant

* remove println

* fix cast

* refactor

* add util

* fix Generic

* fix function return type

* fix process_or and process_and

* reuse selection_range

* fix when process_or and process_and if count == 0

* support case when

* merge

* fix test

* fix generics

* support LambdaFunction

* support tuple and fix test

* fix test

* fix test

* move

* support native

* fix block size

* add test and fix

* fix select_scalars

* improve test

* refine test

* refactor

* rename selection_op

* add more comments

* rename selection_op

* remove unreachable

* format

* fix test

* use databend_xxx

* rename idx

* add comments for SelectExpr

* remove useless code

* rename helper.rs to selection_op.rs

* make lint

* rename selection_op to select_op

* remove data_type() and use infer_data_type()

* use select_value_type to refine code

* remove useless code

* rename count to filtered_count

* improve codes

* improve codes

* refine code

* fix select_scalars

* add more test

* add default compare operation for types

* remove useless code

* improve select

* add # Safety comments

* merge

* merge main

* fix typo

* refine compare

* adapt runtime filter

---------

Co-authored-by: sundy-li <[email protected]>
* feat: support abs for decimal types

* feat: support abs for decimal types

* feat: support abs for decimal types
…tabendlabs#14187)

use binaryfuse16 and disable feature uniform-random to reduce build bloom filter cost
…atabendlabs#14196)

* refactor: Implement exponential backoff in meta-service txn retries

Introduce an exponential backoff strategy for meta-service transaction
retries to mitigate race conditions. Upon a transaction failure, the
system now initiates a delay before the next attempt, reducing the
likelihood of concurrent transaction conflicts.

The backoff intervals follow an exponential pattern, calculated as
`1.4^i * 10 milliseconds`, where `i` is the current retry count.

This change aims to improve the robustness of the transaction handling
by spacing out retries in the face of transient failures.

The number of attempt to retry is increased from 10 to 60.

* refactor: Add logging for transaction retries

Add logging for transaction retries by differentiating between
initial and subsequent failures. Now, the first transaction failure
triggers an info log. After four failed attempts, further retries are
logged at the warning level to highlight potential issues.

Logs are emitted both before and after the backoff sleep period.
…tion (databendlabs#14197)

better handling of nullable columns during filter execution
* fix(query): fix hook refresh virtual columns without license

* fix

* fix

* fix

* fix
`Compatible` error is added in 0.9.41 to accept two types of error
returned from meta-service: `KVAppError` and `MetaAPIError`.

Such compatible layer can be removed since meta-client's min compatible
meta-service version has already increased to a newer version which does
not send `KVAppError` any more.
@github-actions github-actions bot added the pr-chore this PR only has small changes that no need to record, like coding styles. label Jan 3, 2024
@JackTan25 JackTan25 added the ci-cloud Build docker image for cloud test label Jan 3, 2024
@JackTan25 JackTan25 closed this Jan 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-cloud Build docker image for cloud test pr-chore this PR only has small changes that no need to record, like coding styles.
Projects
None yet
Development

Successfully merging this pull request may close these issues.