Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from datafuselabs:main #43

Closed
wants to merge 946 commits into from
Closed

Conversation

pull[bot]
Copy link

@pull pull bot commented Oct 13, 2023

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

@github-actions
Copy link

Pull request description must contain CLA like the following:

I hereby agree to the terms of the CLA available at: https://databend.rs/dev/policies/cla/

## Summary

Summary about this PR

- Close #issue

@github-actions
Copy link

github-actions bot commented Oct 13, 2023

This pull request's title is not fulfill the requirements. @pull[bot] please update it 🙏.

Valid format:

fix(query): fix group by string bug
  ^         ^---------------------^
  |         |
  |         +-> Summary in present tense.
  |
  +-------> Type: rfc, feat, fix, refactor, ci, docs, chore

Valid types:

  • rfc: this PR proposes a new RFC
  • feat: this PR introduces a new feature to the codebase
  • fix: this PR patches a bug in codebase
  • refactor: this PR changes the code base without new features or bugfix
  • ci: this PR changes build/testing/ci steps
  • docs: this PR changes the documents or websites
  • chore: this PR only has small changes that no need to record

@pull pull bot added the ⤵️ pull label Oct 13, 2023
ariesdevil and others added 27 commits January 10, 2024 11:53
* add basic sort support for agg index rewrite

* add more tests

---------

Co-authored-by: sundyli <[email protected]>
* feat(query): grant ownership need modify role privilege

* skip built in roles

* get_ownership part inside the revoke_ownership()

* merge grant_ownership and transfer_ownership
* feat: return latest leader addr if request is forwarded

New feature, compatible change.

When a meta-service follower receives a request, it forwards it to the
leader. This mechanism introduces additional latency.

In this commit:

- Meta-service responds the leader-address in gRPC
  header(response metadata): "x-databend-meta-leader-grpc-endpoint",
  if the meta-service node is a follower.

- Meta-client that receives such a reponse metadate then updates the
  address for further connection to this perceived leader.

- Leader endpoint response is added to two API: `transaction()` and
  `kv_read_v1()`, which are the most used by `SchemaApi`, and very
  latency sensitive.

  `upsert()` and `kv_read()`(v0 API) are not changed: `upsert()` will be
  replaced with `transaction()`, and `kv_read()` is only used by former
  versions databend-query, which are not capable of parsing leader
  endpoint response.

Detailed Changes

- Add `Endpoints` as a container that stores all of the meta-service
  endpoints for a meta-client.

- Updating leader endpoint is done in `EstablishedClient`, which is a
  wrapper for underlying gRPC client. `HandleRPCResult::update_client()`
  is responsible to update leader endpoint and mark a client as **error**.

- The `endpoints` now is shared between `MetaChannelManager` and
  `EstablishedClient`, `MetaChannelManager` uses `endpoints` for
  connecting, `EstablishedClient` uses it for updating.

- The old health based node choosing algorithm is discarded, because the
  responded leader must be in good healthy state, otherwise it will
  step down.

- Add timing and logging to meta-service API.

---

- Fix: #14224

* refactor: Endpoints does not have to store index

* chore: update tests according to Endpoints change
* add select tests for agg index

* add select tests for agg index
* fix(query): remove topk optimization in parquet2

* fix(query): remove topk optimization in parquet2
* support adaptive filter reorder

* chore: add more comments

* feat: add Expr::Cast for build_select_expr

* chore: add more comments
* chore: enable runtime filter for right join

* add no_shuffle to group by shuffle mode

* x

* add network cost

* x

* fix join order

* fix join order

* update

* update

* add semi to inner join rule

* rebase

* fix explain

* fix explain native

* fix tests
* refactor: use binary for non-utf8 string type

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix bendsql

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* try again

* fix

* update test case

* fix test cases

* fmt

* fix test cases

* dbg

* workarond #14296

* update test

* update test
* Add huggingface config

Signed-off-by: Xuanwo <[email protected]>

* Add meta api

Signed-off-by: Xuanwo <[email protected]>

* Add test for huggingfaceconfig

Signed-off-by: Xuanwo <[email protected]>

* Allow parse huggingface

Signed-off-by: Xuanwo <[email protected]>

* Add create huggingface stage test

Signed-off-by: Xuanwo <[email protected]>

* Allow querying from huggingface

Signed-off-by: Xuanwo <[email protected]>

* Add commento for fields

Signed-off-by: Xuanwo <[email protected]>

---------

Signed-off-by: Xuanwo <[email protected]>
* chore: update testing parquet assets for stage test

* fix
)

chore: fix system table ctx warning and log more errors
chore: add cardinality to CteScan explain
* Update 02_0063_function_vector.test

* feat(query): add float64 version of distance function overload

* feat(query): add float64 version of distance function overload

* fix the verctor.rs use style

---------

Co-authored-by: Bohu <[email protected]>
* chore: update testing parquet assets for stage test

* fix

* refactor: add BinaryColumn, BinaryColumnBuilder

* fixgp

* fix

* fix

* --wip-- [skip ci]

* fix
* delete owner in table/db meta

get owner info in meta

* add meta test

---------

Co-authored-by: Bohu <[email protected]>
* refactor unload.

* feat: CopyIntoLocation support output.

* update tests.

* fix test.

* fix test.
* feat: query level data cache metrics

* show data cache metrics is query_log table

* update ut golden files

* Update src/query/catalog/src/statistics/data_cache_statistics.rs

Co-authored-by: Bohu <[email protected]>

* rename fields

* stop gather data from disk cache if in memory cache hits

* rename query_log table fields

* update test golden file

* cleanup

* remove logging

---------

Co-authored-by: Bohu <[email protected]>
* fix(query): add comment in table meta

* fix(query): add comment in table meta
* feat: CSV format add option `binary_format` and `output_header`.

* ci: add tests.

* fix clippy.

* fix tests.
* chore(query): add log prefix filter config

* chore(query): add log prefix filter config
everpcpc and others added 29 commits February 6, 2024 21:57
* refactor: Add KVPbApi::list_pb_values() returns values without keys

* refactor: add KVPbApi::upsert_pb() to update or insert

* refactor: Simplify UdfMgr::add_udf()
* improve hash join code

* fix: from left single join to inner join

* chore: improve left join

* fix: right single to inner
…e) (#14637)

* reduce small block writes, improve s3 latency

* add random test again
* refacot: extract ensure_non_builtin()

* chore: move KVPbApi impl to seperate files
…urning ErrorCode (#14640)

This way the caller can distinguish between a logic error and a backend
meta-service error.
…pprox_count_distinct` function (#14609)

* refactor(query): support Additional param to set the error rate in approx_count_distinct function

* update

* update

* update

* fix
`UdfApi` has only one implementation `UdfMgr` and the abstraction layer
is the underlying `KVApi`. There is no need to introduce another
abstraction.
* refactor(query): disallowed thread::spawn

* refactor(query): disallowed tokio::Handle::spawn

* refactor(query): disallowed tokio::Handle::spawn_blocking

* refactor(query): disallowed tokio block_on

* refactor(query): disallowed tokio block_on

* refactor(query): disallowed tokio block_on

* refactor(query): location futrue

* refactor(query): location futrue

---------

Co-authored-by: Bohu <[email protected]>
refine plan id generation

Co-authored-by: Winter Zhang <[email protected]>
* chore(executor): remove catalog ident name for profile

* chore(meta): remove tenant for table info

* chore(meta): remove tenant for table info

* chore(meta): remove tenant for table info
* feat: add create or replace masking policy support

* feat: add create or replace masking policy support
@TCeason TCeason closed this Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.