Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pull] main from datafuselabs:main #47

Merged
merged 273 commits into from
Jun 21, 2024
Merged

[pull] main from datafuselabs:main #47

merged 273 commits into from
Jun 21, 2024

Conversation

pull[bot]
Copy link

@pull pull bot commented May 21, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

TCeason and others added 30 commits May 13, 2024 18:28
* chore(query): tracking query log for mysql handler

* feat(query): tracking query log for mysql handler
refactor: polish error message when fail to decode a row of NDJSON to JSON.
* feat(query): ensure git is installed in pyenv

* refactor(query): pyenv to 3.9
* feat(query): ensure git is installed in pyenv

* refactor(query): pyenv to 3.9

* refactor(query): pyenv to 3.12

* refactor(query): pyenv to 3.12

* refactor(query): pyenv to 3.12

* refactor(query): pyenv to 3.12

* refactor(query): revert workflow

* refactor(query): revert workflow
recluster disable sort spill
…15509)

* refactor: remove databend-common-meta-app from databend-common-ast

* fix

---------

Co-authored-by: Bohu <[email protected]>
Co-authored-by: sundyli <[email protected]>
* chore: adjust eplased times of various logs

Adjust the time part in the log to use Duration instead of specific
precision time (e.g., seconds), to avoid imprecise log information such
as '0s'.

* chore: add timing log for some schema api

- update_table_meta
- update_multi_table_meta
* feat(query): support unset session setting

Databend now supports the `unset <settings>` command, which is designed to remove global settings from the metasrv. We plan to expand this functionality with additional unset commands:

```sql
unset [session] <setting>; -- Only unset a setting at the session level.
```

For example:
```
-- load_file_metadata_expire_hours default is 248

set global load_file_metadata_expire_hours=12;
-- show settings like '%load_file_metadata_expire_hours%';
-- 12 global-level

set load_file_metadata_expire_hours=13;
-- show settings like '%load_file_metadata_expire_hours%';
-- 13 session-level

unset session load_file_metadata_expire_hours;
-- show settings like '%load_file_metadata_expire_hours%';
-- 12 global-level
```

* optimize code

* fix test

---------

Co-authored-by: Bohu <[email protected]>
* feat: support explain and profile multi table insert statemennt

* rm unused code

* fix plan id

* polish profile
* refactor(query): allow udf server in insert source

* refactor(query): allow udf server in insert source
fix: explain ast insert fisrt/all
…untime (#15504)

* refactor: `SessionContext::current_tenant` should not be changed at runtime

Due to the design principle that `SessionContext::current_tenant` should
not be modified at runtime, this commit transitions
`SessionContext::current_tenant` from a shared `RwLock<Option<Tenant>>`
to a non-shared `Option<Tenant>`.

Additionally, the process of constructing a `Session` has been divided
into two distinct steps:

1. Initially build an instance and set up initial values, including
   `current_tenant`.

2. Convert the `Session` instance into a readonly `Arc<Session>` to
   facilitate sharing across `QueryContext` and other components.

Other Changes:

- Methods of `Session` are now bound to `&self` instead of `&Arc<Self>`.

- The method `SessionCtx::set_current_tenant()` has been made private
  within its module.

- The function `SessionManager::register_session()` has been extracted from
  `SessionManager::create_with_settings()`. Because `Session` creation
  and registration must be two distinct steps.

* chore: return Err instead of using `unwrap()`
* chore: Upgrade minitrace related crates to 0.6.5

* Update cargo.lock
Prior to this commit, the snapshot writer would attempt to commit the
snapshot upon closure of the input channel, operating under the
assumption that the snapshot was fully written. This assumption is
flawed as the closure of the channel might be due to a process shutdown,
not completion of the snapshot.

To address this issue, this commit introduces a change where items sent
to the snapshot writer are encapsulated within a `WriteEntry`, which can
be either `Data(T)` or `Commit`. The snapshot is only considered
complete and ready for commit when a `Commit` variant is received. This
adjustment ensures that premature snapshot commits are avoided in cases
of early channel closures.
* fix(query): view query use default dialect

* fix
* fix(query): convert to arrow column remove ignore inside nullable

* fix

* fix

* fix

* fix

* tmp

* fix

* fix
…ry log (#15531)

* feat(query): add query query_hash and query_parameterized_hash to query log

* fix fmt check

* fix test

* add query_hash test

* use md5 generate hash

* fix check

* only hash select statement;

In query log, the query_text maybe like this:

```
query_text: INSERT INTO products (id, name, description) VALUES (?,?,?)

```

It not a normal query, can not be format.

* refactor: attach hash value into query ctx

* extract method;
youngsofun and others added 29 commits June 13, 2024 18:19
…#15751)

* fix(query): create ownership obejct only check current_role privilege

* create_object rename to check_current_role_only;

* extract get current role logic from get effective roles

* modify test: if drop current role, current role set to public role
* improve substr domain

* fix test

* fix test

* fix test
* feat: support dynamic enable of rust backtrace

* test: add test

* fix: support to set backtrace for clusters

Signed-off-by: Liuqing Yue <[email protected]>

* chore: change syntax to add `SYSTEM` prefix

Signed-off-by: Liuqing Yue <[email protected]>

* refactor: apply review suggestions to split off system kind

Signed-off-by: Liuqing Yue <[email protected]>

* fix: apply review suggestions

Signed-off-by: Liuqing Yue <[email protected]>

* fix: apply review suggestions

Signed-off-by: Liuqing Yue <[email protected]>

* refine some words

Signed-off-by: Liuqing Yue <[email protected]>

* fix: related stmt part not refactored

Signed-off-by: Liuqing Yue <[email protected]>

* chore: clean comments

Signed-off-by: Liuqing Yue <[email protected]>

---------

Signed-off-by: Liuqing Yue <[email protected]>
* fix(query): list user stage not check privilege

* fix clippy
Building databend-meta does not rely on databend-query.
Moving it out of `src/binaries` speed up databend-meta building.
Use `limited_get_log_entries()` to limit the count of log entries in a
`AppendEntries` request by size.
With this approach, every AppendEntries request won't exceed 2MB.
This address the issue with static configured number of entries in a
`AppendEntries` can not limit the RPC size. Such too big AppendEntries
RPC may timeout.

Other changes:

- Only reuse a connection if there is no error returned from the last RPC.
  When a raft RPC is finished, push the connection back for next RPC if it
  returns an `Ok`. Otherwise it drops the connection and the next RPC will
  re-create a new one. Such approach ensures no problematic connection
  will be used.

- Improve raft connection and logging filter. Support multi log prefix filter.

- Add `DisplayOption` to improve display for raft log id:
  `Option<LogId>::display()` returns an `Display` implementation.

- Log elapsed time for critical steps of `AppendEntries` RPC.
* ci: test load parquet unloaded by databend.

* fix
* fix: fix sequence used in function calls

* make clippy happy
recluster final ignore error
reduce recluster depth threshold
* test

* remove redundant code in mergeinto

* remove

---------

Co-authored-by: jw <[email protected]>
fix: multi-tbl insert with lateral flattern

no need to project before building the pipeline of `Duplicate` plan
This test start version-A databend-meta as the leader, feed some data,
build snapshot, and then join a version-B databend-meta to the cluster,
feed some other data, and ensure these two nodes be able to work
together correctly by checking the exported data from these two running
nodes.
* feat: support recursive cte used in multiple places

* cargo fix

* remove useless code
not solved totally, but limited the error cases;
* feat(query): support is distinct from for expression scan

* chore: add test
* remove addRowNumber if source is physical table

* fix test

* fix

* remove code

* add test

* fix review

---------

Co-authored-by: jw <[email protected]>
* chore(query): show view just display view name

* add test
* feat(meta): support use id get table/db_name

add get_table_name_by_id and get_db_name_by_id

* remove extras codes

* stream source table and database can support rename

* fix

* make lint

* fix

* fix review commend

* add test

* fix review commend

* fix review commend

* fix

* fix

* fix test

---------

Co-authored-by: taichong <[email protected]>
@TCeason TCeason merged commit 24a3041 into TCeason:main Jun 21, 2024
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.