Skip to content

Commit

Permalink
Merge branch 'master' into agg_value_modify1
Browse files Browse the repository at this point in the history
  • Loading branch information
cjj2010 authored Sep 12, 2024
2 parents 6f0e0b9 + c71514d commit 7fa27f7
Show file tree
Hide file tree
Showing 3,038 changed files with 137,822 additions and 90,282 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
4 changes: 0 additions & 4 deletions .asf.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,7 @@ github:
- Clang Formatter
- CheckStyle
- P0 Regression (Doris Regression)
- P1 Regression (Doris Regression)
- External Regression (Doris External Regression)
- cloud_p1 (Doris Cloud Regression)
- cloud_p0 (Doris Cloud Regression)
- FE UT (Doris FE UT)
- BE UT (Doris BE UT)
Expand Down Expand Up @@ -111,10 +109,8 @@ github:
strict: false
contexts:
- License Check
- Clang Formatter
- CheckStyle
- P0 Regression (Doris Regression)
- P1 Regression (Doris Regression)
- External Regression (Doris External Regression)
- FE UT (Doris FE UT)
- BE UT (Doris BE UT)
Expand Down
1 change: 1 addition & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,6 @@
# limitations under the License.
#
be/src/io/* @platoneko @gavinchou @dataroaring
be/src/agent/be_exec_version_manager.cpp @BiteTheDDDDt
fe/fe-core/src/main/java/org/apache/doris/catalog/Env.java @dataroaring @CalvinKirs @morningman
**/pom.xml @CalvinKirs @morningman
7 changes: 6 additions & 1 deletion .github/workflows/clang-format.yml
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,14 @@ jobs:
git clone https://github.com/DoozyX/clang-format-lint-action .github/actions/clang-format-lint-action
pushd .github/actions/clang-format-lint-action &>/dev/null
git checkout 6adbe14579e5b8e19eb3e31e5ff2479f3bd302c7
git checkout c71d0bf4e21876ebec3e5647491186f8797fde31 # v0.18.2
popd &>/dev/null
- name: Install Python dependencies
uses: actions/setup-python@v5
with:
python-version: '3.10' # Adjust if needed

- name: "Format it!"
if: ${{ steps.filter.outputs.changes == 'true' }}
uses: ./.github/actions/clang-format-lint-action
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/code-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
sh_checker_comment: true
sh_checker_exclude: .git .github ^docker ^thirdparty/src ^thirdparty/installed ^ui ^docs/node_modules ^tools/clickbench-tools ^extension ^output ^fs_brokers/apache_hdfs_broker/output (^|.*/)Dockerfile$ ^be/src/apache-orc ^be/src/clucene ^pytest
sh_checker_exclude: .git .github ^docker ^thirdparty/src ^thirdparty/installed ^ui ^docs/node_modules ^tools/clickbench-tools ^extension ^output ^fs_brokers/apache_hdfs_broker/output (^|.*/)Dockerfile$ ^be/src/apache-orc ^be/src/clucene ^pytest ^samples

preparation:
name: "Clang Tidy Preparation"
Expand Down
48 changes: 43 additions & 5 deletions .github/workflows/comment-to-trigger-teamcity.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,22 +21,31 @@ on:
issue_comment:
types: [created, edited]

permissions:
contents: read
pull-requests: write
statuses: write

jobs:
check-comment-if-need-to-trigger-teamcity:

# This job only runs for pull request comments, and comment body contains 'run'
if: ${{ github.event.issue.pull_request && contains(github.event.comment.body, 'run') }}
if: ${{ github.event.issue.pull_request && (contains(github.event.comment.body, 'run') || contains(github.event.comment.body, 'skip buildall')) }}

runs-on: ubuntu-latest
env:
COMMENT_BODY: ${{ github.event.comment.body }}
COMMENT_USER_ID: ${{ github.event.comment.user.id }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

steps:
- name: "Parse PR comment"
id: parse
run: |
COMMENT_BODY=$(echo "${COMMENT_BODY}" | xargs)
PULL_REQUEST_NUM="$(echo "${{ github.event.issue.pull_request.url }}" | awk -F/ '{print $NF}')"
COMMIT_ID_FROM_TRIGGER="$(curl -s -H "Authorization:Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/repos/${{ github.repository }}/pulls/${PULL_REQUEST_NUM}" | jq -r '.head.sha')"
TARGET_BRANCH="$(curl -s -H "Authorization:Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/repos/${{ github.repository }}/pulls/${PULL_REQUEST_NUM}" | jq -r '.base.ref')"
if [[ "${COMMENT_BODY}" == *'run buildall'* ||
"${COMMENT_BODY}" == *'run compile'* ||
"${COMMENT_BODY}" == *'run beut'* ||
Expand All @@ -50,15 +59,28 @@ jobs:
"${COMMENT_BODY}" == *'run arm'* ||
"${COMMENT_BODY}" == *'run performance'* ]]; then
echo "comment_trigger=true" | tee -a "$GITHUB_OUTPUT"
echo "comment_skip=false" | tee -a "$GITHUB_OUTPUT"
elif [[ "${COMMENT_BODY}" == *'skip buildall'* ]]; then
if [[ "${COMMENT_USER_ID}" == '27881198' ||
"${COMMENT_USER_ID}" == '37901441' ]]; then
echo "comment_trigger=false" | tee -a "$GITHUB_OUTPUT"
echo "comment_skip=true" | tee -a "$GITHUB_OUTPUT"
echo "COMMENT_USER_ID ${COMMENT_USER_ID} is allowed to skip buildall."
elif [[ "${COMMENT_USER_ID}" == '9208457' && "${TARGET_BRANCH}" == *'branch-2.1'* ]]; then
echo "COMMENT_USER_ID ${COMMENT_USER_ID} is allowed to skip buildall for branch-2.1"
echo "comment_trigger=false" | tee -a "$GITHUB_OUTPUT"
echo "comment_skip=true" | tee -a "$GITHUB_OUTPUT"
else
echo "COMMENT_USER_ID ${COMMENT_USER_ID} is not allowed to skip buildall."
exit
fi
else
echo "comment_trigger=false" | tee -a "$GITHUB_OUTPUT"
echo "comment_skip=false" | tee -a "$GITHUB_OUTPUT"
echo "find no keyword in comment body, skip this action."
exit
fi
PULL_REQUEST_NUM="$(echo "${{ github.event.issue.pull_request.url }}" | awk -F/ '{print $NF}')"
COMMIT_ID_FROM_TRIGGER="$(curl -s -H "Authorization:Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/repos/${{ github.repository }}/pulls/${PULL_REQUEST_NUM}" | jq -r '.head.sha')"
TARGET_BRANCH="$(curl -s -H "Authorization:Bearer ${{ secrets.GITHUB_TOKEN }}" "https://api.github.com/repos/${{ github.repository }}/pulls/${PULL_REQUEST_NUM}" | jq -r '.base.ref')"
echo "PULL_REQUEST_NUM=${PULL_REQUEST_NUM}" | tee -a "$GITHUB_OUTPUT"
echo "COMMIT_ID_FROM_TRIGGER=${COMMIT_ID_FROM_TRIGGER}" | tee -a "$GITHUB_OUTPUT"
echo "TARGET_BRANCH='${TARGET_BRANCH}'" | tee -a "$GITHUB_OUTPUT"
Expand All @@ -71,7 +93,7 @@ jobs:
echo "COMMENT_REPEAT_TIMES=${COMMENT_REPEAT_TIMES}" | tee -a "$GITHUB_OUTPUT"
- name: "Checkout master"
if: ${{ fromJSON(steps.parse.outputs.comment_trigger) }}
if: ${{ fromJSON(steps.parse.outputs.comment_trigger) || fromJSON(steps.parse.outputs.comment_skip) }}
uses: actions/checkout@v4

- name: "Check if pr need run build"
Expand Down Expand Up @@ -364,3 +386,19 @@ jobs:
"performance" \
"${{ steps.parse.outputs.COMMENT_REPEAT_TIMES }}"
fi
- name: "Skip buildall"
if: ${{ fromJSON(steps.parse.outputs.comment_skip) }}
run: |
source ./regression-test/pipeline/common/teamcity-utils.sh
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" feut
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" beut
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" compile
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" p0
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" p1
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" external
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" performance
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" arm
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" cloud_p0
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" cloud_p1
skip_build "${{ steps.parse.outputs.COMMIT_ID_FROM_TRIGGER }}" cloudut
4 changes: 3 additions & 1 deletion .github/workflows/labeler/scope-label-conf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,4 +25,6 @@ meta-change:
- gensrc/proto/*

doing:
- '**'
- base-branch: 'master'
- changed-files:
- any-glob-to-any-file: '**'
2 changes: 2 additions & 0 deletions .licenserc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -92,4 +92,6 @@ header:
- "pytest/qe"
- "pytest/sys/data"
- "pytest/deploy/*.conf"
- "tools/jeprof"
- "tools/FlameGraph/*"
comment: on-failure
10 changes: 9 additions & 1 deletion LICENSE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -725,4 +725,12 @@ Apache 2.0, Copyright 2023 SAP SE or an SAP affiliate company, Johannes Bechberg

This project is maintained by the SapMachine team at SAP SE

----------------------------------------------------------------------------------
----------------------------------------------------------------------------------

be/tools/FlameGraph/*.pl: COMMON DEVELOPMENT AND DISTRIBUTION LICENSE Version 1.0

Unless otherwise noted, all files in this distribution are released
under the Common Development and Distribution License (CDDL).
Exceptions are noted within the associated source files.

----------------------------------------------------------------------------------
92 changes: 92 additions & 0 deletions be/src/agent/be_exec_version_manager.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

#include "agent/be_exec_version_manager.h"

namespace doris {

const std::map<int, const std::set<std::string>> AGGREGATION_CHANGE_MAP = {
{AGGREGATION_2_1_VERSION,
{"window_funnel", "stddev_samp", "variance_samp", "percentile_approx_weighted",
"percentile_approx", "covar_samp", "percentile", "percentile_array"}}};

Status BeExecVersionManager::check_be_exec_version(int be_exec_version) {
if (be_exec_version > max_be_exec_version || be_exec_version < min_be_exec_version) {
return Status::InternalError(
"Received be_exec_version is not supported, be_exec_version={}, "
"min_be_exec_version={}, max_be_exec_version={}, maybe due to FE version not "
"match with BE.",
be_exec_version, min_be_exec_version, max_be_exec_version);
}
return Status::OK();
}

void BeExecVersionManager::check_agg_state_compatibility(int current_be_exec_version,
int data_be_exec_version,
std::string function_name) {
if (current_be_exec_version > AGGREGATION_2_1_VERSION &&
data_be_exec_version <= AGGREGATION_2_1_VERSION &&
AGGREGATION_CHANGE_MAP.find(AGGREGATION_2_1_VERSION)->second.contains(function_name)) {
throw Exception(Status::InternalError(
"agg state data with {} is not supported, "
"current_be_exec_version={}, data_be_exec_version={}, need to rebuild the data "
"or set the be_exec_version={} in fe.conf",
function_name, current_be_exec_version, data_be_exec_version,
AGGREGATION_2_1_VERSION));
}
}

/**
* When we have some breaking change for execute engine, we should update be_exec_version.
* NOTICE: The change could only be dont in X.Y.0 version. and if you introduced new version number N,
* remember remove version N-1's all REUSEABLE changes in master branch only. REUSEABLE means scalar or agg functions' replacement.
* If not, the old replacement will happens in the new version which is wrong.
*
* 0: not contain be_exec_version.
* 1: start from doris 1.2.0
* a. remove ColumnString terminating zero.
* b. runtime filter use new hash method.
* 2: start from doris 2.0.0
* a. function month/day/hour/minute/second's return type is changed to smaller type.
* b. in order to solve agg of sum/count is not compatibility during the upgrade process
* c. change the string hash method in runtime filter
* d. elt function return type change to nullable(string)
* e. add repeat_max_num in repeat function
* 3: start from doris 2.0.0 (by some mistakes)
* a. aggregation function do not serialize bitmap to string.
* b. support window funnel mode.
* 4/5: start from doris 2.1.0
* a. ignore this line, window funnel mode should be enabled from 2.0.
* b. array contains/position/countequal function return nullable in less situations.
* c. cleared old version of Version 2.
* d. unix_timestamp function support timestamp with float for datetimev2, and change nullable mode.
* e. change shuffle serialize/deserialize way
* f. shrink some function's nullable mode.
* g. do local merge of remote runtime filter
* h. "now": ALWAYS_NOT_NULLABLE -> DEPEND_ON_ARGUMENTS
*
* 7: start from doris 3.0.0
* a. change the impl of percentile (need fix)
* b. clear old version of version 3->4
* c. change FunctionIsIPAddressInRange from AlwaysNotNullable to DependOnArguments
* d. change some agg function nullable property: PR #37215
* e. change variant serde to fix PR #38413
*/
const int BeExecVersionManager::max_be_exec_version = 7;
const int BeExecVersionManager::min_be_exec_version = 0;

} // namespace doris
69 changes: 13 additions & 56 deletions be/src/agent/be_exec_version_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,24 +20,27 @@
#include <fmt/format.h>
#include <glog/logging.h>

#include "common/exception.h"
#include "common/status.h"

namespace doris {

constexpr inline int BITMAP_SERDE = 3;
constexpr inline int USE_NEW_SERDE = 4; // release on DORIS version 2.1
constexpr inline int OLD_WAL_SERDE = 3; // use to solve compatibility issues, see pr #32299
constexpr inline int AGG_FUNCTION_NULLABLE = 5; // change some agg nullable property: PR #37215
constexpr inline int VARIANT_SERDE = 6; // change variant serde to fix PR #38413
constexpr inline int AGGREGATION_2_1_VERSION =
5; // some aggregation changed the data format after this version

class BeExecVersionManager {
public:
BeExecVersionManager() = delete;

static Status check_be_exec_version(int be_exec_version) {
if (be_exec_version > max_be_exec_version || be_exec_version < min_be_exec_version) {
return Status::InternalError(
"Received be_exec_version is not supported, be_exec_version={}, "
"min_be_exec_version={}, max_be_exec_version={}, maybe due to FE version not "
"match with BE.",
be_exec_version, min_be_exec_version, max_be_exec_version);
}
return Status::OK();
}
static Status check_be_exec_version(int be_exec_version);

static void check_agg_state_compatibility(int current_be_exec_version, int data_be_exec_version,
std::string function_name);

static int get_newest_version() { return max_be_exec_version; }

Expand All @@ -46,50 +49,4 @@ class BeExecVersionManager {
static const int min_be_exec_version;
};

/**
* When we have some breaking change for execute engine, we should update be_exec_version.
* NOTICE: The change could only be dont in X.Y.0 version. and if you introduced new version number N,
* remember remove version N-1's all REUSEABLE changes in master branch only. REUSEABLE means scalar or agg functions' replacement.
* If not, the old replacement will happens in the new version which is wrong.
*
* 0: not contain be_exec_version.
* 1: start from doris 1.2.0
* a. remove ColumnString terminating zero.
* b. runtime filter use new hash method.
* 2: start from doris 2.0.0
* a. function month/day/hour/minute/second's return type is changed to smaller type.
* b. in order to solve agg of sum/count is not compatibility during the upgrade process
* c. change the string hash method in runtime filter
* d. elt function return type change to nullable(string)
* e. add repeat_max_num in repeat function
* 3: start from doris 2.0.0 (by some mistakes)
* a. aggregation function do not serialize bitmap to string.
* b. support window funnel mode.
* 4: start from doris 2.1.0
* a. ignore this line, window funnel mode should be enabled from 2.0.
* b. array contains/position/countequal function return nullable in less situations.
* c. cleared old version of Version 2.
* d. unix_timestamp function support timestamp with float for datetimev2, and change nullable mode.
* e. change shuffle serialize/deserialize way
* f. shrink some function's nullable mode.
* g. do local merge of remote runtime filter
* h. "now": ALWAYS_NOT_NULLABLE -> DEPEND_ON_ARGUMENTS
*
* 5: start from doris 3.0.0
* a. change the impl of percentile (need fix)
* b. clear old version of version 3->4
* c. change FunctionIsIPAddressInRange from AlwaysNotNullable to DependOnArguments
* d. change some agg function nullable property: PR #37215
* e. change variant serde to fix PR #38413
*/
constexpr inline int BeExecVersionManager::max_be_exec_version = 6;
constexpr inline int BeExecVersionManager::min_be_exec_version = 0;

/// functional
constexpr inline int BITMAP_SERDE = 3;
constexpr inline int USE_NEW_SERDE = 4; // release on DORIS version 2.1
constexpr inline int OLD_WAL_SERDE = 3; // use to solve compatibility issues, see pr #32299
constexpr inline int AGG_FUNCTION_NULLABLE = 5; // change some agg nullable property: PR #37215
constexpr inline int VARIANT_SERDE = 6; // change variant serde to fix PR #38413

} // namespace doris
Loading

0 comments on commit 7fa27f7

Please sign in to comment.