Skip to content

Commit

Permalink
[KYUUBI #6545] Deprecate and remove building support for Spark 3.2
Browse files Browse the repository at this point in the history
# 🔍 Description

This pull request aims to remove building support for Spark 3.2, while still keeping the engine support for Spark 3.2.

Mailing list discussion: https://lists.apache.org/thread/l74n5zl1w7s0bmr5ovxmxq58yqy8hqzc

- Remove Maven profile `spark-3.2`, and references on docs, release scripts, etc.
- Keep the cross-version verification to ensure that the Spark SQL engine built on the default Spark version (3.5) still works well on Spark 3.2 runtime.
- Merge `kyuubi-extension-spark-common` into `kyuubi-extension-spark-3-3`
- Remove `log4j.properties` as Spark moves to Log4j2 since 3.3 (SPARK-37814)

## Types of changes 🔖

- [ ] Bugfix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [x] Breaking change (fix or feature that would cause existing functionality to change)

## Test Plan 🧪

Pass GHA.

---

# Checklist 📝

- [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html)

**Be nice. Be informative.**

Closes #6545 from pan3793/deprecate-spark-3.2.

Closes #6545

54c1725 [Cheng Pan] fix
f4602e8 [Cheng Pan] Deprecate and remove building support for Spark 3.2
2e083f8 [Cheng Pan] fix style
458a92c [Cheng Pan] nit
929e1df [Cheng Pan] Deprecate and remove building support for Spark 3.2

Authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
  • Loading branch information
pan3793 committed Jul 22, 2024
1 parent 895755f commit 063a192
Show file tree
Hide file tree
Showing 65 changed files with 62 additions and 1,755 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/license.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ jobs:
- run: >-
build/mvn org.apache.rat:apache-rat-plugin:check
-Ptpcds -Pkubernetes-it
-Pspark-3.2 -Pspark-3.3 -Pspark-3.4 -Pspark-3.5
-Pspark-3.3 -Pspark-3.4 -Pspark-3.5
- name: Upload rat report
if: failure()
uses: actions/upload-artifact@v3
Expand Down
5 changes: 0 additions & 5 deletions .github/workflows/master.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,6 @@ jobs:
- 8
- 17
spark:
- '3.2'
- '3.3'
- '3.4'
- '3.5'
Expand Down Expand Up @@ -81,10 +80,6 @@ jobs:
spark-archive: '-Pscala-2.13 -Dspark.archive.mirror=https://archive.apache.org/dist/spark/spark-4.0.0-preview1 -Dspark.archive.name=spark-4.0.0-preview1-bin-hadoop3.tgz'
exclude-tags: '-Dmaven.plugin.scalatest.exclude.tags=org.scalatest.tags.Slow,org.apache.kyuubi.tags.DeltaTest,org.apache.kyuubi.tags.IcebergTest,org.apache.kyuubi.tags.PaimonTest,org.apache.kyuubi.tags.SparkLocalClusterTest'
comment: 'verify-on-spark-4.0-binary'
exclude:
# SPARK-33772: Spark supports JDK 17 since 3.3.0
- java: 17
spark: '3.2'
env:
SPARK_LOCAL_IP: localhost
steps:
Expand Down
12 changes: 4 additions & 8 deletions .github/workflows/publish-snapshot-nexus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,19 +30,15 @@ jobs:
matrix:
branch:
- master
- branch-1.7
- branch-1.8
- branch-1.9
profiles:
- -Pflink-provided,spark-provided,hive-provided,spark-3.2
- -Pflink-provided,spark-provided,hive-provided,spark-3.3,tpcds
- -Pflink-provided,spark-provided,hive-provided,spark-3.3
- -Pflink-provided,spark-provided,hive-provided,spark-3.4,tpcds
include:
- branch: master
profiles: -Pflink-provided,spark-provided,hive-provided,spark-3.4
- branch: master
profiles: -Pflink-provided,spark-provided,hive-provided,spark-3.5
- branch: branch-1.8
profiles: -Pflink-provided,spark-provided,hive-provided,spark-3.4
- branch: branch-1.8
- branch: branch-1.9
profiles: -Pflink-provided,spark-provided,hive-provided,spark-3.5
steps:
- uses: actions/checkout@v4
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/style.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
strategy:
matrix:
profiles:
- '-Pflink-provided,hive-provided,spark-provided,spark-3.5,spark-3.4,spark-3.3,spark-3.2,tpcds,kubernetes-it'
- '-Pflink-provided,hive-provided,spark-provided,spark-3.5,spark-3.4,spark-3.3,tpcds,kubernetes-it'

steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -65,10 +65,10 @@ jobs:
if: steps.modules-check.conclusion == 'success' && steps.modules-check.outcome == 'failure'
run: |
MVN_OPT="-DskipTests -Dorg.slf4j.simpleLogger.defaultLogLevel=warn -Dmaven.javadoc.skip=true -Drat.skip=true -Dscalastyle.skip=true -Dspotless.check.skip"
build/mvn clean install ${MVN_OPT} -Pflink-provided,hive-provided,spark-provided,spark-3.2,tpcds
build/mvn clean install ${MVN_OPT} -pl extensions/spark/kyuubi-extension-spark-3-3,extensions/spark/kyuubi-spark-connector-hive -Pspark-3.3
build/mvn clean install ${MVN_OPT} -Pflink-provided,hive-provided,spark-provided,tpcds
build/mvn clean install ${MVN_OPT} -pl extensions/spark/kyuubi-extension-spark-3-3 -Pspark-3.3
build/mvn clean install ${MVN_OPT} -pl extensions/spark/kyuubi-extension-spark-3-4 -Pspark-3.4
build/mvn clean install ${MVN_OPT} -pl extensions/spark/kyuubi-extension-spark-3-5 -Pspark-3.5
build/mvn clean install ${MVN_OPT} -pl extensions/spark/kyuubi-extension-spark-3-5,extensions/spark/kyuubi-spark-connector-hive -Pspark-3.5
- name: Scalastyle with maven
id: scalastyle-check
Expand Down
5 changes: 0 additions & 5 deletions build/release/release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -110,11 +110,6 @@ upload_svn_staging() {
}

upload_nexus_staging() {
# Spark Extension Plugin for Spark 3.2
${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.2 \
-s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
-pl extensions/spark/kyuubi-extension-spark-3-2 -am

# Spark Extension Plugin for Spark 3.3
${KYUUBI_DIR}/build/mvn clean deploy -DskipTests -Papache-release,flink-provided,spark-provided,hive-provided,spark-3.3 \
-s "${KYUUBI_DIR}/build/release/asf-settings.xml" \
Expand Down
20 changes: 5 additions & 15 deletions dev/kyuubi-codecov/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -155,21 +155,6 @@
</build>

<profiles>
<profile>
<id>spark-3.2</id>
<dependencies>
<dependency>
<groupId>org.apache.kyuubi</groupId>
<artifactId>kyuubi-extension-spark-3-2_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.kyuubi</groupId>
<artifactId>kyuubi-spark-authz_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
<id>spark-3.3</id>
<dependencies>
Expand Down Expand Up @@ -203,6 +188,11 @@
<artifactId>kyuubi-spark-connector-hive_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>org.apache.kyuubi</groupId>
<artifactId>kyuubi-spark-authz_${scala.binary.version}</artifactId>
<version>${project.version}</version>
</dependency>
</dependencies>
</profile>
<profile>
Expand Down
2 changes: 1 addition & 1 deletion dev/reformat
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ set -x

KYUUBI_HOME="$(cd "`dirname "$0"`/.."; pwd)"

PROFILES="-Pflink-provided,hive-provided,spark-provided,spark-3.5,spark-3.4,spark-3.3,spark-3.2,tpcds,kubernetes-it"
PROFILES="-Pflink-provided,hive-provided,spark-provided,spark-3.5,spark-3.4,spark-3.3,tpcds,kubernetes-it"

# python style checks rely on `black` in path
if ! command -v black &> /dev/null
Expand Down
9 changes: 3 additions & 6 deletions docs/connector/spark/hudi.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ The **classpath** of Kyuubi Spark SQL engine with Hudi supported consists of

1. kyuubi-spark-sql-engine-\ |release|\ _2.12.jar, the engine jar deployed with a Kyuubi distribution
2. a copy of Spark distribution
3. hudi-spark<spark.version>-bundle_<scala.version>-<hudi.version>.jar (example: hudi-spark3.2-bundle_2.12-0.11.1.jar), which can be found in the `Maven Central`_
3. hudi-spark<spark.version>-bundle_<scala.version>-<hudi.version>.jar (example: hudi-spark3.5-bundle_2.12:0.15.0.jar), which can be found in the `Maven Central`_

In order to make the Hudi packages visible for the runtime classpath of engines, we can use one of these methods:

Expand All @@ -61,15 +61,12 @@ To activate functionality of Hudi, we can set the following configurations:

.. code-block:: properties
# Spark 3.2
# Spark 3.2 to 3.5
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.spark.HoodieSparkKryoRegistrar
spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension
spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog
# Spark 3.1
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension
Hudi Operations
---------------

Expand Down
1 change: 0 additions & 1 deletion docs/contributing/code/building.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,6 @@ Since v1.1.0, Kyuubi support building with different Spark profiles,

| Profile | Default | Since |
|-------------|---------|-------|
| -Pspark-3.2 | | 1.4.0 |
| -Pspark-3.3 | | 1.6.0 |
| -Pspark-3.4 | | 1.8.0 |
| -Pspark-3.5 || 1.8.0 |
Expand Down
13 changes: 7 additions & 6 deletions docs/deployment/migration-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,17 @@

* Since Kyuubi 1.10, `beeline` is deprecated and will be removed in the future, please use `kyuubi-beeline` instead.
* Since Kyuubi 1.10, the support of Spark engine for Spark 3.1 is removed.
* Since Kyuubi 1.10, the support of Spark engine for Spark 3.2 is deprecated, and will be removed in the future.
* Since Kyuubi 1.10, the support of Flink engine for Flink 1.16 is removed.

## Upgrading from Kyuubi 1.8 to 1.9

* Since Kyuubi 1.9.0, `kyuubi.session.conf.advisor` can be set as a sequence, Kyuubi supported chaining SessionConfAdvisors.
* Since Kyuubi 1.9.0, the support of Derby is removal for Kyuubi metastore.
* Since Kyuubi 1.9.0, the support of Spark SQL engine for Spark 3.1 is deprecated, and will be removed in the future.
* Since Kyuubi 1.9.0, the support of Spark extensions for Spark 3.1 is removed, please use Spark 3.2 or higher versions.
* Since Kyuubi 1.9.0, `kyuubi.frontend.login.timeout`, `kyuubi.frontend.thrift.login.timeout`, `kyuubi.frontend.backoff.slot.length`, `kyuubi.frontend.thrift.backoff.slot.length` are removed.
* Since Kyuubi 1.9.0, the support of Flink engine for Flink 1.16 is deprecated, and will be removed in the future.
* Since Kyuubi 1.9, `kyuubi.session.conf.advisor` can be set as a sequence, Kyuubi supported chaining SessionConfAdvisors.
* Since Kyuubi 1.9, the support of Derby is removal for Kyuubi metastore.
* Since Kyuubi 1.9, the support of Spark SQL engine for Spark 3.1 is deprecated, and will be removed in the future.
* Since Kyuubi 1.9, the support of Spark extensions for Spark 3.1 is removed, please use Spark 3.2 or higher versions.
* Since Kyuubi 1.9, `kyuubi.frontend.login.timeout`, `kyuubi.frontend.thrift.login.timeout`, `kyuubi.frontend.backoff.slot.length`, `kyuubi.frontend.thrift.backoff.slot.length` are removed.
* Since Kyuubi 1.9, the support of Flink engine for Flink 1.16 is deprecated, and will be removed in the future.

## Upgrading from Kyuubi 1.8.0 to 1.8.1

Expand Down
4 changes: 2 additions & 2 deletions docs/deployment/spark/gluten.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@

# Gluten

[Gluten](https://oap-project.github.io/gluten/) is a Spark plugin developed by Intel, designed to accelerate Apache Spark with native libraries. Currently, only CentOS 7/8 and Ubuntu 20.04/22.04, along with Spark 3.2/3.3/3.4, are supported. Users can employ the following methods to utilize the Gluten with Velox native libraries.
[Gluten](https://oap-project.github.io/gluten/) is a Spark plugin developed by Intel, designed to accelerate Apache Spark with native libraries. Currently, only CentOS 7/8 and Ubuntu 20.04/22.04, along with Spark 3.3/3.4, are supported. Users can employ the following methods to utilize the Gluten with Velox native libraries.

## Building(with velox Backend)

Expand All @@ -30,7 +30,7 @@ Git clone gluten project, use gluten build script `buildbundle-veloxbe.sh`, and
git clone https://github.com/oap-project/gluten.git
cd /path/to/gluten

## The script builds two jars for spark 3.2.x, 3.3.x, and 3.4.x.
## The script builds two jars for spark 3.3.x, and 3.4.x.
./dev/buildbundle-veloxbe.sh
```

Expand Down
2 changes: 1 addition & 1 deletion docs/extensions/engines/spark/rules.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ And don't worry, Kyuubi will support the new Apache Spark version in the future.
| Kyuubi Spark SQL extension | Supported Spark version(s) | Available since | EOL | Bundled in Binary release tarball | Maven profile |
|----------------------------|----------------------------|------------------|-------|-----------------------------------|---------------|
| kyuubi-extension-spark-3-1 | 3.1.x | 1.3.0-incubating | 1.8.0 | 1.3.0-incubating | spark-3.1 |
| kyuubi-extension-spark-3-2 | 3.2.x | 1.4.0-incubating | N/A | 1.4.0-incubating | spark-3.2 |
| kyuubi-extension-spark-3-2 | 3.2.x | 1.4.0-incubating | 1.9.0 | 1.4.0-incubating | spark-3.2 |
| kyuubi-extension-spark-3-3 | 3.3.x | 1.6.0-incubating | N/A | 1.6.0-incubating | spark-3.3 |
| kyuubi-extension-spark-3-4 | 3.4.x | 1.8.0 | N/A | 1.8.0 | spark-3.4 |
| kyuubi-extension-spark-3-5 | 3.5.x | 1.8.0 | N/A | 1.9.0 | spark-3.5 |
Expand Down
2 changes: 0 additions & 2 deletions docs/extensions/engines/spark/z-order.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,8 +78,6 @@ This feature is inside Kyuubi extension, so you should apply the extension to Sp
- add extension jar: `copy $KYUUBI_HOME/extension/kyuubi-extension-spark-3-5* $SPARK_HOME/jars/`
- add config into `spark-defaults.conf`: `spark.sql.extensions=org.apache.kyuubi.sql.KyuubiSparkSQLExtension`

Due to the extension, z-order only works with Spark 3.2 and higher version.

### Optimize history data

If you want to optimize the history data of a table, the `OPTIMIZE ...` syntax is good to go. Due to Spark SQL doesn't support read and overwrite same datasource table, the syntax can only support to optimize Hive table.
Expand Down
157 changes: 0 additions & 157 deletions extensions/spark/kyuubi-extension-spark-3-2/pom.xml

This file was deleted.

Loading

0 comments on commit 063a192

Please sign in to comment.