Skip to content

Commit

Permalink
[SPARK-48152][BUILD] Make spark-profiler as a part of release and p…
Browse files Browse the repository at this point in the history
…ublish to maven central repo

### What changes were proposed in this pull request?
The pr aims to
- make the module `spark-profiler` as a part of Spark release
- publish the module `spark-profiler` to `maven central repository`
- add instructions on how to compile supports `spark-profiler` in the doc `docs/building-spark.md`

### Why are the changes needed?
1.The modules released in the current daily `spark-4.0.0` do not include `spark-profiler`. I believe that according to the current logic, the `spark-profiler` will not appear in the future official version of spark.

2.Align the compilation description of other modules in doc `docs/building-spark.md`, eg:
<img width="935" alt="image" src="https://github.com/apache/spark/assets/15246973/7ecb5cf2-6ab4-46f5-ae2f-76ef944bdda4">

### Does this PR introduce _any_ user-facing change?
Yes, make it easy for users to use `spark-profiler` in the future version of Spark, instead of manually compiling `spark-profiler` based on source code.

### How was this patch tested?
- Pass GA.
- It is necessary to observe whether the daily snapshots `spark-profilter_2.13` generate
https://repository.apache.org/content/repositories/snapshots/org/apache/spark/spark-profiler_2.13/4.0.0-SNAPSHOT/

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes apache#46402 from panbingkun/jvm_profiler.

Authored-by: panbingkun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
  • Loading branch information
panbingkun authored and dongjoon-hyun committed May 8, 2024
1 parent 5e49665 commit 553e1b8
Show file tree
Hide file tree
Showing 7 changed files with 23 additions and 9 deletions.
10 changes: 5 additions & 5 deletions .github/workflows/maven_test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -190,18 +190,18 @@ jobs:
export ENABLE_KINESIS_TESTS=0
# Replace with the real module name, for example, connector#kafka-0-10 -> connector/kafka-0-10
export TEST_MODULES=`echo "$MODULES_TO_TEST" | sed -e "s%#%/%g"`
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} clean install
./build/mvn $MAVEN_CLI_OPTS -DskipTests -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} clean install
if [[ "$INCLUDED_TAGS" != "" ]]; then
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} -Dtest.include.tags="$INCLUDED_TAGS" test -fae
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} -Dtest.include.tags="$INCLUDED_TAGS" test -fae
elif [[ "$MODULES_TO_TEST" == "connect" ]]; then
./build/mvn $MAVEN_CLI_OPTS -Dtest.exclude.tags="$EXCLUDED_TAGS" -Djava.version=${JAVA_VERSION/-ea} -pl connector/connect/client/jvm,connector/connect/common,connector/connect/server test -fae
elif [[ "$EXCLUDED_TAGS" != "" ]]; then
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} -Dtest.exclude.tags="$EXCLUDED_TAGS" test -fae
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} -Dtest.exclude.tags="$EXCLUDED_TAGS" test -fae
elif [[ "$MODULES_TO_TEST" == *"sql#hive-thriftserver"* ]]; then
# To avoid a compilation loop, for the `sql/hive-thriftserver` module, run `clean install` instead
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} clean install -fae
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Phadoop-cloud -Pjvm-profiler -Pspark-ganglia-lgpl -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} clean install -fae
else
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Pspark-ganglia-lgpl -Phadoop-cloud -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} test -fae
./build/mvn $MAVEN_CLI_OPTS -pl "$TEST_MODULES" -Pyarn -Pkubernetes -Pvolcano -Phive -Phive-thriftserver -Pspark-ganglia-lgpl -Phadoop-cloud -Pjvm-profiler -Pkinesis-asl -Djava.version=${JAVA_VERSION/-ea} test -fae
fi
- name: Clean up local Maven repository
run: |
Expand Down
2 changes: 1 addition & 1 deletion connector/profiler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Code profiling is currently only supported for
To get maximum profiling information set the following jvm options for the executor :

```
-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -XX:+PreserveFramePointer
spark.executor.extraJavaOptions=-XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -XX:+PreserveFramePointer
```

For more information on async_profiler see the [Async Profiler Manual](https://krzysztofslusarski.github.io/2022/12/12/async-manual.html)
Expand Down
6 changes: 5 additions & 1 deletion connector/profiler/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@
</properties>
<packaging>jar</packaging>
<name>Spark Profiler</name>
<description>
Enables code profiling of executors based on the the async profiler.
</description>
<url>https://spark.apache.org/</url>

<dependencies>
Expand All @@ -44,7 +47,8 @@
<dependency>
<groupId>me.bechberger</groupId>
<artifactId>ap-loader-all</artifactId>
<version>3.0-9</version>
<version>${ap-loader.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
</project>
2 changes: 1 addition & 1 deletion dev/create-release/release-build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ SCALA_2_12_PROFILES="-Pscala-2.12"
HIVE_PROFILES="-Phive -Phive-thriftserver"
# Profiles for publishing snapshots and release to Maven Central
# We use Apache Hive 2.3 for publishing
PUBLISH_PROFILES="$BASE_PROFILES $HIVE_PROFILES -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud"
PUBLISH_PROFILES="$BASE_PROFILES $HIVE_PROFILES -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud -Pjvm-profiler"
# Profiles for building binary releases
BASE_RELEASE_PROFILES="$BASE_PROFILES -Psparkr"

Expand Down
2 changes: 1 addition & 1 deletion dev/test-dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ export LC_ALL=C
# NOTE: These should match those in the release publishing script, and be kept in sync with
# dev/create-release/release-build.sh
HADOOP_MODULE_PROFILES="-Phive-thriftserver -Pkubernetes -Pyarn -Phive \
-Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud"
-Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud -Pjvm-profiler"
MVN="build/mvn"
HADOOP_HIVE_PROFILES=(
hadoop-3-hive-2.3
Expand Down
7 changes: 7 additions & 0 deletions docs/building-spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,13 @@ where `spark-streaming_{{site.SCALA_BINARY_VERSION}}` is the `artifactId` as def

./build/mvn -Pconnect -DskipTests clean package

## Building with JVM Profile support

./build/mvn -Pjvm-profiler -DskipTests clean package

**Note:** The `jvm-profiler` profile builds the assembly without including the dependency `ap-loader`,
you can download it manually from maven central repo and use it together with `spark-profiler_{{site.SCALA_BINARY_VERSION}}`.

## Continuous Compilation

We use the scala-maven-plugin which supports incremental and continuous compilation. E.g.
Expand Down
3 changes: 3 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -297,6 +297,9 @@
<mima.version>1.1.3</mima.version>
<tomcat.annotations.api.version>6.0.53</tomcat.annotations.api.version>

<!-- Version used in Profiler -->
<ap-loader.version>3.0-9</ap-loader.version>

<CodeCacheSize>128m</CodeCacheSize>
<!-- Needed for consistent times -->
<maven.build.timestamp.format>yyyy-MM-dd HH:mm:ss z</maven.build.timestamp.format>
Expand Down

0 comments on commit 553e1b8

Please sign in to comment.