From 49d84f88cc6a083fca90a477a210296fded3fcba Mon Sep 17 00:00:00 2001 From: comphead Date: Tue, 5 Mar 2024 17:22:09 -0800 Subject: [PATCH 1/3] doc: Add initial doc how to expand Comet exceptions --- DEBUGGING.md | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 56 insertions(+) diff --git a/DEBUGGING.md b/DEBUGGING.md index e348b7215..fa2425cfe 100644 --- a/DEBUGGING.md +++ b/DEBUGGING.md @@ -30,6 +30,62 @@ LLDB or the LLDB that is bundled with XCode. We will use the LLDB packaged with _Caveat: The steps here have only been tested with JDK 11_ on Mac (M1) +# Expand Comet exception details +By default, Comet outputs the exception details specific for Comet. There is a possibility of extending the exception +details by leveraging Datafusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces) + +```scala +scala> spark.sql("my_failing_query").show(false) + +24/03/05 17:00:07 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] +org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)). +This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker + at org.apache.comet.Native.executePlan(Native Method) + at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) + at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) + at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) + +``` +To do that with Comet it is needed to enable `backtrace` in https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml + +``` +datafusion-common = { version = "36.0.0", features = ["backtrace"] } +datafusion = { default-features = false, version = "36.0.0", features = ["unicode_expressions", "backtrace"] } +``` + +Then build the Comet as [described](https://github.com/apache/arrow-datafusion-comet/blob/main/README.md#getting-started) + +Start Comet with `RUST_BACKTRACE=1` + +```commandline +RUST_BACKTRACE=1 $SPARK_HOME/spark-shell --jars spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar --conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions --conf spark.comet.enabled=true --conf spark.comet.exec.enabled=true --conf spark.comet.exec.all.enabled=true +``` + +Get the expanded exception details +```scala +scala> spark.sql("my_failing_query").show(false) +24/03/05 17:00:49 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) +org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)) + +backtrace: 0: std::backtrace::Backtrace::create +1: datafusion_physical_expr::aggregate::min_max::min +2: ::update_batch + 3: as futures_core::stream::Stream>::poll_next +4: comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}} +5: _Java_org_apache_comet_Native_executePlan +. +This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker + at org.apache.comet.Native.executePlan(Native Method) +at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) +at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) +at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) + +... +``` +Note: +- The backtrace coverage in Datafusion is still improving. So there is a chance the error still not covered, feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues) +- The backtrace doesn't come for free and therefore intended for debugging purposes + ## Debugging for Advanced Developers Add a `.lldbinit` to comet/core. This is not strictly necessary but will be useful if you want to From 04d7ac70d5afe3067c0cd14bec07576009063d62 Mon Sep 17 00:00:00 2001 From: comphead Date: Wed, 6 Mar 2024 10:01:10 -0800 Subject: [PATCH 2/3] doc: Add initial doc how to expand Comet exceptions. refmt --- DEBUGGING.md | 115 ++++++++++++++++++++++++++------------------------- 1 file changed, 59 insertions(+), 56 deletions(-) diff --git a/DEBUGGING.md b/DEBUGGING.md index fa2425cfe..b2c2c5c1e 100644 --- a/DEBUGGING.md +++ b/DEBUGGING.md @@ -30,62 +30,6 @@ LLDB or the LLDB that is bundled with XCode. We will use the LLDB packaged with _Caveat: The steps here have only been tested with JDK 11_ on Mac (M1) -# Expand Comet exception details -By default, Comet outputs the exception details specific for Comet. There is a possibility of extending the exception -details by leveraging Datafusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces) - -```scala -scala> spark.sql("my_failing_query").show(false) - -24/03/05 17:00:07 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] -org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)). -This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker - at org.apache.comet.Native.executePlan(Native Method) - at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) - at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) - at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) - -``` -To do that with Comet it is needed to enable `backtrace` in https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml - -``` -datafusion-common = { version = "36.0.0", features = ["backtrace"] } -datafusion = { default-features = false, version = "36.0.0", features = ["unicode_expressions", "backtrace"] } -``` - -Then build the Comet as [described](https://github.com/apache/arrow-datafusion-comet/blob/main/README.md#getting-started) - -Start Comet with `RUST_BACKTRACE=1` - -```commandline -RUST_BACKTRACE=1 $SPARK_HOME/spark-shell --jars spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar --conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions --conf spark.comet.enabled=true --conf spark.comet.exec.enabled=true --conf spark.comet.exec.all.enabled=true -``` - -Get the expanded exception details -```scala -scala> spark.sql("my_failing_query").show(false) -24/03/05 17:00:49 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) -org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)) - -backtrace: 0: std::backtrace::Backtrace::create -1: datafusion_physical_expr::aggregate::min_max::min -2: ::update_batch - 3: as futures_core::stream::Stream>::poll_next -4: comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}} -5: _Java_org_apache_comet_Native_executePlan -. -This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker - at org.apache.comet.Native.executePlan(Native Method) -at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) -at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) -at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) - -... -``` -Note: -- The backtrace coverage in Datafusion is still improving. So there is a chance the error still not covered, feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues) -- The backtrace doesn't come for free and therefore intended for debugging purposes - ## Debugging for Advanced Developers Add a `.lldbinit` to comet/core. This is not strictly necessary but will be useful if you want to @@ -150,3 +94,62 @@ https://mail.openjdk.org/pipermail/hotspot-dev/2019-September/039429.html Detecting the debugger https://stackoverflow.com/questions/5393403/can-a-java-application-detect-that-a-debugger-is-attached#:~:text=No.,to%20let%20your%20app%20continue.&text=I%20know%20that%20those%20are,meant%20with%20my%20first%20phrase). + +# Verbose debug +By default, Comet outputs the exception details specific for Comet. + +```scala +scala> spark.sql("my_failing_query").show(false) + +24/03/05 17:00:07 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)/ 1] +org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)). +This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker + at org.apache.comet.Native.executePlan(Native Method) + at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) + at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) + at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) + +``` + +There is a verbose exception option by leveraging Datafusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces) +This option allows to append native Datafusion stacktrace to the original error message. +To enable this option with Comet it is needed to include `backtrace` feature in [Cargo.toml](https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml) for Datafusion dependencies + +``` +datafusion-common = { version = "36.0.0", features = ["backtrace"] } +datafusion = { default-features = false, version = "36.0.0", features = ["unicode_expressions", "backtrace"] } +``` + +Then build the Comet as [described](https://github.com/apache/arrow-datafusion-comet/blob/main/README.md#getting-started) + +Start Comet with `RUST_BACKTRACE=1` + +```commandline +RUST_BACKTRACE=1 $SPARK_HOME/spark-shell --jars spark/target/comet-spark-spark3.4_2.12-0.1.0-SNAPSHOT.jar --conf spark.sql.extensions=org.apache.comet.CometSparkSessionExtensions --conf spark.comet.enabled=true --conf spark.comet.exec.enabled=true --conf spark.comet.exec.all.enabled=true +``` + +Get the expanded exception details +```scala +scala> spark.sql("my_failing_query").show(false) +24/03/05 17:00:49 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) +org.apache.comet.CometNativeException: Internal error: MIN/MAX is not expected to receive scalars of incompatible types (Date32("NULL"), Int32(15901)) + +backtrace: + 0: std::backtrace::Backtrace::create + 1: datafusion_physical_expr::aggregate::min_max::min + 2: ::update_batch + 3: as futures_core::stream::Stream>::poll_next + 4: comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}} + 5: _Java_org_apache_comet_Native_executePlan + ... +This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker + at org.apache.comet.Native.executePlan(Native Method) +at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) +at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) +at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) + +... +``` +Note: +- The backtrace coverage in Datafusion is still improving. So there is a chance the error still not covered, feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues) +- The backtrace doesn't come for free and therefore intended for debugging purposes From c2f2d9b4106c1c2204c9088151e816fcfc153339 Mon Sep 17 00:00:00 2001 From: comphead Date: Wed, 6 Mar 2024 10:18:28 -0800 Subject: [PATCH 3/3] doc: Add initial doc how to expand Comet exceptions. refmt --- DEBUGGING.md | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/DEBUGGING.md b/DEBUGGING.md index b2c2c5c1e..754316ad5 100644 --- a/DEBUGGING.md +++ b/DEBUGGING.md @@ -111,9 +111,9 @@ This was likely caused by a bug in DataFusion's code and we would welcome that y ``` -There is a verbose exception option by leveraging Datafusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces) -This option allows to append native Datafusion stacktrace to the original error message. -To enable this option with Comet it is needed to include `backtrace` feature in [Cargo.toml](https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml) for Datafusion dependencies +There is a verbose exception option by leveraging DataFusion [backtraces](https://arrow.apache.org/datafusion/user-guide/example-usage.html#enable-backtraces) +This option allows to append native DataFusion stacktrace to the original error message. +To enable this option with Comet it is needed to include `backtrace` feature in [Cargo.toml](https://github.com/apache/arrow-datafusion-comet/blob/main/core/Cargo.toml) for DataFusion dependencies ``` datafusion-common = { version = "36.0.0", features = ["backtrace"] } @@ -141,15 +141,16 @@ backtrace: 3: as futures_core::stream::Stream>::poll_next 4: comet::execution::jni_api::Java_org_apache_comet_Native_executePlan::{{closure}} 5: _Java_org_apache_comet_Native_executePlan - ... + (reduced) + This was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker at org.apache.comet.Native.executePlan(Native Method) at org.apache.comet.CometExecIterator.executeNative(CometExecIterator.scala:65) at org.apache.comet.CometExecIterator.getNextBatch(CometExecIterator.scala:111) at org.apache.comet.CometExecIterator.hasNext(CometExecIterator.scala:126) +(reduced) -... ``` Note: -- The backtrace coverage in Datafusion is still improving. So there is a chance the error still not covered, feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues) -- The backtrace doesn't come for free and therefore intended for debugging purposes +- The backtrace coverage in DataFusion is still improving. So there is a chance the error still not covered, if so feel free to file a [ticket](https://github.com/apache/arrow-datafusion/issues) +- The backtrace evaluation comes with performance cost and intended mostly for debugging purposes