minor: refactor to move decodeBatches to broadcast exchange code as private function #1195

andygrove · 2024-12-22T14:12:43Z

Which issue does this PR close?

N/A

Rationale for this change

This is a small refactor extracted from #1192.

What changes are included in this PR?

Remove function executeColumnarCollectIterator and associated test because it isn't used anywhere else.
Move decodeBatches from CometExec to a private function in CometBroadcastExchangeExec.scala since that is the only place that it is needed.
Add some comments

How are these changes tested?

Existing tests

codecov-commenter · 2024-12-22T15:15:18Z

Codecov Report

Attention: Patch coverage is 71.42857% with 2 lines in your changes missing coverage. Please review.

Project coverage is 34.75%. Comparing base (ea6d205) to head (65f46a1).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...e/spark/sql/comet/CometBroadcastExchangeExec.scala	71.42%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##               main    #1195       +/-   ##
=============================================
- Coverage     55.28%   34.75%   -20.53%     
- Complexity      868      958       +90     
=============================================
  Files           112      115        +3     
  Lines         10969    43623    +32654     
  Branches       2116     9517     +7401     
=============================================
+ Hits           6064    15161     +9097     
- Misses         3826    25514    +21688     
- Partials       1079     2948     +1869

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…pache#1195)

* feat: add support for array_contains expression * test: add unit test for array_contains function * Removes unnecessary case expression for handling null values * chore: Move more expressions from core crate to spark-expr crate (#1152) * move aggregate expressions to spark-expr crate * move more expressions * move benchmark * normalize_nan * bitwise not * comet scalar funcs * update bench imports * remove dead code (#1155) * fix: Spark 4.0-preview1 SPARK-47120 (#1156) ## Which issue does this PR close? Part of #372 and #551 ## Rationale for this change To be ready for Spark 4.0 ## What changes are included in this PR? This PR fixes the new test SPARK-47120 added in Spark 4.0 ## How are these changes tested? tests enabled * chore: Move string kernels and expressions to spark-expr crate (#1164) * Move string kernels and expressions to spark-expr crate * remove unused hash kernel * remove unused dependencies * chore: Move remaining expressions to spark-expr crate + some minor refactoring (#1165) * move CheckOverflow to spark-expr crate * move NegativeExpr to spark-expr crate * move UnboundColumn to spark-expr crate * move ExpandExec from execution::datafusion::operators to execution::operators * refactoring to remove datafusion subpackage * update imports in benches * fix * fix * chore: Add ignored tests for reading complex types from Parquet (#1167) * Add ignored tests for reading structs from Parquet * add basic map test * add tests for Map and Array * feat: Add Spark-compatible implementation of SchemaAdapterFactory (#1169) * Add Spark-compatible SchemaAdapterFactory implementation * remove prototype code * fix * refactor * implement more cast logic * implement more cast logic * add basic test * improve test * cleanup * fmt * add support for casting unsigned int to signed int * clippy * address feedback * fix test * fix: Document enabling comet explain plan usage in Spark (4.0) (#1176) * test: enabling Spark tests with offHeap requirement (#1177) ## Which issue does this PR close? ## Rationale for this change After #1062 We have not running Spark tests for native execution ## What changes are included in this PR? Removed the off heap requirement for testing ## How are these changes tested? Bringing back Spark tests for native execution * feat: Improve shuffle metrics (second attempt) (#1175) * improve shuffle metrics * docs * more metrics * refactor * address feedback * fix: stddev_pop should not directly return 0.0 when count is 1.0 (#1184) * add test * fix * fix * fix * feat: Make native shuffle compression configurable and respect `spark.shuffle.compress` (#1185) * Make shuffle compression codec and level configurable * remove lz4 references * docs * update comment * clippy * fix benches * clippy * clippy * disable test for miri * remove lz4 reference from proto * minor: move shuffle classes from common to spark (#1193) * minor: refactor decodeBatches to make private in broadcast exchange (#1195) * minor: refactor prepare_output so that it does not require an ExecutionContext (#1194) * fix: fix missing explanation for then branch in case when (#1200) * minor: remove unused source files (#1202) * chore: Upgrade to DataFusion 44.0.0-rc2 (#1154) * move aggregate expressions to spark-expr crate * move more expressions * move benchmark * normalize_nan * bitwise not * comet scalar funcs * update bench imports * save * save * save * remove unused imports * clippy * implement more hashers * implement Hash and PartialEq * implement Hash and PartialEq * implement Hash and PartialEq * benches * fix ScalarUDFImpl.return_type failure * exclude test from miri * ignore correct test * ignore another test * remove miri checks * use return_type_from_exprs * Revert "use return_type_from_exprs" This reverts commit febc1f1. * use DF main branch * hacky workaround for regression in ScalarUDFImpl.return_type * fix repo url * pin to revision * bump to latest rev * bump to latest DF rev * bump DF to rev 9f530dd * add Cargo.lock * bump DF version * no default features * Revert "remove miri checks" This reverts commit 4638fe3. * Update pin to DataFusion e99e02b9b9093ceb0c13a2dd32a2a89beba47930 * update pin * Update Cargo.toml Bump to 44.0.0-rc2 * update cargo lock * revert miri change --------- Co-authored-by: Andrew Lamb <[email protected]> * update UT Signed-off-by: Dharan Aditya <[email protected]> * fix typo in UT Signed-off-by: Dharan Aditya <[email protected]> --------- Signed-off-by: Dharan Aditya <[email protected]> Co-authored-by: Andy Grove <[email protected]> Co-authored-by: KAZUYUKI TANIMURA <[email protected]> Co-authored-by: Parth Chandra <[email protected]> Co-authored-by: Liang-Chi Hsieh <[email protected]> Co-authored-by: Raz Luvaton <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>

minor: refactor decodeBatches to make private in broadcast exchange

65f46a1

andygrove changed the title ~~minor: refactor to move decodeBatches to broacast exchange code as private function~~ minor: refactor to move decodeBatches to broadcast exchange code as private function Dec 22, 2024

viirya approved these changes Dec 22, 2024

View reviewed changes

andygrove merged commit 639fa2f into apache:main Dec 22, 2024
77 checks passed

andygrove deleted the minor-refactor-decodeBatches branch December 22, 2024 19:25

dharanad pushed a commit to dharanad/datafusion-comet that referenced this pull request Jan 1, 2025

minor: refactor decodeBatches to make private in broadcast exchange (a…

8ee39df

…pache#1195)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

minor: refactor to move decodeBatches to broadcast exchange code as private function #1195

minor: refactor to move decodeBatches to broadcast exchange code as private function #1195

andygrove commented Dec 22, 2024 •

edited

Loading

codecov-commenter commented Dec 22, 2024

minor: refactor to move decodeBatches to broadcast exchange code as private function #1195

minor: refactor to move decodeBatches to broadcast exchange code as private function #1195

Conversation

andygrove commented Dec 22, 2024 • edited Loading

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

codecov-commenter commented Dec 22, 2024

Codecov Report

andygrove commented Dec 22, 2024 •

edited

Loading