Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: address failure caused by method signature change in SPARK-48791 #693

Merged
merged 4 commits into from
Jul 20, 2024

Conversation

parthchandra
Copy link
Contributor

Which issue does this PR close?

Closes #692

Rationale for this change

A private method in TaskMetrics that Comet accesses changed its signature

What changes are included in this PR?

Use reflection to call the method

How are these changes tested?

Local build of Spark with the breaking change and then running HiveParquetSuite

@@ -33,4 +37,24 @@ object ShimBatchReader {
Array.empty[String],
0,
0)

def getTaskAccumulator(taskMetrics: TaskMetrics): Option[AccumulatorV2[_, _]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, but it seems to be the same implementation in all of the shims? Deos it even need to be in the shim layer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could move this into the BatchReader itself. Perhaps that is cleaner.

import org.apache.spark.sql.catalyst.InternalRow
import org.apache.spark.sql.execution.datasources.PartitionedFile
import org.apache.spark.util.AccumulatorV2

object ShimBatchReader {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the shiimmed method is for getting an accumulator from TaskMetrics and does not seem specific to BatchReader, so maybe it should be in a specific ShimTaskMetric object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we tend to put the methods in the class they are used. Anyway, let me move this to BatchReader and avoid the code duplication.

@parthchandra
Copy link
Contributor Author

@andygrove removed the shims and moved the code into BatchReader.

@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 33.69%. Comparing base (b558063) to head (3aa57ec).
Report is 6 commits behind head on main.

Files Patch % Lines
...ain/java/org/apache/comet/parquet/BatchReader.java 0.00% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main     #693      +/-   ##
============================================
+ Coverage     33.63%   33.69%   +0.06%     
- Complexity      821      827       +6     
============================================
  Files           109      109              
  Lines         42529    42567      +38     
  Branches       9343     9360      +17     
============================================
+ Hits          14304    14343      +39     
- Misses        25262    25268       +6     
+ Partials       2963     2956       -7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@andygrove andygrove merged commit 9728c02 into apache:main Jul 20, 2024
74 checks passed
himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
…apache#693)

* fix: address failure caused by method signature change in SPARK-48791

* fix build

* remove shim

* add comment
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

address failure caused by method signature change in SPARK-48791
4 participants