Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Improve CometHashJoin statistics #309

Merged
merged 10 commits into from
Apr 24, 2024

Conversation

planga82
Copy link
Contributor

Which issue does this PR close?

Closes #308 .

Rationale for this change

Add all statistics HashJoinExec datafusion node provides.

What changes are included in this PR?

All available metrics

/// Total time for collecting build-side of join
pub(crate) build_time: metrics::Time
/// Number of batches consumed by build-side
pub(crate) build_input_batches: metrics::Count,
/// Number of rows consumed by build-side
pub(crate) build_input_rows: metrics::Count,
/// Memory used by build-side in bytes
pub(crate) build_mem_used: metrics::Gauge,
/// Total time for joining probe-side batches to the build-side batches
pub(crate) join_time: metrics::Time,
/// Number of batches consumed by probe-side of this operator
pub(crate) input_batches: metrics::Count,
/// Number of rows consumed by probe-side this operator
pub(crate) input_rows: metrics::Count,
/// Number of batches produced by this operator
pub(crate) output_batches: metrics::Count,
/// Number of rows produced by this operator
pub(crate) output_rows: metrics::Count

image

How are these changes tested?

Unit testing and manual testing

Comment on lines 336 to 337
CometConf.COMET_EXEC_ENABLED.key -> "true",
CometConf.COMET_EXEC_ALL_OPERATOR_ENABLED.key -> "true") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think these two confs are enabled in CometTestBase. Not sure if anything special about restating here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, can be removed, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thank you!

@viirya viirya merged commit 869da2d into apache:main Apr 24, 2024
28 checks passed
@viirya
Copy link
Member

viirya commented Apr 24, 2024

Merged. Thanks @planga82

himadripal pushed a commit to himadripal/datafusion-comet that referenced this pull request Sep 7, 2024
* HashMergeJoin metrics

* HashMergeJoin metrics test

* Fix test

* Fix format

* Fix descriptions

* Fix imports

* Update spark/src/test/scala/org/apache/comet/exec/CometExecSuite.scala

Co-authored-by: Liang-Chi Hsieh <[email protected]>

* delete conf

* Fix

---------

Co-authored-by: Liang-Chi Hsieh <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve CometHashJoin statistics
4 participants