Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-8010][CORE] Don't generate native metrics if transformer don't generate relNode #8011

Merged
merged 8 commits into from
Nov 25, 2024

Conversation

zml1206
Copy link
Contributor

@zml1206 zml1206 commented Nov 20, 2024

What changes were proposed in this pull request?

(Fixes: #8010)

How was this patch tested?

Local test.
Before PR:
image
After PR:
image

@github-actions github-actions bot added CORE works for Gluten Core VELOX labels Nov 20, 2024
Copy link

#8010

Copy link

Run Gluten Clickhouse CI on x86

1 similar comment
Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

@zhztheplayer
Copy link
Member

Thanks!

Though it feels like a workaround since there is no real filter node in Velox query plan? Do we have other better options?

@zml1206
Copy link
Contributor Author

zml1206 commented Nov 21, 2024

Though it feels like a workaround since there is no real filter node in Velox query plan? Do we have other better options?

How about set metricsUpdater to MetricsUpdater.None when getRemainingCondition is null?
In this way, there is no metrics information.
image

@zhztheplayer
Copy link
Member

zhztheplayer commented Nov 21, 2024

How about set metricsUpdater to MetricsUpdater.None when getRemainingCondition is null?

Sounds fair. Let's see if it works.

BTW I think one of the promising ways is to exclude the filter node from Spark query plan as well. Though I remember there were some issues with an relevant attempt.

@zml1206
Copy link
Contributor Author

zml1206 commented Nov 21, 2024

BTW I think one of the promising ways is to exclude the filter node from Spark query plan as well. Though I remember there were some issues with an relevant attempt.

We can't exclude the filter node, because it's output may differ from child's output.

@github-actions github-actions bot removed the VELOX label Nov 21, 2024
Copy link

Run Gluten Clickhouse CI on x86

@zml1206 zml1206 changed the title [GLUTEN-8010][VL] Inherit child metrics when FilterExecTransformer's remainingCondition is null [GLUTEN-8010][VL] Don't generate native metrics if filter's remainingCondition is null Nov 21, 2024
@github-actions github-actions bot added the VELOX label Nov 21, 2024
Copy link

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot removed the CORE works for Gluten Core label Nov 21, 2024
@@ -54,9 +54,6 @@ object MetricsUtil extends Logging {
MetricsUpdaterTree(
smj.metricsUpdater(),
Seq(treeifyMetricsUpdaters(smj.bufferedPlan), treeifyMetricsUpdaters(smj.streamedPlan)))
case t: TransformSupport if t.metricsUpdater() == MetricsUpdater.None =>
assert(t.children.size == 1, "MetricsUpdater.None can only be used on unary operator")
treeifyMetricsUpdaters(t.children.head)
Copy link
Contributor Author

@zml1206 zml1206 Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will cause operatorId inconsistencies. It should be handled in updateTransformerMetricsInternal. cc @zhztheplayer

@zml1206
Copy link
Contributor Author

zml1206 commented Nov 21, 2024

@zhztheplayer Can you help take a look again? Thanks.

@@ -219,6 +216,7 @@ object MetricsUtil extends Logging {
})

mutNode.updater match {
case MetricsUpdater.None =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure about this?

The code is likely something done by MetricsUpdater.Todo. I remember MetricsUpdater.Todo and MetricsUpdater.None don't share the same semantic.

Did you check the UI? Are all the metrics still normal with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetricsUpdater.Todo means that native metrics exist, but does not update them. MetricsUpdater.None means that native metrics do not exist, and updateNativeMetrics is not supported, so this is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. My impressions about this part of code is:

  • When registerEmptyRelToOperator is called, a fake operator creation is actually emulated so MetricsUpdator.Todo should be considered
  • When child.asInstanceOf[TransformSupport].transform(context) is used, MetricsUpdator.None should be considered

Would you like to check whether the two rule apply?

And if we no longer need MetricsUpdater.None, we can just remove this type of metrics updater.

@github-actions github-actions bot added CORE works for Gluten Core and removed VELOX labels Nov 22, 2024
Copy link

Run Gluten Clickhouse CI on x86

@zml1206 zml1206 changed the title [GLUTEN-8010][VL] Don't generate native metrics if filter's remainingCondition is null [GLUTEN-8010][CORE] Don't generate native metrics if transformer don't generate relNode Nov 22, 2024
@@ -112,13 +115,12 @@ case class ExpandExecTransformer(

override protected def doTransform(context: SubstraitContext): TransformContext = {
val childCtx = child.asInstanceOf[TransformSupport].transform(context)
val operatorId = context.nextOperatorId(this.nodeName)
if (projections == null || projections.isEmpty) {
if (metricsUpdater == MetricsUpdater.None) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @zml1206

Can we create a new utility method, e.g, isNoop that can be called both by doTransform and metricUpdater in each operator? Thanks.

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other code looks good to me.

Copy link

Run Gluten Clickhouse CI on x86

Copy link

Run Gluten Clickhouse CI on x86

Copy link
Member

@zhztheplayer zhztheplayer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit. Thanks!

@@ -61,7 +61,9 @@ abstract class FilterExecTransformerBase(val cond: Expression, val input: SparkP
case _ => false
}

override def metricsUpdater(): MetricsUpdater = if (getRemainingCondition == null) {
override def isLoop: Boolean = getRemainingCondition == null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/isLoop/isNoop/

@@ -48,7 +48,9 @@ case class ExpandExecTransformer(
AttributeSet.fromAttributeSets(projections.flatten.map(_.references))
}

override def metricsUpdater(): MetricsUpdater = if (projections == null || projections.isEmpty) {
override def isLoop: Boolean = projections == null || projections.isEmpty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/isLoop/isNoop/

@@ -44,7 +44,9 @@ case class SortExecTransformer(
@transient override lazy val metrics =
BackendsApiManager.getMetricsApiInstance.genSortTransformerMetrics(sparkContext)

override def metricsUpdater(): MetricsUpdater = if (sortOrder == null || sortOrder.isEmpty) {
override def isLoop: Boolean = sortOrder == null || sortOrder.isEmpty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/isLoop/isNoop/

override def metricsUpdater(): MetricsUpdater = if (
windowExpression == null || windowExpression.isEmpty
) {
override def isLoop: Boolean = windowExpression == null || windowExpression.isEmpty
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/isLoop/isNoop/

@zml1206 zml1206 merged commit e3682fd into apache:main Nov 25, 2024
46 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLICKHOUSE CORE works for Gluten Core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metrics of FilterExecTransformer are incorrect when remainingCondition is null
2 participants