-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-11640. Deletion services in SCM, OM and DN should have consistent metrics #7426
base: master
Are you sure you want to change the base?
Conversation
public void incrSuccessCountInLastIteration(long successCountInLastIteration) { | ||
this.successCountInLastIteration.incr(successCountInLastIteration); | ||
} | ||
|
||
public void incrSuccessBytesInLastIteration(long successBytesInLastIteration) { | ||
this.successBytesInLastIteration.incr(successBytesInLastIteration); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Tejaskriya for the patch.
Parameters can be renamed to delta
to fix checkstyle failures:
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/helpers/BlockDeletingServiceMetrics.java
234: 'successCountInLastIteration' hides a field.
238: 'successBytesInLastIteration' hides a field.
https://github.com/Tejaskriya/ozone/actions/runs/11813260671/job/32909976492
@@ -127,12 +128,14 @@ public Long getNumBlocksToDelete() { | |||
public BackgroundTaskQueue getTasks() { | |||
BackgroundTaskQueue queue = new BackgroundTaskQueue(); | |||
|
|||
metrics.resetLastIterationCounts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone queries the metrics after the resetLastIterationCounts
is done and before we populate all the counters again, won't we get inconsistent result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed it now, to store the counter and update the metrics only at the end of each iteration. COuld you please take a look at the update?
@errose28 @ashishkumar50 could you please review the PR? |
...ervice/src/main/java/org/apache/hadoop/ozone/container/common/impl/BlockDeletingService.java
Outdated
Show resolved
Hide resolved
...va/org/apache/hadoop/ozone/container/keyvalue/statemachine/background/BlockDeletingTask.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, this PR consists of including last Iteration metric capture, for char purpose in prometheus / grafana, but it can be done without this, and its redundant. So this is not requried.
totalTime); | ||
metrics.setStartTimeOfLastIteration(startTime); | ||
metrics.setDurationOfLastIteration(totalTime); | ||
metrics.setNumBlockDeletionCommandSentInLastIteration(commandsSent); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its not useful to capture LastIteration metrics, as chart can be prepared with existing metrics itself. This metrics is not required.
@@ -524,7 +527,12 @@ public void commitTransactions( | |||
} | |||
try { | |||
deletedBlockLogStateManager.removeTransactionsFromDB(txIDsToBeDeleted); | |||
metrics.incrBlockDeletionTransactionCompleted(txIDsToBeDeleted.size()); | |||
int numTransactionsCompleted = txIDsToBeDeleted.size(); | |||
metrics.incrBlockDeletionTransactionCompleted(numTransactionsCompleted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These metrics are not required, total can be derived from sucess+failure, and last iteration additional metrics not required.
What changes were proposed in this pull request?
Currently, SCM and DN have only total metrics, they don't track the numbers for each iteration of the deleting services. In order to view the progress of deletion using metrics, iteration specific metrics will useful.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11640
How was this patch tested?
Tested Manually