Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disambiguate the resource usage node removal eligibility messages #6223

Merged
merged 1 commit into from
Nov 10, 2023

Conversation

shapirus
Copy link
Contributor

What type of PR is this?

/kind feature

What this PR does / why we need it:

Improves user experience as described in #6222

Which issue(s) this PR fixes:

#6222

Special notes for your reviewer:

I would also change the %f to %.2f in the messages to remove the extra significant digits that provide no added value from the output. Please let me know in the comments if it's a good idea, I'll update my commit if needed.

Does this PR introduce a user-facing change?

Disambiguated wording in the log messages related to node removal ineligibility caused by high resource allocation

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 24, 2023
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 24, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Oct 24, 2023
@k8s-ci-robot
Copy link
Contributor

Welcome @shapirus!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 24, 2023
@k8s-ci-robot k8s-ci-robot added area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 24, 2023
@jayantjain93
Copy link
Contributor

jayantjain93 commented Nov 2, 2023

/lgtm

I see the use different messaging for UX, but I don't see a need for decimal to be more restrictive.
I think the values specifically is a ratio and for larger machines this may be a "noticeable" value. I believe these finer margins of the value are used by developers to analyse issues with resource predication/bad scheduling, and losing visibility for that is not something I would like to do.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2023
@jayantjain93
Copy link
Contributor

@x13n you could review approval for this change.

@shapirus
Copy link
Contributor Author

shapirus commented Nov 2, 2023

I believe these finer margins of the value are used by developers to analyse issues with resource predication/bad scheduling, and losing visibility for that is not something I would like to do.

I agree, this justifies having them.

@x13n
Copy link
Member

x13n commented Nov 2, 2023

I can see how utilization may be confusing in this context, but then resource allocation is not a well defined concept I think. Maybe make it a bit more verbose? For example:

cpu request to node allocatable ratio is too high (99.9%)

If going the verbose way, might be useful to also extend it like this:

cpu request to node allocatable ratio (excluding DaemonSets) is too high (99.9%)

(in case when skipDaemonSetPods is true). WDYT?

@shapirus
Copy link
Contributor Author

shapirus commented Nov 2, 2023

If we are ok to change it a little more, what about this?

klog.V(4).Infof("Node %s unremovable: %s request allocation %.2f%% is above the threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)
  • reword to avoid bloating it too much, however it's still 7 bytes longer than the original (I tried hard to make it shorter; happy to accept suggestions for reducing the length of the message without losing its readability, meaning and the useful information that it provides)
  • percents instead of the as-is float for better readability
  • percents also imply, although not absolutely clear, but, I guess, sufficiently to make adding extra words unjustified, that the number shown means the percentage of total allocatable resource
  • %.2f for the value to avoid adding extra digits while retaining the original precision, because the number is multiplied by 100 to get percents
  • make it clear that there is a threshold that we compare against, and not just randomly decide that it's too high when we feel like doing it

This makes it a small self-contained change aiming to solve one particular problem.

On to your second suggestion, regarding the DaemonSet logic. I think it should be a matter of another PR, because it's going to introduce new logic where this log message is produced (checking if daemonsets are skipped or not), thus it should not block the initially proposed change.

If we speak about further improvements, it would be beneficial for the user's experience to include the actual value of the menitoned threshold in the message.
It's not quite trivial, because we don't have its value in the context where this log message is produced. It's calculated (to make things worse, conditionally, depending on the presence of a GPU) in isNodeBelowUtilizationThreshold(). To have it printed in the log, the calculation code would need to be moved into a separate function that could then be called by both isNodeBelowUtilizationThreshold() and unremovableReasonAndNodeUtilization(), I guess.
I guess it warrants having a separate PR for this as well (which I'll be happy to submit).

@x13n
Copy link
Member

x13n commented Nov 3, 2023

I'm in favor of improving readability of logs, but I don't think introducing new concepts is the right approach. In Kubernetes, there's a number of well defined quantities: pod request, pod limit, node allocatable, etc. My main point here is that if the current message is not using these concepts (i.e. "request allocation" as a percentage). I don't have a strong opinion on percentage vs fraction or how many digits the number has.

@shapirus
Copy link
Contributor Author

shapirus commented Nov 3, 2023

What about the following:

klog.V(4).Infof("Node %s unremovable: requested %s (%.2f%% of allocatable) is above the scale-down utilization threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)

?

this will produce e.g.:

Node i-dfa1aa285b1240e88 unremovable: requested memory (70.33% of allocatable) is above the scale-down utilization threshold

@jayantjain93
Copy link
Contributor

I'm in favor of improving readability of logs, but I don't think introducing new concepts is the right approach. In Kubernetes, there's a number of well defined quantities: pod request, pod limit, node allocatable, etc. My main point here is that if the current message is not using these concepts (i.e. "request allocation" as a percentage). I don't have a strong opinion on percentage vs fraction or how many digits the number has.

@x13n, I do have a preference on how many digits. I don't think we should lower the accuracy. However, no preference on percentage vs fraction. :)

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2023
@shapirus
Copy link
Contributor Author

shapirus commented Nov 3, 2023

(meanwhile, I force-pushed the commit to the lasted version I ended up with)

@shapirus shapirus force-pushed the master branch 2 times, most recently from f86706a to 112abe9 Compare November 3, 2023 16:03
@shapirus
Copy link
Contributor Author

shapirus commented Nov 3, 2023

...one more update: %.6g instead of .2f (that I used for percents in the previous version). This will output 6 significant digits: 0.123456%, 1.23456%, 12.3456% etc. This is no less precise than the original version.

However I personally think that such precision makes at least two least-significant digits meaningless. There is no way of making any practical use of it. Such precision is way above the practical measurement accuracy.

4 digits (12.34%, 1.234%) should be more than enough for any practical use case and it will be more readable.

@x13n what do you think of having 4 significant digits for it?

@x13n
Copy link
Member

x13n commented Nov 3, 2023

What about the following:

klog.V(4).Infof("Node %s unremovable: requested %s (%.2f%% of allocatable) is above the scale-down utilization threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)

?

this will produce e.g.:

Node i-dfa1aa285b1240e88 unremovable: requested memory (70.33% of allocatable) is above the scale-down utilization threshold

Yup. Looks good to me. With one caveat that it is actually not true when skipDaemonSetPods is true 🙂 DS requests are then excluded from the denominator instead of being added to the numerator.

...one more update: %.6g instead of .2f (that I used for percents in the previous version). This will output 6 significant digits: 0.123456%, 1.23456%, 12.3456% etc. This is no less precise than the original version.

However I personally think that such precision makes at least two least-significant digits meaningless. There is no way of making any practical use of it. Such precision is way above the practical measurement accuracy.

4 digits (12.34%, 1.234%) should be more than enough for any practical use case and it will be more readable.

@x13n what do you think of having 4 significant digits for it?

Yeah, I struggle to find a use case for keeping 6 significant digits. @jayantjain93 - do you have a specific use case for when high precision would actually help? Does 4 sound reasonable to you?

@shapirus
Copy link
Contributor Author

shapirus commented Nov 4, 2023

With one caveat that it is actually not true when skipDaemonSetPods is true 🙂 DS requests are then excluded from the denominator instead of being added to the numerator.

So we're talking about this code here:

func CalculateUtilizationOfResource(nodeInfo *schedulerframework.NodeInfo, resourceName apiv1.ResourceName, skipDaemonSetPods, skipMirrorPods bool, currentTime time.Time) (float64, error) {
nodeAllocatable, found := nodeInfo.Node().Status.Allocatable[resourceName]
if !found {
return 0, fmt.Errorf("failed to get %v from %s", resourceName, nodeInfo.Node().Name)
}
if nodeAllocatable.MilliValue() == 0 {
return 0, fmt.Errorf("%v is 0 at %s", resourceName, nodeInfo.Node().Name)
}
podsRequest := resource.MustParse("0")
// if skipDaemonSetPods = True, DaemonSet pods resourses will be subtracted
// from the node allocatable and won't be added to pods requests
// the same with the Mirror pod.
daemonSetAndMirrorPodsUtilization := resource.MustParse("0")
for _, podInfo := range nodeInfo.Pods {
// factor daemonset pods out of the utilization calculations
if skipDaemonSetPods && pod_util.IsDaemonSetPod(podInfo.Pod) {
for _, container := range podInfo.Pod.Spec.Containers {
if resourceValue, found := container.Resources.Requests[resourceName]; found {
daemonSetAndMirrorPodsUtilization.Add(resourceValue)
}
}
continue
}
// factor mirror pods out of the utilization calculations
if skipMirrorPods && pod_util.IsMirrorPod(podInfo.Pod) {
for _, container := range podInfo.Pod.Spec.Containers {
if resourceValue, found := container.Resources.Requests[resourceName]; found {
daemonSetAndMirrorPodsUtilization.Add(resourceValue)
}
}
continue
}
// ignore Pods that should be terminated
if drain.IsPodLongTerminating(podInfo.Pod, currentTime) {
continue
}
for _, container := range podInfo.Pod.Spec.Containers {
if resourceValue, found := container.Resources.Requests[resourceName]; found {
podsRequest.Add(resourceValue)
}
}
}
return float64(podsRequest.MilliValue()) / float64(nodeAllocatable.MilliValue()-daemonSetAndMirrorPodsUtilization.MilliValue()), nil

As far as I understand, they are excluded from both the numerator and denominator. Which essentially means that the message will still be correct, if it is understood that what it says implies something like "...where requests and allocatable are what they mean to CA in this context instead of raw values from k8s", but we obviously can't add these words as is.

We could use something like "Node i-dfa1aa285b1240e88 unremovable: effective requested memory (70.33% of effective allocatable) is above the scale-down utilization threshold", but I think it will look ugly.

To be strictly correct there, we could say ...[, DaemonSet [and mirror] pods excluded], but will it really warrant the added verbosity of the message and complication of the code? I'm not sure.

That being said, I agree, it would be nice to indicate that the way that CA calculates utilization depends on certain settings, and then it'll be up to the user to understand what configuration options they set. If only we could make it simple and concise.

Any ideas on possible wording to use without adding extra logic (that'll be partially duplicating what's already done in CalculateUtilizationOfResource())? How do we say: "..., where the meaning of requests and allocatable depend on what pods CA considers for calculating the usage percentage" in one or two words? :)

@x13n
Copy link
Member

x13n commented Nov 6, 2023

Well, all of that was neatly hidden behind the word "utilization" in the existing code. I understand it is easy to confuse with other notions of utilization, so maybe it would suffice to alter the term a bit? "Allocatable utilization" is almost correct, modulo the caveats discussed. How about a completely different term like "node efficiency"? It'd probably make sense to extend the FAQ to explain what that means, but it would definitely resolve the current confusion problem.

Naming is hard...

@shapirus
Copy link
Contributor Author

shapirus commented Nov 6, 2023

The sole purpose of the proposed change was to make the message clear enough for the user to understand at a glance what's going on, without opening a web browser to search for documentation or explanation (after looking at kubectl top node and seeing much lower usage than that reported by CA).

I believe it serves that purpose, while staying sufficiently correct for (arguably) most use cases. Even if the actual number reported may not match the number calculated outside of CA when it's affected by the daemonset-related condition, the message will still serve its main purpose and not confuse the user:

  • ok, why are my nodes not scaling down? let me see the logs!
  • oh, CA is saying that there's too much requests for this resource placed on this node, I need to check my pods and adjust resource allocation settings.

The difference caused by the different states of the skipDaemonSetPods attribute will come into play, if ever, only at a deeper level of debugging, at which point the user will very likely be reading the documentation anyway.

(it must be noted here that a significant number of users use CA deployed with a default set of manifests, or even deployed, often by default, by tools like kops, with all the default settings, and never read the documentation.)

Question is what is our goal about this: provide a better user experience with some technical caveats (until someone comes up with a good wording to cover both), or be strictly correct in the terminology, but make the logs significantly more verbose and potentially more confusing?

@x13n
Copy link
Member

x13n commented Nov 6, 2023

Ok, I think the current phrasing is strictly better in the confusion dimension, so I'm ok to merge as-is.

/lgtm
/approve
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2023
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shapirus, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2023
@x13n
Copy link
Member

x13n commented Nov 6, 2023

Whoops, I actually don't think the hold is needed. gofmt seems to be needed though.

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2023
@shapirus
Copy link
Contributor Author

shapirus commented Nov 6, 2023

gofmt seems to be needed though.

Can you please elaborate on this? Do I need to change something?
(I'm rather new to the go universe, so I can miss something obvious)

@x13n
Copy link
Member

x13n commented Nov 7, 2023

Yes - if you follow the Details link next to test-and-verify check, you'll see logs explaining that the linter thinks the code requires formatting. It should be sufficient to run gofmt -s -w ./cluster-autoscaler/core/scaledown/eligibility/eligibility.go and force push the changes. There are also some unrelated errors from another component (balancer) - if they persist, a rebase might be required.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2023
@shapirus
Copy link
Contributor Author

@x13n that's fixed now.

@x13n
Copy link
Member

x13n commented Nov 10, 2023

Thanks!

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2023
@k8s-ci-robot k8s-ci-robot merged commit 9d4cc86 into kubernetes:master Nov 10, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants