Disambiguate the resource usage node removal eligibility messages #6223

shapirus · 2023-10-24T21:00:01Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

Improves user experience as described in #6222

Which issue(s) this PR fixes:

Special notes for your reviewer:

I would also change the %f to %.2f in the messages to remove the extra significant digits that provide no added value from the output. Please let me know in the comments if it's a good idea, I'll update my commit if needed.

Does this PR introduce a user-facing change?

Disambiguated wording in the log messages related to node removal ineligibility caused by high resource allocation

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

linux-foundation-easycla · 2023-10-24T21:00:06Z

The committers listed above are authorized under a signed CLA.

✅ login: shapirus (b15b745)

k8s-ci-robot · 2023-10-24T21:00:10Z

Welcome @shapirus!

It looks like this is your first PR to kubernetes/autoscaler 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/autoscaler has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

jayantjain93 · 2023-11-02T10:07:32Z

/lgtm

I see the use different messaging for UX, but I don't see a need for decimal to be more restrictive.
I think the values specifically is a ratio and for larger machines this may be a "noticeable" value. I believe these finer margins of the value are used by developers to analyse issues with resource predication/bad scheduling, and losing visibility for that is not something I would like to do.

jayantjain93 · 2023-11-02T10:09:00Z

@x13n you could review approval for this change.

shapirus · 2023-11-02T10:12:55Z

I believe these finer margins of the value are used by developers to analyse issues with resource predication/bad scheduling, and losing visibility for that is not something I would like to do.

I agree, this justifies having them.

x13n · 2023-11-02T13:57:17Z

I can see how utilization may be confusing in this context, but then resource allocation is not a well defined concept I think. Maybe make it a bit more verbose? For example:

cpu request to node allocatable ratio is too high (99.9%)

If going the verbose way, might be useful to also extend it like this:

cpu request to node allocatable ratio (excluding DaemonSets) is too high (99.9%)

(in case when skipDaemonSetPods is true). WDYT?

shapirus · 2023-11-02T16:09:12Z

If we are ok to change it a little more, what about this?

klog.V(4).Infof("Node %s unremovable: %s request allocation %.2f%% is above the threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)

reword to avoid bloating it too much, however it's still 7 bytes longer than the original (I tried hard to make it shorter; happy to accept suggestions for reducing the length of the message without losing its readability, meaning and the useful information that it provides)
percents instead of the as-is float for better readability
percents also imply, although not absolutely clear, but, I guess, sufficiently to make adding extra words unjustified, that the number shown means the percentage of total allocatable resource
%.2f for the value to avoid adding extra digits while retaining the original precision, because the number is multiplied by 100 to get percents
make it clear that there is a threshold that we compare against, and not just randomly decide that it's too high when we feel like doing it

This makes it a small self-contained change aiming to solve one particular problem.

On to your second suggestion, regarding the DaemonSet logic. I think it should be a matter of another PR, because it's going to introduce new logic where this log message is produced (checking if daemonsets are skipped or not), thus it should not block the initially proposed change.

If we speak about further improvements, it would be beneficial for the user's experience to include the actual value of the menitoned threshold in the message.
It's not quite trivial, because we don't have its value in the context where this log message is produced. It's calculated (to make things worse, conditionally, depending on the presence of a GPU) in isNodeBelowUtilizationThreshold(). To have it printed in the log, the calculation code would need to be moved into a separate function that could then be called by both isNodeBelowUtilizationThreshold() and unremovableReasonAndNodeUtilization(), I guess.
I guess it warrants having a separate PR for this as well (which I'll be happy to submit).

x13n · 2023-11-03T14:28:25Z

I'm in favor of improving readability of logs, but I don't think introducing new concepts is the right approach. In Kubernetes, there's a number of well defined quantities: pod request, pod limit, node allocatable, etc. My main point here is that if the current message is not using these concepts (i.e. "request allocation" as a percentage). I don't have a strong opinion on percentage vs fraction or how many digits the number has.

shapirus · 2023-11-03T15:11:14Z

What about the following:

klog.V(4).Infof("Node %s unremovable: requested %s (%.2f%% of allocatable) is above the scale-down utilization threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)

?

this will produce e.g.:

Node i-dfa1aa285b1240e88 unremovable: requested memory (70.33% of allocatable) is above the scale-down utilization threshold

jayantjain93 · 2023-11-03T15:33:00Z

I'm in favor of improving readability of logs, but I don't think introducing new concepts is the right approach. In Kubernetes, there's a number of well defined quantities: pod request, pod limit, node allocatable, etc. My main point here is that if the current message is not using these concepts (i.e. "request allocation" as a percentage). I don't have a strong opinion on percentage vs fraction or how many digits the number has.

@x13n, I do have a preference on how many digits. I don't think we should lower the accuracy. However, no preference on percentage vs fraction. :)

shapirus · 2023-11-03T15:54:24Z

(meanwhile, I force-pushed the commit to the lasted version I ended up with)

shapirus · 2023-11-03T16:10:14Z

...one more update: %.6g instead of .2f (that I used for percents in the previous version). This will output 6 significant digits: 0.123456%, 1.23456%, 12.3456% etc. This is no less precise than the original version.

However I personally think that such precision makes at least two least-significant digits meaningless. There is no way of making any practical use of it. Such precision is way above the practical measurement accuracy.

4 digits (12.34%, 1.234%) should be more than enough for any practical use case and it will be more readable.

@x13n what do you think of having 4 significant digits for it?

x13n · 2023-11-03T19:32:39Z

What about the following:

klog.V(4).Infof("Node %s unremovable: requested %s (%.2f%% of allocatable) is above the scale-down utilization threshold", node.Name, utilInfo.ResourceName, utilInfo.Utilization * 100)

?

this will produce e.g.:

Node i-dfa1aa285b1240e88 unremovable: requested memory (70.33% of allocatable) is above the scale-down utilization threshold

Yup. Looks good to me. With one caveat that it is actually not true when skipDaemonSetPods is true 🙂 DS requests are then excluded from the denominator instead of being added to the numerator.

...one more update: %.6g instead of .2f (that I used for percents in the previous version). This will output 6 significant digits: 0.123456%, 1.23456%, 12.3456% etc. This is no less precise than the original version.

However I personally think that such precision makes at least two least-significant digits meaningless. There is no way of making any practical use of it. Such precision is way above the practical measurement accuracy.

4 digits (12.34%, 1.234%) should be more than enough for any practical use case and it will be more readable.

@x13n what do you think of having 4 significant digits for it?

Yeah, I struggle to find a use case for keeping 6 significant digits. @jayantjain93 - do you have a specific use case for when high precision would actually help? Does 4 sound reasonable to you?

shapirus · 2023-11-04T10:47:52Z

With one caveat that it is actually not true when skipDaemonSetPods is true 🙂 DS requests are then excluded from the denominator instead of being added to the numerator.

So we're talking about this code here:

autoscaler/cluster-autoscaler/simulator/utilization/info.go

Lines 84 to 127 in db5e83b

    
           func CalculateUtilizationOfResource(nodeInfo *schedulerframework.NodeInfo, resourceName apiv1.ResourceName, skipDaemonSetPods, skipMirrorPods bool, currentTime time.Time) (float64, error) { 
        
           	nodeAllocatable, found := nodeInfo.Node().Status.Allocatable[resourceName] 
        
           	if !found { 
        
           		return 0, fmt.Errorf("failed to get %v from %s", resourceName, nodeInfo.Node().Name) 
        
           	} 
        
           	if nodeAllocatable.MilliValue() == 0 { 
        
           		return 0, fmt.Errorf("%v is 0 at %s", resourceName, nodeInfo.Node().Name) 
        
           	} 
        
           	podsRequest := resource.MustParse("0") 
        
           	// if skipDaemonSetPods = True, DaemonSet pods resourses will be subtracted 
        
           	// from the node allocatable and won't be added to pods requests 
        
           	// the same with the Mirror pod. 
        
           	daemonSetAndMirrorPodsUtilization := resource.MustParse("0") 
        
           	for _, podInfo := range nodeInfo.Pods { 
        
           		// factor daemonset pods out of the utilization calculations 
        
           		if skipDaemonSetPods && pod_util.IsDaemonSetPod(podInfo.Pod) { 
        
           			for _, container := range podInfo.Pod.Spec.Containers { 
        
           				if resourceValue, found := container.Resources.Requests[resourceName]; found { 
        
           					daemonSetAndMirrorPodsUtilization.Add(resourceValue) 
        
           				} 
        
           			} 
        
           			continue 
        
           		} 
        
           		// factor mirror pods out of the utilization calculations 
        
           		if skipMirrorPods && pod_util.IsMirrorPod(podInfo.Pod) { 
        
           			for _, container := range podInfo.Pod.Spec.Containers { 
        
           				if resourceValue, found := container.Resources.Requests[resourceName]; found { 
        
           					daemonSetAndMirrorPodsUtilization.Add(resourceValue) 
        
           				} 
        
           			} 
        
           			continue 
        
           		} 
        
           		// ignore Pods that should be terminated 
        
           		if drain.IsPodLongTerminating(podInfo.Pod, currentTime) { 
        
           			continue 
        
           		} 
        
           		for _, container := range podInfo.Pod.Spec.Containers { 
        
           			if resourceValue, found := container.Resources.Requests[resourceName]; found { 
        
           				podsRequest.Add(resourceValue) 
        
           			} 
        
           		} 
        
           	} 
        
           	return float64(podsRequest.MilliValue()) / float64(nodeAllocatable.MilliValue()-daemonSetAndMirrorPodsUtilization.MilliValue()), nil

As far as I understand, they are excluded from both the numerator and denominator. Which essentially means that the message will still be correct, if it is understood that what it says implies something like "...where requests and allocatable are what they mean to CA in this context instead of raw values from k8s", but we obviously can't add these words as is.

We could use something like "Node i-dfa1aa285b1240e88 unremovable: effective requested memory (70.33% of effective allocatable) is above the scale-down utilization threshold", but I think it will look ugly.

To be strictly correct there, we could say ...[, DaemonSet [and mirror] pods excluded], but will it really warrant the added verbosity of the message and complication of the code? I'm not sure.

That being said, I agree, it would be nice to indicate that the way that CA calculates utilization depends on certain settings, and then it'll be up to the user to understand what configuration options they set. If only we could make it simple and concise.

Any ideas on possible wording to use without adding extra logic (that'll be partially duplicating what's already done in CalculateUtilizationOfResource())? How do we say: "..., where the meaning of requests and allocatable depend on what pods CA considers for calculating the usage percentage" in one or two words? :)

x13n · 2023-11-06T12:54:20Z

Well, all of that was neatly hidden behind the word "utilization" in the existing code. I understand it is easy to confuse with other notions of utilization, so maybe it would suffice to alter the term a bit? "Allocatable utilization" is almost correct, modulo the caveats discussed. How about a completely different term like "node efficiency"? It'd probably make sense to extend the FAQ to explain what that means, but it would definitely resolve the current confusion problem.

Naming is hard...

shapirus · 2023-11-06T13:14:41Z

The sole purpose of the proposed change was to make the message clear enough for the user to understand at a glance what's going on, without opening a web browser to search for documentation or explanation (after looking at kubectl top node and seeing much lower usage than that reported by CA).

I believe it serves that purpose, while staying sufficiently correct for (arguably) most use cases. Even if the actual number reported may not match the number calculated outside of CA when it's affected by the daemonset-related condition, the message will still serve its main purpose and not confuse the user:

ok, why are my nodes not scaling down? let me see the logs!
oh, CA is saying that there's too much requests for this resource placed on this node, I need to check my pods and adjust resource allocation settings.

The difference caused by the different states of the skipDaemonSetPods attribute will come into play, if ever, only at a deeper level of debugging, at which point the user will very likely be reading the documentation anyway.

(it must be noted here that a significant number of users use CA deployed with a default set of manifests, or even deployed, often by default, by tools like kops, with all the default settings, and never read the documentation.)

Question is what is our goal about this: provide a better user experience with some technical caveats (until someone comes up with a good wording to cover both), or be strictly correct in the terminology, but make the logs significantly more verbose and potentially more confusing?

x13n · 2023-11-06T20:17:27Z

Ok, I think the current phrasing is strictly better in the confusion dimension, so I'm ok to merge as-is.

/lgtm
/approve
/hold

k8s-ci-robot · 2023-11-06T20:17:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: shapirus, x13n

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~cluster-autoscaler/OWNERS~~ [x13n]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

x13n · 2023-11-06T20:18:27Z

Whoops, I actually don't think the hold is needed. gofmt seems to be needed though.

/hold cancel

shapirus · 2023-11-06T20:25:29Z

gofmt seems to be needed though.

Can you please elaborate on this? Do I need to change something?
(I'm rather new to the go universe, so I can miss something obvious)

x13n · 2023-11-07T19:58:16Z

Yes - if you follow the Details link next to test-and-verify check, you'll see logs explaining that the linter thinks the code requires formatting. It should be sufficient to run gofmt -s -w ./cluster-autoscaler/core/scaledown/eligibility/eligibility.go and force push the changes. There are also some unrelated errors from another component (balancer) - if they persist, a rebase might be required.

shapirus · 2023-11-10T12:10:39Z

@x13n that's fixed now.

x13n · 2023-11-10T14:33:10Z

Thanks!

/lgtm

k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 24, 2023

k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Oct 24, 2023

k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Oct 24, 2023

k8s-ci-robot requested review from BigDarkClown and x13n October 24, 2023 21:00

k8s-ci-robot added area/cluster-autoscaler cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Oct 24, 2023

shapirus mentioned this pull request Oct 24, 2023

[UX] Improve the "memory utilization too big" message #6222

Closed

k8s-ci-robot assigned jayantjain93 Nov 2, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2023

shapirus force-pushed the master branch from ceb650a to 84cfce0 Compare November 3, 2023 15:53

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 3, 2023

shapirus force-pushed the master branch 2 times, most recently from f86706a to 112abe9 Compare November 3, 2023 16:03

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2023

k8s-ci-robot assigned x13n Nov 6, 2023

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 6, 2023

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 6, 2023

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 6, 2023

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2023

shapirus force-pushed the master branch from 0ed7d15 to dac0808 Compare November 10, 2023 10:18

Disambiguate the resource usage node removal eligibility messages

b15b745

shapirus force-pushed the master branch from dac0808 to b15b745 Compare November 10, 2023 10:19

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2023

k8s-ci-robot merged commit 9d4cc86 into kubernetes:master Nov 10, 2023
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disambiguate the resource usage node removal eligibility messages #6223

Disambiguate the resource usage node removal eligibility messages #6223

shapirus commented Oct 24, 2023

linux-foundation-easycla bot commented Oct 24, 2023 •

edited

Loading

k8s-ci-robot commented Oct 24, 2023

jayantjain93 commented Nov 2, 2023 •

edited

Loading

jayantjain93 commented Nov 2, 2023

shapirus commented Nov 2, 2023

x13n commented Nov 2, 2023

shapirus commented Nov 2, 2023 •

edited

Loading

x13n commented Nov 3, 2023

shapirus commented Nov 3, 2023

jayantjain93 commented Nov 3, 2023

shapirus commented Nov 3, 2023

shapirus commented Nov 3, 2023 •

edited

Loading

x13n commented Nov 3, 2023

shapirus commented Nov 4, 2023 •

edited

Loading

x13n commented Nov 6, 2023

shapirus commented Nov 6, 2023 •

edited

Loading

x13n commented Nov 6, 2023

k8s-ci-robot commented Nov 6, 2023

x13n commented Nov 6, 2023

shapirus commented Nov 6, 2023

x13n commented Nov 7, 2023

shapirus commented Nov 10, 2023

x13n commented Nov 10, 2023

Disambiguate the resource usage node removal eligibility messages #6223

Disambiguate the resource usage node removal eligibility messages #6223

Conversation

shapirus commented Oct 24, 2023

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

linux-foundation-easycla bot commented Oct 24, 2023 • edited Loading

k8s-ci-robot commented Oct 24, 2023

jayantjain93 commented Nov 2, 2023 • edited Loading

jayantjain93 commented Nov 2, 2023

shapirus commented Nov 2, 2023

x13n commented Nov 2, 2023

shapirus commented Nov 2, 2023 • edited Loading

x13n commented Nov 3, 2023

shapirus commented Nov 3, 2023

jayantjain93 commented Nov 3, 2023

shapirus commented Nov 3, 2023

shapirus commented Nov 3, 2023 • edited Loading

x13n commented Nov 3, 2023

shapirus commented Nov 4, 2023 • edited Loading

x13n commented Nov 6, 2023

shapirus commented Nov 6, 2023 • edited Loading

x13n commented Nov 6, 2023

k8s-ci-robot commented Nov 6, 2023

x13n commented Nov 6, 2023

shapirus commented Nov 6, 2023

x13n commented Nov 7, 2023

shapirus commented Nov 10, 2023

x13n commented Nov 10, 2023

linux-foundation-easycla bot commented Oct 24, 2023 •

edited

Loading

jayantjain93 commented Nov 2, 2023 •

edited

Loading

shapirus commented Nov 2, 2023 •

edited

Loading

shapirus commented Nov 3, 2023 •

edited

Loading

shapirus commented Nov 4, 2023 •

edited

Loading

shapirus commented Nov 6, 2023 •

edited

Loading