Fix: Check MinSize in fix nodegroup size #6650

daimaxiaxie · 2024-03-22T04:46:39Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

When an ASG out of stock and the instance is less than MinSize(especially spot), the entire Cluster Autoscaler will be stuck due to deleteCreatedNodesWithErrors and fixNodeGroupSize return error.

Cluster Autoscaler cannot handle this situation well.

Which issue(s) this PR fixes:

Fixes #6601

Special notes for your reviewer:

Respect MinSize.

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot · 2024-03-22T04:46:48Z

Hi @daimaxiaxie. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

…odesWithErrors

towca · 2024-04-19T09:49:21Z

/ok-to-test

MaciekPytel · 2024-04-19T10:49:03Z

This PR fixes problem for ASGs, but at the same time breaks it for other cloudproviders. We generally don't assume min size is enforced at provider side (i.e. the expectation is that if CA decides to delete a node it can do so) and the current behavior of going below is the desired one. The reason is that there is no point keeping a VM that cannot register; it's not really providing capacity to kubernetes cluster. Min size is there to guarantee a minimum capacity of the cluster, but a node that cannot register shouldn't really be counted.

We need a more nuanced solution to the problem, one that allows CA to delete unregistered nodes below min size, while giving a chance to providers to handle it. IIRC we discussed extending cloudprovider to support this with @Bryce-Soghigian - do I remember correctly that you have a fix for this that you wanted to upstream?

MaciekPytel · 2024-04-19T11:39:41Z

Actually a good question is why do we have such a check in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/static_autoscaler.go#L785. That feels really inconsistent, why are we ok going below min in deleteCreatedNodesWithErrors, but not in removeOldUnregisteredNodes?

Sure, in the case of deleteCreatedNodesWithErrors we're probably deleting some placeholders and not actual VMs, but what exactly those placeholders are would differ by provider and at the end of the day we're changing target size to be less than min size. If we say that's ok - and I think we should, as per my comment above the meaningful interpretation of min size is "minimum number of nodes" and a VM that cannot register is not really a node - we should do it in all paths where VMs fail to register as nodes. Thoughts @x13n @towca @BigDarkClown ?

Bryce-Soghigian · 2024-04-20T16:33:39Z

Yep @MaciekPytel #6658

The idea is that when the number of unregistered nodes is below min count, we won't ever remove them. This Draft PR will add a respectMinCount to the delete nodes interface that will allow us to make min count violating deletions for unregistered nodes.

I need to sit down and finish it. I wanted to upstream the change so that all cloud providers could benefit from it as well but that involved touching a lot more code thus taking longer than I thought it would :)

Bryce-Soghigian · 2024-04-20T18:19:24Z

Actually a good question is why do we have such a check in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/static_autoscaler.go#L785. That feels really inconsistent, why are we ok going below min in deleteCreatedNodesWithErrors, but not in removeOldUnregisteredNodes?

Another thing to noted is not all cloudproviders enforce min count on DeleteNodes() level as well. Some of them rely on these higher level checks.(They shouldn't?)

the meaningful interpretation of min size is "minimum number of nodes" and a VM that cannot register is not really a node - we should do it in all paths where VMs fail to register as nodes.

For things like IsClusterHealthy, or IsNodegroupHealthy, the opposite is true. We only represent real nodes and not these partial vms (vms that never join apiserver) in the health signal. IMO if the bootstrap is failing its more indicative of a health problem than not ready nodes.

daimaxiaxie · 2024-04-22T02:22:05Z

The reason is that there is no point keeping a VM that cannot register; it's not really providing capacity to kubernetes cluster. Min size is there to guarantee a minimum capacity of the cluster, but a node that cannot register shouldn't really be counted.

I very much agree with this view. Thank you all @MaciekPytel @Bryce-Soghigian .

daimaxiaxie · 2024-04-22T03:20:08Z

Actually a good question is why do we have such a check in https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/core/static_autoscaler.go#L785. That feels really inconsistent, why are we ok going below min in deleteCreatedNodesWithErrors, but not in removeOldUnregisteredNodes?

I also have this question, this code comes from @kawych . I think removeOldUnregisteredNodes and deleteCreatedNodesWithErrors should be consistent because they face the same environment.
When the number of instances <= MinSize, if the create error node needs to be deleted(even placeholders), the unregisitered node also needs to be deleted.

towca · 2024-04-25T13:25:56Z

@daimaxiaxie I think the logic was the same even before @kawych changes FWIW.

@MaciekPytel I have no idea why the behavior is inconsistent between the 2 places, I agree that CA should be able to delete the VMs in both scenarios even if it means going below the MinSize.

@Bryce-Soghigian brought up a good point - if we expect the limit enforcement to happen outside fo CP, it should also mean that CPs shouldn't enforce the limits within DeleteNodes, right. @MaciekPytel for example the GCE cloud provider does so for example, should we change it?

Bryce-Soghigian · 2024-04-25T16:54:13Z

One could also argue that having that cloud provider enforce min count in case some upper level logic is helpful. It means that no matter what we can't violate min count and scale beyond max count.

That being said there are cases(like unregistered nodes) where we want to be able to force a deletion.

x13n · 2024-04-26T19:28:28Z

I agree we should be able to go below min in case of long unregistered nodes and creation errors. There's no point in keeping such nodes around.

However, I'm not convinced complicating CA logic to pass the "ignore min size" flag down to cloud providers is the right thing to do. When CA is managing a node group, it should be the source of truth for determining group size. When CA tries to delete a node, cloud provider code should actuate this decision (or return an error). If we're allowing force delete in certain cases, the additional safeguard on cloud provider side isn't really going to provide a lot of additional protection I think? What is the use case for sometimes honoring min size internally in cloud provider code? Is it meant as extra layer of protection in case CA goes amok or is there more to it?

Bryce-Soghigian · 2024-04-27T01:03:11Z

Yeah exactly. It serves as a safeguard if someone adds a new deletion call. Really, if we didn't have that cloudprovider code I could just delete the unregistered nodes outright. But with the cloudprovider safeguards, i cannot just delete the unregistered nodes.

Since min count is a fundamental principle and people expect it to never be violated it makes sense to have that extra safety in the cloudprovider. But also I would like to be able to remove nodes that are bad. Either modifying that interface for DeleteNodes() or have a new method called UnsafeDeleteNodes() might make sense.

x13n · 2024-04-29T13:31:26Z

That makes sense, thanks. Btw, ForceDeleteNodes()? Or DeleteNodes(force bool)? I see your PR currently has respectMinCount arg which is quite specific and I really would like to avoid adding yet another arg or method to respect other possible constraints.

Bryce-Soghigian · 2024-04-29T19:55:09Z

Force makes a lot of sense lets do that instead.

x13n · 2024-05-07T19:11:50Z

Thanks, so I think we can close this one then.

/close

k8s-ci-robot · 2024-05-07T19:11:55Z

@x13n: Closed this PR.

In response to this:

Thanks, so I think we can close this one then.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

…odesWithErrors

daimaxiaxie · 2024-06-21T02:40:10Z

Hi @x13n , I updated the code and now only respects MinSize in fixNodeGroupSize. Hope it can be reopened.

Respecting MinSize is necessary in fixNodeGroupSize, even with ForceDeleteNodes, because fixNodeGroupSize does not remove nodes. Thanks.

daimaxiaxie · 2024-06-21T02:40:48Z

/reopen

k8s-ci-robot · 2024-06-21T02:40:53Z

@daimaxiaxie: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

…odesWithErrors

k8s-ci-robot · 2024-07-20T09:48:27Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: daimaxiaxie
Once this PR has been reviewed and has the lgtm label, please assign towca for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

cluster-autoscaler/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

bskiba · 2024-08-13T09:57:50Z

cluster-autoscaler/core/static_autoscaler.go

@@ -728,6 +728,17 @@ func fixNodeGroupSize(context *context.AutoscalingContext, clusterStateRegistry
 					incorrectSize.ExpectedSize,
 					incorrectSize.CurrentSize,
 					delta)
+
+				possibleDelta := nodeGroup.MinSize() - incorrectSize.ExpectedSize


Do we really want this check? MinSize is there to protect Cluster Autoscaler against scaling down below MinSize. If we want to fix the target size to sth below minSize, it means that the nodegroup started below minSize, we tried to scale up, but nodes are not registering correctly. I would expect that adjusting the target size of the group even to a value below minSize is something that we want to be able to do, we're not violating the minSize constraint, just reacting to a failed scale up situation.

From the user's perspective, minsize is the minimum value of nodegroup target, and it cannot be less than minsize at any time.

On the other hand, many cloud vendors do not allow you to set a value lower than minsize.

I don't think this makes sense in CA core. For cloud providers that prevent target size setting below min, adjusting the size can happen on cloud provider side. Otherwise it is going to prevent cloud providers that do allow going below min from working correctly. Note that incorrect size in CSR may stay incorrect if target size is adjusted up to match min size, so the subsequent loops will run this code again.

better0332 · 2024-09-10T03:51:09Z

I also encountered this bug

k8s-triage-robot · 2024-12-09T04:13:59Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

daimaxiaxie added 2 commits March 22, 2024 10:14

check MinSize in fix nodegroup size

1c63f03

fix TestStaticAutoscalerInstanceCreationErrors

da0eb2f

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. area/cluster-autoscaler labels Mar 22, 2024

k8s-ci-robot requested review from vadasambar and x13n March 22, 2024 04:46

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 22, 2024

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 22, 2024

daimaxiaxie and others added 2 commits March 22, 2024 13:54

update fix TestStaticAutoscalerInstanceCreationErrors

7cb7ba8

Merge branch 'kubernetes:master' into check-minsize-in-deleteCreatedN…

bb1560e

…odesWithErrors

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 19, 2024

Bryce-Soghigian mentioned this pull request May 5, 2024

WIP: adding respectMinCount to cloudprovider interface DeleteNodes() … #6658

Closed

k8s-ci-robot closed this May 7, 2024

daimaxiaxie and others added 3 commits June 21, 2024 10:10

Merge branch 'kubernetes:master' into check-minsize-in-deleteCreatedN…

e60ad7d

…odesWithErrors

revert deleteCreatedNodesWithErrors

76dd0af

revert TestStaticAutoscalerInstanceCreationErrors

6615d51

k8s-ci-robot reopened this Jun 21, 2024

Merge branch 'kubernetes:master' into check-minsize-in-deleteCreatedN…

58ae9f6

…odesWithErrors

bskiba reviewed Aug 13, 2024

View reviewed changes

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Check MinSize in fix nodegroup size #6650

Fix: Check MinSize in fix nodegroup size #6650

daimaxiaxie commented Mar 22, 2024

k8s-ci-robot commented Mar 22, 2024

towca commented Apr 19, 2024

MaciekPytel commented Apr 19, 2024

MaciekPytel commented Apr 19, 2024

Bryce-Soghigian commented Apr 20, 2024

Bryce-Soghigian commented Apr 20, 2024

daimaxiaxie commented Apr 22, 2024

daimaxiaxie commented Apr 22, 2024 •

edited

Loading

towca commented Apr 25, 2024

Bryce-Soghigian commented Apr 25, 2024

x13n commented Apr 26, 2024

Bryce-Soghigian commented Apr 27, 2024

x13n commented Apr 29, 2024

Bryce-Soghigian commented Apr 29, 2024

x13n commented May 7, 2024

k8s-ci-robot commented May 7, 2024

daimaxiaxie commented Jun 21, 2024

daimaxiaxie commented Jun 21, 2024

k8s-ci-robot commented Jun 21, 2024

k8s-ci-robot commented Jul 20, 2024

bskiba Aug 13, 2024

daimaxiaxie Aug 14, 2024

x13n Aug 23, 2024

better0332 commented Sep 10, 2024

k8s-triage-robot commented Dec 9, 2024

Fix: Check MinSize in fix nodegroup size #6650

Are you sure you want to change the base?

Fix: Check MinSize in fix nodegroup size #6650

Conversation

daimaxiaxie commented Mar 22, 2024

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Mar 22, 2024

towca commented Apr 19, 2024

MaciekPytel commented Apr 19, 2024

MaciekPytel commented Apr 19, 2024

Bryce-Soghigian commented Apr 20, 2024

Bryce-Soghigian commented Apr 20, 2024

daimaxiaxie commented Apr 22, 2024

daimaxiaxie commented Apr 22, 2024 • edited Loading

towca commented Apr 25, 2024

Bryce-Soghigian commented Apr 25, 2024

x13n commented Apr 26, 2024

Bryce-Soghigian commented Apr 27, 2024

x13n commented Apr 29, 2024

Bryce-Soghigian commented Apr 29, 2024

x13n commented May 7, 2024

k8s-ci-robot commented May 7, 2024

daimaxiaxie commented Jun 21, 2024

daimaxiaxie commented Jun 21, 2024

k8s-ci-robot commented Jun 21, 2024

k8s-ci-robot commented Jul 20, 2024

bskiba Aug 13, 2024

Choose a reason for hiding this comment

daimaxiaxie Aug 14, 2024

Choose a reason for hiding this comment

x13n Aug 23, 2024

Choose a reason for hiding this comment

better0332 commented Sep 10, 2024

k8s-triage-robot commented Dec 9, 2024

daimaxiaxie commented Apr 22, 2024 •

edited

Loading