Skip to content
This repository has been archived by the owner on Feb 9, 2022. It is now read-only.

WIP: providerid controller #17

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

rgolangh
Copy link
Contributor

  • Increase leader election lease time
  • Add controller to reconcile node's providerID

Danil-Grigorev and others added 8 commits June 23, 2020 11:29
MOTIVATION
This fix handles the network addresses reconciliation problems which leads to
mishandling of status and phase reporting and lack of linkage, which causes the
machine to apear in the wrong state.

The main issue is around network reconciliation, and for that a major refactoring been made
to better address this. See [1] below.

RESULT
While the VM is not up it still doesn't have any IP addresses and this should be reported as nil addresses.
If the VM is UP it means we have to report error, and that will trigger a future reconcilliation.
A change in the node object that, that happen when a VM goes down for example, triggers reconciliation
and that should change the list of addresses and state accordingly. When the VM boots a reverse thing
happens - the machine boots with no IP addresses, till the node object is detecting a change because
kubelet starts responding. That triggers a reconcillation that will detect the addresses correctly. If
not that should keep firing reconcillation till it does.

MODIFICATION
[1] structural changes:
Instead of having pile of changes in one or two methods, every change to
the status or spec has its own function.
The high level break down of the any handling now is:
- create or delete the vm. no-op on the vm for update
- reconcile provider id
- reconcile network addresses
- reconcile annotations
- reconcile providerStatus
- update machine
- update machine/status

Signed-off-by: Roy Golan <[email protected]>
Bug 1817853: Reconcile network addresses according to VM status
Implement leader election for manager
The machine-api-controller components are refreshing their lease more
than all other components combined. Bringing this to 90s each, will
decrease etcd writes at idle.
BUG 1858400: [Performance] Lease refresh period for machine-api-controllers is too high, causes heavy writes to etcd at idle
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rgolangh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

@rgolangh: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@rgolangh rgolangh changed the title providerid controller WIP: providerid controller Jul 28, 2020
@rgolangh rgolangh force-pushed the providerid-controller branch 3 times, most recently from 079c7b6 to 5321dbf Compare July 30, 2020 10:43
This controller is a replacement for a cloud provider. Its purpose is to
to set the node spec.providerID by querying ovirt/RHV api, where the
node name is the VM name.

This will make autoscalling work.

Signed-off-by: Roy Golan <[email protected]>
@rgolangh rgolangh force-pushed the providerid-controller branch from 5321dbf to a865d06 Compare July 30, 2020 11:13
@openshift-ci-robot
Copy link

@rgolangh: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/images a865d06 link /test images
ci/prow/sanity a865d06 link /test sanity
ci/prow/vet a865d06 link /test vet
ci/prow/fmt a865d06 link /test fmt

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants