Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ feat: add support for cluster class #261

Conversation

salasberryfin
Copy link
Contributor

@salasberryfin salasberryfin commented Feb 8, 2024

What this PR does / why we need it:

This moves the changes proposed in #226 to the recently created v1beta1 api version.

This PR adds RKE2ControlPlaneTemplate, which will be used by ClusterClass. It also includes a quick start example to create a sample ClusterClass and a workload Cluster based on it.

Which issue(s) this PR fixes:
Fixes #75

Special notes for your reviewer:

The example ClusterClass makes use of the new configurable bootstrap timeout in CAPD, which is pending release after being merged here. The default timeout is set to 3m, which is sometimes not enough for machine to be provisioned. Upstream CAPI is expected to release this change in April but we can consider this as a non-blocker as it is an existing issue when using CAPRKE2 + CAPD. In any case, it can be added to the release notes until it is ultimately released.

Supporting ClusterClass meant extra fields had to be added to RKE2ControlPlaneSpec, including Version. Since the provider already has a Version field in RKE2AgentConfigSpec, there's a follow up issue, #229, to track work on addressing how these fields are managed, with the goal of deprecating RKE2AgentConfigSpec.Version in a future API version.

For now, only the existing RKE2AgentConfigSpec.Version is supported and the new RKE2ControlPlaneSpec.Version is only a placeholder to support using ClusterClass. A following PR will add the changes to support either of the two fields but not both.

This PR superseds #226

Checklist:

  • squashed commits into logical changes
  • includes documentation
  • adds unit tests
  • adds or updates e2e tests

@salasberryfin salasberryfin added the kind/feature New feature or request label Feb 8, 2024
@salasberryfin salasberryfin force-pushed the v1beta1-feat-add-support-for-clusterclass branch 5 times, most recently from 49fcc9d to 2ff7488 Compare February 13, 2024 14:49
@salasberryfin salasberryfin changed the title WIP: feat: add support for cluster class feat: add support for cluster class Feb 13, 2024
@salasberryfin salasberryfin requested a review from a team February 13, 2024 15:08
@salasberryfin salasberryfin changed the title feat: add support for cluster class ✨ feat: add support for cluster class Feb 16, 2024
@richardcase richardcase added this to the v0.3.0 milestone Feb 19, 2024
Copy link
Contributor

@richardcase richardcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good @salasberryfin 👍 Thanks for this.

@@ -147,6 +147,7 @@ type RKE2AgentConfig struct {
LoadBalancerPort int `json:"loadBalancerPort,omitempty"`

// Version specifies the rke2 version.
// This field will be deprecated in newer versions of the API and RKE2ControlPlaneSpec.Version will be used instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the +kubebuilder:deprecatedversion:warning="" annotation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Until we resolve the way we manage the new Version field (here), we're still using this and the other is just a placeholder required by the ClusterClass implementation. The deprecation warning may push people to use the new field while it is technically not supported yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could modify the code to use one or the other, and then have the deprecation warning. Initially didn't want to include this in the same PR, but we can do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say lets do it or create an issue to do it straight afterwards so that its in the API bump release.

bootstrap/api/v1beta1/rke2config_webhook.go Outdated Show resolved Hide resolved
bootstrap/api/v1beta1/rke2configtemplate_webhook.go Outdated Show resolved Hide resolved
bootstrap/config/crd/kustomization.yaml Outdated Show resolved Hide resolved
controlplane/api/v1beta1/rke2controlplane_webhook.go Outdated Show resolved Hide resolved
controlplane/api/v1beta1/rke2controlplanetemplate_types.go Outdated Show resolved Hide resolved
Copy link
Contributor

@furkatgofurov7 furkatgofurov7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am missing something but should not you add RKE2ClusterTemplate CRD to enable CC?

// In future iterations, this field overrides the RKE2 Version specificied in RKE2ConfigSpec.AgentConfig.Version
// which will be deprecated in newer versions of the API.
// +optional
Version string `json:"version"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this version accounted anywhere? In upstream this Version field is slightly unnessesary, as this only may differentiate the control plane machines version from worker machine versions. It might be simpler to unify those. Just thinking outloud, may be wrong

README.md Show resolved Hide resolved
@salasberryfin
Copy link
Contributor Author

Maybe I am missing something but should not you add RKE2ClusterTemplate CRD to enable CC?

@furkatgofurov7 the way I understand ClusterClass works in this scenario, we define a topology with RKE2ControlPlaneTemplate and RKE2ConfigTemplate and then for the Cluster resource we just choose the infrastructure provider, similar to what we do in config/samples. With this approach I don't see a need for RKE2ClusterTemplate. Does it make sense?

@salasberryfin salasberryfin force-pushed the v1beta1-feat-add-support-for-clusterclass branch 3 times, most recently from 8fea3a0 to 655c6eb Compare February 20, 2024 20:06
@salasberryfin
Copy link
Contributor Author

Thanks for the reviews and comments @Danil-Grigorev @richardcase @furkatgofurov7. Resolved most of them and will be looking at the remaining ones tomorrow.

@salasberryfin salasberryfin force-pushed the v1beta1-feat-add-support-for-clusterclass branch 2 times, most recently from 8f0fb74 to 43f5e60 Compare February 21, 2024 08:38
Danil-Grigorev
Danil-Grigorev previously approved these changes Feb 22, 2024
Copy link
Contributor

@Danil-Grigorev Danil-Grigorev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one nit about linter rules, seems to be added by mistake?

.golangci.yml Outdated Show resolved Hide resolved
@@ -423,6 +423,8 @@ func autoConvert_v1beta1_RKE2ControlPlaneSpec_To_v1alpha1_RKE2ControlPlaneSpec(i
return err
}
out.Replicas = (*int32)(unsafe.Pointer(in.Replicas))
// WARNING: in.Version requires manual conversion: does not exist in peer-type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that version needs to be stored as annotation, and restored manually, if for some reason the RKE2 provider version is downgraded. Maybe needs addressing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a manual conversion here that handles the new fields Version and MachineTemplate.

Comment on lines +160 to +183

func Convert_v1beta1_RKE2ControlPlaneSpec_To_v1alpha1_RKE2ControlPlaneSpec(in *controlplanev1.RKE2ControlPlaneSpec, out *RKE2ControlPlaneSpec, s apiconversion.Scope) error {
// Version was added in v1beta1.
// MachineTemplate was added in v1beta1.
return autoConvert_v1beta1_RKE2ControlPlaneSpec_To_v1alpha1_RKE2ControlPlaneSpec(in, out, s)
}

func Convert_v1beta1_RKE2ControlPlaneStatus_To_v1alpha1_RKE2ControlPlaneStatus(in *controlplanev1.RKE2ControlPlaneStatus, out *RKE2ControlPlaneStatus, s apiconversion.Scope) error {
return autoConvert_v1beta1_RKE2ControlPlaneStatus_To_v1alpha1_RKE2ControlPlaneStatus(in, out, s)
}

func Convert_v1alpha1_RKE2ControlPlaneStatus_To_v1beta1_RKE2ControlPlaneStatus(in *RKE2ControlPlaneStatus, out *controlplanev1.RKE2ControlPlaneStatus, s apiconversion.Scope) error {
return autoConvert_v1alpha1_RKE2ControlPlaneStatus_To_v1beta1_RKE2ControlPlaneStatus(in, out, s)
}

func Convert_v1beta1_RKE2ControlPlaneTemplateSpec_To_v1alpha1_RKE2ControlPlaneTemplateSpec(in *controlplanev1.RKE2ControlPlaneTemplateSpec, out *RKE2ControlPlaneTemplateSpec, s apiconversion.Scope) error {
return autoConvert_v1beta1_RKE2ControlPlaneTemplateSpec_To_v1alpha1_RKE2ControlPlaneTemplateSpec(in, out, s)
}

func Convert_v1alpha1_RKE2ControlPlaneTemplateSpec_To_v1beta1_RKE2ControlPlaneTemplateSpec(in *RKE2ControlPlaneTemplateSpec, out *controlplanev1.RKE2ControlPlaneTemplateSpec, s apiconversion.Scope) error {
return autoConvert_v1alpha1_RKE2ControlPlaneTemplateSpec_To_v1beta1_RKE2ControlPlaneTemplateSpec(in, out, s)
}

func Convert_v1alpha1_RKE2ControlPlaneTemplateStatus_To_v1beta1_RKE2ControlPlaneStatus(in *RKE2ControlPlaneTemplateStatus, out *controlplanev1.RKE2ControlPlaneStatus, s apiconversion.Scope) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously there were no RKE2ControlPlaneTemplates in use. I think if there are issues with this conversion code, we can remove it.

richardcase
richardcase previously approved these changes Feb 22, 2024
Copy link
Contributor

@richardcase richardcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, just a nit.

// In future iterations, this field overrides the RKE2 Version specificied in RKE2ConfigSpec.AgentConfig.Version
// which will be deprecated in newer versions of the API.
// +optional
Version string `json:"version"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets follow up after this merges and before we release on the version fields.

controlplane/api/v1beta1/rke2controlplane_types.go Outdated Show resolved Hide resolved
@salasberryfin
Copy link
Contributor Author

salasberryfin commented Feb 22, 2024

Thanks @richardcase @Danil-Grigorev. Will address these nits before merging. May need to think a bit more about the Version deprecation, as the field is required in the control plane provider to satisfy the ClusterClass implementation but we're also using RKE2AgentConfig.Version for worker nodes that won't have the RKE2ControlPlane.Version, so it looks like it's not as easy as having one or the other.

@salasberryfin salasberryfin force-pushed the v1beta1-feat-add-support-for-clusterclass branch from 43f5e60 to af19f6c Compare February 22, 2024 19:07
@salasberryfin
Copy link
Contributor Author

Thanks everyone for the reviews. Just fixed most recent nits. The follow-up on how Version is configured is #229.

richardcase
richardcase previously approved these changes Feb 23, 2024
Copy link
Contributor

@richardcase richardcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go, we can follow up on the version etc

Danil-Grigorev
Danil-Grigorev previously approved these changes Feb 23, 2024
furkatgofurov7
furkatgofurov7 previously approved these changes Feb 23, 2024
Copy link
Contributor

@furkatgofurov7 furkatgofurov7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

README.md Outdated Show resolved Hide resolved
// In future iterations, this field overrides the RKE2 Version specificied in RKE2ConfigSpec.AgentConfig.Version
// which will be deprecated in newer versions of the API.
// +optional
Version string `json:"version"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@salasberryfin salasberryfin force-pushed the v1beta1-feat-add-support-for-clusterclass branch from 15f7962 to 0b62188 Compare February 23, 2024 17:13
@richardcase richardcase merged commit ae0ecc3 into rancher:main Feb 27, 2024
7 checks passed
Copy link

@mjura mjura left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add ClusterClass support
5 participants