-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOP-86] Add status & events for Helm Chart Addons; Improve Status for Manifests #18
Conversation
4d1688b
to
4972eac
Compare
7e4b4a9
to
8c8cb75
Compare
- list | ||
- patch | ||
- update | ||
- watch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the operator need permissions for all of these? If we only use watch
/get
/list
etc, then we should only ask for those permissions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I copied the setup from kubebuilder but I think those extra ones were only needed in their specific example. I updated to only use watch
/ get
/ list
return ctrl.Result{}, err | ||
} | ||
|
||
result, err := r.setOwnerReferenceOnManifest(ctx, logger, instance, m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: I am wondering how is calling setControllerReference()
different from setting the owns relationship with SetupWithManager
:
return ctrl.NewControllerManagedBy(mgr).
For(&boundlessv1alpha1.Addon{}).
Owns(&boundlessv1alpha1.Manifest{}).
Watches(
&source.Kind{Type: &batch.Job{}},
handler.EnqueueRequestsFromMapFunc(r.findAddonForJob),
builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}),
).
Complete(r)
Also, if we set the Owns()
reference in SetupWithManager()
, why do we still need to set the controller reference for each instance of the CR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
setControllerReference
sets the below in the metadata of the manifest :
"ownerReferences": [
{
"apiVersion": "boundless.mirantis.com/v1alpha1",
"blockOwnerDeletion": true,
"controller": true,
"kind": "Addon",
"name": "calico",
"uid": "7d7a3f66-46c1-4a46-a3be-866d105d480e"
}
]
Then Owns(&boundlessv1alpha1.Manifest{}).
tells the manager to basically Watch
each instance of boundlessv1alpha1.Manifest
and issue a reconcile against the resource in the ownerRef
if one is present.
Without the setControllerReference
then no reconcile is issued since there is no object in the ownerRef
controllers/addon_controller.go
Outdated
Watches( | ||
&source.Kind{Type: &batch.Job{}}, | ||
handler.EnqueueRequestsFromMapFunc(r.findAddonForJob), | ||
builder.WithPredicates(predicate.ResourceVersionChangedPredicate{}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here explaining what Job it will watch? Does it watches for all jobs in the namespace? Or a subset of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It watches all jobs in the cluster, and then runs the MapFunc
we provide (findAddonForJob
) on each job. The map function returns a list of reconcile requests (if any) that the manager should send.
In our case - the manager will watch all Jobs
in the cluster. Then for each job it attempts to find an addon that has an index value of jobNamespace-jobName
. If it finds one it reconciles that addon, if not then it does nothing. So while it watches all jobs in the cluster , only the jobs that have the same namespace and same name as one specified in an addon spec will trigger a reconcile.
Will add similar comment in the code.
// updateHelmchartAddonStatus checks the status of the associated helm chart job and updates the status of the Addon CR accordingly | ||
func (r *AddonReconciler) updateHelmchartAddonStatus(ctx context.Context, logger logr.Logger, namespacedName types.NamespacedName, job *batch.Job, addon *boundlessv1alpha1.Addon) error { | ||
logger.Info("Updating Helm Chart Addon Status") | ||
if job.Status.CompletionTime != nil && job.Status.Succeeded > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactor: No change needed at the moment. But this if/else
can be simplified by labeling these conditions. for example:
type JobStatus int
const (
JobSatusSuccess int = iota
JobStatusFailed
JobStatusProgressing
)
func getJobStatusCondition(status Status) JobStatus {
if job.Status.CompletionTime != nil && job.Status.Succeeded > 0 {
return JobSatusSuccess
}
...
}
// If no errors are found, we check if any deployments/daemonsets are still progressing and set the manifest status to Progressing | ||
// Otherwise set the manifest status to Available | ||
// This is not comprehensive and may need to be updated as we support more complex manifests | ||
func (r *ManifestReconciler) checkManifestStatus(ctx context.Context, logger logr.Logger, namespacedName types.NamespacedName, objects []boundlessv1alpha1.ManifestObject) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactor suggestion: This function can be broken down into and we can add unit tests for it. Currently it is hard to read:
Suggestion:
- Filter for
Deployment
andDaemonSets
to removeswitch
statement - Use label (for condition of manifest status) to make it easier to parse
- Unit tests
This does not need to be done with this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, we can break it down and make it easy to also add support for other resources in the future.
…en Unhealthy & Progressing during Deployment progression; General cleanup
ae418eb
to
ca2f750
Compare
Description/Summary
Addresses https://mirantis.jira.com/browse/BOP-86
This PR adds status and events to Helm Chart Addons. The status changes are done by telling the addon controller to watch (https://book.kubebuilder.io/reference/watching-resources/operator-managed)
Jobs
which are created by the helm controller for eachHelmChart
custom resource. This means Addons will display status of install helm chart job. The limitation here is the job succeeds once all the resources specified in the helm chart are created (i.e deployments), but if there is a failure in creation of pods by that deployment it will not bubble up. We could look into doing something more here similar to manifests below but it will not be as simple as manifests since we don't know beforehand what, if any, pods are created by the helm chart.Additionally this PR reworks how I previously implemented some of the events / status for manifest addons. I think the previous approach was brittle and not as robust. It updated the status of the manifest after certain events but then didn't update it past a further point. I kept some of the status/events from previous PR (such as when the addon fails in an early stage of setting up the CR), but additionally have also set up the manifest controller to watch
Deployment
andDaemonset
resources and update status off of those. These resources were chosen because they have some status fields we can update off of, and are the main resources that deploy pods that we currently support inManifest
. I think we will definitely need to update this once we support some more complex manifests but I think it is a good starting point for manifest status.Testing
Test 1 : Happy Path
Using below addons:
After deploying above blueprint we can watch the status of the addons:
Addons start with
Progressing
Status. For Helm chart addons, the status gets updated as the associatedJob
gets progressed. Once theJob
finishes, the status is set toAvailablre
.For Manifest addons we see
metallb
isAvailable
almost instantly. This is because the manifest is just a namespace. On the other handcalico
manifest is inProgressing
for several minutes - until the deployments and daemonsets in the manifest successfully create pods.Eventually the addons become
Available
If we describe any addon we can see a more detailed status and some events that have been emitted. i.e for
my-grafana
:Test 2 : Deploy manifest that fails to install
And we can get details by describing the manifest
which are also bubbled up to the addon: