[BOP-86] Add status & events for Helm Chart Addons; Improve Status for Manifests #18

tppolkow · 2023-11-29T23:14:45Z

Description/Summary

Addresses https://mirantis.jira.com/browse/BOP-86

This PR adds status and events to Helm Chart Addons. The status changes are done by telling the addon controller to watch (https://book.kubebuilder.io/reference/watching-resources/operator-managed) Jobs which are created by the helm controller for each HelmChart custom resource. This means Addons will display status of install helm chart job. The limitation here is the job succeeds once all the resources specified in the helm chart are created (i.e deployments), but if there is a failure in creation of pods by that deployment it will not bubble up. We could look into doing something more here similar to manifests below but it will not be as simple as manifests since we don't know beforehand what, if any, pods are created by the helm chart.

Additionally this PR reworks how I previously implemented some of the events / status for manifest addons. I think the previous approach was brittle and not as robust. It updated the status of the manifest after certain events but then didn't update it past a further point. I kept some of the status/events from previous PR (such as when the addon fails in an early stage of setting up the CR), but additionally have also set up the manifest controller to watch Deployment and Daemonset resources and update status off of those. These resources were chosen because they have some status fields we can update off of, and are the main resources that deploy pods that we currently support in Manifest. I think we will definitely need to update this once we support some more complex manifests but I think it is a good starting point for manifest status.

Testing

Test 1 : Happy Path

Using below addons:

    addons:
      - name: calico
        kind: manifest
        enabled: true
        manifest:
          url: https://raw.githubusercontent.com/projectcalico/calico/v3.26.3/manifests/calico.yaml
      - name: metallb
        kind: manifest
        enabled: true
        manifest:
          url: https://raw.githubusercontent.com/kubernetes/website/main/content/en/examples/admin/namespace-dev.yaml
      - name: my-grafana
        enabled: true
        kind: chart
        namespace: monitoring
        chart:
          name: grafana
          repo: https://grafana.github.io/helm-charts
          version: 6.58.7
          values: |
            ingress:
              enabled: true

After deploying above blueprint we can watch the status of the addons:

tpolkowski@tpolkowski-MBP16-1947 boundless-cli % k get addon -n boundless-system -w
NAME     STATUS
calico
metallb
calico
metallb
my-grafana
calico       Progressing
metallb      Progressing
my-grafana
calico       Progressing
my-grafana   Progressing
calico       Available
metallb      Available
calico       Progressing
my-grafana   Available
calico       Progressing
calico       Available

Addons start with Progressing Status. For Helm chart addons, the status gets updated as the associated Job gets progressed. Once the Job finishes, the status is set to Availablre.

For Manifest addons we see metallb is Available almost instantly. This is because the manifest is just a namespace. On the other hand calico manifest is in Progressing for several minutes - until the deployments and daemonsets in the manifest successfully create pods.

Eventually the addons become Available

tpolkowski@tpolkowski-MBP16-1947 boundless-cli % k get addon -n boundless-system
NAME         STATUS
calico       Available
metallb      Available
my-grafana   Available

If we describe any addon we can see a more detailed status and some events that have been emitted. i.e for my-grafana:

Status:
  Last Transition Time:  2023-12-05T03:31:52Z
  Reason:                Helm Chart helm-install-grafana successfully installed
  Type:                  Available
Events:
  Type    Reason            Age                    From              Message
  ----    ------            ----                   ----              -------
  Normal  SuccessfulCreate  5m46s (x2 over 5m46s)  addon controller  Created Chart Addon monitoring/my-grafana

Test 2 : Deploy manifest that fails to install

tpolkowski@tpolkowski-MBP16-1947 boundless-cli % k get addon -A
NAMESPACE          NAME                STATUS
boundless-system   blackbox-exporter   Unhealthy
boundless-system   calico              Available

tpolkowski@tpolkowski-MBP16-1947 boundless-cli % k get manifest -A
NAMESPACE          NAME                STATUS
boundless-system   blackbox-exporter   Unhealthy
boundless-system   calico              Available

And we can get details by describing the manifest

k describe manifest blackbox-exporter -n boundless-system

Status:
  Last Transition Time:  2023-12-05T22:04:20Z
  Message:               failed to update manifest  : yaml: line 199: mapping values are not allowed in this context
  Reason:                failed to update manifest
  Type:                  Unhealthy
Events:
  Type     Reason        Age                      From                 Message
  ----     ------        ----                     ----                 -------
  Warning  FailedCreate  8m15s                    manifest controller  failed to create objects for the manifest boundless-system/blackbox-exporter : yaml: line 199: mapping values are not allowed in this context
  Warning  FailedCreate  8m13s (x3 over 8m14s)    manifest controller  failed to update manifest crd while update operation boundless-system/blackbox-exporter : Operation cannot be fulfilled on manifests.boundless.mirantis.com "blackbox-exporter": the object has been modified; please apply your changes to the latest version and try again
  Warning  FailedCreate  8m11s (x3 over 8m16s)    manifest controller  failed to update manifest object with finalizer boundless-system/blackbox-exporter
  Warning  FailedCreate  2m20s (x231 over 8m15s)  manifest controller  failed to update manifest boundless-system/blackbox-exporter : yaml: line 199: mapping values are not allowed in this context

which are also bubbled up to the addon:

Status:
  Last Transition Time:  2023-12-05T22:04:08Z
  Message:               failed to update manifest  : yaml: line 199: mapping values are not allowed in this context
  Reason:                failed to update manifest
  Type:                  Unhealthy
Events:
  Type     Reason        Age                    From              Message
  ----     ------        ----                   ----              -------
  Warning  FailedCreate  8m45s (x25 over 9m3s)  addon controller  Failed to Create Manifest Addon default/blackbox-exporter : Operation cannot be fulfilled on manifests.boundless.mirantis.com "blackbox-exporter": the object has been modified; please apply your changes to the latest version and try again

ranyodh · 2023-12-08T15:23:57Z

config/rbac/role.yaml

+  - list
+  - patch
+  - update
+  - watch


Does the operator need permissions for all of these? If we only use watch/get/list etc, then we should only ask for those permissions.

Good point, I copied the setup from kubebuilder but I think those extra ones were only needed in their specific example. I updated to only use watch / get / list

ranyodh · 2023-12-08T15:33:38Z