Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduce missing events for helmChart reconciliation failures #907

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

souleb
Copy link
Member

@souleb souleb commented Mar 6, 2024

fixes 4649

If implemented this PR reintroduces events for some failing actions during the reconciliation process, related to the chart retrieval and loading of chart and values.

We do not reintroduce all events (still available in logs), but where it is desirable to have notifications in place.

@souleb souleb force-pushed the add-reconciliation-events branch 3 times, most recently from 77ba08c to 57a15dd Compare March 6, 2024 12:03
@souleb
Copy link
Member Author

souleb commented Mar 6, 2024

Testing with a faulty configmap gives us:

{"level":"info","ts":"2024-03-06T11:09:39.678Z","logger":"event-server","msg":"dispatching event","eventInvolvedObject":{"kind":"HelmRelease","namespace":"backend","name":"redis","uid":"50f43354-d991-4149-bd7f-31074eed9a83","apiVersion":"helm.toolkit.fluxcd.io/v2beta2","resourceVersion":"625589"},"message":"could not resolve ConfigMap chart values reference 'backend/application-values-g9g5b268dh' with key 'values.yaml': key not found"}
Status:
  Conditions:
    Last Transition Time:  2024-03-06T11:10:17Z
    Message:               Fulfilling prerequisites
    Observed Generation:   8
    Reason:                Progressing
    Status:                True
    Type:                  Reconciling
    Last Transition Time:  2024-03-06T11:10:17Z
    Message:               could not resolve ConfigMap chart values reference 'backend/application-values-g9g5b268dh' with key 'values.yaml': key not found
    Observed Generation:   8
    Reason:                ValuesError
    Status:                False
    Type:                  Ready

If implemented this PR reintroduce events for some failling action
during the reconciliation process, related to the helmChart retrieval
and loading of chart and values.

Signed-off-by: Soule BA <[email protected]>
@souleb souleb force-pushed the add-reconciliation-events branch from 57a15dd to e283ead Compare March 6, 2024 14:52
@souleb souleb requested a review from darkowlzz March 6, 2024 21:46
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @souleb 🥇

@stefanprodan stefanprodan changed the title Reintroduce missing events for helmChart reconciliation Reintroduce missing events for helmChart reconciliation failures Mar 7, 2024
@souleb souleb merged commit b79cad0 into main Mar 7, 2024
6 checks passed
@souleb souleb deleted the add-reconciliation-events branch March 7, 2024 12:07
@@ -259,6 +259,7 @@ func (r *HelmReleaseReconciler) reconcileRelease(ctx context.Context, patchHelpe
conditions.MarkStalled(obj, aclv1.AccessDeniedReason, err.Error())
conditions.MarkFalse(obj, meta.ReadyCondition, aclv1.AccessDeniedReason, err.Error())
conditions.Delete(obj, meta.ReconcilingCondition)
r.Eventf(obj, eventv1.EventSeverityError, aclv1.AccessDeniedReason, err.Error())
Copy link
Contributor

@darkowlzz darkowlzz Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The event type seems incorrect for our event recorder. Our event recorder implements the kubernetes EventRecorder interface which only supports Normal and Warning events. In our implementation, we also accept Trace type and convert them to Normal type, refer https://github.com/fluxcd/pkg/blob/0cf0546cb4ded7301cf3f8476c2d868c8d27a187/runtime/events/recorder.go#L225 .
eventv1.EventSeverityError seems to be for notification-controller event severity.

Because of how our event recorder is written, it doesn't result in any error/failure but the argument value seems incorrect and this is also done similarly in line 240 above. But everywhere else in this repository, proper event type is passed to event recorder.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the way this was fixed in helm-controller was with this function

func (r *HelmReleaseReconciler) event(_ context.Context, hr v2.HelmRelease, revision, severity, msg string) {
...
  eventType := corev1.EventTypeNormal
  if severity == eventv1.EventSeverityError {
    eventType = corev1.EventTypeWarning
  }
  r.EventRecorder.AnnotatedEventf(&hr, eventMeta, eventType, severity, msg)
}

and then in fluxcd/pkg/runtime/events/recorder.go we have this function:

func eventTypeToSeverity(eventType string) string {
	switch eventType {
	case corev1.EventTypeWarning:
		return eventv1.EventSeverityError
	case eventv1.EventTypeTrace:
		return eventv1.EventSeverityTrace
	default:
		return eventv1.EventSeverityInfo
	}
}

This code makes sure that you always pass warning or normal to the recorder that can then re-switch to the right severity for NC.

For Trace events you can pass them as-is to the recorder that expects it, and do not send them to NC:

	// Do not send trace events to notification controller,
	// traces are persisted as Kubernetes events only as normal events.
	if severity == eventv1.EventSeverityTrace {
		r.EventRecorder.AnnotatedEventf(object, annotations, corev1.EventTypeNormal, reason, messageFmt, args...)
		return
	}

I think we can make sure that the recorder infer the right severity and final type based on the passed event type directly in fluxcd/pkg/runtime/events/recorder.go. I'll give it a try.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the severity is for NC events to filter the notifications based on severity. The event type that the event recorder accepts is the upstream core types, Normal and Warning. But since we needed a way to only send kubernetes events and skip NC events, we introduced Trace type as well. So the event recorder events are different from NC event types with different purposes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flux 2.1.1 to 2.2.2 / 2.2.3 some alerts notification are missing
3 participants