-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: report ArgoCD errors via .status.conditions field #1608
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @jparsai, one additional request in addition to what is described below.
Part of this story is ensuring that the error messages that we are setting are user friendly.
AFAICT if you search for the following strings in the code, you will find cases where we are creating new errors:
errors.New
fmt.Errorf
I did that, and one thing I noticed is there are cases where there are error log statement are very user-friendly, and then below them is an error which is very user unfriendly:
reqLogger.Info(fmt.Sprintf("the ArgoCD instance '%s' does not match the label selector '%s' and skipping for reconciliation", request.NamespacedName, r.LabelSelector)) log.Error(err, fmt.Sprintf("Arg %s is already part of the default command arguments", arg))
In these cases, I think we should make the error returned just as 'friendly' as the log statement logging the error case. AFAICT there is no reason to not have the error message be the same as the logged error message, as the logger error message provided much more datail.
It would also be beneficial to look at the error messages in general, by searching for the strings mentioned above. See if you can decide if they can be improved to provide better context for what the Argo CD operator was doing when the error occurred. (where possible)
@@ -117,11 +117,26 @@ func (r *ReconcileArgoCD) Reconcile(ctx context.Context, request ctrl.Request) ( | |||
labelSelector, err := labels.Parse(r.LabelSelector) | |||
if err != nil { | |||
reqLogger.Info(fmt.Sprintf("error parsing the labelSelector '%s'.", labelSelector)) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reduce the amount of code duplication and/or places in the code where we need to update the .status.conditions
fields, in this file? reconcileRolloutsManager
function in Rollouts operator would be an example of this.
For example:
func (r *ReconcileArgoCD) Reconcile(ctx context.Context, request ctrl.Request) (ctrl.Result, error) {
// (...)
result, err := internalReconcile(...)
if err := updateStatusConditionOfArgoCD(ctx, createCondition(err.Error()), argocd, r.Client, log); err != nil {
log.Error(error, "unable to update status of ArgoCD")
return reconcile.Result{}, error
}
return result, err
}
func (r *ReconcileArgoCD) internalReconcile(ctx context.Context, request ctrl.Request) (ctrl.Result, error) {
// the contents of the existing Reconcile function
}
Or you could do it how we do it in Argo Rollouts operator, in reconcileRolloutsManager
, which is basically:
func (r *ReconcileArgoCD) internalReconcile(ctx context.Context, request ctrl.Request) (metav1.Condition, error) {
// the contents of the existing Reconcile function
}
@@ -10,6 +10,11 @@ metadata: | |||
name: example-argocd | |||
status: | |||
phase: Available | |||
conditions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there a few failing cases that we can trigger and check, as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it before, but in existing logic it is not possible to intentionally fail the reconciliation. At few places we can make invalid configs in CR, which will log error, but operator doesn't return it, since in next cycle it will fail again until user changes the CR configs. I checked existing test cases as well, but there are no negative test cases.
assert.Equal(t, a.Status.Conditions[0].Type, argoproj.ArgoCDConditionType) | ||
assert.Equal(t, a.Status.Conditions[0].Reason, argoproj.ArgoCDConditionReasonErrorOccurred) | ||
assert.Equal(t, a.Status.Conditions[0].Message, "the ArgoCD instance 'argocd/argo-test-2' does not match the label selector 'foo=bar' and skipping for reconciliation") | ||
assert.Equal(t, a.Status.Conditions[0].Status, metav1.ConditionFalse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we verify that fixing the error, and then calling reconcile again, will remove the condition?
} | ||
|
||
// insertOrUpdateConditionsInSlice is a generic function for inserting/updating metav1.Condition into a slice of []metav1.Condition | ||
func insertOrUpdateConditionsInSlice(newCondition metav1.Condition, existingConditions []metav1.Condition) (bool, []metav1.Condition) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need unit tests for this, you can likely use the ones from argo-rollouts-manager, and port them here.
401cc52
to
e1224cb
Compare
Signed-off-by: Jayendra Parsai <[email protected]>
What type of PR is this?
/kind enhancement
What does this PR do / why we need it:
This PR is to add a new
Conditions
field inStatus
of ArgoCD CR which would contain a single condition of 'Type: Reconciled' for any error that occurs.Which issue(s) this PR fixes:
Red Hat issue tracker :- https://issues.redhat.com/browse/GITOPS-5876
How to test changes / Special notes to the reviewer:
.status.conditions
field should be updated based on error or success.