Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validate if cluster FSID is empty #185

Merged

Conversation

umangachapagain
Copy link
Contributor

When generating replicationID in CreateUniqueReplicationId(), ensure that at least 2 non-empty cluster FSID strings are available. Also, refactor to include sorting logic to enure consistent hashing.

Copy link
Contributor

openshift-ci bot commented Nov 29, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: umangachapagain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

}

if len(fsids) < 2 {
return "", fmt.Errorf("replicationID can not be generated due to missing cluster FSID")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add which cluster ID we are actually missing. It should be present as the key in clusterFSIDs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be multiple and I didn't want to do a bunch of string manipulation. Since this is a very corner case, can we keep that as a follow up item?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

String manipulation may not be required if you add it in the above loop. We can check if something is empty and just log that this spoke cluster's fsid is nil.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or it can done in the place where we are fetching the clusterFSIDs. It should be easier and will make more sense to log that

fsids = append(fsids, v)
replicationId, err := utils.CreateUniqueReplicationId(clusterFSIDs)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning an error will keep it in continous reconcilation and will keep printing the logs.
Do we want to log the info and reconcile at a later interval ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error loop is fine I guess, since we do not expect it to be for long. Also, better to error out that a manual restart.

When generating replicationID in CreateUniqueReplicationId(), ensure
that at least 2 non-empty cluster FSID strings are available. Also,
refactor to include sorting logic to enure consistent hashing.

Signed-off-by: Umanga Chapagain <[email protected]>
In some cases, FSID can be nil and later be populated. For such cases,
we need to ensure that ManifestWork for VRC is updated with newly
generated replication ID.

Signed-off-by: Umanga Chapagain <[email protected]>
@ShyamsundarR
Copy link

LGTM! Looping on the reconcile should be fine, is there a max reconcile exponential backoff set for this reconciler? Asking to see if this could result in a backoff due to cluster claims not present for a larger interval.

@umangachapagain
Copy link
Contributor Author

LGTM! Looping on the reconcile should be fine, is there a max reconcile exponential backoff set for this reconciler? Asking to see if this could result in a backoff due to cluster claims not present for a larger interval.

@ShyamsundarR We use the defaults. Not sure what those values are.

@vbnrh
Copy link
Member

vbnrh commented Nov 29, 2023

/lgtm
/hold

@umangachapagain
Copy link
Contributor Author

/unhold

@openshift-merge-bot openshift-merge-bot bot merged commit 1b647ef into red-hat-storage:main Nov 29, 2023
11 checks passed
@umangachapagain umangachapagain deleted the fsid-mismatch branch November 29, 2023 15:29
@umangachapagain
Copy link
Contributor Author

/cherry-pick release-4.14

@openshift-cherrypick-robot

@umangachapagain: new pull request created: #186

In response to this:

/cherry-pick release-4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants