Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix triggering replication when multiple destinations are set #331

Open
wants to merge 1 commit into
base: development/1.15
Choose a base branch
from

Conversation

Kerkesni
Copy link
Contributor

@Kerkesni Kerkesni commented Dec 19, 2024

  • Issue 1: In Artesca "SITE_NAME" is never used, so we always
    trigger objects that are replicated to the first storageClass in
    the replication rule.

  • Issue 2: We check the global replication status when verifying
    wether or not an object should be retriggered. This doesn't necessarily
    work all the time, especially when replicating to multiple destinations.
    As if one destination fails the global status becomes failed, which will
    make it impossible to trigger objects with a completed status for example.

  • Issue 3: replication info is completely overwritten when it does not contain
    info about a specific site. This will cause an issue when replicating to multiple
    destinations as the script can only be launched for one site at a time, so when
    having a object with non initialized replication info, we won't be able to set
    the replication info properly for all destinations.

Issue: S3UTILS-184

@bert-e
Copy link
Contributor

bert-e commented Dec 19, 2024

Hello kerkesni,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@scality scality deleted a comment from bert-e Dec 19, 2024
@bert-e
Copy link
Contributor

bert-e commented Dec 19, 2024

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@bert-e
Copy link
Contributor

bert-e commented Dec 19, 2024

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_pull_requests

@Kerkesni Kerkesni changed the base branch from development/1.14 to development/1.15 December 20, 2024 13:46
@bert-e
Copy link
Contributor

bert-e commented Dec 20, 2024

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_pull_requests

@scality scality deleted a comment from bert-e Dec 20, 2024
@scality scality deleted a comment from bert-e Dec 20, 2024
if (!objMD.getReplicationInfo()
|| !objMD.getReplicationSiteStatus(storageClass)) {
let replicationInfo = objMD.getReplicationInfo();
if (!replicationInfo || !replicationInfo.status) {
const { Rules, Role } = repConfig;
const destination = Rules[0].Destination.Bucket;
Copy link
Contributor

@francoisferrand francoisferrand Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this may not be correct, if there is more than 1 desitination....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crrExistingObjects doesn't currently support triggering replication for multiple sites at once. So this would work as it will initialize the replication info when triggering the first site and then append the other sites' info to the fields when the other ones are triggered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even without triggering multiple sites at once : if a user requests replicatoin to the second "site", this will initialize with the 1st destination (--> trigger replication!), then we will add the second one...

When there is no replicationInfo (i.e. typically when it is empty), should we not just initialize an empty replicationInfo, and let the next block ("Update replication info with site specific info") fill the details for the requested destination?

Copy link
Contributor

@francoisferrand francoisferrand Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in cloudserver, it seems this is initialized to bucketMD.replicationConfig.destination : should we do the same?

Copy link
Contributor Author

@Kerkesni Kerkesni Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird one... So, we currently only support having a single destination bucket for all replication rules of a bucket. When creating the rules/workflows via UI it's even worse, the destination bucket becomes the name of the bucket we are replicating from.
The value of this field is not used in Zenko. When creating a location, Cloudserver initializes a client class for the respective backend (aws/azure/...) that keeps the name of the destination bucket in memory, that's the value we use when replicating and not what's in the replication rule (only the storageClass is used to know which client to use).

Comment on lines +197 to +199
storageClass: '',
role: Role,
storageType: this.storageType,
storageType: '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these 2 fields (storageClass and storageType) used for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storageClass is what each location's replication queue processor uses to check if it should replicate an object or not.
storageType is used by Cloudserver in Backbeat routes to do some pre-checks (check if versioning is supported on the backend and that the location is valid)

Copy link
Contributor

@francoisferrand francoisferrand Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is how these fields are used... my question is really what information do they actually store, what these fields represent (for example, the storageClass of the object is already known, and stored in .location[] and dataStoreName ; and this cannot be the storageClass of the 'remote' object, since there may be multiple destinations...)

same for destination btw, I don't understand this field initialize with a single bucket name :-/ Or maybe this is a left-over from the first replication (not supporting multi-targets)?

Copy link
Contributor

@francoisferrand francoisferrand Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the code which initializes these in cloudserver:

function _getReplicationInfo(rule, replicationConfig, content, operationType,
    objectMD, bucketMD) {
    const storageTypes = [];
    const backends = [];
    const storageClasses = _getStorageClasses(rule);
    if (!storageClasses) {
        return undefined;
    }
    storageClasses.forEach(storageClass => {
        const storageClassName =
              storageClass.endsWith(':preferred_read') ?
              storageClass.split(':')[0] : storageClass;
        const location = s3config.locationConstraints[storageClassName];
        if (location && replicationBackends[location.type]) {
            storageTypes.push(location.type);
        }
        backends.push(_getBackend(objectMD, storageClassName));
    });
    if (storageTypes.length > 0 && operationType) {
        content.push(operationType);
    }
    return {
        status: 'PENDING',
        backends,
        content,
        destination: replicationConfig.destination,
        storageClass: storageClasses.join(','),
        role: replicationConfig.role,
        storageType: storageTypes.join(','),
        isNFS: bucketMD.isNFS(),
    };
}

--> it seems these fields (along with role) are not multi-destination/rule aware?
--> these should be mostly useless for Zenko's multibackend replication (otherwise we probably have bugs), but may still be needed for CRR (and may cause bugs if multiple rules/destinations were used in that case)

Copy link
Contributor Author

@Kerkesni Kerkesni Dec 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storageClass and storageType are multi-destination aware (although their name suggests otherwise), they contain the list of all storage classes and storage types we are replicating to.
Example:

storageClass: "aws-location,azure-blob"
storageType: "aws_s3,azure"

In Zenko, these fields are duplicates of information we already have, as backbeat/cloudserver have a list of all location information, we could just use the storage class stored in the rules to get the info we want. I think these are more of a relic from S3C that we can't really remove right now as S3C uses them.

In CRR, the role is also a list. I don't think this works in Zenko tho.
The destination field is a weird one, i explained how it works in the previous comment

- Issue 1: In Artesca "SITE_NAME" is never passed, so we always
trigger objects that are replicated to the first storageClass in
the replication rule.

- Issue 2: We check the global replication status when verifying
wether or not an object should be retriggered. This doesn't necessarily
work all the time, especially when replicating to multiple destinations.
As if one destination fails the global status becomes failed, which will
make it impossible to trigger objects with a completed status for example.

- Issue 3: replication info is completely overwritten when it does not contain
info about a specific site. This will cause an issue when replicating to multiple
destinations as the script can only be launched for one site at a time, so when
having a object with non initialized replication info, we won't be able to set
the replication info propely for all destinations.

Issue: S3UTILS-184
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants