Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Download multipart parts #749

Open
0xavi0 opened this issue Oct 6, 2023 · 2 comments
Open

[FR] Download multipart parts #749

0xavi0 opened this issue Oct 6, 2023 · 2 comments
Labels
area/rgw-sfs RGW & SFS related area/upstream Related to an upstream project kind/feature New functionality or support for something triage/proposal for closure There are reasons for this issue to be closed

Comments

@0xavi0
Copy link
Contributor

0xavi0 commented Oct 6, 2023

This was discovered when running the warp multipart test as it always complains about the size.
warp uploads parts and then tries to download them. (not the whole object)

When getting multiparts parts we're returning the whole object.

Example:

// create a dummy part (we'll use the same for every part upload)
$ dd if=/dev/random of=/tmp/part bs=1M count=6

// create the multipart upload (take note of the uploadId)
$ aws --endpoint=http://127.0.0.1:7480 s3api create-multipart-upload  --bucket test --key test-mp


// upload parts
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 1
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 2
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 3
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 4
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 5
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 6
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 7
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 8
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 9
$ aws --endpoint=http://127.0.0.1:7480 s3api upload-part --bucket test --key test-mp --body /tmp/part --upload-id 20231006T151146.621816566Z --part-number 10

// now list all the parts 
$ aws --endpoint=http://127.0.0.1:7480 s3api list-parts --bucket test --key test-mp --upload-id 20231006T151146.621816566Z  > /tmp/parts

// create the parts json file , for example
$ cat /tmp/mps 
{
    "Parts": [
        {
            "PartNumber": 1,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 2,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 3,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 4,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 5,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 6,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 7,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 8,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 9,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        },
        {
            "PartNumber": 10,
            "ETag": "\"3394c11c5f93f2aa754ffb8bc6f2d02d\""
        }
    ]
}

// finish the multipart upload
$ aws --endpoint=http://127.0.0.1:7480 s3api complete-multipart-upload --bucket test --key test-mp --upload-id 20231006T151146.621816566Z --multipart-upload file:///tmp/mps

// now try to download part 1
$ aws --endpoint=http://127.0.0.1:7480 s3api  get-object --bucket test --key test-mp --part-number 1 /tmp/download-part
{
    "AcceptRanges": "bytes",
    "LastModified": "2023-10-06T15:22:12+00:00",
    "ContentLength": 62914560,
    "ETag": "\"01485cb8fa95e2c925132431fa5c484d-10\"",
    "VersionId": "3xI6p27uWobQZKLkSbUzOmmh4792IYM",
    "ContentType": "binary/octet-stream",
    "Metadata": {}
}

It returns the whole object. Size, LastModified, ETag... are the attributes of the whole object.
I suspect the --part-number parameter is just ignored right now in our code, but I haven't checked. (will do)

@0xavi0 0xavi0 added kind/bug Something isn't working kind/feature New functionality or support for something labels Oct 6, 2023
@0xavi0 0xavi0 added this to S3GW Oct 6, 2023
@github-project-automation github-project-automation bot moved this to Backlog in S3GW Oct 6, 2023
@github-actions github-actions bot added the triage/waiting Waiting for triage label Oct 6, 2023
@irq0
Copy link
Contributor

irq0 commented Oct 10, 2023

The get-object request becomes a regular RGW get obj handled by our SFSObject::SFSReadOp. The RGW Ops layer doesn't seem to care about the "PartNumber" parameter or has the read op params struct anything related.
The only reference I found is in PutObj. I think RGW simply doesn't support that feature.

I think we should file this as a not supported feature.

Should we ever want to support it, we already store multipart manifests when SSE is enabled that keeps the parts / offests around for decryption. The same could be used to support GETs with part number at the expense of a larger object version attr

@jecluis
Copy link
Contributor

jecluis commented Oct 16, 2023

Agreed. I think this can go on the unsupported pile. This needs an entry in the compatibility matrix. If rgw itself does not support this, we should consider contributing this behavior upstream at some point.

Keeping this issue around, pushing it to a later milestone, and tagging it for upstream contribution.

@jecluis jecluis added area/upstream Related to an upstream project area/rgw-sfs RGW & SFS related and removed kind/bug Something isn't working triage/waiting Waiting for triage labels Oct 16, 2023
@jecluis jecluis added this to the v0.25.0 milestone Oct 16, 2023
@jecluis jecluis added the priority/2 To be prioritized according to impact label Oct 16, 2023
@jecluis jecluis added triage/proposal for closure There are reasons for this issue to be closed and removed priority/2 To be prioritized according to impact labels Mar 20, 2024
@jecluis jecluis removed this from the v0.25.0 milestone Mar 20, 2024
@jecluis jecluis added this to s3gw Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/rgw-sfs RGW & SFS related area/upstream Related to an upstream project kind/feature New functionality or support for something triage/proposal for closure There are reasons for this issue to be closed
Projects
Status: No status
Development

No branches or pull requests

3 participants