Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/cldsrv 546 post object #5601
base: epic/RING-45960-postObject-api
Are you sure you want to change the base?
Feature/cldsrv 546 post object #5601
Changes from all commits
efc79b2
6f82826
41b839c
9d9c4ae
b740e13
6ff75db
8d59e1c
b954a45
3443d71
0d40c42
99b0c26
252205a
7c18326
88403e4
7018ba1
d09a42d
2a7d4cf
3aadbde
29ebf3f
2a41094
cb7b4fe
95140e3
92d9273
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more details about
multipart/form-data
, parsing bybusboy
and aws behavior with some links to documentation:Content-Type
Documentation for the Content-Type, with the boundary directive and example with body: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type
Again,
busboy
will acceptapplication/x-www-form-urlencoded
(https://github.com/fastify/busboy/blob/master/lib/types/urlencoded.js#L9)That will unnecessarily read the whole body and trigger an error
400 InvalidArgument
instead of the412 Precondition Failed
returned by aws and stopping the request early.Content-Disposition
The documentation about
Content-Disposition
gives you details for its usage inmultipart/form-data
: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-DispositionIn the body, a part will look like
Here
busboy
will parse this and emit afield
event withfieldname
beingf1
You can send this with:
<input type="text" name="f1">
curl -F f1=value1
curl -F "f1=</etc/hostname"
(this put the content of the file but not the filename)form-data
or rawYou can also have
You can send this with:
<input type="file" name="f2">
curl -F "[email protected]"
(filename will be example.txt)form-data
or rawNotice the
filename
directive, this will makebusboy
emit afile
event withfieldname
beingf2
.As described in busboy documentation: https://www.npmjs.com/package/@fastify/busboy#busboy-methods with the default config
isPartAFile
.You can notice in AWS documentation about the
file
form field: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPOST.html#RESTObjectPOST-requests-form-fieldsSo you can either send:
Content-Disposition: form-data; name="file"
(with an input of type text)Content-Disposition: form-data; name="file"; filename="whatever.txt"
Which means by default the
file
form field can be emitted either by thefield
orfile
busboy event.Also by testing on AWS, it accepts other form fields like
key
,acl
having afilename
directive (meaning coming from an input type="file"), example this should be possible:Content-Disposition: form-data; name="key"; filename="whatever.txt"
Content-Disposition: form-data; name="acl"; filename="whatever.txt"
So any form field can be emitted by either the
field
orfile
busboy event.The processing / validation should be performed based on the
fieldname
and not the busboy event.Note that it could be interesting to replace the default behavior of busboy and defining
isPartAFile
to consider a file only thefieldname === 'file'
(with potentially presence offilename
).When doing that there should also be some limits defined on busboy, if a form field other than
file
is being a big file and now parsed as a string instead of a stream, limit would help busboy to stop reading it. (The defaultlimit.fieldSize
is 1MiB but AWS accepts only 20KB for the form-data exluding file).${filename} replacement
About
${filename}
replacement, it's not only on the key, but all form fields: https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-HTTPPOSTForms.htmlAlso the documentation defines how
filename
should be sanitized (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Disposition#filename)Busboy should sanitize it, but you could add some tests with filename containing
../xxx
,a/b/c
, etc to ensure the${filename}
works as aws.And validation on form fields, like
key
length, should then be performed after${filename}
replacement as it could impact the value.Additional validation
You can receive a
Content-Disposition: form-data
without name, that would make an undefinedfieldname
and crash:TypeError: Cannot read properties of undefined (reading 'toLowerCase')
Also by testing on AWS, they seem to trim the
fieldname
of whitespaces:name=" \n key \n "
will should matchkey
.It's also possible to have multiple parts with the same
name
.For example with this:
or
curl -F f1=value1 -F f1=value2
We should ensure, just like AWS that we don't have multiple times the same form field, whether busboy emits it via
field
orfile
.Maybe the file should not be read if the parsing failed and an error was caught.
And when reading (pipe) the file stream, we need to catch error, otherwise it can crash:
Error: Part terminated early due to unexpected end of multipart data
It seems the
finish
event is triggered once the file is completely read, and there is a validation logic in there, the logic could happen earlier before writting file to disk.There is a validation on
Content-Length
as well:The
key
should be validated just like it's done for regularPutObject
.See https://github.com/scality/Arsenal/blob/7eb2701f21aba91d45c2454ee8c5c257fea1c0db/lib/s3routes/routes.ts#L65-L73
Maybe there could be some blacklisted object prefix on cloudserver
You should find attached a javascript script to write the raw body form-data easier than using `nc`. It allows to test invalid bodies that are usually well formated by lib like `form-data` or by `curl` or `postman`.
Also you can add a delay between writes to test server timeout behavior.
Generating the body raw without a lib can help you test any kind of invalid body to make sure you use busboy correctly and validate all input before using them to prevent a crash of the server.
Busboy already takes care of all the parsing, we need to configure it correctly (cf busboy readme) to match AWS spec and validate inputs to prevent a crash.
Ideally you could extract the parsing / validation (parseFormData function) with busboy into its own isolated class to make it easier for testing. And if you want to be consistent with callbacks usage instead of promises and async await, you can pass a callback as input of your parsing function to call once parsing /validation is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
startsWith
will allow things likeacl abc
. Known fields should be checked with equality and the specialx-amz-meta-
can check checked aside