-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a hook or callback for cleaning up on disk file uploads #264
Comments
I totally agree and have to admit, that I completely forgot about this behavior in my merge request. But why should a custom cleanup function from the calling application take care about the cleanup. In my opinion, the library should always cleanup all uploaded files. Why not put it into the destructor of the file_info class? Of course there is a use case (i.e. the use case, I implemented this feature for), where we want to have these uploaded files be preserved longer than the actual request. |
This is what I suspected. Some users want to keep those files, others don't, others maybe only want to keep some files. What about a default cleanup method built into libhttpserver, which would remove all files, and a possibility to override that cleanup method by the application? Could look somewhat like the already present custom defaulted error messages or custom logging callbacks? |
What would a custom cleanup mean? Problem I see is, that one would have to keep track of the files, which should not be deleted in the application. And the handler can be called completely out of the render functions. Personally, I would rather prefer a solution, where you could command the library to delete them or not. So, if the library sends an error message (i.e. 404) or if the render function (i.e. for render put) does not take care about the files, the library should automatically delete the files in the end. |
What I had in mind was a rather global function registered on webserver creation.
Yes. But I see this as a feature. The render functions are bound to a registered resource. File uploads (think of malicious requests from client with bad intent) can happen to any resource, especially resources where the webserver would answer with 404.
Mkay. Do you have an idea how this could be realized from API point of view? |
Actually it might really get difficult to find a good solution for setting the cleanup on every request individually (at least I have no quick idea on that). So yes, a callback or something like that is probably a better solution. Maybe a little bit more radical approach: |
What I had in mind was the whole map.
Yes, I mean I could live with that, because it would not interfere at all with my usecase. For the case where those files are needed later, that copy might be expensive, depending on how it is done, but you could argue that's business of the user. Just moving (renaming) a file would make sense for the case where the lib generates the filenames anyways, but I did not thought of this in the first place, because I considered the lib responsible for those files. Either way, it would need an update of documentation, to clearly state what the library does and what not. It's a little hard to guess what user expectations are in this case. 😕 |
I am fine if the cleanup function would get the whole map. This was merely a question. Actually the "radical" approach would just be the same if the user of the lib would not provide a custom cleanup function. Then the lib would again always delete all uploaded files and a user could simply preserve files by copying or renaming. As a default cleanup is needed anyway I offering the possibility for a custom cleanup function is a good approach. Anyway I think, this functionality (a default cleanup function at least) is really necessary for a released version. As using the library as it is at the moment could be used for some kind of DoS attack if a malicious client could simply fill up the whole file system of a device. |
Love the thread. The general approach of the library with managing uploaded content is that the content is "owned" by the library until it leaves and it is accessed by the structures of the library itself. Along those lines, if you use an "in memory" approach, the content uploaded goes away as soon as the The fact that we store the file on disk is just an extension of memory to me; the library still "owns" that content until the user doesn't take explicit ownership of it. As such, having a deletion by default is a good strategy (akin to freeing a dangling pointer and preventing a leakage). To keep the library usable this is what I'd do:
On 2. I am working on a more general system of overrides at resource level; so keeping it at server level for now is fine - it will all go down one layer when I am finally close on this. |
Okay, what I take from this for now: we can start with the cleanup (1.) already, because it's more or less independent from the actual solution we come up with for hook or override (2.) or whatever it will be called. |
If the user needs those files, she should copy/move theme in the request handler. References: etr#264 Signed-off-by: Alexander Dahl <[email protected]>
For that minimal "radical" approach, see #266. |
* http_request: Remove uploaded files in destructor If the user needs those files, she should copy/move theme in the request handler. References: #264 Signed-off-by: Alexander Dahl <[email protected]> * test: integ: file_upload: Remove redundant file delete Not needed anymore. http_request destructor does cleanup now. * test: integ: file_upload: Check uploaded files are deleted * test: integ: file_upload: Stop webserver earlier The webserver runs in a different thread than the test, therefor deleting the uploaded files and checking if they are deleted might lead into a race condition, and thus into tests failing sometimes, but not always. Moving the webserver stop and destructor behind the curl action should be safe, every information needed for the test is copied to or in the resource handler.
Is your feature request related to a problem? Please describe.
Since 3fd99c8 it's possible to let libhttpserver store files uploaded with forms (mime-type multipart/form-data) to disk, see the pull request #257 for reference. This whole new feature request is about cases where file upload target is set to
FILE_UPLOAD_DISK_ONLY
orFILE_UPLOAD_MEMORY_AND_DISK
.In certain circumstances it is necessary for an application to not store those files permanently, but remove them from disk again, after processing. Doing that in a render method associated with a registered resource is possible. However the file is always stored to disk, regardless if that render method is called at all. Multiple code paths in
webserver::finalize_answer()
, mostly error cases, are possible leading tomr->dhrs
assigned by something else than the render callback supplied by the application. Simplest case is sending a (possibly forged) request to a not registered resource (HTTP status 404), but others are possible, too. That would mean the cleanup in your render method is not executed.The actual target we use libhttpserver on is an embedded device with usable RAM smaller than files uploaded, so we have to store those files on disk. Those files can and should be removed again on every request, otherwise the filesystem would fill up quickly. This is especially true for manipulated requests with bad intent. A full filesystem can be considered as a denial of service situation.
Describe why the feature or enhancement you are proposing fits the library.
I propose some extension to libhttpserver for cleaning up all uploaded files of a request at the end of request processing. Thinking of the possibility of optionally hooking up a callback function provided by the application, where user can decide how to cleanup, and call that in an appropriate place.
Describe the solution you'd like
Extend the
create_webserver
options by a function to set a custom cleanup function. That function would not be called if not set by the application. However if set it could be executed next toMHD_destroy_response(raw_response);
at the end ofwebserver::finalize_answer()
?Describe alternatives you've considered
With the current API of libhttpserver always cleaning up would only be possible by providing custom methods for all of these:
And you had to put that same cleanup code into every render method in your application, no matter if it is supposed to handle file uploads at all.
This would lead to lots of duplicated code with high potential to miss some path.
Additional context
Same problem applies to the draft #246 not followed (for good reasons) in favor of #257.
The text was updated successfully, but these errors were encountered: