-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design question — multiple storage backends or locations #255
Comments
A couple of comments about some of the design choices:
I think the issue you point to is that it's hard today in Invenio to use multiple Invenio |
As I understand it, the current design/implementation has the following:
FILES_REST_STORAGE_FACTORY
, the provided implementation beingpyfs_storage_factory
. As far as I can tell, this implementation isn't particularly/actually PyFS-specific.FILES_REST_DEFAULT_STORAGE_CLASS
)Bucket.default_storage_class
,FileInstance.storage_class
(both adb.Column(db.String(1))
, which implies it's never been used? see source)FileInstance.uri
. The location is not used after that, and so file URIs for filesystem-stored files are absolute paths on disk.invenio_files_rest.tasks:merge_multipartobject
callsmp.merge_parts()
, which has a**kwargs
that gets passed all the way through toFileInstance.storage()
, which defaults tostorage_class
being the one specfied byFILES_REST_DEFAULT_STORAGE_CLASS
. In this scenario, thestorage_class
on the FileInstance gets ignored.This seems a bit complicated, and doesn't yet necessarily meet its objectives of having a tidy interface and support for multiple storage backends.
What I would have expected:
Location
defines a storage class and a base path, e.g. (abstractly)(pyfs, /some/path)
,(s3, /bucket-name/some-path)
or(s3bucket, /some/path)
Bucket
provides for a default location for new files, defaulting to the global default locationFileInstance
has aLocation
and a path. The location defaults to the bucket at creation time. The base path for the bucket and the path for the file instance are combined and used by the storage backend defined on the locationFileInstance.storage()
takes no**kwargs
, and creates the storage class instance based on the above. Maybe it's a (cached?) property. Its behaviour is contained in a function that replaces the storage class factory and can be overridden, but generally an implementer would leave this alone unless they want to do something special.This GitHub issue came about because I was trying to work out how one would support hooking up to multiple S3 buckets, and my assumption was that each would be a location for each bucket, but it wasn't immediately clear how to integrate this into the existing framework.
Could we improve on how this all works before we have a public release?
The text was updated successfully, but these errors were encountered: