Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore the idea of a Reverse Manifest Download of an Object/Version #233

Open
1 of 3 tasks
terrywbrady opened this issue Feb 7, 2020 · 4 comments
Open
1 of 3 tasks
Milestone

Comments

@terrywbrady
Copy link
Contributor

terrywbrady commented Feb 7, 2020

For object/version retrievals, would it be possible for Merritt to deliver a reverse manifest object containing links to individual file downloads? Once a reverse manifest has been downloaded to a user's desktop, could software be written to download and assemble the component files directly from the user's desktop?

As Merritt introduces signed URL downloads for files, the creation of a reverse manifest would eliminate the need to assemble (and duplicate) content in order to serve up an object or version download.

The reverse manifest download tool would behave like Box/Google Drive/Dropbox synchronization of a folder.

Have any other preservation systems implemented a feature like this?

  • Ask the OCFL community if this is under consideration
  • Ask APTrust if they have implemented something like this
  • Ask John K if he knows of any efforts in this area
@terrywbrady
Copy link
Contributor Author

Conversation with John K:...

If I understand correctly, it may already be solved by anyone with a library that implements bag (BagIt) fetching via the fetch.txt file. Merritt would ship a skimpy "holey" bag, metaphorically filled with "holes", that the recipient would fill by running through the received fetch.txt file (the reverse manifest). (edited)

A post about Wellcome Collection, which describes their approach to building a storage service layer on top of Amazon S3 with BagIt.

One specific thing that caught my eye was their use of the fetch.txt to do versioning of bags:

@terrywbrady
Copy link
Contributor Author

@terrywbrady
Copy link
Contributor Author

Doing a quick scan, I have not found client software to resolve fetch.txt entries.

Under what conditions could we ask an end user to use client software to resolve fetch.txt files?

  • Make this mandatory for any object/version retrieval
  • Make this mandatory for any object/version retrieval above a size threshold

What would make this useful for an end user?

  • For certain types of objects/versions, the processing of the fetch.txt file might be easier to throttle on the client side rather than performing one really large download
  • Recovery/retries would be easier to perform

@terrywbrady
Copy link
Contributor Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant