Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom S3 Buckets for Orgs #578

Open
Tracked by #580
ldko opened this issue Feb 7, 2023 · 2 comments · May be fixed by #2093
Open
Tracked by #580

Custom S3 Buckets for Orgs #578

ldko opened this issue Feb 7, 2023 · 2 comments · May be fixed by #2093
Assignees
Labels
back end Requires back end dev work

Comments

@ldko
Copy link

ldko commented Feb 7, 2023

I have been trying out local deployments of Browsertrix Cloud with microk8s and would find it helpful if I could configure a local storage path to where WACZ/WARCs are written for crawls, so I can put them in a location dedicated for storage rather than having it all go to the same place as where the microk8s data is generally being stored.

@Shrinks99 Shrinks99 added the back end Requires back end dev work label Feb 7, 2023
@Shrinks99 Shrinks99 changed the title Configure storage path for local deployments Configurable storage path per org on back end Feb 7, 2023
@Shrinks99
Copy link
Member

Shrinks99 commented Feb 7, 2023

This is planned :) We'll use this ticket to track the ability to configure the storage location on the back end on a per-org basis with an enforceable default set by the server admin.

Requirements

  • Main storage path and backup storage path can be changed to other S3 compatible locations in the org settings
    • The backup path must be set to something. If nothing is set it will use the Webrecorder default.
    • Users can set any number of backup paths
    • When a user changes their main storage path or backup storage path, their data will move to the new bucket.
  • Org Quotas superadmin panel can enable or disable the ability to set custom bucket locations

@Shrinks99 Shrinks99 mentioned this issue Feb 7, 2023
4 tasks
@Shrinks99 Shrinks99 added this to the Beta Tasks milestone Feb 8, 2023
@Shrinks99 Shrinks99 mentioned this issue Aug 3, 2023
10 tasks
@tw4l tw4l assigned tw4l and ikreymer and unassigned tw4l Aug 9, 2023
@Shrinks99 Shrinks99 modified the milestones: Beta Tasks, v1.10.0 Feb 20, 2024
@Shrinks99 Shrinks99 changed the title Configurable storage path per org on back end Custom S3 Buckets for Orgs Feb 21, 2024
@Shrinks99 Shrinks99 removed this from the v1.10.0 milestone Apr 30, 2024
@tw4l
Copy link
Member

tw4l commented Sep 19, 2024

Tasks:

  • Ensure code to add and remove custom storages works as expected
  • Add ability to set custom storage as primary and/or replica storage locations (ensure there's always one replica location set if any are configured, use default if org custom replica storage isn't set)
  • Ensure downloads and uploads with custom storage work as expected
  • Add tests
  • Add documentation
  • Add background job to move files from existing S3 bucket to new S3 bucket and update database accordingly (by modifying prefix)
    • start job when primary storage is changed, seting org to read-only and wait for crawls to complete before kicking this off
    • when replica location is added, don't set to read-only but instead start background jobs to replicate all files to new replica location

In the initial pass adding/removing custom storages and changing primary or replica storage on an org will be done through superadmin-only API endpoints. In a future iteration we could add an admin UI for this similar to org quota and proxy settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back end Requires back end dev work
Projects
Status: In Review
Development

Successfully merging a pull request may close this issue.

4 participants