You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are able to move sequences between projects in ENA
However, the only time that a sequence's visibility changes is at the moment when its parent project undergoes release. And each project can only be released once. This prevented our planned model of each user having a private and a public project and us simply transferring between them at the point of release.
Proposal
My proposal is that when on a specific day we want to release sequences, we:
create a new "release project" for that day's sequences (potentially across all users)
move sequences from their user's private projects to the release project
release the release project, triggering release of the sequences
some time later move the sequences to the users' public project
This could result in us creating 365 projects per year, but that doesn't actually seem too bad. And initially it wouldn't be that many.
The text was updated successfully, but these errors were encountered:
I think this sounds like a potential solution - great idea @theosanderson!
However, as this will be a large code change I think we should do a test before implementing. Sadly it will have to be on ENA production (with our non-broker account) as we can't test well on dev - so we will have to ask them to clean up afterwards.
We should create 2 private and one public project.
First submit the sequence(assembly) in the private project and wait for it to be accessioned.
Move the sequence from the first to the second private project and then release the project (i.e. make it public).
Make sure the sequence has in fact gone public
Move the sequence to the initial public project -> check that the correct bioproject is linked on sequence view pages and on NCBI virus
As I was anyways thinking about this, these are the steps required for implementing the feature:
Update: #2893 will need to also be resolved for this to work.
[minutes] Modify cronjob to also notify us about new private sequences to submit
[1 day - hours if we choose to stay with slack notifications] Modify cronjob to create PRs on github with private and public sequences (maybe on different pages)
[minutes] Add new public/private status and release data column for sequence- and project- table in DB
[hours] Change read in process from github to distinguish between public and private
[hours] Upload private sequences to private ENA repo (code reusable until upload to Loculus, just change project creation to private)
[1+ days - tbd] Do not add accession to Loculus public page: either send submission group an email or make this only visible on sequence details page for group members (if a private sequence is publicly disclosed ENA has a process of making it public) - specifics should be discussed before starting could be much longer
[hours] If a sequence's release data is changed modify table (I think the cronjob will have to updated to also send these updates and the read in process will also have to be modified)
[1 day] On release date move sequences to different private folder and make folder public
[1 day] Once public (check API if there is a good endpoint for checking this) move to group's public project.
[hours] Update sequence's status to open, change bioproject to new public project and make this data visible on Loculus (with upload external metadata endpoint)
Other thoughts - restricted use sequences will also be visible in a github repo - do we need to add a banner or sth to make sure they are not mis-used?
Totally agreed that we need to do the tests. And I am quite uncertain about whether the last move to the correct project will propagate to NCBI Virus. Even if it doesn't this may still be a useful stopgap compared to the previous behaviour - perhaps with us contacting the ENA helpdesk every month or so to trigger a sync.
Current Situation
Proposal
My proposal is that when on a specific day we want to release sequences, we:
This could result in us creating 365 projects per year, but that doesn't actually seem too bad. And initially it wouldn't be that many.
The text was updated successfully, but these errors were encountered: