Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Galaxy - RNA Seq workflow misc. issues #592

Open
CaseyRichards92 opened this issue Jun 24, 2020 · 9 comments
Open

Galaxy - RNA Seq workflow misc. issues #592

CaseyRichards92 opened this issue Jun 24, 2020 · 9 comments
Assignees

Comments

@CaseyRichards92
Copy link

Add issues or preferences for RNA Seq workflow https://hardwoodgenomics.org/content/whole-thing-617

@CaseyRichards92
Copy link
Author

CaseyRichards92 commented Jun 24, 2020

Step 2

  • Change sentence "Only files of the following types are listed: fastqsanger, fastqsanger.gz, fastqillumina, fastqillumina.gz, fastqsolexa, fastqsolexa.gz." to "Only .fastq and .fastq.gz files accepted from the following platforms illumina, sanger, solexa."
  • Change to Paired end files opposed to Paired files
  • Doesn’t seem to accept fastq or fastq.gz files
  • Reword description under Data Collection: “You may select from previously uploaded files, or upload new files.”
  • Todo: 64 GB is a lot of data to allow for each user.

@mestato
Copy link

mestato commented Jun 26, 2020

Alteration from above:

  • Change sentence "Only files of the following types are listed: fastqsanger, fastqsanger.gz, fastqillumina, fastqillumina.gz, fastqsolexa, fastqsolexa.gz." to "Only .fastq and .fastq.gz files accepted from the Illumina platform"

@CaseyRichards92
Copy link
Author

@CaseyRichards92
Copy link
Author

Featured counts from test
image

@mestato
Copy link

mestato commented Aug 14, 2020

  • Stringtie step - Add a note to users that they can use gtf and just rename it gff (yes this is a hack but is fast and will work)

@florence-77
Copy link

User data quota lowered to 20GB. Is this large enough to be useable, but small enough to prevent the site from being overloaded?

@mestato
Copy link

mestato commented Aug 21, 2020

Update tool descriptions:

  • StringTie merge (version 2.1.1): Merge transcripts across samples into a nonredundant set
  • featureCounts (version 1.4.6.p5): Count reads per transcript

Also:

  • grey out presets using css
  • indent lines under the bolded headings (also a CSS thing)

@Ferrisx4
Copy link

Workflow Logic

Option 1 (Preferable, but not guaranteed to work)

  • Allow Tripal Galaxy to read from Galaxy when tools specify that they want a Genome from the site (Data managers)
    • This will allow us to take advantage of the Galaxy built-in Genomes and their respective index files. All workflows in the future should be this way. We already have indexes for many types, all we need to do is add them to the .loc files in Galaxy. See the diagram on this page for an explanation.

OR

Option 2 (Should work, more time needed to re-wrap hisat and to wrap hisat-build)

  • Allow the HISAT2 tool to accept indexed reference genomes from history
  • Wrap hisat-build as a tool, insert it into Workflow before HISAT
    • This is to avoid having HISAT re-index the reference genome for every pair of input files (bad)

@Ferrisx4
Copy link

Modified Galaxy webform logic to accept certain file aliases but also a compressed version of any file: tripal/tripal_galaxy@d77d14b

This is necessary because Drupal seems to be unable to filter correctly on multipart file extensions.
Ideally we can trust our users to be able to read the form and supply the correct type of file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants