You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I believe that fastp would improve the current align-DNA pipeline and workflow.
fastp is an all-in-one FASTQ preprocessor. It performs read filtering, base correction, quality control, and adapter trimming. It also produces a variety of QC plots that can be used to make decisions around sample inclusion/exclusion in further analysis.
Currently, fastp is only offered in the align-RNA pipeline, where I find it is very useful in reducing time spent running the software seperately. Offering fastp as a configurable option in align-DNA would create feature parity between the pipelines and also save users significant time and storage.
Today, I run fastp before align-DNA runs and store a seperate set of fastq files on top of the ones already registered. Multiplied across many projects this can become non-negligible, and save the lab storage space if adapter trimming were done as part of the pipeline and trimmed fastqs were deleted each time a run concluded.
The text was updated successfully, but these errors were encountered:
For QC, yes we will be developing sample- and cohort-level QC pipelines.
For hard-clipping, aligners typically perform soft-clipping on reads contaminated by adapters. Given the potential compute and storage costs, I don't think we would need this option for most of our datasets although it would be helpful to see benchmarking results in the context of the compute costs and downstream data accuracy.
I believe that fastp would improve the current align-DNA pipeline and workflow.
fastp is an all-in-one FASTQ preprocessor. It performs read filtering, base correction, quality control, and adapter trimming. It also produces a variety of QC plots that can be used to make decisions around sample inclusion/exclusion in further analysis.
Currently, fastp is only offered in the align-RNA pipeline, where I find it is very useful in reducing time spent running the software seperately. Offering fastp as a configurable option in align-DNA would create feature parity between the pipelines and also save users significant time and storage.
Today, I run fastp before align-DNA runs and store a seperate set of fastq files on top of the ones already registered. Multiplied across many projects this can become non-negligible, and save the lab storage space if adapter trimming were done as part of the pipeline and trimmed fastqs were deleted each time a run concluded.
The text was updated successfully, but these errors were encountered: