Skip to content

Commit

Permalink
addition of a workflow for purging duplicates in a single haplotype
Browse files Browse the repository at this point in the history
  • Loading branch information
Delphine-L committed Feb 7, 2024
1 parent aa85d71 commit 62f6d5d
Show file tree
Hide file tree
Showing 4 changed files with 3,250 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Changelog

## [0.1] - 2024-02-07

### Added

- Workflow for purging duplication in contigs in single haplotype
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
- doc: Test outline for Purging-duplicates-one-haplotype-VGP6b
job:
Genomescope model parameters:
class: File
location: https://www.dropbox.com/scl/fi/6a6b9xc6jih3rtfrfy7ch/Genomescope-model-parameters.tabular?rlkey=2d86kp0wsdlzk5bxzvs36ry4a&dl=1
filetype: tabular
Assembly to purge:
class: File
location: https://www.dropbox.com/scl/fi/7hbeju59zkbc5d28lpp19/hap1.fasta?rlkey=xii6j4iq0y48f5qulu0nti8uu&dl=1
filetype: fasta
Meryl Database:
class: File
location: https://www.dropbox.com/scl/fi/gpkhbn9onssi5z776kwzy/Meryl-Database.meryldb?rlkey=2ozjdiego6ew9nwaxjgmw3a79&dl=1
filetype: meryldb
Estimated genome size - Parameter File:
class: File
location: https://www.dropbox.com/scl/fi/s1jieqs34p5ol3xjcez39/Estimated-genome-size-Parameter-File.expression.json?rlkey=bamnfmxo7mqk7y3gq59v5rrzb&dl=1
filetype: expression.json
Assembly to leave alone (need this for merqury):
class: File
location: https://www.dropbox.com/scl/fi/xz7v21ora7n86iisopr8i/hap2.fasta?rlkey=91vdm0bes0a77kaz17rwv759y&dl=1
filetype: fasta
Pacbio Reads Collection - Trimmed:
class: Collection
collection_type: list
elements:
- class: File
identifier: yeast_reads_sub1.fastq.gz
location: https://www.dropbox.com/scl/fi/1xmp7kajhdp5cnm4nc8qj/Trimmed_yeast_reads_sub1.fastq.gz?rlkey=q7e54x26fm6fp732kp2lz6bxf&dl=1
Name of un-altered assembly: Hap2
Name of purged assembly: Hap1
outputs:
Removed haplotigs:
asserts:
has_n_lines:
n: 14
Purged assembly:
asserts:
has_n_lines:
n: 158
Purged assembly (GFA):
asserts:
has_n_lines:
n: 160
Assembly statistics:
asserts:
has_text:
text: "Average contig length 23,851.86 24,476.28"
Cutoffs:
asserts:
has_text:
text: "1 15 15 16 16 48"
'Busco on Purged Primary assembly: short summary':
asserts:
has_text:
text: "C:1.1%[S:1.1%,D:0.0%],F:0.5%,M:98.4%,n:3354"
Purged assembly statistics:
asserts:
has_text:
text: "# scaffolds 79"
Nx Plot:
asserts:
has_size:
value : 57000
delta: 5000
Size Plot:
asserts:
has_size:
value : 84000
delta: 5000
'Merqury on Phased assemblies: stats':
element_tests:
output_merqury.completeness:
asserts:
has_text:
text: "both all 1244957 1300032 95.7636"
Loading

0 comments on commit 62f6d5d

Please sign in to comment.