Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#final assemly result# #57

Open
AliBasuony2022 opened this issue Jan 29, 2024 · 2 comments
Open

#final assemly result# #57

AliBasuony2022 opened this issue Jan 29, 2024 · 2 comments

Comments

@AliBasuony2022
Copy link

Dear Christoph,

Where I can find the final assemly result in outputs from MITObim.

In tutorial (see https://github.com/chrishah/MITObim), you mentioned that it is under the following:

-bash-4.1$ less iteration8/testpool-Salpinus_mt_genome_assembly/testpool-Salpinus_mt_genome_d_results/testpool-Salpinus_mt_genome_out.unpadded.fasta

While in log file from MITObim run, it sayes

Final assembly result will be written to file: /mnt/scratch/c1845371/whole_genome/mitochondrial_genome/mitobim/mitobim_375_trimmed/iteration2/testpool-375_1-it2_noIUPAC.fasta

I'm confused because the two previous assembly output produce a different length for the same sample.

Kind regards,
Ali

@chrishah
Copy link
Owner

Dear Ali,
The final result is the one it says in the log: *noIUPAC.fasta. Sorry for the confusion. The info on Github isn't up to date.

For why you would have different length assemblies, this may well be if you use different seed sequences or change other settings. Also, note that the final assembly isn't circularised, i.e. the start and the end of your assembly may well overlap and the exact extend will depend on the starting point for the assembly. If you want to check if your result is likely circular, there is a script that comes with the Mitobim repository:

circules.py -f your_final_assembly.fasta 

You'll get a report that will tell you if it has found patterns that would indicate circularity and suggest clipping points, for example -c 0,15893. If you run then again:

circules.py -f your_final_assembly.fasta -c 0,15893

this will remove the overhangs and the resulting assembly is expected to be ciruclar. After this step the results from different seeds should be the same.

Best wishes,
Christoph

@AliBasuony2022
Copy link
Author

Dear Christoph,

Thanks so much for your quick response.

As I mentioned in the previous message, the different in the lenght was for the same sample in the same run (I mean it was the same script and the same seetings) - any way it's clear now for me.

I'm studying the performance of de novo and reference-based approaches on extracting of mitogenome from whole genome re-sequencing.
You know that most of problems of de novo assemblies arrise from the tandem repeat regions in the D-loop. This could explains differences in recovered mtDNA length between de novo and reference-based approaches. Is it possible in MITObim to get the bam file for the reconstructed mitogenome, please? I want to submit it as asupplementary - just to convience the reviewers that the tandem repeats are the problem, nothing else.

Below is some helpful information for your reference.
For denovo assembly: I got 16521 bp from NOVOPlsty, 17842 bp from MITObim after clipping
For reference mapping: the length of the reference genome 16813 bp (of course this is biased)

Best regards,
Ali

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants