-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue crashed analysis from tree inference step #61
Comments
To be clear, you ran PASTA using the alignment from the previous stage as
input (-i)? And it still tried to do an alignment?
Also, are you planning to do one iteration or more? If you want to have
only one iteration, there is no reason to run FastTree inside PASTA. You
can just run it outside.
…On Mon, Aug 9, 2021 at 12:36 PM Diego Alonso Marquez Palacios < ***@***.***> wrote:
Hello Siavash. Hoping you are well.
I'm reaching the final steps to build a tree from the Silva 13.8 dataset.
Unfortunately, it crashed during the tree inference step due to the
outdated fasttreeMP (which I did not expect to be used again in the
processing when preparing a new server)
The only iteration for the realignment step took a bit more than 3 weeks.
I was checking the wiki for a way to continue with the alignment produced
with this step, but apparently the --aligned option still goes over a
realignment step.
I did some time estimations with subsets of the Silva database and it
should take about 3 more days before finishing the tree, only if we manage
to skip the realignment, otherwise it would be 3 more weeks again.
I was wondering if I'm missing an option from the wiki to continue from
this substep of the iteration. Otherwise, I can try to modify the code to
provide the last alignment to the first iteration. If that's the case, I
will need to kindly ask you to refer me to the involved files in this
change or any development documentation to aid in solving this situation.
Thanks beforehand for your help.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#61>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGJXOHN67O7H37DUPIDPBDT4AU4HANCNFSM5B2VJU4Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
--
Siavash Mirarab
|
Last month I started pasta with So is it possible to just obtain the final tree with fasttreeMP from From the subset tests logs, I'm guessing the command below would be useful for this last step?:
(assuming edit: Yes, it did perform a realignment step (one iteration) Thank you! |
Hi Diego,
Yes, you can just obtain the final tree with fasttreeMP from
...iteration_0_seq_alignment.txt.
Note that default PASTA does 3 iterations, but most of the advantage comes
from the first iteration. The result of running fasttree on this alignment
will give you the result of the first iteration. I think given the size of
your dataset, one iteration is reasonable and sufficient. If you wanted to
do one more iteration, you can, but not necessary (I think).
Three more caveats.
1. ...iteration_0_seq_alignment.txt is already masked to remove super gappy
sites. The unmasked file is also available
(...temp_iteration_0_seq_unmasked_alignment.gz) but you don't want to give
that to FastTree. Also, it is in a format that needs translation (more on
that below). However, you may want to see how long
...iteration_0_seq_alignment.txt is and you may decide to mask even a bit
more; the default is to mask a site if it is a gap in >99.9% of species.
2. the ...iteration_0_seq_alignment.txt file uses PASTA's internal names
for species. These names can be translated back to the original names using
a simple text file (._temp_name_translation.txt) and a command that I will
send you.
3. If you are going to use the FastTree tree as your final tree, you may
want to eliminate the starting tree
(/home/ec2-user/.pasta/pastajob/tempBaXNdl/step0/mincluster/tempfasttreeGOBzhJ/start.tre).
It will take a bit longer, but that should be fine.
For both 1 and 2, I have scripts that are shipped as part of PASTA. Let me
write a quick markdown file and describe these. In the meantime, you can
start your FastTree run. I hope to get to this in a day or so.
Thanks
Siavash
…On Mon, Aug 9, 2021 at 1:10 PM Diego Alonso Marquez Palacios < ***@***.***> wrote:
Last month I started pasta with -i
pastajob_temp_iteration_initialsearch_seq_alignment.txt --aligned
The last file produced in the folder today was
pastajob_temp_iteration_0_seq_alignment.txt
So is it possible to just obtain the final tree with fasttreeMP from
...iteration_0_seq_alignment.txt ?
From the subset tests logs, I'm guessing the command below would be useful
for this last step?:
/home/ec2-user/pasta-code/pasta/bin/fasttreeMP -quiet -nt -gtr -gamma - **configuration)
fastest -intree /home/ec2-user/.pasta/pastajob/tempBaXNdl/step0/mincluster/tempfasttreeGOBzhJ/start.tre -log /home/ec2-user/.pasta/past
ajob/tempBaXNdl/step0/mincluster/tempfasttreeGOBzhJ/log /home/ec2-user/.pasta/pastajob/tempBaXNdl/step0/mincluster/tempfasttreeGOBzhJ/i pmj.launch_alignment(context_str=context_str)
nput.fasta
(assuming input.fasta == ...iteration_0_seq_alignment.txt)
Thank you!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#61 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGJXODNAQDI74HBJXKR5CLT4AY4VANCNFSM5B2VJU4Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
--
Siavash Mirarab
|
Thanks so much Siavash. I'll let you know about how fasttree goes. |
Diego,
I added information about getting the unmasked alignment from the PASTA
temporary files and name mapping here:
https://github.com/smirarab/pasta/blob/master/pasta-doc/pasta-tutorial.md#step-6-using-run_seqtoolspy
and in particular
https://github.com/smirarab/pasta/blob/master/pasta-doc/pasta-tutorial.md#restart-pasta-from-the-previous-runs
…On Mon, Aug 9, 2021 at 8:53 PM Diego Alonso Marquez Palacios < ***@***.***> wrote:
Thanks so much Siavash. I'll let you know about how fasttree goes.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#61 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGJXOGG45W4PSV3JXS6ZB3T4CPE3ANCNFSM5B2VJU4Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email>
.
--
Siavash Mirarab
|
Hi Siavash. Thanks for the added steps on the tutorial. Thank you very much. |
Hi Diego, There are two potential reasons.
Thanks |
Hi Siavash, thanks for the response. I will let you know about this |
Hello Siavash. Hoping you are well.
I'm reaching the final steps to build a tree from the Silva 13.8 dataset.
Unfortunately, it crashed during the tree inference step.
The only iteration for the realignment step took a bit more than 3 weeks. I was checking the wiki for a way to continue with the alignment produced with this step, but apparently the
--aligned
option still goes over a realignment step.I did some time estimations with subsets of the Silva database and it should take about 3 more days before finishing the tree, only if we manage to skip the realignment, otherwise it would be 3 more weeks again.
I was wondering if I'm missing an option from the wiki to continue from this substep of the iteration. Otherwise, I can try to modify the code to provide the last alignment to the first iteration. If that's the case, I will need to kindly ask you to refer me to the involved files in this change or any development documentation to aid in solving this situation.
Update: I found out that the inference step consists of a call to fasttreeMP - the debug output shows the exact args to execute the binary with. I'm thinking that the final steps would involve running a modified version of treeholder.py
Thanks beforehand for your help.
The text was updated successfully, but these errors were encountered: