-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SCAPP running for a long time and then, segfault #19
Comments
Hi, thanks for your interest in using SCAPP. SCAPP will run for varying times depending on the sample and can take a very long time for large, complex samples as the core of the algorithm scales as O(n^3) for n nodes in a component of the assembly graph (in practice it is not that extreme). It is hard to know what the complexity of your graph is (# of nodes in largest component, node degrees, lengths of potential paths etc), however the file size and # of contigs should provide a rough estimate. For reference, we have run SCAPP on files larger than what you reported here in less than a day (16 threads). It is strange that it segfaulted, and that it is mostly using 1 thread and that it takes so long.
I can see if there is something to debug the problem with this specific sample. You could try to remove short contigs from the assembly graph before running SCAPP, but this will likely degrade the performance, and SCAPP should be able to run on a graph of this size. If you have a really huge assembly graph (it doesn't sound like it from your description) you could also try to divide the reads and create a few smaller assemblies and run SCAPP on each of them. |
Hello, Thanks a lot for your answer. I've already sent the requested data. If you don't see anything wrong, I will try the steps you suggest. |
Hello,
I am trying to run SCAPP on my dataset. For some samples, it runs fast (around 1 day), but there are others in which it is taking long. One is running for more than 15 days. The "scapp.log" file was being updated all the time, so I guessed it was properly running. However, today, it failed (segfault).
Do you know whether it is normal to take that much time? I put the program to run on 30 threads, but it uses 1 for almost all the time. The file I'm running SCAPP on is around 816 Mb, obtained from metaspades (up to k99), with 387.000 contigs (67.000 above 1000bp).
Is there any way of subsetting the fastg file from metaspades (to remove short contigs) as input for SCAPP.
Thank you,
Best regards
The text was updated successfully, but these errors were encountered: