-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to merge the TEsorter repeat libraires #52
Comments
Yes. In the output library |
Okay, thanks for answering the second part of my question. But I still have doubt about merging the two libraries. As the RepeatModeler provides the consensus library where the number of sequences is very less as compared to input genome fasta whereas, the TEsorter provides the number of sequences same as the input genome fasta. So I am wondering that, can I merge both the librarires in one and then run clustered the merged library using tools like CD-Hit? |
I do not understand. Are you using |
Thank you for the prompt reply. Yes, I used the -genome option to screen for the TEs in my genome. However, I was not aware that we can also input the library obtained from RepeatModeler. |
You are right. Please note that the |
Okay. I am using the TEsorter v1.4.6, and I did get the *.cls.lib by using the -genome option. |
It is strange. How did you install it? Is it the last version from github? |
I installed with conda environment |
I test the conda version, but only four files output:
|
Oh, it must be because I did not define my genome by parameter -genome instead I used something. |
Yes. |
Further, on this.. I run TEsorter with the RepeatModeler output consesi.fa and it took only one minute to give me the output in *.cl.lib, with the following output on screen Now I am wondering does the pipeline worked or not? |
It works. It is fast for small TE library. |
Hi, I have run the RepeatMasker, and I am getting more repeats classified as "unknown" which I want to reduce. I am attaching the output of repeatMasker for my genome both using RepatModeler ---> RepeatMasker and RepeatModeler ---> TEsorter --->RepeatMasker. Do you have any suggestions on how can I reduce the number of "unknown" TEs? Further, I am also attaching the headers of the file .*cls.lib which I obtained after running TEsorter and input in RepeatMasker.
|
You may use the union set of non-unknown TEs from RepatModeler and TEsorter. |
I could not get you! are you suggesting to take only those sequences that are annotated by both RepeatModeler and TEsorter output (which we obtain after running with RepeatModeler library)? |
I mean you may replace the unknown classifications by TEsorter with the known classifications by RepeatModeler, like:
It is just to reduce the number of "unknown" TEs. |
Okay, Thanks! |
Hey, thanks for the tool. How can I merge the output library of TEsorter with the repeatModeler repeat library to run RepeatMasker? Further, can I directly input the output library of TEsorter in RepeatMasker?
The text was updated successfully, but these errors were encountered: