-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hi-C maps] How to manage heterotype duplications? #83
Comments
Hi Quentin, I'm not sure how the green rectangles in the contact maps were generated. Despite the weak Hi-C signals in this short fragment, it seems like it should be relocated to the interior of the final large green rectangle. If this large green rectangle was generated by some tools, there may be an error. Best regards, |
Hello, I generated them using the GreenHill utility fasta to juicebox
assembly. It will generate the green blocks at each gap region. It is very
useful to move contigs. But it ended making my assembly of type
Group:::fragment ...
Actually I could use the script assembly to fasta from the juicebox_scrips
folder and then I could get a final.fa and a final.agp and after that I did
rename the chromosomes and remapped HiC reads. Everything is fine now.
However the assembly after HapHiC from my polished assembly had 700 contigs
but the initial assembly 500 but also the quality was different.
I wonder, if I need to maybe continue the assembly at 700 contigs I think
it would be better because I have corrected many structural errors.
Meanwhile I will run HapHiC ( it's running now) using the P utgs and gfa P
utgs.
I wonder if the P utgs will have a better quality than the haplotype
resolved contigs in term of QV. Also how to get the opposite haplotype? Is
it just a consensus assembly like wtdbg2 outputs?
Also for the haplotype resolved assembly, how to supply the
.hic.hap1.p_ctg.fa and .hic.hap2.p_ctg.fa to HapHic?
Should I cat file 1 file 2 > combined ?
Or should I use the * like hap*.fa ?
Because in your wiki you explain about p ctgs but not for P utgs.
Another Issue that I found is that I want to be able to look at the same
map at MAPQ0 do I have fuiltered the HIC reads and run quick view mode to
generate Hic.filtered_0.bam
Then I got the out_JBAT and the hic files but when I use it as control in
JBAT the scaffolds are not sorted like in the hic map at MAPQ1.
Would it be possible to add an option like --mapq "auto(1)"
But we could choose 0 or 30 or all. And it would make 3 hic maps at MAPQ1,
0 and 30 so we can visualise the telomeres or repeats at mapq0 and at
mapq30 see whats unique between sequences.
Mapq0 is important because if we see a gap in the HiC map it means that
there is no HiC mapping. Normally if there is no hic mapping, and we
overlay another technology like HIFI.winnowmap.aligned.sorted.wig
We could directly cut the useless parts in JBAT instead of calling a
consensus with bcftools consensus --no-ref
These are just suggestions. And also maybe a converter for bed files
because many times I had annotations of the scaffold.fa that Wes generated
after HapHiC and I wanted to plot Hifi reads mapping (minimap2), gaps (
detgaps from asset), QV errors ( from Mercury) telomeres ( from seqkit
locate). In the end I could do all of that but actually you could directly
add of all these functions to HapHiC. Maybe tomorrow i'll send you the
scripts if you're interested.
It is true that you may think that the most important is to be able to
resolve the genome assembly / scaffolding problem and that you already gave
to the answer. But I believe that one single pipeline could be useful
especially for a gain of time. But at the same time maybe It is better to
just let Hap Hic as is.
However I find the reordering of contigs by name disturbing. I am used to
contig ordering by size. I wonder what could be the reason to order by
name? Or is it arbitrary? Or is it from Hifiasm?
See you and have a wonderful day.
Your tool is very powerful. Haha
Quentin.
…On Tue, Oct 22, 2024, 4:12 PM Xiaofei Zeng ***@***.***> wrote:
Hi Quentin,
I'm not sure how the green rectangles in the contact maps were generated.
Despite the weak Hi-C signals in this short fragment, it seems like it
should be relocated to the interior of the final large green rectangle. If
this large green rectangle was generated by some tools, there may be an
error.
Best regards,
Xiaofei
—
Reply to this email directly, view it on GitHub
<#83 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASYS5TAQDQXISDD2WS32OSDZ4YJIBAVCNFSM6AAAAABQLRMLKSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRYG4ZDKNBTGA>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I would like to ask what do you usually do when encountering this type of patterns in the scaffolds:
Is it caused by undercollapsed heterozygosity right ? So I should move to debris?
Thank you in advance
Quentin.
The text was updated successfully, but these errors were encountered: