You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Bloom filter is overkill, but position-based duplicate assignment is perhaps not ideal either. A hybrid approach seems likely to offer a middle-ground:
method
mapped data
unmapped data
CPU
memory
specificity
complexity
Bloom filter
✓
✓
+++
+++
++
+++
Start position
✓
-
+
+
+
+
Positional hash
✓
-
++
++
+++
++
The "positional hash" can be anchored by a genomic location (maybe an actual genomic coordinate, and not chrom, pos tuple) and the hash function can be weak and truncated to the first few significant characters. Hash collisions will be rare as long as the hash function is uniform.
The text was updated successfully, but these errors were encountered:
A Bloom filter is overkill, but position-based duplicate assignment is perhaps not ideal either. A hybrid approach seems likely to offer a middle-ground:
The "positional hash" can be anchored by a genomic location (maybe an actual genomic coordinate, and not
chrom, pos
tuple) and the hash function can be weak and truncated to the first few significant characters. Hash collisions will be rare as long as the hash function is uniform.The text was updated successfully, but these errors were encountered: