-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get nerd-sniped and do some micro-optimizations. #41
base: master
Are you sure you want to change the base?
Conversation
Use std::unique_ptr in more places. Replace c-style arrays with c++ std::array. Use defaulted constructors/destructors where appropriate. Make it easier for the compiler to vectorize comparisons in the inner loop of stitchAlignToTranscript.
We're linking statically, and this lets the compiler be better at dead-code elimination.
src/lib.rs
Outdated
let (chr, pos, cigar, cigar_ops) = if record.len() > 0 { | ||
let rec = &record[0]; | ||
(rec.tid().to_string(), rec.pos().to_string(), format!("{}", rec.cigar()), rec.cigar().len().to_string()) | ||
} else { | ||
("NA".to_string(), "NA".to_string(), "NA".to_string(), "NA".to_string()) | ||
}; | ||
println!("{:?},{:?},{:?},{},{},{},{}", std::str::from_utf8(&read).unwrap(), new_now.duration_since(now), record.len(), chr, pos, cigar, cigar_ops); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of unnecessary memcpy going on here.
let (chr, pos, cigar, cigar_ops) = if record.len() > 0 { | |
let rec = &record[0]; | |
(rec.tid().to_string(), rec.pos().to_string(), format!("{}", rec.cigar()), rec.cigar().len().to_string()) | |
} else { | |
("NA".to_string(), "NA".to_string(), "NA".to_string(), "NA".to_string()) | |
}; | |
println!("{:?},{:?},{:?},{},{},{},{}", std::str::from_utf8(&read).unwrap(), new_now.duration_since(now), record.len(), chr, pos, cigar, cigar_ops); | |
if record.len() > 0 { | |
let rec = &record[0]; | |
println!("{:?},{:?},{:?},{},{},{},{}", std::str::from_utf8(&read).unwrap(), new_now.duration_since(now), record.len(), rec.tid(), rec.pos(),rec.cigar(), rec.cigar().len()) | |
} else { | |
println!("{:?},{:?},{:?},NA,NA,NA,NA", std::str::from_utf8(&read).unwrap(), new_now.duration_since(now), record.len()) | |
}; |
Though I guess it doesn't really matter if this is just temporary benchmarking code.
But, maybe this should be an actual benchmark?
Hey @adam-azarchs, yeah agreed it doesn't matter since this is just quick benchmarking code. Just for context, we're spending a bunch of time in In general, on a per read basis we take about ~280 microseconds to align a read, for a throughput of about 3.5K reads a second, which means that with datasets with ~380M reads we're taking almost 30 core hours to get our alignments out. Obviously, with cloud that's nothing, but I suspect we could probably do much better for stand alone users and consume less energy. I benchmarked this branch again, and as before found that although it's better for reasons of readability and I kind of want to merge it in just for that, it didn't move the needle on speed (actually was ~4% slower on average, and although not shown, that wasn't statistically significant, so it may have been a little faster or a little slower but wasn't a game changer). The plot below is the timings for 10K of the same reads to align, we have a lot of reproducible variation between reads but not so much between branches. The plot shows how long it took to align each read on master and your branch. There's a lot of heuristics at work in this code, but stepping through the code it looks like we should have a much higher throughput than we do, so am looking into that a bit more. |
0e34e5c
to
b34d0ff
Compare
I think I know why there was maybe a performance regression, because the loop vectorizer was having trouble figuring out that comparing two |
I'm wondering whether #53 will make any difference to what we see here. |
I think at next |
Low utilization means I/O bound. There's definitely low-hanging fruit to be found there. To start with, I think we need to give another try to the mmap strategy, just this time a less buggy implementation. |
Use std::unique_ptr in more places.
Replace c-style arrays with c++ std::array.
Use defaulted constructors/destructors where appropriate.
Make it easier for the compiler to vectorize comparisons in the inner
loop of stitchAlignToTranscript.
Muck around a little with compiler flags.