-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expression filter #57
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really nice. I made some specific code comments but overall the only thing I see missing is that I'd like to be able to apply multiple filters consecutively.
I am picturing expressions separated by a ";" (maybe) which would be parsed and applied in order to the fsd object.
@@ -21,12 +21,19 @@ pub struct FiberFilters { | |||
help_heading = "BAM-Options" | |||
)] | |||
pub bit_flag: u16, | |||
/// Filtering expression to use for filtering records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can extend this by adding more lines starting with ///
It would be great if you used it as a space to describe the syntax of the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a description of the syntax of the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See db994b8
@@ -21,12 +21,19 @@ pub struct FiberFilters { | |||
help_heading = "BAM-Options" | |||
)] | |||
pub bit_flag: u16, | |||
/// Filtering expression to use for filtering records |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a description of the syntax of the parser.
src/utils/ftexpression.rs
Outdated
pub fn parse_filter(filter_orig: &str) -> ParsedExpr { | ||
let mut filter = filter_orig.to_string(); | ||
filter.retain(|c| !c.is_whitespace()); | ||
|
||
let func_name_end = filter.find('(').unwrap_or(filter.len()); | ||
let func_name = filter[..func_name_end].trim().to_string(); | ||
|
||
let gnm_feat_start = filter.find('(').unwrap_or(filter.len()) + 1; | ||
let gnm_feat_end = filter.find(')').unwrap_or(filter.len()); | ||
let gnm_feat = filter[gnm_feat_start..gnm_feat_end].to_string(); | ||
if !["msp", "nuc", "m6a", "5mC"].contains(&gnm_feat.as_str()) { | ||
eprintln!("Invalid argument for len function: {}", gnm_feat); | ||
std::process::exit(1); | ||
} | ||
|
||
let rest = &filter[gnm_feat_end + 1..].trim(); | ||
|
||
let operators = ["!=", ">=", "<=", ">", "<", "="]; | ||
let mut operator = "".to_string(); | ||
let mut threshold = None; | ||
let mut range = None; | ||
|
||
for &op in operators.iter() { | ||
if let Some(pos) = rest.find(op) { | ||
operator = op.to_string(); | ||
let threshold_str = rest[pos + op.len()..].trim(); | ||
if threshold_str.contains(':') { | ||
let range_parts: Vec<&str> = threshold_str.split(':').collect(); | ||
if range_parts.len() == 2 { | ||
range = Some(( | ||
range_parts[0].trim().parse::<f64>().unwrap(), | ||
range_parts[1].trim().parse::<f64>().unwrap(), | ||
)); | ||
} | ||
} else { | ||
threshold = Some(threshold_str.parse::<f64>().unwrap()); | ||
} | ||
break; | ||
} | ||
} | ||
|
||
if let Some((_, _)) = range { | ||
if operator != "=" { | ||
eprintln!("Range thresholds can only be used with the '=' operator."); | ||
std::process::exit(1); | ||
} | ||
} | ||
|
||
let threshold_value = match range { | ||
Some((min, max)) => Threshold::Range(min, max), | ||
None => Threshold::Single(threshold.unwrap()), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is important to be able to apply multiple filters at once in a single command such that this function returns a Vec of ParsedExp
, which can then be iteritivaly applied to every fiber-seq record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great. Thanks! |
No description provided.