-
Notifications
You must be signed in to change notification settings - Fork 107
[sambamba view] Filter expression syntax
Sambamba-view supports custom filtering for alignment records. This wiki page describes syntax of filter expressions which are provided by the user with --filter
command-line option. Fields and flags are described in the SAM specification.
A filter expression is a number of basic conditions linked by and
, or
, not
logical operators, and enclosed in parentheses where needed.
Basic condition is a one for a single record field, tag, or flag.
You can use ==
, !=
, >
, <
, >=
, <=
comparison operators for both integers and strings.
Strings are delimited by single quotes, if you need a single quote inside a string, escape it with \
.
Reduce the BAM file to a BAM file containing reads on the second reference sequence chr2 as described in the SAM header.
sambamba view -F "ref_id==1" -f bam HG01375.mapped.ILLUMINA.bwa.CLM.low_coverage.20120522.bam > HG01375.mapped.ILLUMINA.bwa.CLM.low_coverage.20120522_chr2.bam
Show all read names that start with ERR
sambamba view -F "read_name =~ /^ERR/" HG01375.mapped.ILLUMINA.bwa.CLM.low_coverage.20120522_chr1.bam
mapping_quality >= 30 and ([RG] =~ /^abcd/ or [NM] == 7)
read_name == 'abc\'def'
The following flag names are recognized:
- paired
- proper_pair
- unmapped
- mate_is_unmapped
- reverse_strand
- mate_is_reverse_strand
- first_of_pair
- second_of_pair
- secondary_alignment
- failed_quality_control
- duplicate
- supplementary
- chimeric
not (unmapped or mate_is_unmapped) and first_of_pair
Conditions for integer and string fields are supported.
List of integer fields:
- ref_id
- position
- mapping_quality
- sequence_length
- mate_ref_id
- mate_position
- template_length
List of string fields:
- read_name
- sequence
- cigar
- strand ('+'/'-')
- ref_name
- mate_ref_name
ref_id == 3 and mapping_quality >= 50 and sequence_length >= 80
Tags are denoted by their names in square brackets, for instance, [RG]
or [Q2]
. They support conditions for both integers and strings, i.e. the tag must also hold value of the corresponding type.
In order to do filtering based on the presence of a particular tag, you can use special null
value.
[RG] != null and [AM] == 37