Skip to content

Commit

Permalink
fixed a crash of the SFA module in case the list of cInDels is empty
Browse files Browse the repository at this point in the history
  • Loading branch information
andreas-rempel committed Jan 5, 2024
1 parent 3707925 commit 0382a93
Show file tree
Hide file tree
Showing 5 changed files with 41 additions and 42 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Variant calling is a process to identify differences between strains, accessions
Sequence variants can be analyzed to predict their functional effects on encoded proteins based on available structural annotations.
Many existing tools operate on a variant-by-variant basis, and per-variant functional impact predictions are generally considered accurate.
However, challenging cases arise where multiple neighboring variants must be considered simultaneously, especially when these variants influence each other's functional impact.

The Neighborhood-Aware Variant Impact Predictor (NAVIP) addresses this problem by considering all variants that may affect a coding sequence (CDS) during the prediction process.
This comprehensive approach increases the accuracy of predicting functional consequences by considering the broader genomic context surrounding the target variants.
To use NAVIP, users must provide a Variant Call Format (VCF) file, a genomic FASTA file, and an associated Gene Feature Format (GFF3) file as input.
Expand All @@ -14,11 +15,10 @@ The tool is also freely available on our web server at: https://pbb-tools.de/NAV
# Usage

For the main program, there are no strict dependencies other than Linux, [Python 3](https://www.python.org), and [matplotlib](https://matplotlib.org).
For most use cases, it is sufficient to download the source code / clone the git repository and
run the script ***runnavip.sh*** with the required arguments, which will guide the user through the entire NAVIP processing pipeline:
For most use cases, it is sufficient to download the source code / clone the git repository and run the script ***runnavip.sh*** with the required arguments, which will guide the user through the entire NAVIP processing pipeline:

```
Usage: runnavip.sh --i <invcf> --g <ingff> --f <infasta> --o <outpath>
Usage: runnavip.sh -i <invcf> -g <ingff> -f <infasta> -o <outpath>
where:
-i | --invcf Specify the input VCF file
Expand Down
63 changes: 29 additions & 34 deletions Readme.txt
Original file line number Diff line number Diff line change
@@ -1,52 +1,47 @@
NAVIP has three existing modules: \
1) VCF preprocessing,
2) the NAVIP main program, \
3) and one simple first analysis of the created data.

NAVIP has three existing modules:
1) VCF preprocessing,
2) the NAVIP main program,
3) and one simple first analysis of the created data.

You can choose the module with "--mode <module>".
The module shortcuts are "pre","main" and "sfa" and there is a vcf-format-check "vcfc".
The module shortcuts are "pre", "main" and "sfa" and there is a vcf-format-check "vcfc".

For the VCF-Check: "--mode vcfc --invcf <path_with_file>"

VCF preprocessing needs two more arguments:
The VCF-Check needs one argument:
"--invcf <path_with_file>"

"--invcf <path_with_file>" and "--outpath <path_to_folder>"
VCF preprocessing needs two arguments:
"--invcf <path_with_file>" and "--outpath <path_to_folder>"

Please be aware, that no new folder will be created.
Please be aware, that no new folder will be created.

The NAVIP main program needs four arguments:
"--invcf <path_with_file>", "--ingff <path_with_file>", "--infasta <path_with_file>" and "--outpath <path_to_folder>"

"--invcf <path_with_file>", "--ingff <path_with_file>", "--infasta <path_with_file>" and "--outpath <path_to_folder>"

The best possible output will be available, when the VCF file is 'corrected' by the preprozessing. However, NAVIP will still be able to deal with most of the 'normal' VCF data and will do its best.
The best possible output will be available, when the VCF file is 'corrected' by the preprozessing.
However, NAVIP will still be able to deal with most of the 'normal' VCF data and will do its best.

The SFA module needs three arguments:
"--innavipvcf <path_with_file>", "--innavipfasta <path_with_file>" and "--outpath <path_to_folder>"

"--innavipvcf <path_with_file>","--innavipfasta <path_with_file>" and "--outpath <path_to_folder>"

Example: VCF preprocessing:

"python3 navip.py \
--mode pre \
--invcf /.../small_variants.vcf \
--outpath /.../VCF_Preprocessing/"

python3 navip.py \
--mode pre \
--invcf /.../small_variants.vcf \
--outpath /.../VCF_Preprocessing/

Example: NAVIP main:
"python3 navip.py \
--mode main \
--invcf VCF_Preprocessing/first.vcf \
--ingff /.../Araport11_GFF3_genes_transposons.201606.gff \
--infasta /.../TAIR10.fa \
--outpath /.../NAVIP_Main_Output/"

python3 navip.py \
--mode main \
--invcf VCF_Preprocessing/first.vcf \
--ingff /.../Araport11_GFF3_genes_transposons.201606.gff \
--infasta /.../TAIR10.fa \
--outpath /.../NAVIP_Main_Output/

Example: SFA:
"python3 navip.py \
--mode sfa \
--innavipvcf /.../All_VCF.vcf \
--innavipfasta /.../all_transcripts_data.fa \
--outpath /.../SFA_Output/"


python3 navip.py \
--mode sfa \
--innavipvcf /.../All_VCF.vcf \
--innavipfasta /.../all_transcripts_data.fa \
--outpath /.../SFA_Output/
7 changes: 5 additions & 2 deletions compensating_indels.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,7 +484,8 @@ def find_all_cindels_v2(navip_vcf_file_link: str, mod_or_not: bool, outputfolder
table_output_file.write(description + "\n".join(normal_output_with_zeros))
table_output_file.close()

do_magic_plotting(normal_output_with_zeros,outputfolder, table_outputname,formats, max_x_axis_bpr )
if normal_output_with_zeros:
do_magic_plotting(normal_output_with_zeros,outputfolder, table_outputname,formats, max_x_axis_bpr )

### write all tids and stuff
involved_tid_list_output = []
Expand Down Expand Up @@ -557,7 +558,9 @@ def find_all_cindels_v2(navip_vcf_file_link: str, mod_or_not: bool, outputfolder
table_output_file = open(outputfolder + table_outputname, 'w')
table_output_file.write(description + "\n".join(normal_output_with_zeros))
table_output_file.close()
do_magic_plotting(normal_output_with_zeros, outputfolder, table_outputname,formats, max_x_axis_bpr)

if normal_output_with_zeros:
do_magic_plotting(normal_output_with_zeros, outputfolder, table_outputname,formats, max_x_axis_bpr)

### write all tids and stuff
involved_tid_list_unique = []
Expand Down
2 changes: 1 addition & 1 deletion runnavip.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#!/bin/sh

help="
Usage: $(basename "$0") --i <invcf> --g <ingff> --f <infasta> --o <outpath>
Usage: $(basename "$0") -i <invcf> -g <ingff> -f <infasta> -o <outpath>
where:
-i | --invcf Specify the input VCF file
Expand Down
5 changes: 3 additions & 2 deletions sfa2.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,5 +10,6 @@ def sfa2_main(navip_vcf_file_link:str, mod_or_not:bool, outputfolder:str ,format
#def find_all_cindels_v2(navip_vcf_file_link: str, mod_or_not: bool, outputfolder: str, formats:str):
print("Starting compensating_indels script")
time = datetime.now()
cindels.find_all_cindels_v2(navip_vcf_file_link, mod_or_not, outputfolder, formats, max_x_axis_bpr)
print("Finished in: " +str(datetime.now() - time))
try: cindels.find_all_cindels_v2(navip_vcf_file_link, mod_or_not, outputfolder, formats, max_x_axis_bpr)
except: import sys; print("Warning: List of cInDels is empty, skipping the analysis...", file=sys.stderr)
print("Finished in: " +str(datetime.now() - time))

0 comments on commit 0382a93

Please sign in to comment.