Skip to content

Latest commit

 

History

History
12 lines (7 loc) · 744 Bytes

README.md

File metadata and controls

12 lines (7 loc) · 744 Bytes

Spark4VCF

Spark4VCF is a scalable and high performance toolkit for the analysis, annotation, and prioritization of genomic variants.

Introduction

Spark4VCF was created by the software development team at Vinbigdata's Biomedical Information center, which leverages spark parallelism to speed up data processing times of genomic tools like VEP, GATK, PyPGx, etc. With a simple architecture, making the integration of tools with Spark easy and effective, the results of the integration is remarkable. The architecture of Spark4VCF is shown in the following figure:

Spark4VCF integration flow

Installation