Skip to content

vinhdc10998/Spark4VCF

 
 

Repository files navigation

Spark4VCF

Spark4VCF is a scalable and high performance toolkit for the analysis, annotation, and prioritization of genomic variants.

Introduction

Spark4VCF was created by the software development team at Vinbigdata's Biomedical Information center, which leverages spark parallelism to speed up data processing times of genomic tools like VEP, GATK, PyPGx, etc. With a simple architecture, making the integration of tools with Spark easy and effective, the results of the integration is remarkable. The architecture of Spark4VCF is shown in the following figure:

Spark4VCF integration flow

Installation

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Scala 57.0%
  • Python 41.1%
  • Dockerfile 1.9%