Skip to content
Jonah Duckles edited this page Mar 9, 2015 · 2 revisions

Apache spark is a framework for computing in-memory across clusters of compute nodes. Computations have to be adapted to a map/reduce job.

The informatics team has developed Docker containers for bringing up a Spark Cluster and has begun testing various workflows using Spark.

Clone this wiki locally