Why Scala and functional programming for Kernalytics ?

Why not R ?

Scala is statically typed and compiled, which helps when writing code. A lot of validation is performed before anything is run
Scala is faster, it is compiled and run on the JVM
Scala is much more coherent and functional programming is more developed

Why not C++ ?

External opinions

Scala vs C++ (Slant)
What are the advantages of Scala over C++ and Haskell? (Quora)
C++ vs Rust vs Scala (stackshare)

Specific arguments

automatic object management
use of immutable collections
everything passed by reference
being able to manipulate functions directly to structure the code
avoid global states, uninitialized variable, uncontrolled states
pattern matching in Scala is very useful
learning functional construct will be useful to parallelize code or move it to Spark for example

Functional vs imperative programming

Most of the code is written in a functional style, using immutable collections. This removes all errors that can be found in codes where data is not initialized, where global states are abused and the program can go through invalid states. However, the pseudo code algorithms in the articles are usually written in an imperative style. Translating them to functional style will make the code difficult to read, and might reduce performances.

The current balance recommendation between the two paradigms is:

To use functional programming for the overall architecture. See for example the KernelGenerator where functions are passed as arguments, or Learn, where the Try monad helps manage the errors.
To use imperative programming, mutable collections and var for local computations. For example the Incomplete Cholesky Decomposition in IncompleteCholesky.scala. Here an imperative style is used and the code is much closer to the article pseudocode.

Basics to learn to understand the Kernalytics code:

A beginner should grasp the basics of Scala from the introduction on the Scala website.

The following methods for basic collections should be understood:

map
filter
flatMap
foreach
reduce
fold, foldLeft

As well as those constructs:

patten matching using case classes, similar to the end of this tutorial
monads (described below)
for / yield (described below)

The Try Monad

The use of this monad is the base of the error management code. It allows to write the code in a very sparse and concise manner, by providing mechanisms for the composition of error-prone computations.

There are formal introductions available online, such as Demystifying the Monad in Scala

In this section the minimum requirements to understand the Kernalytics code wil be described.

The Try[A] class (A being a parameter class) has two children classes, Success and Failure. Success[A] is a wrapper around an A object, while Failure is a wrapper around a throwable object. The main idea is that every function during initialization returns a Try[_] of some sort. When an error occurs, it will become the general result.

For a given object Try[A], the map needs a function A => B, while the flatMap method needs a method A => Try[B]. A function A => B is a method that can not raise an exception, while a function A => Try[B] can potentially fail and raise an exception.

Here is a simple example:

val a = 9.0
val b = Try(math.sqrt(a)) // if
val c = b.flatMap(computeInverse) // see definition of computeInverse below
val d = c.map(addTwo)

def computeInverse(x: Double): Try[Double] = Try(1.0 / x)
def addTwo(x: Double): Double = x + 2.0

If a < 0, b will be a Failure, otherwise, it will be a Success[Double](3.0).
If b is a Success[Double](0.0), then c is a failure. This is not the case here, and c is equal to Success(0.33333333)
d is equal to Success(2.333333333). Since addTwo does not return a Try[Double], but simply a Double, it can not fail, this is why c.map has been called, and not c.flatMap

The advantage of this syntax is that composition is very simple. All the tests are implicit, but not written by the programmer. The Try Monad takes care of everything, the programmer just provides the functions to be applied.

The for / yield syntax is syntaxic sugar, when multi-level composition of flatMap and map becomes difficult to read. This articles explains it using the flatMap method from List. List[_] is a monad just the way Try[_] is. Consider a List[A], its flatMap method takes as argument a function A => List[B], and map takes a function A => B. Therefore the explanation is relevant to the current documentation.

Additional resources

Monads - Another way to abstract computations in Scala (Medium article)
Functional Programming in Scala (book)
Scala with Cats (book)
Functional Programming Principles in Scala (Coursera online course)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scala.md

scala.md

Why Scala and functional programming for Kernalytics ?

Why not R ?

Why not C++ ?

External opinions

Specific arguments

Functional vs imperative programming

Basics to learn to understand the Kernalytics code:

The Try Monad

Additional resources

Files

scala.md

Latest commit

History

scala.md

File metadata and controls

Why Scala and functional programming for Kernalytics ?

Why not R ?

Why not C++ ?

External opinions

Specific arguments

Functional vs imperative programming

Basics to learn to understand the Kernalytics code:

The Try Monad

Additional resources