- Scala is statically typed and compiled, which helps when writing code. A lot of validation is performed before anything is run
- Scala is faster, it is compiled and run on the JVM
- Scala is much more coherent and functional programming is more developed
- Scala vs C++ (Slant)
- What are the advantages of Scala over C++ and Haskell? (Quora)
- C++ vs Rust vs Scala (stackshare)
- automatic object management
- use of immutable collections
- everything passed by reference
- being able to manipulate functions directly to structure the code
- avoid global states, uninitialized variable, uncontrolled states
- pattern matching in Scala is very useful
- learning functional construct will be useful to parallelize code or move it to Spark for example
Most of the code is written in a functional style, using immutable collections. This removes all errors that can be found in codes where data is not initialized, where global states are abused and the program can go through invalid states. However, the pseudo code algorithms in the articles are usually written in an imperative style. Translating them to functional style will make the code difficult to read, and might reduce performances.
The current balance recommendation between the two paradigms is:
- To use functional programming for the overall architecture. See for example the KernelGenerator where functions are passed as arguments, or Learn, where the Try monad helps manage the errors.
- To use imperative programming, mutable collections and var for local computations. For example the Incomplete Cholesky Decomposition in IncompleteCholesky.scala. Here an imperative style is used and the code is much closer to the article pseudocode.
A beginner should grasp the basics of Scala from the introduction on the Scala website.
The following methods for basic collections should be understood:
- map
- filter
- flatMap
- foreach
- reduce
- fold, foldLeft
As well as those constructs:
- patten matching using case classes, similar to the end of this tutorial
- monads (described below)
- for / yield (described below)
The use of this monad is the base of the error management code. It allows to write the code in a very sparse and concise manner, by providing mechanisms for the composition of error-prone computations.
There are formal introductions available online, such as Demystifying the Monad in Scala
In this section the minimum requirements to understand the Kernalytics code wil be described.
The Try[A]
class (A being a parameter class) has two children classes, Success and Failure. Success[A] is a wrapper around an A object, while Failure is a wrapper around a throwable object. The main idea is that every function during initialization returns a Try[_]
of some sort. When an error occurs, it will become the general result.
For a given object Try[A], the map needs a function A => B, while the flatMap method needs a method A => Try[B]. A function A => B is a method that can not raise an exception, while a function A => Try[B] can potentially fail and raise an exception.
Here is a simple example:
val a = 9.0
val b = Try(math.sqrt(a)) // if
val c = b.flatMap(computeInverse) // see definition of computeInverse below
val d = c.map(addTwo)
def computeInverse(x: Double): Try[Double] = Try(1.0 / x)
def addTwo(x: Double): Double = x + 2.0
- If a < 0, b will be a Failure, otherwise, it will be a
Success[Double](3.0)
. - If b is a
Success[Double](0.0)
, then c is a failure. This is not the case here, and c is equal to Success(0.33333333) - d is equal to Success(2.333333333). Since addTwo does not return a Try[Double], but simply a Double, it can not fail, this is why c.map has been called, and not c.flatMap
The advantage of this syntax is that composition is very simple. All the tests are implicit, but not written by the programmer. The Try Monad takes care of everything, the programmer just provides the functions to be applied.
The for / yield syntax is syntaxic sugar, when multi-level composition of flatMap and map becomes difficult to read. This articles explains it using the flatMap method from List. List[_]
is a monad just the way Try[_]
is. Consider a List[A], its flatMap method takes as argument a function A => List[B], and map takes a function A => B. Therefore the explanation is relevant to the current documentation.