Skip to content

Latest commit

 

History

History
145 lines (106 loc) · 7.86 KB

monoid.md

File metadata and controls

145 lines (106 loc) · 7.86 KB

Monoid for structural transformations

Documentation

Golang uses imperative style for structural transformations. The usage of for loop is advertised by majority of language tutorials. Usage of for loops is an idiomatic replacement for well-known high-order functional constructs like map, filter and fold.

sum := 0
for _, x := listOfInt {
  sum = sum + x
}

Usage of for loops for anything else than primitive containers requires a lot of boilerplate, which means a lot of repeated code. For example, iteration over skip lists, trees or other complex data types requires exposure of type internals to client code, therefor making unnecessary coupling. Monoid is the correct and most basic content to apply a structural transformation for any data structure:

a monoid is an algebraic structure with a single associative binary operation and an identity element.

Monoid is just a scientific name for mostly used computer science concept. The example above uses "commutative monoid under addition (+) with identity element zero (0)". Any type that defines "identity element" and implements "associative binary operation" is monoid:

type Monoid interface {
  // the identity value for a particular monoid.
  Mempty() Monoid

  // an associative binary operation
  Mappend(x Monoid) Monoid
}

The for loop example becomes

type MInt int
func (v MInt) Mempty() Monoid { return MInt(0) }
func (v MInt) Mappend(x Monoid) Monoid { return m + x.(MInt) }

sum := MInt(0)
for _, x := listOfInt {
  sum = sum.Mappend(x)
}

Generic algorithm with monoid

Let's consider a typical structural transformation in Scala. It builds a new collection by applying a function to all elements of a container type. The data structure traversal algorithm is implemented only once, which is a huge benefit for complex data structures.

trait Seq[A] {
  def map[B](f: (A) => B): Seq[B]
} 

It is not possible to implement generic map in Golang due to absence of generics and type covariance. The code generation does not help, we are bloating source code due to high cardinality of A x B set. Fortunately, high-order structural transformation functions can be implemented in terms of monoid abstraction. Anyone is able to implement generic map, filter, fold, leaving concrete type B behind monoid interface. As an example, MapReduce programming model is a monoid with left folding. Many iterative structural transformations may be elegantly expressed by a monoid operation:

  • fold - analyse "recursive" data structure through use of monoid;
  • map - immutable transformation of the structure, preserving the shape but often altering a type;
  • filter - produces a new data structure which contains elements accepted by predicate function;
  • comprehension - builder notation as distinct from the use of map and filter functions.

As an example of generic transformation using monoid abstraction:

func (seq SeqA) Map(mB Monoid, f func (A) Monoid) Monoid {
  b := mB.Mempty()
	for _, x := range seq {
		b = b.Mappend(f(x))
	}
	return b
}

Efficiency of monoid in Go

Usage of monoid interface has its own cost - it makes code "slower". The benchmark of the original example using classical for loop and its monoid counterpart shows significant difference. Unfortunately, an interface abstraction and its conversion to concrete types and back introduces the overhead (there are good post that shed the light about this overhead: Research on the interface of golang and Why Go is Not My Favorite Programming Language).

cpu: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
BenchmarkMonoid-4       	46114359	       136.40 ns/op	       0 B/op	       0 allocs/op
BenchmarkForLoop-4      	425828221	        14.12 ns/op	       0 B/op	       0 allocs/op

Showing nodes accounting for 1160ms, 100% of 1160ms total
      flat  flat%   sum%        cum   cum%
     380ms 32.76% 32.76%     1140ms 98.28%  github.com/fogfish/golem/pure_test.SeqA.MMap
     360ms 31.03% 63.79%      390ms 33.62%  runtime.convT64
     190ms 16.38% 80.17%      380ms 32.76%  github.com/fogfish/golem/pure_test.SeqB.Mappend
     150ms 12.93% 93.10%      360ms 31.03%  github.com/fogfish/golem/pure_test.convAtoB
      60ms  5.17% 98.28%       60ms  5.17%  runtime.newstack
      20ms  1.72%   100%     1160ms   100%  github.com/fogfish/golem/pure_test.BenchmarkMonoid

Showing nodes accounting for 1110ms, 100% of 1110ms total
      flat  flat%   sum%        cum   cum%
     470ms 42.34% 42.34%      880ms 79.28%  github.com/fogfish/golem/pure_test.forEach
     370ms 33.33% 75.68%      400ms 36.04%  github.com/fogfish/golem/pure_test.joinAtoB (inline)
     210ms 18.92% 94.59%     1110ms   100%  github.com/fogfish/golem/pure_test.BenchmarkForLoop

This naive examples has shown 10x difference between "traditional" code and monoid abstraction. The overhead becomes less visible with growth of types complexity. For example, the monoid abstraction of traditional slice is 5x slower.

Structural transformations with clojure

The advantage of Monoid interface is the ability to apply transformation to data structure of any shape. The disadvantage is the overhead of runtime interface conversion. Usage of clojure mitigates discussed issue and leverage transformations with type safety.

Go functions may be closures. A closure is a function value that references variables from outside its body. The function may access and assign to the referenced variables; in this sense the function is "bound" to the variables.

We can assume a generic transformation as application of type-safe closure over data structure elements:

func (seq SeqA) FMap(f func(A)) {
	for _, x := range seq.value {
		f(x)
	}
}

The monoid definition becomes less formal in the context of clojure. The identity value is the type constructor and associative binary function is what-ever combine operation.

// associative binary function
func (seq SeqB) Append(x B) {/* ... */}

// identity value
seqB = SeqB{}
seqA.FMap(func(x A) { seqB.Append(/* A -> B */) })

Usage of clojure shows comparable performance with for loops. It is still slower for naive transformations algorithms but this difference fades away with any real life scenario.

cpu: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
BenchmarkMonoid-4       	46114359	       136.40 ns/op	       0 B/op	       0 allocs/op
BenchmarkForLoop-4      	425828221	        14.12 ns/op	       0 B/op	       0 allocs/op
BenchmarkClojure-4      	168989125	        35.48 ns/op	       0 B/op	       0 allocs/op

Afterwords

Monoid is an abstract concept in computer science that helps with validation and implementation of generic algorithms. Usage of monoid in Go makes sense mainly for structural transformations. The classical interface-based monoid abstraction introduces visible overhead which is not desired in highly loaded solutions. However, usage of clojure delivery monoid-like A ⟼ B type transformation while resolving issue concerning a runtime overhead.

Related articles

  1. Monoids for Gophers
  2. Foldable Go
  3. Cats: Monoid
  4. Functors, Applicative Functors and Monoids