Hands on Monoids in Scala – Applying categories to birdwatching

What do Monoids in Scala have to do with birdwatching?

Before I come to that I would like to mention that since I’m coming from F# and recently started with Scala, being able to use type classes in my code is new to me.

I find this really exciting which I’d like to share.

So this post is maybe not a complete cohesive tutorial an Monoids but I hope that it is a fun little teaser for looking into things like functional programming, Scala, Cats, Haskell, type classes or maybe even category theory.

Let’s start

Imagine we have to aggregate many lists of bird counts.

A single list of bird counts is represented by a map of type Map[(Date, Location, Species), Int].

The objective is to create a function that takes many of these maps and combines them into a single map of aggregated counts. For equal keys the counts should be summed up.

This is the signature of the function:

def aggregate(birdCounts: List[Map[(Date, Location, Species), Int]]) = ???

Here is an example input:

List(
  Map(
    (Date(2016, 1, 4), cologne, pigeon) -> 37,
    (Date(2016, 1, 11), bonn, starling) -> 64,
    (Date(2016, 1, 18), cologne, bullfinch) -> 54
  ),
  Map(
    (Date(2016, 1, 18), cologne, bullfinch) -> 60,
    (Date(2016, 1, 18), bonn, bullfinch) -> 2,
    (Date(2016, 1, 25), dusseldorf, bullfinch) -> 75
  ),
  Map(
    (Date(2016, 1, 4), cologne, pigeon) -> 30
  )
)

Expected output:

Map(
  (Date(2016, 1, 4), cologne, pigeon) -> 67,
  (Date(2016, 1, 11), bonn, starling) -> 64,
  (Date(2016, 1, 18), cologne, bullfinch) -> 114,
  (Date(2016, 1, 18), bonn, bullfinch) -> 2,
  (Date(2016, 1, 25), dusseldorf, bullfinch) -> 75
)

This looks like a nice little programming kata.

The direct approach

There are different approaches on how to solve this.

I don’t even want to think about the imperative way.

So let’s look at a declarative (or functional) way of solving this:

birdCounts
  .flatMap(_.toList)
  .groupBy(_._1)
  .map({ case (k,v) => (k, v.map(_._2).sum) })

A more general abstraction

The functional solution seems to work.

But there is a more general abstraction to this that we can take advantage of.

I will give it away, right away, this is the solution:

import cats.implicits._

birdCounts.combineAll

That’s it!

I think, this is pretty cool.

But why does this work?

It works because in this context:

  • The type Map[A, B] is a Monoid
  • The type Int is a Monoid
  • And there is a default implementation in the library Cats for these types and type classes

Cats and type classes

Cats is a library which provides abstractions for functional programming in Scala and a very big part of these abstractions are type classes.

Type classes originated in Haskell. A type class defines a set of operations that must be supported by all instances of that type class.

There are lots of existing and commonly used type classes but we can also define our own.

Instances of a type class are normal types that we usually work with like String, Int, List, … or any other type, whether built-in, custom or coming from some external library.

The cool thing is that these existing types can be extended with the type class functionalities without changing the original code. There is no need for inheritance, or subtyping.

How does this work and what is a Monoid?

A Monoid is such a type class and relates to category theory. (Also see Categories Great and Small).

Here comes a quick summary of what a Monoid is.

Recap on Monoids

A Monoid consists of a type together with:

  • A binary operation for combining values (sometimes called combine or append)
  • A value that doesn’t do anything when combined with others (sometimes called empty, identity or neutral element)

In addition to that the Monoid must satisfy the Monoid laws, associativity and left and right identity. (Which will not be further discussed here. If you are interested please refer to Monoid laws.)

Examples

A simple example of a Monoid is the type Int together with + as the binary operation and 0 as the neutral element.

Other examples are:

  • Type Int with operation * and neutral element 1
  • Type String with operation String.concat and neutral element String.empty
  • Type List with operation ++ and neutral element List.empty
  • Type Boolean with the operation && and the neutral element true
  • Type A => A with the operation andThen (function composition) and the neutral element (x: A) => x (identity function)

Hands on Monoids in Scala

Let’s explore that a bit.

Prerequisites to following along the next steps are that Scala and sbt are installed. Also Scala and sbt should be added to the system’s path variable.

Follow these steps to start the Scala REPL with the Cats library loaded:

  • Create a new directory
  • Create an empty file build.sbt inside this directory
  • Put this line inside build.sbt: libraryDependencies += "org.typelevel" %% "cats" % "0.8.0"
  • Open a terminal and navigate to the directory
  • Start sbt by typing sbt
  • Start the Scala REPL by typing console

First let’s import the needed dependencies:

scala> import cats.Monoid
import cats.Monoid

scala> import cats.implicits._
import cats.implicits._

Now we can try out some things.

Let’s start with the first Monoid example from above (Int, +, 0).

scala> Monoid[Int].empty
res1: Int = 0

scala> Monoid[Int].combine(1,2)
res2: Int = 3

We can combine a list of integers with combinAll:

scala> (1 to 10).toList.combineAll // = 0+1+2+3+4+5+6+7+8+9+10
res4: Int = 55

Under the covers combineAll is doing a fold over the list where Monoid[Int] is 0 and combine is addition:

scala> (1 to 10).fold(Monoid[Int].empty)((a,b) => a.combine(b))
res0: Int = 55

… which is the same as:

scala> (1 to 10).fold(0)(_ + _)
res1: Int = 55

Here is a diagram which shows how a left fold for a list of type List[Monoid[Int]] works. First the neutral element and the first element of the list are applied to the addition function. Then this result and the second element are added and so on.

            ------------ Monoid[Int].combine -------------
            |    |    |    |    |    |    |    |    |    |
            v    v    v    v    v    v    v    v    v    v
((((((((((0 + 1) + 2) + 3) + 4) + 5) + 6) + 7) + 8) + 9) + 10)
          ^
          |_ Monoid[Int].empty

With a Map type (as in the birdwatching problem) it works very similar:

scala> Monoid[Map[Long, Int]].empty
res5: Map[Long,Int] = Map()

scala> Monoid[Map[Long, Int]].combine(Map(1L -> 10), Map(1L -> 100, 2L -> 999))
res6: Map[Long,Int] = Map(1 -> 110, 2 -> 999)

It is important that the values of the map are of type Int. And that Int has a default Monoid implementation in Cats. Therefore the values for identical keys will be added together. This is exactly what we wanted in the opening example.

Creating our own instance of a Monoid

So far we have seen default implementations of Monoids that were provided by Cats.

Finally let’s take a somewhat more advanced example and see how an instance of a Monoid can be created in Scala.

Let’s define a class Point:

case class Point(x: Int, y: Int)

Imagine we have a function that represents a move from one point to another. It takes a starting point as an input parameter and returns the destination point. It has this signature:

Point => Point

We can define some values of that type:

scala> val up = (p: Point) => Point(p.x, p.y + 1)
up: Point => Point = <function1>

scala> val down = (p: Point) => Point(p.x, p.y - 1)
down: Point => Point = <function1>

scala> val right = (p: Point) => Point(p.x + 1, p.y)
right: Point => Point = <function1>

scala> val left = (p: Point) => Point(p.x - 1, p.y)
left: Point => Point = <function1>

scala> val stay = (p: Point) => p.copy()
none: Point => Point = <function1>

Now we can define the type Point => Point as an instance of Monoid[Point => Point] where the neutral element is stay and the combine operation is function composition andThen:

implicit val moveMonoid: Monoid[Point => Point] = 
  new Monoid[Point => Point] {
    def empty = stay

    def combine(m1: Point => Point, m2: Point => Point) = 
      m1 andThen m2
  }

(The code above can be executed in the REPL by using the paste mode. Type :paste. Then the code can be pasted. Press ctrl-D to quit paste mode.)

Using the new Monoid:

scala> val move = List(up, up, up, right, right, down, left, stay).combineAll
move: Point => Point = <function1>

scala> move(Point(0,0))
res6: Point = Point(1,2)

Conclusion

We have seen how Monoids can help with combining maps, a trivial but very common use case.

The concept of Monoids is very simple and powerful and it is very useful in many other situations.

They can be the building blocks of other more complex types such as the Writer Monad e.g.

Also Monoids are extensively used in data processing libraries like e.g. Twitter’s Algebird.

If this got you interested, follow along the hands-on, play around with it, use Cats in your next project or find out more about functional programming or category theory.

Have fun!