Identify Side Effects And Refactor Fearlessly

When we refactor code how can we be confident that we don't break anything?

3 of the most important things that allow us to refactor fearlessly are:

  • Side effect free - or pure - expressions
  • Statically typed expressions
  • Tests

In this article we will solely focus on the aspect of side effects and strictly speaking on how to identify them. Being able to identify side effects in our programs clearly is the precondition for eliminating them.

Why avoid side effects?

Continue reading →

PureScript Case Study And Guide For Newcomers

Have you ever wanted to try out PureScript but were lacking a good way to get started?

If you

  • Have some prior functional programming knowledge - maybe you know Haskell,Elm,F#,or Scala,etc.
  • Want to solve a small task with PureScript
  • And want to get started quickly

This post is for you!

In this post we will walk through setting up and implementing a small exemplary PureScript application from scratch.

Continue reading →

Elm And The Algorithm Of Music

In this article I would like to present a minimal implementation of a music data type and everything that is needed to turn that into audible sound from an Elm application.

We will see how to transcribe an existing composition - an excerpt from Chick Corea's Children's Songs No. 6 - and listen to the result right here,embedded in this article.

From a music data type to performance

My colleague Jonas recently pointed out the presentation Making Algorithmic Music by Donya Quick to me. Donya Quick shows how she uses the Haskell library Euterpea to produce algorithmic music.

It got me really excited about the idea of porting this to Elm and to be able to use this in web applications.

In the following we will see the core data types and algorithms from Euterpea ported to Elm. To focus on the core concepts the implementation is stripped down to the minimum that is required to transcribe and perform an existing polyphonic piece of music (for a single instrument).

Continue reading →

Interactive Command Line Applications In Scala –Well Structured And Purely Functional

This post is about how to implement well structured,and purely functional command line applications in Scala using PureApp.

PureApp originated in an experiment while refactoring out some glue code of an interactive command line application. At the same time it was inspired by the Elm Architecture Pattern,and scalaz's SafeApp,as well as scalm.

To show the really cool things we can do with PureApp,we will implement a self-contained example application from scratch.

This application translates texts from and into different languages. And it provides basic user interactions via the command line.

The complete source code is compiled with tut. Every output (displayed as code comments) is generated by tut.
Continue reading →

How To Use Applicatives For Validation In Scala And Save Much Work

In this post we will see how applicatives can be used for validation in Scala. It is an elegant approach. Especially when compared to an object-oriented way.

Usually when we have operations that can fail,we have them return types like Option or Try. We sequence operations and once there is an error the computation is short circuited and the result is a None or a Failure.

Applicatives allow us to compose independent operations and evaluate each one. Even if an intermediate evaluation fails. This allows us to collect error messages instead of returning only the first error that occurred.

A classic example where this is useful is the validation of user input. We would like to return a list of all invalid inputs rather than aborting the evaluation after the first error.

Scala Cats provides a type that does exactly that. So let's dive into some code and see how it works.

Continue reading →

Parsers in Scala built upon existing abstractions

After some initial struggles,the chapter Functional Parsers from the great book Programming in Haskell by Graham Hutton,where a basic parser library is built from scratch,significantly helped me to finally understand the core ideas of parser combinators and how to apply them to other programming languages other than Haskell as well.

While I recently revisited the material and started to port the examples to Scala I wasn't able to define a proper monad instance for the type Parser[A].

The type Parser[A] alias was defined like this:

type Parser[A] = String =>Option[(A,String)] // defined type alias Parser 

To test the monad laws with discipline I had to provide an instance of Eq[Parser[A]]. Because Parser[A] is a function,equality could only be approximated by showing degrees of function equivalence,which is not a trivial task.

Also the implementation of tailRecM was challenging. (I couldn't figure it out.)

Using existing abstractions

Continue reading →

Strongly Typed Configuration Access With Code Generation

Most config libraries use a stringly typed approach.

Some handle runtime failures due to invalid configuration schemas by leveraging data types like Option or Result to represent missing values or errors. This allows us to handle these failures by either providing default values or by providing decent error messages.

This is a good strategy that we should definitely stick to.

However,the problem with default values is that we might not even notice if the configuration is broken. This could potentially fail in production. In any case an error e.g. due to a misspelled config property will be observable at runtime at the earliest.

Wouldn't it be a great user experience (for us developers) if the compiler told us if the configuration schema is invalid? Even better,imagine we could access the configuration data in a strongly typed way like any other data structure,and with autocompletion.

Moreover,what if we didn't have to write any glue code,not even when the configuration schema changes?

This can be done with the costs of an initial setup that won't take more than probably around 5 minutes.

Continue reading →

Error and state handling with monad transformers in Scala

In this post I will look at a practical example where the combined application (through monad transformers) of the state monad and the either monad can be very useful.

I won't go into much theory,but instead demonstrate the problem and then slowly build it up to resolve it.

You don't have to be completely familiar with all the concepts as the examples will be easy to follow. Here is a very brief overview:

Continue reading →

Use lambdas and combinators to improve your API

If your API overflows with Boolean parameters,this is usually a bad smell.

Consider the following function call for example:

toContactInfoList(csv,true,true) 

When looking at this snippet of code it is not very clear what kind of effect the two Boolean parameters will have exactly. In fact,we would probably be without a clue.

We have to inspect the documentation or at least the parameter names of the function declaration to get a better idea. But still,this doesn't solve all of our problems.

The more Boolean parameters there are,the easier it will be for the caller to mix them up. We have to be very careful.

Moreover,functions with Boolean parameters must have conditional logic like if or case statements inside. With a growing number of conditional statements,the number of possible execution paths will grow exponentially. It will become more difficult to reason about the implementation code.

Can we do better?

Sure we can. Lambdas and combinators come to the rescue and I'm going to show this with a simple example,a refactoring of the function from above.

This post is based on a great article by John A De Goes,Destroy All Ifs — A Perspective from Functional Programming.

I'm going to take John's ideas that he backed up with PureScript examples and present how the same thing can be elegantly achieved in Scala.

Continue reading →

Modelling API Responses With sbt-json –Print Current Bitcoin Price

I'm currently working on an sbt plugin that generates Scala case classes at compile time to model JSON API responses for easy deserialization especially with the Scala play-json library.

The plugin makes it possible to access JSON documents in a statically typed way including auto-completion. It takes a sample JSON document as input (either from a file or a URL) and generates Scala types that can be used to read data with the same structure.

Let's look at a basic example,an app that prints the current Bitcoin price to the console.

Continue reading →

'https://fonts.googleapis.com/css?family=Droid+Sans|Droid+Sans+Mono|Open+Sans:400,600,700';.elm-music-play-button,.elm-music-stop-button{margin:2px;}span.n{color:#96C71D;}table.pre,pre.fssnip,pre{line-height:13pt;border:1px solid #d8d8d8;border-collapse:separate;white-space:pre;font:9pt'Droid Sans Mono',consolas,monospace;width:90%;margin:10px 20px 20px;background-color:#212d30;padding:10px;border-radius:5px;color:#d1d1d1;max-width:none;}.shariff{display:block !important;clear:both}.shariff ul{display:flex;flex-direction:row;flex-flow:row wrap;padding:0 !important;margin:0 !important}.shariff li{height:35px;box-sizing:border-box;list-style:none !important;overflow:hidden !important;margin:5px !important;padding:0 !important;text-indent:0 !important;border-left:0 none !important}.shariff a{position:relative;display:block !important;height:35px;padding:0;margin:0;box-sizing:border-box;border:0;text-decoration:none;background-image:none !important;text-align:left;box-shadow:none;cursor:pointer}.shariff .shariff-icon svg{width:32px;height:20px;padding:7px 1px;box-sizing:content-box !important}.shariff-button::before{content:none !important}.shariff .shariff-buttons.theme-round li{width:35px !important;height:35px;border-radius:50%;margin:5px}.shariff .theme-round a{position:relative;height:35px;border-radius:50%}.shariff .theme-round .shariff-icon svg{display:block;margin:auto;padding:8px 1px}.shariff .theme-round .shariff-icon svg path{fill:#fff}.shariff.shariff-align-flex-start ul{justify-content:flex-start;align-items:flex-start}.widget .shariff.shariff-widget-align-flex-start ul{justify-content:flex-start;align-items:flex-start}.widget .shariff li{border:0;font-weight:400}.widget .shariff .theme-default a,.widget .shariff .theme-color a,.widget .shariff .theme-grey a,.widget .shariff .theme-round a{color:#fff;display:block;font-weight:400}@media only screen and (max-width:360px){.shariff .shariff-buttons li{width:35px}.shariff .shariff-buttons .shariff-icon svg{display:block;margin:auto}}@media only screen and (min-width:361px){.shariff .shariff-buttons li{width:125px}}@media screen{@font-face{font-family:'FontAwesome';src:url(/wp-content/themes/editor/inc/fontawesome/fontawesome-webfont.eot);src:url(/wp-content/themes/editor/inc/fontawesome/fontawesome-webfont.eot) format('embedded-opentype'),url(/wp-content/themes/editor/inc/fontawesome/fontawesome-webfont.woff) format('woff'),url(/wp-content/themes/editor/inc/fontawesome/fontawesome-webfont.ttf) format('truetype'),url(/wp-content/themes/editor/inc/fontawesome/fontawesome-webfont.svg) format('svg');font-weight:normal;font-style:normal;}.fa{display:inline-block;font-family:FontAwesome;font-style:normal;font-weight:normal;line-height:1;-webkit-font-smoothing:antialiased;-moz-osx-font-smoothing:grayscale;}@-moz-keyframes spin{0%{-moz-transform:rotate(0deg);}100%{-moz-transform:rotate(359deg);}}@-webkit-keyframes spin{0%{-webkit-transform:rotate(0deg);}100%{-webkit-transform:rotate(359deg);}}@-o-keyframes spin{0%{-o-transform:rotate(0deg);}100%{-o-transform:rotate(359deg);}}@keyframes spin{0%{-webkit-transform:rotate(0deg);transform:rotate(0deg);}100%{-webkit-transform:rotate(359deg);transform:rotate(359deg);}}.fa-times:before{content: "\f00d";}.fa-folder:before{content: "\f07b";}.fa-folder-open:before{content: "\f07c";}.fa-navicon:before,.fa-reorder:before,.fa-bars:before{content: "\f0c9";}#simple-social-icons-2 ul li a,#simple-social-icons-2 ul li a:hover,#simple-social-icons-2 ul li a:focus{background-color:#999 !important;border-radius:3px;color:#fff !important;border:0px #fff solid !important;font-size:18px;padding:9px;}}

12 Things You Should Know About Event Sourcing

Are you aware that storing and updating current state means loosing important data?

Event sourcing is a way to solve this problem. It is the technique of storing state transitions rather than updating the current state itself.

Event sourcing has some more benefits:

  • Complete audit-proof log for free
  • Complete history of every state change ever
  • No more mapping objects to tables
  • Distribution support
  • CQRS (Command Query Responsibility Segregation) support
  • Natural fit for domain-driven design and functional programming
  • Be prepared for unanticipated use cases in the future (for free)

State transitions are an important part of our problem space and should be modelled within our domain -- Greg Young

When I first encountered the concept of event sourcing and CQRS and looked at some sample applications, I had the impression that it must be extremely difficult to implement. But later I found out that event sourcing is easier than I first thought, especially when it is expressed with functional programming.

Here are 12 things about event sourcing that should help you to get started today.

1. An event is something that has happened in the past

An event is a fact that happened in the past.

Events indicate that something within the domain has changed. They contain all the information that is needed to transform the state of the domain from one version to the next.

Events make the concept of side effects explicit. In an event sourced system, a state change is no longer an implicit result of performed operation. Instead the state change is explicitly defined and expressed in the domain language.

Here are some simplified example events for an accounting system:

sealed trait Event
final case class OnlineAccountCreated(accountId: UUID) extends Event
final case class DepositMade(accountId: UUID, depositAmount: BigDecimal) extends Event
final case class MoneyWithdrawn(accountId: UUID, withdrawalAmount: BigDecimal) extends Event

Events vs. Commands

Commands on the other hand have the intent of asking the system to perform an operation. Commands are represented in the imperative mood (e.g. CreateOnlineAccount) and can be rejected by the application.

2. An event should be presented as a verb in past tense

State transitions are an important part of the domain. Therefore they are explicitly recorded as events and should be named according to the ubiquitous language.

Because events are facts that have happened in the past, they should be presented as verbs in past tense.

3. State is a first level derivative of the event stream

Current state (e.g. the balance of a bank account) is the derivative of all the events that have happened up until now.

This means that an object is persisted as a stream of events. There is no need anymore to map its structural model with all its properties to tables in a relational database.

alt text

"There is not an impedance mismatch between events and the domain model." -- Greg Young

State is also often referred to as a projection of the event stream.

4. Every aggregate has its own event stream

An aggregate is a cluster of associated objects treated as a single unit. This could be e.g. an order and its line-items.

In the first example from above the aggregate is the bank account.

Every aggregate has its own event stream. Therefore every event must be stored together with an identifier for its aggregate. This ID is often called AggregateId, or StreamId.

5. The structure of an event and how to persist it

As just mentioned, an event must have an aggregate ID because we must be able to query events by their aggregate ID.

They also need either a timestamp or a sequence or version number. This is important because we have to be able to sort the event stream chronologically to be able to correctly replay the events.

The aggregate ID and the sort key do not necessarily have to be part of the domain event's data structure although they sure can be. I personally find it a little redundant to store the aggregate ID within every event payload. In the domain code the ID is always known within the context and can be passed explicitly if needed. But I guess it is the default for most implementations to be part of the event data.

We also need so store the event data itself. This is usually called Data or Payload and is done by serializing the domain event.

The minimum information to store per event is:

Column Type
StreamId Guid
Data Blob
Version Int

There is also some additional but optional information that can be stored if convenient or required by the business:

  • The event type (e.g. ShippingInformationAdded)
  • Event version (unique within context of a given stream)
  • Correlation ID
  • Timestamp
  • Other meta data (e.g. user, permission level, IP addresses)

When persisting events, the current version of the event stream (sometimes called StreamRevision) must be equal to the expected version. The expected version is the version of the stream on which the creation of the given events was based on. If the version is not equal to the expected version there will be a concurrency conflict. This check must be done within a single transaction.

Note that the StreamId column should be indexed.

Additionally it can be useful to to track all the aggregates currently in the system, e.g. in a separate table.

A selection of suitable event storages are e.g. Event Store, Redis, or plain old relational databases. (I guess there are many more options that I haven't evaluated, yet.)

You can find more detailed information on event storage in the CQRS Documents by Greg Young under Building an Event Storage.

6. Events are immutable

An event is a fact that happened in the past. So unless you've invented a time machine, it cannot be changed.

7. There is no delete

Just as events cannot be changed, they cannot be deleted either. A deletion is just another event, a compensating action sometimes called "reversal transaction".

Immutable events and streams have some benefits:

  • Append-only models distribute more easily than updating models. There are far fewer locks to deal with and horizontal partitioning with the aggregate ID as the partition key is easy.
  • No information is lost. This is extremely valuable if the business can derive a competitive advantage from the data.

8. The apply function

The apply function is the essence of event sourcing. It takes a state and an event and returns a new state:

alt text

Here is a code example:

def apply(account: Account, event: Event): Account = {
  (account, event) match {
    case (Uninitialized, OnlineAccountCreated(accountId)) =>
      OnlineAccount(accountId, 0)
    case (OnlineAccount(accountId, balance), DepositMade(_, depositAmount)) =>
      OnlineAccount(accountId, balance + depositAmount)
    case (OnlineAccount(accountId, balance), MoneyWithdrawn(_, withdrawalAmount)) =>
      OnlineAccount(accountId, balance - withdrawalAmount)
    case _ => account
  }
}

Note that the apply function is pure. There must be no side effects when applying events to recreate the aggregate.

9. The replay function

To recreate state from an event stream we need to replay all the events. This is simply done by folding the events given an initial state. The replay function takes an initial state and the event stream and returns the current state of the aggregate and its version:

alt text

It is convenient to also return the version to be able to detect concurrency conflicts. Here is the code:

def replay(initial: Account, events: List[Event]): (Account, Int) = {
  events.foldLeft((initial, -1)) {
    case ((state, version), event) => (apply(state, event), version + 1)
  }
}

10. The decide function

The decide function takes the current state and a command as inputs. It then decides whether the operation that the command requested can be performed at the current state by applying business rules. If so it will return one or more events:

alt text

Let's look at the code:

def decide(cmd: Command, state: Account): Either[Error, List[Event]] = {
  (state, cmd) match {
    case (Uninitialized, CreateOnlineAccount(accountId)) =>
      Right(List(OnlineAccountCreated(accountId)))
    case (OnlineAccount(accountId, balance), MakeDeposit(_, amount)) =>
      if (amount <= 0) {
        Left("deposit amount must be positive")
      } else {
        Right(List(DepositMade(accountId, amount)))
      }
    case (OnlineAccount(accountId, balance), Withdraw(_, amount)) =>
      if (amount <= 0) {
        Left("withdrawal amount must be positive")
      } else if (balance - amount < 0) {
        Left("overdraft not allowed")
      } else {
        Right(List(MoneyWithdrawn(accountId, amount)))
      }
    case _ =>
      Left(s"invalid operation $cmd on current state $state")
  }
}

Note that validation of the command is also done in the decide function.

Commands will be rejected if:

  • The deposit or withdrawal amount is less or equal than 0
  • The account does not exist (will be handled by the default case)
  • A withdrawal would result in a negative account balance

11. Event sourcing: the complete pattern

Now we need to combine the replay operation, the decide function, and the event persistence to create a fully functional event sourcing application.

Here is how it works:

alt text

Let's create a Scala case class that represents the event store:

final case class EventStore(
  appendToStream: (String, Int, List[Event]) => Either[Error, Unit],
  readFromStream: String => Either[Error, List[Event]])

Now we can implement the integration code:

def handleCommand(store: EventStore)
                 (accountId: UUID, cmd: Command) : Either[Error, Unit] = {
  for {
    events <- store.readFromStream(accountId.toString)
    (state, version) = Domain.replay(Uninitialized, events)
    newEvents <- Domain.decide(cmd, state)
    _ <- store.appendToStream(accountId.toString, version, newEvents)
  } yield ()
}

Testing the application

To test the application we will create a console application:

val store = EventStore(
  appendToStream = InMemoryEventStore.appendToStream,
  readFromStream = InMemoryEventStore.readFromStream
)

// dependency injection by partial application
val handle = CommandHandling.handleCommand(store) _

def query(accountId: UUID) = {
  store
    .readFromStream(accountId.toString)
    .map(events => Domain.replay(Uninitialized, events))
}

val accountId = UUID.randomUUID()

val commands = List(
  CreateOnlineAccount(accountId),
  MakeDeposit(accountId, 1000),
  Withdraw(accountId, 500),
  Withdraw(accountId, 501)
)

commands.foreach(cmd => {
  val result = handle(accountId, cmd)
  println(cmd)
  println(s"${result.fold(err => s"[ERROR] $err", _ => "[SUCCESS]")}")
  println(s"current state: ${query(accountId).fold(err => "invalid", account => s"$account")}")
})

First we create an in-memory event store. The implementation is not shown for the sake of brevity. You can find it here. The in-memory implementation is only for testing and demonstration purposes, don't use this in production!

Then we inject the store into the handle function by partial application.

We create a query function that reads the event stream from the in-memory event store and replays the events.

Then we handle a few commands and print the results to the console:

CreateOnlineAccount(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a)
[SUCCESS]
current state: (OnlineAccount(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,0),0)
MakeDeposit(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,1000)
[SUCCESS]
current state: (OnlineAccount(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,1000),1)
Withdraw(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,500)
[SUCCESS]
current state: (OnlineAccount(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,500),2)
Withdraw(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,501)
[ERROR] overdraft not allowed
current state: (OnlineAccount(7c05de5a-a2bd-4c2a-8ca1-20438db94b9a,500),2)

12. When to apply event sourcing?

Finally let's discuss when to apply event souring.

Greg Young says:

"In fact every domain is a naturally transaction based domain when Domain Driven Design is being applied"

So this means that event sourcing can be applied to any application that is driven by the domain.

However, event sourcing comes with some costs regarding the implementation effort when every behavior is modelled explicitly. It also will be more expensive regarding the disk costs because it leads to larger amounts of data being stored.

It's a trade-off. We have to decide wether theses costs are worth the ROI for the business. There's more information on the trade-offs in Greg Young's CQRS document.

Generally speaking, I suggest to consider event sourcing when:

  • You want to leverage any of the benefits of event sourcing
  • The domain is behavior driven
  • The system is task based
  • The system is not CRUD based

Conclusion

We have discussed the most important ideas of event sourcing.

While doing this we created a fully functional event sourcing application by applying these ideas in practice.

Even with the extra overhead of explicitly modelling all state transitions, the domain could be implemented in an uncluttered less then 90 lines single source file.

There are still a few open ends that we haven't covered:

  • Durability of commands and events
  • Performance issues that arise when there is a very large number of events that have to be processed
  • How to improve performance by implementing snapshots and CQRS
  • How to improve performance by keeping aggregates in memory e.g. with Akka
  • How to deal with asynchrony
  • Evolving and upgrading events
  • Error handling
  • Aggregates can only be queries by their ID, however, SQL-like querying can be implemented with CQRS and appropriate read models.
  • How to do inter bounded context communication (context maps and process managers)
  • How to implement sagas

While many of these aspects are non-trivial it is always a good idea to start out simple which is what I did in this post.

Don't introduce complex concepts until you've proven their need!

So if you are new to event sourcing, I hope this gives you a good starting point. Tell me how you are doing.

If you are an expert on event sourcing, and you have any comments or suggestions, please let me know.

The complete source code can be found on GitHub.

Resources