A generic implementation of a Nested Monte Carlo Search

When we refactor code how can we be confident that we don't break anything?

3 of the most important things that allow us to refactor fearlessly are:

Side effect free - or pure - expressions
Statically typed expressions
Tests

In this article we will solely focus on the aspect of side effects and strictly speaking on how to identify them. Being able to identify side effects in our programs clearly is the precondition for eliminating them.

Why avoid side effects?

Continue reading →

Have you ever wanted to try out PureScript but were lacking a good way to get started?

If you

Have some prior functional programming knowledge - maybe you know Haskell,Elm,F#,or Scala,etc.
Want to solve a small task with PureScript
And want to get started quickly

This post is for you!

In this post we will walk through setting up and implementing a small exemplary PureScript application from scratch.

Continue reading →

In this article I would like to present a minimal implementation of a music data type and everything that is needed to turn that into audible sound from an Elm application.

We will see how to transcribe an existing composition - an excerpt from Chick Corea's Children's Songs No. 6 - and listen to the result right here,embedded in this article.

From a music data type to performance

My colleague Jonas recently pointed out the presentation Making Algorithmic Music by Donya Quick to me. Donya Quick shows how she uses the Haskell library Euterpea to produce algorithmic music.

It got me really excited about the idea of porting this to Elm and to be able to use this in web applications.

In the following we will see the core data types and algorithms from Euterpea ported to Elm. To focus on the core concepts the implementation is stripped down to the minimum that is required to transcribe and perform an existing polyphonic piece of music (for a single instrument).

Continue reading →

This post is about how to implement well structured,and purely functional command line applications in Scala using PureApp.

PureApp originated in an experiment while refactoring out some glue code of an interactive command line application. At the same time it was inspired by the Elm Architecture Pattern,and scalaz's SafeApp,as well as scalm.

To show the really cool things we can do with PureApp,we will implement a self-contained example application from scratch.

This application translates texts from and into different languages. And it provides basic user interactions via the command line.

The complete source code is compiled with tut. Every output (displayed as code comments) is generated by tut.
Continue reading →

In this post we will see how applicatives can be used for validation in Scala. It is an elegant approach. Especially when compared to an object-oriented way.

Usually when we have operations that can fail,we have them return types like Option or Try. We sequence operations and once there is an error the computation is short circuited and the result is a None or a Failure.

Applicatives allow us to compose independent operations and evaluate each one. Even if an intermediate evaluation fails. This allows us to collect error messages instead of returning only the first error that occurred.

A classic example where this is useful is the validation of user input. We would like to return a list of all invalid inputs rather than aborting the evaluation after the first error.

Scala Cats provides a type that does exactly that. So let's dive into some code and see how it works.

Continue reading →

After some initial struggles,the chapter Functional Parsers from the great book Programming in Haskell by Graham Hutton,where a basic parser library is built from scratch,significantly helped me to finally understand the core ideas of parser combinators and how to apply them to other programming languages other than Haskell as well.

While I recently revisited the material and started to port the examples to Scala I wasn't able to define a proper monad instance for the type Parser[A].

The type Parser[A] alias was defined like this:

type Parser[A] = String =>Option[(A,String)] // defined type alias Parser

To test the monad laws with discipline I had to provide an instance of Eq[Parser[A]]. Because Parser[A] is a function,equality could only be approximated by showing degrees of function equivalence,which is not a trivial task.

Also the implementation of tailRecM was challenging. (I couldn't figure it out.)

Using existing abstractions

Continue reading →

Most config libraries use a stringly typed approach.

Some handle runtime failures due to invalid configuration schemas by leveraging data types like Option or Result to represent missing values or errors. This allows us to handle these failures by either providing default values or by providing decent error messages.

This is a good strategy that we should definitely stick to.

However,the problem with default values is that we might not even notice if the configuration is broken. This could potentially fail in production. In any case an error e.g. due to a misspelled config property will be observable at runtime at the earliest.

Wouldn't it be a great user experience (for us developers) if the compiler told us if the configuration schema is invalid? Even better,imagine we could access the configuration data in a strongly typed way like any other data structure,and with autocompletion.

Moreover,what if we didn't have to write any glue code,not even when the configuration schema changes?

This can be done with the costs of an initial setup that won't take more than probably around 5 minutes.

Continue reading →

In this post I will look at a practical example where the combined application (through monad transformers) of the state monad and the either monad can be very useful.

I won't go into much theory,but instead demonstrate the problem and then slowly build it up to resolve it.

You don't have to be completely familiar with all the concepts as the examples will be easy to follow. Here is a very brief overview:

Continue reading →

If your API overflows with Boolean parameters,this is usually a bad smell.

Consider the following function call for example:

toContactInfoList(csv,true,true)

When looking at this snippet of code it is not very clear what kind of effect the two Boolean parameters will have exactly. In fact,we would probably be without a clue.

We have to inspect the documentation or at least the parameter names of the function declaration to get a better idea. But still,this doesn't solve all of our problems.

The more Boolean parameters there are,the easier it will be for the caller to mix them up. We have to be very careful.

Moreover,functions with Boolean parameters must have conditional logic like if or case statements inside. With a growing number of conditional statements,the number of possible execution paths will grow exponentially. It will become more difficult to reason about the implementation code.

Can we do better?

Sure we can. Lambdas and combinators come to the rescue and I'm going to show this with a simple example,a refactoring of the function from above.

This post is based on a great article by John A De Goes,Destroy All Ifs — A Perspective from Functional Programming.

I'm going to take John's ideas that he backed up with PureScript examples and present how the same thing can be elegantly achieved in Scala.

Continue reading →

I'm currently working on an sbt plugin that generates Scala case classes at compile time to model JSON API responses for easy deserialization especially with the Scala play-json library.

The plugin makes it possible to access JSON documents in a statically typed way including auto-completion. It takes a sample JSON document as input (either from a file or a URL) and generates Scala types that can be used to read data with the same structure.

Let's look at a basic example,an app that prints the current Bitcoin price to the console.

Continue reading →

A $15\times 15$ initial SameGame board has an upper bound of $1.97\times 10^{182}$ reachable board positions.

Therefore it can be extremely difficult to solve, even for computers.

A very promising approach to solving complex problems such as SameGame is the Nested Monte Carlo Search (NMCS). It is a very simple variation from the family of Monte Carlo Search algorithms, especially suited for single player games.

In this post we will see a generic implementation of the NMCS in Java that can easily be adapted to different problem domains.

But before we come to that, let's look at how the NMCS works.

Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) algorithms are extremely relevant today for state-of-the-art artificial intelligence implementations.

One recent famous example is Google's AI AlphaGo that mastered the game of Go. AlphaGo utilizes a MCTS for one of its core components.

The MCTS algorithms produce very good solutions to problems where other traditional methods fail.

These problems usually tend to have a very large search space such as games like Go, Chess or SameGame.

What's the general idea?

A MCTS is a heuristic algorithm which is based on running many random simulations.

During a simulation (sometimes also called playout, rollout or sample) random legal actions are applied to an initial state until a terminal state is reached.

When the number of simulations approaches infinity, the result will converge to an optimal solution.

Of course, this is impractical.

Therefore the simulation strategy is combined with a tree search to guide the search to more promising branches of the search tree. This often produces very good results within a reasonable amount of time.

Generic application

Monte Carlo Searches don't need any domain specific knowledge. They can be applied to any problem that can be described in terms of

state
list of legal actions
score
function that takes a state and an action and returns a new state
simulation strategy

Nested Monte Carlo Search

In this post we will focus on a very basic Monte Carlo algorithm called Nested Monte Carlo Search (NMCS).

The NMCS is especially suited for complex single player games such as SameGame.

Basic algorithm

A NMCS repeats the following steps until time runs out or until the search terminates.

Selection

During the first iteration the initial state (root) is selected.

Otherwise the action leading to the current best score is selected.

Simulation

For the previously selected state all legal actions are determined.

For each of these next actions a simulation is played out.

Backpropagation

The best result found during the previous simulation step is stored in memory.

Search levels

The specialty of the NMCS is that during the simulation step the simulations themselves can be nested applications of the NMCS algorithm to all the children of the current state.

The depth of nesting can be described in terms of search levels.

Recursive definition

Level	Description
`0`	One playout starting at the current node.
`1`	During simulation steps a level-0 search is conducted for all children.
`n>1`	During simulation steps a recursive level-(n-1) search is conducted for all children.

Example of a level-2 search step

The following diagram shows one iteration of a level-2 search:

Explanation:

Node 1 is selected.
Next, a complete 2 level deep subtree of the search space with 1 as its root is traversed. This subtree has 5 leafs, 4 - 8.
For each leaf a normal simulation is played out.
The best result is found by the simulation from node 7. The result is propagated back and the next node from the best result (node 3) is selected for the next iteration.

What's the optimal search level?

Finding a good search level is a matter of manual tuning and experimenting.

It varies from case to case and depends on parameters such as size of the search space, duration of the search, etc.

NMCSs with higher search levels will find better results during simulations, but they will take longer to get deep down the search path.

Playout policy

A level-0 search is a normal simulation and defined as one playout starting from a given node until a terminal state is reached.

On this path every action is chosen by a playout policy.

Playout policies could either randomly choose actions or they could use domain specific strategies to improve the results.

Generic implementation

The NMCS algorithm can be applied to every problem, puzzle or game that can be described in terms of the generic interface INmcsState<TState, TAction>.

Generic interface

The generic type TState represents the state of the problem, e.g. the board position.

The generic type TAction represents an action, e.g. a coordinate on a SameGame board, that can be applied to a state.

The implementation has to provide:

a score
a function to get all legal next actions
a function that takes a state and an action and returns a new state
an implementation of a playout policy for simulations

public interface INmcsState<TState, TAction> {
    double getScore();
    boolean isTerminalPosition(); // for convenience (same as findAllLegalActions().size() == 0)
    List<TAction> findAllLegalActions();
    INmcsState<TState, TAction> takeAction(TAction action);
    Pair<Double, List<TAction>> simulation();
}

The implementation must be immutable. That means none of the operations is allowed to change the state of the object that the operation is called on.

If the state has to be updated, instead of updating the internal state a new updated version will have to be returned.

Of course, the implementation could be a wrapper around an existing mutable class where a copy is created before mutating. This would be an application of the adapter design pattern.

Generic algorithm

The NMCS algorithm is defined by the function executeSearch.

It takes three parameters:

state: of type INmcsState<TState, TAction> which represents the current state
level: of type int which specifies the search level
isCanceled: of type Supplier<Boolean> which allows to inject code for gracefully stopping the execution, e.g. a timer

The return value is a tuple of a score of type Double and a list of actions of type List<TAction>. The list of actions describes the path from the root to the terminal state.

public static <TState, TAction> Pair<Double, List<TAction>> executeSearch(INmcsState<TState, TAction> state,
        final int level, final Supplier<Boolean> isCanceled) {

    // terminating case
    if (level <= 0)
        return state.simulation();

    Pair<Double, List<TAction>> globalBestResult = Pair.of(state.getScore(), new LinkedList<TAction>());
    final List<TAction> visitedNodes = new LinkedList<TAction>();

    while (!state.isTerminalPosition() && !isCanceled.get()) {

        Pair<Double, List<TAction>> currentBestResult = Pair.of(0.0, new LinkedList<TAction>());
        TAction currentBestAction = null;

        for (TAction action : state.findAllLegalActions()) {
            final INmcsState<TState, TAction> currentState = state.takeAction(action);
            // recursion
            final Pair<Double, List<TAction>> simulationResult = executeSearch(currentState, level - 1, isCanceled);

            if (simulationResult.item1 >= currentBestResult.item1) {
                currentBestAction = action;
                currentBestResult = simulationResult;
            }
        }

        if (currentBestResult.item1 >= globalBestResult.item1) {
            visitedNodes.add(currentBestAction);
            globalBestResult = currentBestResult;
            globalBestResult.item2.addAll(0, visitedNodes);
        } else {
            currentBestAction = globalBestResult.item2.get(visitedNodes.size());
            visitedNodes.add(currentBestAction);
        }

        state = state.takeAction(currentBestAction);
    }
    return globalBestResult;
}

The algorithm terminates either when time runs out or when a node is selected during the selection step that has no child node and is a terminal state.

Here is an example of calling the executeSearch function:

final int[][] board = BoardGenerator.generateBoard("1,1,1;2,2,2;3,3,3;");
final int level = 2;
final long maxRunningTimeMs = 2 * 60 * 1000;

final INmcsState<SGBoard, Point> state = new SGNmctsState(board);

final long endTimeMs = System.currentTimeMillis() + maxRunningTimeMs;
final Pair<Double, List<Point>> result = NestedMonteCarloSearch.executeSearch(state, level, () -> {
    return System.currentTimeMillis() > endTimeMs;
});

The complete code is available on GitHub.

Enhancements and improvements

Here is a selection of potential improvements:

Root parallelization
By running independent searches from the root, the risk of getting trapped in a local maximum is reduced.
Guided playout policy
For some problem domains guided playouts may lead to better results than random sampling.
More emphasis on exploration as opposed to exploitation
It could be profitable to try unpromising actions occasionally because they still may lead to the total maximum.
Parallelization and performance optimization
More efficient implementations will provide better results.
Choose randomly between equally good results
The implementation shown above is intentionally kept simple. If multiple playouts yield the same score, the last playout will always be selected. A random selection between equally good paths would be the superior strategy.

Resources

Single-Player Monte-Carlo Tree Search for SameGame
Monte-Carlo Tree Search Enhancements for One-Player and Two-Player Domains
Code from this post on GitHub

Leif Battermann

Functional Programming Fu

Identify Side Effects And Refactor Fearlessly

Why avoid side effects?

PureScript Case Study And Guide For Newcomers

Elm And The Algorithm Of Music

From a music data type to performance

Interactive Command Line Applications In Scala –Well Structured And Purely Functional

How To Use Applicatives For Validation In Scala And Save Much Work

Parsers in Scala built upon existing abstractions

Using existing abstractions

Strongly Typed Configuration Access With Code Generation

Error and state handling with monad transformers in Scala

Use lambdas and combinators to improve your API

Modelling API Responses With sbt-json –Print Current Bitcoin Price

Leif Battermann

Functional Programming Fu

A generic implementation of a Nested Monte Carlo Search for single player games

Monte Carlo Tree Search

What's the general idea?

Generic application

Nested Monte Carlo Search

Basic algorithm

Selection

Simulation

Backpropagation

Search levels

Recursive definition

Example of a level-2 search step

What's the optimal search level?

Playout policy

Generic implementation

Generic interface

Generic algorithm

Enhancements and improvements

Resources