Elm And The Algorithm Of Music

In this article I would like to present a minimal implementation of a music data type and everything that is needed to turn that into audible sound from an Elm application.

We will see how to transcribe an existing composition - an excerpt from Chick Corea's Children's Songs No. 6 - and listen to the result right here, embedded in this article.

From a music data type to performance

My colleague Jonas recently pointed out the presentation Making Algorithmic Music by Donya Quick to me. Donya Quick shows how she uses the Haskell library Euterpea to produce algorithmic music.

It got me really excited about the idea of porting this to Elm and to be able to use this in web applications.

In the following we will see the core data types and algorithms from Euterpea ported to Elm. To focus on the core concepts the implementation is stripped down to the minimum that is required to transcribe and perform an existing polyphonic piece of music (for a single instrument).

Music data type

Let's first define a type that represents a primitive musical value which can be either a note or a rest:

type alias Duration =
    Float

type alias Pitch =
    Int

type Primitive
    = Note Duration Pitch
    | Rest Duration

A Duration is expressed as a Float and represents the note value (the relative duration of a note). A quarter note e.g. can be defined by the expression 1/4 or the literal 0.25, an eighth note by 1/8 or 0.125, and so on. I prefer using Float over a - potentially more accurate - type for rational numbers because Float is simple and easy to work with in Elm.

Pitch is an integer that represents the number of semitones from C-1 (five octaves below C4 or Middle C). So C-1 corresponds to pitch 0, and C4 to 60, A4 to 69, and so on. This convention is also used by most MIDI implementations.

With the Primitive type we can now define lots of primitive musical values. However, we cannot yet express how they should be composed together to form a larger - more complex - musical value.

There are two fundamentally different ways of combining notes (and rests). They can be played either in sequence, or simultaneously. Let's implement this as a custom type:

type Music
    = Prim Primitive
    | Seq Music Music
    | Par Music Music

Music has three constructors:

  • Prim is a constructor for a primitive musical value (a note or a rest).
  • Seq takes two arguments of type Music that are played in sequence.
  • Par takes two arguments of type Music that are played in parallel.

Note that the definition of Music is recursive which allows us to easily combine musical values with each other to form larger structures that again can be composed together, and so on.


Example

Music Example 1

Here is one way to encode this small musical figure:

m : Music
m =
    Seq
        (Par (Prim (Note (1 / 4) 60)) (Prim (Note (1 / 4) 64)))
        (Par (Prim (Note (1 / 4) 62)) (Prim (Note (1 / 4) 65)))

Constructors for notes and syntactic sugar

The Music type is already pretty powerful and we can use it to describe quite complex pieces of music. However, the usability is not very good, yet. This is partly because the pitch numbers are not very readable.

The first thing we need for better syntax is a constructor for a note that returns a value of type Music:

note : Duration -> Pitch -> Music
note duration pitch =
    Prim (Note duration pitch)

After we define Octave as a synonym for Int we can implement functions to easily construct notes, given an octave and a duration:

type alias Octave =
    Int

c : Octave -> Duration -> Music
c octave duration =
    note duration (12 * (octave + 1) + 0)

cs : Octave -> Duration -> Music
cs octave duration =
    note duration (12 * (octave + 1) + 1)

df : Octave -> Duration -> Music
df octave duration =
    note duration (12 * (octave + 1) + 1)

-- etc. ...
-- ...

b : Octave -> Duration -> Music
b octave duration =
    note duration (12 * (octave + 1) + 11)

There is a function for each combination of a letter name ([c, d, e, f, g, a, b]) and an optional suffix that indicates an accidental, f for a flat (♭), or s for a sharp (♯).

Likewise it is useful to have constructors for durations (and rests), that will not be listed here for the sake of brevity.


Example

With these helper functions the example from above becomes much more readable:

qn : Duration
qn =
    1 / 4

m : Music
m =
    Seq
        (Par (c 4 qn) (e 4 qn))
        (Par (d 4 qn) (f 4 qn))

More convenient functions

To make it even more convenient to construct and combine Music values we define the functions rest, line, and times as follows:

{-| Create a rest where duration specifies the length of the pause.
-}
rest : Duration -> Music
rest duration =
    Prim (Rest duration)

{-| Create a sequence from a list of Music values.
-}
line : List Music -> Music
line =
    List.foldr Seq (rest 0)

{-| Repeat a Music value for a particular number of times.
-}
times : Int -> Music -> Music
times n music =
    List.repeat n music |> line

In anticipation of our goal to perform Children's Songs No. 6 we implement another function that makes it easy to create grace notes:

{-| Create a grace note with 1/8 of the duration of the principle note.
The pitch of the grace notes is n semitones higher (or lower) than the principle note.
-}
graceNote : Int -> Music -> Music
graceNote n music =
    case music of
        Prim (Note duration pitch) ->
            Seq
                (note (duration / 8) (pitch + n))
                (note ((duration / 8) * 7) pitch)

        _ ->
            music

Transcribing music

In only a few lines of code we have defined a data structure - and some convenient functions - that is powerful enough to represent a complex musical score.

So let's go ahead and transcribe an excerpt of Chick Corea's composition Children's Songs No. 6.

Here is the sheet music for the first 28 bars of Children's Songs No. 6:

Children's Songs No. 6

Base line

Let's start with the base line (the left hand voice).

The base line has a lot of repetitions. It consists of only three different phrases that are each repeated several times which is a perfect fit to apply the times function.

We can encode the complete 28 bars of the base line as follows:

dqn : Duration
dqn =
    3 / 8

bassLine : Music
bassLine =
    let
        b1 =
            line [ b 2 dqn, fs 3 dqn, g 3 dqn, fs 3 dqn ]

        b2 =
            line [ b 2 dqn, es 3 dqn, fs 3 dqn, es 3 dqn ]

        b3 =
            line [ as_ 2 dqn, fs 3 dqn, g 3 dqn, fs 3 dqn ]
    in
    line [ times 3 b1, times 2 b2, times 4 b3, times 5 b1 ]

Melody

The encoding of the melody follows the same principles:

melody : Music
melody =
    let
        ( en, qn, hn ) =
            ( 1 / 8, 1 / 4, 1 / 2 )

        ( dhn, triplet ) =
            ( 3 / 4, (2 / 3) * (1 / 8) )
    in
    line
        [ line
            [ times 3 (line [ a 4 en, e 4 en, d 4 en, fs 4 en, cs 4 en, b 3 en, e 4 en, b 3 en, graceNote -1 (d 4 qn), cs 4 en, b 3 en ])
            , cs 4 (dhn + dhn)
            ]
        , line [ d 4 dhn, f 4 hn, gs 4 qn, fs 4 (hn + en), g 4 en, fs 4 en, e 4 en, cs 4 en, as_ 3 en, a 3 dqn ]
        , line [ as_ 3 en, cs 4 en, fs 4 en, e 4 en, fs 4 en, g 4 en, as_ 4 en, cs 5 (hn + en) ]
        , line [ d 5 en, cs 5 en, e 4 en, rest en, as_ 4 en, a 4 en, g 4 en, d 4 qn, c 4 en, cs 4 en ]
        , line [ fs 4 en, cs 4 en, e 4 en, cs 4 en, a 3 en, as_ 3 en, d 4 en, e 4 en, fs 4 en, graceNote 2 (e 4 qn), d 4 en ]
        , line [ graceNote 2 (d 4 qn), cs 4 en, graceNote 1 (cs 4 qn), b 3 (en + hn), cs 4 en, b 3 en ]
        , line [ fs 4 en, a 4 en, b 4 (hn + qn), a 4 en, fs 4 en, e 4 qn, d 4 en, fs 4 en, e 4 hn, d 4 hn, fs 4 qn ]
        , line [ cs 4 triplet, d 4 triplet, cs 4 triplet, b 3 ((dhn * 3) + hn) ]
        ]

There are a few things to note:

  • The melody also has some repetitions in the beginning which is a good fit for times.
  • In bars 2, 4, 6, 18 and 19 the graceNote function is used.
  • The nested line applications give more structure and help readability. Alteratively all the values could have been in one large single list.
  • The duration of an eighth note triplet is specified as 2⁄3 of the duration of an eighth note. (There are more advanced ways of denoting triplets, but this is simple and it works.)

Putting it together

song : Music
song =
    Par bassLine melody

Ok, this part was really easy 🙂

Performance

With the types and functions we have created so far we are able to store musical information in a data structure. But we cannot really do much more with it. We especially do not have a mechanism, yet, to interpret the music values as actual audible sound.

To do this, we first need to transform a Music value into a time-ordered list of MusicEvent.

Music events

A MusicEvent is a type alias especially designed to carry the minimum information needed to produce sound.

type alias Time =
    Float

type alias MusicEvent =
    { time : Time
    , pitch : Pitch
    , duration : Time
    }

A MusicEvent describes a single note with the following properties:

  • time is the point in time (in seconds, relative to the beginning of the musical piece) when the note should be played
  • pitch is the number of semitones above C-1 (same as above)
  • duration absolute duration of a note measured in seconds

Transformation to music events

We are going to implement the helper function merge first. We need this function to merge events that are played in parallel into a single time-ordered list. Here is a generic algorithm for merging two sorted lists into one:

merge : (a -> comparable) -> List a -> List a -> List a
merge compare list1 list2 =
    case ( list1, list2 ) of
        ( [], ys ) ->
            ys

        ( xs, [] ) ->
            xs

        ( x :: xs, y :: ys ) ->
            if compare x < compare y then
                x :: merge compare xs (y :: ys)

            else
                y :: merge compare ys (x :: xs)

Next we define Tempo, a type alias to hold information about tempo which consists of the number of beats per minute and the beat duration:

type alias Tempo =
    { duration : Duration, bpm : Int }

Now we can transform a Music value into a list of MusicEvent with musicToMEvents:

musicToMEvents : Time -> Tempo -> Music -> ( List MusicEvent, Time )
musicToMEvents currentTime tempo music =
    let
        metro =
            60 / (toFloat tempo.bpm * tempo.duration)
    in
    case music of
        Prim (Note duration pitch) ->
            ( [ MusicEvent currentTime pitch (duration * metro) ]
            , duration * metro
            )

        Prim (Rest duration) ->
            ( [], duration * metro )

        Seq music1 music2 ->
            let
                ( events1, duration1 ) =
                    musicToMEvents currentTime tempo music1

                ( events2, duration2 ) =
                    musicToMEvents (currentTime + duration1) tempo music2
            in
            ( events1 ++ events2, duration1 + duration2 )

        Par music1 music2 ->
            let
                ( events1, duration1 ) =
                    musicToMEvents currentTime tempo music1

                ( events2, duration2 ) =
                    musicToMEvents currentTime tempo music2
            in
            ( merge .time events1 events2, max duration1 duration2 )

Let's examine the implementation of musicToMEvents a bit more closely:

Parameters

musicToMEvents takes three parameters:

  • currentTime of type Time is the state (or context) that represents the current time, and that is threaded through the computation. The initial value is the starting time of the music performance. When musicToMEvents is called recursively the time is updated if necessary.
  • tempo of type Tempo is used to calculate the scale factor metro to convert relative durations (expressed in beats) into absolute durations (expressed in seconds).
  • And finally music of type Music, the value we operate on.

Return value

musicToMEvents returns a list of MusicEvent and a value of type Time which indicates the absolute duration (in seconds) of the music.

Function body

The main part of the interpretation of a music value is a pattern match with four cases:

  1. Prim (Note duration pitch): A note is transformed into a singleton list of MusicEvent and an absolute duration that is calculated from the metro scale factor and the note's duration.
  2. Prim (Rest duration): A rest is interpreted as an empty list and an absolute duration that is calculated from the metro scale factor and the rest's duration.
  3. Seq music1 music2: Two music values that are played in sequence are interpreted recursively one at a time. However, the start time of the second value is the end time of the first which is calculated by adding the absolute duration of the first music value to the current time.
    In the sequential case the results are combined by appending the list of events and adding up their durations.
  4. Par music1 music2: Two music values that are played in parallel are also interpreted one at a time. Contrary to the sequential case the start times of two music values are the same.
    In the parallel case the results are combined by merging the lists of events (with the previously defined merge function) and by determining the maximum of the two durations.

For convenience we define a function perform that sets the time to begin the performance to 0 and returns only the music events:

perform : Tempo -> Music -> List MusicEvent
perform tempo music =
    musicToMEvents 0 tempo music |> Tuple.first

Example

Music Example 1

Here is interpretation result of the example from above as a list of music events (serialized as JSON):

[
    { "time":0,   "pitch":64, "duration":0.5 },
    { "time":0,   "pitch":60, "duration":0.5 },
    { "time":0.5, "pitch":65, "duration":0.5 },
    { "time":0.5, "pitch":62, "duration":0.5 }
]

Audio performance with JS interop

When it comes to producing real audible sound in the browser, there are a number of different possibilities. E.g. we could use external MIDI devices with the Web MIDI API, or we could use the Web Audio API with something like Tone.js.

However, in this post we will focus on WebAudioFont which uses a sample-based synthesis and works in most browsers including mobile.

JSON encoder and ports

To interact with the WebAudioFont JavaScript library we have to use ports and we have to encode a music event as a JSON value:

port play : Value -> Cmd msg

encode : MusicEvent -> Value
encode event =
    Encode.object
        [ ( "time", Encode.float event.time )
        , ( "pitch", Encode.int event.pitch )
        , ( "duration", Encode.float event.duration )
        ]

playback : List MusicEvent -> Cmd msg
playback =
    Encode.list encode >> play

The JavaScript side

Here are the ~30 lines of JavaScript code that uses the WebAudioFont API and produced real audible sound in the browser.

const AudioContextFunc = window.AudioContext || window.webkitAudioContext
const audioContext = new AudioContextFunc()
const player = new WebAudioFontPlayer()

app.ports.play.subscribe(function (events) {
  const fenderRhodes = 45
  const volume = 90 / 127
  const info = player.loader.instrumentInfo(fenderRhodes)
  player.loader.startLoad(audioContext, info.url, info.variable)

  player.loader.waitLoad(function () {
    player.cancelQueue(audioContext)

    const startTime = audioContext.currentTime + 0.2

    events.forEach(function (event) {
      const time = startTime + event.time
      player.queueWaveTable(
        audioContext,
        audioContext.destination,
        window[info.variable],
        time,
        event.pitch,
        event.duration,
        volume
      )
    })
  })

  return false
})

Let's hear it!

That's all.

So without further ado, let's listen to the excerpt from Chick Corea's Children's Songs No. 6. Just hit the the Play button!

Note that the output is produced by the Elm application embedded in this post. This is not prerecorded.

Summary

In roughly 200 lines of pure Elm code (with no other dependencies than Json.Encode) we defined the types and functions that make it pleasant to work with quite complex musical values.

With a little bit of JavaScript that music can actually be made audible in the browser.

All of this is based on very sound concepts and the work that went into the Haskell Euterpea library. And there are a lot of things that we haven't covered in this article.

I'd love to turn this into an Elm package at some point. If you have any ideas for potential use or application, or any other suggestions I'd love to hear them!

There is more to come!