# Elm And The Algorithm Of Music

In this article I would like to present a minimal implementation of a music data type and everything that is needed to turn that into audible sound from an Elm application.

We will see how to transcribe an existing composition - an excerpt from Chick Corea's Children's Songs No. 6 - and listen to the result right here, embedded in this article.

## From a music data type to performance

My colleague Jonas recently pointed out the presentation Making Algorithmic Music by Donya Quick to me. Donya Quick shows how she uses the Haskell library Euterpea to produce algorithmic music.

It got me really excited about the idea of porting this to Elm and to be able to use this in web applications.

In the following we will see the core data types and algorithms from Euterpea ported to Elm. To focus on the core concepts the implementation is stripped down to the minimum that is required to transcribe and perform an existing polyphonic piece of music (for a single instrument).

## Music data type

Let's first define a type that represents a primitive musical value which can be either a note or a rest:

``````type alias Duration =
Float

type alias Pitch =
Int

type Primitive
= Note Duration Pitch
| Rest Duration
``````

A `Duration` is expressed as a `Float` and represents the note value (the relative duration of a note). A quarter note e.g. can be defined by the expression `1/4` or the literal `0.25`, an eighth note by `1/8` or `0.125`, and so on. I prefer using `Float` over a - potentially more accurate - type for rational numbers because `Float` is simple and easy to work with in Elm.

`Pitch` is an integer that represents the number of semitones from C-1 (five octaves below C4 or Middle C). So C-1 corresponds to pitch 0, and C4 to 60, A4 to 69, and so on. This convention is also used by most MIDI implementations.

With the `Primitive` type we can now define lots of primitive musical values. However, we cannot yet express how they should be composed together to form a larger - more complex - musical value.

There are two fundamentally different ways of combining notes (and rests). They can be played either in sequence, or simultaneously. Let's implement this as a custom type:

``````type Music
= Prim Primitive
| Seq Music Music
| Par Music Music
``````

`Music` has three constructors:

• `Prim` is a constructor for a primitive musical value (a note or a rest).
• `Seq` takes two arguments of type `Music` that are played in sequence.
• `Par` takes two arguments of type `Music` that are played in parallel.

Note that the definition of `Music` is recursive which allows us to easily combine musical values with each other to form larger structures that again can be composed together, and so on.

#### Example

Here is one way to encode this small musical figure:

``````m : Music
m =
Seq
(Par (Prim (Note (1 / 4) 60)) (Prim (Note (1 / 4) 64)))
(Par (Prim (Note (1 / 4) 62)) (Prim (Note (1 / 4) 65)))
``````

### Constructors for notes and syntactic sugar

The `Music` type is already pretty powerful and we can use it to describe quite complex pieces of music. However, the usability is not very good, yet. This is partly because the pitch numbers are not very readable.

The first thing we need for better syntax is a constructor for a note that returns a value of type `Music`:

``````note : Duration -> Pitch -> Music
note duration pitch =
Prim (Note duration pitch)
``````

After we define `Octave` as a synonym for `Int` we can implement functions to easily construct notes, given an octave and a duration:

``````type alias Octave =
Int

c : Octave -> Duration -> Music
c octave duration =
note duration (12 * (octave + 1) + 0)

cs : Octave -> Duration -> Music
cs octave duration =
note duration (12 * (octave + 1) + 1)

df : Octave -> Duration -> Music
df octave duration =
note duration (12 * (octave + 1) + 1)

-- etc. ...
-- ...

b : Octave -> Duration -> Music
b octave duration =
note duration (12 * (octave + 1) + 11)
``````

There is a function for each combination of a letter name (`[c, d, e, f, g, a, b]`) and an optional suffix that indicates an accidental, `f` for a flat (♭), or `s` for a sharp (♯).

Likewise it is useful to have constructors for durations (and rests), that will not be listed here for the sake of brevity.

#### Example

With these helper functions the example from above becomes much more readable:

``````qn : Duration
qn =
1 / 4

m : Music
m =
Seq
(Par (c 4 qn) (e 4 qn))
(Par (d 4 qn) (f 4 qn))
``````

### More convenient functions

To make it even more convenient to construct and combine `Music` values we define the functions `rest`, `line`, and `times` as follows:

``````{-| Create a rest where duration specifies the length of the pause.
-}
rest : Duration -> Music
rest duration =
Prim (Rest duration)

{-| Create a sequence from a list of Music values.
-}
line : List Music -> Music
line =
List.foldr Seq (rest 0)

{-| Repeat a Music value for a particular number of times.
-}
times : Int -> Music -> Music
times n music =
List.repeat n music |> line
``````

In anticipation of our goal to perform Children's Songs No. 6 we implement another function that makes it easy to create grace notes:

``````{-| Create a grace note with 1/8 of the duration of the principle note.
The pitch of the grace notes is n semitones higher (or lower) than the principle note.
-}
graceNote : Int -> Music -> Music
graceNote n music =
case music of
Prim (Note duration pitch) ->
Seq
(note (duration / 8) (pitch + n))
(note ((duration / 8) * 7) pitch)

_ ->
music
``````

## Transcribing music

In only a few lines of code we have defined a data structure - and some convenient functions - that is powerful enough to represent a complex musical score.

So let's go ahead and transcribe an excerpt of Chick Corea's composition Children's Songs No. 6.

Here is the sheet music for the first 28 bars of Children's Songs No. 6:

### Base line

The base line has a lot of repetitions. It consists of only three different phrases that are each repeated several times which is a perfect fit to apply the `times` function.

We can encode the complete 28 bars of the base line as follows:

``````dqn : Duration
dqn =
3 / 8

bassLine : Music
bassLine =
let
b1 =
line [ b 2 dqn, fs 3 dqn, g 3 dqn, fs 3 dqn ]

b2 =
line [ b 2 dqn, es 3 dqn, fs 3 dqn, es 3 dqn ]

b3 =
line [ as_ 2 dqn, fs 3 dqn, g 3 dqn, fs 3 dqn ]
in
line [ times 3 b1, times 2 b2, times 4 b3, times 5 b1 ]
``````

### Melody

The encoding of the melody follows the same principles:

``````melody : Music
melody =
let
( en, qn, hn ) =
( 1 / 8, 1 / 4, 1 / 2 )

( dhn, triplet ) =
( 3 / 4, (2 / 3) * (1 / 8) )
in
line
[ line
[ times 3 (line [ a 4 en, e 4 en, d 4 en, fs 4 en, cs 4 en, b 3 en, e 4 en, b 3 en, graceNote -1 (d 4 qn), cs 4 en, b 3 en ])
, cs 4 (dhn + dhn)
]
, line [ d 4 dhn, f 4 hn, gs 4 qn, fs 4 (hn + en), g 4 en, fs 4 en, e 4 en, cs 4 en, as_ 3 en, a 3 dqn ]
, line [ as_ 3 en, cs 4 en, fs 4 en, e 4 en, fs 4 en, g 4 en, as_ 4 en, cs 5 (hn + en) ]
, line [ d 5 en, cs 5 en, e 4 en, rest en, as_ 4 en, a 4 en, g 4 en, d 4 qn, c 4 en, cs 4 en ]
, line [ fs 4 en, cs 4 en, e 4 en, cs 4 en, a 3 en, as_ 3 en, d 4 en, e 4 en, fs 4 en, graceNote 2 (e 4 qn), d 4 en ]
, line [ graceNote 2 (d 4 qn), cs 4 en, graceNote 1 (cs 4 qn), b 3 (en + hn), cs 4 en, b 3 en ]
, line [ fs 4 en, a 4 en, b 4 (hn + qn), a 4 en, fs 4 en, e 4 qn, d 4 en, fs 4 en, e 4 hn, d 4 hn, fs 4 qn ]
, line [ cs 4 triplet, d 4 triplet, cs 4 triplet, b 3 ((dhn * 3) + hn) ]
]
``````

There are a few things to note:

• The melody also has some repetitions in the beginning which is a good fit for `times`.
• In bars 2, 4, 6, 18 and 19 the `graceNote` function is used.
• The nested `line` applications give more structure and help readability. Alteratively all the values could have been in one large single list.
• The duration of an eighth note triplet is specified as 2⁄3 of the duration of an eighth note. (There are more advanced ways of denoting triplets, but this is simple and it works.)

### Putting it together

``````song : Music
song =
Par bassLine melody
``````

Ok, this part was really easy 🙂

## Performance

With the types and functions we have created so far we are able to store musical information in a data structure. But we cannot really do much more with it. We especially do not have a mechanism, yet, to interpret the music values as actual audible sound.

To do this, we first need to transform a `Music` value into a time-ordered list of `MusicEvent`.

### Music events

A `MusicEvent` is a type alias especially designed to carry the minimum information needed to produce sound.

``````type alias Time =
Float

type alias MusicEvent =
{ time : Time
, pitch : Pitch
, duration : Time
}
``````

A `MusicEvent` describes a single note with the following properties:

• `time` is the point in time (in seconds, relative to the beginning of the musical piece) when the note should be played
• `pitch` is the number of semitones above C-1 (same as above)
• `duration` absolute duration of a note measured in seconds

### Transformation to music events

We are going to implement the helper function `merge` first. We need this function to merge events that are played in parallel into a single time-ordered list. Here is a generic algorithm for merging two sorted lists into one:

``````merge : (a -> comparable) -> List a -> List a -> List a
merge compare list1 list2 =
case ( list1, list2 ) of
( [], ys ) ->
ys

( xs, [] ) ->
xs

( x :: xs, y :: ys ) ->
if compare x < compare y then
x :: merge compare xs (y :: ys)

else
y :: merge compare ys (x :: xs)
``````

Next we define `Tempo`, a type alias to hold information about tempo which consists of the number of beats per minute and the beat duration:

``````type alias Tempo =
{ duration : Duration, bpm : Int }
``````

Now we can transform a `Music` value into a list of `MusicEvent` with `musicToMEvents`:

``````musicToMEvents : Time -> Tempo -> Music -> ( List MusicEvent, Time )
musicToMEvents currentTime tempo music =
let
metro =
60 / (toFloat tempo.bpm * tempo.duration)
in
case music of
Prim (Note duration pitch) ->
( [ MusicEvent currentTime pitch (duration * metro) ]
, duration * metro
)

Prim (Rest duration) ->
( [], duration * metro )

Seq music1 music2 ->
let
( events1, duration1 ) =
musicToMEvents currentTime tempo music1

( events2, duration2 ) =
musicToMEvents (currentTime + duration1) tempo music2
in
( events1 ++ events2, duration1 + duration2 )

Par music1 music2 ->
let
( events1, duration1 ) =
musicToMEvents currentTime tempo music1

( events2, duration2 ) =
musicToMEvents currentTime tempo music2
in
( merge .time events1 events2, max duration1 duration2 )
``````

Let's examine the implementation of `musicToMEvents` a bit more closely:

#### Parameters

`musicToMEvents` takes three parameters:

• `currentTime` of type `Time` is the state (or context) that represents the current time, and that is threaded through the computation. The initial value is the starting time of the music performance. When `musicToMEvents` is called recursively the time is updated if necessary.
• `tempo` of type `Tempo` is used to calculate the scale factor `metro` to convert relative durations (expressed in beats) into absolute durations (expressed in seconds).
• And finally `music` of type `Music`, the value we operate on.

#### Return value

`musicToMEvents` returns a list of `MusicEvent` and a value of type `Time` which indicates the absolute duration (in seconds) of the music.

#### Function body

The main part of the interpretation of a music value is a pattern match with four cases:

1. `Prim (Note duration pitch)`: A note is transformed into a singleton list of `MusicEvent` and an absolute duration that is calculated from the metro scale factor and the note's duration.
2. `Prim (Rest duration)`: A rest is interpreted as an empty list and an absolute duration that is calculated from the metro scale factor and the rest's duration.
3. `Seq music1 music2`: Two music values that are played in sequence are interpreted recursively one at a time. However, the start time of the second value is the end time of the first which is calculated by adding the absolute duration of the first music value to the current time.
In the sequential case the results are combined by appending the list of events and adding up their durations.
4. `Par music1 music2`: Two music values that are played in parallel are also interpreted one at a time. Contrary to the sequential case the start times of two music values are the same.
In the parallel case the results are combined by merging the lists of events (with the previously defined `merge` function) and by determining the maximum of the two durations.

For convenience we define a function `perform` that sets the time to begin the performance to `0` and returns only the music events:

``````perform : Tempo -> Music -> List MusicEvent
perform tempo music =
musicToMEvents 0 tempo music |> Tuple.first
``````

#### Example

Here is interpretation result of the example from above as a list of music events (serialized as JSON):

``````[
{ "time":0,   "pitch":64, "duration":0.5 },
{ "time":0,   "pitch":60, "duration":0.5 },
{ "time":0.5, "pitch":65, "duration":0.5 },
{ "time":0.5, "pitch":62, "duration":0.5 }
]
``````

### Audio performance with JS interop

When it comes to producing real audible sound in the browser, there are a number of different possibilities. E.g. we could use external MIDI devices with the Web MIDI API, or we could use the Web Audio API with something like Tone.js.

However, in this post we will focus on WebAudioFont which uses a sample-based synthesis and works in most browsers including mobile.

#### JSON encoder and ports

To interact with the WebAudioFont JavaScript library we have to use ports and we have to encode a music event as a JSON value:

``````port play : Value -> Cmd msg

encode : MusicEvent -> Value
encode event =
Encode.object
[ ( "time", Encode.float event.time )
, ( "pitch", Encode.int event.pitch )
, ( "duration", Encode.float event.duration )
]

playback : List MusicEvent -> Cmd msg
playback =
Encode.list encode >> play
``````

#### The JavaScript side

Here are the ~30 lines of JavaScript code that uses the WebAudioFont API and produced real audible sound in the browser.

``````const AudioContextFunc = window.AudioContext || window.webkitAudioContext
const audioContext = new AudioContextFunc()
const player = new WebAudioFontPlayer()

app.ports.play.subscribe(function (events) {
const fenderRhodes = 45
const volume = 90 / 127

player.cancelQueue(audioContext)

const startTime = audioContext.currentTime + 0.2

events.forEach(function (event) {
const time = startTime + event.time
player.queueWaveTable(
audioContext,
audioContext.destination,
window[info.variable],
time,
event.pitch,
event.duration,
volume
)
})
})

return false
})
``````

## Let's hear it!

That's all.

So without further ado, let's listen to the excerpt from Chick Corea's Children's Songs No. 6. Just hit the the Play button!

Note that the output is produced by the Elm application embedded in this post. This is not prerecorded.

## Summary

In roughly 200 lines of pure Elm code (with no other dependencies than `Json.Encode`) we defined the types and functions that make it pleasant to work with quite complex musical values.

With a little bit of JavaScript that music can actually be made audible in the browser.

All of this is based on very sound concepts and the work that went into the Haskell Euterpea library. And there are a lot of things that we haven't covered in this article.

I'd love to turn this into an Elm package at some point. If you have any ideas for potential use or application, or any other suggestions I'd love to hear them!

There is more to come!