Procedurally Generated Music

by Thomas Kretzschmar

Generating music starts off with generating data. Data in music often goes by the MIDI standard. I started off with the idea that it's wise to start off with making a simple synthesizer in Unity with simple square/sine and triangle waveforms. These can function as different instruments together with a white noise generator. Things as glide and portamento as well as different soundfonts and VSTs would come at a later stage and not necessarily needed for procedural music.

Later in the project the switch was made towards MIDI, the generating of the audio with the synthesizer brought its problems. Since Unity does not have MIDI functionalities they need to be made. Since there where time constraints in the project and five weeks is not even close to enough time to make these functionalities work in Unity I made use of the drywetMidi library.

This project will be based around the 12 note temperament system based around 440 Hz tuning. Having different tet systems, xenharmonics or tonalism ("just intonation") are not part of this project.

Contents

  • Generating Audio
    • Simple unity synthesizer
    • Metronome
  • Music logic
    • Notes and Modes
    • Chords
    • Chord Progressions
  • Converting the project to midi
    • Drywetmidi setup
    • Generating songs through midi
  • Dungeon

Generating Audio

Starting this project I wanted to generate the audio myself. This means that I wanted to program a synthesizer that made beeps at certain frequencies. Different frequencies representing different notes. This meant I had to not only code my own synthesizer, but also had to code a sequencer to go with it.

Simple Unity Synthesizer

The logic behind the unity synthesizer was made with the help of this video.

 public double frequency = 440.0; // A = 440 tuning 
  private double increment;
  private double phase;
  private double samplingFrequency = 48000.0;

  //public static Oscillator instance;
  public WaveForm waveform = new WaveForm();
  public enum WaveForm
  {
    sineWave,
    squareWave,
    triangleWave,
    noise
  }
  public float gain;
  public float volume = 0.1f;
  public int thisFreq;
  private void OnAudioFilterRead(float[] data, int channels)
  {
    increment = frequency * 2.0 * Mathf.PI / samplingFrequency;
    for (int i = 0; i < data.Length; i += channels)
    {
      phase += increment;
      switch (waveform)
      {
        case WaveForm.sineWave:
          data[i] = (float)(gain * Mathf.Sin((float)phase));
          break;
        case WaveForm.squareWave:
          if (gain * Mathf.Sin((float)phase) >= 0 * gain)
          {
            data[i] = (float)gain * 0.6f;
          }
          else
          {
            data[i] = (-(float)gain) * 0.6f;
          }
          break;
        case WaveForm.triangleWave:
          data[i] = (float)(gain * (double)Mathf.PingPong((float)phase, 1.0f));
          break;
        case WaveForm.noise:
          break;
        default:
          throw new NotImplementedException(waveform.ToString());
      }

      //2 speakers instead of 1 
      if (channels == 2)
      {
        data[i + 1] = data[i];
      }
      if (phase > (Mathf.PI * 2))//resets phase
      {
        phase = 0.0;
      }
    }
  }

As can be seen above the synthesizer is merely a single oscillator generating a single sound with the space bar. The waveform of the oscillator can be adjusted in the inspector.

The 3 different waveforms and a noise channel

The below codeblock shows a class consisting of the frequencies of an octave, together with logic to fill in the frequencies of the other 5 octaves. The << operator performs an arithmetic left shift on a bit pattern.

/// /// Reference frequencies from middle octave ///
static class Frequencies
{
public const float C = 261.6256f;
public const float Db = 277.1826f;
public const float D = 293.6648f;
public const float Eb = 311.1270f;
public const float E = 329.6276f;
public const float F = 349.2282f;
public const float Gb = 369.9944f;
public const float G = 391.9954f;
public const float Ab = 415.3047f;
public const float A = 440f;
public const float Bb = 466.1638f;
public const float B = 493.8833f;
public static float NoteInOctave(float freq, int octave)
{
  int multiplier = 1 << Math.Abs(octave - 4);

  float result;
  if (octave < 4)
  {
    result = freq / multiplier;
  }
  else
  {
    result = freq * multiplier;
  }

  return result;
}

Metronome

This metronome code is an example from the unity documentation. I saw it as a good way to make it into a sequencer and start of procedurally generating a musical score. The metronome ticks in a 4/4 pattern and plays a tick as a quarter note.

public class Metronome : MonoBehaviour
{

  public double bpm = 140.0F;
  public float gain = 0.5F;
  public int signatureHi = 4;
  public int signatureLo = 4;

  private double nextTick = 0.0F;
  private float amp = 0.0F;
  private float phase = 0.0F;
  private double sampleRate = 0.0F;
  private int accent;
  private bool running = false;
  Instrument _instrument;

  void Start()
  {
    _instrument = GetComponent<Instrument>();
    accent = signatureHi;
    double startTick = AudioSettings.dspTime;
    sampleRate = AudioSettings.outputSampleRate;
    nextTick = startTick * sampleRate;
    running = true;
  }

  void OnAudioFilterRead(float[] data, int channels)
  {
    if (!running)
      return;

    double samplesPerTick = sampleRate * 60.0F / bpm * 4.0F / signatureLo;
    double sample = AudioSettings.dspTime * sampleRate;
    int dataLen = data.Length / channels;
    int n = 0;
    while (n < dataLen)
    {
      float x = gain * amp * Mathf.Sin(phase);
      int i = 0;
      while (i < channels)
      {
        data[n * channels + i] += x;
        i++;
      }
      while (sample + n >= nextTick)
      {
        nextTick += samplesPerTick;
        amp = 1.0F;
        _instrument.generatedNote = _instrument.RandomNote();

        //first bar
        if (++accent > signatureHi)
        {

          accent = 1;
          amp *= 2.0F;
        }
        Debug.Log("Tick: " + accent + "/" + signatureHi);
      }
      phase += amp * 0.3F;
      amp *= 0.993F;
      n++;
    }
  }
}

The sequencer made this way has a significant delay. This is because of Unity audio buffers. Clearly audible in the video below.

Two metronomes, with the same bpm and time signature, significantly enough out of sync to be amusical.

After a lot of time spent looking into things like unity burst and running native code, the fix actually turned out to be relatively simple. To fix these issues I made sure that all instruments use the same sequencer. This way there is also one metronome running and all instruments go off of this.

Notes & Modes

Now we start with the procedural music part. To start off we made the sequencer generate random notes every quarter note. As seen in the video below that's a huge mess.


We can make two run completely random too and that's an even bigger mess.


If we let the bass run at a lower frequency than the melody and with less notes it already sounds more musical.



To make this a bit more likeable to the ears we can start with adding some music theory. We can hardcode it to use a certain scale, but there is no fun in that. So we code it so there are templates for modes and according to a random generated number you determine the rootnote of this mode or scale (for music theory purposes there's a difference, but in this document I will use them interchangeably.)

Generates a random rootnote.


Pro tip: make sure you generate the same rootnote for both instruments, or you will bust your head thinking about why sometimes it generates more than the required 7 notes... as seen below.

To make sure both get the same rootnote we have to make sure it generates only once and not for each instrument its on. We can do this in the sequencer's Awake method.

Personal source, my music cheat sheet document in reference to modes.

Above is a snapshot out of my personal made music cheat sheet I use regularly. This shows the 7 most used modes in modern music.

Above table put into code. Previous version was 7 arrays, now compressed into one jagged array

This means that now we have a randomly picked mode from the jaggedarray, then from this mode there will be random notes picked from the mode to make up the melody as well as the bass.

The randomly picked mode with random notes from this mode in action.

This is starting to sound a lot like music! However this is at the same time too random and too restrictive. too restrictive since it only picks from one mode, and too random since there's no progression, leitmotif or any other method to this madness, also the metronome only works in quarter notes. A clear to-do list.

Music Logic

For there to be music we have to think of what components music exists. For this we have several components; chords, notes, modes and song structure being the fundamentals of most music.

Chords

The first thing we'll work on are chords. Chords in essence are multiple notes played at the same time. This can be from several instruments, think of a big band where brass instruments together play several different notes at the same time, or from one instrument, think of a guitar strumming several notes at the same time.

To translate this into code we will turn our mono synth into a polysynth. In other words we will give the instrument class the ability to send data to different oscillators. To start off we will make a class containing intervals of some of the most used chords.

There's two ways this can be done. From the perspective of the respective mode like most musicians few it, or as a chromatic interval. If we take a simple Major triad as example, this would be the 1 3 and 5 of the major scale(ionian mode). If we look at it from the chromatic scale (all notes) this would be the 1 5 and 8. This is a difficult decision to make. The first option would mean that its easier to build upon the knowledge from a music perspective, but this would create unnecessary dependency between the ChordIntervals and Modes classes.

doesn't sound too pleasing

This is a pretty decent solution for modes, but for the chords and extensions it might be easier to create several jaggedarrays, consisting of major chords and extensions /minor chords and extension etc. This will also make it easier from a music theory perspective, when the computer should generate a major chord it can pick any chord from the jaggedmajorchord array and it will sound decent enough, the same can't be said from putting a minor chord instead.

Chord Progressions

Depending on the mode we are in we have to do different chords for a chord progression. This is easily visualised in the graph I made.

Lets take a generic 1-5-6-4 progression. Different Modes have different chords that fit in that mode.
An Ionian 6 will be a vi7 but a dorian 6 will be a vi7b5. This is not as rigid as it seems at first glance. For example if your verse is in ionian it can easily make use of chords that are not in ionian, you could take a dominant II chord and it can still fit the song. This is called modal interchange. For more information on this topic I advise this video.

To start off with we will stick with the diatonic chords of the mode we are in. This means we could do it with a switch case of modes, in here we can pick a chord progression.

jaggedarray filled with modes
jaggedarray filled with chord progressions

Converting the project to midi

Unity's way of handling audio is causing a lot of issues to my project. From audio buffers to generation issues. This is the reason I chose to start using the midi protocol. This could be done using Fmod, but I chose to use DryWetMidi, since it would negate the steep learning curve of Fmod.
The code for DryWetMidi can be viewed on this github page.

DryWetMidi Setup

The first issue we run in is that the library is way too static. As seen in the image above it uses a lot of static readonly extension methods to call on things like intervals. since its creating a midi file we can't just use an easy way out like a switch or if statements to generate this random interval for us.
Let's start off by seeing what we can do without tinkering in the library itself.

.SetRootNote(Melanchall.DryWetMidi.MusicTheory.Note.Get((NoteName)NoteNames.GetValue(random.Next(0,NoteNames.Length)), 3))
 .Note(Interval.FromHalfSteps(random.Next(0, 12)))

Here we set a random root note and play a random note. The random root note is a random element taken from an enum.

Type type = typeof(NoteName);
Array NoteNames = type.GetEnumValues();

This way of making the pattern is very static and hard to work with. First we need to remove some of the beauty of this code. It might seem redundant to work this way but its necessary to be able to random generate parts of the code. We will put the patternBuilder in front of all the dots and close the statements too.

    patternBuilder.SetRootNote(Melanchall.DryWetMidi.MusicTheory.Note.Get((NoteName)NoteNames.GetValue(random.Next(0, NoteNames.Length)), 3)); //Sets a random root note 
    //.SetRootNote(Melanchall.DryWetMidi.MusicTheory.Note.Get(NoteName.GSharp, 3));
    patternBuilder.Note(Interval.FromHalfSteps(random.Next(0, 12)));
    patternBuilder.Note(Interval.FromHalfSteps(random.Next(0, 12)));
    patternBuilder.Note(Interval.FromHalfSteps(random.Next(0, 12)));
    patternBuilder.Note(Interval.FromHalfSteps(random.Next(0, 12)));

Working this way means there's also room for things like switches and if statements, meaning we can bring some logic from the old structure into this new usage with the midi library. Think of the song structures, chord progressions and modes mentioned above.

Integrating the previous mode logic with the new midi functions, this took way longer than I'd like to admit

Generating songs through midi

To generate sound through midi we first have to figure out how to generate music. In the image below is an illustration explaining the process of the generation.

Some notes on how the randomisation will work and what to visualise.
This checks the chord progression and mode depending on mode it plays a certain arpeggio

The above code plays the 1 chord just fine, without hardcoding. If you want the 5 chord of a lydian progression you could easily code it as being 7 halfsteps removed from the root since it's a similar chord. This is true for every mode except locrian.

This means that instead of declaring the interval right after case 4: in the switch statement we have to declare it in each if statement. This is true for all the cases except for 0, since there are no # or b 1's as seen in the graph above.

This turned into 4 bar loops.

2 for loops each representing 4 bars.

pseudo random bassline

The pseudo random bassline picks from three options. Either the root only, the root twice or the root and the fifth of the respective current chord from the chord progression.

This worked. we then later on made a quick and dirty version.... with a switch in a switch in a switch.

dirty chord generating.

We wanted to make it a bit less dirty by filling an array with the jagged arrays for each chord.

The problem arises with giving these chords sharps and flats. Making the code more complicated, but keeping the same structure.

I made three ways to play the chords. Either a downward arpeggio, upward arpeggio or randomly played notes from the chord. The first two always bring up a decently enough and stable part. the randomness almost never does it but it does give some flavour to something otherwise very stale. The way it works is it checks if the pattern before this was random, if it is do not repeat this randomness again, to prevent a mess from happening. The downward arpeggio is made with an inverted for loop.

three ways of playing a chord.

Dungeon

To show the proof of concept of having several songs generated we need at least two dungeon rooms. These rooms do not have to be randomly generated only the music that gets generated has to be that way. Together with a player object with very basic movement and a collision with the rooms we can make this happen.

Two dungeon rooms, a player cube in between and text as feedback of what is currently happening musically-wise.

Sadly this breaks the game. DryWetMidi does not allow the creation of multiple midi files. To clarify, since the device is already in use by the previous room it can't play the midi file of the next room. The only way to get this done is by completely disregarding the last room's file, which unfortunately to this project is not rendered live, but pre-rendered. This means that there is no way to render a new midi file after the previous one got disregarded.

Reflecting

In the end the results are not what I expected going into this project. If I had to do it over I would directly make the jump to FMod. This way a lot of the restrictions I ran into during the project get removed. This would also remove the time I spent on generating the music and trying to do it all from scratch.

Another thing I would do differently is by defining what would make something procedurally generated. I learned that it doesn't take full randomness to be regarded as something procedurally generated and directing the generating in a certain direction is actually a beneficial thing.

Sources

Overview | DryWetMIDI. (2022). Github.io. https://melanchall.github.io/drywetmidi/

Procedural Music Generation - PROCJAM Tutorials. (n.d.). Www.procjam.com. Retrieved November 7, 2022, from https://www.procjam.com/tutorials/en/music/

Unity - Scripting API: MonoBehaviour.OnAudioFilterRead(float[], int). (n.d.). Docs.unity3d.com. Retrieved November 7, 2022, from https://docs.unity3d.com/2017.1/Documentation/ScriptReference/MonoBehaviour.OnAudioFilterRead.html

‌Modal Interchange | Music with Myles. (2017). [YouTube Video]. In YouTube. https://www.youtube.com/watch?v=1dRA28cdt5c

Unity3D analog style Synthesizer Tutorial. (n.d.). Www.youtube.com. Retrieved November 7, 2022, from https://www.youtube.com/watch?v=GqHFGMy_51c

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts