Deep MIDI

Real-Time MIDI Pattern Analysis for Augmented Improvisation

← Back

Contents

Overview

The purpose of this project is to capture and analyse MIDI data in real time and use it to enhance live improvised performance. The data includes both note-on/note-off events as well as any changes that might be taking place in MIDI enabled effects units (i.e. as values coming from a LFO Filter knob on a synth, or a level change on a guitar effects pedal). All data is recorded, timestamped, and analysed using real-time deep learning models. These models are used to create new MIDI data which is fed back to provide musical stimulus during performances.

This project does not seek to automate performances or to mimick the human ability improvise. Instead, the idea is to find ways to augment the improvisation experience for musicians in the context of live settings. When musicans play in ensemble settings, they are in a constant process of perceiving their own and others’ musical decisions and responding to that. This project is about finding ways to enhance that process by using real-time data that is produced by the musicians.

This is a multidiscipllinary project bringing together improvisation practice, real-tme sound design, and the use of advanced mathematics to analyse streaming data and discover latent patterns in data. The end goal is to find new ways to think about interactive creative performance, and reframe the MIDI protocol as a language that can assist creative interactions. It aims to enables a new kind of deep listening where performers can interact with emergent patterns in the data as it is created, and where data can become an active participant in the creative process.

The title of this project, Deep MIDI, is a nod to Pauline Oliveros' work in this field, and her committment to create sonic experiences from which participants (be they listeners or performers) can find new ways to explore and experience music.

Existing research projects using MIDI data

There are a number of research projects currently taking place Music Information Retrieval (MIR) which utilise MIDI data. Existing studies focus on pattern mining or modeling performance after the data has been been captured. Some of these are listed below.

Researcher / Group Project / Dataset Focus / Contribution Reference (APA7)
Hawthorne, C. (Google Magenta) MAESTRO Dataset High-quality aligned audio–MIDI data from piano performances for transcription and generation tasks. Hawthorne, C. et al. (2018). *Enabling factorized piano music modeling and generation with the MAESTRO dataset*. ISMIR.
Simon, I., Roberts, A. (Google Magenta) PerformanceRNN / Magenta MIDI Dataset Expressive performance modeling and real-time sequence generation using recurrent neural networks. Oore, S. et al. (2018). *This time with feeling: Learning expressive musical performance*. Neural Computing and Applications.
Donahue, C. (UCSD) MIDI-DDSP Neural synthesis conditioned on MIDI control parameters for expressive sound generation. Huang, R., & Donahue, C. (2021). *MIDI-DDSP: Detailed control of musical performance via hierarchical modeling*. ISMIR.
Raffel, C. (Columbia / Google) Lakh MIDI Dataset Large-scale collection of MIDI files aligned with Million Song Dataset for symbolic–audio mapping. Raffel, C. (2016). *Learning-based methods for comparing sequences, with applications to audio-to-MIDI alignment and matching*. PhD thesis, Columbia University.
Kim, J., & Bello, J. P. (NYU) URMP Dataset Synchronized audio–video–MIDI dataset of chamber ensembles for multi-modal music analysis. Li, B., Kim, J., & Bello, J. P. (2018). *URMP: A dataset for multimodal music performance analysis*. ISMIR.
MIDI Toolbox (Eerola & Toiviainen) MATLAB MIDI Toolbox Early symbolic music analysis environment; basis for subsequent MIR dataset handling. Eerola, T., & Toiviainen, P. (2004). *MIDI Toolbox: MATLAB tools for music research*. University of Jyväskylä.
Huang, C. A., & Yang, Y.-H. (Academia Sinica) POP909 Dataset 909 full pop songs in aligned MIDI format for melody, harmony, and structure analysis. Wang, Z. et al. (2020). *POP909: A pop-song dataset for music arrangement generation*. ISMIR.
Choi, K. et al. (Spotify Research) GiantMIDI-Piano Automated transcription of piano music into large-scale MIDI dataset for generative models. Kong, Q. et al. (2020). *GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music*. ISMIR.

Background Reading

Some interesting background reading for this topic is listed below.

Example architecture and related code

The diagram below captures one possible setup of MIDI & audio routing, that I am currently using. There are of course many ways to set this kind of thing up, this is a working example that is currently used to collect data for analysis.

                   ┌─────────────────────┐
                   │  Morningstar        │
                   │     MC8 Pro         │
                   │ 4 Omniports + 5-pin │
                   └───────┬─────────────┘
                           │
       ┌───────────────────┼──────────────────────────────────────────────┐
       │                   │                           │                  │
       ▼                   ▼                           ▼                  ▼
Blooper [Ring-Active MIDI IN]  Cloudburst [Standard Type A MIDI IN]  Mood MK II [Ring-Active MIDI IN]  Lost & Found [Ring-Active MIDI IN]

                           │
                           ▼
Timeline [5-pin MIDI IN] ──▶ MIDI THRU (5-pin) ──▶ GT-1000 Core [3.5mm MIDI IN] ──▶ MIDI THRU (3.5mm) ──▶ Arturia MicroFreak [3.5mm MIDI IN]

──────────────────────────────────────────────────────────────
Audio Path
──────────────────────────────────────────────────────────────
[Guitar] ──▶ LS-2 Input
             ├─A Output──▶ FX Loop 1 ──▶ GT-1000 Core ──▶ Return A (LS-2)
             │
             └─B Output──▶ Sonic Cake ABY Box ──▶ FX Loop 2 ─▶ Timeline ─▶ Cloudburst ─▶ Mood MK II ─▶ Source Audio EQ2 ─▶ Return B (LS-2)

Arturia MicroFreak/Octotrack/Korg Modwave ───────────┘ (joined via Sonic Cake ABY Box for FX Loop 2)

LS-2 Output ──▶ Blooper ─▶ AER Alpha Amp

This project is supported by a number of interconnected codebases that work together to collect data, undertake real-time analysis, feed data back to performers for further interaction, and even use in data visualisation. These repositories allow the system to capture performance data, run machine learning models live, render dynamic visual feedback, and enable distributed collaboration across networked musicians.

1. Data collection and data analysis
Link to related code

The purpose of this Python script is to manage all midi data collection messaging and real-time data management<. Built with mido and python-rtmidi, this environment listens continuously for incoming messages — including note, CC, and control data — and records them with microsecond precision into an SQLite database. Deep learning models (e.g., PyTorch, TensorFlow) can then process these streams to predict or generate new patterns, supporting live adaptive improvisation.

2. Real-time performance templates
Link to related code

The SuperCollider layer manages sound generation and performance interaction. It receives interpreted data from the Python process and responds with synthesized gestures, rhythmic patterns, and evolving textures. This setup enables non-linear improvisation structures, linking analytical and auditory domains in real time.

3. Visualisation
Link to related code

A Node.js environment provides interactive visual feedback using JavaScript libraries such as D3.js and Three.js. It displays rhythmic lattices, gesture maps, and timing densities, offering musicians a visual interface to explore the evolving structure of improvisations. The visualiser connects directly to the Python data stream for low-latency rendering.

4. Jamulus (Low-Latency Collaboration Config)
Link to related code

Configuration files define Jamulus-based network setups for online improvisation and rehearsal. These allow distributed musicians to connect to the Deep Improvisation environment with minimal latency, integrating the analytical and visual systems into remote collaborative performance.

Getting involved in this project

Coming soon