This paper details a collaboration with Arca to advance generative MIDI models, resulting in a seven-hour MRP installation showcasing expressive, uncanny MIDI continuations created using Notochord.
Generative MIDI models are increasingly prevalent as creative compositional aids and musical performance tools, learning from large MIDI datasets to offer completions, continuations, in-painting and more. However, their expressive capabilities are limited by the low fidelity and dimensionality of the MIDI protocol itself, and don't perform as well in unstructured improvisatory scenarios due to being trained mainly on structured and bar-aligned compositions. We present an approach to overcoming these limitations, which emerged out of a collaboration with the artist Arca where her Magnetic Resonator Piano (MRP) performances were turned into a seven-hour installation. The real-time probabilistic model Notochord learned in-context from Arca's playing, and via further structural constraints improvised new MIDI continuations. Arca's continuous gestural data from the MRP were then re-mapped onto the new MIDI notes, creating endless renditions that retained gestural subtlety. We reflect on the various uncanny aspects of the installation and how this approach could be taken further, and share code1 and notebooks2 documenting our process for others to build on.
Installation art, Magnetic Resonator Piano, Notochord, Artificial intelligence, Deep learning, Generative AI, MIDI, Intelligent musical instruments
This paper presents a novel exploration at the intersection of generative AI and musical performance, specifically through a collaboration that integrates the unique qualities of the Magnetic Resonator Piano (MRP) [1] with the capabilities of generative MIDI models, in this case Notochord [2]. Highlighted in this project is an installation — MRP Day Activations — developed in collaboration with the artist Arca as part of her multifaceted event The Light Comes in the Name of the Voice, demonstrating the potential of integrating AI into instrumental performance to produce new forms of musical expression. The installation sought to elaborate on recordings of Arca’s MRP performances, through a seven-hour continuous experience. Arca’s event served as a platform for artistic exploration and also as a contribution to the ongoing dialogue about the role of AI in pop music and live performance settings. In sharing our experience of designing the MRP installation aspect of the event, we aim to shed light on the potential synergies between AI-generated content and live musical expression.
Our contributions are the following:
We contribute to the practice of creative AI for pop music, reporting on a unique installation with Arca combining generative AI and the magnetic resonator piano.
We enhance accessibility to advanced musical interfaces by releasing the tools for playing the MRP using the Notochord MIDI model, alongside open-source notebooks and code for the wider MRP community.
We report on our method of mapping expression curves to and from MIDI velocities to combine general MIDI models with unique interfaces like the MRP.
We provide reflections on the implications of our installation for the broader conversation on generative AI within the pop music landscape.
Three views of the MRP Day Activations installation at Rotunda, Bourse de Commerce, Paris. Video credit: Lewis Wolstanholme. |
We begin the paper by describing the artistic context of the collaboration and our previous works with the MRP. We then describe the MRP in more detail, and review generative AI in the context of MIDI generation, in particular contrasting existing models with Notochord, the model used in this project. Following this, we describe the design and outcomes of the installation, and share our reflections and possible future directions.
This project marks our first collaboration with Arca. Our research group is dedicated to examining the role of artificial intelligence in the development of new musical instruments. The opportunity to work together arose in part due to us both having previously developed pieces for the MRP. In Arca’s case, this came in the form of her concert held in New York in October 2023. As the New York Times reported, “[Arca’s] piano — unlike Swift’s or Lady Gaga’s — is prepared with magnets that turn it into an electroacoustic machine of woozy, otherworldly lyricism, tinged with buzz.”3
Meanwhile, Armitage premiered the MRP-based installation Strengjavera at AIMC 2023. A second iteration was exhibited at Nordic House in Reykjavík, where the MRP was controlled by biomimetic artificial life simulations: “Bringing together the perfect intermingling of art and science, the new installation by composer, producer, performer and research Jack Armitage will have viewers asking big questions about humanity and technology while being marvelled by beautiful acoustic piano sounds created in fascinating ways.”4 Further recent experimentation has involved training RAVE neural audio models on recordings of the installation [3].
For The Light Comes in the Name of the Voice, Victor Shepardson joined the collaboration, providing expertise in deep learning for real-time symbolic music generation, via the project Notochord. Iterative prototyping happened across sites in Paris and Reykjavík. The group aimed to capture the essence of Arca’s MRP playing and extend it for a seven hour installation in Bourse de Commerce’s Rotunda, supervised by onsite technician Lewis Wolstanholme. The group paid particular attention to elevating and honouring Arca’s artistic intentions over the MRP’s technical constraints, while also remaining open to the generative expressive potential of machine perturbation.
Below we provide an excerpt of the public event description5:
Arca presents her new musical exploration, “The Light Comes in the name of the Voice” at the Bourse de Commerce6. In a diffuse and ghostly manner, Arca previously inhabited the museum’s Rotunda in 2022 for the Echo2 installation by Philippe Parreno, presented on the occasion of the exhibition “Une seconde d’éternité.” The unique acoustics of this monumental space, formed by the union of the 19th-century Halle aux blés and Tadao Ando's concrete cylinder, offer her a distinctive echo chamber and a new playground.
Produced specifically for this space, “The Light Comes in the name of the Voice” begins as a minimal quest for sound transformation. It is first formed between Arca and a magnetic resonated piano -an electronically-augmented acoustic instrument that produces new acoustic sounds from the piano strings, inducing vibrations to create countless crescendos and harmonics, all controlled from the keyboard. During the daytime, as Arca disappears, the magnetic resonator piano (MRP) continues to play using AI technology trained on Arca's playing.
MRP inventor Andrew McPherson introduces his creation this way7:
The magnetic resonator piano (MRP) is an augmented piano which uses electromagnets to elicit new sounds from the strings of a grand piano. The MRP extends the sonic vocabulary of the piano to include infinite sustain, crescendos from silence, harmonics, pitch bends and new timbres, all produced acoustically without the use of speakers. The MRP is an electronic instrument, but the experience of playing it closely resembles the acoustic piano it inherits from. An optical scanner on the piano keyboard measures the continuous position of every key, enabling subtle, delicate gestures that would not produce sound on an ordinary piano, even as all the techniques of traditional piano playing remain available. The kit can be installed in any grand piano. The Augmented Instruments Laboratory website8 features more details about the research and technology behind the MRP.
Describing the origins of the MRP, McPherson recalled “what I had in mind was the idea that you would have something that was still recognizable as a piano but had a sort of extraordinary range of tone color, especially the idea that you can separate the timbre from the dynamics.” [4]
McPherson and Kim [5] detail the gesture-sound mapping of the MRP in a 2012 paper, reproduced below, which is crucial to understanding how to work with data produced by the MRP.
Unlike the majority of new musical instruments, the MRP features a custom recording format, and the ability to record and playback performances. The below listing shows an example of MRP OSC recording data. Here, two notes (MIDI 62 & 56) are played, and their intensity and pitch vibrato parameters are modulated, and the timestamps (in seconds) show the fine resolution of data:
0.144359 /mrp/midi iii 159 62 127
0.148741 /mrp/quality/intensity iif 15 62 0
0.150149 /mrp/quality/intensity iif 15 62 0
0.150253 /mrp/midi iii 159 56 127
0.151611 /mrp/quality/intensity iif 15 56 0
0.151668 /mrp/quality/intensity iif 15 62 0
0.154576 /mrp/quality/intensity iif 15 56 0
0.154654 /mrp/quality/intensity iif 15 62 0
0.157414 /mrp/quality/intensity iif 15 56 0
0.157479 /mrp/quality/intensity iif 15 62 0
0.160273 /mrp/quality/intensity iif 15 56 0
0.160335 /mrp/quality/intensity iif 15 62 0
0.163375 /mrp/quality/intensity iif 15 56 0
0.163547 /mrp/quality/intensity iif 15 62 0
0.166477 /mrp/quality/intensity iif 15 56 0
0.166749 /mrp/quality/intensity iif 15 62 1
0.168951 /mrp/quality/intensity iif 15 56 1
0.168989 /mrp/quality/intensity iif 15 62 1
0.171918 /mrp/quality/intensity iif 15 56 1
0.17199 /mrp/quality/intensity iif 15 62 1
0.174787 /mrp/quality/intensity iif 15 56 1
0.174844 /mrp/quality/intensity iif 15 62 1
0.177677 /mrp/quality/intensity iif 15 56 1
0.177761 /mrp/quality/intensity iif 15 62 1
0.179157 /mrp/quality/pitch/vibrato iif 15 62 0
0.179223 /mrp/quality/intensity iif 15 56 1
0.179245 /mrp/quality/intensity iif 15 62 1
0.180599 /mrp/quality/pitch/vibrato iif 15 62 0
0.180645 /mrp/quality/intensity iif 15 56 1
0.180665 /mrp/quality/intensity iif 15 62 1
0.182038 /mrp/quality/pitch/vibrato iif 15 62 0
0.182082 /mrp/quality/intensity iif 15 62 1
0.183462 /mrp/quality/pitch/vibrato iif 15 62 0
Due to the aforementioned gesture-sound mapping, by working with the recording data rather than the KeyScanner data itself, we are working at a few abstraction levels away from direct gestural data, mediated by the MRP software’s per-note state machines. The MRP’s subtle, micro scale details [6] emerge from a complex interaction between the physical piano, the KeyScanner and its calibration, the software’s state machines, and the performer themselves.
Most MRP repertoire falls in the contemporary classical tradition, with pianists extending their existing techniques, and emulation software has been written to support this practice [7]. To facilitate artistic departure from existing MRP repertoire, we have developed iimrp9. The repository contains client software that implements the MRP’s protocols in Python, SuperCollider, Max/MSP and TidalCycles. The software has been developed to enable new ways of interacting with the MRP that go beyond traditional keyboard-based control. More examples of iimrp usage can be found in our examples repository10.
Probabilistic generative music has a long history with roots in the Markov models of the Iliac Suite and the Stochastic music of Xenakis, and reaching all the way back to ancient methods of divination. In recent years, substantial research has sought to deploy data driven ‘deep learning’ models to generative modeling of symbolic music and musical signal. Much effort has focused on MIDI, since large amounts of MIDI music can be obtained as training data, yet the modeling task is generally simpler than for raw audio. Some efforts model full tracks with multiple parts, while others focus on solo performance or even monophonic generation. Meanwhile, some efforts have treated MIDI more like a score, imposing keys, time signatures and strict quantization of events, while other works attempt to model MIDI files as performances including variable timing and gesture. Ji et al. [8] call the latter “composing expressive performance” while Oore et al. [9] use the term “direct performance generation”. Amongst these, though some systems are designed explicitly for interactivity [10], most are not designed for low-latency generation. Live performance, if possible at all, tends to involve an asynchronous process of loop generation or variation.
The Notochord [2] model was designed to be the first deep generative MIDI performance model to meet three design goals at once:
Very low latency (~10ms) processing of MIDI events appropriate for real-time performance
Multi-part modeling of full tracks
Constrained prediction of each note event attribute (pitch, velocity, instrument, timing) conditional on any other attributes
Goal 1 means that systems using Notochord can integrate into a real-time performance with little perceptible delay — for example, the model can react instantly to change of key or dynamics by a performer, without waiting for a bar boundary or indeed needing a notion of time signature. To achieve this, the model must run quickly on a CPU, and must represent notes using NoteOn and NoteOff events — note durations can’t be conditioned on with low latency, since they aren’t known until the note is over.
Goal 2 means that a single Notochord model can generate full polyphonic performances with multiple instruments. To achieve this, the model must represent the part, channel or instrument associated with each event.
Goal 3 means that a performer can take very fine-grained control over generation — for example, Notochord can predict pitches conditioned on the timing and velocity from a MIDI controller; or Notochord can play two instruments autonomously as the performer plays a third; or Notochord can generate music while a performer constrains the allowable pitches, degree of polyphony, dynamics and instrumentation.
In this project, we used a Notochord model trained on the lakh MIDI dataset11, a large collection of MIDI files found on the internet including representations of pop, classical and soundtrack music. The same model was used in the Notochord Arcs and Scrambled Signals performance at AIMC 2023 [11]. It often creates a feeling of being at the edge of sense, making errors and non-sequiturs while also developing and re-contextualising them, wandering between styles.
Notochord is distributed as a Python package via PyPI12 and is available on GitHub13. Notochord models can be used via a Python API or OSC server, and there are also several real-time MIDI processing apps.
The aim of this project was to create an AI system that could improvise in the style of Arca's performances for playback on the MRP, for a duration of around seven hours, during the daytime between the two evening concerts. Given the extremely tight timeline—less than three days from receiving performance data to the installation—and the remote nature of our collaboration, we faced several challenges. Below we describe these constraints and how they led to the conclusion that Notochord was an excellent fit for this project.
The first data we received was of Arca’s rehearsal with the MRP. Playing it back on our MRP felt uncanny, as the piano hammers, which would have been activated by Arca originally, were for us motionless. Upon analyzing the data, we recognized the difficulty in applying traditional melody- and meter-focused technologies due to the performance's complex polyphony and free timing. In addition, Arca specifically calibrates the MRP’s KeyScanner sensor bar in a unique way. Usually, calibration is performed by pressing each individual key down to the key bed, but Arca calibrates the MRP with only light touches, such that relatively small key depressions result in full intensity notes, enabling light and fluidic stroking gestures.
It would also be difficult to train new models given the short time available for iteration and the small amount of new training data. Instead, we looked for a pre-trained model which could learn from the new data in-context [12] and which would allow enough control for us to shape an aesthetically interesting result. Since we also worked remotely, we did not attempt to design a real-time system, but rather pre-rendered performances to be played back by the MRP. However, it was advantageous to iterate using a model capable of generating faster than real-time on a laptop, and with which we were already familiar. These considerations led us to the choice of Notochord as a generative model for the project. We considered other recent off-the-shelf MIDI models [13][14], but found that most assume structured compositions, which didn't suit the free-form nature of Arca's performance.
Notochord is a model for MIDI note events, but an MRP performance — particularly in Arca’s style of playing, with the keys tuned to very high sensitivity — is as much in the expression of each note via intensity, brightness and vibrato gestures as in the notes themselves. modeling expression data directly seemed impractically complex given our constraints. So, the challenge remained of mapping MRP expression onto a form that Notochord could interpret.
We used two strategies for shaping the Notochord performance: in-context learning, and structural constraints. First, we fed the whole original performance into a Notochord model as a prompt, so that the generated material would function as a continuation of the original performance. This served to inform Notochord implicitly about the general characteristics of the performance — harmony, tempo, timing, and so on. Second, we generated continuations by constraining each new MIDI event to correspond roughly to an event from the original sequence. Afterwards, we mapped expression curves back onto the new continuation.
Each event generated by Notochord was within, for example, one octave of the original, 32 velocity units, and half to double the elapsed time since the previous event. The sequence of NoteOn and NoteOff events was the same, though the model was free to match NoteOffs to NoteOns originally of different pitches. Specifically for the MRP, we also built-in constraints to prevent magnets from overheating due to being held for too long. Probabilistic models like Notochord can incorporate such constraints seamlessly — at any given time, if there are pitches which have been activated for too much cumulative time, the model is constrained to choose only those pitches for NoteOff events, and prevented from choosing them for NoteOn, simply by setting certain probabilities to zero and normalizing the resulting distribution.
The result of this method was a new improvisation following the rough texture of the original and complementing it in general harmony and timing characteristics. Below we provide a pseudocode version of this processing stage, corresponding to ‘Generate Continuation’ in the block diagram above, to complement the complete Python notebook:
def noto_variation(model, original_events, heat_thresh)
model.reset()
all_pitches = {e['pitch'] for e in original_events}
events = [] # store generated events
held_pitches = set() # track which notes are playing
pitch_heat = {p:0 for p in all_pitches} # track cumulative heat of magnets
# For each event in the original performance:
for event in original_events:
hot_pitches = {
p for p,h in pitch_heat.items()
if h > heat_thresh}
# if it is a NoteOn
if event['vel'] > 0:
allowed_pitches = (
all_pitches
& set(range(event['pitch']-12, event['pitch']+13))
- held_pitches
- hot_pitches
)
new_event = model.query_feed(
min_time=event['time']/2,
max_time=event['time']*2,
min_vel=max(1, event['vel']-16),
max_vel=min(127, event['vel']+16),
include_pitch=allowed_pitches
)
else:
allowed_pitches = hot_pitches or held_pitches
new_event = model.query_feed(
min_time=event['time']/2,
max_time=event['time']*2,
next_vel=0,
include_pitch=allowed_pitches
)
# accumulate heat
events.append(new_event)
for p in held_pitches:
pitch_heat[p] += new_event['time']
for p in all_pitches - held_pitches:
pitch_heat[p] -= new_event['time']
# update held notes
if new_event['vel'] > 0:
held_pitches.add(new_event['pitch'])
else:
held_pitches.remove(new_event['pitch'])
events.append(new_event)
return events
Notochord models note velocities, but the MRP doesn’t use the velocities associated with each MIDI event — instead dynamics are present in a continuous intensity curve. We reduced expression curves to a scalar ‘velocity score’ for each MRP note based on the average value of the intensity and other expression curves, and sent that to the Notochord model. When generating new notes from the model, MIDI velocity was mapped back to an original expression curve from the data by selecting the curve with the nearest velocity score. Finally, the selected expression curve was adapted to the length of the generated note by time-stretching while preserving the original attack speed, i.e. not stretching the first 200 milliseconds. The ensured that intensity curves would attack in a manner similar to the original performance, but vibrato curves wouldn’t be truncated from the end of a note, where they often bend toward an adjacent pitch in a legato gesture.
The simple approach of mapping between note expressions and velocities was more expedient than building a parametric model for the MRP expression curves, while still allowing Notochord to make sense of dynamics. Reusing actual curves from the recording propagated some of the expressive feeling of Arca’s performance, if scrambled in the process of being made legible to the model. We queried Notochord for several variations of the original performance, using a variety of specific versions of the structural constraints and temperature parameters. The final installation stitched together several such generations into a piece of around one hour duration, which was looped for the seven hour installation.
There was one final challenge that emerged when we received the MRP recording data from the first concert, and were tasked with generating more continuations based on this before the installation opened. The concert data contrasted starkly with the rehearsal data in the following ways:
The data was temporally sparse, with gaps sometimes minutes in length between episodes.
Many notes had a duration of 0-100ms, which were unlikely to have been played physically by rapidly depressing and releasing a key.
Compositionally there were no long, held chords.
As it turned out, Arca’s MRP performance was integrated into a complex performance including multiple other instruments and effects pedals, and live recording and processing of the MRP across a spatial audio system. There were also bright stage lights potentially interfering with the MRP’s key height sensors, which may have also been interacting with Arca’s extra sensitive MRP calibration setting. Fortunately, we were able to make use of this data because Notochord’s generations are resilient against sequences of extremely short notes. We detected and reduced the large gaps and filtered out events of <50ms duration, because these would not allow enough time for the magnet to resonate the string. For our final generations, we concatenated the rehearsal and concert data to provide more compositional variety, which Arca likened to a “synthesis of different self states”.
Our system produced intimate yet uncanny generations of new MRP data to be played back on the hardware. Elements of the harmony and expressive timing of the original performance remained, draped across its gross structure, yet the machine performance seemed to bristle in every direction, played by many hands and many minds.
In MRP Day Activations, the AI-generated performance emerged as a distinct reinterpretation of the original, imbued with a sense of uncanniness. It sounded like a version of the original performance, but possessed an expressive set of gestural curves that, at times, defied logic, with vibratos bending in unexpected directions. Despite adhering to the overall form, it presented music beyond the capabilities of a single human pianist. Nevertheless, we felt it offered a peculiar temporal and expressive coherence of its own, while clearly not played by a person with two hands.
The transition from expressive curves to velocity scores resulted in a degree of abstraction and unpredictability in the conversion back to expressive parameters. This process also underscored various dimensions of uncanniness, and a disjunction of expression in the absence of temporal continuity. Given the choice to “clean” the performance data further by for example removing shorter notes, we opted not to, out of respect for Arca’s original brief and the performance itself. In the end we felt that these elements resonated with Arca's aesthetics, challenging traditional perceptions of musical performance and embodiment.
The project's reliance on remote collaboration and the specific format of the MRP's OSC recording underscored the potential for innovative and ecologically sensitive methodologies within the New Interfaces for Musical Expression (NIME) community. The existence of the recording format facilitated a unique dialogue over time and distance, between Arca's distinctive playing style and the generative capabilities of the Notochord model to refract decades of MIDI culture. Striking a balance between Notochord’s uncanny quality and the original's essence became a subtly artistic contribution, more than just a technical one. Left to its own devices, Notochord can at times spiral into drunken mind-wandering and myopic repetition, which we have explored artistically in other contexts described earlier. Being familiar with this, we became attuned to when the generated continuations became “too Notochord-y” and tweaked parameters accordingly.
There are several avenues for future exploration and development of the techniques described in this paper. These considerations stem not only from the project's outcomes but also from the evolving landscape of AI particularly in instrumental musical expression [15].
Future developments could explore the refinement of input MIDI data processing, testing different techniques such as chunking, clustering, or hierarchical analysis to confer different representations of the musical data, resulted in different generated outcomes. A more nuanced approach may enable more sophisticated in-context learning mechanisms, thereby distinguishing further from offline fine-tuning by adapting dynamically to the evolving musical landscape within a performance.
Despite Notochord being a real-time model, the time constraints and remote collaboration in this scenario meant that an offline-first approach was preferable. The techniques we have described however are all applicable to low-latency real-time performance, with the exception of our method of time-stretching expression curves. In order to preserve the ends of expression curves, we could instead delay NoteOff events slightly, which we anticipate would work well for the MRP with its gradual release times. Or, when the legato gesture is triggered by playing adjacent notes, we could instead model vibrato curves of generated notes more deterministically following the MRP software.
Regardless of the preferred approach, a real-time version of our system could learn to ornament or continue a performance interactively while a performer is at the piano, allowing investigation of uncanniness from the perspective of the performer. Further, if a larger dataset of MRP performances can be collected, we might match expression curves to context based on more than a unidimensional velocity score, considering pitch or other aspects such as recent note duration or degree of polyphony.
As we contemplate the transition to real-time systems, the concept of dynamically adjusting generative constraints based on performance context becomes increasingly relevant. This adaptive approach could respond to various factors, such as performance dynamics, temporal shifts, or the evolving emotional landscape of a piece, increasing the fidelity and responsiveness of the performer’s dialogue with the model, instrument and audience.
This paper has detailed a novel approach to enhancing the expressive capabilities of generative MIDI models, particularly within unstructured improvisational contexts where existing models may struggle. Our collaboration with Arca, utilizing her magnetic resonator piano (MRP) performances, served as a fertile ground for this exploration. The project culminated in a seven-hour installation that showcased the potential of real-time probabilistic modeling to generate MIDI continuations that are inventive and expressive, while remaining faithful to a specific artist’s style.
By leveraging Notochord, a model adept at learning in-context from Arca's unique playing style, we introduced structural constraints that enabled the improvisation of new MIDI sequences. These were not mere algorithmic outputs; they were imbued with the essence of the original performances through the remapping of Arca's gestural data onto the generated MIDI notes. This process resulted in performances that, while uncanny, maintained a profound connection to the gestural nuances characteristic of Arca's MRP playing.
Thanks to Arca, Shaun MacDonald, Lewis Wolstanholme, Andrew McPherson, Bronze.ai, Google Arts & Culture and the onsite teams at Bourse de Commerce, Paris.
The Intelligent Instruments project (INTENT) is funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 101001848).
This project is funded by the European Research Council (ERC) and before embarking upon our research we sought ethical clearance from Iceland’s Science Ethics Committee. The committee considered our research questions, methods, recruitment strategies and treatment of personal data. All personal information is kept safely and anonymously.