Skip to main content
SearchLoginLogin or Signup

Agonistic Dialogue on the Value and Impact of AI Music Applications

Published onAug 29, 2024
Agonistic Dialogue on the Value and Impact of AI Music Applications
·

Abstract

This paper records the application of a critical and agonistic mode of inquiry to analyse and critique a specific application of artificial intelligence (AI) to music practice. It is constructed as a structured interdisciplinary dialogue between 1) a musicologist and social scientist and 2) a music informatics engineer and learner of Irish traditional music (ITM). Focusing on folk-rnn and ITM, the two authors debate the role of data ethics in AI music applications, the dynamics of inclusion and exclusion, and the nature of embedded value systems and power asymmetries inherent in applying AI to music. The paper argues that identifying the value of AI music applications is critical for ensuring research efforts make musical contributions along with academic and technical ones. Overall, this agonistic dialogue exemplifies how questions of right and wrong — the core of ethics — can be examined as AI is applied more and more to music practice.

Authors

Anna-Kaisa Kaila ([email protected]) and Bob L. T. Sturm ([email protected])

KTH Royal Institute of Technology
Stockholm, Sweden

Keywords

AI music, interdisciplinarity, agonistic dialogue, Irish Traditional Music, ethics

Introduction

A provocation [1] and a keynote address [2] at AIMC 2023 argued that critical reflections around the value of AI music applications should be at the center of the research endeavors. This would enable our research efforts to “be of service to people and their music and other cultural expressions — not just the other way around” [1]. While there may be some consensus on this goal, it is less certain what such efforts would look like in practice. In this interdisciplinary debate paper, we will look broadly at the ethical state of music and AI, dissecting contentious issues of data use, value systems, and the power asymmetry in the dynamics of inclusion and exclusion. We do so via an agonistic mode of interdisciplinarity [3] using a method and a writing approach inspired by Morrison and McPherson [4].

Essentially, this paper records a structured email chain in which we (the two authors) respond to each others’ discussion prompts. We are 1) a musicologist and social scientist (Anna) and 2) a music informatics engineer and learner of Irish traditional music (Bob). Both are members of two partly overlapping interdisciplinary research teams which regularly engage in spontaneous debates on issues related to AI and music, often involving participants from both engineering and humanist backgrounds.1 The initiative to engage in a systematic debate specifically about folk-rnn arose from notions presented in a critical discussion thread on AI found on the website http://thesession.org, and the provocation presented at AIMC in 2023 [1], which incurred divergent reactions from the authors.

Our discussion focuses on folk-rnn, a Long Short-Term Memory network (LSTM) trained to generate symbolic music sequences resembling transcriptions of Irish traditional dance music (ITM) [5]. folk-rnn was originally trained with a dataset of more than 23,000 transcriptions of ITM downloaded from the repository of http://thesession.org. This website is a crowd-sourced database of traditional tunes, which is continuously updated and expanded by an active community of users from around the world. For the purpose of accessibility, folk-rnn was made available as a server-based implementation that provides a selection of options for parameter tuning and a button for initiating the tune generation, and is linked to a growing archive of tunes generated by users of the system at https://themachinefolksession.org [6].

The method applied in this paper, inspired by Morrison and McPherson [4], worked as follows: the two authors took turns responding by email providing critical questions and responses to each other around the topic of folk-rnn [7] and ITM. While the purpose of the exercise was not specifically to provoke a dispute, there was no active effort made to avoid confrontation or to seek consensus, and any fractures in understanding or agreement were purposefully left visible. After this, the authors engaged in a collaborative process of editing and reworking the text as done by Morrison and McPherson [4].

The idea of not distilling the disagreements and discordance of the co-authors into a unanimous and univocal synthesis is not entirely new. Aucouturier and Bigand [8] imagines a dialogue between characters from two disciplines on the merits and shortcomings of music informatics research. Such a dialogue format has been applied in a conference keynote delivered by Rodgers and Sterne [9], which is cited as an inspiration by Morrison and McPherson [4]. Huang et al. [10] and Ferraro et al. [11] have also engaged in interdisciplinary or agonistic dialogues as part of the method of inquiry. One could of course trace the method all the way back to Socratic dialogue, as well as to efforts in philosophy of science (e.g. [12], [13]) to translate knowledge creation processes across disciplines, and to engage in critical and reflective introspection of their unspoken premisses and value propositions.

Interrogating folk-rnn

Where does folk-rnn come from and who needs it?

Anna: Sturm et al. [14] describes that the primary aim in developing folk-rnn was to casually test extending machine learning methods to an application area where they had not been widely used yet. It seems ITM was the domain of choice mostly by coincidence, not because there was a need for a service that could provide an unlimited supply of artificial music in this style. I am left wondering: besides a proof of concept for the research community, what is this system then needed for, and who needs it? Where is the value in applying AI to folk music?

Bob: Indeed, [14] presents the “origin story” for folk-rnn, the first version of which (v1, 2015) was motivated as a curious and humorous exercise that required very little effort to execute, instead of any noble pursuit of scientific knowledge. That it appeared to be as successful as it did at modeling the data we trained it on, and our experience of the reactions of people who came in contact with it, led us to ask the deeper questions that motivated other versions of folk-rnn and similar systems [15][16][17], its online implementation [18], an album (https://soundcloud.com/oconaillfamilyandfriends/sets/lets-have-another-gan-ainm [19]), concerts engaging composers and audiences with music and AI [20], and so on. Our focus on ITM came entirely from my (at that time) passing awareness of it, and the availability of a large dataset of textual representations of the music compatible with the code base we started with [21]. We did not consider the implications of that focus until later, but it has ended up illuminating a variety of interesting issues, such as value systems in traditional music and their collision with technology and innovation [22][23], and the responsible and ethical development of AI systems in spaces of cultural production [10][22][24].

In response to your questions, folk-rnn and its “descendants” [15][16][17] are not needed by anyone in the same ways as all other systems developed in the application of AI to folk music — all of which are reviewed in [14] stretching from 1951–2019. And can't the same be said more broadly for any method of algorithmic composition [25]? Not one of these systems needs to exist or be available to the general public.2

Anna: I can appreciate the value of research to be evaluated for what it has led to and not what premisses it started from. This issue, however, seems more systemic than specific to folk-rnn. Barnett [26] demonstrated in her literature review how negative societal impacts discussed in research papers on generative AI music systems are by far dwarfed by the discussions of positive impacts, indicating that critical reflections around the questions of “why”, “for whom”, “for what purpose” and “with what consequences” are generally not very common. In contrast, research and engineering tends to celebrate novelty [27]. This inevitably results in ever new systems being built, whether they are “needed” or not.

Data colonialism and power asymmetries

Anna: Returning to your notion about the data, I can see there were many reasons for why ITM transcriptions were a convenient choice for setting up folk-rnn, from availability and suitable format to the low legal risks [28][29]. Unfortunately, as Huang et al. [10] notes, such use of these hand-entered transcriptions from thesession.org was not anything the community asked for or gave consent to, nor does folk-rnn seem to serve their goals. The contributions of this research endeavor are also rather skewed towards the machine learning communities. For thesession.org contributors then, folk-rnn seems like a “solution in search of a problem” (p. 12 of [30]), or possibly a misuse of their commons. What is your view on that?

Bob: Let us look at the source of the data of folk-rnn, http://thesession.org, and read how the website positions itself. Whose data is it? On the front page it reads:3 “The Session is a community website dedicated to Irish traditional music. You can find tunes to play, find sessions to play them in, and join in discussions about the music. You can also find events (like concerts and festivals), or explore the track listings of recordings. You can contribute too.” I cannot find any information on the use or misuse of the data contributed by its users. The same goes for the periodic public backup of the website database by its owner: https://github.com/adactio/TheSession-data. In the FAQ there's only a few comments about what not to post in the discussions (no sports or foul language), and that some kinds of tunes are not allowed (no slow airs). Under the “Privacy” page there is a section on “Your contributions”, which only states: “All of the content on The Session—tune settings, sessions, events, discussions, etc.—is provided by members. Every contribution is attributed with a link back to the corresponding member profile.” The creator and owner of the website has even created an API to make it easy to access its data: https://thesession.org/api. I wonder why the creator of the website has made no proclamations of how the data he has collected can and cannot be used?

Anna: I would not agree that because the use of thesession.org data for training an AI model was not literally forbidden on the website, we should assume it was something the contributors of the database would not mind, or would welcome, had they been specifically asked. The power dynamics of the situation are a factor to consider here. Couldry and Mejias [31] describe activities where social resources inherent in data are extracted, appropriated and commodified for profit as data colonialism. In the case of ITM, such power asymmetry is of course further aggravated by the legacy of social and political colonialism that Ireland and Irish immigrants have been subjected to throughout their history, as Huang et al. [10] points out. Seen in this context, the act of appropriating “othered” data can be experienced as invasive.

Bob: I think it entirely appropriate to ask how folk-rnn and our work with it exhibit aspects of data colonialism. As a brief review of Couldry and Mejias [31], historical colonialism consists of four activities: extraction and appropriation of resources; enforcing unequal social and economic relations; unequally distributing benefits from the appropriation; and spreading ideologies in support of colonialism. Couldry and Mejias [31] argues the same four activities are a part of data colonialism, but here the resource that has been discovered and is being appropriated is data. I am curious how you see folk-rnn participating in these actions.

In one sense, building folk-rnn did not involve an extraction of resources since thesession.org had already done so. The data remains at thesession.org, and my duplication of a github snapshot of it in 2015 does not threaten the existence of thesession.org. Even though thesession.org does not explicitly forbid the wholesale use of their database, using the data to train machines to imitate the syntax inherent to the transcriptions is arguably a use that was not foreseen when the website was established in 2001. However, just because some use is unintended, and some users protest, does not make it wrong or mean that harm has been caused. Couldry and Mejias [31] writes, “Data colonialism is concerned with the external appropriation of data on terms that are partly or wholly beyond the control of the person to whom the data relates.” So I wonder, how “external” could I have been in this case when I collected the data from the weekly publicly available repository of thesession.org — a website created and maintained by someone who works as a computer scientist?4

I see no way that folk-rnn has directly contributed to enforcing unequal social and economic relations or spreading ideologies in support of colonialism.5 And there certainly has not been any profit generated from the development and deployment of folk-rnn [10]— and really, we have never worked to make money from such a service.6,7 Benefits of our use of the data from thesession.org (as in grant money related to the development and study of the system in situ) have actually found their way to practitioners of ITM by us hiring them as experts to participate and reflect in our studies, as performers at concerts and in recording sessions, as discussants at workshops, and as judges in the AI Music Generation Challenges [32][33][34].8

Insiders and outsiders

Anna: To me, the act of prioritising technical novelty over aesthetic and social purpose, indifference towards power asymmetries in the data use, and unequal distribution of benefits reflect certain techno-positivist and extractivist value inclinations.9 The data use in itself might also invite other, even more harmful uses, simply because there is a precedent to be followed. In this way, I would argue that the act the framing of the data as freely available resource still carries colonialist flavours to it. Should it not be the ITM community who gets to decide whether their data can and should be used for machine learning on their own terms?

Even if folk music is a special case with its non-proprietary ownership and crediting structures, the reactions from communities and industries formed around other musical styles are not as permissive. In a recent study, two-thirds (64%) of German and French music authors considered the risks of AI use to outweigh its promises [35]. We see signs of the upcoming battles and counter-attacks of artist communities already in the visual domains (e.g., [36][37]), and music and audio are closely following the suit ([38][39]). All this indicates that the unwarranted introduction of AI into a creative community can be a contentious act. Given the ban of AI-generated content from thesession.org database (https://thesession.org/discussions/47876#comment974114), this aspect has not been lost on the ITM community.

Bob: First, I have to argue that there is no such thing as “The ITM Community”. There are, at most, communities of practice, and perhaps they might be conveniently lumped together as “The ITM Community” — but there really is no justification for doing so. Second, since there is no “The ITM Community” I have to question the meaningfulness of calling the data at thesession.org “their data”. I also want to question the assumption that thesession.org is representative of ITM, and even that it is serving a good and noble purpose for ITM. Frankly put, thesession.org is but a hodgepodge of derivative works of varied qualities and unidentified origins, and exists outside of some communities of ITM practice (e.g., those eschewing notation). Furthermore, the ban of AI-generated tunes on thesession.org was not done by some consensus of users, but only at the discretion of one person: he who created and maintains the website.10 However, the discussion around that decision does not show a consensus at all. Why should we only pay attention to those who are discomfited? And why should we ignore the lack of comment from many other communities of practice who couldn't care less about thesession.org and folk-rnn?

Machine folk as an invasive species

Anna: I agree there may be situations where the requirement to obtain consent for data use may be not only difficult, but outright impossible. That said, I do not think that such difficulty in identifying a unanimous community with a clear and authoritative opinion should grant us the right to abandon critical reflection on these issues. Even if there is no conclusive answer, the efforts to conduct a dialogue with the practitioners are in themselves valuable.

I would also argue that there is value in paying attention to what the critics are concerned about, regardless of whether they are in the majority or in the minority. One of the main reasons why AI models stir debate seems to be the question of volume. If we take the analogy of a new invasive species, the question is not whether the “AI-animal” will threaten to consume the native population11 to the verge of extinction, which it obviously cannot. Yet, with its enormous reproductive (generative) power and with no natural enemies in the ecosystem,12 its unwarranted presence can over time become excessive and harmful. This question is separate from the fear that the stream of automated sound-alikes starts blending with the original heritage repertoire. Could such events not over time disrupt the delicate network of legacy knowledge that holds the ITM repertoire together?

In more general terms, Huang et al. [10] and Mersch [40], and to some extent Attali [41], argue that drowning our cultural space with random quality content in monstrous quantities is problematic, and there is already some concrete evidence about it (see, e.g. [42]). An uncontrollable flood of AI-generated content that no one will ever have the time and resources to curate is not the kind of future most of us are looking forward to. I wonder if that is a concern you share?

Bob: I agree that “The ITM Community” only existing in the abstract does not free us from trying to act in responsible ways or the duty to reflect on the various matters at hand. And I appreciate the analogy of the invasive species, and the allusion to Attali's “crisis of proliferation” [41], but I want to examine it a bit. A key assumption here is that ITM is some kind of pristine and fragile piece of nature; and, like in the real world, introducing a new species to this land could disrupt its balance and cause extinction. And so, the argument continues, we must work to make sure this land remains pure, protecting its “original heritage repertoire” from all invasive species, lest it be replaced and lost forever. There are at least two major problems with this argument. The first is that ITM is actually far from pristine and fragile. It is diverse, dynamic and thriving. Going to any Irish music festival during the summer in Ireland shows just how diverse and thriving it is, and how many young people are becoming masterful musicians of traditional music.13 There is great national support for traditional music in Ireland by the government of Ireland. Irish society has made a decision that it values its musical heritage enough to support it with public funding. Then add on top of that how wide spread Irish music is around the world, absent direct support of the Irish government. There is no “delicate legacy network” here. It is extremely robust, and not threatened in the least by an AI generating even billions of great tunes.

The second problem with the argument is that it motivates exclusionary, retrograde and ultimately harmful inclinations. In the pursuit of purity and the preservation of the “original heritage repertoire”, where do we stop? Are we to police sessions to make sure those attending and listening are receiving “original heritage repertoire” delivered appropriately? Should we restrict sessions to only occur in places where adherence to tradition can be guaranteed? Are we to correct all mistakes, and make sure everyone listening knows how each real tune actually goes, lest an error makes its way into everyone's memory and thus possibly changes the tune forever? Should newcomers be properly vetted and carefully shielded by influences foreign to tradition? Should music recordings released under the moniker of ITM be quality assured and properly disinfected of any kind of new influence that might disturb the old order? And why place so much value on tunes that speak to a different time and place? Do we say to the people writing new tunes today, you are not allowed to write such tunes unless you cut turf or fish for a living? Calls for purity quickly move to a dark place where only certain people are allowed to participate in certain ways. In my mind it is not far to leap from, “Tunes generated by AI are not allowed” to “Tunes performed by non-Irish are not allowed.” Were the calls for purity in Irish music from the 1950s enforced, ITM wouldn't have the accordion and bouzouki; and it certainly wouldn't have been a part of the folk music revivals following in the decades afterwards. If we enforce purity today, what will we miss in the future? Forcing a tradition to be static will lead to its demise.

Malevolence and benevolence

Anna: Let me approach the question from a slightly different angle. Collingridge [43] argues that we cannot see the full impacts of technology before it has been fully implemented and adopted, but once it is established, its impacts become irreversible. In the context of our example, once machine-generated folk music has entered the musical ecosystems en masse, it may eventually become impossible to conclusively distinguish between human- and machine-made folk tunes. Whether that scenario is dystopic or not is of course up for debate, but I think that decision should be left for the (future) practitioners themselves to make. In an effort to retain the option of keeping the repertoires separate, we could for instance agree to provide systematic metadata labelling or watermarks for the machine-generated content, and carefully communicate the origin when the tunes are reproduced, performed and distributed.14 I do not think that being mindful has to mean purity policing or full abolishment.

Since we seem to disagree on the extent of harm inflicted, I would be curious to hear what you yourself think is at present the strongest argument against the work done with and around folk-rnn?

Bob: With respect to folk-rnn the system, I think the strongest argument against it is that any contribution it makes to ITM is still totally unclear. It may not be harmful to ITM (as I have argued), but it doesn't seem to benefit ITM either. This was actually an argument against our work posed by user Ergo on thesession.org on March 18 2017:15 “explain how this is going to contribute to ITM ... I can see how it might contribute to AI, or to your own resume. ... [The project] sounds like the project is intended to further the use of AI in creative domains. Fine - something new and cool to do with AI. But I’d simply like you to leave ITM out of it”. My only “meh” response was to point at my work as being “basic research” — that its value may become apparent later on. It is now seven years later, and I still can't see any noteworthy contribution of folk-rnn to ITM. Both thesession.org and TunePal [44] make clear contributions to ITM. folk-rnn simply does not (yet).

That being said, we see folk-rnn still being used. People are finding enough value in its tunes to submit them to https://themachinefolksession.org. Just on March 5, 2024 we see five tunes posted. folk-rnn was also used by composer Rob Laidlow in a movement of his 2022 work "Silicon".16 In March 2023, YouTuber Marcel Ardans (who plays and teaches bluegrass) made a video about using folk-rnn to trick a fellow musician.17 The concluding discussion of that video is interesting, including such observations as: 1) an AI learning from tunes and assembling them together to create new tunes is essentially what people do anyhow; and 2) such an AI could help one break free of patterns of practice; and 3) AI is escalating the need to reform intellectual property. Still, folk-rnn itself was never designed or even thought of as a way to contribute ITM; and I don't really see a way it can contribute to ITM. It's just a parlor trick. It's a stupid language model with a cheat code. It's at most a footnote in the history of ITM.

Finally, I want to separate folk-rnn as a tool from the research around it and other activities stemming from it. The variety of concrete (but rather personal) ways these have contributed to ITM are detailed by Huang et al. [10]. It was actually very early on that we started to collaborate with practitioners [45], which included applying for money together to support our activities. And our work and reflection has continued on these issues. It is also important to note that an unlimited supply of “perfect” Irish jigs and reels generated by any model far better than folk-rnn will not have any kind of negative impact on ITM simply because the tradition does not hinge on the “dots”, i.e., music notated in an impoverished way. Value in the tradition is simply not given to the particular notes one is playing.

Unanticipated contributions

Anna: Following on that thought, I would like to come back to your earlier question, “if we enforce purity today, what will we miss in the future?” I’m intrigued by this notion that future value may be lost if we do not allow novel machine folk traditions to grow and flourish. There is certainly inherent creative potential in these tools that is independent of their original training data and the “obvious” use cases bound by historical traditions. In the framing of what we have discussed so far, how do you propose we could support creative and unexpected uses of generative AI tools for (folk) music?

Bob: I’m not sure if I believe that “future value may be lost if we do not allow novel machine folk traditions to grow and flourish”. To make that claim assumes a utilitarian ethical position. But I am not saying anything and everything should be allowed or embraced. I think it is entirely the prerogative of a community of practice. Tony MacMahon's exhortations to stop the commercialization of ITM [46] were heeded by a few but ignored by many, many more. Sean Ó'Riada's tirade against the accordion in ITM18 is laughable (then and today), but sincere.

As to supporting creative and unexpected uses of AI for folk music, I hope our work shows a variety of ways in which it can meaningfully contribute. It is one thing to open O'Neill's "1001" [47] (a collection that is loved but also ignored) and learn to play the dots that O'Neill (or really, his collaborator James Early, who did most of the transcriptions) felt are the most representative (along with titles of suspect origin), it is yet another to engage with a machine-generated tune and solve the puzzle of making it fit with the grammar of the tradition one is practicing — ultimately deciding whether to keep it in one's tunebox, share it with others, or forget it all together. That I have led an ITM learners group in Stockholm since 2019 is a totally unanticipated outcome of my playful but naive explorations in 2015 — and one that the many attending and enthusiastic members would argue contributes positively to their personal practice.19 I also hope our discussion and consideration of many issues herein, real or imagined, provide a glimpse of our contemporary thinking for those reading in 100 year's time, whether or not machine folk has become part of a musical fabric somewhere in the world.

Applying AI to music ethically

Bob: Here's a question for you. I was recently asked a question by a PhD student who is assembling a very large database of symbolic music (the size of which is an order of magnitude larger than what we assembled for folk-rnn): “How do I do this in an ethical way?” I didn't have a clear answer other than to say, do it carefully and with lots of reflection. What would you say?

Anna: That is indeed a tricky question. The uncertainty of what advice to give is probably a symptom of what Munn [48] quite provocatively called “the uselessness of AI ethics”. The issue is not in that thinking about ethics would be futile, but rather that it is counter-productive to approach it as a check-list of empty principles that can be filled in and then forgotten, and such half-baked efforts can easily lead to whitewashing [49]. Nevertheless, I empathize with the sincere effort of seeking guidance, and the frustration for not having a clear answer to provide. For that purpose, I would supplement your notions of carefulness and continuous critical reflection, and the range of participatory efforts that we have already discussed, with a recommendation to use some of the existing guidelines and tools for ethical analysis (e.g., [50][51][52][53]) and for model or data documentation (e.g., [54][55]). We can also look at commercial actors that have committed to adhere to ethical practices (e.g., the tech start-up XHail as pointed to by Clancy [56]), or Clancy’s initiative AI:OK, and see what kinds of steps they have taken to tackle or mitigate some these issues, and seek to develop new ones ourselves. As partial as these solutions may be, they are probably the best ones we have at hand for the time being.

Interim Conclusion

This agonistic dialogue identifies several conflicting points about the use and misuse of data and cultural resources, the asymmetries in the concentration of value and power, and in the diverging perspectives on the future of AI systems applied to music. While the focus of the analysis and critique is on folk-rnn and ITM, many of the same concerns can and should be applied and debated in the context of other systems applied to music. Unburdened by the financial pressures driving commercial actors in the development of AI music systems, academic researchers have the freedom to critically reflect on the present and the future of AI music applications, and how they could be otherwise.20

We began our dialogue from the premise that it would be desirable for the efforts in AI music research to be beneficial not only for engineering pursuits but also for music communities. The exercise in the agonistic mode of interdisciplinarity proved well-suited for foregrounding points of tension, value conflicts, and how one might fall afoul of this premise. The method, predictably, did not result in concrete guidelines of how such points of tension could be concretely addressed and mitigated in the course of research efforts. This is not just a symptom of the domain complexity but also of the difficulty of operationalising ethics. Conclusive deliberations of right and wrong will always be conducted in the specific context of the issue at hand, and the issues of ethical relevance shift and (re-)emerge as the society around us changes. Similarly, the debate about AI in music must continue.

Acknowledgments

This work was supported by Wallenberg AI, Autonomous Systems and Software Program – Humanities and Society (WASP-HS) funded by the Marianne and Marcus Wallenberg Foundation (Grant 2020.0102), and by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (MUSAiC project, Grant agreement No. 864189).

Ethics Statement

This paper discusses a range of ethical issues in AI music applications and in their data use. It aims to raise awareness and promote the adoption of reflective and fair practices in the research conducted within the AI Music Creativity domain and adjacent fields. The list of issues addressed is not conclusive or complete, and many ethically relevant topics have been either left out of the scope of the paper or have been only partially covered. The analysis and critique is presented from the perspective of two Western researchers. The authors acknowledge that this position toward ITM in itself represents a power and information asymmetry not dissimilar to the position criticised in the article, and that the arguments presented should be interpreted and evaluated in this context. One of the authors, being directly involved in the development and of the AI system interrogated herein, has a clear non-financial conflict of interest. The authors see this conflict mitigated by the fact that the purpose of the present paper is to critically examine said system and the research process around it via an agonistic mode of interdisciplinarity.


Comments
0
comment
No comments here
Why not start the discussion?