As music generated using artificial intelligence (AI music) becomes more prevalent — originating not only from individuals but also commercial services — the need to study it and its impacts becomes important. How can this material and its sources be meaningfully studied and critically engaged with, especially considering the unprecedented scales possible with generative AI? The paper begins to answer this question by considering AI music along seven aspects: 1) the company providing an AI music service; 2) its founders and employees; 3) the use of the service; 4) the users; 5) the algorithms; 6) the music; and 7) the sustainability. We make our discussion more concrete by considering the contemporary AI music service Boomy. While our investigations are preliminary and focused on a single AI music service, we argue that they open several interesting avenues of exploration for many disciplines and their intersections to help prepare for the coming flood of AI music. This paper asks many more questions than it answers, which is a feature (not a bug) of it advocating for a new domain of study: AI Music Studies.
Artificial intelligence, machine learning, music, humanities, business, streaming, industry
The homepage of the AI music service Boomy reads: “Create original songs in seconds, even if you’ve never made music before. Submit your songs to streaming platforms and get paid when people listen. Join a global community of artists empowered by generative music.” A paying subscriber (user) of this service is able to direct it to create “songs” and compile “releases”, which Boomy then distributes to dozens of online services (e.g., Spotify, Amazon Music, and YouTube), potentially reaching many people. Streaming revenue is then collected by Boomy, and a portion is paid to the user.1 Since its inception in early 2019, Boomy appears to have been very active. As of March 16, 2024, 11h25 CET, text at the bottom of the homepage reads: “Boomy artists have created 19,069,450 original songs.”2
Boomy is not singular in the use of generative methods for creating and selling music. In 2014, the Melomics project [1] created “the world’s biggest music marketplace, built by [the supercomputer] Iamus” containing “1 billion songs, most genres”.3 This massive collection is no longer online, but a few examples remain.4 The UK-startup Jukedeck (formed 2015), which provided a service for tailor-made and copyright-free AI music, was bought by ByteDance in 2019 [2]. A little more than a year later the customized music generation service Amper Music was acquired by the stock photography firm Shutterstock [3]. The AI music generator AIVA (Artificial Intelligence Virtual Artist), created by a company with a similar product as Jukedeck and Amper Music, acquired formal recognition as the first virtual music composer by the authors’ rights society SACEM [4]. The company Endel, focused on the generation of soundscapes for wellness purposes, was the first to step into yet another economic domain of artistic activity by signing the first deal with a major music label for publishing a number of albums of AI-generated soundscapes [5]. Many other companies exist in this space now, including Harmonai, soundraw.io, Splash, Soundful, Aimi, Infinite Album, Loudly, mousika, Riffusion, Beatoven.ai, Amadeus Code, Soundry AI, WaveAI, suno and udio.5
A flood of AI music into music practice, culture and economics is coming. The companies mentioned above particularly target the music industry that is now formed around streaming revenue and monetization on content. In an April 2022 interview [6], Boomy founder and CEO Alex Mitchell predicts :
What does a hundred billion songs per year look like, and how does that shift the market? Whoever is there first and whoever is doing that — the spoils on the other side of that thing are just going to be ridiculous.
Boomy’s aim for endless music generation seems to promise a planetary-level musical spam event [7], where AI music completely swamps the available music online. The ironic result of this could be that for many of the traditional uses and meanings of music [8][9] AI music serves few human, relational purposes, and instead completely focuses on the generation of money for intellectual property owners. In addition, flooding online music distributors with AI-generated content risks aggravating the already stark winner-take-all tendencies that characterize the contemporary music industry [10] as it intensifies competition for listener attention.
Critical work relevant to the study of AI music has been done by Collins in the space of artificial musicians [11], artificial critics [12], large-scale algorithmic composition [13][14], and the subversion of music content retrieval services [15]. Other relevant work includes Bown and Britton [16], which discusses a creative experiment to compose albums of personalised variations at scale. The DarwinTunes project [17] investigated generatively evolving an infinite radio stream using listener preferences as its evolutionary selection criterion. Some have looked at the ethics of the application of AI to music and art in general [18][19][20][21][22][23]. We see a growing body of work related to intellectual property, such as suggestions of possible legal instruments suitable for protecting AI-based works, or arguments against introducing such protection [24][25][26]. Several authors have examined the potential role of copyright and related rights specifically in the domain of AI music [27][28][29][30][31][32][33][34][35], including a handful of critical reflections on what consequences current legal uncertainty brings to the industry and to musical practice at large [36][20][37]. Lastly, there have been calls to include matters of sustainability into the research agenda of AI arts and computational creativity [38][39] and initial attempts to, for example, quantify [40][41] the environmental costs of AI arts and music. However, the extent of this research is very limited, as the environmental sustainability of AI in general has only become an interest in recent years (e.g., [42][43][44][45][46]).
Where is AI music ending up and how it is impacting the world? Who and what are engaging with these materials, why, and how? How does AI music interact with the established music industry, and with musical practice at large? How can AI music systems and their vast bodies of “work” be identified, conceptualized, and critically examined in and as culture [47]? To begin to address some of these questions, this paper considers seven aspects of AI music: 1) the company providing an AI music service; 2) the founders and employees; 3) the use of the service; 4) the users; 5) the algorithms; 6) the music; and 7) the sustainability. While these are not the only aspects possible, they provide a starting point for a critical analysis of AI music. The next section breaks down each of these aspects into numerous questions, information sources and methodologies, revealing rich avenues for exploration. We then discuss these aspects more concretely by considering the contemporary AI music service Boomy. Ultimately, this paper advocates for a new kind of music studies — AI Music Studies — and attempts to outline what such a subject might look like.6
Each of the following seven subsections pose numerous questions related to seven particular aspects of AI music, and briefly discusses methodologies and information sources for such investigations. Our selection of these aspects was not done in reference to any existing taxonomy, but through deliberating on how our varied questions about AI music can be clustered together. These are not the only aspects possible, but we have found them useful for establishing a framework for our analyses.
A company is an organization focused on commercializing a service or product. When it comes to a company whose service or product is AI music, how does the company work, and how does AI music figure into it? How is the company making money from AI music, and what is the company’s relationship to music in general? What is its business model? What is driving investment in the company? How does the company treat issues around intellectual property? Answers to many of these questions are typically not openly available and so must be surmised from a variety of sources, e.g., the company website, job advertisements, industry news (e.g., Crunchbase), the terms of service, public talks from management, and back-of-the-envelope calculations.
Founders are entrepreneurs who establish an AI music company and solicit investments. Employees are paid laborers of the company. What are the visions and values of the founders, and what motivates them? What are the skills and values of the employees? How do the employees relate to the music ecosystem? How do the visions, values and skills shape the service? A detailed analysis of the culture within a company can help one understand why products are shaped the way they are. Examples include the ethnography of AI researchers developing medical support systems by Forsythe [48], and Seaver [47] investigating the mindsets of the employees of a music streaming company. Common to these and related works is that they deliver a fine-grained picture of the relations between the people developing technology, their values, and the strengths and shortcomings of the technology they produce. While it may be ideal to conduct a long-term ethnography, the investigation of textual sources such as interviews and social media discourse can provide valuable insights into the culture of an enterprise or institute, e.g., as has been done for Spotify by Eriksson et al. [49].
The use of an AI music service is its application to create and perhaps distribute music. How is an AI music service used? What is the procedure from the user perspective? What options are available, and how much control can be taken away from the automation? In what ways can this be seen as “musicking” [50], and how meaningful is it for music? Answers to these questions provide parameters for the analysis of the resulting music, e.g., dimensions of style, instrumentation, and rhythm. It also provides insight into the kinds of users targeted by an AI music service. Studying the use of such services can provide perspectives on how the role of cultural production in society clashes with the values of entrepreneurial efforts to automatically generate and distribute billions of new songs a day. Sources of information for this aspect can be one’s own use of the service (autoethnographic), discussion forums of users, and “how-to” posts from the service.
A user of an AI music service is one who engages with the service. It could also be one who listens to AI-generated music. Who are the users of AI music and services, and why do they use them? How do they use them? Who are the listeners of AI music, how do they listen? How do users discuss the outputs they elicit from such a service? What is their scale of operation, and how is that reflected in the artefacts they produce? How is a user’s use of AI music entwined with cultural norms, values, and practices? Sources of information for this aspect can be discussion forums of users, and interviews (ethnographic). Important methodological considerations involve the typical anonymous online existence of many users. Accurate demographic details of users are difficult to ascertain online, and so necessitate live interviews.
An algorithm of an AI music service is a computational procedure it executes in creating music. What algorithms is an AI music service using to generate its output? How do the algorithms consider a user’s music knowledge? How do they treat specific musical dimensions, alone and in combination? If machine learning is being used, what is the training data, how was it assembled, and by whom? Does the service employ algorithms for plagiarism detection, both with reference to existing non-AI music, and AI music generated by the service and others? To what extent is “reverse engineering” allowed for a user, an academic doing research (e.g., see [49]), or a competitor? Sources of information for this aspect are typically not accessible given the closed nature of AI music services, and so their characteristics must be surmised through engaging with the service and analyzing its output.
The music of an AI music service is that material it generates with which one engages by listening, or using in other ways (e.g., composing). What are the qualities and conventions of this music? How diverse is this world of expression? How meaningful and useful are methods of (ethno)musicology, music theory, sound studies, to name only a few disciplines, for answering these questions, where the scale of will eventually involve the analysis of hundreds of thousands to millions of artifacts? Methods of music information retrieval (MIR) are applicable here [51]; however, major methodological problems revolve around data collection. Faced with AI music flooding streaming service each day, which tracks should be selected for study, how many, and how should they be selected? When AI music service is undergoing development, how frequently should its music be sampled? And given the apparent impermanence of the AI music generated and then distributed by some AI music services, how should the music be treated? Should it be downloaded and archived immediately to save it from the apparent whims of digital service providers deciding to strike AI music from their catalogues [52]?
The sustainability of an AI music service refers to its impacts on environmental, economic and cultural dimensions of the wider world. Environmental sustainability of AI music is deeply intertwined with the above six aspects of analysis, but we isolate it here in order to highlight its considerable importance and unique related challenges. It is our stance that sustainability should be one critical lens to analyse and make sense of AI music, including questions of environmental ethics. Relevant questions include, what motivates the investments in an AI music business? Are they motivated by striving towards a more sustainable state of things in a wider, global perspective, or by something else? How are such businesses prioritizing values of sustainability across their organization, business models and products, if at all? There are varying social and material dimensions that require analysis in order to understand AI music from the perspective of sustainability. These dimensions touch upon social values and aesthetics, as well as how those are mediated by and reflected on in the designed materiality. These dimensions are interconnected in socio-material entanglements. The core question to guide such sustainability-oriented inquiries is: what is the nature of culture, practices, and materiality surrounding AI music, and are values, aesthetics, and practices that prioritize sustainability an integral part of them?
Let us now turn to a specific AI music service with respect to each of the seven aspects just discussed. We choose to focus on the contemporary AI music service Boomy for several reasons: 1) it provides a complete generation-to-distribution pipeline to its users; 2) it is one of the oldest and still-existing commercial AI music services (est. 2019); and 3) a large amount of material about it is available for study. We only discuss one question from each aspect of analysis due to space limitations.
One can surmise the extent to which Boomy is making money from streaming of the AI music it distributes. The Boomy Discord server7 has the channel “share-your-songs” where users post links to their releases. Looking through the posts between April 1st 2022 and March 31st 2023, we count 86 artists posting such links, and then record their monthly Spotify listener counts (as of April 11 2023). The total number of monthly listeners for these artists is 11,047.8 Let us optimistically assume that these monthly listeners listen to ten Boomy songs over that time. Given that the average revenue from a Spotify listen is USD0.004,9 this implies an income to Boomy of 20% of about USD400, or USD80. Assuming that Boomy distributes to 40 other streaming services paying the same, and that there are just as many listens on them of the same music (which are certainly not true, but nonetheless provides a best-case scenario for streaming revenue to Boomy), this makes a monthly streaming income to Boomy of about USD3,000 from the songs of these 86 artists.
It seems then that Boomy can obtain revenue from streams of the AI music it distributes; however, Mitchell discusses [53] Boomy “A&R” (artists and repertoire team) as listening to nearly all the songs being saved by users, approving their release, and curating playlists. This team appears to have at least two employees.10 Conceivably this work requires merely checking whether the metadata given to releases contains offensive material, but it could also entail auditioning the outputs, which would take much more time. One can then surmise that the cost of labor in listening to and approving the released music and metadata would exhaust the monthly income from streaming due to the sheer number of submissions by users — most of which will probably not contribute streaming revenue at all.
Of his vision for the company, Mitchell refers many times to the success of another product [53]: “What if you could do the iPhone 12 camera but for music, and then tie it into the global rights system so that people can make money from it?” He refers to Boomy as currently being “iPhone 3”, but on its way to the latest generation. Mitchell specifies:
the core principle of what we want to achieve is: I’m a person [with] a smartphone anywhere in the world with any internet connection, I can create music that is as good as what’s coming out of major record labels in 10 seconds.
Boomy’s appeal to accessibility (often termed “democratisation”) is common in the literature on computational creativity and creative AI. A great deal of academic literature in the history of the subject has considered accessibility a key objective for developing creative systems [54]. In the commercial sector, accessibility is of primary importance to investors because it appeals to a massively scalable business model. There is no doubt that AI is resulting in very interesting options for musicians and artists, but Mitchell’s claim ignores decades of different social and educational approaches to making music making accessible, in everything from public school music programs to avant-garde programs aimed at expanding musical participation — work that gives people ownership of the music they make, and the right to determine the nature of their involvement. Rather, by significantly reducing the need for musical expertise, technical competence, and/or education on the part of users, all of which are cast as barriers to access, the service that Mitchell envisions is difficult to distinguish from a form of technologically-mediated deskilling [55]. In line with historical and contemporary forms of workplace automation [56], the offloading of certain forms of creative activity to the company’s proprietary system restructures the kinds of musical work that is left to users, with the monitoring of the AI outputs becoming one of their tasks.
To explore Boomy as a user with the “free” subscription plan, on March 25, 2023 at 10h35 CET we had Boomy create a song in the style “Electronic Dance”, and sub-style “Warehouse groove”. The only edit we performed was to add vocals, which involved recording 16 seconds of non-sensical vocalizations. The result is a 1m50s song (Audio 1), which we saved to our library and selected for release. We then entered a release title (“Stonks go up”), an artist name (“Stonky Stonks”), and entered a word to generate the album art (“stonks”). We selected “Misc / Undefinable” in the “genre” field and finally specified the lyrics as “clean” (rather than “explicit”). Two weeks after submitting it for release “Boomy Distribution” sent mail on April 9, 2023 at 00h08 CET that it has been approved and will soon be distributed.11 The email specified a link to a tracking page for the release, which provides a visualization of the “Trending Streams” for eight different streaming services.
Investigating our song on these different services soon after its release revealed some interesting details. The song appears on several of them with the date April 6, 2023. For the credits for the release on Spotify we find these details: “Performed by Stonky Stonks; Written by Arthur Mathias Wolf, JINSOO KIM; Produced by —; Source: Boomy Corporation.” Deezer also specifies for the song, “Composer: Jinsoo Kim - Arthur Mathias Wolf”. It seems that all songs distributed by Boomy to Spotify and Deezer have this peculiar attribution.12 While Boomy says that it distributes all releases to many companies, it does not claim that all releases will be made available on them. Some editorial efforts appear to be in effect. As of October 19, 2023 our release is no longer available anywhere.13
Of the users of the Boomy music service, Mitchell [53] claims “85% of [them] tell us it’s the first time they’ve ever made music.” The number of these Boomy users is not clear, but the abstract for Mitchell’s 2022 talk at GITEXPLUS, titled “Boomy’s AI case study: Enabling the next billion songs” mentions:14 “over 500,000 creators make, publish, and monetize instant songs”. The Boomy Discord server15 reveals other interesting details. With only a few messages per day on average, the server is quiet. The difference is quite stark considering the hundreds of posts per day seen on the Discord servers of other AI music services, e.g., suno.16 A core group of about a dozen Boomy users appears to be contributing most messages on the Discord, typically sharing their own Boomy music and providing feed-back to others. Reading through other channels of the Boomy Discord server shows some sense of a community, resembling in this case what Shelemay [57] describes as an “affinity” musical community — that is, a collective formed when individuals engaged with a particular “musical style or tradition” are magnetically drawn to one another.
Hour-long “Weekly Workshops” organized by Boomy A&R17, which provide opportunities for users to receive feedback on their Boomy creations, occur on the Discord server, but only a few users actively participate at most. Regular competitions organised by Boomy appear to be well-received among some users. A competition called “BOOmy Spooktacular Contest” was won by the user Wobinn on October 31, 2023 for their song “I’m on fire”,18 which features them singing/rapping over the Boomy-generated track (see YouTube video below).19
Section 5 of the Boomy Terms of Use explicitly prohibits investigating how it is generating music: “In connection with the Site and Service, you must not: Reverse engineer, decompile or disassemble any portion of the Site or Service, except where such restriction is expressly prohibited by applicable law”. We may nevertheless surmise based on the high fidelity of its output that it is not generating audio via neural synthesis, such as done by AI music services like suno.ai or Stable Audio. The sounds appearing in Boomy music seem to come from more established synthesis methods, such as sample-based wavetables, and VST instruments. Based on listening to many songs generated with Boomy we might also surmise that it is employing some kind of template-based approach, where chord progressions and rhythms are selected from sets unique to each Boomy style, and assembled probabilistically (e.g., a Markov model). It is also clear that when a vocal sample is uploaded to a Boomy track (such as Stonky Stonks Audio 1), the service performs some equalization and mastering of the results to make it blend with the backing track. Much more investigation is necessary to uncover the details of Boomy’s music making algorithms.
We now want to explore the similarity between a collection of Boomy-generated music with a collection of human-crafted music, using computational procedures of music information retrieval (MIR) [51]. The two collections we compare are: 1) “500 Ways to Have Fun” by Paperboy Prince & The Boomy Community,20 which contains 500 tracks with a total duration of 14h49m (denoted as “Boomy500”); and 2) The 500 Greatest Songs of All Time,21 a list assembled by Rolling Stone magazine in 2021 (denoted as “RS500”), which has a total duration of 31h 10m. Our choice of these two collections is driven by a few considerations: they are both curated; they have the same size (in terms of songs); they are highly dissimilar and so provide a sanity check for the results. Furthermore, since this section merely serves to illustrate how one might apply an MIR pipeline to analyzing large collections of AI generated music, the sensibility of the choice of data to compare is less important.
From the audio of all 1000 songs we compute features from the latent representations computed by the MusiCNN autotagging model trained on the MagnaTagATune dataset [58]. The penultimate layer of MusiCNN computes a length-200 vector for each 3-second segment of an input audio recording. We compute the mean and standard deviation of each dimension over all features extracted from an audio file, which results in a length-400 vector describing the recording. We compute these features for each track and then apply Uniform Manifold Approximation and Projection (UMAP) [59] with 5 neighbours, a minimal distance of 0.3, correlation as a distance metric, and projected onto two components. Image 1 shows the results.
We see two clusters clearly emerge, each closely associated with the two different collections. Some songs from the two do appear in other clusters, however. We identify nine Boomy500 songs that are within the cluster made up mostly of recordings in RS500, including the tracks titled, “Beak” by Fart 12 (track 239), “Deez Nuts” by Paperboy Prince (track 252), and “Dookie” by Paperboy Prince (track 414). We also identify three RS500 songs that are in the cluster made up mostly of recordings in Boomy500, including the tracks titled, “I’ll Take You There” by The Staple Singers, “I Shot The Sheriff” by Bob Marley, and “Little Wing” by Jimi Hendrix. These three songs do not sound significantly different from the rest of RS500, and are quite different from Boomy500. More investigation is needed to understand the results above.
At the point of time of this analysis, Boomy claims that users have generated 19,910,546 “original songs” through its platform.22 This raises concerns on the increased scale that AI technology enables for producing music, in comparison to less technologized forms of music-making. In the case of Boomy, it is not transparent to the user what the environmental impact of their actions is when generating music using the service. These could, however, be a core part of the design of the systems if environmental sustainability was prioritized in the design processes of Boomy — as has been already brought up in the context of other AI arts technologies [38][60][61]. The uncomfortable truth is that if acquiring profits or catering to user needs are what dominantly motivates the development of AI music technology, dimensions of sustainability will be deprioritized. In the case of Boomy, its prospects for profitability would seem to depend on reducing music to “interchangeable intellectual property”. This raises troublesome questions relating to sustainability, e.g., how can the computational cost of such profit-oriented generation and exchange of digital data be justified? Similar concerns of energy consumption (and environmental impact in terms of CO2) have been raised earlier in the context of blockchain and cryptocurrencies [62] but are also centrally present in the case of AI music, and specifically Boomy in how it operates around a massive production and exchange of digital data.
This paper has identified and provided preliminary investigations into the critical study of AI music, envisioning a new kind of music study — AI Music Studies — that draws on multiple disciplines, including engineering and computer science, (ethno)musicology, economics, sociology, science and technology studies, computer music, and business analytics. Many more questions have been asked than answered, but the overall aim of this paper is to argue that a new kind of music studies is needed, and to sketch what it can look like. We have focused on one particular AI music service to make our discussion more concrete. Though our analysis we see Boomy is a company borne out of the current music industry that mainly views music as a product to be sold a distributed, leveraging AI technology that shows the potential to exponentially accelerate, and possibly obliterate, established systems of music creation and consumption by its mere scalability. For AI to ultimately have a constructive role in music practice, culture and economics, it needs to be part of adopting a broader understanding of music as an as an active practice of “musicking” [50] rather than a commodity.
One might be motivated to think of AI as a new force in established “music ecosystems” [32]. The notion of music ecosystems has been developed in ethnomusicology to prioritize challenges of preserving musical diversity in a time of globalization, commodification and commercialization of music. According to Titon [63], a music ecosystem is defined by four aspects: diversity, limits to growth, interconnectivity, and stewardship. Schippers [64] further builds on Titon, and defines the music ecosystem after Tansley [65]:
the whole system, including not only a specific music genre, but also the complex of factors defining the genesis, development and sustainability of the surrounding music culture in the widest sense, including (but not limited to) the role of individuals, communities, values and attitudes, learning processes, contexts for making music, infrastructure and organisations, rights and regulations, diaspora and travel, media and the music industry [64].
Aligned with these notions of a music ecosystem are also underlying moral claims on their aspects. By diversity Titon refers to the ethnomusicological motivation of a global conservation of musical cultures, as the greater the diversity among species in the ecosystem the better chance of survival for any part of the ecosystem and the system as a whole. By limits to growth, he refers to the negative impact that continuous growth has on diversity. When resources are concentrated on certain parts of musical cultures others are suppressed. There needs to be limits to growth to find a balanced yet continuously evolving relationship between the dimensions in the ecosystem. Interconnectivity stresses the need to recognize the reciprocal relationship between individuals and communities, and between different communities and multiple music ecosystems at large. Interventions cannot be made to one dimension without consideration of its interconnection with and affects on others. Finally, stewardship means caring for something that is not owned by anyone. It highlights that music has a value beyond ownership and commodification, for instance as seen in grass-root and amateur music practices.
The metaphor of the ecosystem in music might be convenient, but it might also be misleading. While the metaphor highlights important interrelations between the actors and dimensions involved in music as practice, culture and economics, it lacks a firm parallel in how these aspects are fulfilled in real life. Some scholars, like Clancy [32], view AI as an interfering force in the music ecosystem. This presumes that music practices were sustainable and diverse before the entrance of AI. With Titon’s four aspects in mind, and with Schippers’ identification of the different dimensions of a music ecosystem, we are far away from a music ecosystem in which AI can have a sustainable position. The music industry itself hardly fills these requirements considering the established infrastructure of music consumption and monetization on streaming that already disrupts musical diversity [66][67].
A challenge for AI Music Studies then is to find more appropriate metaphors and concepts to critically analyse AI music, the existing music practices in which it takes part, and the new practices that might come from AI technology. AI Music Studies needs to recognize the complexities of current music practices in terms of values and attitudes, aesthetics, relationships between individuals and communities, economics, and technological innovation. It needs to understand how AI is a product of our current time and reflect on how it latches on to already disruptive music practices. Morris [68] describes an aspect of this as the “optimization of culture”, where music is primarily treated as goods to be distributed and used on certain platforms such as Spotify, which forces musicians and creators to think as software engineers in order to compete for streams. The entrance of AI accelerates the optimization of culture by both scale and speed. On the other hand, AI Music Studies also needs to reflect on the potentials of AI to be a force working towards more diverse and sustainable ways of “musicking”. The interdisciplinary perspectives that AI Music Studies engage is a necessary step towards this vision, and a careful application of metaphors and concepts is required in order to prepare for the coming flood of AI music.
This work was supported by the European Research Council under the European Union’s Horizon 2020 research and innovation programme (MUSAiC project, Grant agreement No. 864189), and by the Wallenberg AI, Autonomous Systems and Software Program – Humanities and Society (WASP-HS) funded by the Marianne and Marcus Wallenberg Foundation (Grant 2020.0102).
Since we have focused exclusively on a single company to understand how it is operating, some may see this paper as a “hit job” on this company, e.g., that it is “exposing” the company for taking advantage of contemporary incentives of digital service providers (pay-per-play streaming). Our intent is not to castigate this company, but to honestly portray how one entity marketing AI music is operating in the broader business of content streaming. None of the authors of this paper have financial incentives in any AI music business, and thus do not have any conflicts of interest. None of the data used in this paper come from a purposeful deception of the company or of its users. Most of the information we cite is public facing and accessible and can be readily confirmed (the exception being our user experience in creating Stonky Stonks Audio 1). We make sure to be explicit where we are surmising in the paper, e.g., on the financial aspects of the company. Our motive in writing this paper, and using this particular company as a case study, is purely to motivate the critical study of AI music and its impacts on the world.