The Crowd Misses The Data Train [304]

Hi Crowd!

If your life intersects at all with any musicians you likely heard about this Atlantic piece about millions of songs used to train AI datasets for a number of generative music AI models. The immediate outrage of course is that this was done without the knowledge or permission of any of the musicians whose music was used. They even created a searchable database so you can enter any band or musician name and see exactly which songs were found in the training data. Important detail, this isn’t comprehensive: Finding your music in here means it was used for training, but not finding your music in here does not mean it wasn’t used for training. This just covers the datasets that were found and explored, there are certainly many others.

I wrote a lot about training data and the ethics of using work without permission in part 3 of my recent series so I won’t repeat it here, suffice to say this is a real concern people have, and one that isn’t going away anytime soon. A quick scan of the socials in the wake of that Atlantic piece shows a flood of people demanding compensation, asking about class action lawsuits, and general outrage. Which is to be expected, and I get. I also don’t imagine it will go anywhere, as that ship has long sailed. If you only read one follow up to this, it should be Mat Dryhurst’s “So your music helped train an AI music model.” Side note: he annoyingly published as a X article, though I guess if that’s where the conversation was happening he chose to post it there to be a part of it rather than somewhere else that people might not have seen, but that’s a rant for another time. Mat makes a number of really good, thoughtful and pragmatic observations, not the least of which is that a number of the people upset about this just want AI to go away and any solution which isn’t that won’t be enough for them. (Spoiler, that’s not happening). He also digs into the math of actually paying people for usage, and makes it fairly clear why no one will be happy with those numbers either. And while the “without permission” detail is a real sticking point here, and it should be, with permission isn’t an easy fix either as 31 music organizations have just published an open letter warning against labels and publishers that are signing licensing deals in an attempt to at least have things happen above level. As someone who has been trying to objectively navigate this for a while now, it still feels very messy and a lot of that rests on the players involved having drastically different expectations, assumptions, and essentially speaking different languages.

Some positive context perhaps, from musician James Blake:

Also positive, by all accounts the 0 10 (zero ten) pavilion at Art Basel was a smashing success, and perhaps more history/educational focused than the last incarnation in an attempt to fully cement the current era of artists using blockchain/ai in the well established lineage of artists playing with digital. In Basel William Mapan (pictured above in the header) and 0xDEAFBEEF especially saw much well deserved attention. And La Random published a wonderful interview with the Thoma Foundation. If you aren’t aware Carl & Marilynn Thoma are some of the largest art collectors / patrons in the world, and noteworthy here for their significant attention to digital works.

If nothing else, it’s an interesting moment right now. Worth paying attention to it.

-s


June 22, 2026 Sean Bonner

Subscribe

Enter your email address to subscribe and receive new posts by email.