PIONEER: GenAI is revolutionizing audio and music production | Jay LeBoeuf
Generative AI heralds a step change for human creativity. A conversation with JAY LEBOEUF of Descript
Happy Thursday,
🎙️ Dropping into your inbox with another PIONEER conversation.
The story of Generative AI thus far has been dominated by image and text generation – with the latter emerging as a medium that is already an extreme disruptor. (Wow… I have so much to say about large language models and how the ‘text generation space’ is evolving… more on that in tomorrow’s newsletter.)
But today, I want to focus on audio because Generative AI will revolutionize the entire audio domain. There are main strands for disruption:
Voice cloning (see my interview with Alex Serdiuk of Respeecher);
Audio Production (which we cover in-depth here);
And actual music generation (I will come back to this).
Synthesising information that hasn’t occurred
I was delighted to speak to a friend, colleague and true ‘sound’ pioneer on Generative AI and audio production.
🎶 Jay LeBoeuf is a musician, engineer and researcher. The Head of Business and Corporate Development at Descript, he is also a lecturer in music and AI at Stanford University. Jay has spent the past 20 years at the intersection of media, technology, and AI. He’s consistently witnessed how machine learning makes audio and music production easier — Generative AI, however, takes it to the next level.
Jay and I have both had ‘revelatory moments’ when we realised we were witnessing the birth of a new class of AI. For Jay, the epiphany came in 2017 (years before the term ‘Generative AI’ had been coined) when he encountered one of the earliest voice cloning companies. He came to the profound realisation that this technology could synthesise information “that has not occurred before.”
I define Generative AI as a synthesis of information that has not occurred before. And, it could be filling in the blank spaces that… just aren't there.
The death of creativity?
A key part of our discussion regards the philosophical debate around whether AI heralds the death of human creativity. ⛔ We say nay!
While it’s easy to hear ‘AI-generated audio’ and imagine a future where machines replace creators, producers, and musicians – Jay and I think the opposite will happen. As AI takes over some of the slow, expensive and technical aspects of production, it will make the process easier, cheaper, better and more accessible.
⬆️ This in turn will allow more humans to create more and to create better, resulting in more producers, musicians and artists. While there is no doubt that AI will automate parts of creative workflows (leading to the loss of specific jobs) the overall effect will still be ‘more’.
I’ll use my own experience as an example.
I’ve been using Descript to help produce my video content (Jay’s been helping me navigate that). One of the magical things the platform does is use AI to generate transcripts of interviews.
Although they are not 100% accurate, AI can do most of the heavy lifting. So while I might not need a human transcriber, I have become a video creator myself. Moreover, I quickly realised that even with AI, I need to work with producers. (AI alone is not enough!)
🟰 The net effect? More creativity, more creators, and more work for creatives.
As the technical parts of production become less important — the benchmark for creativity will err towards the human element — who’s got the best idea? Who’s got the creative vision? And with the playing field levelled in this way, we will also see the emergence of new voices that have not been heard before. As Jay notes, that’s an exciting prospect.
🎆 So rather than automating creativity — AI will unleash a Cambrian explosion of creativity. With no monopoly on creativity, I expect to see the following:
A massive expansion of the creator economy
More creative work and new creative jobs (IE, AI Transcription Editor, AI Copy Editor, AI Video editor, Creative Prompt Director)
Elevated storytelling and content
☯️ Those are all the good bits. But, as is a consistent theme in my analysis — with the Yin comes the Yang. The democratisation of creativity will, no doubt, also lead to:
More people using creative impulses to do ‘bad’ stuff (IE, fake porn, fraud, impersonation)
More mediocre, poor and spammy content (not everyone will be ‘elevating’ their craft)
Social anxiety and upheaval due to economic transformation
Beyond creativity and automation, Jay and I have a broad-ranging conversation ranging from the technical aspects of AI-led audio production to the ethical implications of Generative AI more broadly.
I think you’ll enjoy it.
This in episode we cover
[00:25] - Background:Jay LeBoeuf
[01:40] - Machine Learning for Music
[05:34] - Generative AI
[08:27] - Lyrebird: Voice Cloning
[14:01] - Generative AI for Audio and Music Production
[16:01] - Interfaces for Control
[18:57] - Customer Reception of Generative AI
[24:20] - Democratization and Quality of Craft
[26:10] - Ethics
[27:51] - Authentication of Content
[30:37]- Integration within the traditional tool stack
[32:11] - Fine Tuning and Personalization
[41:47] - Software and hardware advances
See you tomorrow with Friday’s #EGAI newsletter.
Namaste,
Nina