Why We'll Get Good At Detecting AI
Hello friends,
And welcome free subscribers to your weekly preview. The main story is all about YouTube’s new project to simulate John Legend legally. And it prompts me to wonder if “AI music” is goign to become an insult soon.
Paid subscribers keep going for some takes on Nvidia working with Microsoft games, Appe Mag-safe coming to Android and more.
Cheers,
Tom
Main Story
DeepMind and YouTube release Lyria, a gen-AI model for music, and Dream Track to build AI tunes
DeepMind announced a music generation model called Lyria that YouTube creators can try.
Creators will get two experimental toolsets. Dream Track can create music to be used in YouTube Shorts. And Music AI can turn your humming into music. DeepMind also announced SynthID, a way to watermark generated music so you know it's made by a machine.
DreamTrack got the consent of some artists to let users create 30-second tracks int heir voice and musical style. The artists on board include Alec Benjamin, Charlie Puth, Charli XCX, Demi Lovato, John Legend, Sia, T-Pain, Troye Sivan, and Papoose. Users enter a topic, choose the artist and the tool outputs the 30-second track. The tool is in a limited release to selected creators.
Music AI can create music based on an instrument you specify, take your humming and turn it into a track or use chords you give it to create a choir or ensemble. It can also create backing instrumentals and vocals for an existing vocal track you provide. Music AI Tools will come out later this year.
YouTube will not be the first nor the last. Ghostwriter is the famous source behind the spoofed Drake track from earlier this year. Meta open sourced its music generator in June and Stability AI launched one in September. TechCrunch notes that startups like Riffusion are also out there.
Then there's SynthID, a watermark embedded in music generated by DeepMind's tools. DeepMind says it will be inaudible to humans, but still detectable by machines even if an audio track is compressed, sped up or slowed down. DeepMind says it converts the audio wave into a 2D visualization, adds the watermark, and then converts it back into a one dimensional wave form.
MY TAKEAWAY: This stuff is coming even faster than I thought. It's fascinating to see companies work as hard to reassure users of its responsible approach as it is to impress us with what it can do. But here's another thing I think. In a year we'll be complaining about the shortcomings of this stuff. Listen to that Charlie Puth track they used as an example. Then listen to Charlie Puth. Then go back and listen to the generated one again. You'll definitely start to teach yourself the difference. We're all about to become literate in discerning the fine points of what generated music sound like. Just like we are learning to look for signs of generated images. I wouldn't be shocked if "generated" or "AI" becomes a derogatory term for actual art int he future to describe how it sounds or looks cheap.
Other Stories - Worth Your Attention
Keep reading with a 7-day free trial
Subscribe to Tom Merritt Tech Newsletter to keep reading this post and get 7 days of free access to the full post archives.