The day when technology can recreate your voice to say anything —no matter how insane— is pretty much here. A group of engineers has created an AI program that can convincingly mimic the voice of podcast host Joe Rogan.
You can hear the results in the video below. The voice is not only identical to Rogan, but it can also produce a natural-sounding rhythm to almost every word said.
However, the voice is artificial. The engineers at the AI company Dessa can get their fake Joe Rogan to say anything —all they have to do is type in the text. In this case, they made the AI-powered voice talk about sponsoring a hockey team full of chimpanzees and also tout the benefits of being a robot.
The video will probably both amuse and disturb you. And that’s the point. “It’s pretty f*cking scary,” Dessa wrote in a blog post about the achievement, which was created using a text-to-speech synthesis technology they’ve dubbed RealTalk.
“In the next few years (or even sooner), we’ll see the technology advance to the point where only a few seconds of audio are needed to create a life-like replica of anyone’s voice on the planet,” the company added.
#RealTalk: Thanks for lending us your voice, @JoeRogan! We knew you would get how serious AI really is, and appreciate your help spreading awareness about its potential, which is at once incredible but also dangerous. https://t.co/kY31spaBdJ pic.twitter.com/UYggRaj4eT
— Dessa (@dessa) May 17, 2019
The demo is just the latest example of how AI-powered algorithms could be exploited to spread misinformation to fool the entire public. Researchers have already shown you can use the technology to create realistic-looking, but ultimately fake images and videos of people, including politicians and celebrities. US lawmakers are concerned these AI-generated “deepfakes” might one day open a new front in information warfare.
Dessa’s text-to-speech synthesis technology could be used for malicious purposes as well. Imagine scam phone callers with the ability to impersonate your family members. Or hackers creating audio fakes of a politician saying offensive things in order to interfere with an election.
Of course, the same technology could be used for good; as examples, Dessa points to automatic voice dubbing for movies in any language, as well as more natural-sounding smart assistants. However, it’s clear society needs to ensure synthetic speech technologies are developed responsibly, before they become a danger, the company said.
“It’s not outlandish to believe that the implications we mentioned (and of course, many more) will soon make their way into the fabric of society,” Dessa warned
The scary implications are why Dessa is declining to release the technical details behind their AI-powered text-to-speech synthesis technology. “To work on things like this responsibly, we think the public should first be made aware of the implications that speech synthesis models present before releasing anything open source,” the company said.
That all said, the company plans on posting a technical overview of their AI technology in the coming days. In the meantime, Dessa has created a website that tests whether you can pick out real audio clips of Rogan from the fake ones generated by their AI.
In response to the demo, Rogan himself tweeted: “This could become a real problem. I’m flattered and honored that they chose my voice as an example to let us know that we’re fucked.”