I’ve heard it said you shouldn’t do something just because you can. We humans find this difficult, we love to show off our abilities. Nowhere is this more prevalent than in tech. Every other day some ‘inventor’ shows off some useless product. We marvel at his cleverness and how far tech has come and then move on with our lives.
We have one such tech feature today. Amazon demoed a new Alexa feature that allows it to mimic the voices of users’ dead relatives. Alexa is Amazon’s AI voice assistant that competes against Google Assistant and Siri.
When voice assistants first came out they had robotic voices. That improved over the years and Google showed off their Assistant calling humans who were not able to tell they were talking to an AI.
We celebrated how natural voice assistants were getting but Amazon didn’t think that was enough. They went ahead and taught Alexa how to mimic dead people’s voices.
Amazon showed a video of Alexa reading a book to a child in the voice of the kid’s dead grandmother. They seemed proud of this achievement and I don’t know why. I know I might be in the minority but I don’t want my tech trying to imitate my dead relatives.
How does Alexa do this?
It has been possible for AIs to learn to imitate voices for a while now. All they need is the voice sample and then they mimic the voice. Alexa only needs one minute of recorded audio to do its thing.
I hope it didn’t need spelling out but Alexa doesn’t magically know what your dead grandfather sounded like. It doesn’t have a connection to the underworld, there are no seance shenanigans going on.
If you don’t have an audio recording of your deceased loved one Alexa won’t be able to help you.
I mentioned that voice mimicry has been a thing for years and the tech has been used in movies and TV like the young Skywalker in the Mandalorian. You have no doubt come across deepfakes on social media too.
We have discussed what deepfakes are here but the gist is that AI, through deep learning learns the nuances of what a specific voice sounds like in the audio mimicry example we are working with. Once that is learnt, you can get the AI to utter any other words in the learnt voice.
You can check out Vocones, one of the more famous deepfake generators.
In film, these deepfakes can be used for language-dubbing and have been a cost saver for filmmakers by removing the need to fly in actors to re-record some minor lines.
Voice mimicry could come in handy should someone suddenly die, leaving behind a voice-protected vault. I don’t know if deepfakes are able to fool other AI but they can fool humans already.
Remember that time criminals stole US$243,000 by impersonating the voice of the CEO of a UK firm using AI. I don’t condone such crimes but I understand them.
Those use cases made sense to me. Why Alexa had to bring dead relatives into the picture, I’ll never know. Throughout the whole process, not a single person at Amazon stopped to ask, ‘should we be doing this?’
What do you think?
Are you excited about the direction Alexa is taking voice mimicry? We don’t know if this feature will make it to the public, so far Amazon has only demoed it without sharing when or if it’s on it’s way. It could have been a show off exercise, kind of like a proof of concept. Let’s hope it’s that.
I’m only comfortable with deepfakes being used in film and for mischief, someone please make deepfakes of President Mnangagwa saying stuff like ‘Ko nhai Mangwana, ndiani atora Super ranga riri pasi pedesk rangu iro?’
Man, these AIs need to chill. Just last week we were talking about a Google engineer claiming that their AI had come alive. Whilst some Texans got their AI to create enzymes that eat plastic in days, as opposed to the centuries it takes to break down naturally.