Get amazing AI audio voiceovers made for long-form content such as podcasts, presentations and social media. (Get started now)

Faking It 'Til You're Making It: The Rise of AI-Generated Voices

Faking It 'Til You're Making It: The Rise of AI-Generated Voices - Machines That Talk (And Sing!)

The ability of machines to mimic human voices has opened up exciting possibilities in various fields. While text-to-speech technology has been around for decades, recent advances in AI have enabled a greater level of realism and control. Now machines can not only speak, but sing with hauntingly human-like voices.

Voice synthesis technology allows us to give a voice to those who have lost their ability to speak, whether through injury or disease. Apps like ModelTalker showcase how neural networks can clone a voice with just a few minutes of sample audio. Users simply record themselves reading a passage, and the app creates a digital vocal doppelgänger. This "vocal avatar" can then speak any typed text in a scarily accurate recreation of the original voice.

The entertainment industry has also embraced synthetic voices. When filmmakers needed a realistic digital double for the actor Val Kilmer, who lost his voice to throat cancer, they turned to AI. The company Sonantic used archived footage to build a Kilmer voice AI that delivered new dialogue for the Top Gun sequel. Musicians like Holly Herndon are exploring the artistic potential of "singing" neural networks trained on their own vocals.

As the technology advances, legal and ethical pitfalls remain. Deepfakes that misrepresent or spoof real people have caused controversies. But when used responsibly, AI-generated voices grant those with speech limitations a chance to rediscover their true voice. These vocal avatars also let us posthumously resurrect voices from history. Imagine speeches read by MLK, songs sung by Freddie Mercury, or interviews with Ada Lovelace.

Faking It 'Til You're Making It: The Rise of AI-Generated Voices - Cloning Your Inner Morgan Freeman

Have you ever wanted that deep, soothing voice of Morgan Freeman to narrate your latest podcast, audiobook, or other audio project? With recent advances in AI voice cloning technology, that dream can become a reality. Apps like Resemble.AI and services from companies like Veritone allow anyone to clone a voice with just a few minutes of sample audio.

To create your own Morgan Freeman voice clone, you simply need to provide a short voice sample of the man himself reading aloud. This gives the AI model enough data to analyze his unique timber, rhythm, and inflection. Once trained on Freeman's voice, the model can generate completely new speech in his signature calming baritone.

Matt Reed is an audiobook narrator who has used Resemble.AI to craft AI versions of celebrity voices like Freeman's. As he describes it, "The AI listens to the real human voice and figures out the hidden parameters that make that voice sound the way it does. Things like the pitch, the tone, the speed, the cadence." With just 90 seconds of Morgan Freeman audio, Reed was able to clone the actor's voice and have it convincingly read passages from public domain books.

Of course, there are still limitations. AI voice clones may get tripped up on unusual words or complex sentence structures. But for simple narration, the results can be eerily realistic. As the technology improves, AI voices will be able to tackle more advanced texts. Reed believes these synthetic voices open up new creative possibilities for content creators. Writers can cast any voice they want for stories without needing to hire famous actors.

Youtuber Cinefix put Freeman's AI voice clone to the test by having it narrate the entire teaser trailer for Christopher Nolan's Tenet. While not perfect, the digital Freeman voice showcases how AI cloning can produce remarkably natural speech and intonation for a specific individual. As Cinefix says, "This technology still has a long way to go, but it's slowly getting there."

Faking It 'Til You're Making It: The Rise of AI-Generated Voices - Legal Quandaries of Vocal Replication

As AI voice cloning technology grows more advanced, legal questions around replicating someone's voice without permission are emerging. Laws have not kept pace with this technology, leaving gaps around ownership rights and misuse of synthesized voices.

Several high-profile cases have highlighted the legal uncertainties. In 2019, a Canadian insurance company used AI to clone a deceased client's voice so her son could hear a touching birthday message from his late mother. But the son never consented to the voice synthesis, sparking privacy concerns.

That same year, a museum created an AI exhibit featuring an interactive kiosk with a cloned voice of civil rights pioneer Rosa Parks. Parks' estate objected, arguing this unauthorized vocal replication infringed on Parks' publicity rights. The museum canceled the exhibit to avoid potential litigation.

The key legal issue is that voices contain identifiable characteristics unique to each person. Your voice is an extension of your identity. So does replicating it require consent? Can synthesized voices be trademarked to establish ownership rights?

Attorney Joseph C. Gratz notes that under current US law, voices themselves don't have standalone copyright protections. But the specific voice recording used to train an AI model may be copyrighted. This creates a gray area around AI clones based on copyrighted source material.

Regarding likeness rights, Nancy Wolff, partner at Cowan DeBaets Abrahams & Sheppard, explains that "the recording of someone's voice and the ability to mimic that voice could be considered an attribute of that person's identity. To use someone's 'synthetic' voice without permission raises right of publicity issues.”

Since voice cloning often requires only a short sample of someone's speech, people may not even realize their voice is being copied for commercial applications. Wolff believes this technology calls for reassessing the right to privacy around our voices.

Some legal experts have proposed that synthesized voices should require consent from the original speaker or their estate before use. Others argue we cannot put the technological genie back in the bottle, so new regulations are needed.

Faking It 'Til You're Making It: The Rise of AI-Generated Voices - AI Voices Go Mainstream

As AI voice synthesis technology improves, these uncannily human-sounding vocal AI avatars are transcending niche applications and entering the mainstream. From customer service bots to audio books to video game characters, industries across the board are embracing the cost and time savings of automated voice actors.

Audiobook publisher Findaway Voices found they could tap into a wider talent pool by using AI voices. Aspiring authors now only need to record a short sample of their own voice for Findaway to synthesize into a digital narrator. This gives more authors a chance to personally narrate their own books rather than hiring expensive studio voice talent.

The video game industry has also embraced synthetic voice actors. Studios like Electronic Arts and Activision are turning to AI voice cloning platforms like Sonantic to craft realistic digital doubles of human actors. This allows adding additional dialogue or localization without having to recall the original voice talent for more recording sessions. The resulting AI game characters can converse dynamically while maintaining consistent voices.

Even call centers and customer service phonelines are experimenting with AI voices. Companies like PlaytestCloud offer "vocal avatar" services that clone a real employee's voice to auto-generate phone tree messages and interactive dialogues. Callers perceive a more natural experience hearing a familiar voice, reducing hangups. The AI handles simple requests while passing complex issues to human agents.

Of course, responsible use remains paramount as these AI vocals enter daily life. Sonantic enforces strict contractual limits when licensing their AI voices to entertainment studios, stipulating permitted contexts and durations of use. Other vendors advise transparently identifying AI voices to avoid deceiving users.