OpenAI claims its software can clone your voice from 15 seconds of you talking

Super lab loves to big up things it says it couldn't possibly let loose on the world for now

OpenAI's latest trick needs just 15 seconds of audio of someone speaking to clone that person's voice – but don't worry, no need to look behind the curtain, the biz wants everyone to know it's not going to release this Voice Engine until it can be sure the potential for mischief has been managed. 

Described as being a "small model" that uses a 15-second clip and a text prompt to generate natural-sounding speech resembling the original vocalist, OpenAI said it's already been testing the system with several "trusted partners." It has provided purported samples of Voice Engine's capabilities in marketing bumf emitted at the end of last month. 

According to OpenAI, Voice Engine can be used to do things like provide reading assistance, translate content, support non-verbal people, help medical patients who've lost their voices regain the ability to speak in their own voice and expand access to services in remote settings. All those use cases are demoed and have been part of the work OpenAI has been doing with early partners. 

News of the existence of Voice Engine, which OpenAI said was developed in late 2022 to serve as the tech behind ChatGPT Voice, Read Aloud, and its text-to-speech API, comes as concerns over voice cloning have reached a fever pitch of late.

One of the most headline-grabbing voice cloning stories of the year came from the New Hampshire presidential primary in the US, during which AI-generated robocalls of President Biden went out urging voters not to participate in the day's voting. 

Since then the FCC has formally declared AI-generated robocalls to be illegal, and the FTC has issued a $25,000 bounty to solicit ideas on how to combat the growing threat of AI voice cloning. 

Most recently, former US Secretary of State, senator and First Lady Hillary Clinton has warned that the 2024 election cycle will be "ground zero" for AI-driven election manipulation. So why come forward with another potentially trust-shattering technology in the midst of such a debate? 

"We hope to start a dialogue on the responsible deployment of synthetic voices, and how society can adapt to these new capabilities," OpenAI said.

"Based on these conversations and the results of these small scale tests, we will make a more informed decision about whether and how to deploy this technology at scale," the lab added. "We hope this preview of Voice Engine both underscores its potential and also motivates the need to bolster societal resilience against the challenges brought by ever more convincing generative models." 

To assist in preventing voice-based fraud, OpenAI said it is encouraging others to phase out voice-based authentication, explore what can be done to protect individuals against such capabilities, and accelerate tech to track the origin of audiovisual content "so it's always clear when you're interacting with a real person or with an AI." 

That said, OpenAI also seems to accept that, even if it doesn't end up deploying Voice Engine, someone else will likely create and release a similar product - and it might not be someone as trustworthy as them, you know. 

"It's important that people around the world understand where this technology is headed, whether we ultimately deploy it widely ourselves or not," OpenAI said. 

So consider this an oh-so friendly warning that, even if OpenAI isn't the reason, you can't trust everything you hear on the internet nowadays. ®

More about


Send us news

Other stories you might like