OpenAI unveils its Voice Engine tool that can replicate people’s voices

NBC News
30 Mar 202402:14

Summary

TLDROpenAI is leading the charge in AI innovation with its latest text-to-audio generator, Voice Engine, capable of creating realistic voice samples from just a 15-second clip. While the technology has the potential to revolutionize various industries, concerns about misuse are rising, with risks including the creation of fake messages and robocalls. OpenAI is cautiously working with a limited number of partners to ensure responsible development and is showcasing the technology's potential across multiple languages and creative applications, including short movies generated by its video model, Sora.

Takeaways

  • 🚀 OpenAI has played a pivotal role in advancing artificial intelligence with its text-generating tool, ChatGPT.
  • 🎨 The organization has also impressed with AI-generated visuals through Dolly, showcasing its capabilities in visual arts.
  • 🗣️ OpenAI is now introducing a new text-to-audio generator called Voice Engine, capable of converting text to realistic human voice samples.
  • 🌐 The technology can be applied across various languages, as demonstrated by the phrase 'Friendship is a universal treasure' in Spanish and Japanese.
  • 🏢 Companies are expected to rush to integrate and update their platforms with OpenAI's new voice technology.
  • 🔊 OpenAI requires only a 15-second sample of a voice to generate a synthetic version, highlighting the efficiency of the technology.
  • 📝 There are existing concerns about the misuse of voice-cloning technology, such as in fake ransom messages and robocalls.
  • 📜 OpenAI acknowledges the risks of synthetic voice misuse and is working cautiously with a limited number of partners on the Voice Engine tool.
  • 🎥 The organization is collaborating with filmmakers to explore the potential of its video generator, Sora, in creating short movies.
  • 🌟 OpenAI's demonstrations aim to prepare society for the impact of emerging technologies and to showcase their potential for various applications.

Q & A

  • What significant contribution did OpenAI make in the field of artificial intelligence?

    -OpenAI kick-started a new era of artificial intelligence with its text-generating tool, ChatGPT.

  • What is Dolly, and what does it specialize in generating?

    -Dolly is an AI-powered system developed by OpenAI that specializes in generating AI-generated visuals based on user prompts.

  • What is the new technology OpenAI is unveiling, and what does it do?

    -OpenAI is unveiling Voice Engine, a new text-to-audio generator that can convert text into AI-generated voice samples.

  • How long of a sample does OpenAI's Voice Engine require to generate a voice?

    -OpenAI's Voice Engine only needs a 15-second sample to generate a voice.

  • What are some potential risks associated with voice-generating technology?

    -Potential risks include the creation of fake ransom messages, robocalls, and the possibility of misuse in various platforms without proper regulations.

  • How is OpenAI addressing the risks of synthetic voice misuse?

    -OpenAI is working on the tool with a limited number of partners and taking a cautious and informed approach to a broader release due to the potential for misuse.

  • What languages is OpenAI demonstrating the potential of its voice-generating technology in?

    -OpenAI is demonstrating its technology in multiple languages, including Spanish and Japanese.

  • How is OpenAI showcasing the capabilities of its video generator?

    -OpenAI is partnering with filmmakers to create short movies using its video generator, showcasing its potential in the film industry.

  • What is the significance of the phrase 'friendship is a universal treasure' in the script?

    -The phrase 'friendship is a universal treasure' is used to demonstrate the ability of OpenAI's voice-generating technology to convey meaningful messages across different languages.

  • What is the main concern expressed by the reporter regarding AI technologies?

    -The main concern expressed by the reporter is the lack of regulations for powerful AI technologies and the potential for misuse in various contexts.

  • How does OpenAI's approach to developing and releasing new technologies reflect its stance on ethical AI?

    -OpenAI's approach reflects a commitment to ethical AI development by working cautiously, partnering with a limited number of entities, and considering the potential risks and misuses of the technology.

Outlines

00:00

🤖 Introduction to OpenAI's Innovations

This paragraph introduces OpenAI's significant contributions to the field of artificial intelligence. It discusses the development of ChatGPT, a text-generating tool, Dolly's AI-generated visuals, and the new text-to-video tool. The focus then shifts to the unveiling of OpenAI's Voice Engine, a text-to-audio generator capable of converting a 15-second human voice sample into an AI-generated one. The paragraph highlights the potential impact of this technology on various companies and platforms, while also acknowledging the risks associated with voice cloning and the misuse of synthetic voices. OpenAI's approach to addressing these concerns is also mentioned, emphasizing a cautious and informed strategy for broader release of the technology.

Mindmap

Keywords

💡Artificial Intelligence

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think, learn, and problem-solve like humans. In the context of the video, AI is the driving force behind the innovations and breakthroughs achieved by OpenAI, particularly in the development of text-generating tools like ChatGPT and voice engines that can mimic human speech.

💡ChatGPT

ChatGPT is an AI-powered text-generating tool developed by OpenAI that is capable of producing human-like text based on the input it receives. It has been a significant milestone in the field of AI, demonstrating the technology's ability to understand and generate complex language. In the video, ChatGPT is highlighted as a key innovation that has contributed to the new era of AI.

💡Text-to-Audio Generator

A text-to-audio generator is a software application that converts written text into spoken words, effectively creating a voice from the text. In the video, OpenAI's new voice engine is an example of such a technology, which can take a real human voice sample and generate an AI version of it, showcasing the potential for highly realistic voice replication.

💡Voice Cloning

Voice cloning refers to the process of creating a synthetic version of a voice by imitating the unique characteristics of a person's speech. The technology allows for the duplication of voices without the individual's direct involvement. In the context of the video, voice cloning is a concern as it raises ethical and security issues, especially in the absence of regulations.

💡Regulations

Regulations are the rules and guidelines set by governing bodies to control and manage various aspects of society, including technology. In the video, the lack of regulations for AI technologies like voice cloning is a point of concern, as it may lead to misuse and unethical applications of these powerful tools.

💡Synthetic Voice Misuse

Synthetic voice misuse refers to the improper or unethical use of artificially generated voices to deceive, manipulate, or cause harm. This is a significant concern as AI technologies become more advanced, as they can create highly convincing fake voices that may be used for malicious purposes.

💡Language Diversity

Language diversity refers to the variety of languages spoken around the world and the ability of a technology to cater to multiple languages. In the video, OpenAI's voice engine demonstrates its potential by showing its capabilities across different languages, which highlights the technology's global applicability and the importance of addressing language-specific challenges and nuances.

💡Video Generator

A video generator is a tool or software that can create videos or visual content based on input data, such as text descriptions or other forms of content. In the context of the video, OpenAI's video generator, Sora, is used in partnership with filmmakers to create short movies, demonstrating the technology's potential in the field of video production and storytelling.

💡Emerging Technology

Emerging technology refers to new or rapidly developing technologies that have the potential to significantly impact society, economies, and industries. In the video, the term is used to describe AI advancements like text-to-audio generators and video generators, which are still in their early stages but already showing transformative potential.

💡Ethical Concerns

Ethical concerns are issues related to moral principles and values that arise when considering the development and application of certain technologies. In the video, ethical concerns are raised in relation to AI technologies like voice cloning and synthetic voices, which could be misused to deceive or manipulate people without their consent.

Highlights

OpenAI kick-started a new era of artificial intelligence with its text-generating tool ChatGPT.

AI-generated visuals through Dolly amazed the public.

OpenAI unveiled a new text-to-audio generator called Voice Engine.

Voice Engine can turn a real human voice sample into an AI-generated one.

OpenAI needs only a 15-second sample to generate a voice.

The technology has potential applications across various companies and platforms.

There are concerns about the misuse of voice-cloning technology.

Other voice-generating programs have been used to create fake ransom messages and robocalls.

OpenAI is working cautiously on the tool with a limited number of partners.

The company is taking an informed approach to a broader release due to potential misuse.

OpenAI's technology demonstrates potential by showing what it can do across languages.

The phrase 'Friendship is a universal treasure' is showcased in multiple languages.

OpenAI is partnering with filmmakers to use its video generator, Sora.

Short movies are being created with the help of OpenAI's video generator.

The technology is not limited to voices; it also includes video generation.

OpenAI's advancements are pushing society to prepare for the future of technology.

Transcripts

play00:01

AND YOU'VE REALLY GOT TO HEAR IT

play00:06

TO BELIEVE IT.

play00:07

HERE'S NARISSA MAR -- MARISSA

play00:09

PARRA.

play00:10

>> Reporter: OpenAI HELPED

play00:11

KICK-START THE NEW ERA OF

play00:12

ARTIFICIAL INTELLIGENCE WITH ITS

play00:13

TEXT-GENERATING TOOL ChatGPT.

play00:14

IT STUNNED US WITH ITS

play00:15

AI-GENERATED VISUALS THROUGH

play00:15

DOLLY AND AMAZED US WITH ITS

play00:18

TEXT-TO-VIDEO TOOL.

play00:20

NOW IT'S UNVEILING VOICE ENGINE,

play00:23

A NEW TEXT-TO-AUDIO GENERATOR

play00:25

THAT CAN TURN THIS REAL HUMAN

play00:25

VOICE SAMPLE --

play00:26

>> FORCE IS A PUSH OR PULL THAT

play00:27

CAN MAKE AN OBJECT MOVE --

play00:32

>> Reporter: -- INTO THIS

play00:35

AI-GENERATED ONE --

play00:36

>> HAVE YOU EVER WONDERED WHY A

play00:37

SOCCER BALL SOARS THROUGH THE

play00:37

AIR --

play00:38

>> Reporter: OpenAI NEEDS ONLY A

play00:39

15-SECOND SAMPLE TO GENERATE A

play00:39

VOICE.

play00:40

>> IT'S GOING TO GET A LOT OF

play00:44

COMPANIES RUSHING TO PERFECT AND

play00:45

TO UPDATE A LOT OF THEIR

play00:47

PLATFORMS.

play00:48

>> Reporter: OpenAI IS NOT THE

play00:49

FIRST COMPANY TO DEMONSTRATE THE

play00:51

ABILITY TO CLONE VOICES.

play00:52

THE RISKS OF THE TECHNOLOGY

play00:53

ALREADY CLEAR.

play00:54

>> DOES IT IMPRESS YOU OR

play00:56

CONCERN YOU?

play00:57

>> I HAVE TO ADMIT IT'S

play01:00

IMPRESSIVE.

play01:01

I'M VERY CONCERNED ABOUT THE

play01:04

POSSIBILITIES OF THESE KINDS OF

play01:05

POWERFUL TECHNOLOGIES.

play01:08

THERE ARE NO REGULATIONS SO FAR.

play01:11

>> Reporter: OTHER

play01:13

VOICE-GENERATING PROGRAMS HAVE

play01:14

BEEN USED TO CREATE FAKE RANSOM

play01:15

MESSAGES AND THIS FAKE ROBO CALL

play01:15

INTENDED TO SOUND LIKE PRESIDENT

play01:16

BIDEN.

play01:17

>> IT'S IMPORTANT THAT YOU SAVE

play01:18

YOUR VOTE FOR THE NOVEMBER

play01:19

ELECTION.

play01:20

>> Reporter: OpenAI

play01:22

ACKNOWLEDGING THE RISKS OF THIS

play01:23

EMERGING TECHNOLOGY, SAYING IN

play01:24

ITS BLOG THAT IT'S WORKING ON

play01:25

THE TOOL WITH A LIMITED NUMBER

play01:26

OF PARTNERS, AND THAT, QUOTE, WE

play01:27

ARE TAKING A CAUTIOUS AND

play01:29

INFORMED APPROACH TO A BROADER

play01:30

RELEASE DUE TO THE POTENTIAL FOR

play01:33

SYNTHETIC VOICE MISUSE.

play01:33

BUT THE COMPANY ALSO

play01:35

DEMONSTRATING ITS POTENTIAL,

play01:35

SHOWING WHAT IT CAN DO ACROSS

play01:37

LANGUAGES.

play01:38

>> FRIENDSHIP IS A UNIVERSAL

play01:39

TREASURE.

play01:40

>> Reporter: HERE'S THAT SAME

play01:41

PHRASE IN SPANISH --

play01:42

[ SPEAKING IN A GLOBAL

play01:44

LANGUAGE ]

play01:45

AGAIN IN JAPANESE --

play01:46

[ SPEAKING IN A GLOBAL

play01:48

LANGUAGE ]

play01:49

>> AND IT'S NOT JUST VOICES.

play01:51

OpenAI IS ALSO PARTNERING WITH

play01:53

SOME FILMMAKERS TO TRY OUT ITS

play01:55

VIDEO GENERATOR SORA CREATING

play01:56

SHORT MOVIES LIKE THIS ONE.

play01:57

>> I AM LITERALLY FILLED WITH

play01:59

HOT AIR.

play02:00

>> Reporter: OpenAI

play02:02

DEMONSTRATING ITS CAPABILITIES

play02:03

IN PART TO PUSH SOCIETY TO

play02:05

PREPARE FOR TECHNOLOGY THAT IS

Rate This

5.0 / 5 (0 votes)

Related Tags
AI InnovationsText-to-SpeechVisual AIEthical ConcernsTechnology AdvancementsOpenAI InitiativesGlobal LanguagesMultimedia ApplicationsRegulatory ChallengesFuture Impact