I Challenged My AI Clone to Replace Me for 24 Hours | WSJ

The Wall Street Journal
28 Apr 202307:34

TLDRJoanna Stern, a journalist from the Wall Street Journal, challenges an AI clone to replace her for a day to explore the capabilities and implications of AI-generated voices and videos. She undergoes a process with Synthesia to create a realistic AI avatar and voice clone using her own recordings. Joanna then tests the AI in four scenarios: making phone calls, creating a TikTok video, passing bank biometric voice verification, and participating in video calls. While the AI voice successfully deceives some, the video avatar fails to convincingly mimic human movement and expression. The experiment raises concerns about the potential for misuse of such technology, as well as the need for vigilance to distinguish between real and AI-generated content.

Takeaways

  • 🎭 The experiment involved creating a realistic AI avatar of Joanna Stern, using AI tools to replicate her appearance and voice.
  • 📹 Joanna's AI avatar was created by Synthesia, a startup specializing in AI avatars, after recording her facial movements and voice.
  • 💬 To improve the AI's voice, ElevenLabs was used, which created a more accurate voice clone using two hours of Joanna's previous recordings.
  • 📞 In Challenge One, Joanna's AI voice successfully passed a phone call test with Evan Spiegel, CEO of Snap, and even fooled her sister.
  • 📱 For Challenge Two, an AI-generated TikTok script was created, but the final video failed to convince TikTok viewers due to the avatar's limited movements and expressions.
  • 🏦 In Challenge Three, the AI voice was able to pass a bank's voice biometric security, speaking with a representative without additional questions.
  • 📞 Challenge Four involved using the AI avatar in video calls, which failed as the participants noticed the unnatural movements and lack of humor.
  • 🤖 AI video clones are not yet convincing enough to fool people, but AI voice technology has advanced to a point where it can be quite convincing.
  • ⏰ The potential time-saving benefits of AI clones were acknowledged, but there are also concerns about their misuse, such as scamming through voice calls.
  • 🔒 Both Synthesia and ElevenLabs have measures in place to ensure that avatars and voice clones are used ethically and with proper consent.
  • 🌟 Joanna emphasizes the importance of staying human and being vigilant to distinguish between real and AI-generated content.

Q & A

  • What is the main challenge proposed by Joanna Stern in the video?

    -Joanna Stern challenges herself to see if she can be replaced by an AI clone for a day, using AI-generated voice and video to blur the lines between real and fake.

  • Which company created Joanna's AI avatar?

    -Joanna's AI avatar was created by a startup called Synthesia.

  • What was Joanna's experience like while creating her AI avatar?

    -Joanna recorded a series of head movements and read through a pre-written script at a professional studio. She also recorded another script for about an hour to create a custom voice.

  • How did ElevenLabs improve Joanna's AI voice?

    -ElevenLabs produced a better voice for Joanna's AI after her producer uploaded two hours of her previous recordings.

  • What are the costs associated with creating an avatar with Synthesia and a voice clone with ElevenLabs?

    -Synthesia charges at least $1,000 to create a custom avatar, while creating a voice clone with ElevenLabs costs $5 a month.

  • What was the outcome of the first challenge involving phone calls?

    -The first challenge, involving phone calls, was a success. Joanna's AI voice was able to conduct a call with Evan Spiegel, CEO of Snap, without raising suspicion.

  • How did TikTok users react to the AI-generated TikTok video?

    -TikTok users were less impressed with the AI-generated video. They noticed that the avatar did not move its arms, mouth movements did not always match the audio, and there was a lack of facial expression.

  • What was the result of the bank biometrics challenge?

    -The bank biometrics challenge was a success. The AI voice was able to confirm Joanna's identity and transfer her to a customer service representative without additional questions.

  • How did the video call challenge turn out?

    -The video call challenge was a failure. Participants in the video call noticed that the AI avatar looked like a hologram, had poor posture, and did not make jokes, leading them to believe it was not the real Joanna.

  • What are the potential risks and misuses of AI-generated voices and videos?

    -The potential risks include scammers using AI-generated voices to impersonate individuals when calling banks or their families. Misuse could also involve creating fake content that is difficult to distinguish from real content.

  • What precautions do Synthesia and ElevenLabs take to prevent misuse of their technology?

    -Synthesia requires those creating avatars to give verbal consent, and ElevenLabs requires users to check a box stating they have permission to use the voice. The company also claims it can identify its voices if they are misused.

  • What is Joanna Stern's final message to the viewers?

    -Joanna Stern's final message is to 'stay human' and to be on high alert to distinguish between real and AI-generated content.

Outlines

00:00

😀 Introduction to Creating an AI Avatar

The video begins with Joanna Stern expressing her excitement to host a video about creating an AI avatar that resembles her in appearance and movement. She discusses the challenge of differentiating between real and fake in the age of AI-generated text and images. Joanna's goal is to see if she can be replaced by an AI version of herself for a day, thus freeing up her time for personal activities. The process of creating her AI avatar is described, which involves recording various head movements and reading a script at a professional studio, followed by recording another script for an hour to create a custom voice. The AI neural networks then use this data for training.

05:01

📞 AI Voice and Avatar in Practical Scenarios

The video script outlines Joanna's experiment with AI technology in various real-life scenarios. She tests the AI voice by making a phone call to Evan Spiegel, CEO of Snap, discussing the potential impact of AI on human communication. The AI voice successfully deceives Evan and even Joanna's sister during a call about a personal matter. The next challenge involves creating a TikTok video using an AI script and Joanna's AI avatar. Despite the AI avatar's limitations in movement and expression, Joanna is impressed with the final product, although the TikTok team notices the discrepancies. The third challenge is about bank biometrics, where the AI voice successfully passes a voice verification system at Chase, highlighting the potential for misuse of AI technology. The final challenge involves video calls, where Joanna's AI avatar is inserted into a Google Meet session. However, the avatar's lack of realism is quickly noticed by the participants. The video concludes with a reflection on the capabilities and risks of AI voice and video technology, emphasizing the need for vigilance to distinguish between real and AI-generated content.

Mindmap

Keywords

💡AI Clone

An AI Clone refers to an artificial intelligence model that is designed to mimic the behavior, speech, and appearance of a specific individual. In the video, Joanna Stern creates an AI clone of herself to see if it can replace her for a day, which is central to the theme of exploring the capabilities and limitations of current AI technology.

💡Synthesia

Synthesia is a startup company mentioned in the video that specializes in creating AI avatars. Joanna's AI avatar was made by Synthesia, which recorded her performing various head movements and reading a script to use as training data for the AI. This keyword is significant as it represents the technology enabling the creation of the AI clone in the video.

💡ElevenLabs

ElevenLabs is another company referenced in the video that produced a better AI voice for Joanna's clone after analyzing two hours of her previous recordings. This keyword is important as it highlights the advancements in AI voice generation and its role in creating a more convincing AI representation of a person.

💡Challenges

Challenges in the context of the video refer to the four tasks Joanna sets for her AI clone to test its capabilities. These include making phone calls, creating a TikTok video, passing bank biometrics, and conducting video calls. The challenges are a key part of the narrative as they put the AI's abilities to the test.

💡Voice Biometrics

Voice Biometrics is a technology that uses the unique characteristics of a person's voice to verify their identity. In the video, Joanna uses this technology to test whether her AI clone can pass as her when calling the bank. It's a crucial concept as it explores the potential misuse of AI in security and authentication systems.

💡TikTok

TikTok is a popular social media platform where users post short videos. Joanna asks ChatGPT to write a script for a TikTok video featuring her AI avatar. The keyword is significant as it demonstrates one of the practical applications of AI in content creation and social media.

💡AI-generated Voice

An AI-generated voice is a synthesized voice created by artificial intelligence to sound like a specific person or a generic human voice. In the video, Joanna explores the quality and realism of AI-generated voices, particularly in the context of phone calls and customer service interactions.

💡Custom Avatar

A custom avatar, as discussed in the video, is a digital representation of a person created for specific uses, such as in internal company videos or other virtual environments. Joanna's AI avatar is an example of a custom avatar, and the concept is integral to the video's exploration of AI's potential to replicate human presence.

💡Chatbot

A chatbot is an AI program designed to simulate conversation with humans. In the video, Joanna mentions 'My AI,' a chatbot within the Snap app. The keyword is relevant as it represents another form of AI technology that is becoming more integrated into everyday communication.

💡Authenticity

Authenticity in the video refers to the ability of the AI clone to convincingly replicate Joanna's behavior, speech, and appearance. The concept is central to the video's theme, as it examines how well AI can mimic human characteristics and whether it can be distinguished from the real person.

💡Misuse of AI

The potential misuse of AI is a concern raised in the video, particularly in the context of using AI voices to impersonate individuals for fraudulent purposes, such as calling banks or family members. This keyword is significant as it highlights the ethical and security implications of advanced AI technology.

Highlights

Joanna Stern challenges herself to be replaced by an AI clone for a day to see if it can convincingly replicate her actions and interactions.

AI tools are blurring the lines between real and fake, prompting Joanna to explore AI-generated voice and video capabilities.

Joanna's AI avatar is created by Synthesia, a startup that records head movements and uses AI neural networks for realistic replication.

ElevenLabs produces a more convincing AI voice after analyzing two hours of Joanna's previous recordings.

Synthesia and ElevenLabs allow users to type in text for AI to replicate speech, with Synthesia targeting corporate internal video creation.

Joanna's AI successfully impersonates her in a phone call with Evan Spiegel, CEO of Snap, without raising suspicion.

The AI voice passes a bank's voice biometric security, confirming its potential for misuse.

Creating a TikTok with AI narration reveals limitations in avatar movement and facial expressions.

AI's inability to replicate human-like nuances in video calls leads to its failure in fooling participants.

Joanna expresses concern about the potential for scammers to misuse AI voices and the need for vigilance in distinguishing real from AI.

Synthesia and ElevenLabs have measures in place requiring consent and permission for the use of avatars and voice clones.

Joanna concludes that while AI voices are impressive, video clones are not yet convincing enough to replace human interactions.

The experiment with AI clones raises ethical considerations and the importance of staying human amidst technological advancements.

AI tools are evolving rapidly, with potential benefits for efficiency but also risks if used maliciously.

The future of AI in personal and professional settings will require a balance between convenience and the authenticity of human connection.

Joanna's experience with AI clones emphasizes the need for ongoing development and responsible use of emerging AI technologies.