I Challenged My AI Clone to Replace Me for 24 Hours | WSJ
TLDRJoanna Stern, a journalist from the Wall Street Journal, challenges an AI clone to replace her for a day to explore the capabilities and implications of AI-generated voices and videos. She undergoes a process with Synthesia to create a realistic AI avatar and voice clone using her own recordings. Joanna then tests the AI in four scenarios: making phone calls, creating a TikTok video, passing bank biometric voice verification, and participating in video calls. While the AI voice successfully deceives some, the video avatar fails to convincingly mimic human movement and expression. The experiment raises concerns about the potential for misuse of such technology, as well as the need for vigilance to distinguish between real and AI-generated content.
Takeaways
- 🎭 The experiment involved creating a realistic AI avatar of Joanna Stern, using AI tools to replicate her appearance and voice.
- 📹 Joanna's AI avatar was created by Synthesia, a startup specializing in AI avatars, after recording her facial movements and voice.
- 💬 To improve the AI's voice, ElevenLabs was used, which created a more accurate voice clone using two hours of Joanna's previous recordings.
- 📞 In Challenge One, Joanna's AI voice successfully passed a phone call test with Evan Spiegel, CEO of Snap, and even fooled her sister.
- 📱 For Challenge Two, an AI-generated TikTok script was created, but the final video failed to convince TikTok viewers due to the avatar's limited movements and expressions.
- 🏦 In Challenge Three, the AI voice was able to pass a bank's voice biometric security, speaking with a representative without additional questions.
- 📞 Challenge Four involved using the AI avatar in video calls, which failed as the participants noticed the unnatural movements and lack of humor.
- 🤖 AI video clones are not yet convincing enough to fool people, but AI voice technology has advanced to a point where it can be quite convincing.
- ⏰ The potential time-saving benefits of AI clones were acknowledged, but there are also concerns about their misuse, such as scamming through voice calls.
- 🔒 Both Synthesia and ElevenLabs have measures in place to ensure that avatars and voice clones are used ethically and with proper consent.
- 🌟 Joanna emphasizes the importance of staying human and being vigilant to distinguish between real and AI-generated content.
Q & A
What is the main challenge proposed by Joanna Stern in the video?
-Joanna Stern challenges herself to see if she can be replaced by an AI clone for a day, using AI-generated voice and video to blur the lines between real and fake.
Which company created Joanna's AI avatar?
-Joanna's AI avatar was created by a startup called Synthesia.
What was Joanna's experience like while creating her AI avatar?
-Joanna recorded a series of head movements and read through a pre-written script at a professional studio. She also recorded another script for about an hour to create a custom voice.
How did ElevenLabs improve Joanna's AI voice?
-ElevenLabs produced a better voice for Joanna's AI after her producer uploaded two hours of her previous recordings.
What are the costs associated with creating an avatar with Synthesia and a voice clone with ElevenLabs?
-Synthesia charges at least $1,000 to create a custom avatar, while creating a voice clone with ElevenLabs costs $5 a month.
What was the outcome of the first challenge involving phone calls?
-The first challenge, involving phone calls, was a success. Joanna's AI voice was able to conduct a call with Evan Spiegel, CEO of Snap, without raising suspicion.
How did TikTok users react to the AI-generated TikTok video?
-TikTok users were less impressed with the AI-generated video. They noticed that the avatar did not move its arms, mouth movements did not always match the audio, and there was a lack of facial expression.
What was the result of the bank biometrics challenge?
-The bank biometrics challenge was a success. The AI voice was able to confirm Joanna's identity and transfer her to a customer service representative without additional questions.
How did the video call challenge turn out?
-The video call challenge was a failure. Participants in the video call noticed that the AI avatar looked like a hologram, had poor posture, and did not make jokes, leading them to believe it was not the real Joanna.
What are the potential risks and misuses of AI-generated voices and videos?
-The potential risks include scammers using AI-generated voices to impersonate individuals when calling banks or their families. Misuse could also involve creating fake content that is difficult to distinguish from real content.
What precautions do Synthesia and ElevenLabs take to prevent misuse of their technology?
-Synthesia requires those creating avatars to give verbal consent, and ElevenLabs requires users to check a box stating they have permission to use the voice. The company also claims it can identify its voices if they are misused.
What is Joanna Stern's final message to the viewers?
-Joanna Stern's final message is to 'stay human' and to be on high alert to distinguish between real and AI-generated content.
Outlines
😀 Introduction to Creating an AI Avatar
The video begins with Joanna Stern expressing her excitement to host a video about creating an AI avatar that resembles her in appearance and movement. She discusses the challenge of differentiating between real and fake in the age of AI-generated text and images. Joanna's goal is to see if she can be replaced by an AI version of herself for a day, thus freeing up her time for personal activities. The process of creating her AI avatar is described, which involves recording various head movements and reading a script at a professional studio, followed by recording another script for an hour to create a custom voice. The AI neural networks then use this data for training.
📞 AI Voice and Avatar in Practical Scenarios
The video script outlines Joanna's experiment with AI technology in various real-life scenarios. She tests the AI voice by making a phone call to Evan Spiegel, CEO of Snap, discussing the potential impact of AI on human communication. The AI voice successfully deceives Evan and even Joanna's sister during a call about a personal matter. The next challenge involves creating a TikTok video using an AI script and Joanna's AI avatar. Despite the AI avatar's limitations in movement and expression, Joanna is impressed with the final product, although the TikTok team notices the discrepancies. The third challenge is about bank biometrics, where the AI voice successfully passes a voice verification system at Chase, highlighting the potential for misuse of AI technology. The final challenge involves video calls, where Joanna's AI avatar is inserted into a Google Meet session. However, the avatar's lack of realism is quickly noticed by the participants. The video concludes with a reflection on the capabilities and risks of AI voice and video technology, emphasizing the need for vigilance to distinguish between real and AI-generated content.
Mindmap
Keywords
AI Clone
Synthesia
ElevenLabs
Challenges
Voice Biometrics
TikTok
AI-generated Voice
Custom Avatar
Chatbot
Authenticity
Misuse of AI
Highlights
Joanna Stern challenges herself to be replaced by an AI clone for a day to see if it can convincingly replicate her actions and interactions.
AI tools are blurring the lines between real and fake, prompting Joanna to explore AI-generated voice and video capabilities.
Joanna's AI avatar is created by Synthesia, a startup that records head movements and uses AI neural networks for realistic replication.
ElevenLabs produces a more convincing AI voice after analyzing two hours of Joanna's previous recordings.
Synthesia and ElevenLabs allow users to type in text for AI to replicate speech, with Synthesia targeting corporate internal video creation.
Joanna's AI successfully impersonates her in a phone call with Evan Spiegel, CEO of Snap, without raising suspicion.
The AI voice passes a bank's voice biometric security, confirming its potential for misuse.
Creating a TikTok with AI narration reveals limitations in avatar movement and facial expressions.
AI's inability to replicate human-like nuances in video calls leads to its failure in fooling participants.
Joanna expresses concern about the potential for scammers to misuse AI voices and the need for vigilance in distinguishing real from AI.
Synthesia and ElevenLabs have measures in place requiring consent and permission for the use of avatars and voice clones.
Joanna concludes that while AI voices are impressive, video clones are not yet convincing enough to replace human interactions.
The experiment with AI clones raises ethical considerations and the importance of staying human amidst technological advancements.
AI tools are evolving rapidly, with potential benefits for efficiency but also risks if used maliciously.
The future of AI in personal and professional settings will require a balance between convenience and the authenticity of human connection.
Joanna's experience with AI clones emphasizes the need for ongoing development and responsible use of emerging AI technologies.