Expressive AI Avatars Launch Event | Synthesia

Synthesia
25 Apr 202421:25

Summary

TLDRSynthesia introduces Expressive Avatars, a significant upgrade to their AI video technology. These avatars can now understand and reflect emotions in speech, providing a more natural and engaging experience. The launch includes a real-time video preview feature and improved lip-sync in multiple languages. Synthesia's platform offers a full suite for video creation, editing, and sharing, emphasizing the importance of video content in today's economy where engagement is key.

Takeaways

  • 🌐 The event is global, with participants from various time zones.
  • 🎉 The launch is anticipated to be the most significant of the year, focusing on a major upgrade to Avatar technology.
  • 🆓 Attendees are offered the chance to be among the first to try expressive avatars without needing to sign up.
  • 🚀 Synthesia's mission is to simplify video content creation for everyone through a comprehensive AI video communication platform.
  • 🎭 The platform includes avatars, voices, a video editor, and collaborative features for real-time team work on videos.
  • 📈 Synthesia aims to boost user engagement through video, which is crucial for marketing, learning, and training due to increased retention and viewer attention.
  • 📊 There's a significant shift in online consumption habits, with a preference for video over text, aligning with our biological wiring for visual content.
  • 🤖 The evolution of Avatar technology at Synthesia began with looped videos, moved to fully synthetic generation, and now includes expressive avatars that understand and react to the text they're speaking.
  • 💬 Expressive avatars are a breakthrough, allowing for sentiment prediction, improved lip sync, and more natural voice modulations that match the avatar's expressions.
  • 🔍 The new technology is not just an upgrade but a transformation that enables avatars to perform like actors, making them more engaging and suitable for a wider range of content, including sensitive topics.
  • 🔮 Synthesia is also working on additional features like multilingual voice cloning, an AI screen recorder, and an improved animation system called 'triggers'.

Q & A

  • What is Synthesia's mission?

    -Synthesia's mission is to make it easy for everyone in the world to make video content by providing a platform for all AI Video Communications needs.

  • How does Synthesia help users create videos?

    -Synthesia assists users in creating videos by offering avatars and voices, a full video editor for adding screen recordings, background images, text, animations, and a collaborative platform for team collaboration.

  • What is the significance of the new avatar technology launch mentioned in the script?

    -The new avatar technology launch is significant as it represents a major upgrade to Synthesia's core avatar capabilities, promising more expressive and engaging avatars for users.

  • Why are expressive avatars considered an upgrade from previous avatar technologies?

    -Expressive avatars are an upgrade because they understand and respond to the sentiment of the text, allowing for more natural and emotive performances, unlike previous avatars that were limited to lip-syncing and basic movements.

  • How does the Express one AI model enable avatars to be more expressive?

    -The Express one AI model enables avatars to be more expressive by analyzing the text and predicting the sentiment, which then influences the avatar's facial expressions, lip-sync, and voice to match the intended emotion.

  • What are the three components of the new expressive avatar technology?

    -The three components of the new expressive avatar technology are sentiment prediction and facial expressions, better lip-sync, and updated voices to match body language and emotion.

  • How does Synthesia plan to improve the avatar technology further in the future?

    -Synthesia plans to improve avatar technology by adding features like multilanguage voice cloning, adding more expressive avatars, introducing an AI screen recorder, and enhancing the triggers system for animating content into videos.

  • What is Avatar Preview and how does it assist users?

    -Avatar Preview is a feature that allows users to see a real-time, low-resolution preview of their video before rendering it out, which is helpful for iterating on the script and content to achieve the desired expression with the avatar.

  • How can users try out the new expressive avatars?

    -Users can try out the new expressive avatars by visiting Synthesia's website and accessing the avatars feature without needing to sign up or provide an email address.

  • What are some use cases where expressive avatars can make a significant difference?

    -Expressive avatars can make a significant difference in use cases such as healthcare education, product and marketing sales content, and customer support, where emotional connection and engagement are crucial.

Outlines

00:00

🌐 Introduction to Synthesia's Global Launch Event

The speaker welcomes the global audience to a live event, acknowledging their punctuality and inviting them to share their locations in the comments. The event is held in London, but the audience spans across different time zones, with some having breakfast and others dinner. The speaker hints at an exciting product launch, which is a significant upgrade to the core Avatar technology. They promise a link for early access to expressive avatars without the need for registration, encouraging viewers to stay until the end of the event. The speaker provides a brief overview of Synthesia, emphasizing its mission to simplify video content creation for everyone through AI-powered video communication tools. Synthesia offers a platform that includes avatars, voices, video editing, and collaborative features, aiming to be user-friendly for those without video editing experience.

05:01

🚀 Evolution of Avatar Technology at Synthesia

The speaker discusses the evolution of avatar technology since its inception in 2020. The initial avatars were simple, looping videos of real people with lip-syncing. However, these early avatars lacked body language and facial expressions that matched the voiceover, leading to unnatural movements. In 2022, Synthesia introduced fully synthetically generated avatars, which improved the avatar's movements but still lacked the expressiveness of a human. The speaker highlights the limitations of previous avatars, which could appear robotic and dull due to their lack of understanding of the content they were delivering. The new 'expressive avatars' are introduced as a significant leap forward, capable of understanding the sentiment behind the words and delivering performances that are more engaging and natural.

10:05

🎭 Demonstrating the Expressiveness of New Avatars

The speaker demonstrates the new 'expressive avatars' by showing a video of an avatar named Julia. The avatar is programmed to respond to emotionally charged words and emojis, adjusting its facial expressions and body language accordingly. The AI system, called Express one, has been trained to understand the nuances of language and perform accordingly. The speaker emphasizes the improved lip-sync technology and the updated voices that match the avatar's body language and emotional state. The new avatars are shown to be capable of a range of human-like expressions, from happiness to frustration, making them more engaging and relatable.

15:12

🤖 The Future of Avatar Technology and Customer Experience

Jonathan Stark, the CTO of Synthesia, discusses the new 'Express one' model, which has learned to understand text and express non-verbal cues for more natural communication. The model is trained with data from professional actors to create more lifelike avatar performances. The speaker highlights the potential for future enhancements, such as placing avatars in various locations and utilizing more body language. The new avatars are expected to open up new use cases, particularly in sensitive areas like healthcare, where empathy is crucial. The avatars can now connect with viewers on an emotional level, which was not possible with previous technology. Examples are provided to illustrate how expressive avatars can be used in different contexts, such as healthcare education, product marketing, and customer support, each with a distinct tone and style.

20:15

📈 Upcoming Features and Closing Remarks

The speaker outlines upcoming features for Synthesia, including multilingual voice cloning, additional expressive avatars, an AI screen recorder, and an improved triggers system. They invite the audience to test the new expressive avatars on the Synthesia website without needing to provide an email address. The speaker expresses excitement for the audience's feedback and future developments, hinting at an even more impressive 'Next Generation' of avatar technology to be revealed later in the year. The event concludes with gratitude for the audience's participation and anticipation for the creative uses of the new expressive avatars.

Mindmap

Keywords

💡Synthesia

Synthesia is the name of the company being discussed in the video script, which specializes in AI video communication technology. The company's mission is to simplify the process of creating video content for users worldwide. Synthesia provides a platform that includes avatars, voices, video editing tools, and sharing capabilities, aiming to cover the entire lifecycle of video creation. In the context of the video, Synthesia is launching an upgrade to its avatar technology, which is a significant part of their service offering.

💡Avatar Technology

Avatar technology refers to the creation of digital representations of humans that can be used in various applications, such as video games, virtual reality, or, as discussed in the video, for generating synthetic video content. In the script, avatar technology is central to Synthesia's platform, allowing users to create videos with realistic, AI-generated human figures that can speak and emote in a lifelike manner.

💡Expressive Avatars

Expressive avatars are a new feature being introduced by Synthesia, which enhances the avatar technology to not only mimic lip movements but also to convey emotions and sentiments through facial expressions and body language. This upgrade allows avatars to perform more naturally and engage viewers on an emotional level, which is crucial for creating compelling video content. The script highlights the expressive avatars' ability to understand the text and express emotions accordingly, making them appear more human-like.

💡AI Video Assistant

The AI video assistant is a feature of Synthesia's platform that automates the video creation process. It can take various types of content, such as documents or blog posts, and transform them into video format by writing scripts, generating visuals, and producing the final video. This assistant is designed to make video production more accessible and efficient, especially for those without video editing experience.

💡Sentiment Prediction

Sentiment prediction in the context of the video refers to the AI's ability to analyze text and predict the emotional tone that should be conveyed by the avatar. This technology is a key component of expressive avatars, allowing them to match their facial expressions and tone of voice to the sentiment of the script. For example, the script mentions that the AI can detect if a sentence is happy, frustrated, or upset, and the avatar will perform accordingly.

💡Lip Sync

Lip sync is the process of matching an avatar's mouth movements to the audio track in a video. In the script, it is mentioned that Synthesia has improved lip sync technology, making it more accurate across different languages. This enhancement contributes to the overall naturalness and realism of the avatars in videos.

💡Engagement

Engagement, in the video script, refers to the level of interest and interaction that viewers have with the content. High engagement is important for various stakeholders, such as marketers or educators, as it can lead to better conversion rates or information retention. The script emphasizes that expressive avatars can enhance engagement by making videos more emotionally compelling and thus more effective in capturing and maintaining the viewer's attention.

💡Express One AI Model

The Express One AI model is the underlying technology that powers the expressive avatars. It has been trained to understand the nuances of human language and the relationship between what is said and how it is said. This model allows avatars to perform like actors, delivering lines with the appropriate emotions and expressions, making the avatars more engaging and lifelike. The script describes how this model has been developed by analyzing hours of human speech and performance.

💡Multimodal Communication

Multimodal communication refers to the use of multiple modes or channels of communication, such as text, voice, and visual cues. In the script, the upgraded avatar technology is described as a form of multimodal communication because it combines verbal language with non-verbal cues like facial expressions and body language. This integration is crucial for creating a more natural and human-like communication experience in videos.

💡Video Life Cycle

The video life cycle refers to the entire process of creating, editing, sharing, and distributing video content. In the script, Synthesia's platform is described as covering the entire video life cycle, from video creation with avatars and editing tools to collaborative features and sharing capabilities. This comprehensive approach aims to simplify video production and make it accessible to a broader range of users.

Highlights

Introduction to the anticipated launch of a major upgrade to Avatar technology.

Invitation for participants to share their locations, showcasing the global reach of the audience.

Overview of Synthesia's mission to simplify video content creation for everyone.

Description of the all-in-one platform for AI Video Communications provided by Synthesia.

Emphasis on the ease of use for those without video editing experience.

Introduction of AI video assistant for content generation from various sources.

Explanation of one-click translation feature for global content reach.

Discussion on the importance of video content in the online economy.

Analysis of how visual content engages audiences more effectively.

Historical context of Avatar technology development since 2020.

Introduction of the new Expressive Avatars with improved emotive capabilities.

Explanation of the Express one AI model that powers the new avatars.

Demonstration of sentiment prediction and its impact on avatar performance.

Showcasing improved lip sync technology across multiple languages.

Preview of the new human-like voices that enhance avatar expressiveness.

CTO Jonathan Stark's discussion on the generative model behind Expressive Avatars.

Market feedback and customer anticipation for more lifelike avatars.

Examples of how expressive avatars can be used in healthcare, marketing, and customer support.

Tips and tricks for using expressive avatars effectively.

Announcement of upcoming features like multilingual voice cloning and AI screen recorder.

Invitation to test the new expressive avatars and provide feedback.

Transcripts

play00:05

all right we're live thanks to all of you for  being here on time that's amazing um we're going  

play00:11

to give people just a few more moments uh to  trickle in so let me know in the comments where  

play00:15

you're calling in from I'm here in the London HQ  but I know the cesia fam is is global so probably  

play00:21

some of you eating breakfast some of you having  dinner um let us know where you are it's always  

play00:26

fun to see where people are are calling in from  um in just a minute we'll get started this is  

play00:30

going to be I think it's the most anticipated  launch we've had this year so far um and that  

play00:36

makes sense this is a huge upgrade to the core  Avatar technology uh and I can't wait to show  

play00:42

you all the cool things that uh that you'll be  able to do with it if you stick around to the  

play00:45

end we'll give you a link where we can go in and  you can be one of the first people in the world  

play00:49

to try out expressive avatars um you don't even  have to sign up so make sure you stay until the  

play00:53

end and we'll drop that link for all of you here  that said let's get started um as we always do  

play01:00

for those of you new to Synthesia and who haven't  heard about Synthesia before I want to spend just  

play01:04

a few moments on talking a little bit about what  we do so at cesia our mission is pretty simple we  

play01:10

want to make it easy for everyone in the world to  make video content and we do that by giving you  

play01:15

one platform for all your AI Video Communications  we help you make the video with our avatars and  

play01:21

voices we're going to be talking more about  that today we give you an entire video editor  

play01:26

where you can add in your screen recordings your  background images text animations all the things  

play01:31

that you need to make an video end to endend we  also give you a collaborative platform so you can  

play01:38

have a team on CIA you can invite your colleagues  you can leave comments you can even work together  

play01:42

in videos in real time like you know it from  Google Docs where you can see each other's cursors  

play01:47

and once you're done with the video we also help  you share the video with the world that could be  

play01:51

through our sharing Pages or via our video player  which you can put into your app or your website or  

play01:56

where you want to have it so it's one platform for  the entire video life cycle and it's really easy  

play01:59

to use being really easy to use especially for  people who don't have any video editing experience  

play02:06

but may come more from a background where they  have created content maybe they've you've written  

play02:10

documents maybe you've made PowerPoints we try  to make to these as easy to use as possible so  

play02:15

we do that in a few different ways uh we have a  huge Bank of templates where you can go in and  

play02:20

you can find um templates for different use cases  different visual looks and feels get you started  

play02:25

really quickly we have our AI video assistant  a very very popular feature that we launched  

play02:29

recently and with the AI video system we can help  take some of your existing content that could be  

play02:35

a PDF document it could be a PowerPoint it could  be a blog post on your website so URL or it could  

play02:41

also be just an entirely free from uh free free  form prompt idea and we can take one of those and  

play02:47

we can generate a video for you so we'll take the  content we'll write the script we'll give you some  

play02:52

basic visuals and we have something which you can  finalize yourself and once you've done all that we  

play02:57

also help you with one click translation so take  your video and make it into as many different  

play03:00

languages as you want why is it important to make  videos well for us the way we think of this is  

play03:06

that if you look at the online economy today it's  just very very obvious that people want to watch  

play03:11

and listen to that content and they don't want to  read that much anymore right in our private lives  

play03:16

most of us I'm guessing a lot of you out there  as well um when you want to learn something new  

play03:22

probably new start on YouTube maybe you listen  to a podcast you go on Tik Tok and may also buy  

play03:27

the book but usually that's like step five down  the line right and that is definitely the pattern  

play03:32

most people have and it makes sense because  biologically we're just hardwired to better  

play03:38

understand and remember visual content right  it stimulates more sensors feels like more like  

play03:42

how we consume information in the real world and  the kind of byproduct of this is that engagement  

play03:47

goes up significantly and for all of you out  there who are making videos for a variety of  

play03:52

different stakeholders engagement is super super  super important right if you're doing sales and  

play03:56

marketing content you want engagement because it  translates to better verion rates if you're doing  

play04:00

learning and training content you want engagement  because it translates into higher information  

play04:04

retention so engagement is really really important  and engagement is only getting more important and  

play04:11

video is only going to get more important in  terms of how we communicate if you look at the  

play04:14

average attention span it has shortened by 69%  since 2004 that is a staggering number right and  

play04:21

what this means for everyone who makes content is  you really need to be good at grabbing people's  

play04:26

attention and keeping it and of course you don't  do that with a huge wall of text you do that with  

play04:31

really awesome video and audio content and that's  what we help you do this is of course what led to  

play04:38

Synthesia um we developed the Avatar technology  back in 2020 and the Insight we had back then  

play04:45

was that humans just respond so much better to  human faces and voices than anything else when  

play04:50

it comes to Communications um since then the  Avatar technology has come quite a far away and  

play04:56

today we're going to be showcasing you the latest  and greatest but I wanted to just take a moment  

play05:00

to explain the progression of Avatar Technologies  and what are the limitations and what are all the  

play05:06

cool new things you're going to be able to do with  expressive avatars so back in 2020 we invented the  

play05:12

first avatar platform and this is the product that  you'll see most most other Avatar products are  

play05:19

still using this technology today it's actually  pretty simple what you do is you take a real video  

play05:24

of a real person you Loop it in a smart way and  then you just change the lips so that they match  

play05:29

a new voice track um this illusion can work pretty  well you've probably all seen really good results  

play05:35

of this but it does kind of break down especially  when you're using it for creating a lot of content  

play05:40

because ultimately the body language the facial  expressions and everything else just doesn't  

play05:46

match what the Avatar is saying because it's just  a video playing in the background so you might get  

play05:49

avatars that do weird things with their head like  this maybe like hands which are like clearly not  

play05:54

in beat with what's saying and that's because it's  actually just a video playing in the background  

play05:58

right in in 2022 we launched the first version of  avatars that are fully synthetically generated and  

play06:04

what that means is that everything in the video  is generated by AI not just the lips and what  

play06:09

this enabled us to do back in 2022 was to take  out those weird things you don't want in video  

play06:15

right you don't want avatars doing weird things  with their head and their hands h in 2023 we kind  

play06:21

of took that technology and we began using it to  build back the performance into the avatars so  

play06:26

you can add in guestures we improve devices and  overall avatars just kind of perform better and  

play06:32

look more real right but there's a big problem  with Avatar Technologies up until this day and  

play06:38

is that even though the results today are really  good avatars have no idea what they're actually  

play06:44

saying what does that mean well it means that  if you look at what humans do and the way we  

play06:50

talk we change our tone of voice we change our  facial expressions we emote in our body language  

play06:56

differently depending on what we say so when I'm  talking right now you'll see that my eyebrows  

play07:01

going up I have a lot of micro Expressions my  hands are kind of in tune with what I'm actually  

play07:05

saying that's not something I'm conscious about  right it's just that's just the way that's what  

play07:09

we do when we talk and all of you out there do the  exact same thing right it's a lot of very small  

play07:13

things that makes us human essentially but avatars  don't have this understanding today and that's why  

play07:19

a lot of avatars can look a bit robotic and a  bit dull and just to some extent a little bit  

play07:24

unengaging right because there is very robotic in  delivery of their Alliance and one way of thinking  

play07:30

of this is like when you give a line or script  to a real actor they will perform that right they  

play07:35

won't just like read up what's on the paper today  avatars are just kind of reading up what's on the  

play07:39

paper we um with expressive avatars have actually  transformed them into actors that understand what  

play07:46

they're saying and can deliver their lines in  a way that is much more engaging and natural  

play07:52

than before what you're seeing right here on the  background is you can see the kind of emotions and  

play07:57

like micro expressions and the avatars the can be  happy they can be a bit sad they could even sort  

play08:01

of laugh a little bit um and this is because we've  built this new uh expressive Avatar product on top  

play08:10

of what we call our Express one AI model which is  essentially an AI model that has watched hours and  

play08:16

hours and hours of people talking and kind of  decoded what the language and the relationship  

play08:21

between what we say and how we say it so that we  can make the avatars perform like that the system  

play08:26

has three components that you'll notice as a user  of expressive avatars the first one is sentiment  

play08:32

prediction and facial expressions this is all  in our Express one AI model and the second one  

play08:38

is better lip sync so this has of course been  a continuous progress for us a project for us  

play08:42

to make the lip sync better and better you should  see a significant uplift with this new release and  

play08:47

last but not least we've also given the voices an  update so that they also match the body language  

play08:52

that they're emotive and have a high dynamic range  of internality Rhythm and so on and so forth the  

play08:57

first thing here is around automatic Sy sentiment  prediction so this really is all about figuring  

play09:02

out the relationship between a specific sentence  and a performance of The Avatar um I'll love to  

play09:10

just show you kind of like how this this actually  works so if we jump into Studio this is Julia this  

play09:15

is one of our new before expressive avatars she's  on a nice basis background and I'm going to put in  

play09:20

um a few sentences here so the first one I'm going  to put is I am very happy exclamation mark happy  

play09:26

smiley we'll get back to that in a minute but I am  frustrated and I am so upset as you can see here  

play09:35

these are of course all emotionally Laten words  and um the reason I'm using emotional upated words  

play09:40

here to demonstrate how this works is because what  the AI System picks up on what Express one picks  

play09:45

up on is the nuances in the language right it's  trying to figure out how this should be performed  

play09:50

and so you can even do kind of small fun things  like adding in an emotic conon or an emoji there  

play09:54

just to kind of reemphasize over for the AI model  that this should be like a happy sentence and this  

play09:59

way there's lots of like fun explanation to be had  to get the right performance out of your avatars  

play10:05

we've made a video um with with the with the  script let's watch what this looks like I am very

play10:11

happy I am so

play10:15

upset I am frustrated so that's what it looks  like with um new expressive avatars as you can  

play10:24

see it's a huge difference from what we had before  that and this is all because of our automatic uh  

play10:29

sentiment prediction Tech the next one here is  lip sync so lip sync you're all very well aware  

play10:35

of and we've given it um a big Improvement in  this version and it's not just in English this  

play10:40

is in any language let's see how some of our new  avatars do with uh with some tongue twisters six  

play10:47

Sleek swans swam swiftly southwards Peter  Piper picked a peck of pickled peppers how  

play10:53

can a clam cram in a clean cream can third is  the human like voice so the voice is of course  

play11:01

an incredibly important part of your video  both when you have an avatar but of course  

play11:05

even more so if you're doing slides that don't  have or scen don't have an avatar in them um  

play11:09

and here again much more lifelike much more  natural much more interesting to listen to  

play11:14

let's just take a quick uh listen here to what  traditional AI Avatar voices sound like and then  

play11:20

we'll have a listen to the expressive voices  after can you hear that I didn't realize it  

play11:25

would transform the video so much new voices  really capture your attention can you hear  

play11:32

that I didn't realize it would transform the  video so much new voices really capture your

play11:37

attention so as you can hear here right it's like  it's it's just two completely different ways of  

play11:45

saying the same thing um the latter one of course  being much more natural and interesting um to  

play11:51

listen to I can talk all day about Express one uh  because it's such an exciting piece of technology  

play11:57

and but I'd love to just bring it over to John  Stark who's our CTO talk a bit more about the  

play12:01

model how it was made and what it means for you as  U as our customers both today but also the future  

play12:07

as we continue to develop this this technology  hi I'm Jonathan Stark I'm CTO at Synthesia um and  

play12:15

today we're here at our whopping press studio in  London so Express one is a new type of model that  

play12:21

has learned how to understand the text that's  spoken so that it can express non-verbal cues  

play12:28

for communication so when we speak it's not  just the words that we say it's how we say it  

play12:33

it's a generative model that's learned to distill  knowledge about the world from the data is trained  

play12:38

on and we work with actors in London in New York  so our Production Studios I think it's something  

play12:44

like 50 or 60 actors a week that we work with and  this is kind of an ongoing program to record the  

play12:51

best performances in the world so we can kind of  distill that into the performances of our avatars  

play12:56

today with Avatar technology a lot of what's been  done is looping and replay you've got an existing  

play13:04

video and you reuse it with Express one what  we're doing is we're generating performances  

play13:09

it's a purely generative performance everything  you see is completely new every single time so  

play13:14

when we think about the future of this type of  Technology we think about um unlocking richer and  

play13:20

richer content for users so being able to place  avatars in locations being able to use more body  

play13:26

language and the way we communicate this is this  is all the sorts of things we'd like to bring to  

play13:31

our users awesome so what does this actually  mean for you as customers that the avatars now  

play13:37

much better and much more lifelike uh probably  for a lot of you I don't need to tell you that  

play13:42

because it's what everyone has been asking for  for a really long time and I think the result  

play13:45

sort of like speak for themselves um but we did  go out to the market we did talk to a lot of our  

play13:51

customers we looked at how we use some internally  um and showed it to just a lot of people and I  

play13:56

think um what it really kind of comes down to is  that it opens up a lot of new use cases now that  

play14:02

the avatars are more expressive and more natural  and the way it does that is because for videos  

play14:08

where you want to connect with the viewer on  some sort of emotional level we actually can  

play14:13

do that now with these avatars but that wasn't  really possible with the previous generation so  

play14:18

if you're talking about things that are sensitive  like healthcare for example like you want to have  

play14:22

empathy in device you want it to be pleasant to  listen to if you're doing product and Marketing  

play14:27

sales content you probably want a bit more kind of  upbeat and excitement in device um to really kind  

play14:32

of Storytelling customer support you want to make  sure that what is probably a frustrated customer  

play14:38

and you know feels like they're being listened to  that it's a friendly video that they're watching  

play14:43

with a degree of understanding not just like a  robot and kind of giving them tips on how to solve  

play14:48

their problem there's of course a million more  use cases with this makes sense but for us it's  

play14:52

really about now you can actually begin to connect  with the viewer in a very different way than you  

play14:56

could before so we prepared a few examples here  of what that look looks and sounds like so first  

play15:00

off we got Healthcare education and I think when  you look at this view this video right it's just  

play15:05

it's it's such a nice and pleasant experience to  to watch it hi James I'm Paloma I'm here to help  

play15:11

with your medication and share some comforting  tips it's normal to feel overwhelmed and anxious  

play15:16

but I promise you'll start feeling better soon my  first piece of advice keep your medication next  

play15:25

to something you use daily have you considered  involving your friends or family they can provide  

play15:32

great support during this time I hope these  tips are helpful let's go through treatment  

play15:38

together let me know if you have any questions  yeah so as you can hear much more natural has  

play15:44

that kind of empathic voice right as a patient  you'd feel kind of in safe hands if you if you  

play15:48

watch this video on the product and marketing is  a bit the different end of the spectrum you want  

play15:52

excitement you want optimism um and again all this  is actually deduced by our model right from just a  

play15:59

text it understands that for healthcare it should  sound a bit more empathetic than if you're putting  

play16:03

in the script of someone selling you or making an  ad so all this is automatically determined by the  

play16:07

system um let's listen to the product marketing  which should be quite different hello there let  

play16:13

me introduce to you Spen smartly exciting new  feature Autos savings with it saving money  

play16:18

becomes a seamless and enjoyable part of your  life let me explain we round up to the nearest  

play16:25

dollar and boost your savings whenever you shop  Cofe groceries every swipe adds up I could buy  

play16:32

a house already join Spen smartly now and turn  your everyday purchases into future savings so as  

play16:40

you can see here this ad is much more upbeat much  more excited optimistic um and that's again like  

play16:46

inferred from just the script the third example  we're going to see here is customer support so  

play16:51

you're dealing with someone who's probably a bit  unhappy with your product they run into a problem  

play16:54

and they're trying to solve it so you want to  meet them with a friendly face a nice video  

play16:59

calm them down and help them solve that problem  so let's watch this one as well hello there I'm  

play17:03

Jazz seems like you are having some trouble with  the internet connection that's annoying but no  

play17:09

worries we'll have it up and running again quickly  first let's find your router you'll notice some  

play17:15

lights on it go ahead and try unplugging it for  a moment all right next stop let's take a quick  

play17:21

look at your devices just make sure they are all  connected to the right Network okay next let's  

play17:28

give your connection a quick test can you open a  website or watch a video how's is it looking hope  

play17:35

this was helpful and easy to follow if not please  let us know via email so that's a few examples of  

play17:42

what expressive avatars can do and how they're  like different from um the avatars that you know  

play17:47

today a few tips and tricks for how to use these  avatars um along with uh with expressive avatars  

play17:55

we're also launching Avatar preview so Avatar  preview is real real time previewing of your  

play18:00

video before you've actually rendered it and this  is quite powerful when you're putting together  

play18:04

your performance you're testing out different  types of scripts and emotions the way it works  

play18:08

is just you hit the video preview button and once  you do that you will get um in something like 10  

play18:14

seconds you actually get uh to see the video and  what it looks like before you before you render  

play18:18

it out it's a bit low resolution than what the  final result is but it's fantastic for this kind  

play18:23

of iteration um on on the script and the content  that you have to get expression out the avatar  

play18:29

use emotive words and there's a few here on  the slide that you can see but in general you  

play18:34

want to use language that kind of provokes  emotions right you can also use punctuation  

play18:38

so exclamation marks question marks you can even  try putting smilees in there like everything you  

play18:43

can do to sort of guide the system towards the  emotion that you want to have in your video and  

play18:48

helps the system understand you uh better so  to sum up the difference between kind of old  

play18:53

school AI avatars and expressive avatars for old  school avatars right you all know them you've all  

play18:58

seen the videos and they work really well for  some types of content especially if it's more  

play19:03

how to a very kind of practical content like how  to change your password works pretty fine that is  

play19:08

robotic demeanor bit of robotic voice the script  and the facial expressions and the body language  

play19:13

is not really in sync and that can give you these  kind of weird moments but again it works for some  

play19:16

use cases um on the other hand expressive avatars  actually understand what they're saying the actors  

play19:22

or air avatars are performing like an actor would  and this means that it's much more pleasant to  

play19:28

look at that longer form videos is even better  and you can now begin to build a bit more of  

play19:33

emotional connection with your viewer right so if  you're doing sensitive content content where you  

play19:37

want to drive a bit more engagement you want to  drive that connection then expressive avatars is  

play19:41

really really Superior in this regard expressive  Avatar is very exciting but we of course working  

play19:47

on a bunch of other things as well and I just  wanted to mention a few of them before we we jump  

play19:51

off first one we got multilangual voice cloning  so that means your voice in a lot of different  

play19:56

languages very very very cool feature you can  hear yourself speak with a different accent  

play20:00

which just in itself is a really fun experience  um we're adding more expressive avatars of course  

play20:05

so keep an eye on in the coming weeks more of  them will drop into the product we have our AI  

play20:09

screen recorder and this is going to be really  really amazing addition to the product for those  

play20:14

of you who work with screen recordings today will  soon be ready to lift the veil of what people are  

play20:19

working on um but this will make your life a lot  easier the last one is triggers we're renaming  

play20:25

markers which is a system we use for animating  content into videos we just simplified it made  

play20:30

it a bit more powerful and for those of you who  are super users I think you'll really enjoy kind  

play20:35

of the direction that this feature is taking up  next but of course last but not least test out  

play20:41

the expressive avatars and all of you here in  the call have a chance to be the first in the  

play20:45

world to actually play with this technology if you  go to Synthesia doio avatars you can try that out  

play20:51

for free you don't even have to put your email  in um we also have a premium so if you want to  

play20:55

try out the entire product you can definitely  do that as well let us know what you think of  

play20:59

the avatars what do you want how can we make them  better and then I'll be very excited to talk about  

play21:05

the Next Generation Um sometime later this year  which will be even more mind-blowing than what  

play21:10

we're seeing today thank you so much for being  here I really really really appreciate it and I  

play21:14

cannot wait to see all the cool things you'll  be creating with expressive avatars thank you

Rate This

5.0 / 5 (0 votes)

الوسوم ذات الصلة
SynthesiaExpressive AvatarsVideo TechnologyAI Video CommunicationsContent CreationEngagementVideo EditingEmotional AIText-to-VideoAI Generated
هل تحتاج إلى تلخيص باللغة الإنجليزية؟