Create Your Own AI Animated Avatar: A Step-by-Step Guide

Prompt Engineering
4 Feb 202307:57

TLDRIn this informative video, Rachel guides viewers through the process of creating a personalized AI Avatar. The tutorial begins with generating an image using Midjourney, a platform that employs AI to produce images from textual prompts. Next, the script for the video is crafted using Chat GPT, an AI language model that generates natural language text. To bring the AI to life, 11 Labs is utilized to create a high-quality, engaging AI voice-over. Finally, the video is assembled using Synthesia, an AI video platform that simplifies the creation of dynamic videos. Rachel's step-by-step guide is not only educational but also inspiring, encouraging viewers to unleash their creativity and explore the limitless possibilities of AI technology.


  • 🚀 Creating an AI Avatar involves using a combination of AI tools and creativity.
  • 🖼️ The image for the AI Avatar can be generated using platforms like Midjourney, which requires a specific prompt syntax.
  • 💬 The script for the AI Avatar is written using AI language models like Chat GPT from Open AI, which can generate natural language text.
  • 🎙️ AI voice-overs are provided by companies like 11 Labs, which specialize in high-quality AI voice synthesis.
  • 🎥 The video for the AI Avatar can be created using AI video platforms like Synthesia (referred to as 'did' in the transcript).
  • 📸 Midjourney's image generation process involves selecting an image and upscaling it for higher resolution.
  • 📝 The script for the video is created by copying text into the 11 Labs platform and selecting voice settings.
  • 🔊 Special characters in the script may need to be replaced to ensure proper audio generation.
  • 🌟 Synthesia offers pre-built avatars and the ability to upload custom ones, along with various voice styles to match the script.
  • 📈 The video generation process in Synthesia tracks the number of video cards used and credits consumed.
  • 🔍 The final AI Avatar video can be downloaded and uploaded to platforms like YouTube for sharing.
  • 🤖 Although the AI Avatar's animation might appear robotic, it is considered a good result for a free tool.

Q & A

  • What is the name of the AI language model used to create the script for the AI avatar?

    -The AI language model used to create the script is called Chat GPT, created by Open AI.

  • Which company is credited for providing the AI voice-over for the AI avatar?

    -11 Labs is the company that specializes in creating high-quality AI voice-overs for the AI avatar.

  • What AI video platform was used to create the dynamic and engaging video?

    -The AI video platform used to create the video is called Synthesia (referred to as 'did' in the transcript, likely a typo or shorthand).

  • How can one generate an image for the AI avatar using Mid Journey?

    -To generate an image using Mid Journey, one needs to join the Discord server, find a newbies channel, and use a specific prompt syntax to describe the desired image characteristics.

  • What is the process of 'upscaling' an image in the context of Mid Journey?

    -Upscaling an image in Mid Journey involves making the generated image bigger in size, which may take some time to process.

  • How does one create a script for the AI avatar using Chat GPT?

    -To create a script using Chat GPT, one simply provides a prompt or request, and the AI generates natural language text based on that input.

  • What is the process of generating an audio narration using 11 Labs?

    -After copying the script into 11 Labs, one can select a voice and adjust voice settings, then generate the audio narration, which can be downloaded once completed.

  • How does Synthesia (did) animate the face of the AI avatar?

    -Synthesia uses the uploaded audio to animate the face of the AI avatar, making it appear as though the avatar is speaking, although it may still look somewhat robotic.

  • What is the cost associated with generating a video using Synthesia (did)?

    -Each video generated on Synthesia costs five credits, and the platform provides a certain number of credits for users to start with.

  • How can one obtain the final AI animated avatar video?

    -After the video is generated in Synthesia, it can be downloaded and then uploaded to platforms like YouTube for sharing.

  • What is the purpose of the Discord server mentioned in the script?

    -The Discord server is used for accessing the Mid Journey platform to generate images for the AI avatar, and it requires joining before one can use it.

  • Why is the AI avatar created to look like a human but not be a real human?

    -The AI avatar is designed to look like a human to facilitate communication and engagement with users, even though it is not a real human, to provide a more relatable and interactive experience.



😀 Introduction to AI Avatar Creation

In this introductory paragraph, Rachel, the AI host of the Prompt Engineering channel, welcomes viewers and expresses her purpose: to guide them through the process of creating their own AI avatar. She introduces herself as an AI avatar created using advanced AI tools and techniques. Rachel explains that the script for the video was generated using Chat GPT, an AI language model by Open AI, which can produce natural language text. Her voice was provided by 11 Labs, a company specializing in high-quality AI voice-overs. The video itself was made using an AI video platform called Synthesia (referred to as 'did' in the script). Rachel encourages viewers to use these tools and their creativity to create unique AI avatars and looks forward to seeing their creations.


🎨 Creating an AI Avatar Image with Midjourney

This paragraph details the first step in creating an AI avatar: generating an image. Rachel instructs viewers to use Midjourney, an AI image generation tool, which can be accessed through a Discord server. She guides viewers on how to join the server if they don't have an account and navigate to the 'newbies' channel. There, users are shown how to generate an image using a specific prompt syntax, which includes a description of the desired image, camera settings, and lighting conditions. Rachel demonstrates using a prompt found on Reddit to generate a series of potential images. Viewers can then select a preferred image and upscale it for higher resolution, which is saved for further use in the avatar creation process.

📝 Scripting with Chat GPT and Voiceover with 11 Labs

The second paragraph focuses on the next steps in the AI avatar creation process: scripting and voiceover. Rachel shares that the video's script was generated using Chat GPT and then copied into 11 Labs to create the narration. She explains that while 11 Labs offers a free trial, signing up for an account allows for longer audio generations. She guides viewers on how to select a voice and generate the audio, replacing a special character with her name, 'Rachel', to complete the script. Once the audio is generated, it is downloaded and prepared for the video creation process.

🎥 Video Creation and Animation with Synthesia

In the final paragraph, Rachel moves on to the video creation stage using Synthesia (referred to as 'did' in the script). She demonstrates how to create an account or start a free trial on the platform and guides viewers on how to upload their generated avatar image and script. The platform offers pre-built avatars and voice options, but Rachel chooses to upload the audio created earlier. She then instructs on how to generate the video, noting that it takes some time and uses credits within the platform. Once the video is ready, she mentions that it can be downloaded and shared on platforms like YouTube. Rachel concludes by expressing hope that viewers enjoyed the process and encourages them to subscribe for similar content.



💡AI Animated Avatar

An AI Animated Avatar is a digital representation of a person or character that uses artificial intelligence to mimic human-like movements and speech. In the context of the video, Rachel is an example of an AI Animated Avatar created using advanced AI tools and techniques. The avatar can communicate and engage with viewers, showcasing the potential of AI in creating realistic, interactive digital characters.

💡Cutting Edge AI tools

Cutting Edge AI tools refer to the latest and most advanced applications of artificial intelligence technology. These tools are often at the forefront of innovation and are capable of performing complex tasks that were previously only possible for humans. In the video, such tools are used to create the AI Animated Avatar, emphasizing the role of modern AI in shaping new forms of digital interaction.

💡Chat GPT

Chat GPT is an AI language model developed by Open AI that is designed to generate human-like text based on given prompts. It is a powerful tool for creating scripts, stories, or any form of written content. In the video script, Chat GPT is credited with writing the script that Rachel, the AI Animated Avatar, is using to communicate with the audience.

💡11 Labs

11 Labs is a company that specializes in creating high-quality AI voice-overs. Their technology enables the creation of natural and engaging voices for digital characters, which is crucial for making AI Animated Avatars seem more lifelike. In the video, 11 Labs is used to give Rachel's avatar a voice, demonstrating the importance of vocal expression in avatar creation.

💡AI Video Platform

An AI Video Platform is a software solution that utilizes artificial intelligence to assist in the creation, editing, and enhancement of video content. These platforms often offer features like automated video generation, dynamic content insertion, and other time-saving tools. In the video, the AI video platform 'D-ID' is used to create the final video with the AI Animated Avatar, showcasing the ease and efficiency of modern video production with AI.

💡Mid Journey

Mid Journey is a tool or platform mentioned in the video used for generating images based on specific prompts. It is part of the process of creating a visual representation for the AI Animated Avatar. The video script describes using Mid Journey to generate a potential image for the avatar, highlighting the use of AI in the visual design process.

💡Discord Server

A Discord Server is a community space within the Discord platform where users can communicate via text, voice, and video. In the context of the video, the Discord Server is used as a place to join the beta version of Mid Journey, indicating the collaborative and community-driven nature of the AI avatar creation process.

💡Prompt Engineering

Prompt Engineering refers to the process of carefully crafting prompts to guide AI systems like Chat GPT or image generation tools to produce desired outputs. It is a form of AI interaction that requires creativity and understanding of the AI's capabilities. In the video, Rachel discusses prompt engineering in the context of creating the script and image for her AI Animated Avatar.

💡AI Language Model

An AI Language Model is a type of artificial intelligence designed to understand, interpret, and generate human language in a way that is coherent and contextually appropriate. Chat GPT is an example of an AI language model, and it is used in the video to generate the script for the AI Animated Avatar, Rachel, illustrating the model's role in content creation.

💡Dynamic and Engaging Videos

Dynamic and engaging videos are those that capture and maintain the viewer's attention through the use of motion, interaction, and compelling content. The video script discusses creating such videos using an AI video platform, emphasizing the role of AI in enhancing video engagement and viewer experience.

💡Video Generation

Video Generation is the process of creating a video, which can be manual or automated. In the context of the video, video generation is facilitated by an AI video platform that uses the uploaded image, script, and voice-over to produce a video of the AI Animated Avatar. This process is highlighted as being time-efficient and accessible, thanks to AI technology.


Introduction to the concept of creating an AI animated avatar using various tools.

Explanation of the use of ChatGPT for scriptwriting, demonstrating its application in generating natural language texts.

Mention of ElevenLabs for high-quality AI voice-over creation, highlighting its ability to produce natural and engaging audio.

Overview of the DID platform for creating dynamic and engaging AI videos.

Step-by-step guide on creating an AI avatar, starting from image generation with MidJourney.

Description of how to join MidJourney through its Discord server and participate in image creation.

Detailed guidance on using MidJourney's special syntax for image prompts.

Process of selecting and upscaling an image from MidJourney for use in the avatar.

How to leverage ChatGPT for scripting video content.

Usage of ElevenLabs to convert text into a spoken voice-over.

Explanation of signing up for ElevenLabs and generating voice narration.

Instructions on setting up a video project on the DID platform, including importing an avatar and audio.

Details on adding custom audio to the DID platform to personalize the avatar.

Final step of video generation on DID, including a mention of the platform's credit system.

Invitation to subscribe and engage further with the content for similar tutorials.