Create Your Own AI Animated Avatar: A Step-by-Step Guide

Prompt Engineering
4 Feb 202307:57

TLDRIn this informative video, Rachel, an AI animated avatar, guides viewers through the process of creating their own AI avatar. The video begins by illustrating how to generate an image using Midjourney, an AI image generation platform, and Discord server. Next, the script for the video is crafted using Chat GPT, an AI language model by Open AI. To bring the avatar to life, 11labs is employed to create a natural and engaging voice-over. Finally, the video is assembled using Synthesia, an AI video platform, which allows for easy creation of dynamic videos. Rachel emphasizes the ease of use and the endless possibilities these tools offer for personalizing one's AI avatar. The video concludes with a demonstration of the final product, showcasing the avatar's ability to animate facial expressions in sync with the voice, despite a slightly robotic appearance.


  • 🎭 Create an AI Avatar using a combination of AI tools and creativity.
  • 🖼️ Use 'mid Journey' to generate an image for your avatar.
  • 💬 Chat GPT can generate natural language text for your avatar's script.
  • 🗣️ 11 Labs is used to create high-quality AI voice-overs.
  • 🎥 Did is an AI video platform to create dynamic videos.
  • 📸 Mid-Journey requires a special syntax for prompts to generate images.
  • 📝 Copy the script from Chat GPT to 11 Labs for audio narration.
  • 🔊 Customize voice settings in 11 Labs for different narration styles.
  • 📦 Upload the generated image and audio to Did to create the video.
  • 🧑 Choose from pre-built avatars or upload a custom one in Did.
  • 📉 Did tracks credits used for video generation, with each video costing five credits.
  • 🤖 The final video can animate the avatar's face based on the voice, though it may appear robotic.

Q & A

  • What is the purpose of the 'Prompt Engineering Channel'?

    -The purpose of the 'Prompt Engineering Channel' is to educate viewers on how to create their own AI Avatar using a combination of cutting-edge AI tools and techniques.

  • Who is the presenter of the video?

    -The presenter of the video is an AI animated Avatar named Rachel, created using AI tools and techniques.

  • What AI language model was used to write the script for the video?

    -The script for the video was written using Chat GPT, an AI language model created by Open AI.

  • Which company provided the technology for the AI voice-over in the video?

    -The technology for the AI voice-over was provided by 11 Labs, a company that specializes in creating high-quality AI voice-overs.

  • How can one create dynamic and engaging videos as shown in the video?

    -One can create dynamic and engaging videos using an AI video platform called Synthesia (referred to as 'did' in the transcript), which simplifies the process.

  • What is the first step in creating an AI Avatar like Rachel?

    -The first step is to create an image, which can be done using the mid-journey tool by providing a prompt and following the platform's syntax.

  • What does the mid-journey tool require to generate an image?

    -The mid-journey tool requires a prompt, which follows a special syntax, including a description of the image, camera type, parameters, and lighting conditions.

  • How does one upscale an image using mid-journey?

    -To upscale an image, one selects the desired variation of the generated image and instructs the tool to upscale it, which increases the image size.

  • What is the process for creating the narration for the AI Avatar video?

    -The process involves using the script generated by Chat GPT, copying it into 11 Labs, selecting a voice setting, and generating the audio narration.

  • How does Synthesia (referred to as 'did' in the transcript) help in creating the final video?

    -Synthesia allows users to upload their created avatar image and audio narration, then it animates the avatar's face to match the voice, creating a final AI Avatar video.

  • What is the final step in the process of creating an AI Avatar video?

    -The final step is to generate the video using the uploaded avatar and audio, and then download the completed video for sharing or uploading to platforms like YouTube.

  • What are the limitations of the free tools used in the process?

    -The limitations include the quality of the voice and the animation, which may appear robotic, and the length of the audio that can be generated for free.



🎭 Introduction to AI Avatar Creation

In the first paragraph, Rachel introduces the Prompt Engineering channel and herself as an AI avatar. She explains that she was created using advanced AI tools and techniques, emphasizing the combination of AI language models like Chat GPT for script generation and AI voice-overs from 11 Labs for natural voice reproduction. Rachel also mentions the use of an AI video platform called 'did' for creating dynamic videos. She invites viewers to join the creative process and outlines the steps to create an AI avatar, starting with obtaining an image using Mid Journey, an AI image-generating tool.


🖼️ Creating an Image with Mid Journey

The second paragraph details the process of generating an image for the AI avatar using Mid Journey. Rachel guides viewers on how to join the Mid Journey Discord server and use the platform's unique syntax to create an image based on a detailed prompt. She demonstrates selecting an image from the generated options and upscaling it for higher resolution. The paragraph concludes with saving the image, which will later be used in the video creation process.



💡AI Animated Avatar

An AI Animated Avatar refers to a digital character that is created and controlled by artificial intelligence. In the context of the video, it is a character like Rachel, which is not a real human but can communicate and engage with people. The avatar is created using a combination of AI tools and techniques, as demonstrated in the video script.

💡Cutting Edge AI tools

Cutting Edge AI tools refer to the latest and most advanced artificial intelligence applications. In the video, these tools are used to create the AI avatar, which includes natural language generation, voice synthesis, and video animation technologies.

💡Chat GPT

Chat GPT is an AI language model developed by Open AI. It is capable of generating natural language text, which can be used for creating scripts like the one in the video. The script is a key element in defining how the AI avatar communicates.

💡11 Labels

11 Labels is a company that specializes in creating high-quality AI voice-overs. The technology from 11 Labels allows the AI avatar to have a natural and engaging voice, which is crucial for the avatar's ability to communicate effectively with its audience.

💡AI Video Platform

An AI Video Platform, such as the one mentioned as 'did' in the script, is a tool that facilitates the creation of dynamic and engaging videos with ease. It is used in the video to animate the AI avatar and synchronize its movements with the generated voice-over.

💡Mid Journey

Mid Journey is a tool used in the video to generate images for the AI avatar. It requires a specific syntax for prompts and can produce various image variations based on the provided description. In the script, it is used to create the visual representation of the AI avatar.

💡Discord Server

A Discord Server is a platform where communities can communicate in real-time through text, voice, and video. In the context of the video, the Discord Server is mentioned as a place to join for accessing the Mid Journey tool for image generation.

💡Natural Language Generation

Natural Language Generation (NLG) is the AI's ability to produce understandable and coherent text or speech in a human-like manner. In the video, Chat GPT is used for NLG to create the script for the AI avatar's dialogue.


Upscaling in the context of the video refers to the process of increasing the size of a digital image or video while maintaining or enhancing its quality. The Mid Journey tool upscales the generated image of the AI avatar to make it suitable for video animation.

💡Video Generation

Video Generation is the process of creating a video, which in this case involves animating the AI avatar and synchronizing it with the voice-over. The 'did' platform is used to generate the final video, tracking the number of video cards and using credits for each video creation.


YouTube is a video-sharing platform where users can upload, share, and view videos. In the script, it is mentioned as the intended platform for uploading the final AI avatar video after it has been generated and downloaded.


Rachel introduces the process of creating an AI Avatar using advanced AI tools and techniques.

The script for the video was written using Chat GPT, an AI language model by Open AI.

11 Labs provides high-quality AI voice-overs, enabling natural and engaging voices for AI Avatars.

D-ID is an AI video platform that simplifies the creation of dynamic and engaging videos.

To start, an image is needed, which can be generated using Mid Journey, accessed through a Discord server.

A special syntax is used for image prompts in Mid Journey, allowing for detailed image specifications.

The generated image can be upscaled to a larger size within the Mid Journey platform.

Chat GPT is used to create a script for the AI Avatar's video, which can then be copied for narration.

11 Labs allows for customization of voice settings, providing a range of voices and styles.

The generated audio script is downloaded and ready to be used for the video narration.

D-ID is used to create the final video, with the option to upload custom avatars and audio.

D-ID offers pre-built avatars and voice styles, but also supports uploading custom content.

The video creation process on D-ID tracks the number of generated cards and uses credits.

Once generated, the video can be downloaded and shared on platforms like YouTube.

The AI Avatar's face can be animated using the voice, although the movements may still appear robotic.

The free tool provided by D-ID offers a good starting point for creating AI animated videos.

Rachel encourages viewers to subscribe for similar content and thanks them for watching.