How to AI Animate. AnimateDiff in ComfyUI Tutorial.

Sebastian Kamph
10 Nov 202327:46

TLDRThe video script offers a comprehensive guide on creating animations using AI, covering various workflows including text to video and video to video. It discusses the requirements for different options, such as needing a GPU for the free version and no hardware for the very cheap option. The guide walks through installing custom nodes, setting up workflows, and adjusting parameters for optimal results. Tips on animation length, frame rate, and prompt scheduling are provided, along with troubleshooting advice for common errors. The script also touches on the installation of FFmpeg for video preview generation.


  • 🎨 The video demonstrates how to create animations using AI in a few minutes, showcasing various workflows for text-to-video and video-to-video animations.
  • πŸ–ŒοΈ For the free option, a GPU with at least 8 to 10 gigs of VRAM is required, while the cheap option doesn't need any additional hardware besides a computer or phone.
  • πŸ“‚ The tutorial uses the Inner Reflections guide and workflow, starting with the paid version (EC) before moving on to the free version.
  • πŸ“ˆ The number of frames and frame rate are crucial settings for the duration and smoothness of the animation, with film and TV typically using 24 or 25 FPS.
  • 🚫 Anime Diff can only create animations up to 36 frames, but longer animations can be achieved by chaining them together.
  • 🎯 Custom nodes may need to be installed for the free version of the software, which can be done through the Comy UI manager.
  • 🌟 The motion scale setting affects the intensity of the animation, with higher values leading to more dynamic movements and lower values resulting in slower animations.
  • πŸ“ The prompt is a key component in defining what the AI should generate, with both positive (desired outcome) and negative (unwanted outcome) descriptions.
  • πŸ”„ The seed value determines the consistency of the animation generation; a fixed seed will yield the same animation upon repeated generation.
  • πŸŽ₯ The video-to-video workflow involves using a local installation of Comy UI and may require the installation of additional models and custom nodes.
  • πŸ“… Prompt scheduling allows for dynamic changes in the animation by setting different prompts for each frame, creating a sequence of scenes, such as transitioning through the seasons.

Q & A

  • What are the different workflows mentioned in the script for creating animations with AI?

    -The script mentions three main workflows: text to video, video to video, and prompt scheduling. The text to video workflow is for creating animations from textual descriptions, the video to video workflow is for transforming existing videos, and prompt scheduling allows for dynamic changes in the animation based on a sequence of prompts.

  • What hardware requirements are there for the free and cheap options of creating AI animations?

    -For the very cheap option, no specific hardware other than a computer or phone is required. For the free option, a GPU with at least 8 to 10 gigs of VRAM is needed.

  • What is the role of the 'Inner Reflections guide and workflow' in the process?

    -The 'Inner Reflections guide and workflow' is a set of instructions and steps that the speaker is using to demonstrate the process of creating animations with AI. It likely provides a structured approach to using the AI tools effectively.

  • What is the maximum number of frames that Anime Diff can generate in one go?

    -Anime Diff can generate a maximum of 36 frames in one animation. For longer animations, multiple segments of 36 frames each can be created and chained together.

  • How does the 'motion scale' setting in theAnimate Diff settings affect the animation?

    -The 'motion scale' setting determines the amount of movement or animation in the generated video. A higher value results in more dynamic and wild animations, while a lower value leads to slower and more subdued movements.

  • What is the purpose of the 'prompt' in the text to video workflow?

    -The 'prompt' is a textual description that guides the AI in generating the desired animation. It includes positive descriptions (what the user wants to see in the animation) and negative descriptions (what the user does not want to see).

  • What is the significance of the 'seed' value in the animation generation process?

    -The 'seed' value is a starting point for the AI's random number generation process. It ensures that the same animation can be consistently regenerated by keeping the seed fixed. Changing the seed results in different iterations of the animation.

  • How does the 'sampler' setting affect the output of the animation?

    -The 'sampler' setting determines the algorithm used to generate the animation. Different samplers like 'M caras' or 'Oiler a' can produce varying levels of divergence in the generated images, affecting the consistency and final look of the animation.

  • What is the 'frame load cap' and how does it influence the video to video workflow?

    -The 'frame load cap' is a setting that limits the number of frames the AI will use from the input video. This can be used to reduce the computational load or to focus on specific parts of the video by skipping certain frames.

  • What is 'prompt scheduling' and how does it change the animation?

    -Prompt scheduling is a feature that allows the user to set different prompts for different frames within the animation. This results in an animation that changes according to the sequence of prompts, transitioning between different scenes or themes over time.

  • Why is it necessary to install FFmpeg when running the AI animation process locally?

    -FFmpeg is necessary for generating a preview of the animation or combining the frames into a video or GIF. It is a powerful tool for handling multimedia data and is used to process the output of the AI-generated frames into a playable format.



🎨 Introduction to AI Animation Workflows

The speaker introduces the concept of creating animations using AI in a few minutes. They plan to demonstrate various workflows, including text-to-video and video-to-video, along with tips and tricks for optimal results. The speaker mentions the availability of free and affordable options, requiring minimal hardware. They also discuss the need for a GPU with 8-10 GB of VRAM for the free option and provide a link to the workflows in the description below.


πŸš€ Setting Up the Animation Process

The speaker explains the setup process for creating animations, starting with the Think Diffusion platform. They guide the audience on selecting the machine, launching it, and understanding the interface. The focus is on the text-to-video workflow, with an emphasis on the importance of settings such as the number of frames, frame rate, and animation size. The speaker also addresses potential errors and how to select appropriate models for the animation.


🌟 Customizing Animation Settings

The speaker delves into the customization of animation settings, including the context length and context overlap for chaining animations. They discuss the motion module and its impact on the animation's movement. The prompt section is highlighted, explaining how it defines the desired outcome and what to avoid. The speaker also touches on the seed's role in iteration and the sampler's effect on image generation.


πŸ“Ή Video-to-Video Animation Tutorial

The speaker transitions to a video-to-video animation tutorial, emphasizing the need for local installation of Comy UI and its manager. They guide through the installation of missing custom nodes and the process of loading a video input. The speaker explains the use of control net nodes and their influence on the final animation, including the strength and duration settings. The video's setup is demonstrated, with adjustments made for smoother animations and better visual quality.


🎭 Advanced Prompt Scheduling

The speaker introduces the concept of prompt scheduling, allowing for different prompts to be applied to each frame of the animation. They demonstrate how to set up a batch prompt schedule and the importance of commas in the prompt list. The speaker also addresses common errors and how to fix them. The tutorial showcases the power of prompt scheduling, with an example of an animation transitioning through different seasons and settings.


πŸ› οΈ Installing FFmpeg for Video Preview

The speaker concludes with a guide on installing FFmpeg for previewing animations as videos or GIFs. They provide a step-by-step process for downloading and installing 7-Zip and FFmpeg, including renaming and placing the folder in the root directory. The speaker emphasizes the importance of adding the FFmpeg path to the system's environment variables and provides a command for doing so. The tutorial ends with a prompt to watch another video for more on generative AI and AI.



πŸ’‘AI Animations

AI Animations refer to the process of creating animated content using artificial intelligence. In the context of the video, the speaker is discussing the creation of animations through AI in a short amount of time, demonstrating how AI can be utilized to expedite the animation process and produce various types of animated content, such as text-to-video and video-to-video workflows.


Workflows in this context refer to the step-by-step procedures or sequences of tasks used to create AI animations. The video provides an overview of different workflows such as text-to-video and video-to-video, which are methods for generating animated content from textual descriptions or existing video footage, respectively.


GPU stands for Graphics Processing Unit, a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer intended for output to a display device. In the video, the speaker mentions the need for a GPU with at least 8 to 10 gigs of VRAM for the free option of creating AI animations, indicating the computational power required for such tasks.

πŸ’‘Inner Reflections Guide

The Inner Reflections Guide is a reference to a specific guide or set of instructions used in the workflow for creating AI animations. It serves as a resource for users to understand and implement the processes involved in generating animations using AI technology.

πŸ’‘Comy UI

Comy UI appears to be a user interface or software platform used for creating AI animations. It is mentioned as a tool where workflows can be loaded and utilized to generate animations, with specific features and options for customization and control over the animation process.

πŸ’‘Anime Diff

Anime Diff, short for Anime Diffusion, is a term used to describe a method or technology for generating animated content, particularly in the style of anime, using AI models. It is likely a specific implementation or application of AI used within the Comy UI for creating animations.


Checkpoints in the context of AI animations refer to saved states or configurations that can be loaded to continue or restart the animation generation process from a specific point. They are used to preserve progress and allow users to pick up from where they left off or experiment with different outcomes without starting over.

πŸ’‘Prompt Scheduling

Prompt Scheduling is a feature or method that allows users to set specific prompts or instructions for different frames or segments of an animation. This enables the creation of dynamic animations where the content and style can change over the course of the animation, reflecting the different prompts scheduled for each frame or segment.

πŸ’‘Control Nets

Control Nets are tools or models used in AI animation workflows to influence the output based on certain predefined parameters or 'controls'. They act as a guide for the AI to generate content that adheres to specific criteria, ensuring that the final animation aligns closely with the desired outcome.


FFmpeg is a free and open-source software used for handling multimedia files, including the conversion of different video formats. In the context of the video, FFmpeg is recommended for users running the animation creation process locally, to help with the preview and combination of frames into a video or GIF format.


The speaker demonstrates how to create animations using AI in minutes.

Multiple workflows for text-to-video and video-to-video animation creation are discussed.

The use of a GPU with at least 8 to 10 gigs of VRAM is recommended for the free option.

The paid option requires no additional hardware and is easy to use.

The basics of text-to-video workflow are explained, including setting the number of frames and frame rate.

Anime diff can only make animations that are 36 frames at most, but longer animations can be created by chaining them together.

The motion module and motion scale settings are explored, affecting the animation's movement intensity.

The prompt settings are detailed, explaining how to define what is wanted and unwanted in the animation.

The process of generating an animation using the comy UI and the importance of the sampler setting are discussed.

The video-to-video workflow is introduced, requiring local installation of comy UI and custom nodes.

Control net nodes are explained, which use line art to influence the end result of the animation.

Prompt scheduling is introduced, allowing for dynamic changes in the animation based on frame.

The importance of commas in prompt scheduling for proper error-free execution is highlighted.

The guide provides a step-by-step process for installing ffmpeg to preview animations as videos or GIFs.

The tutorial concludes with a summary of the key points and encouragement for further exploration of generative AI.