Stable Diffusion Consistent Character Animation Technique - Tutorial

Tobias Fischer
3 Apr 202334:11

TLDRThe video script outlines a technique for creating animations using stable diffusion, focusing on character development and animation. The creator shares their process of developing non-programmer friendly scripts to assist in generating animation sprite sheets. They discuss the use of a turntable Laura and open pose for consistent character creation, and detail the steps involved in refining the character's appearance and poses. The script also touches on the use of stable diffusion for polishing details and creating a coherent animation loop, concluding with the implementation of the final sprite sheet in a game engine.


  • ๐ŸŽจ The video outlines a technique for creating animations using stable diffusion, demonstrated through the development of character animations.
  • ๐Ÿ› ๏ธ The creator spent hours developing and refining the techniques, resulting in an orc animation tutorial.
  • ๐Ÿ“‚ Non-programmer friendly scripts were developed to assist in the creation of animations, making the process accessible to a wider audience.
  • ๐Ÿ”— The original idea was inspired by a technique shared on the stable diffusion subreddit, which was then modified and expanded upon.
  • ๐Ÿ“ The tutorial involves downloading and setting up a workspace with specific folder structures and files for organizing the animation assets.
  • ๐Ÿ–Œ๏ธ The process includes creating pose images and reference images, which are used to guide the stable diffusion AI in generating character animations.
  • ๐ŸŽจ Iterative refinement is key, with the creator using paint software to make adjustments and improvements to the generated images.
  • ๐Ÿ”„ A turntable pose image is utilized to maintain consistency in character appearance throughout the animation.
  • ๐Ÿ”ข The use of settings file (settings.pi) is crucial for controlling various parameters of the animation generation process.
  • ๐Ÿ“ˆ The creator emphasizes the importance of patience and็ป†่‡ด work in achieving a desired look for the character, as this will influence the final animation quality.
  • ๐Ÿš€ The final step involves implementing the generated sprite sheet into a game engine like Godot, showcasing the animation in a game environment.

Q & A

  • What is the main technique discussed in the video?

    -The main technique discussed in the video is creating animations using stable diffusion, which involves generating character animations and developing non-programmer friendly scripts to assist in the process.

  • How long did it take the creator to work on the character animation?

    -The creator spent a few hours working on the character animation, developing techniques and experimenting with different approaches.

  • What was the original inspiration for the animation technique?

    -The original inspiration for the animation technique came from a user on the stable diffusion subreddit who demonstrated a method using a turntable Laura and open pose to achieve consistent characters without a specific character embedding.

  • What are the necessary Python packages for this technique?

    -The necessary Python packages include opencv, numpy, and REM BG. These can be installed using pip install commands.

  • How does the creator handle the character's facial features in the animation?

    -The creator uses a combination of stable diffusion and manual editing in a program like MS Paint to refine and polish the facial features, ensuring they are detailed and aligned with the desired outcome.

  • What is the purpose of the 'turntable.png' pose picture?

    -The 'turntable.png' pose picture is used as a reference for the character's pose and is placed in the root of the workspace to guide the stable diffusion process in generating consistent character orientations.

  • What is the role of the '' file in the workspace?

    -The '' file is the main configuration file for the animation project. It defines the workspace, pose names, and other parameters that the scripts use to generate and process the animations.

  • How does the creator ensure smooth transitions between frames in the animation?

    -The creator uses a combination of stable diffusion iterations, manual adjustments in painting software, and an interpolation process to create smooth transitions and maintain coherency between frames.

  • What is the final output of the animation process?

    -The final output of the animation process is a sprite sheet that contains all the individual frames of the animation, which can then be implemented into a game engine like Godot for use in a game.

  • What is the recommended 'iterations' value for interpolation?

    -The recommended 'iterations' value for interpolation is between one and three, with two being the sweet spot, as it allows for a good spread of frames without overcomplicating the process.

  • How can the creator improve the animation further?

    -The creator can improve the animation further by spending more time on refining the poses, using better interpolation tools, and polishing the details to achieve a cleaner and more cohesive final product.



๐ŸŽจ Introduction to Animation Creation with Stable Diffusion

The paragraph introduces the speaker's recent work on creating animations using stable diffusion, showcasing two distinct results: a character animation and an orc attack animation. The speaker explains the process involved, including the development of techniques and scripts to facilitate the creation of animations, emphasizing their user-friendliness for those without programming experience. The original idea is credited to a technique demonstrated by XYZ disk on the stable diffusion subreddit, which the speaker has modified and expanded upon, creating helper scripts for generating large-scale animation sprite sheets. The speaker provides a GitHub link to these scripts and sets the stage for a tutorial on how to use them.


๐Ÿ“ Setting Up the Workspace and Initial Configuration

The speaker details the initial setup process for the animation project, including downloading the necessary code, installing required packages like opencv, numpy, and REM BG, and creating a workspace folder structure. The process involves creating a 'workspaces' directory, a 'poses' folder within it, and a specific 'turntable.png' pose picture. The speaker also mentions setting up a reference image and emphasizes the importance of proper folder naming and structure for the smooth functioning of the scripts. The paragraph concludes with instructions on initializing the workspace using a Python script.


๐ŸŽญ Crafting Character Appearance through Stable Diffusion

In this section, the speaker describes the process of using stable diffusion to craft the appearance of the character for the animation. The speaker uses a specific Laura embedding and provides a prompt, explaining the use of positive and negative prompts to refine the character's look. The speaker also discusses adjusting settings such as CFG scale, batch size, and sampling steps to iterate through different character variants. The process includes fine-tuning the character's features, such as the pants' texture, using paint software and stable diffusion to clean up details and achieve the desired result. The speaker also addresses the challenge of having characters face in different directions and shares a technique for achieving this by masking and adjusting denoising thresholds.


๐Ÿ–Œ๏ธ Refining Poses and Iterating the Animation

The speaker continues with the process of refining the character's poses and iterating the animation. The speaker recommends using an existing pose as a starting point for the next animation, adjusting the weight and denoising strength, and using paint software to modify the character's features. The speaker emphasizes the importance of maintaining the character's features while transitioning between poses. The process involves running the images through stable diffusion, refining the character's shape and features, and saving theๆปกๆ„็š„ results in the respective pose folders. The speaker also discusses the use of an '' script to clean up images and increase coherency between poses.


๐Ÿ”„ Iterating and Polishing the Animation Sequence

The speaker explains the process of iterating and polishing the animation sequence using the '' script, which allows for control over the number of images to the left and right, as well as the order of the images. The speaker opts for a non-randomized order and a coherency pass to improve the smoothness of the animation. The speaker also discusses the use of stable diffusion to further refine the images, focusing on increasing coherency between frames and polishing up the character's appearance. The process includes multiple iterations, with the speaker adjusting settings such as sampling steps and denoising strength to achieve the desired look.


๐ŸŽจ Extracting, Cleaning, and Interpolating Animation Frames

The speaker outlines the steps for extracting, cleaning, and interpolating the animation frames. The '' script is used to gather all animations into an 'extracted' folder, while the '' script removes the background and places the frames onto a clean background. The speaker then discusses the interpolation process, which involves running the '' script to find middle frames between each pair of frames, creating smoother transitions. The speaker also explains the option for loop interpolation for cyclical animations and shares a custom setup for interpolating only specific walk frames. The process concludes with the creation of a sprite sheet from the cleaned and interpolated frames, ready for implementation in a game engine like Godot.


๐Ÿš€ Implementing the Sprite Sheet in Godot and Final Thoughts

The speaker demonstrates the implementation of the created sprite sheet in the Godot game engine, discussing the settings used for resizing and arranging the sprite sheet. The speaker acknowledges some imperfections in the final result but is satisfied with the outcome given the time and effort invested. The speaker encourages viewers to spend more time refining the process in their own projects, suggesting the potential for cleaner animations with better interpolation tools. The tutorial ends with a call to action for viewers to like, subscribe, and comment if they enjoyed the video, and the speaker expresses anticipation for the next video.



๐Ÿ’กStable Diffusion

Stable Diffusion is an AI model used for generating images and animations. In the video, it is the primary tool for creating character animations and sprites. The technique involves using specific settings and prompts to guide the AI in producing desired visual outputs, such as character poses and details.

๐Ÿ’กCharacter Animation

Character Animation refers to the process of creating movement and life-like motion for characters in digital media. In the context of the video, the creator is focusing on developing a technique to animate characters using Stable Diffusion, including pathfinding and refining the animation through various steps.

๐Ÿ’กTurntable Laura

Turntable Laura is a technique mentioned in the video where a character model is used to achieve consistent character animations without a specific character embedding. It involves using a turntable setup to capture multiple angles of the character, which helps in creating smooth and consistent animations.

๐Ÿ’กHelper Scripts

Helper Scripts are custom scripts created to assist in simplifying and automating certain tasks within a larger project. In the video, these scripts are designed to aid in the creation of animations using Stable Diffusion, making the process more accessible to individuals without programming experience.

๐Ÿ’กWorkspace Initialization

Workspace Initialization is the process of setting up the necessary environment and files for a project. In the context of the video, it involves creating specific folders and files within the workspace to organize and guide the animation creation process using Stable Diffusion and helper scripts.


Interpolation is a mathematical method used to estimate values between known values. In the video, it refers to the process of creating intermediate frames between existing animation frames to create smoother and more continuous animations.

๐Ÿ’กSprite Sheets

Sprite Sheets are collections of images or 'sprites' that are used in video games and animations. They typically contain all the frames of an animation in a grid-like format. In the video, the creator generates a sprite sheet from the images produced by Stable Diffusion and helper scripts.

๐Ÿ’กControl Net

Control Net is a term used in the context of AI-generated images and animations to refer to a guiding mechanism that helps the AI model maintain consistency and coherence in the generated content. In the video, the control net is used to ensure that the character animations have a consistent style and appearance.

๐Ÿ’กDenoising Threshold

Denoising Threshold is a parameter used in AI models like Stable Diffusion to control the level of noise or randomness in the generated images. A higher threshold means the AI will make fewer changes to the input, resulting in less noise and more consistency in the output.

๐Ÿ’กOpen Pose

Open Pose refers to a pose estimation technique that identifies and tracks human body keypoints across different frames of video. In the video, it is used to generate consistent character poses without needing a specific character model.


Godot is an open-source game engine used for creating 2D and 3D games and interactive applications. In the video, the generated sprite sheet is implemented in Godot to see the final result of the character animation in a game-like environment.


The speaker has been working on a technique for creating animations using stable diffusion.

Two different results are showcased: a character animation and an orc attacking animation.

The character animation was a pathfinding attempt, while the orc animation was tutorial-based.

The process involved developing various techniques and using scripts to aid in animation creation.

The original idea came from a technique demonstrated on the stable diffusion subreddit by XYZ disk.

The technique involves using a turntable Laura and open pose for consistent character generation.

The speaker modified and extended the technique, creating helper scripts for larger scale animation sprite sheets.

The GitHub repository mentioned contains all the scripts and a readme on how to run them.

The workspace initialization involves creating folders and setting up the environment with necessary packages.

The animation process includes creating a reference image and using it to guide the stable diffusion generation.

The speaker emphasizes the importance of taking time to refine the initial character image, as it informs the rest of the process.

The speaker uses a combination of stable diffusion and manual editing in paint to achieve desired results.

The process of generating the animations involves multiple iterations and adjustments for coherency and detail.

Interpolation is an optional step that can be used to smooth out animations and create more frames.

The speaker provides a detailed walk-through of the entire animation creation process, from concept to implementation in Godot.

The final result is a sprite sheet that can be implemented in a game, showcasing the potential practical application of the technique.

The speaker encourages viewers to spend more time refining the process for even better results in their own projects.