MORE Consistent Characters & Emotions In Fooocus (Stable Diffusion)

Jump Into AI
13 Mar 202417:05

TLDRThe video tutorial delves into character consistency, demonstrating how to create a character from start to finish using a face grid. It emphasizes using a grid of four larger, more detailed faces for realism over the nine smaller faces grid. The process involves image editing, using various AI models for detail and emotion generation, and inpainting for corrections. The tutorial also explores techniques for achieving consistency in face swaps, incorporating different emotions and lighting, and suggests ways to blend a specific face into existing images or create scenes around a face on a blank canvas.


  • ๐ŸŽจ Use a grid of faces to maintain character consistency in image generation.
  • ๐Ÿ” For more detailed and realistic images, opt for a 4-face grid instead of a 9-face grid.
  • ๐Ÿ”— Find reference images for different angles by searching online with specific phrases like 'different angle face reference sheet'.
  • ๐Ÿ–ผ๏ธ Edit and create grids using simple software like Microsoft Paint.
  • ๐Ÿ–Œ๏ธ When using Focus, remove Styles for more control over the image generation process.
  • ๐Ÿ“ˆ Adjust stop and weight numbers to fine-tune the generated images to achieve the desired look.
  • ๐Ÿ˜„ Experiment with prompts that include emotions and detailed descriptions for more expressive character images.
  • ๐ŸŒŸ Use the array support function to generate multiple emotions in a single prompt.
  • ๐ŸŽญ Inpaint and adjust individual facial features using the same model that generated the images.
  • ๐Ÿ–ผ๏ธ Split the grid into individual images using basic image editing software.
  • ๐Ÿ”„ Use different images at various angles and expressions for a versatile character design.

Q & A

  • What is the main focus of the video?

    -The main focus of the video is to discuss character consistency in image generation, specifically using a face grid to create a character from start to finish.

  • Why is using a grid of four faces recommended over a grid of nine faces for realistic images?

    -A grid of four faces is recommended for realistic images because each image will be larger and more detailed, providing more to work with compared to the smaller faces in a grid of nine.

  • How does one find a suitable reference grid for different angle faces?

    -One can find a suitable reference grid by doing an image search on Google with phrases like 'different angle face reference sheet'. The found images can be edited in software like Microsoft Paint to create a grid.

  • What is the purpose of using the 'disable seed increment' option in the array support function?

    -The 'disable seed increment' option allows for creating multiple images with the exact same seed and prompt, with the only difference being a single word or phrase changed between them, which helps in generating images with different emotions.

  • Why is it important to maintain a high weight setting when trying to get a motion to show in the generated images?

    -Maintaining a high weight setting is important to keep the generated images consistent with the original character's features, preventing the face characteristics from being lost too much when trying to show motion.

  • How can one improve the detail of a face in an image?

    -One can improve the detail of a face by using the 'outpaint' function and selecting the 'improve detail' setting in the image generation software. This adds detail and resolution to the face area.

  • What is the recommended approach for trying to get the exact face in a photo into an existing image?

    -The recommended approach is to create a transparent image of the face and use the image generation software to blend it into the existing photo, rather than trying to manipulate and rotate the image extensively in a program like Photoshop.

  • Consistency can be ensured by using the same model and weight settings across different subjects, and by using the 'disable seed increment' feature to generate multiple images with different emotions from the same seed and prompt.


  • What is the significance of using different emotions in the array support function?

    -Using different emotions in the array support function allows for the generation of a variety of facial expressions from a single prompt, which can be useful for creating a more dynamic and versatile character.

  • How does the video suggest dealing with images where the face is far away or the detail is poor?

    -The video suggests choosing an image where the face is closer and then using 'outpaint' to add detail and resolution, ensuring that the overall image has more detail due to the increased resolution in each section.



๐ŸŽจ Character Consistency with Face Grids

The paragraph discusses the importance of character consistency in character design and expands on the concept introduced in the first video. It explains the process of using a face grid to create a character from start to finish, focusing on the transition from a basic grid to more detailed images for realistic portrayals. The speaker shares their experience with different grid sizes and suggests using larger, more detailed images for better results. They also provide a practical guide on finding reference images, using image editing software like Microsoft Paint to create and adjust grids, and the importance of maintaining aspect ratio and image size for consistency.


๐ŸŒŸ Emotion and Expression Variations

This section delves into the nuances of creating characters with varied emotions and expressions. The speaker describes how to use the focus and grid system to generate images with different emotions such as happiness, laughter, anger, and sadness. They explain the use of arrays to incorporate these emotions into the image generation process and suggest enhancing the text prompts for more pronounced emotional expressions. The paragraph also touches on the challenges of achieving consistency in facial features and suggests solutions like adjusting stop and weight numbers for a more realistic and less 'fake' appearance. The speaker emphasizes the importance of imperfections for realism and shares their approach to refining the character's look through iterative generation and inpainting.


๐Ÿ–ผ๏ธ Enhancing and Expanding Image Details

The paragraph focuses on techniques to enhance and expand the details of generated images. It introduces the concept of 'outpainting' to add resolution and detail to close-up images, explaining the process of selecting directions for expansion and generating new images. The speaker also addresses the challenges of creating images in different lighting conditions, such as nighttime shots, and suggests workarounds like using low light images or editing skills to achieve the desired effect. They provide guidance on using Google Images to find poses and the use of inpainting and face swap functions to refine and improve the images, emphasizing the importance of using the same model for consistency.


๐Ÿ”„ Integrating Specific Faces into Existing Images

The final paragraph discusses methods for integrating specific faces into existing images, whether for creating a consistent character across different scenes or for blending a particular face into a new context. The speaker outlines a process for creating a transparent image of the face, resizing and rotating it to fit over a base image, and using Focus to blend it in. They acknowledge the limitations of their own photoshopping skills and suggest alternative approaches, such as using other software for more precise image manipulation. The paragraph also explores the reverse process of creating a blank canvas with just the face and then outpainting the surroundings, offering a comprehensive guide on using masks, invert mask settings, and generating detailed scenes around the character. The speaker concludes by reminding viewers of the flexibility of these techniques across various image types and styles, encouraging experimentation and refinement to achieve the desired character look.



๐Ÿ’กCharacter Consistency

Character consistency refers to maintaining a uniform appearance and personality traits of a character throughout different mediums or scenarios. In the context of the video, it involves creating a character from start to finish using a face grid, ensuring that the character's facial features and expressions remain recognizable and coherent across various angles and emotions.

๐Ÿ’กFace Grid

A face grid is a tool used in character design and digital art to create multiple faces of a character at various angles. It helps artists maintain consistency in facial features and proportions. In the video, the creator uses a grid of four faces to achieve more detailed and realistic character images compared to a grid of nine faces.

๐Ÿ’กImage Editing

Image editing involves altering or enhancing digital images using software tools. In the video, image editing is crucial for refining the character's face by resizing, adjusting details, and inpainting to achieve the desired look and consistency across different images.


In character design, capturing emotions is essential for bringing the character to life and making them relatable. The video discusses using different emotions in the image prompt to generate a variety of facial expressions for the character, enhancing the character's depth and range of expression.


Inpainting is a digital image editing process that fills in missing or selected parts of an image with content that matches the surrounding area. In the video, inpainting is used to correct and enhance specific areas of the generated images, such as the eyes, to achieve a more realistic and detailed appearance.

๐Ÿ’กFace Swap

Face swap is a technique in digital art and photography where the face of one image is replaced with another, often used for creating composite images or humorous effects. In the video, face swap is discussed as a method to integrate the generated character faces into different scenes or backgrounds, maintaining consistency in the character's appearance.


Outpainting is a feature in some image editing software that allows users to expand the borders of an image by generating new content that matches the original. It is used in the video to add more detail and resolution to the character's face by expanding the image beyond the original canvas, enhancing the overall quality and detail.

๐Ÿ’กRandom Seed

A random seed is a value used in generative algorithms to produce a reproducible sequence of random numbers. In the context of the video, controlling the random seed allows the creator to generate multiple images with the same starting conditions, ensuring consistency in the character's appearance across different images.

๐Ÿ’กWeight Settings

Weight settings in image generation software like DALL-E or other AI-based tools influence the importance of certain aspects of the input. In the video, weight settings are used to control the prominence of different faces in the final output, ensuring that the character's most desired features are more prominent.

๐Ÿ’กDetail Enhancement

Detail enhancement refers to the process of improving the clarity and sharpness of details in an image. In the video, detail enhancement is used during inpainting to add more definition to specific parts of the image, such as the eyes, to make the character's features more realistic and visually appealing.

๐Ÿ’กAspect Ratio

Aspect ratio refers to the proportional relationship between the width and height of an image or screen. Maintaining a consistent aspect ratio is important in image editing and design to ensure that the image's composition is preserved when resizing or cropping.


Exploring character consistency through the use of a face grid.

Using a grid of four faces for more detailed and realistic character creation.

Finding reference images through Google search using phrases like 'different angle face reference sheet'.

Editing reference images in simple software like Microsoft Paint to create a grid.

Adjusting image size and aspect ratio to fit the grid and maintain detail.

Using Focus and image prompts to generate character images with specific traits like 'Italian female'.

Refining the image generation process by adding details like age, hair description, and facial features.

Incorporate realism by adding skin imperfections and avoiding makeup for a more authentic look.

Utilizing the array support function to generate multiple emotions in a single prompt.

Fixing minor issues with the generated images using inpaint tools.

Creating a collection of faces at different angles and expressions for versatile character representation.

Applying inpainting to add or improve details like eye color and facial expressions.

Using different images as a base for face swap, maintaining consistency across generated images.

Adjusting weight settings and using multiple angles to achieve a desired balance in character appearance.

Outpainting close-up images to increase detail and resolution, enhancing the overall image quality.

Integrating images into existing scenes using inpaint and outpaint functions for seamless blending.

Creating transparent images of faces for blending onto other photos, leveraging Focus to achieve a natural merge.

Experimenting with various styles and expressions to build a diverse and dynamic character library.