the best REALISTIC models for Stable Diffusion

James Beltman
26 Jul 202308:44

TLDRThe video discusses the best models for creating highly realistic images using Stable Diffusion. The presenter's favorite model, Epic Realism, is praised for its ability to transform simple prompts into lifelike images, particularly in capturing facial details. The video offers tips for using the model effectively, such as avoiding certain keywords and fine-tuning parameters like steps and CFG scale. It also recommends using DPM sde Caris or dpm2m Keras samplers and high res upscalers like nmkd super scale or nmkd faces for better resolution and detail. The Magic Mix model is highlighted for its strengths in dramatic and dark scenes, but with limitations in facial generation. Analog Madness is noted for its versatility in generating images of ordinary individuals, emphasizing the importance of vivid prompts for captivating results. The video provides a comprehensive guide for harnessing the power of these models to create lifelike images.

Takeaways

  • 🎨 **Epic Realism**: A favored model for generating lifelike images, especially good at capturing facial details.
  • 📈 **Parameters Tuning**: Keeping steps above 20 and adjusting CFG scale to 5 can help maintain quality and realism.
  • 🖼️ **Image Enhancement**: Using high res upscalers like nmkd super scale or nmkd faces with a denoising setting of 0.35 can improve image detail.
  • 🚫 **Negative Keywords**: Words like 'cartoon' and 'painting' should be avoided to maintain realism, while 'Asian, Chinese' can be used negatively to counter model biases.
  • 🌓 **Lighting Details**: The model captures light shadows well, so no need for extra lighting keywords.
  • 📉 **Over-Description**: Avoid overly describing the face to prevent less desirable results.
  • 🧙‍♂️ **Magic Mix Model**: Ideal for dramatic and dark scenes, but has limitations in facial generation, often generating East Asian women.
  • 🔍 **Sampler Selection**: For Magic Mix, Eula a Euler, dpm2m Karis, or dpmsc cares are recommended samplers.
  • 🔧 **Upscaling Tips**: Using nmkd faces or nmkd super scale with high res steps set to 15 and denoising strength between 0.1 and 0.5 enhances image quality.
  • 🌐 **Analog Madness**: Versatile model that can generate images of ordinary individuals, with output quality highly dependent on prompt vividness.
  • 📝 **Prompt Crafting**: Specific and pointed prompts work well with Analog Madness, but may not be as effective with other models.

Q & A

  • What is the name of the model that excels in capturing facial detail?

    -Epic Realism is the model that excels in capturing facial detail, transforming simple prompts into stunningly lifelike results.

  • What are some of the keywords that should be avoided when using the Epic Realism model?

    -Keywords such as 'Masterpiece', 'best quality', and '8K' should be avoided as they do not add a noticeable difference to the outcome.

  • What is the recommended setting for the CFG scale when using the Epic Realism model?

    -The author recommends setting the CFG scale to five, as increasing it might compromise the realistic feel.

  • Which samplers are suggested for achieving an extra dose of realism with the Epic Realism model?

    -For an extra dose of realism, the samplers DPM SDE Caris or DPM2M Keras are suggested. Other samplers like DPM Fast also work well.

  • What is the recommended denoising strength and upscale factor for using an upscaler with the Epic Realism model?

    -The recommended denoising strength is 0.35, and the upscale factor is 2 for using an upscaler with the Epic Realism model.

  • What is the limitation of the Magic Mix model when it comes to facial generation?

    -The Magic Mix model has a limitation in that it almost exclusively generates East Asian women and tends to lean towards a uniform and unrealistic 'Tick-Tock slim face filter' look.

  • What is the recommended range for the number of steps when using the Magic Mix model?

    -The sweet spot for the number of steps when using the Magic Mix model is between 20 and 40.

  • How does the Analog Madness model differ from other models in terms of the subjects it can generate?

    -Analog Madness differs by being able to generate images of ordinary individuals, offering a refreshing alternative to the supermodel renditions often produced by other popular models.

  • What is the recommended sampler for the Analog Madness model?

    -The SDC (SDE Caris) sampler is the ideal choice when working with the Analog Madness model.

  • What is the default setting for the conflict scale that offers the best results with the Analog Madness model?

    -The default setting of 7 for the conflict scale offers the best results with the Analog Madness model.

  • What are some keywords that can make an image more realistic in terms of color and composition with the Analog Madness model?

    -Keywords such as '3D Max', 'grotesque', and 'desaturated' work well to make the image more realistic in terms of color and general composition with the Analog Madness model.

  • What is the importance of effective use of negatives when using the Epic Realism model?

    -Effective use of negatives helps to add realism to the image and also helps to define what you don't want in your image, especially since many realistic models tend to be biased towards creating East Asian women.

Outlines

00:00

🖼️ Epic Realism for Lifelike Image Generation

The first paragraph introduces the Epic Realism model, which is favored for its ability to transform simple prompts into highly realistic images, particularly in capturing facial details. The speaker advises on prompt construction, emphasizing simplicity and the inclusion of negative keywords to refine the output. Parameters such as steps, CFG scale, and sampler choice are discussed, with recommendations for settings to avoid image errors and artifacts. High-resolution upscaling techniques are also covered, with specific tools and settings suggested for enhancing image quality. The importance of using negatives to counteract model biases and the capture of intricate details like light shadows without extra effort are highlighted. The paragraph concludes with instructions on how to download and use the Epic Realism model for stable diffusion.

05:00

🎭 Magic Mix for Dramatic and Dark Imagery

The second paragraph discusses the Magic Mix model, which excels in creating dramatic and dark scenes with a moody and mysterious atmosphere. However, it is noted that the model has limitations, particularly in generating facial features, often defaulting to East Asian women with a slim face filter look. Optimal sampler options, step ranges, and upscaling recommendations are provided to enhance image quality. The convex shell parameter is identified as crucial, with a recommended range for achieving the best results. The use of positive prompts and terms that make a difference with Magic Mix is discussed, as well as the application of textual inversions to improve image quality and avoid common pitfalls like malformed anatomy or cartoony appearances. The paragraph concludes with a reminder of the model's strengths in creating images with striking lighting effects and atmospheric settings, while cautioning about its tendency towards a specific facial style.

Mindmap

Keywords

💡Epic Realism

Epic Realism is a model for stable diffusion that excels in creating highly realistic images, particularly in capturing facial details. It is favored for its ability to transform simple prompts into stunningly lifelike results. In the video, it is mentioned as the author's current favorite model, showcasing its effectiveness in generating images with fine details without the need for complex prompts.

💡Automatic 1111

Automatic 1111 refers to a specific setting or version within the stable diffusion software that the author uses to demonstrate the creation process of the images. It is the interface where the author inputs prompts and adjusts parameters to generate the desired images, as shown in the video.

💡Prompts

Prompts are the descriptive inputs provided to the stable diffusion model to guide the generation of images. They are crucial for steering the output towards the desired outcome. The script emphasizes the importance of simplicity in prompts and avoiding certain keywords that do not affect the outcome, while including others that enhance realism.

💡Parameters

Parameters are the adjustable settings within the stable diffusion model that influence the image generation process. Fine-tuning parameters such as steps, CFG scale, and sampler is key to achieving a balance between quality and realism. The video provides specific recommendations for these parameters to optimize the use of the models discussed.

💡High Res Upscaler

High Res Upscaler is a tool used to enhance the resolution of generated images, improving their level of detail. The author recommends using either the nmkd super scale or nmkd faces with a denoising setting of 0.35 and an upscale factor of 2 to achieve better image quality, as demonstrated in the comparison of upscaled and non-upscaled images.

💡Magic Mix

Magic Mix is another model for stable diffusion that has its unique strengths, particularly in creating dramatic and dark-lit scenes. It is noted for its tendency to generate images with a specific facial style, which may be preferred by some users. The video discusses how to optimize the use of Magic Mix with various samplers and steps to achieve the best results.

💡Analog Madness

Analog Madness is a versatile and dynamic model that stands out for its ability to generate images of ordinary individuals, offering a refreshing alternative to the typical supermodel renditions. The power of Analog Madness lies in the potency of the prompts provided, with more vivid and robust prompts leading to more captivating outputs. The video outlines the author's workflow with this model, emphasizing the importance of specific and pointed prompts.

💡Resolution

Resolution in the context of the video refers to the clarity and detail of the generated images. The author discusses the importance of achieving high resolution to ensure fine details, especially in facial features. The use of an upscaler is recommended to improve the resolution and detail of the images.

💡Denoising Strength

Denoising Strength is a parameter related to the High Res Upscaler tool that determines how much of the original noise is removed from the image during the upscaling process. A higher denoising strength results in a cleaner image but may also remove some of the original details. The author suggests a denoising strength of 0.35 for a balance between detail and noise reduction.

💡Stable Diffusion Web UI

Stable Diffusion Web UI refers to the user interface of the stable diffusion software accessible via the web. It is where users can download and select different models, such as Epic Realism, to generate images. The video provides instructions on how to navigate and use the Stable Diffusion Web UI for model selection and image generation.

💡Bias

Bias in the context of the video refers to the tendency of the realistic models in stable diffusion to favor certain outcomes, such as generating East Asian women. The author discusses the need to use effective negatives in prompts to counteract this bias and achieve a desired image that aligns with the user's intent.

Highlights

Epic Realism is a favored model for creating highly realistic images, particularly excelling in capturing facial details.

For achieving lifelike results, it's advised to keep prompts simple and avoid adding extra keywords like 'masterpiece', 'best quality', or '8K'.

Including negative keywords such as 'cartoon', 'painting', and 'illustration' helps to maintain the realistic qualities of the generated images.

Fine-tuning parameters like steps, CFG scale, and sampler choice is crucial for balancing quality and realism in images.

DPM SDE Caris or DPM2M Keras samplers are recommended for an extra dose of realism in image generation.

High res upscalers like 'nmkd super scale' or 'nmkd faces' with a denoising setting of 0.35 and an upscale factor of 2 can significantly improve image detail.

The model tends to generate images biased towards East Asian women; adding 'Asian, Chinese' to the negative prompts can help diversify the ethnicity.

Avoiding terms like 'hard light' or 'cinematic lighting' helps to achieve a more natural effect in the generated images.

Over-describing the face can lead to less desirable results, so it's better to keep facial descriptions minimal.

The 'Magic Mix' model excels in creating dramatic and dark lit scenes, bringing out moodiness and mystery.

Magic Mix has limitations, especially with facial generation, often generating East Asian women with a slim face filter look.

Samplers like Euler, Euler DPM2M Karis, or DPM SDE Caris work well with Magic Mix for producing great results.

For Magic Mix, a step range of 20 to 40 and a denoising strength between 0.1 and 0.5 are recommended for optimal image quality.

The 'Analog Madness' model is versatile and dynamic, capable of generating images of ordinary individuals with a high level of detail and complexity.

The effectiveness of Analog Madness lies in the potency of the prompts provided; vivid and robust prompts lead to more captivating outputs.

The sde Caris sampler is the ideal choice when working with Analog Madness for an optimal balance between details and computational load.

A default conflict scale of 7 offers the best results for realism when working with Analog Madness.

Keywords like '3D Max', 'grotesque', and 'desaturated' work well to enhance the realism of color and composition in Analog Madness outputs.

Analog Madness's strength lies in generating realistic, non-modelesque figures, providing a refreshing take on AI image generation.