NEW Photorealism Model

Sebastian Kamph
25 Aug 202308:08

TLDRThe video discusses a stable Fusion model that improves photorealism, particularly in human images, where previous models like sdxl lag behind. The speaker shares their positive experience with the new model, highlighting its effectiveness in creating realistic images, from a Viking to a post-apocalyptic setting, and even a cyberpunk vibe. They mention the Juggernaut XL model for stable Fusion 1.5, which has been well-received, and suggest that custom models are surpassing base models in quality. The video also provides guidance on how to download and use the new model, emphasizing the ease of use for beginners and the potential for 'happy accidents' in generative AI.


  • ๐ŸŒŸ Introduction of a new stable Fusion model that improves photorealism, especially for human images.
  • ๐Ÿš€ Comparison of the new model to the previous stable Fusion 1.5 and 1.4 versions, highlighting advancements.
  • ๐ŸŽจ Discussion of the limitations of the previous model, particularly in rendering human skin textures and details.
  • ๐Ÿ“ธ Presentation of various images generated by the new model, including a Viking, a post-apocalyptic man, a woman in the jungle, and a cyberpunk scene.
  • ๐ŸŒ Mention of the uploader's dissatisfaction with someone cutting in line and a playful remark about keeping an eye out.
  • ๐Ÿ’ก Emphasize on the uploader's role in doing the research so viewers don't have to, encouraging subscriptions and engagement.
  • ๐ŸŽฅ Commentary on the photorealistic quality of the generated images, some of which could be mistaken for real photographs.
  • ๐Ÿš€ Excitement about the potential of custom models like Juggernaut XL to outperform base models in future releases.
  • ๐Ÿ“‹ Instructions on how to download and install the new model, including the placement of files in specific folders for different UIs.
  • ๐ŸŽ‰ Demonstration of the ease of use of the Focus UI, recommended for beginners interested in stable Fusion.
  • ๐ŸŒŒ Generation of diverse scenes showcasing the versatility of the model, from a Viking warrior to a Sci-Fi spaceship and a neon-lit street cat.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is the exploration of a stable Fusion model that improves photorealism, particularly in images of humans.

  • What is the issue mentioned with the previous model, sdxl?

    -The issue with the sdxl model is that it has a lag in photorealism, especially when it comes to depicting people or humans in images.

  • How does the speaker feel about the new model they are discussing?

    -The speaker feels that the new model is pretty good and has shown promising results in enhancing photorealism in generated images.

  • What is the speaker's opinion on the depiction of animals in AI-generated images?

    -The speaker believes that animals, especially the fur and details, look very realistic and fantastic in AI-generated images.

  • What is the speaker's view on the skin texture in the generated images?

    -The speaker thinks that the skin texture, particularly on women, could be improved as it often appears too smooth or has a shiny makeup effect.

  • What is Juggernaut XL and how is it related to the video's topic?

    -Juggernaut XL is a custom stable Fusion sdxl model that the speaker downloaded and found to be quite good in producing photorealistic images.

  • How does the speaker describe the process of using the new model?

    -The speaker explains that users need to download the model, place it in the correct folder depending on the UI they are using, and restart their stable Fusion application.

  • What is the significance of the Viking Warrior example in the video?

    -The Viking Warrior example is used to demonstrate the capabilities of the new model in generating high-quality, photorealistic images with simple prompts.

  • What advice does the speaker give to beginners in using stable Fusion?

    -The speaker recommends the Focus interface for beginners due to its ease of use and suggests watching previous videos for more guidance.

  • How does the speaker feel about the unexpected results or 'happy accidents' in generative AI?

    -The speaker enjoys the unexpected results or 'happy accidents' that can occur with generative AI, as they can lead to beautiful and surprising image generations.

  • What is the speaker's final recommendation to the viewers?

    -The speaker encourages viewers to try out the new model, check the description for links, and share their best model tips in the comments section.



๐ŸŽจ Introduction to Stable Fusion and Photorealism Enhancement

The speaker begins by introducing a new stable Fusion model that aims to improve photorealism, particularly in human images. They acknowledge the limitations of the previous model (sdxl) and express optimism about the improvements this new model can bring. The speaker also mentions a personal grievance with someone who cut in line and encourages viewers to support their research through likes, subscriptions, and comments. They then transition into discussing the images generated by the older versions of stable Fusion and share their excitement about the potential of the new model, Juggernaut XL, which they have recently downloaded and experimented with.


๐Ÿš€ Exploring the Capabilities of the New Stable Fusion Model

In this paragraph, the speaker delves into the capabilities of the new stable Fusion model, specifically Juggernaut XL. They showcase various images generated by the model, including a Viking, a post-apocalyptic man, a dystopian society, a woman in the jungle, a cyberpunk scientist, and an elderly woman in black and white. The speaker praises the model for its ability to create photorealistic images, particularly noting the realistic flow of hair and animal fur. However, they also point out that the skin texture could be improved. The speaker emphasizes the ease of use and the potential of custom models to outperform base models, encouraging viewers to explore and download these models for enhanced results.



๐Ÿ’กStable Fusion

Stable Fusion refers to an AI model used for generating images with a focus on achieving photorealism. In the context of the video, it is the primary tool being discussed for improving the quality of AI-generated images, especially in depicting humans and animals. The speaker mentions different versions of Stable Fusion, such as 1.5 and 1.4, indicating an ongoing development and improvement process.


Photorealism is a visual quality in which images or artwork closely resembles real-life photographs. In the video, the speaker is interested in evaluating how well the Stable Fusion model can create images that look like actual photographs, particularly of humans and animals. The term is used to describe the goal of achieving a high level of detail and realism in AI-generated content.


SDXL appears to be a specific model or version within the Stable Fusion series, noted for its ability to generate images. However, the speaker mentions that while SDXL has been effective for many images, it has a lag in achieving photorealism, particularly with human subjects. The term is used to compare the performance of different models in producing realistic images.

๐Ÿ’กJuggernaut XL

Juggernaut XL is a custom Stable Fusion model mentioned in the video as an example of a model that has been well-received by users for its capabilities. It represents the trend of custom models outperforming base models in the Stable Fusion series. The speaker discusses this model in the context of its ability to produce high-quality, photorealistic images.


Cinematic refers to the quality of images or scenes that resemble those found in movies or television shows. In the context of the video, the speaker is interested in the ability of the Stable Fusion model to generate images that could be used in cinematic contexts, indicating a high level of detail, composition, and visual appeal.

๐Ÿ’กCustom Models

Custom models in the context of the video refer to modified or specialized versions of the base Stable Fusion models. These are created by users or developers to improve certain aspects of image generation, such as photorealism. The speaker notes that custom models are starting to outshine the base models, indicating a trend of user-driven innovation within the AI image generation community.

๐Ÿ’กViking Warrior

Viking Warrior is a specific theme or subject matter used as a prompt for the Stable Fusion model to generate an image. It represents the historical and cultural figures from the Viking age, known for their warrior culture. In the video, the speaker uses 'Viking Warrior' as an example of a prompt to demonstrate the model's ability to generate detailed and contextually relevant images.


Cyberpunk is a subgenre of science fiction that typically features advanced technology and science, often set in a dystopian future. In the video, the term is used to describe one of the themes of the generated images, indicating a visual style characterized by a blend of futuristic technology and gritty urban environments. The speaker appreciates the model's ability to capture the cyberpunk aesthetic in the generated images.

๐Ÿ’กFur Realism

Fur realism refers to the accurate and detailed depiction of animal fur in images, which is a challenging aspect of image generation for AI models. In the video, the speaker praises the new Stable Fusion model for its ability to generate images with realistic-looking fur, indicating an improvement in the model's performance and its capacity to capture fine details.

๐Ÿ’กSkin Rendering

Skin rendering is the process of creating realistic human skin textures and colors in digital images or 3D models. In the context of the video, the speaker critiques the model's performance in rendering skin, noting that while it is generally good, there is room for improvement, particularly in the depiction of women's skin which can sometimes appear overly smooth or shiny.

๐Ÿ’กHappy Accidents

Happy accidents refer to unintended or unexpected positive outcomes that occur during the creative process. In the video, the term is used to describe the surprising and delightful results that can emerge from the use of generative AI, such as Stable Fusion, where the final image may surpass the creator's initial expectations.


The discussion focuses on a stable Fusion model that improves photorealism, especially in human images.

The speaker mentions that while sdlx has been great, there is a noted lag in photorealism for people.

The speaker shares their opinion on the model's effectiveness and hints at a personal grudge.

The speaker emphasizes their role in doing research for the community and encourages subscribing and participating.

The transition to changing the background signifies moving on to the next topic of discussion.

Examples of images from stable Fusion 1.5 and 1.4 are mentioned, showcasing the variety of outputs.

The speaker expresses their admiration for the photorealistic quality of an AI-generated Viking image.

A post-apocalyptic man with a skull mask and a dystopian society image is described as fantastic.

The speaker loves the realistic portrayal of hair and light in the image of a woman in the jungle.

The cyberpunk vibe and scientist with mechanical eyes image are mentioned, showcasing the model's versatility.

The speaker notes that animals in AI-generated images, like the snow leopard, look very realistic.

The speaker points out that skin depiction, particularly on women, could be improved.

A cinematic scene of a car generated by a custom stable Fusion sdxl model, Juggernaut XL, is highlighted.

The speaker provides instructions on where to place the model files for different user interfaces.

The speaker recommends the Juggernaut XL model and suggests that custom models are outshining base models.

The process of activating a new model in the Focus user interface is explained.

The speaker shares their excitement for the happy accidents and beautiful generations produced by generative AI.

The speaker concludes by encouraging viewers to share their model tips in the comments.