Best AI Photorealism yet? NEW Model!

Sebastian Kamph
17 Sept 202309:32

TLDRThe video script discusses advancements in generative AI for creating photorealistic images using stable diffusion. It introduces a new model trained on realism and shares tips for enhancing photorealism, such as adding textures and imperfections. The host demonstrates live renders and compares different models, highlighting the progress in achieving realistic results without extensive manual editing.


  • 🚀 The journey towards achieving photorealistic images with Stable Diffusion is ongoing, with significant progress being made.
  • 🎨 A new model is introduced that is specifically trained on realism, aiming to create more photorealistic images.
  • 👀 Improvements in eye rendering are highlighted, with the addition of 'detail eyes' to enhance the realism of the portraits.
  • 🌞 The importance of skin texture in achieving realism is discussed, with suggestions to use terms like 'dry skin' and 'visible skin hair' to improve渲染效果.
  • 🎬 The speaker shares their admiration for a model that produces plain, regular images, akin to typical stock photos, emphasizing the value of authenticity.
  • 🔄 The process of adding 'lures' to the model is explained to address common issues with generative AI, such as oily and plastic-looking skin.
  • 📸 A recommendation is made to use a model called 'Realistic Stock Photos' for close-up photos of people, which has been trained with stock photos for a more natural look.
  • 🌐 Instructions are provided on how to download and install the recommended models and 'lures' for use with various Stable Diffusion interfaces.
  • 🎭 The impact of different styles, such as 'Cinematic' and 'Analog Fill', on the final image is demonstrated, showing how they can alter the vibe of the rendered images.
  • 👩 Examples of rendered images, including a portrait of a woman astronaut and a Viking woman warrior, are used to illustrate the capabilities and potential of the new models and techniques.
  • 📈 The progress of Stable Diffusion is praised, with the speaker noting a significant reduction in the need for manual inpainting compared to previous versions.

Q & A

  • What is the main focus of the video?

    -The main focus of the video is to discuss and demonstrate the process of creating photorealistic images using Stable Diffusion and various models, with an emphasis on achieving a more realistic style in portrait renderings.

  • What is the significance of the astronaut portrait in the video?

    -The astronaut portrait is highlighted as an example of a fantastic image created with the new model, which feels like it's straight out of the movie 'Space Odyssey 2001', showcasing the potential for high-quality photorealism in the generated images.

  • What are some of the live renders featured in the video?

    -The live renders featured in the video include a portrait of a woman with detailed eyes, a sunset at the beach, and an astronaut, all aiming to demonstrate the capabilities of the new model in achieving photorealistic results.

  • What specific improvements are being made to the skin texture in the images?

    -The video discusses adding features like dry skin, visible skin hair, and skin blemishes to improve the realism of the skin texture, addressing common issues with oiliness and plastic-like appearances in previous models.

  • What is the 'realistic stock photos' model mentioned in the video?

    -The 'realistic stock photos' model is a new model trained specifically on realism, using stock photos for close-up images of people. It aims to produce plain, regular images that resemble typical stock photos, focusing on authenticity and photorealism.

  • How does the video address the issue of imperfections in the generated images?

    -The video suggests adding prompts for imperfections such as dry skin, visible skin hair, and skin blemishes to make the images look more natural and realistic, moving away from the overly perfect and unrealistic appearances of some AI-generated images.

  • What are the different styles applied to the images in the video?

    -The video applies various styles to the images, including cinematic and analog film, to demonstrate how different styles can affect the final look and feel of the generated images, from a vintage old photo vibe to a more cinematic and dramatic appearance.

  • How does the video compare the new model's performance to Stable Diffusion 1.5?

    -The video compares the new model to Stable Diffusion 1.5 by noting that the base model of Stable Diffusion 1.5 struggled to produce realistic images, whereas the new model has achieved much better results, requiring less in-painting and offering more consistently good images.

  • What additional elements are used to enhance the eyes in the images?

    -The video mentions the use of a 'detail eyes' model to enhance the realism and detail of the eyes in the generated images, addressing common issues with the eyes appearing too shiny or not detailed enough in previous models.

  • What is the overall impression of the progress in photorealistic AI and generative AI?

    -The overall impression is positive, with the video expressing happiness and satisfaction with the direction the technology is taking and the rapid progress of Stable Diffusion in achieving more realistic and authentic photorealistic images.



🎨 Journey to Photorealism with AI

The paragraph introduces the quest for achieving the best photorealistic images using generative AI, specifically focusing on stable diffusion. It mentions the introduction of a new model that brings us closer to this goal. The speaker welcomes viewers and sets the stage for a session where they will demonstrate how to create realistic images, including addressing common issues such as the oily and plastic appearance of skin in AI-generated images. The paragraph also highlights the live rendering process of portraits and the intention to improve photorealism by focusing on skin texture and other realistic image elements.


👀 Enhancing Realism in AI-Generated Portraits

This paragraph delves into the specifics of improving the realism of AI-generated images, particularly focusing on the eyes and skin texture. It discusses the shortcomings of previous models in rendering eyes and the improvements brought by the new model. The speaker shares examples of images with better eye details and skin blemishes, emphasizing the importance of imperfections in achieving a more natural and authentic look. The paragraph also touches on the use of different styles and models to create a variety of realistic portraits, from a 17th-century woman to a Viking warrior in a coffee shop, and the challenges faced in getting the perfect result.




Photorealism refers to the creation of images that are extremely realistic and closely resemble photographs. In the context of the video, it is the primary goal the creator is striving for, using AI and specific models to generate images that look like they could have been taken with a camera. The video discusses techniques and models used to enhance photorealism, such as adding textures and imperfections to make the images more authentic.

💡Generative AI

Generative AI refers to artificial intelligence systems that are capable of creating new content, such as images, music, or text. In the video, generative AI is used to produce photorealistic images through a process called stable diffusion. The creator discusses the journey of improving these AI-generated images to make them increasingly realistic and true to life.

💡Stable Diffusion

Stable Diffusion is a specific type of generative AI model that focuses on generating high-quality images. It is mentioned in the video as the method being used to create photorealistic images. The creator discusses the process of using and improving upon the stable diffusion model to achieve more realistic results.

💡Model Training

Model training is the process of teaching a machine learning model to recognize patterns and make decisions based on data. In the context of the video, model training is crucial for achieving photorealism, as the AI models are trained on specific datasets, like stock photos of people, to generate realistic images.

💡Skin Texture

Skin texture refers to the detailed appearance of human skin, including elements like pores, blemishes, and hair. In the video, improving skin texture is a key aspect of achieving photorealism. The creator discusses adding details like dry skin, visible skin hair, and blemishes to make the AI-generated images more lifelike and authentic.

💡Eye Details

Eye details refer to the intricate features of the eyes, such as the iris, pupil, and reflections, which contribute to the realism of an image. In the video, enhancing eye details is important for creating convincing portraits. The creator uses additional models, like 'detail eyes', to improve the quality and realism of the eyes in the AI-generated images.


Imperfections are minor flaws or irregularities that occur naturally and contribute to the authenticity of an image or object. In the context of the video, adding imperfections such as skin blemishes and visible hair is essential for achieving photorealism, as it makes the images appear more true to life and less artificially perfect.

💡CFG Scale

CFG Scale likely refers to a configuration setting in the generative AI model that adjusts the level of detail or the focus on certain features in the generated images. In the video, a CFG scale of three is recommended for achieving a certain level of photorealism in close-up photos of people.


Portraits are a type of photography or artwork that focuses on depicting a person's face or figure. In the video, the creator is using generative AI to generate realistic portraits, and discusses various techniques and models to improve the realism of these images.

💡Cinematic Vibe

Cinematic vibe refers to the visual and emotional qualities that make an image or video feel like a scene from a movie. In the video, the creator discusses using certain models and styles to give the AI-generated images a cinematic feel, suggesting a more dramatic and engaging visual style.

💡Vintage Style

Vintage style refers to the visual aesthetic that resembles the look and feel of older photographs or films, often characterized by a certain warmth, graininess, and color palette. In the video, the creator attempts to achieve a vintage style by using specific settings and models, aiming to create images that evoke the feeling of a bygone era.


The pursuit of achieving photorealistic images with Stable Diffusion is ongoing, with significant progress being made.

A new model is introduced that is specifically trained on realism, aiming to enhance the photorealistic quality of generated images.

The importance of adding 'lures' to the model to address common failures, such as improving the realism of eyes and skin texture, is emphasized.

The speaker shares their admiration for the movie 'Space Odyssey 2001', drawing a parallel between the quality of generated images and the aesthetics of the film.

Live renders of a portrait of a woman, detailed eyes, and a sunset at the beach are showcased to demonstrate the current capabilities of the technology.

The speaker critiques the common issue of skin appearing oily and plastic in AI-generated images, highlighting the progress made in this area with the new model.

An example of a live render is provided, showing the transition from a standard image to one with added skin blemishes for increased realism.

The model 'Realistic Stock Photos' is recommended for its ability to produce plain and regular images, akin to stock photos.

The process of downloading and installing the new model and 'lures' is explained to enhance the user's experience with photorealism.

The impact of different styles, such as 'cinematic' and 'analog fill', on the final output of images is discussed, showing how they can change the vibe of the generated photos.

A live demonstration of changing a woman's image into a fashion model and then into a Viking woman warrior illustrates the versatility of the technology.

The comparison between the old Stable Diffusion 1.5 and the current progress shows a significant improvement in the realism of generated images.

The addition of imperfections such as dry skin, visible skin hair, and blemishes are suggested to enhance the realism of the images.

The speaker's satisfaction with the direction of progress in the field of generative AI and the potential it holds for the future is expressed.

The practical application of the technology is demonstrated through the creation of various images, showing its potential for professional use.

The concept of achieving 'dry skin' effect by using specific prompts is introduced, providing insights into the level of control over image generation.

The transition from a cinematic style to a vintage old photo style is demonstrated, showing the adaptability of the AI in capturing different aesthetics.