I Spent 1000 Hours Researching This - You Won't Believe What I Discovered About Stable Diffusion!

PromptGeek
28 Jul 202318:31

TLDRIn this video for Prompt Geek, the speaker humorously suggests that with stable diffusion, one can forget about expensive camera gear and still create photorealistic images from the comfort of home. The speaker has compiled a comprehensive 182-page prompt look book, which includes over 350 images and 200 prompt tags, all tested personally over hundreds of hours. This resource is available for free on Gumroad, with an optional $2 donation to support the creator's coffee fund. The video outlines the best settings for stable diffusion, introduces the models used, and provides examples from the book. The speaker shares their findings on models like Universe Stable and Absolute Reality, discusses the importance of LORAs for realistic skin and eyes, and offers tips on negative prompts and sampling methods. The guide also covers prompt structure, including style of photography, subject details, pose, framing, background, lighting, camera angle, and properties. The speaker emphasizes the effectiveness of certain styles like documentary photography for realistic skin tones and candid photography for natural-looking images. The video concludes with an invitation to share community creations and a reminder to like, subscribe, and consider donating for the free guide.

Takeaways

  • 📷 Stable Diffusion allows you to create photorealistic images without expensive camera equipment.
  • 🎨 The speaker has created a comprehensive 182-page prompt look book with over 350 images and 200 prompt tags, tested over hundreds of hours.
  • 📚 The prompt look book is available for free on Gumroad, with an option to donate towards the creator's coffee fund.
  • 🔍 The look book includes best settings for stable diffusion, models used, and examples from the book.
  • 🌌 Models like Universe Stable, Absolute Reality, and Photon are recommended for sci-fi, fantasy, and film grain effects.
  • 🧩 Using LORAs such as 'detailed eyes' and 'polyhedron New Skin' can enhance the realism of skin textures and eyes.
  • ❌ Negative prompts like 'bad hands' and 'unrealistic dream' can be used to guide the AI away from generating unwanted features.
  • 🔄 The sampling method DPM++ SDE CARAS with 30 sampling steps and high res fix is suggested for image generation.
  • 🖼️ High res steps set to 20 and Denoising strength around 0.2 are recommended for image quality.
  • 🖌️ In-painting can be used to fix minor issues with generated images, such as eyes or mouths.
  • 📚 The structure of a perfect prompt includes style of photo, subject details, pose/action, framing, background, lighting, camera angle, camera properties, and photographer's style.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about creating photorealistic images using stable diffusion technology without the need for expensive camera equipment.

  • What is the resource the speaker has created to assist with creating realistic images?

    -The speaker has created a 182-page prompt look book with over 350 images and 200 prompt tags, which they have tested over hundreds of hours.

  • How can viewers access the speaker's prompt look book?

    -The prompt look book is available for free on Gumroad, with an option to donate $2 towards the speaker's coffee fund.

  • Which models does the speaker recommend for creating photorealistic images?

    -The speaker recommends the Universe Stable, Absolute Reality, and Photon models for creating photorealistic images with stable diffusion.

  • What are LORAs and how are they used in the process?

    -LORAs are additional prompt elements that enhance specific features in the generated images, such as 'detailed eyes' and 'polyhedron New Skin', which are used to improve the realism of skin textures and eyes.

  • What is the significance of including negative prompts in the process?

    -Negative prompts are used to avoid unwanted elements in the generated images, such as 'bad hands' or 'unrealistic dream', which can help refine the output to be more realistic.

  • What sampling method does the speaker recommend for stable diffusion?

    -The speaker recommends using DPM ++ SDE CARAS sampling with 30 steps for stable diffusion.

  • How does the speaker suggest improving the resolution of the generated images?

    -The speaker suggests using a four x ultra sharp upscaler and setting the high res steps to 20 for better resolution.

  • What is the role of the 'camera' tag in the prompt?

    -The 'camera' tag is used to prevent the AI from generating images where the subject is holding a camera, which is often not desired.

  • What does the speaker suggest for fixing faces that do not turn out well in the initial image generation?

    -The speaker suggests using in-painting to fix faces that do not look right in the initial image generation.

  • How does the speaker determine the structure of the perfect prompt for image generation?

    -The speaker has developed a structure that includes the style of photo, subject details, pose or action, framing, background, lighting, camera angle, camera properties, and the style of the photographer's name.

  • What is the speaker's advice on sharing the generated images with the community?

    -The speaker encourages viewers to share their generated images on Reddit or in the comments section of the video for others to see and provide feedback.

Outlines

00:00

📸 Introduction to Photorealistic Image Creation with Stable Diffusion

The speaker humorously suggests that despite owning expensive camera equipment, one can create photorealistic images using stable diffusion without needing to leave their basement. They introduce a free resource, an 182-page prompt look book with over 350 images and 200 prompt tags, tested over hundreds of hours. The speaker also mentions their upcoming project and requests likes, subscriptions, and optional donations for coffee funding. The video will cover the best settings for stable diffusion, the models used, and examples from the book.

05:03

🖼️ Selecting the Right Models and Prompts for Realistic Imagery

The speaker discusses the models they've found most successful for creating images with a sci-fi or fantasy twist and for different backgrounds. They mention using LORAs for realistic skin textures and eyes, and the importance of including negative prompts. The speaker also details the settings for stable diffusion, including the sampling method, steps, upscaler, high res steps, and denoising strength. They touch on the use of a detailer and the process of generating the first image, emphasizing the ease of achieving realistic results with the right prompts and settings.

10:04

🎨 Crafting the Perfect Prompt for AI Image Generation

The speaker provides guidance on constructing effective prompts for AI image generation. They explain the structure of a prompt, including the style of photo, subject details, pose or action, framing, background, lighting, camera angle, and camera properties. The speaker shares examples of different styles like abstract, candid, documentary, and large format photography, and how they influence the outcome. They also caution against focusing on hands and feet and the importance of using adjectives to describe the character of the subject.

15:07

📹 Camera Properties, Filters, and Photographer Styles in AI Image Prompts

The speaker delves into camera properties that can be included in prompts, such as specific camera models and film types, and their impact on the image's realism. They note that technical terms like lens measurements and F stops do not significantly affect the outcome, but specific lenses with unique qualities do. The speaker also discusses various filters that can be applied to images and the inclusion of different photographers' styles in prompts to achieve distinct visual results. They encourage the community to download the book, build their images, and share their creations, while also supporting the channel through likes and subscriptions.

Mindmap

Keywords

💡Stable Diffusion

Stable Diffusion is a term used to describe a type of artificial intelligence model that can generate photorealistic images from textual descriptions. In the video, it is the core technology that enables the creation of realistic images without the need for traditional photography equipment. The host discusses how to use Stable Diffusion effectively to produce high-quality results.

💡Prompt Look Book

A Prompt Look Book is a resource that contains a collection of examples and guidelines for creating prompts, which are the textual descriptions used to direct AI image generation models like Stable Diffusion. The host has created a 182-page look book with over 350 images and 200 prompt tags, which is offered for free to help viewers understand how to construct effective prompts.

💡LORAs

LORAs, or Latent Optimizations, are specific types of adjustments used within the Stable Diffusion model to influence the characteristics of the generated images, such as skin texture and eye detail. The host mentions using 'detailed eyes' and 'polyhedron New Skin' LORAs to enhance the realism of the generated images.

💡Negative Prompts

Negative prompts are terms or phrases included in the prompt to guide the AI away from including certain elements or qualities in the generated image. An example from the script is 'bad hands', which is used to prevent the AI from generating images with poorly rendered hands.

💡Sampling Method

The Sampling Method refers to the algorithmic technique used by the AI to generate the image. The host discusses setting the sampling method to 'DPM ++ SDE CARAS' and mentions the number of sampling steps to be set to 30 for optimal results.

💡High Res Fix

High Res Fix is a setting within the Stable Diffusion model that is used to increase the resolution of the generated image. The host always uses this setting to ensure the images are of high quality.

💡Upscale

Upscaling is the process of increasing the size of the generated image. The host uses a four times ultra sharp upscaler to achieve higher resolution images more quickly than with the eight times NMKD super scaler suggested by the creator of Absolute Reality.

💡ADetailer

ADetailer is a tool that can be used to refine the details of generated images, such as faces. However, the host found that it sometimes resulted in repetitive faces, so they opted to manually fix faces in post-processing with a tool like in paint.

💡Prompt Structure

The Prompt Structure refers to the way in which the textual description, or prompt, is organized to guide the AI in generating the image. The host provides a detailed structure that includes elements like the style of photo, subject details, pose, framing, background, lighting, camera angle, and photographer's style.

💡Camera Properties

Camera Properties are specific details about the type of camera or lens that can be included in the prompt to influence the style and quality of the generated image. The host discusses various camera models and lenses, emphasizing that certain properties, like '50 millimeter lens', do not significantly affect the outcome, while others, like 'fisheye lens', do.

💡Style of Photographer

The Style of Photographer refers to the inclusion of a specific photographer's name in the prompt to guide the AI towards emulating that photographer's distinctive style in the generated image. The host provides examples of photographers whose styles can be effectively incorporated into prompts to achieve particular visual effects.

Highlights

You can create photorealistic images using stable diffusion without expensive camera equipment.

The presenter has created a 182-page prompt look book with over 350 images and 200 prompt tags, tested over hundreds of hours.

The resource is available for free on Gumroad, with an option to donate to the presenter's coffee fund.

The video showcases the best settings for stable diffusion, including models and prompt examples from the book.

Three models discussed are Universe Stable, Absolute Reality, and Photon, each suitable for different types of images.

Using LORAs like 'detailed eyes' and 'polyhedron New Skin' can enhance the realism of skin textures and eyes.

Negative prompts such as 'bad hands' and 'unrealistic dream' are crucial for refining the image generation process.

The presenter recommends DPM++ SDE CARAS sampling with 30 steps for high-quality image generation.

High res fix and 4x ultra sharp upscaler are used for faster and great results.

Denoising strength can be adjusted between 0.2 to 0.4 for optimal image quality.

The aspect ratio and CFG scale can be modified based on the desired image orientation and focus.

The presenter found that using adetailer for generating many images led to repetitive faces, suggesting manual touch-ups.

The structure of an effective prompt includes the style of photo, subject details, pose/action, framing, background, lighting, camera angle, and camera properties.

Different styles like abstract, candid, and documentary photography yield distinct visual elements and skin tones.

The use of adjectives and focusing on the character's emotional state can lead to more expressive and authentic AI images.

Avoid focusing the prompt on hands and feet, or be prepared to make adjustments in post-processing.

ControlNet using the Open Pose model can be an alternative to prompt-based pose control.

Providing contextual details for the background without being overly prescriptive allows the AI to interpret the essence of the prompt.

Lighting tags like 'candlelight', 'chiaroscuro', and 'neon lighting' can significantly affect the mood and realism of the generated image.

Camera angle and properties, such as specific camera models and film types, can be invoked to influence the style and quality of the image.

The book includes a variety of摄影师 (photographers') styles that can be used to achieve different artistic outcomes.

The presenter encourages the community to use the provided resource, share their creations, and support the channel.