SDXL - BEST Build + Upscaler + Steps Guide

Olivio Sarikas
10 Jul 202312:15

TLDRThis tutorial showcases the best practices for achieving stunning results with SDXL, a powerful image upscaling tool. The guide covers essential settings, refining techniques, and a step-by-step process for enhancing image quality. It also introduces different upscaling models and provides resources for further exploration and inspiration, making it an invaluable resource for those looking to master SDXL.

Takeaways

  • 😀 The tutorial is about achieving great results with SDXL, a model for image generation and upscaling.
  • 🔍 The video script includes a discussion about the best settings for image quality and upscaling, as well as reviewing images online.
  • 📈 The script introduces a new feature in Comp View I, which allows for automatic testing of different methods and experimenting with various settings.
  • 📝 Multiple instances of prompts are used to define the quality and style of the image, including both positive and negative prompts.
  • 🎨 The tutorial explains the use of a 'refined model' and a 'base model' with different step counts for image generation, emphasizing a ratio of 80% base model steps to 20% refiner steps.
  • 🖼️ The base rendering process is detailed, including the use of CLIP text encoders for both positive and negative prompts, and the configuration of the K sampler.
  • 🔧 The refiner stage is described, which simplifies the process by using fewer inputs and focusing on technical aspects of the image.
  • 🌟 The script mentions the importance of the VAE-D code in converting the latent image (data points) into a pixel image that can be viewed.
  • 🔍 A double upscaling process is introduced, first using a skin detailer and then a secondary upscaling method, with options for different upscaling models.
  • 🔗 Links to upscaling models, community-trained models, and Discord channels for inspiration and support are provided for further exploration.
  • 🖼️ The final image can be downloaded and directly loaded into Comfy UI for testing and experimentation with the provided build.

Q & A

  • What is the main topic of the video script?

    -The main topic of the video script is about achieving amazing results with the SDXL model, including the best settings for upscaling image quality and reviewing images online.

  • Who is Winston Wolf and what is his role in the video script?

    -Winston Wolf is a member of the speaker's Discord community. He helped the speaker during the live stream and also contributed to the script's content.

  • What is the purpose of the 'comp view I' mentioned in the script?

    -The 'comp view I' is a lab used for experimenting with different methods to see how they work and for setting up automatic testing of various methods.

  • What are the positive and negative prompts in the script and what do they define?

    -Positive and negative prompts define the setup regarding the quality and style of the image. Positive prompts include terms like 'detailed photo' and 'realistic 8K UHD high quality', while negative prompts include '3D render', 'anime', 'blurry', 'low resolution', etc.

  • What is the significance of the 'base model' and 'refiner model' in the script?

    -The 'base model' and 'refiner model' are different stages of the image generation process. The base model uses 20 steps, and the refiner model starts from the 20th step and uses 25 steps, indicating a ratio of 80% base model steps to 20% refiner steps.

  • What is the 'CFG scale' mentioned in the script and what is its purpose?

    -The 'CFG scale' is a setting within the image generation process that affects the configuration of the image. It can be set to a fixed value for consistent testing or randomized for varied results.

  • What is the role of the 'K sampler Advanced' in the script?

    -The 'K sampler Advanced' is a component that takes inputs from the model and the prompts to generate the initial image. It plays a crucial role in the base rendering stage before refinement.

  • What does the script mean by 'double upscale' and how is it used?

    -The 'double upscale' refers to a two-step process where the image is first refined using a specific skin detailer (1x upscaler) and then further upscaled using either a 4X nmkd CX upscaler or a 4X Ultra sharp upscaler.

  • What are the differences between the 'nmkd CX upscaler' and the 'Ultra sharp upscaler' as mentioned in the script?

    -The 'nmkd CX upscaler' provides a nice upscale with some graininess but sharp details, while the 'Ultra sharp upscaler' results in a softer image that can sometimes be blurry in certain parts but has sharp lines on the face.

  • How can viewers find more upscaling models and community-created models as mentioned in the script?

    -Viewers can find more upscaling models and community-created models by joining the official stable diffusion Discord, checking the show and tell XL room, and following the provided links to model databases and Mega drives.

  • What is the final step for viewers to experiment with the image generation process as described in the script?

    -The final step for viewers is to download the provided image and drag it into the canvas of Comfy UI, which will automatically load the complete build, provided they have downloaded the necessary SDXL models and app scales.

Outlines

00:00

🖼️ Upscaling Image Quality with SDXL

The speaker begins by acknowledging the success of a recent live stream and the contributions of the community, particularly Winston Wolf. They plan to condense the insights from the stream into a tutorial. The focus is on experimenting with different methods to upscale image quality using the comp view I tool. The speaker introduces multiple prompts to define image quality and style, such as 'detailed photo' and 'realistic 8K UHD high quality', and contrasts these with negative prompts like '3D render' and 'low resolution'. They discuss the setup of the base model and refiner model, emphasizing the importance of the step ratio between the two. The speaker also explains the technical aspects of the image setup, including image size, CFG scale, and seed, and provides a download link for those interested in experimenting with the setup.

05:01

🎨 Refining Image Details with SDXL

This paragraph delves into the process of refining images using the SDXL model. The speaker explains the use of clip text encoders for both positive and negative prompts, detailing how these inputs are used in the model. They discuss the base rendering stage, emphasizing that it is not yet refined and contains noise. The speaker then moves on to the refiner stage, describing a simpler setup with a clip input and a text input. They explain how the latent image from the base render is used as input for the refiner model and how the total step count and start step are crucial for the refinement process. The speaker also discusses the use of a VAED code to render the latent image into a pixel image. Finally, they introduce a double upscale process, comparing the results of two different upscaling methods and providing links to download the models for further experimentation.

10:01

🔍 Exploring Community Resources for Image Upscaling

In the final paragraph, the speaker provides guidance on how to access and utilize community resources for image upscaling. They instruct viewers on how to download and install upscaling models from a provided model database, which includes a variety of models trained for different purposes. The speaker also recommends joining the official stable diffusion Discord community for inspiration and examples of successful image upscaling. They highlight the 'show and tell XL room' and the 'Parthenon of winners' as valuable resources. The speaker concludes by offering their own image for download, which can be used to automatically load the complete build in Comfy UI, and reminds viewers to download the necessary SDXL models and app scales for the process to work.

Mindmap

Keywords

💡SDXL

SDXL refers to a specific model or version of a software or technology used for image processing, likely related to Stable Diffusion, a type of AI model that generates images from textual descriptions. In the video, the creator discusses how to achieve optimal results using the SDXL model, indicating its importance in the tutorial.

💡Upscale

Upscaling in the context of the video refers to the process of increasing the resolution of an image while trying to maintain or enhance its quality. The script mentions different methods and models used for upscaling, such as the '4X nmkd CX upscaler' and '4X Ultra sharp upscaler', showing the focus on improving image detail and clarity.

💡Prompts

Prompts are textual inputs provided to the AI model to guide the generation or modification of images. The script describes 'positive' and 'negative' prompts that define the desired qualities and styles of the images, as well as those that should be avoided.

💡Refined Model

A refined model in the video script refers to a version of the AI used for further enhancement of the image after the initial rendering. It suggests a two-stage process where the base model generates the image, and the refined model improves upon it.

💡Steps

Steps in this context are the stages or iterations in the image generation or enhancement process. The script specifies different numbers of steps for the base model and the refiner, indicating a progression in the level of detail and quality of the image.

💡CFG Scale

CFG Scale likely refers to a configuration setting or parameter in the AI model that affects how the image is generated or refined. The script mentions adjusting the CFG scale as part of the process to achieve the desired image quality.

💡Seed

In the context of the video, a seed is a value used to initialize the random number generator in the AI model, ensuring repeatability of results. A fixed seed allows for consistent outcomes when testing different settings.

💡CLIP Text Encoder

A CLIP Text Encoder is a component of the AI model that processes textual prompts. The script mentions it being used for both positive and negative prompts, indicating its role in interpreting the textual descriptions to influence the image generation.

💡K Sampler

The K Sampler in the script refers to a part of the AI model's architecture responsible for sampling or selecting from the latent space to generate the image. It is mentioned in conjunction with the positive and negative prompts, suggesting its importance in shaping the final image.

💡VAE Decoder

VAE stands for Variational Autoencoder, and a VAE Decoder is used to transform the latent space representation of an image into a pixel representation that can be visually interpreted. The script describes its role in the final stage of the image refinement process.

💡Discord Community

The Discord Community mentioned in the script refers to a group of users who interact and collaborate on the Discord platform. The creator acknowledges the help from the community, particularly Winston Wolf, in the live stream and script development.

Highlights

Introduction to the best settings for upscaling image quality with SDXL.

Review of yesterday's live stream discussing image upscaling and quality settings.

Comprehensive tutorial on achieving amazing results with SDXL.

Use of Comp View I for experimenting with different methods and setting up automatic testing.

Introduction of multiple instances of prompts for defining image quality and style.

Explanation of positive and negative prompts for image attributes like detailed photo, wide angle, and realistic 8K UHD.

Discussion on the model Lotus and the refined model for base model setup.

Explanation of the total steps and steps on base model, highlighting the 80-20 ratio.

Details on image size, CFG scale, and seed settings.

Description of the base rendering process and its components.

Introduction to the refiner stage and its simplified setup.

Explanation of the latent image and its role in the rendering process.

Process of using a double upscale for image refinement.

Comparison of the 4X nmkd CX upscaler and the 4X Ultra sharp upscaler.

Recommendation to download and experiment with various upscaling models.

Advice on joining the official stable diffusion Discord for inspiration and community challenges.

Offer to download the presenter's image for automatic loading in Conf UI.

Final encouragement to like the video and join the community.