How to UPSCALE with Stable Diffusion. The BEST approaches.

Next Tech and AI
3 Dec 202322:16

TLDRThe video tutorial explains how to upscale images generated by StableDiffusion 1.5 models, which often have lower resolution compared to StableDiffusion XL. It introduces various upscaling methods, including the use of the Superscale and ESRGAN upscalers, and highlights the advantages of each. The video also demonstrates the installation and use of the epicRealism model and the high-res fix for upscaling. It further details the process of using the 'nearest' and 'ESRGAN' upscalers and how to apply them for different results. The tutorial also covers the use of the 'Ultimate SD upscale' script for more detailed results and introduces ControlNet for the best and most stable upscaling results. It concludes with a comparison of the different methods and recommends ESRGAN for quick results, the SD upscale script or ultimate SD upscale for detailed images, and ControlNet for the best and most stable results, especially for repeated upscaling.

Takeaways

  • 🔍 StableDiffusion 1.5 has improved custom models that offer similar or better quality than StableDiffusion XL but with better performance and less memory usage.
  • 📈 The main limitation of these custom models is the resolution, typically trained with sizes of 512x512 or 768x768.
  • 🎨 Upscaling is a solution to increase the resolution of images generated by custom models.
  • 📚 Models for StableDiffusion 1.5 can be found on platforms like CivitAI and HuggingFace, with descriptions and usage advice.
  • 📁 To upscale, one can use the Superscale algorithm, which is a 4x upscaler and can be installed via a specific URL.
  • 🖼️ The epicRealism model is used for generating images, and parameters such as sampling steps, sampling method, height, and CFG scale are crucial for the generation process.
  • ⚙️ High-res fix can be used for upscaling, but it may have limitations depending on the GPU VRAM.
  • 🔧 For a simple upscale without high detail, the nearest pixel algorithm can be used, but for better quality, ESRGAN is recommended.
  • 🧩 The image-to-image tab in the UI can be used to improve upscaling by adjusting parameters and using the SD upscale script.
  • 📱 The Superscale upscaler can provide more detailed results when used with the tile size set to 512x512.
  • 📈 For even better results, the Ultimate SD upscale script can be used, which provides more detail and is more stable for multiple upscaling iterations.
  • 🔧 ControlNet is a neural network that can be used for upscaling, providing stable and detailed results, suitable for repeated upscaling tasks.

Q & A

  • What are some of the benefits of using StableDiffusion 1.5 models over StableDiffusion XL?

    -StableDiffusion 1.5 models offer improved customization and performance, deliver similar or better quality, and require much less memory than StableDiffusion XL.

  • Where can one typically find the specialized and improved models for StableDiffusion 1.5?

    -The specialized and improved models for StableDiffusion 1.5 can typically be found at CivitAI and HuggingFace.

  • What is the main challenge with the custom models for StableDiffusion 1.5?

    -The main challenge with the custom models for StableDiffusion 1.5 is the resolution, as they are usually trained with 512x512 or 768x768.

  • How can one upscale a picture generated by a StableDiffusion 1.5 custom model?

    -One can upscale a picture generated by a StableDiffusion 1.5 custom model by using different upscaling methods, such as ESRGAN, Superscale, and ControlNet, and comparing the results to see the advantages of each method.

  • What is the epicRealism model used for in this context?

    -The epicRealism model is used to generate pictures with high realism, and it is one of the models that can be used for upscaling purposes.

  • What are the steps to install the Superscale upscaler?

    -To install the Superscale upscaler, one needs to download it from the provided URL, move the file to the StableDiffusion models ESRGAN-directory, and then replace the existing file if necessary.

  • How does the sampling method and steps affect the generation of images with StableDiffusion 1.5?

    -The sampling method and steps determine the quality and speed of image generation. A higher number of sampling steps (e.g., above 20) and a suitable sampling method can improve the quality of the generated images.

  • What is the purpose of the high-res fix in the context of upscaling?

    -The high-res fix is an upscaler that can be used to improve the resolution of generated images, despite its name suggesting a fix for high-resolution issues.

  • What are the recommended parameters for using the epicRealism model with upscaling?

    -The recommended parameters include setting the sampling steps to 25, CFG scale to 5, and using a denoising strength around 0.2 for upscaling by 2 or 4 times.

  • How does the ControlNet upscaler differ from other methods mentioned?

    -ControlNet is a neural network that can be used for various tasks, including upscaling. It provides more stability in the results and allows for repeated upscaling without significant loss in quality.

  • What are the advantages of using the ultimate SD upscale script over simpler methods?

    -The ultimate SD upscale script provides more detail in the upscaled images, making it suitable for situations where high levels of detail are required.

Outlines

00:00

🖼️ Introduction to Upscaling with Custom StableDiffusion Models

The paragraph introduces the limitations of StableDiffusion XL, such as its high memory requirement and common training resolutions of 512x512 or 768x768. It contrasts this with the improved performance and lower memory usage of custom models for StableDiffusion 1.5, which offer comparable or better image quality. The main challenge with these models is their lower resolution output, which the video aims to address through upscaling techniques. The paragraph also mentions the availability of these models on platforms like CivitAI and HuggingFace, and provides a brief guide on installing the epicRealism model and an upscaler called Superscale. The process of generating an image using the epicRealism model is outlined, including the need to adjust parameters such as sampling steps, height, and CFG scale.

05:04

🔍 Exploring Upscaling Techniques and Parameters

This paragraph delves into the details of upscaling an image generated by a custom model. It advises caution when using certain prompts due to the potential for the model to generate images with insufficient clothing details. The paragraph discusses the use of the high-res fix as an upscaler, with options to choose different upscale modules like ESRGAN. It also touches on the importance of setting the denoise strength and the limitations faced when working with limited GPU memory. The narrative then shifts to a simpler upscaling method, resizing the image by 4 times using the nearest pixel method and ESRGAN for better results. It concludes with a discussion on using the image-to-image tab for further upscaling, emphasizing the need to adjust parameters and prompts for optimal results.

10:07

📈 Advanced Upscaling with SD Upscale Script and Superscale

The paragraph focuses on advanced upscaling techniques using the SD upscale script and the Superscale upscaler. It emphasizes the importance of setting the correct parameters, including the sampling method, sampling steps, CFG, and denoising levels. The Superscale upscaler is chosen for its ability to generate more detailed and less pixelated images. The paragraph also discusses the benefits of using tile sizes that are compatible with the default settings of Stable Diffusion 1.5, which allows for efficient use of VRAM and the generation of upscaled images even on graphics cards with lower memory. The selection of the Superscale upscaler and the generation of the upscaled picture concludes the section, highlighting the improved detail and quality of the upscaled image.

15:17

🎨 Refining Upscaling with ControlNet for Ultimate Results

This section introduces ControlNet as a superior upscaling solution, which is a neural network capable of various tasks, including defining a person's pose. The paragraph outlines the installation process of ControlNet, which is similar to that of the ultimate upscaler. It also discusses the necessity of downloading specific models from HuggingFace and placing them in the appropriate directory for upscaling. The narrative then guides through the process of using ControlNet for upscaling, emphasizing the stability of the results and the potential for repeated upscaling without significant quality loss. The paragraph concludes with a comparison of the results obtained from using ControlNet and the ultimate SD upscale script, highlighting the increased detail and stability of the ControlNet output.

20:24

📊 Summarizing Upscaling Methods and Recommendations

The final paragraph summarizes the different upscaling methods discussed in the video. It recommends using ESRGAN for quick results without the need for detail, the SD upscale script or the ultimate SD upscale script for more detailed images, and ControlNet for the best results or when multiple upscaling iterations are required. The paragraph also provides a visual comparison of the upscaling results with and without the use of a script, emphasizing the superior detail retention and stability of the ControlNet method. It concludes with a call to action, encouraging viewers to like or comment if the video was helpful for upscaling their images.

Mindmap

Keywords

💡StableDiffusion XL

StableDiffusion XL is a high-resolution model used for generating images from text prompts. It is known for its ability to understand short prompts and produce high-quality images. However, the video suggests that newer models for StableDiffusion 1.5 offer similar or better quality with improved performance and less memory usage, making them more efficient despite typically having lower native resolutions.

💡Upscale

Upscaling is the process of increasing the resolution of an image while maintaining or enhancing its quality. In the context of the video, upscaling is a solution to the lower native resolutions of the newer StableDiffusion models. It involves using various techniques and tools to enlarge images generated by these models to higher resolutions without significant loss of detail or quality.

💡Custom Models

Custom models refer to specialized versions of StableDiffusion that have been improved for specific tasks or performance characteristics. These models are often trained with lower resolutions like 512x512 or 768x768 and are available through platforms like CivitAI and HuggingFace. The video emphasizes the importance of reading the usage advice and parameters provided with these models to ensure optimal results.

💡EpicRealism Model

The EpicRealism model is a specific custom model used in the video for generating images. It is chosen for its ability to produce realistic images. The video demonstrates how to install and use this model, including setting specific parameters like sampling steps and CFG scale for generating images before upscaling.

💡ESRGAN

ESRGAN stands for Enhanced Super-Resolution Generative Adversarial Networks. It is an upscaler used in the video to improve the resolution of images. ESRGAN uses deep learning to intelligently upscale images, resulting in smoother and more detailed results compared to simpler methods like nearest-neighbor upscaling.

💡Superscale

Superscale is an upscaler tool mentioned in the video that is used to increase the resolution of images. It is one of the options for upscaling images generated by StableDiffusion models. The video discusses downloading and using the 4x Superscale version for upscaling images to achieve higher quality results.

💡ControlNet

ControlNet is a neural network used for various tasks, including upscaling images with more stability and detail. It is presented in the video as one of the best upscaling solutions when compared to other methods. ControlNet is used to upscale images based on the source image alone, providing a more stable result that can be used for multiple upscaling iterations.

💡Sampling Steps

Sampling steps refer to the number of iterations used in the process of generating an image from a model. In the context of the video, a higher number of sampling steps (e.g., 25) is recommended for generating images with the EpicRealism model to achieve better quality before upscaling.

💡CFG Scale

CFG Scale is a parameter used in the image generation process that affects the creativity level of the model. A higher CFG scale, such as the value of 5 used in the video, typically results in more varied and detailed images, which is beneficial before applying upscaling techniques.

💡Denoising Strength

Denoising strength is a parameter that controls the level of noise reduction applied during the image generation process. The video suggests setting a lower value (around 0.2) to avoid excessive denoising, which can lead to a loss of detail in the upscaled image.

💡Tile Size

Tile size determines the dimensions at which the image is divided for processing, especially important for upscaling. In the video, a tile size of 512x512 is used, which is the default for StableDiffusion 1.5 and allows for efficient use of graphics cards with lower memory, facilitating the upscaling of images in smaller segments.

Highlights

StableDiffusion 1.5 has improved custom models that offer the same or better quality than StableDiffusion XL with better performance and less memory usage.

The primary challenge with these custom models is their lower resolution, typically trained with 512x512 or 768x768.

Upscaling is a solution to increase the resolution of images generated by custom models.

Different upscaling methods will be compared to showcase their advantages.

StableDiffusion 1.5 models are available at CivitAI and HuggingFace with usage descriptions and parameters.

The epicRealism model is used for demonstration, emphasizing the importance of reading model advice for proper parameter settings.

Installing the model involves downloading the model save tensor and placing it in the StableDiffusion home directory.

An upscaler is recommended for use with the model, with a hint provided in the model description.

The Superscale 4x upscaler is identified as a necessary component for the upscaling process.

The Automatic1111 WebUI is used for the tutorial, applicable across Windows, Linux, and macOS.

Parameters for the epicRealism model include sampling steps above 20, specific sampling methods, and CFG scale adjustments.

The generation process with the custom model is faster compared to StableDiffusion XL.

High-res fix is mentioned as an upscaling option, but caution is advised due to potential memory issues.

Simple upscaling can be achieved by sending the image to an extra step for resizing.

Nearest pixel upscaling is shown to be less effective, with ESRGAN recommended for better results.

The image-to-image tab is used for more detailed upscaling, with parameters adjusted to match image generation settings.

The Superscale upscaler is chosen for its detail preservation capabilities.

The Ultimate SD upscale script is introduced for even more detailed results.

ControlNet is highlighted as the best upscaling solution, offering stability and detail.

ESRGAN is recommended for quick upscaling without the need for detail, while SD upscale scripts are better for detailed results.

ControlNet is suggested for the best results or when multiple upscaling iterations are required.

The final comparison shows the superiority of ControlNet in upscaling for detail and stability.