Stable Diffusion入门教程,1小时入门AI绘画

10 Sept 202362:11

TLDRThe video script offers an in-depth tutorial on getting started with Stable Diffusion, a popular AI image generation tool. It covers the system requirements, installation process, and essential features such as model selection, prompt usage, and the application of Lora and VAE for enhancing image quality and style. The tutorial also highlights tips for optimizing the generation process, including the use of CLIP termination layers, iteration steps, and seed values for consistent results. By following the guide, users can expect to create high-resolution, stylistically coherent images by leveraging the power of Stable Diffusion.


  • 🎨 The video introduces two AI painting tools, MidJourney and Stable Diffusion, with the former being user-friendly but paid, while the latter is open-source and free but has a higher learning curve.
  • 💻 To run Stable Diffusion, specific hardware requirements are recommended, such as a CPU with low requirements, 16GB+ RAM, and an NVIDIA GPU with at least 8GB VRAM, preferably a 3060 or higher model.
  • 🔧 The video provides a detailed tutorial on installing the Stable Diffusion launcher on a Windows PC, including downloading the installation package and setting up the necessary dependencies.
  • 🖌️ The importance of using the correct GPU is emphasized, as using an AMD GPU can lead to slow rendering and overheating issues due to reliance on CPU computation.
  • 📂 Adequate disk space is crucial, with a suggestion of reserving over 100GB for the installation package and various models, which can range from 2GB to 5GB or more.
  • 🔍 The video demonstrates how to download and install models from websites like c站 (huggingface) and domestic sites like 哩布哩布 and 吐司art, and how to switch between them within the Stable Diffusion WebUI.
  • 🌟 The role of models (Checkpoints) in determining the style of the generated images is discussed, with recommendations on specific models for different styles like realism and cartoonish looks.
  • 📝 The concept of prompts (正负向提示词) is introduced as the core of Stable Diffusion, with explanations on how to use positive and negative prompts to guide the AI in generating desired images.
  • 🔄 The video covers the use of prompt plugins like 'prompt all-in-one' to simplify the process of adding common attributes to the generated images, such as clothing and accessories.
  • 🌐 The necessity of translating prompts from Chinese to English for Stable Diffusion to understand and execute them correctly is highlighted.
  • 🎭 The tutorial concludes with a brief introduction to Lora and VAE, explaining their functions in refining the style and color palette of the generated images, and how to download and apply them.

Q & A

  • What are the two AI painting tools mentioned in the script, and what are their main differences?

    -The two AI painting tools mentioned are MidJourney and Stable Diffusion. MidJourney is known for its low entry barrier and good image quality but requires payment. Stable Diffusion, on the other hand, is open-source and free but has a higher entry barrier.

  • What are the system requirements for using Stable Diffusion for drawing?

    -For Stable Diffusion, the CPU requirements are minimal, but it mainly relies on the GPU for computation. It is recommended to use an NVIDIA GPU, preferably a 3060 or above with at least 8 GB of VRAM. For the CPU, a low-cost option can suffice. The memory should be 16 GB or more, and at least 100 GB of disk space is advised to accommodate the installation package and additional models.

  • What happens if you use an AMD GPU with Stable Diffusion?

    -While Stable Diffusion can still run with an AMD GPU, the drawing process will be significantly slower. Instead of using the GPU for computations, it will use the CPU, leading to high heat generation and potentially reaching temperatures as high as 98 degrees Celsius.

  • How can one download the Stable Diffusion launcher?

    -The Stable Diffusion launcher can be downloaded by searching for 'SD launcher' on Bilibili and finding a video by the author Qiu Ye. In the video description or评论区, there should be a link to a cloud storage service, such as Kuaiku, where the installation package can be downloaded.

  • What is the role of the 'prompt' in Stable Diffusion?

    -The 'prompt', also known as the '提示词' in Chinese, is a core element in Stable Diffusion. It is the input that guides the AI in generating the desired image. It can include positive and negative prompts to specify what to include or exclude in the image.

  • What are the recommended models for beginners to download for Stable Diffusion?

    -Two recommended models for beginners are '麦橘写实', which is a realistic human style, and '大颗寿司', which is a cartoon style. These models can be downloaded from websites like liblib or Toastart.

  • How can you switch between different models in Stable Diffusion?

    -To switch between models, go to the WebUI interface, click on the 'Stable Diffusion model' option at the top left, and select the desired model from the dropdown list. The model must be downloaded and installed in the correct directory within the SD installation path for it to appear in the list.

  • What is the significance of 'Lora' in the context of Stable Diffusion?

    -Lora is a feature in Stable Diffusion that allows users to input a specific style directly into the system without having to write a complex prompt. It is particularly useful when words are not sufficient to describe the desired style or when the user wants to emulate a specific aesthetic seen in a set of images.

  • How do you install and use 'Lora' in Stable Diffusion?

    -Lora can be downloaded from websites like liblib and installed in the 'models' folder within the SD Web UI directory. Once installed, it can be selected in the Lora section of the WebUI interface, and its information will be added to the prompt field. The user can then adjust the weight of the Lora and generate an image with the applied style.

  • What is 'VAE' and how does it affect the image generation in Stable Diffusion?

    -VAE, or Variational Autoencoder, is a feature that acts as a filter or color adjuster in Stable Diffusion. It can enhance the brightness and vividness of the generated images. Whether to use VAE or not depends on the base model being used, and the model's author will often provide recommendations on its usage.

  • How can you save and download the generated images in Stable Diffusion?

    -To save a generated image, click the 'Save' button in the WebUI interface. The image will be saved to the 'outputs' folder within the SDWebUI directory. From there, it can be easily accessed and downloaded to other locations on the computer.



🎨 Introduction to AI Painting and Stable Diffusion

The speaker discusses their recent exploration into AI painting, focusing on two prominent tools: MidJourney and Stable Diffusion. They highlight the ease of use and cost of MidJourney, contrasting it with the open-source, free, yet more complex Stable Diffusion. The speaker shares their experience learning Stable Diffusion and introduces an hour-long tutorial aimed at getting beginners started with the tool. They proceed to explain the hardware requirements for running Stable Diffusion, emphasizing the importance of a powerful GPU from NVIDIA and a minimum of 16GB RAM. The speaker also advises on the necessary disk space and shares their personal电脑配置, highlighting the speed at which their setup can generate images.


🔧 Installation and Setup of Stable Diffusion

The speaker guides the audience through the process of downloading and installing the Stable Diffusion launcher, using a Windows computer as an example. They explain how to find the installation package on Bilibili and extract it. The speaker then walks through the steps of installing the launcher's dependencies and running the application. They also demonstrate how to fix common issues, such as the browser not opening the WebUI automatically, by manually entering the URL. The speaker emphasizes the importance of downloading and installing models for Stable Diffusion, recommending specific websites and models for beginners.


🖌️ Understanding and Using Models in Stable Diffusion

The speaker delves into the concept of models in Stable Diffusion, explaining that they define the 'painting style' of the generated images. They discuss where to download models, how to install them, and how to switch between them within the WebUI. The speaker recommends specific models for different styles and provides detailed instructions on placing the downloaded models in the correct directory. They also touch on the impact of models on the generated images, using examples to illustrate the difference in styles.


📝 The Role of Prompts in Stable Diffusion

The speaker explains the critical role of prompts (known as '提示词' in Chinese) in Stable Diffusion, detailing how they guide the generation of images. They cover the use of positive and negative prompts to include or exclude specific elements from the images. The speaker also discusses the syntax of prompts, including the use of parentheses and colons to adjust the weight of certain words. They introduce the concept of prompt plugins and demonstrate how to use them to enhance the image generation process.


🌟 Advanced Prompt Techniques and Features

The speaker continues the discussion on prompts, focusing on advanced techniques and additional features within the Stable Diffusion interface. They explain the use of universal prompts, the importance of prompt syntax, and the functionality of prompt plugins. The speaker also introduces the concept of 'Lora' as a tool to enhance the precision of image generation when words alone are not sufficient. They provide a brief overview of how Lora works and its significance in achieving specific styles.


🎭 Utilizing Lora for Style Consistency

The speaker elaborates on the use of Lora, a feature that allows for consistent styling in generated images. They demonstrate how to download and install Lora models and integrate them into the Stable Diffusion workflow. The speaker also explains the importance of using a compatible base model with Lora and the concept of 'trigger words' that activate specific features within a Lora model. They discuss the impact of Lora weight on the final image and the process of setting the correct cover image for Lora models.


🌈 Exploring VAE for Image Colorization

The speaker introduces VAE (Variational Autoencoder) as a tool for colorization and filtering in Stable Diffusion. They explain the difference between using VAE and not using it, showcasing the enhanced vibrancy and color depth that VAE provides. The speaker guides the audience on how to download VAE models and integrate them into the system. They also discuss the recommendations from model authors regarding the use of VAE and how to select the appropriate VAE model based on the base model in use.


🛠️ Additional Features and Parameters in Stable Diffusion

The speaker covers various features and parameters in Stable Diffusion that can influence the image generation process. They discuss the CLIP termination layer, iteration steps, and the adoption method, providing practical advice on the optimal settings for each. The speaker also explains the high-resolution repair feature and how to use it effectively to increase image quality without causing image degradation. They touch on the importance of the prompt steering coefficient and random seed number in achieving desired outcomes.


🎖️ Final Tips and Conclusion of the Tutorial

The speaker concludes the tutorial by summarizing the key points covered and offering final tips for using Stable Diffusion effectively. They demonstrate how to save and download generated images and emphasize the importance of experimenting with different settings to achieve the desired results. The speaker also mentions that more advanced topics, such as Controlnet, will be covered in future进阶课程 (advanced courses). They sign off, indicating that the hour-long入门课程 (introductory course) has come to an end.




AI绘画 refers to the use of artificial intelligence to create visual art. In the context of the video, it specifically refers to the use of AI tools like MidJourney and Stable Diffusion for generating digital artwork. The video discusses the features and differences between these two AI painting tools, highlighting their ease of use, cost, and capabilities in producing images.

💡Stable Diffusion

Stable Diffusion is an open-source AI model used for generating images from textual descriptions. It is one of the two AI painting tools discussed in the video and is noted for its cost-free availability and the requirement of specific hardware configurations for optimal performance.


Hardware configuration refers to the specifications and components of a computer system required to run a particular software or application effectively. In the video, it is emphasized that for running Stable Diffusion, specific hardware is needed, particularly a high-performance GPU from NVIDIA and a sufficient amount of RAM.


A launcher, in the context of the video, refers to the application used to initiate and manage the operation of the Stable Diffusion AI model. The video provides instructions on how to download, install, and use the launcher, which is crucial for setting up the AI painting environment.


In the context of AI and machine learning, a model refers to the underlying algorithmic structure that is trained to perform specific tasks, such as image generation in the case of Stable Diffusion. The video discusses different models, also known as Checkpoints, that can be used within Stable Diffusion to produce different artistic styles and outcomes.


Prompt words, or '提示词' in Chinese, are the textual descriptions or keywords that users input into AI models like Stable Diffusion to guide the generation of images. These prompts are crucial as they directly influence the style and content of the AI-generated artwork.


Lora, as mentioned in the video, refers to a type of model enhancement used in conjunction with the Stable Diffusion AI model. It is a set of stylistic parameters that can be applied to the generated images to achieve specific visual effects or styles, effectively refining the output based on a collection of images with a unified theme.


VAE, or Variational Autoencoder, is a type of generative AI model used for dimensionality reduction and generating new data points in a compressed space. In the context of the video, VAE functions as a filter or color adjustment tool for the generated images by the Stable Diffusion model, enhancing the visual appeal and color vibrancy.


WebUI stands for Web User Interface and refers to the graphical interface through which users interact with the Stable Diffusion AI model. The video discusses the use of the WebUI to input prompts, select models, and generate images, as well as manage various settings and options.


The term '教程' translates to 'tutorial' in English and refers to the instructional content provided in the video. The video itself is a tutorial aimed at teaching viewers how to use the Stable Diffusion AI painting tool from installation to generating their first images.


Disk space refers to the storage capacity available on a computer's hard drive or solid-state drive. In the context of the video, it is emphasized that users should reserve ample disk space to accommodate the installation files of the Stable Diffusion launcher and the subsequent storage of models and generated images.


The introduction of two AI painting tools, MidJourney and Stable Diffusion, with their respective features and pricing structures.

The tutorial's aim to provide an hour-long入门 (entry-level) guide to mastering Stable Diffusion.

The importance of having an NVIDIA GPU with at least 8 GB of VRAM for optimal performance with Stable Diffusion.

The recommendation to allocate at least 100 GB of disk space for the installation and operation of Stable Diffusion and its models.

The process of downloading and installing the Stable Diffusion launcher, including the use of specific web pages and net disk addresses.

The explanation of the Stable Diffusion model, its role in determining the painting style, and the process of downloading, installing, and switching between models.

The introduction to prompt (提示词) and its significance in guiding the AI's output in Stable Diffusion.

The demonstration of how to use positive and negative prompts to refine the AI-generated images.

The utilization of prompt syntax, such as parentheses and colons, to adjust the weight of prompts in Stable Diffusion.

The installation and use of the prompt all-in-one plugin to enhance the efficiency of inputting prompts.

The explanation of other prompt-related functions, including the use of preset styles and the import/export of prompt data.

The introduction to Lora and its role in capturing and applying specific styles to AI-generated images.

The process of downloading, installing, and using Lora models to achieve desired artistic styles.

The distinction between Lora's native and plugin forms, and their respective installation directories and usage methods.

The importance of using the correct base model when applying Lora to ensure the best results.

The demonstration of how to use VAE (Variational Autoencoder) as a filter or color adjustment tool in Stable Diffusion.

The recommendation to use the DPM+2M karas parameter for the best results in image generation.

The explanation of the CLIP termination layer and its impact on the quality of generated images.

The discussion on the iteration steps and their effect on the balance between image quality and generation speed.

The strategy for achieving high-resolution images without image degradation using the high-resolution repair feature.

The significance of the prompt steering coefficient in controlling the adherence of the AI to the input prompts.

The use of the random seed number to recreate or vary the AI-generated images and the method to lock or reset the seed value.

The overview of additional features and parameters in Stable Diffusion, such as saving and downloading generated images.