【Stable Diffusion】LoRA炼丹超详细教学·模型训练看这篇就够了

AI小王子

24 Aug 202341:14

TLDRAI小王子的教程详细讲解了如何训练Lora模型，包括理解checkpoint和Lora的区别、影响图像质量的因素、选图的重要性、训练步骤和参数设置。教程还涉及了显卡要求、软件安装、模型训练过程和后期优化，旨在帮助用户创建专属的AI模特或IP形象。

Takeaways

📝 训练Lora模型需要理解其与checkpoint的区别，Lora相当于设计稿，可以应用到不同的大模型中。
🖌️ 在渲染图片时，图片质量受checkpoint、Lora、关键词和参数四个因素影响，其中checkpoint影响整体风格，Lora负责细节。
🏗️ 训练Lora时，可以选择不同的类别，但重要的是选图的质量，包括面部表情、构图、人物特征和图片质量。
🔍 选图时应考虑多样性，如不同表情、角度和背景，以及图片的质量，建议使用高像素图片但需注意渲染时间。
📸 寻找训练图片时，可以使用自己的照片或者动漫、电影截图，或者使用3D渲染器制作的图片。
🤖 训练Lora需要一定数量的图片，简单主体至少15张，复杂主体如建筑至少100张。
🚦 训练步数和Epoch（训练轮数）需要根据图片数量和类别灵活调整，以确保AI充分学习。
💻 训练Lora对显卡有一定要求，推荐使用Nvidia显卡，并根据显存调整训练分辨率。
🛠️ 安装训练所需软件前，确保满足依赖条件，如正确版本的Python、Git和Visual Studio。
🔧 训练前要进行图像预处理，包括打tag和创建镜像副本，以便AI更好地理解和学习图片内容。
📈 训练过程中要监控loss rate等参数，以及使用TensorBoard等工具来分析训练状态。

Q & A

Lora和checkpoint在AI模型训练中分别扮演什么角色？
-Lora可以被理解为设计稿或设计师的助手，也称为小模型，可以应用到不同的大模型中。Checkpoint则影响整体的风格要素最大，可以把它理解为设计师，模型占用空间较大，一般在2GB以上。
在训练Lora模型时，关键词和参数分别代表什么？
-关键词可以理解为甲方的需求，它决定了AI模型的输出方向和内容。参数则可以类比为设计师的从业经验和能力，它影响着AI模型的表现和最终效果。
选择用于训练的图片时，应该考虑哪些因素？
-选择图片时，应考虑不同的面部表情、构图、人物特征、背景、场景和灯光等因素，以及图片的质量，像素越高越好，但需权衡渲染时间。
如何获取高质量的训练图片？
-可以从动漫、电影、游戏里截取高清截图，自己绘制或拍摄图片，或者使用midjourney、stable diffusion等工具生成图片。
训练Lora模型时，推荐的图片数量和训练步数是多少？
-对于简单的主体如人物Lora，至少需要15张图片和1500到6000步的训练步数；对于复杂的主体如建筑或场景，则至少需要100张照片和更多的训练步数。
在训练Lora模型时，显卡的选择有哪些要求？
-最好是Nvidia显卡，AMD显卡也可以用，但是AMD显卡在训练时容易出错并且速度较慢。显卡和显存决定了训练的分辨率。
安装Kohya_ss时需要注意哪些依赖条件？
-需要确保Python版本为3.10，本地安装了Git，以及安装了Visual Studio。
如何使用stable diffusion webui进行图像预处理？
-在stable diffusion webui中，选择预处理图像功能，设置资源目录和目标目录，然后根据图片数量和训练需求设置每个文件夹的步数。
在Kohya_ss中训练Lora模型时，如何设置训练参数？
-在Kohya_ss的GUI界面中，选择预设配置文件，设置源模型和训练参数，包括批次大小、Epoch、学习率等，并根据需要调整其他相关参数。
训练完成后如何评估Lora模型的效果？
-可以通过观察训练过程中的loss rate，以及使用TensorBoard查看训练状态和模型表现，最终根据生成的图片质量来判断模型效果。
如何从多个训练好的Lora模型中选择最优的模型？
-可以使用脚本和additional networks插件进行模型对比，通过调整权重和观察不同权重下的效果，选择最符合需求的Lora模型。

Outlines

00:00

🧙 Introduction to AI Modeling and Lora Training

The paragraph introduces the concept of AI modeling, specifically focusing on the training of Lora models. It discusses the common issues faced when training models and offers solutions to simplify complex parameters. The speaker, AI小王子, explains the difference between Lora and checkpoint models, their applications, and the factors that influence the quality of rendered images, such as checkpoint, Lora, keywords, and parameters. The importance of selecting diverse and high-quality images for training is emphasized, along with the idea of training Lora as a design draft that can be applied to various models.

05:02

🛠️ Training Steps and Image Selection

This section delves into the specifics of training Lora models, highlighting the importance of image selection for effective training. It outlines the need for a variety of facial expressions, compositions, character traits, and lighting conditions in the images used for training. The speaker provides guidelines on the number of images required for different types of subjects and the optimal training steps per image. The paragraph also discusses the use of different types of images, such as personal photos, screenshots from anime, and images generated from AI models like Stable Diffusion or 3D rendering software.

10:02

🔧 Software Requirements and Installation Process

The speaker outlines the necessary software and plugins required for Lora training, including Kohya_ss, additional networks plugin, and CUDNN for Nvidia GPUs. Detailed instructions are provided for the installation and setup process, including the required Python version, Git installation, and Visual Studio setup. The paragraph also covers the configuration process for Kohya_ss, including the command to execute in the terminal and the steps to follow for the Accelerate config setup. The speaker ensures that the audience understands the importance of each step and provides solutions for common issues that may arise during installation.

15:03

📸 Image Preprocessing and Tagging

This segment focuses on the image preprocessing and tagging process using Stable Diffusion's web interface. The speaker explains how to create folders for images, logs, and models, and how to name them appropriately. The process of creating image subsets for different parts of the model, such as body, head, and clothing, is discussed. The speaker also talks about the importance of determining the number of training steps per image based on the total number of images and the overall training steps. The use of blip and Deepbooru for tagging images is explained, with a preference for blip due to its sentence-based tagging for better AI comprehension.

20:04

🖥️ Configuration and Training Parameters

The paragraph discusses the configuration and training parameters in Kohya_ss for Lora model training. It covers the selection of base models, the importance of choosing the correct version for compatibility, and the customization of folders for input images, output models, and logs. The speaker explains the regularization process to prevent overfitting and the naming of the output model. The paragraph provides a comprehensive overview of the training parameters, including the selection of learning rates, batch size, epochs, and the use of various learning rate schedules like constant, warm-up, and cosine退火.

25:06

🎨 Advanced Configuration and Training Start

This section covers the advanced configuration settings in Kohya_ss, such as memory-efficient attention and gradient checking points for lower VRAM GPUs. The speaker emphasizes the importance of enabling buckets for AI-driven image cropping and provides default values for various parameters. The process of printing the training command and starting the training is detailed, along with the expected outcomes and the interpretation of the loss rate. The speaker also mentions the use of TensorBoard for monitoring training progress and provides tips for post-training adjustments and model selection.

30:09

🚀 Conclusion and Next Steps

The speaker concludes the tutorial by encouraging the audience to practice the training of their own Lora models and to share their work for further learning and improvement. The importance of independent judgment in evaluating model quality is stressed, and the speaker expresses a desire for audience engagement through likes, subscriptions, and community interaction. The paragraph ends with an update on server benefits for premium users and a promise of more content to come.

Mindmap

Keywords

💡LoRA

LoRA在视频中指的是一种AI模型，它可以被训练来生成具有特定风格或特征的图像。它是通过学习一系列特定的图像来掌握这些特征的，比如人物的表情、衣服的风格等。在视频中，LoRA被比喻为设计稿或设计师的助手，能够被应用到不同的大模型中，以产生具有一致风格的图像。

💡炼丹

在视频中，炼丹是一个比喻术语，指的是训练AI模型的过程。这个过程包括选择合适的图像、调整参数和进行模型训练等步骤，目的是让AI学习并掌握特定的图像风格或特征。

💡checkpoint

Checkpoint在视频中指的是AI训练过程中的一个里程碑，它可以影响整体的风格要素。Checkpoint类似于一个大型的设计师，它决定了图像的基本风格和特征。在训练LoRA时，可以不需要完全使用自己的图片素材，而是可以和其他类似的模型进行融合。

💡关键词

关键词在视频中是指在训练LoRA时使用的描述性词汇，它们用来指导AI模型学习特定的图像特征。关键词可以理解为甲方的需求，告诉AI模型需要关注和学习哪些方面。

💡参数

参数在视频中是指在训练LoRA模型时需要调整的各种设置，这些设置会影响模型的学习过程和最终生成的图像质量。参数被比喻为设计师的从业经验能力，决定了AI出图的效果。

💡选图

选图是训练LoRA模型时的一个重要步骤，指的是选择用于训练的图像。这些图像需要具有不同的面部表情、构图、人物特征等，以便AI能够学习到丰富的视觉信息。

💡步数

步数在视频中指的是在训练LoRA模型时，每张图像被AI学习的次数。步数越多，AI对图像细节的学习就越深入，但过多的步数可能会导致训练过度，所以需要适当权衡。

💡Epoch

Epoch在视频中指的是训练过程中的一个完整循环，即AI模型遍历所有训练数据的次数。Epoch的数量会影响模型训练的总步数和学习的质量，更多的Epoch可以提供更多的训练机会，但也可能导致过拟合。

💡显卡要求

显卡要求是指训练LoRA模型时对硬件设备的要求，特别是显卡的类型和显存大小。高性能的显卡可以提供更快的训练速度和更高的图像质量。

💡Kohya_ss

Kohya_ss是视频中提到的一个训练LoRA模型的程序，它提供了一个用户友好的界面来调整训练参数和监控训练过程。

Highlights

AI小王子分享了如何训练专属的Lora模型，让AI模特或IP形象更加个性化。

训练Lora或checkpoint可以让图片渲染效果更符合预期，减少瑕疵。

Checkpoint类似于设计师，影响整体风格，而Lora则像设计稿，可以应用到不同大模型中。

关键词和参数在渲染图片中起到决定性作用，类似于甲方需求和设计师能力。

训练Lora时，图片的选择至关重要，需要多样化的表情、构图、特征和高质量。

训练Lora所需的图片数量和步数取决于模型的复杂程度，如人物Lora至少需要15张图片。

训练过程中，显卡的选择和显存大小会影响训练的质量和速度，推荐使用Nvidia显卡。

介绍了Kohya_ss软件，它是训练Lora模型的推荐工具，并详细说明了安装过程。

训练Lora前需要进行图像预处理，包括打tag和选择训练步数。

在训练Lora时，可以通过调整Epoch和步数来控制训练轮数和每张图片的学习深度。

介绍了如何使用stable diffusion webui进行图像预处理和打tag。

详细讲解了在Kohya_ss中设置和调整训练参数的过程。

训练完成后，可以通过Tensor board查看训练状态和损失值，以及如何选出最佳模型。

提供了常见报错解决方案，并鼓励大家在评论区或服务器交流群分享自己的作品。

教程旨在帮助大家深度理解Lora模型训练，提升AI应用能力。

强调了训练Lora时，图片质量和数量的重要性，以及如何通过调整参数优化训练结果。

介绍了如何使用additional networks插件进行模型融合和扩展。

教程最后提供了如何通过脚本和插件对比不同Lora模型的效果。

AI小王子鼓励大家通过实践掌握Lora训练技巧，并分享自己的进步和作品。

Casual Browsing

【SD3】超详细使用教程+效果测评你想看的都在这里

2024-06-16 06:15:00

【Stable Diffusion】老照片修复+图片高清化+一键抠图超详细讲解

2024-03-25 21:15:03

全新升級✨超簡單AI繪圖！實作教學 Midjourney V6模型 Discord

2024-04-02 20:50:01

【AIGC】七千字通俗讲解Stable Diffusion | 稳定扩散模型 | CLIP | UNET | VAE | Dreambooth | LoRA

2024-04-17 09:25:00

【Stable-Fast-3D】超高速開源圖片轉 3D 模型　僅需 0.5 秒即可生成高品質模型！｜僅需一張圖即可生成 3D 模型

2024-09-14 20:27:00