【Stable Diffusion】LoRA炼丹 超详细教学·模型训练看这篇就够了
TLDRAI小王子的教程详细讲解了如何训练Lora模型,包括理解checkpoint和Lora的区别、影响图像质量的因素、选图的重要性、训练步骤和参数设置。教程还涉及了显卡要求、软件安装、模型训练过程和后期优化,旨在帮助用户创建专属的AI模特或IP形象。
Takeaways
- 📝 训练Lora模型需要理解其与checkpoint的区别,Lora相当于设计稿,可以应用到不同的大模型中。
- 🖌️ 在渲染图片时,图片质量受checkpoint、Lora、关键词和参数四个因素影响,其中checkpoint影响整体风格,Lora负责细节。
- 🏗️ 训练Lora时,可以选择不同的类别,但重要的是选图的质量,包括面部表情、构图、人物特征和图片质量。
- 🔍 选图时应考虑多样性,如不同表情、角度和背景,以及图片的质量,建议使用高像素图片但需注意渲染时间。
- 📸 寻找训练图片时,可以使用自己的照片或者动漫、电影截图,或者使用3D渲染器制作的图片。
- 🤖 训练Lora需要一定数量的图片,简单主体至少15张,复杂主体如建筑至少100张。
- 🚦 训练步数和Epoch(训练轮数)需要根据图片数量和类别灵活调整,以确保AI充分学习。
- 💻 训练Lora对显卡有一定要求,推荐使用Nvidia显卡,并根据显存调整训练分辨率。
- 🛠️ 安装训练所需软件前,确保满足依赖条件,如正确版本的Python、Git和Visual Studio。
- 🔧 训练前要进行图像预处理,包括打tag和创建镜像副本,以便AI更好地理解和学习图片内容。
- 📈 训练过程中要监控loss rate等参数,以及使用TensorBoard等工具来分析训练状态。
Q & A
Lora和checkpoint在AI模型训练中分别扮演什么角色?
-Lora可以被理解为设计稿或设计师的助手,也称为小模型,可以应用到不同的大模型中。Checkpoint则影响整体的风格要素最大,可以把它理解为设计师,模型占用空间较大,一般在2GB以上。
在训练Lora模型时,关键词和参数分别代表什么?
-关键词可以理解为甲方的需求,它决定了AI模型的输出方向和内容。参数则可以类比为设计师的从业经验和能力,它影响着AI模型的表现和最终效果。
选择用于训练的图片时,应该考虑哪些因素?
-选择图片时,应考虑不同的面部表情、构图、人物特征、背景、场景和灯光等因素,以及图片的质量,像素越高越好,但需权衡渲染时间。
如何获取高质量的训练图片?
-可以从动漫、电影、游戏里截取高清截图,自己绘制或拍摄图片,或者使用midjourney、stable diffusion等工具生成图片。
训练Lora模型时,推荐的图片数量和训练步数是多少?
-对于简单的主体如人物Lora,至少需要15张图片和1500到6000步的训练步数;对于复杂的主体如建筑或场景,则至少需要100张照片和更多的训练步数。
在训练Lora模型时,显卡的选择有哪些要求?
-最好是Nvidia显卡,AMD显卡也可以用,但是AMD显卡在训练时容易出错并且速度较慢。显卡和显存决定了训练的分辨率。
安装Kohya_ss时需要注意哪些依赖条件?
-需要确保Python版本为3.10,本地安装了Git,以及安装了Visual Studio。
如何使用stable diffusion webui进行图像预处理?
-在stable diffusion webui中,选择预处理图像功能,设置资源目录和目标目录,然后根据图片数量和训练需求设置每个文件夹的步数。
在Kohya_ss中训练Lora模型时,如何设置训练参数?
-在Kohya_ss的GUI界面中,选择预设配置文件,设置源模型和训练参数,包括批次大小、Epoch、学习率等,并根据需要调整其他相关参数。
训练完成后如何评估Lora模型的效果?
-可以通过观察训练过程中的loss rate,以及使用TensorBoard查看训练状态和模型表现,最终根据生成的图片质量来判断模型效果。
如何从多个训练好的Lora模型中选择最优的模型?
-可以使用脚本和additional networks插件进行模型对比,通过调整权重和观察不同权重下的效果,选择最符合需求的Lora模型。
Outlines
🧙 Introduction to AI Modeling and Lora Training
The paragraph introduces the concept of AI modeling, specifically focusing on the training of Lora models. It discusses the common issues faced when training models and offers solutions to simplify complex parameters. The speaker, AI小王子, explains the difference between Lora and checkpoint models, their applications, and the factors that influence the quality of rendered images, such as checkpoint, Lora, keywords, and parameters. The importance of selecting diverse and high-quality images for training is emphasized, along with the idea of training Lora as a design draft that can be applied to various models.
🛠️ Training Steps and Image Selection
This section delves into the specifics of training Lora models, highlighting the importance of image selection for effective training. It outlines the need for a variety of facial expressions, compositions, character traits, and lighting conditions in the images used for training. The speaker provides guidelines on the number of images required for different types of subjects and the optimal training steps per image. The paragraph also discusses the use of different types of images, such as personal photos, screenshots from anime, and images generated from AI models like Stable Diffusion or 3D rendering software.
🔧 Software Requirements and Installation Process
The speaker outlines the necessary software and plugins required for Lora training, including Kohya_ss, additional networks plugin, and CUDNN for Nvidia GPUs. Detailed instructions are provided for the installation and setup process, including the required Python version, Git installation, and Visual Studio setup. The paragraph also covers the configuration process for Kohya_ss, including the command to execute in the terminal and the steps to follow for the Accelerate config setup. The speaker ensures that the audience understands the importance of each step and provides solutions for common issues that may arise during installation.
📸 Image Preprocessing and Tagging
This segment focuses on the image preprocessing and tagging process using Stable Diffusion's web interface. The speaker explains how to create folders for images, logs, and models, and how to name them appropriately. The process of creating image subsets for different parts of the model, such as body, head, and clothing, is discussed. The speaker also talks about the importance of determining the number of training steps per image based on the total number of images and the overall training steps. The use of blip and Deepbooru for tagging images is explained, with a preference for blip due to its sentence-based tagging for better AI comprehension.
🖥️ Configuration and Training Parameters
The paragraph discusses the configuration and training parameters in Kohya_ss for Lora model training. It covers the selection of base models, the importance of choosing the correct version for compatibility, and the customization of folders for input images, output models, and logs. The speaker explains the regularization process to prevent overfitting and the naming of the output model. The paragraph provides a comprehensive overview of the training parameters, including the selection of learning rates, batch size, epochs, and the use of various learning rate schedules like constant, warm-up, and cosine退火.
🎨 Advanced Configuration and Training Start
This section covers the advanced configuration settings in Kohya_ss, such as memory-efficient attention and gradient checking points for lower VRAM GPUs. The speaker emphasizes the importance of enabling buckets for AI-driven image cropping and provides default values for various parameters. The process of printing the training command and starting the training is detailed, along with the expected outcomes and the interpretation of the loss rate. The speaker also mentions the use of TensorBoard for monitoring training progress and provides tips for post-training adjustments and model selection.
🚀 Conclusion and Next Steps
The speaker concludes the tutorial by encouraging the audience to practice the training of their own Lora models and to share their work for further learning and improvement. The importance of independent judgment in evaluating model quality is stressed, and the speaker expresses a desire for audience engagement through likes, subscriptions, and community interaction. The paragraph ends with an update on server benefits for premium users and a promise of more content to come.
Mindmap
Keywords
💡LoRA
💡炼丹
💡checkpoint
💡关键词
💡参数
💡选图
💡步数
💡Epoch
💡显卡要求
💡Kohya_ss
Highlights
AI小王子分享了如何训练专属的Lora模型,让AI模特或IP形象更加个性化。
训练Lora或checkpoint可以让图片渲染效果更符合预期,减少瑕疵。
Checkpoint类似于设计师,影响整体风格,而Lora则像设计稿,可以应用到不同大模型中。
关键词和参数在渲染图片中起到决定性作用,类似于甲方需求和设计师能力。
训练Lora时,图片的选择至关重要,需要多样化的表情、构图、特征和高质量。
训练Lora所需的图片数量和步数取决于模型的复杂程度,如人物Lora至少需要15张图片。
训练过程中,显卡的选择和显存大小会影响训练的质量和速度,推荐使用Nvidia显卡。
介绍了Kohya_ss软件,它是训练Lora模型的推荐工具,并详细说明了安装过程。
训练Lora前需要进行图像预处理,包括打tag和选择训练步数。
在训练Lora时,可以通过调整Epoch和步数来控制训练轮数和每张图片的学习深度。
介绍了如何使用stable diffusion webui进行图像预处理和打tag。
详细讲解了在Kohya_ss中设置和调整训练参数的过程。
训练完成后,可以通过Tensor board查看训练状态和损失值,以及如何选出最佳模型。
提供了常见报错解决方案,并鼓励大家在评论区或服务器交流群分享自己的作品。
教程旨在帮助大家深度理解Lora模型训练,提升AI应用能力。
强调了训练Lora时,图片质量和数量的重要性,以及如何通过调整参数优化训练结果。
介绍了如何使用additional networks插件进行模型融合和扩展。
教程最后提供了如何通过脚本和插件对比不同Lora模型的效果。
AI小王子鼓励大家通过实践掌握Lora训练技巧,并分享自己的进步和作品。