Control-Netの導入と基本的な使い方解説!自由自在にポージングしたり塗りだけAIでやってみよう!【Stable Diffusion】

9 Mar 202313:58

TLDRThe video script introduces a revolutionary AI technology called Control-Net by Iliasviel, which simplifies the process of generating character poses in web UI. It highlights the installation and use of Mikubill's "SD-WebUI-ControlNet" and the integration of models from Hugging Face. The video demonstrates the Open Pose and CannyEdge functions for pose reproduction and line art generation, respectively. It emphasizes the efficiency in character design and the potential applications in various fields, including game development and VTuber content creation.


  • 🚀 Introduction of Control-Net by Iliasviel in February 2023 revolutionized the way poses are generated for characters, making it significantly easier compared to previous methods.
  • 🤖 Mikubill's 'SD-WebUI-ControlNet' is an expansion that allows the use of Control-Net directly on the web UI, streamlining the process further.
  • 📋 The installation process involves downloading and installing the extension from GitHub, followed by the model files from Hugging Face, totaling around 6GB.
  • 🖌️ Control-Net's Open Pose function enables the reproduction of a pose from a simple stick-figure or an image, extracting the pose without the need for complex 3D software.
  • 🎨 The CannyEdge function is a line extraction tool that can transform a vague description into a detailed line art, which can then be used for coloring and character design.
  • 🔄 Detected maps, which are the extracted stick-figure images, can be saved and used for reference, enhancing the efficiency of character design and illustration.
  • 🏠 For background and building designs, functions like MLSD, Depth, and Segmentation are particularly useful, offering detailed extraction for various purposes.
  • 👗 The Normal Map and Depth functions are more applicable in 3D environments like Blender, aiding in the creation of complex textures and depth in 2D illustrations.
  • 🎭 Control-Net has a wide range of applications, from game development and character design to VTuber content creation and Live2D sample generation.
  • 🌐 The script provides a comprehensive guide on installing and operating Control-Net, highlighting its potential to change the landscape of image generative AI.

Q & A

  • What was the main challenge in generating character poses before the introduction of Control-Net?

    -Before Control-Net, generating character poses required using a troublesome method of either writing 'spells' to induce a pose or creating a pose using 3D drawing software and then converting it into an image-to-image.

  • What is Control-Net and how did it change the process of pose generation?

    -Control-Net is a revolutionary technology released by Iliasviel in February 2023 that simplifies the process of pose generation. It allows users to easily take a desired pose without the need for complex methods like writing spells or using 3D drawing software.

  • How can one install and use Mikubill's 'SD-WebUI-ControlNet'?

    -To install Mikubill's 'SD-WebUI-ControlNet', users need to download and install it from GitHub, access the Mikubill page, copy the URL, and paste it into the 'Install from URL' tab in the web UI's extension section. After installation, users need to restart the web UI and apply the Control-Net script.

  • What is the significance of the 'Models' folder in the Control-Net installation process?

    -The 'Models' folder is crucial as it contains the necessary model files for the Control-Net to function. Users must download these files from Hugging Face and place them in the 'Models' folder within the Extensions folder in the Web UI Install folder.

  • How does the 'Open Pose' function of Control-Net work?

    -The 'Open Pose' function allows users to reproduce a pose from an image. It can extract the pose from a stick-figure or an image and reflect it in the generated result, effectively allowing for the recreation of specific poses in the final image.

  • What is the role of the 'CannyEdge' function in the Control-Net?

    -The 'CannyEdge' function is a line extraction tool that can create a strong sense of line art in the generated image. It can extract lines from a reference image and use them to guide the AI in generating a result based on the line art.

  • How does the 'Pre-processor' component interact with the 'Model' in the Control-Net?

    -The 'Pre-processor' works in conjunction with the 'Model' to preprocess the input image, extracting necessary features like poses or line art. Depending on the image used, different pre-processors like 'Open Pose' or 'CannyEdge' are required to prepare the image for the model to generate the desired output.

  • What is the purpose of the 'Detected Map' and how can it be saved?

    -The 'Detected Map' is the result of the Control-Net's line extraction function, showing the extracted lines from an image. Users can enable the 'allow auto-saving' option in the 'setting tab' of the Control-Net to automatically save these detected maps for future use.

  • How can the Control-Net be beneficial for character design and game development?

    -The Control-Net can significantly aid in character design by allowing designers to quickly generate poses and line art, thus streamlining the design process. In game development, it can be used to rapidly prototype character designs and backgrounds, making iterative changes easier and more efficient.

  • What are some other functions of the Control-Net besides 'Open Pose' and 'CannyEdge'?

    -In addition to 'Open Pose' and 'CannyEdge', the Control-Net has functions like MLSD for straight line extraction, Normal Map for surface uneven detection, Depth for extracting image depth, and others like Holistically Nested Edge Detection, Pixel Difference Network, and Fake Scribble for various image processing tasks.

  • How does the Control-Net represent a revolution in image generative AI?

    -The Control-Net represents a revolution in image generative AI by providing precise control over the generation process, allowing users to accurately trace and reproduce specific elements from reference images. This level of control was not previously possible with traditional image-to-image methods.



🚀 Introduction to Control-Net and Mikubill's SD-WebUI-ControlNet

This paragraph introduces the revolutionary Control-Net technology released by Iliasviel in February 2023, which simplifies the process of generating poses for characters. It explains the traditional, more cumbersome methods of achieving desired poses, such as writing 'spells' or using 3D drawing software. The speaker then presents Mikubill's SD-WebUI-ControlNet, an expansion that allows the use of Control-Net directly on the web UI. The paragraph provides a step-by-step guide on how to download, install, and apply the Control-Net, including accessing GitHub, using the extension tab in the web UI, and installing the necessary models from Hugging Face. The practical application of the technology is demonstrated through the use of sample images and the functionality of the Control-Net is showcased by generating an image from a stick-figure using the open pose function.


📚 Understanding the Control-Net's Pre-Processor and Model Relationship

This paragraph delves into the relationship between the pre-processor and the model within the Control-Net framework. It uses the analogy of ordering at a ramen shop to explain how additional specifications (like 'more harder or more oily') can influence the generation result. The paragraph clarifies that when using a stick-figure, no pre-processor is needed, but when using an image, a pre-processor is essential to extract the pose. The pre-processor and model are considered a 'set' depending on the type of image used. The paragraph also introduces the CannyEdge function, a line extraction tool, and explains how it can be used to save a 'detected map' for further use. The capabilities of the Control-Net are further highlighted by demonstrating how it can be used to generate images from line art, offering a new level of efficiency in character design and painting.


🎨 Exploring Additional Functions of the Control-Net

This paragraph discusses various additional functions of the Control-Net, such as MLSD for straight line extraction, Normal Map for surface uneven detection, Depth for extracting image depth, and Holistically Nested Edge Detection for outlining strength and weakness. It also covers Pixel Difference Network, a clearer version of HED, and Fake Scribble for creating illustrations from photos. The paragraph emphasizes the versatility of these functions for different applications, including character illustrations, background designs, and material design. It concludes by highlighting the ease of use and the significant impact Control-Net has on the creative process, especially for those in the design field or using Live2D for VTubers, by enabling efficient generation of poses and designs that were previously challenging to achieve.




Control-Net is a revolutionary technology introduced in the video that simplifies the process of generating images with specific poses. It allows users to achieve desired poses more easily than traditional methods, which involved writing 'spells' or using 3D drawing software. The technology is a breakthrough in the field of image generative AI, as it can extract the outline from a target image and reflect it in the generation result, acting like an additional order to the prompt.

💡Mikubill's SD-WebUI-ControlNet

Mikubill's 'SD-WebUI-ControlNet' is an expansion that enables the use of Control-Net directly on the web UI. It is a tool that allows users to install and run Control-Net models within the web interface, making the process of generating images with specific characteristics more accessible and user-friendly. This tool is showcased in the video as a means to download, install, and apply the Control-Net technology for image generation.

💡Hugging Face

Hugging Face is a platform mentioned in the video where users can access and download various models for the Control-Net. It serves as a repository for different types of models, including 'WebUI ControlNet Module SafeTensors', which are essential for the functionality of the Control-Net. Users can search for 'Control Net' on Hugging Face and download the necessary files to enhance their image generation capabilities.

💡Open Pose

Open Pose is a function of the Control-Net that enables the reproduction of a pose from an image. It is particularly useful for generating images where the pose is a critical element. The Open Pose function can read the pose from a sample image, such as a stick-figure or an illustration, and reflect it in the generated image, allowing for accurate pose reproduction.


CannyEdge is another representative function of the Control-Net that focuses on line extraction. It is used to generate images with a strong sense of line art by extracting the lines from a reference image. This function is particularly useful for creating detailed illustrations or for applying line art to the generated images, enhancing the artistic quality of the output.


A pre-processor in the context of the Control-Net is a function that performs pre-processing on the input image before it is used for generation. This includes tasks such as extracting lines or poses from the image. The pre-processor is essential for preparing the image so that the Control-Net can accurately reflect the desired features in the generated image.


In the context of the video, a model refers to the specific algorithm or set of instructions used by the Control-Net to generate images. Different models are designed for different purposes, such as reproducing poses, extracting lines, or generating images based on line art. The models are integral to the Control-Net's ability to produce varied and complex images based on user input.

💡Detected Map

A detected map is an output generated by the Control-Net that represents the extracted features from the input image, such as lines or edges. This map serves as a guide for the AI to generate images that accurately reflect the original image's characteristics. The detected map is a crucial intermediate step in the process of using the Control-Net to create images with specific attributes.

💡Installation Procedure

The installation procedure refers to the steps required to set up and use the Control-Net on the web UI. This includes downloading and installing the necessary tools, such as Mikubill's 'SD-WebUI-ControlNet', and models from platforms like Hugging Face. The procedure ensures that the user has all the required components to utilize the Control-Net effectively.

💡Character Design

Character design is the process of creating the visual appearance and characteristics of a character for various forms of media, such as illustrations, animations, or games. In the context of the video, character design is facilitated by the Control-Net, which allows for the efficient generation of character poses and features, making the design process more accessible and streamlined.

💡Image Generation

Image generation refers to the process of creating new images using AI technology, such as the Control-Net. This process involves inputting specific instructions or characteristics, which the AI then uses to produce original images that match the given criteria. Image generation is a central theme of the video, as it showcases how the Control-Net can be used to generate images with specific poses, line art, and other features.


Introduction of Control-Net, a revolutionary technology for easier pose generation in character design.

Control-Net allows users to generate poses without complex methods such as writing 'spells' or using 3D drawing software.

Mikubill's "SD-WebUI-ControlNet" is an expansion that simplifies the use of Control-Net on web UI.

Installation process of the local version, Automatic 1111, and its integration with the web UI.

Accessing Mikubill's GitHub page for the installation of the Control-Net extension.

The process of installing the Control-Net extension through the web UI's extension tab.

Explanation of the Open Pose function, which reproduces poses from an image.

Demonstration of pose reproduction using a stick-figure as input.

Introduction to the CannyEdge function, which extracts line art from an image.

Setting up the system to save the 'detected map' for further use.

Use of CannyEdge to generate images with a strong sense of line art.

The ability to color line art without breaking its structure, aiding in character design efficiency.

Explanation of the pre-processor and model relationship in the Control-Net, and their roles in pose reproduction.

Overview of additional functions of the Control-Net, including MLSD, Normal Map, Depth, and more.

The impact of Control-Net on the ease of character design and its potential in game development.

The practical application of Control-Net in VTuber content creation and Live2D.

The significance of Control-Net in revolutionizing image generative AI for character and background design.

Conclusion on the benefits of Control-Net for artists and designers, and its potential to streamline creative processes.