ComfyUI Workflow Build Text2Img + Latent Upscale + Model Upscale | ComfyUI Basics | Stable Diffusion

11 Jun 202423:38

TLDRThis tutorial video guides viewers through building a basic text-to-image workflow from scratch using ComfyUI, comparing it with Stable Diffusion's automatic LL. It covers adding checkpoint nodes, prompt sections, and generation settings. The video then demonstrates enhancing the workflow with latent upscale and model upscale techniques, showing how to integrate multiple LoRA nodes for detailed image generation. The host invites feedback and suggests more tutorials to follow.


  • ๐Ÿ˜€ The video tutorial explains how to build a basic text-to-image workflow from scratch on ComfyUI.
  • ๐Ÿ” It also covers how to enhance the workflow with latent and model upscale features for improved image quality.
  • ๐Ÿ› ๏ธ The first step in building the workflow is to add a checkpoint node, which is crucial for the process.
  • ๐Ÿ“ The tutorial introduces the use of positive and negative prompt sections in the workflow, essential for guiding the image generation.
  • ๐Ÿ”„ The video demonstrates three methods to add nodes to the workflow: right-clicking, double-clicking, and dragging from the load checkpoint node.
  • ๐ŸŒŸ The importance of the K sampler node is highlighted for generating images based on the prompts and conditions set.
  • ๐Ÿ–ผ๏ธ The script describes how to connect nodes for image generation, including the use of width and height settings for the output.
  • ๐Ÿ” Adding the LoRa node to the workflow is shown as a way to enhance the image generation process.
  • ๐Ÿ”ง The process of duplicating nodes, such as LoRa, is explained to allow for more complex image generation setups.
  • ๐Ÿ“ˆ The tutorial includes a section on creating a latent upscale workflow, which involves additional nodes and connections for image enhancement.
  • ๐Ÿ“Š The final part of the script discusses the model upscale workflow, showing how to further refine the image resolution and details.

Q & A

  • What is the main topic of the tutorial video?

    -The main topic of the tutorial video is building a basic text to image workflow on ComfyUI from scratch and enhancing it with latent and model upscaling techniques.

  • What are the two simple ways to get nodes in ComfyUI as mentioned in the video?

    -The two simple ways to get nodes in ComfyUI are: right-clicking anywhere on the blank space and selecting 'Add node', and double-clicking anywhere on the blank space to open a search bar where you can type in the node name.

  • What is the purpose of the 'positive prompt' and 'negative prompt' sections in the workflow?

    -The 'positive prompt' and 'negative prompt' sections in the workflow are used to guide the image generation process, with positive prompts encouraging desired features and negative prompts discouraging undesired ones.

  • How can you add a 'K sampler' node in ComfyUI?

    -You can add a 'K sampler' node in ComfyUI by double-clicking on the blank space, typing 'K sampler' in the search bar, and selecting 'K sampler' from the results.

  • What is the role of the 'latent upscale' in the workflow?

    -The 'latent upscale' in the workflow is used to enhance the resolution and details of the generated image by upscaling the latent representation of the image before decoding it back into a pixel space.

  • How can you connect multiple 'LoRA' nodes in the workflow?

    -You can connect multiple 'LoRA' nodes by duplicating the 'LoRA' node and then connecting the model and clip outputs from the previous 'LoRA' node to the corresponding inputs of the next 'LoRA' node in the sequence.

  • What is the difference between 'latent upscale' and 'upscale by model' as discussed in the video?

    -The 'latent upscale' works by upscaling the latent representation of the image, while 'upscale by model' uses a specific model to upscale the final image output, typically resulting in a larger and more detailed image.

  • How does the video suggest simplifying a complex workflow in ComfyUI?

    -The video suggests using a 'reroot' node to simplify a complex workflow in ComfyUI, which helps in organizing and cleaning up the connections between different nodes.

  • What are some of the parameters that can be adjusted in the 'K sampler' node for image generation?

    -Some of the parameters that can be adjusted in the 'K sampler' node for image generation include sampling steps, CFG scale, seeds, and the type of sampling scheduler.

  • How does the video demonstrate changing the resolution of the generated image?

    -The video demonstrates changing the resolution of the generated image by adjusting the width and height parameters in the 'empty latent image' node connected to the 'K sampler' node.

  • What is the final step in the workflow for saving the generated image?

    -The final step in the workflow for saving the generated image is to connect the output of the 'VAE decode' node to a 'save image' node, which stores the final image output.



๐Ÿ›  Building a Basic Text-to-Image Workflow on Kyui

The video tutorial begins with an introduction to creating a basic text-to-image workflow from scratch using Kyui. The instructor discusses the importance of obtaining a checkpoint node and demonstrates two methods for adding nodes to the workflow: right-clicking to access the 'add node' section or double-clicking to use a search bar. The tutorial also compares the workflow with Stable Diffusion's automatic LL, emphasizing the need for positive and negative prompt sections. These prompts are added through various methods, including dragging from the load checkpoint node or using the search bar. The instructor then connects the prompts to the load checkpoint node and renames them accordingly, setting the foundation for the text-to-image generation process.


๐Ÿ”„ Enhancing the Workflow with Additional Features

This paragraph delves into enhancing the basic text-to-image workflow with features like the addition of a 'K sampler' node, which is crucial for image generation. The instructor explains how to add this node and connect it to the positive and negative prompts. The tutorial also covers the importance of setting the width and height for the image using an 'empty latent image' node. The workflow is then tested with specific parameters, and the results are shared, demonstrating the initial output of a stylized image. The video continues with the integration of 'Lora' nodes to further refine the image generation process, showing how to connect these nodes and the impact they have on the final image. The instructor also demonstrates how to duplicate Lora nodes for more complex workflows.


๐ŸŒŸ Adding Latent Upscale and Model Upscale Techniques

The script explains how to incorporate latent upscale and model upscale techniques into the existing workflow. The 'latent upscale by' node is introduced, which requires a sample from the initial workflow's output. The instructor guides the audience through connecting the necessary nodes for upscaling, including a 'K sampler' for the upscaled latent image and the appropriate prompts. The workflow is then simplified using a 'reroot' function to organize and streamline the process. The tutorial also covers adjusting the denoisng strength to balance changes in the upscaled image, comparing the original and upscaled results, and the importance of selecting the right scale factor for the desired outcome.


๐Ÿ“ˆ Finalizing the Workflow with Model Upscale and Cleanup

The final part of the tutorial focuses on integrating the 'upscale by model' feature into the workflow. The instructor adds a node for this purpose and connects it to the previously generated latent upscale image. The settings for the upscale model are adjusted, and the results are compared to the original text-to-image output, highlighting the differences in resolution and quality. The workflow is then cleaned up for clarity, and the instructor guides the audience through the process of simplifying the workflow and organizing it into groups. The video concludes with a summary of the workflow's components, including the basic text-to-image, Lora nodes, latent upscale, and model upscale features.


๐ŸŽฌ Wrapping Up the Video with a Completed Workflow

In the conclusion of the video, the instructor recaps the entire process of building a comprehensive text-to-image workflow from scratch. The workflow includes a simple text-to-image setup, the addition of Lora nodes, latent upscale techniques, and model upscale methods. The audience is encouraged to provide feedback and suggestions for future video content in the comments section. The instructor expresses gratitude for watching and signs off with a warm farewell, indicating the end of the tutorial.




ComfyUI refers to a user interface design philosophy that emphasizes ease of use and a pleasant, relaxing experience for the user. In the context of the video, ComfyUI is likely a specific software or platform being used to build a text-to-image workflow from scratch, which is the main theme of the tutorial.

๐Ÿ’กText-to-Image Workflow

A text-to-image workflow is a sequence of steps or processes used to generate images from textual descriptions. In the video, the creator is guiding viewers on how to build such a workflow on ComfyUI, which involves various nodes and settings to produce images based on text inputs.

๐Ÿ’กLatent Upscale

Latent upscale is a process that enhances the quality or resolution of an image, particularly in the context of AI-generated images. The script mentions adding a latent upscale workflow to improve the quality of images produced by the text-to-image process.

๐Ÿ’กModel Upscale

Model upscale refers to the process of increasing the resolution of an image using a specific model or algorithm. In the script, the term is used when discussing the final stage of the workflow where images generated by the text-to-image process are further upscaled for higher detail and resolution.

๐Ÿ’กCheckpoint Node

In the context of the video, a checkpoint node is a component in the workflow that likely represents a saved state or model in the image generation process. The script describes how to add and use a checkpoint node as a foundational element of the workflow.

๐Ÿ’กPositive Prompt

A positive prompt is a textual instruction or guide that helps direct the AI to generate a desired image. The script explains the importance of setting up a positive prompt section in the workflow to guide the image generation process positively.

๐Ÿ’กNegative Prompt

A negative prompt is the opposite of a positive prompt, used to exclude certain elements or characteristics from the generated image. The video script describes setting up a negative prompt section to refine the image generation by specifying what should be avoided.


The K-Sampler node, as mentioned in the script, is likely a component used for sampling methods in the image generation process. It is connected to both positive and negative prompts and is essential for the generation of images based on the provided textual descriptions.

๐Ÿ’กVAE Decode

VAE stands for Variational Autoencoder, and 'VAE decode' refers to the process of decoding or reconstructing an image from a compressed or encoded representation. In the script, a VAE decode node is used to convert the generated latent image into a viewable format.


In the context of the workflow described in the video, 'reroot' likely refers to a method of simplifying or restructuring the workflow by connecting nodes in a way that makes the process more streamlined and easier to manage. The script mentions using reroot to simplify the workflow.

๐Ÿ’กStable Diffusion

Stable Diffusion is a term that refers to a type of AI model capable of generating images from text descriptions. The script mentions comparing the ComfyUI workflow with Stable Diffusion, indicating it as a reference point or alternative for building text-to-image workflows.


Introduction to building a basic text to image workflow on ComfyUI from scratch.

Explanation of enhancing the workflow with latent and model upscale images.

Comparison with Stable Diffusion's automatic LL to understand necessary workflow components.

Step-by-step guide on adding a checkpoint node in ComfyUI.

Three methods to obtain prompt nodes for positive and negative prompts in the workflow.

Importance of connecting prompt sections to the load checkpoint node.

Details on setting up the generation section with sampling methods and image dimensions.

How to add a K sampler node for image generation.

Connecting the K sampler node with positive and negative prompts for image generation.

Demonstration of generating an image using the basic text to image workflow.

Adding LoRA (Layer-wise Adaptive Rate Scaling) to the workflow for enhanced image details.

Instructions on duplicating and connecting multiple LoRA nodes for advanced customization.

Incorporate latent upscale workflow to improve image quality and resolution.

Building the latent upscale workflow with specific nodes and connections.

Adjusting denoisng strength for better image results in the latent upscale process.

Integrating model upscale workflow to further enhance image resolution.

Final workflow review including text to image, LoRA nodes, latent upscale, and model upscale.

Conclusion and invitation for feedback on the tutorial video.