ComfyUI Workflow Build Text2Img + Latent Upscale + Model Upscale | ComfyUI Basics | Stable Diffusion

CHILDISH YT
11 Jun 202423:38

TLDRThis tutorial video guides viewers through building a basic text-to-image workflow on ComfyUI from scratch, enhancing it with latent and model upscale techniques. It compares the process with Stable Diffusion's automatic LL, detailing the setup of nodes for checkpoints, prompts, and image generation. The video also demonstrates adding LoRA for style enhancement and concludes with a clean workflow, showcasing the results of text-to-image, LoRA integration, latent upscale, and model upscale.

Takeaways

  • ๐Ÿ˜€ The tutorial video provides a step-by-step guide on building a basic text-to-image workflow from scratch on ComfyUI.
  • ๐Ÿ” It discusses enhancing the workflow with latent and model upscale images, comparing it with Stable Diffusion's automatic LL.
  • ๐Ÿ› ๏ธ The first step in building the workflow is to add a checkpoint node, which can be done by right-clicking or double-clicking on the blank space and searching for the node.
  • ๐Ÿ“ The importance of setting up positive and negative prompt sections in ComfyUI is highlighted, with three methods provided for adding prompt nodes.
  • ๐Ÿ”„ The tutorial explains how to connect the prompt sections to the load checkpoint node, which is crucial for the workflow's functionality.
  • ๐ŸŒŸ It demonstrates the process of generating images by connecting the K sampler node with the prompts and adjusting parameters like steps, CFG scale, and sampling method.
  • ๐ŸŽจ The addition of the LoRA (Low-Rank Adaptation) node to the workflow is covered, detailing how to connect it for enhanced image generation.
  • ๐Ÿ” The video script also covers how to duplicate LoRA nodes to further refine the image generation process.
  • ๐Ÿ“ˆ The latent upscale workflow is introduced, explaining how to upscale images by connecting specific nodes and adjusting the denois strength for better results.
  • ๐Ÿ–ผ๏ธ The final part of the tutorial shows how to add a model upscale node to the workflow, which allows for further image enhancement by increasing resolution.
  • ๐Ÿ”— The process of simplifying and organizing the workflow using reroot nodes is discussed to make the workflow more manageable and clear.

Q & A

  • What is the main topic of the tutorial video?

    -The main topic of the tutorial video is building a basic text to image workflow on ComfyUI from scratch and enhancing it with latent upscale and model upscale features.

  • What is the first step in building a workflow on ComfyUI according to the video?

    -The first step in building a workflow on ComfyUI is to get a checkpoint node. This can be done by right-clicking on the blank space and selecting 'Add node' or by double-clicking to open a search bar and typing 'checkpoint'.

  • How can you add a prompt node in ComfyUI?

    -You can add a prompt node in ComfyUI by right-clicking on the blank space and selecting 'Add node' then 'Conditioning' and choosing 'Clip Text and Code Prompt', by double-clicking and typing 'prompt' in the search bar, or by dragging from the 'Clip' section of the 'Load Checkpoint' node.

  • What are the two types of prompt sections mentioned in the video?

    -The two types of prompt sections mentioned in the video are the positive prompt section and the negative prompt section.

  • What is the purpose of the 'K sampler' node in the workflow?

    -The 'K sampler' node is used for generating images in the workflow. It is connected with positive and negative prompts to influence the image generation process.

  • How can you adjust the resolution of the generated images?

    -You can adjust the resolution of the generated images by adding an 'Empty Latent Image' node and specifying the desired width and height.

  • What is the role of the 'Load Laura' node in the workflow?

    -The 'Load Laura' node is used to enhance the workflow by adding LoRA (Low-Rank Adaptation) capabilities, which allows for fine-tuning the model for specific styles or details.

  • How can multiple LoRA (Lowa) nodes be connected in the workflow?

    -Multiple LoRA nodes can be connected by duplicating the 'Load Laura' node and then connecting the model and clip outputs from one LoRA node to the inputs of the next LoRA node in the sequence.

  • What is the purpose of the 'Latent Upscale' node in the workflow?

    -The 'Latent Upscale' node is used to enhance the resolution of the generated images by a certain scale factor, improving the details and quality of the images.

  • What is the final step in the workflow for generating images?

    -The final step in the workflow for generating images is to connect the output of the K sampler or the upscale nodes to a 'VAE decode' node, which decodes the information and converts it into an image, followed by saving or previewing the image.

  • How does the 'Upscale by Model' feature differ from 'Latent Upscale'?

    -The 'Upscale by Model' feature uses a specific model to upscale the image, typically resulting in a much higher resolution and larger file size compared to 'Latent Upscale', which adjusts the latent vector's scale factor to increase image resolution.

Outlines

00:00

๐Ÿ› ๏ธ Building a Basic Text-to-Image Workflow on Kyui

This paragraph introduces the tutorial's objective: constructing a fundamental text-to-image workflow from scratch using Kyui. It also mentions the comparison with Stable Diffusion's automatic 1.1 to provide insights into creating a workflow. The process begins with clearing the workspace and accessing the checkpoint node, which is fundamental for the workflow. Two methods for adding nodes are discussed: right-clicking for the 'Add node' section or double-clicking to search for specific nodes like the 'checkpoint'. The paragraph concludes with a comparison of these methods and a transition to the next topic, which is the automatic 1.1's positive and negative prompt sections.

05:01

๐Ÿ“ Understanding Prompt Nodes and Image Generation Settings

The second paragraph delves into the creation of prompt nodes for both positive and negative prompts, which are essential for guiding the image generation process in Kyui. It outlines three methods to add these nodes: using the 'Add node' option after a right-click, double-clicking to search, or dragging from the 'load checkpoint' node. The paragraph also covers the renaming of these nodes for clarity. Moving on, it discusses the image generation settings found in automatic 1.1, such as sampling method, scheduler types, steps, width, height, CFG scale, seeds, and emphasizes their importance. The speaker then guides the audience back to Kyui to add a 'K sampler' node for image generation, explaining different ways to access and connect it with the previously created prompt nodes.

10:04

๐Ÿ–ผ๏ธ Generating Images and Enhancing the Workflow with LoRA

This paragraph focuses on the final steps of the basic text-to-image workflow, including the addition of an 'empty latent image' node to define width and height, and the connection of a 'VAE decode' node to convert information into an image. It describes the process of generating an image using specific settings and prompts, and then compares the results with different resolutions. The paragraph also introduces the integration of LoRA into the workflow to enhance image generation, explaining how to add and connect the 'load LoRA' node, and the importance of connecting its model and clip points to the existing workflow. The results of adding LoRA are demonstrated, showcasing the improved image detail and style.

15:05

๐Ÿ” Refining the Workflow with Latent Upscaling and Multiple LoRA Nodes

The fourth paragraph discusses the process of adding latent upscale functionality to the workflow. It explains how to incorporate a 'latent upscale by' node, connect it with the initial image output, and set up a new 'K sampler' for the upscaled image. The paragraph also covers the connection of model, positive prompt, and negative prompt nodes to the upscale process. The importance of adjusting the 'denoising strength' for better results is highlighted, and the speaker demonstrates the effect of this setting on the image output. The paragraph concludes with a comparison between the original image and the latent upscaled image, emphasizing the changes and potential disturbances caused by the denoising strength.

20:07

๐Ÿ”„ Finalizing the Workflow with Model Upscaling and Cleanup

In the final paragraph, the speaker wraps up the tutorial by adding a 'model upscale' node to the workflow, which is used to further enhance the resolution of the generated images. The process involves connecting the output of the latent upscale to the 'upscale by model' node and adjusting settings like the scale factor. The results of the model upscale are compared with the latent upscale and the original text-to-image output, showcasing the differences in resolution and image quality. The paragraph concludes with a cleanup of the workflow, organizing the nodes into groups for clarity, and reflecting on the completed workflow, which includes a simple text-to-image workflow, LoRA integration, latent upscale, and model upscale.

Mindmap

Keywords

๐Ÿ’กComfyUI

ComfyUI refers to a user interface design concept that emphasizes ease of use and a relaxing experience for the user. In the context of this video, it's the name of the software platform being used to build a text-to-image workflow. The script mentions ComfyUI multiple times, indicating its central role in the tutorial.

๐Ÿ’กText-to-Image Workflow

A text-to-image workflow is a series of steps or processes that convert textual descriptions into visual images. The video script outlines how to build such a workflow from scratch using ComfyUI, which is essential for understanding the video's main theme of image generation based on text prompts.

๐Ÿ’กCheckpoint

In the context of this video, a checkpoint is a node in the workflow that represents a saved state or model in the image generation process. The script emphasizes the importance of adding a checkpoint node to ComfyUI as the first step in building the workflow.

๐Ÿ’กLatent Upscale

Latent upscale refers to a technique used to enhance the resolution and detail of an image by working with its latent space representation. The video script discusses adding a latent upscale node to the workflow, showing how to improve the quality of generated images.

๐Ÿ’กModel Upscale

Model upscale is a process that increases the resolution of an image using a specific model designed for upscaling. The script mentions adding model upscale to the workflow, demonstrating an advanced technique to further refine the image quality beyond the initial generation.

๐Ÿ’กPositive Prompt

A positive prompt is a text input that guides the image generation process towards desired outcomes. The video script describes setting up a positive prompt section in ComfyUI, which is crucial for directing the AI to create specific types of images.

๐Ÿ’กNegative Prompt

A negative prompt is text input that specifies what should be avoided in the image generation process. The script explains adding a negative prompt section to ComfyUI, which helps in refining the image by excluding undesired elements.

๐Ÿ’กK Sampler

The K sampler is a node in the workflow that handles the sampling process during image generation. The script details the addition and connection of a K sampler node, which is key for determining the steps and randomness in creating the final image.

๐Ÿ’กVAE Decode

VAE stands for Variational Autoencoder, and 'VAE decode' refers to the process of decoding information from the latent space back into a visual image. The script mentions using a VAE decode node to convert the latent image data into a viewable image format.

๐Ÿ’กStable Diffusion

Stable Diffusion is a term used in the script to refer to a type of AI model that generates images from text descriptions. It's mentioned in the context of comparing the workflow built in ComfyUI with the capabilities of Stable Diffusion's automatic LL (likely referring to 'latent layers').

๐Ÿ’กReroot

In the context of the video, 'reroot' refers to a function in ComfyUI that simplifies the workflow by organizing and structuring the nodes in a hierarchical manner. The script describes using reroot to clean up and streamline the complex workflow after adding multiple nodes and steps.

Highlights

Introduction to building a basic text to image workflow on ComfyUI from scratch.

Explanation of enhancing the workflow with latent and model upscale images.

Comparing the workflow with Stable Diffusion's automatic LL for clarity.

Step-by-step guide on adding a checkpoint node to the workflow.

Using the search bar to find and add nodes like the checkpoint node.

Importance of positive and negative prompt sections in the workflow.

Three methods to add prompt nodes to the ComfyUI workflow.

Connecting the prompt nodes to the load checkpoint node.

Details on the Generation section with parameters like sampling method and scheduler types.

How to add a K sampler node for image generation.

Connecting the K sampler node to the prompts for image generation.

Adding width and height parameters to the K sampler node.

Finalizing the workflow with the addition of VAE decode and save image options.

Testing the workflow with a positive prompt and generating an image.

Adjusting resolution and testing the workflow with different settings.

Adding LoRA (Low-Rank Adaptation) to the workflow for enhanced image generation.

Demonstrating how to duplicate and connect multiple LoRA nodes.

Integrating latent upscale workflow to the existing text to image process.

Adding a node for latent upscale by a specific factor.

Connecting the latent upscale node to the K sampler for enhanced image quality.

Incorporating upscale by model as the final step in the workflow.

Comparing the results of the original text to image output with latent upscale and model upscale.

Finalizing the workflow with cleanup and organization of nodes.

Invitation for feedback and suggestions for further tutorial videos.