AI로 그림만들기! 처음이용자를 위한 기초설명 (그대로 따라하기,무료,stable-diffusion)

뉴럴닌자 - AI공부
23 Jul 202318:34

TLDRThis video script introduces first-time users to Stable Diffusion WebUI, detailing the process of creating images using various models and settings. It explains the importance of model selection, the integration with Google Drive, and the use of prompts to guide AI in image creation. The script also delves into technical aspects like VAE, sampling methods, and the impact of steps on image quality. It further discusses batch creation, the significance of the CFG scale, and the role of seed values for consistent results. Additional features like extras, high-res fix, and face enhancement tools are also covered, providing a comprehensive guide for users to get started with Stable Diffusion WebUI.


  • 🖼️ The video introduces Stable Diffusion WebUI, a tool for first-time users to create images using AI.
  • 💻 The process can be executed on Google Colab, eliminating the need for high computer specifications.
  • 🔍 Users can select from a variety of models, and even add custom models via Google Drive.
  • 🎨 The model, or checkpoint, is crucial as it determines the overall image quality and style.
  • 🌈 VAE, the last color model, can be included or excluded, affecting the color quality of the generated images.
  • 📝 Prompts are essential; they describe the desired image and guide the AI in creating it.
  • 🚫 Negative prompts help to exclude undesired elements from the generated images.
  • 🎲 Sampling is an algorithm that refines images by reducing noise; different methods yield varying results.
  • 📐 The size of the generated image can be adjusted, but the aspect ratio and pixel size can impact the outcome.
  • 🔄 The CFG scale adjusts how strongly the prompt influences the final image, with higher values leading to more accurate depictions.
  • 🌟 Enhancing image quality and detail can be achieved through various settings like high-res fix, steps, and upscale values.

Q & A

  • What is the primary focus of the video?

    -The video focuses on teaching the basics for first-time users of Stable Diffusion WebUI, explaining the values set when creating an image and providing guidance on what to watch out for during the process.

  • Why is Google Colab used in the process?

    -Google Colab is used because it allows the process to be executed without being affected by the computer specifications, making it accessible to users with varying hardware capabilities.

  • How can users with graphics card specifications install and use an alternative to the Colab environment?

    -Users with graphics card specifications can install and use the necessary software directly on their computers, bypassing the need for the Colab environment.

  • What is the significance of selecting a model in Stable Diffusion WebUI?

    -Selecting a model is crucial as it determines the overall image shape and quality. Different models have varying capacities and selecting the right one can greatly influence the output of the generated images.

  • How can additional models be added in the Stable Diffusion WebUI?

    -Additional models can be added through Google Drive, allowing users to utilize a wider range of models for their image creation.

  • What is a VAE in the context of the video?

    -VAE stands for Variational Autoencoder, which is used as the last color model in the process. It can be included in checkpoints and is usually distributed without it, affecting the coloration of the generated images.

  • What are prompts in Stable Diffusion WebUI?

    -Prompts are descriptive languages that express the image a user wants to create. They guide the AI in generating an image based on the content and details described in the prompt.

  • What is the difference between positive and negative prompts?

    -Positive prompts specify what should be included in the image, while negative prompts specify what should be excluded. This helps in refining the output to match the user's vision more closely.

  • How does the sampling method affect image creation?

    -The sampling method determines how an image is created from noise. Different algorithms like Euler A, DPM-Karras, or DDIM are used, each showing different speeds and levels of detail in the final image.

  • What is the role of the step in sampling?

    -The step refers to the number of times sampling occurs. More steps generally lead to more detailed images, but setting too high a number can result in quality deterioration.

  • What is the purpose of the CFG scale?

    -The CFG scale indicates how much the prompt should be applied to the image generation. Higher values strongly reflect the prompt content, while lower values result in weaker reflections, potentially ignoring or underestimating the prompt.

  • How can the 'high-res fix' feature enhance image quality?

    -The 'high-res fix' feature improves image quality by increasing the image size and detail. It allows for a more refined and higher resolution output, though it may also alter the original image significantly.



📚 Introduction to Stable Diffusion WebUI

This paragraph introduces viewers to the basics of using Stable Diffusion WebUI for first-time users. It emphasizes that the process will run on Google Colab, eliminating the need for high computer specifications. Users with graphics cards can install and use the software outside of the Colab environment. The video outlines the steps to select a model, the importance of model capacity, and the ability to add models via Google Drive. It also explains the concept of checkpoints and VAE, the last color model. The paragraph concludes with instructions on entering prompts to guide the AI in creating images, and the difference between positive and negative prompts.


🎨 Image Creation Process and Sampling

This section delves into the image creation process, explaining that images are generated from noise through sampling. It discusses various sampling methods, such as Euler A and SDE Karras, and their impact on image quality and detail. The paragraph also covers the importance of steps in sampling and the potential downsides of too many steps. It touches on the aspect of size in image creation, the common use of SD1.5 models, and the consequences of deviating from the standard 512-pixel training size. The paragraph further explores the use of word combinations versus sentences in prompts and the concept of creating multiple images through layout settings.


🔧 Adjusting Settings for Image Quality

This paragraph focuses on the various settings that can be adjusted to improve image quality. It discusses the role of CFG scale in reflecting prompt content and the potential distortion that can occur with high values. The seed value's function in generating initial noise values is explained, along with the consistency of images when using the same seed value. The paragraph also introduces the concept of Extras for creating slightly varied images and the High-Res Fix feature for enhancing image detail. The impact of denoising strength on image quality and the balance between detail and original image preservation are also covered.


👤 Enhancing Facial Features in Images

The final paragraph discusses techniques for enhancing facial features in generated images. It highlights the challenges of rendering smaller faces and provides a solution using the 'inpainting' method. The paragraph introduces the use of DDetailer, an extension that simplifies the process of redrawing faces at a 512px size for clarity. The video concludes by reiterating the aim of providing a comprehensive guide for first-time users of Stable Diffusion WebUI and expresses a hope that the information is helpful, ending with a farewell until the next video.



💡Stable Diffusion WebUI

Stable Diffusion WebUI is a user interface designed for utilizing the Stable Diffusion model, which is an AI-based system for image generation. In the context of the video, it is the platform through which first-time users will interact with the AI to create images. The script explains the process of using this interface, including model selection, prompt input, and image generation.

💡Google Colab

Google Colab is a cloud-based platform that allows users to run Python programs and access AI models without the need for high-end computer specifications. In the video, it is mentioned as the environment where the Stable Diffusion process will be executed, emphasizing that users can run the AI model regardless of their personal computer's capabilities.

💡Model Selection

Model selection refers to the process of choosing a specific AI model or 'checkpoint' from a list of available options. Each model has different capacities and can influence the overall image shape and quality. The video provides guidance on how to select and execute a model, which is a crucial step in using the Stable Diffusion WebUI.


VAE, or Variational Autoencoder, is a type of model used in the Stable Diffusion process that affects the color distribution of the generated images. It is often included in checkpoints and can be set or not depending on the user's preference. The VAE setting is important for the final appearance of the colors in the created images.


A prompt is a descriptive input provided by the user that guides the AI in creating a specific image. It is a crucial element in the image generation process, as it communicates the desired content to the AI. Positive and negative prompts are used to include or exclude certain elements from the image, respectively.


Sampling is the algorithmic process by which an AI model generates an image from noise. It involves progressively refining the image by removing noise and building up details according to the prompt and model's guidance. Different sampling methods, such as Euler A or SDE Karras, can be used to create images with varying levels of detail and speed.

💡CFG Scale

CFG Scale, or Control Flow Graph Scale, is a value that determines the influence of the prompt on the generated image. A higher CFG Scale means that the prompt's content will be more strongly reflected in the image, while a lower value results in a weaker reflection, potentially ignoring or underestimating the prompt's words.

💡Seed Value

The seed value is a starting point for the random number generation used in the image creation process. It determines the initial noise value from which the image is sampled. By changing the seed value, users can generate different images, even with the same prompt and model settings. Entering -1 allows the system to automatically input a random seed value.

💡High-Res Fix

High-Res Fix is a feature that enhances the quality and detail of the generated images by increasing their size and adjusting the denoising strength. This option allows users to improve the image resolution and detail level, though it may also alter the original image significantly if the settings are too high.

💡Upscale Value

The Upscale Value determines the factor by which the generated image is enlarged. It is used to increase the size of the image, which can improve the overall quality but may also introduce blurriness if set too high. The right Upscale Value depends on the chosen upscaler and the desired balance between size and clarity.


Inpaint is a technique used to modify or enhance specific parts of an image, such as faces. It allows users to redraw or clarify details in a localized area of the image, often at a higher resolution, to improve the overall quality and clarity of that part.


Introduction to Stable Diffusion WebUI for first-time users.

Explaining the values set when creating an image one by one.

Executing the process using Google Colab, eliminating the need for high computer specifications.

Option to install and use the software with computer graphics card specifications instead of Colab.

Selecting a model and understanding its capacity and execution method.

Adding models through Google Drive for convenience.

The impact of the chosen model on the overall image shape.

Running Colab by pressing the blue button and understanding the different versions available.

Integration with Google Drive for immediate saving of created images or using saved models.

Explanation of the model, also known as a checkpoint, and its role in AI image creation.

Understanding VAE, the last color model, and its inclusion in checkpoints.

The concept and use of prompts to express the desired image for AI to create.

The difference between positive and negative prompts and their impact on the final image.

Setting up VAE for improved image quality and avoiding faded colors.

Explanation of sampling, the algorithm that creates an image from noise.

The importance of the sampling method and its effect on the speed and quality of image creation.

Understanding the step count in sampling and its relation to image detail and quality.

The role of size in image creation and the common use of SD1.5 models.

Creating multiple images by setting the layout, batch count, and batch size.

The significance of the CFG scale in determining how much the prompt influences the image.

Explanation of the seed value and its effect on generating unique images.

The use of the green recycling icon for reusing seed values in subsequent creations.

The function of the extra variation seed value and variation strength for creating slightly altered images.

The high-res fix feature for enhancing image quality and detail.

The impact of the denoising strength on the detail and overall appearance of the image.

Understanding the Hi-Res Step and its setting in relation to the sampling step.

The upscale value and its role in enlarging the image while maintaining its quality.

The use of different upscalers like Latent and ESRGAN series for enhancing image detail.

The method of enhancing the face in an image using inpainting and DDetailer.

Concluding the tutorial and expressing hope for the video's helpfulness.