How to UPSCALE with Stable Diffusion. The BEST approaches.

Next Tech and AI
3 Dec 202322:16

TLDRThe video provides a comprehensive guide on how to upscale images generated by StableDiffusion 1.5 using various methods. It highlights that while StableDiffusion XL offers high resolution, the newer models for StableDiffusion 1.5, available at CivitAI and HuggingFace, offer comparable quality with better performance and lower memory usage. The primary challenge with these models is their lower resolution outputs, typically 512x512 or 768x768. The video introduces the concept of upscaling as a solution, demonstrating different upscaling techniques such as using the nearest pixel method, ESRGAN, and Superscale, with a focus on the epicRealism model. It also covers the installation and use of the Superscale upscaler and the ultimate SD upscale script for more detailed results. The video concludes with a recommendation to use ControlNet for the best and most stable upscaling results, allowing for repeated upscaling without significant quality loss.

Takeaways

  • πŸ“ˆ StableDiffusion 1.5 has improved custom models that offer similar or better quality than StableDiffusion XL, with better performance and less memory usage.
  • πŸ” The main challenge with these custom models is their lower resolution, typically trained with 512x512 or 768x768.
  • 🎨 Upscaling is a solution to increase the resolution of images generated by custom models.
  • πŸ“š Models and upscalers can be found on platforms like CivitAI and HuggingFace, with detailed instructions on their use.
  • πŸ“± The epicRealism model is used in the video for demonstration, and the Superscale upscaler is highlighted for its effectiveness.
  • βš™οΈ Parameters such as sampling steps, sampling method, height, and CFG scale are crucial for generating and upscaling images.
  • πŸš€ Upscaling methods compared include high-res fix, nearest pixel, ESRGAN, and Superscale, with ESRGAN being suitable for most cases and Superscale for anime.
  • πŸ” For higher quality upscaling, the image-to-image tab can be used with the epicRealism model and specific parameters to enhance details.
  • 🧩 The Superscale upscaler can be used in conjunction with the SD upscale script for more detailed results.
  • πŸ“ ControlNet is introduced as the best upscaling solution for achieving the most stable and detailed results, especially for repeated upscaling.
  • πŸ”„ The Ultimate SD upscale script is another option for upscaling, offering a balance between detail and speed.

Q & A

  • What is the main advantage of using StableDiffusion 1.5 models over StableDiffusion XL?

    -The main advantage of using StableDiffusion 1.5 models over StableDiffusion XL is that they deliver the same quality or even better, have better performance, and require much less memory.

  • What is the primary challenge with the custom models for StableDiffusion 1.5?

    -The primary challenge with the custom models for StableDiffusion 1.5 is the resolution, as they are usually trained with lower resolutions like 512x512 or 768x768.

  • How can one upscale a picture generated by a custom model?

    -One can upscale a picture generated by a custom model by using different upscaling methods, such as ESRGAN, Superscale, or ControlNet, and comparing the results to see the advantages of each method.

  • Where can one find the specialized and improved models for StableDiffusion 1.5?

    -One can find the specialized and improved models for StableDiffusion 1.5 at platforms like CivitAI and HuggingFace.

  • What is the recommended sampling step for the epicRealism model?

    -The recommended sampling step for the epicRealism model is to set it to something above 20, for example, 25.

  • Why might one choose to use the high-res fix for upscaling?

    -One might choose to use the high-res fix for upscaling to improve the resolution of the generated images without significantly increasing the memory consumption.

  • What is the difference between using the nearest pixels algorithm and ESRGAN for upscaling?

    -The nearest pixels algorithm works on pixels and nearest pixels, which can result in a less smooth image with visible pixels. In contrast, ESRGAN uses a deep learning network to scale images more intelligently, resulting in a smoother and more detailed upscaled image.

  • What parameters are important to consider when using the SD upscale script?

    -Important parameters to consider when using the SD upscale script include the sampling method, sampling step, CFG scale, denoising strength, and tile size.

  • How does ControlNet contribute to the upscaling process?

    -ControlNet is a neural network that can be used for various tasks, including upscaling. It contributes to the upscaling process by providing a more stable result and allowing for repeated upscaling without significant loss in quality.

  • What is the recommended approach for upscaling if one wants to maintain the most detail in the image?

    -For maintaining the most detail in the image, one should use the 'ultimate sd upscale' script in the image to image tab or opt for ControlNet for the best results.

  • What is the advantage of using the Superscale upscaler over other methods?

    -Superscale is an upscaler that provides more detail and a smoother result compared to the nearest pixels algorithm. It is particularly useful for upscaling images without introducing visible pixels or artifacts.

Outlines

00:00

🎨 Introduction to StableDiffusion 1.5 and Upscaling Techniques

The video starts by discussing the capabilities of StableDiffusion XL, highlighting its high resolution and prompt understanding. However, it points out that newer custom models for StableDiffusion 1.5 offer similar or better quality with improved performance and lower memory requirements. The main challenge with these models is their lower resolution outputs, typically 512x512 or 768x768. The video introduces the concept of upscaling as a solution to this issue and outlines various upscaling methods that will be explored. It also mentions the availability of specialized models on platforms like CivitAI and HuggingFace, and provides a brief guide on installing the epicRealism model and an upscaler called Superscale.

05:04

πŸ” Exploring Upscaling Options and Image Quality

This paragraph delves into the details of using the epicRealism model with specific parameters to generate images. It cautions users about potential issues with clothing depiction and discusses the limitations of upscaling using high-res fix and the ESRGAN upscaler due to memory constraints. The focus then shifts to a simpler upscaling method, resizing by 4 times, and comparing different upscalers, such as the nearest pixel method and ESRGAN, for their effectiveness. The video also demonstrates how to further enhance the image quality by using the image-to-image tab with a high level of detail in the prompt and maintaining the same parameters as during the initial image generation.

10:07

πŸ“ˆ Advanced Upscaling with Superscale and ControlNet

The video explains how to use the Superscale upscaler for improved results, guiding viewers through the process of selecting the appropriate script and parameters for upscaling. It emphasizes the importance of using a tile size of 512x512 to accommodate lower VRAM graphics cards. The paragraph also introduces ControlNet as a superior upscaling solution, highlighting its ability to provide more detail and stability in the upscaled image. The process of installing and using ControlNet, including downloading necessary models from HuggingFace, is outlined, along with the recommended settings for using it effectively.

15:17

πŸ“± Utilizing ControlNet for Stable Upscaling Results

This section focuses on the advantages of using ControlNet for upscaling images, noting that it can produce more stable results compared to other methods. The video demonstrates how to use ControlNet with default values and suggests experimenting with control weight for different results. It also compares the results of using ControlNet with those of the ultimate SD upscale script, showing that ControlNet provides more details and a smoother appearance. The video concludes by recommending ESRGAN for quick upscaling without the need for detail, the SD upscale script or ultimate SD upscale script for detailed images, and ControlNet for the best results or when multiple upscaling iterations are required.

Mindmap

Keywords

πŸ’‘StableDiffusion XL

StableDiffusion XL is a high-resolution image generation model that is capable of understanding short prompts and producing high-quality images. However, the video script discusses its limitations in terms of memory consumption and the fact that newer models, such as StableDiffusion 1.5, offer comparable or better quality with improved performance and less memory usage.

πŸ’‘Upscale

Upscaling refers to the process of increasing the resolution of an image while maintaining or enhancing its quality. In the context of the video, upscaling is a solution to the lower resolution output of some StableDiffusion 1.5 models, allowing them to compete with the higher resolution capabilities of StableDiffusion XL.

πŸ’‘Custom Models

Custom models in the video script refer to specialized versions of StableDiffusion 1.5 that have been improved for specific tasks or to generate certain types of images. These models are often found on platforms like CivitAI and HuggingFace and come with instructions on how to use them for optimal results.

πŸ’‘EpicRealism Model

The EpicRealism model is a specific custom model mentioned in the video used for generating highly realistic images. It is one of the models that viewers are encouraged to try, and the video provides a tutorial on how to install and use this model for image generation and upscaling.

πŸ’‘Superscale

Superscale is an upscaler tool used to increase the resolution of images generated by custom models. The video discusses downloading and using the 4x Superscale version to upscale images to a higher resolution, which is a crucial step in enhancing the detail of the generated images.

πŸ’‘ESRGAN

ESRGAN stands for Enhanced Super-Resolution Generative Adversarial Networks. It is an upscaler mentioned in the video used for improving the quality of upscaled images. ESRGAN is particularly good for most cases, but there are specific upscalers for certain types of images, such as anime.

πŸ’‘Sampling Steps

Sampling steps refer to the number of iterations used in the image generation process. The video suggests setting the sampling steps to a value above 20, like 25, for the EpicRealism model to achieve better image quality.

πŸ’‘CFG Scale

CFG Scale is a parameter used in the image generation process that affects the creativity level of the output. A higher CFG scale, such as the value 5 mentioned in the video, typically results in more diverse and less predictable images.

πŸ’‘Denoising Strength

Denoising strength is a parameter that controls the level of noise reduction applied to the generated image. The video advises against setting this value too high to avoid excessive image smoothing, which can lead to a loss of detail.

πŸ’‘Tile Size

Tile size is a parameter that defines the dimensions of the smaller sections (tiles) into which the upscaled image is divided during the upscaling process. Using a tile size of 512x512, as mentioned in the video, is beneficial for managing VRAM usage, especially on systems with lower memory capacity.

πŸ’‘ControlNet

ControlNet is a neural network used for various tasks, including upscaling images with more stability and detail. The video highlights ControlNet as one of the best methods for upscaling, as it can be used to upscale images multiple times with consistent results, making it suitable for achieving very high resolutions like 8k.

Highlights

StableDiffusion 1.5 has improved custom models that deliver similar or better quality than StableDiffusion XL with better performance and less memory usage.

The main challenge with these custom models is the lower resolution, typically trained with 512x512 or 768x768.

Upscaling is a solution to increase the resolution of images generated by custom models.

Different upscaling methods will be compared to showcase their advantages.

The epicRealism model from CivitAI or HuggingFace is used as an example, with instructions on how to install and use it.

The Superscale upscaler is introduced as a tool for upscaling images, with a focus on the 4x Superscale version.

The Automatic1111 WebUI is used for the upscaling process, with steps outlined for Windows, Linux, and macOS.

Parameters such as sampling steps, sampling method, height, CFG scale, and prompts are crucial for generating images with the epicRealism model.

The high-res fix is mentioned as an alternative upscaling tool, despite its name suggesting a fix for high resolution.

ESRGAN is recommended for most cases, while a specific upscaler is suggested for anime images.

The nearest pixels algorithm is not recommended due to visible pixelation.

The image-to-image tab is used for further upscaling with the removal of specific details from the prompt for a more generalized result.

The SD upscale script is included in the default web UI installation for additional upscaling options.

Tile size is important for upscaling, with 512x512 being the default size for StableDiffusion 1.5.

The Superscale upscaler is chosen again for its quality in upscaling, providing more detail and improved results.

The ultimate SD upscale script is introduced for even more detail in the upscaled images.

ControlNet is presented as the best upscaling solution, a neural network capable of various tasks including defining a person's pose.

ControlNet provides a more stable result suitable for multiple upscaling iterations, easily scaling to 8k or higher.

ESRGAN is suitable for quick upscaling without needing details, while the sd upscale script or ultimate sd upscale script are recommended for more detailed results.

For the best results and when time is not a constraint, ControlNet is the preferred choice for upscaling.