SD3 - Local Install Guide! FASTEST Way to run the new Model - Stable Diffusion 3

Olivio Sarikas
12 Jun 202406:15

TLDRThis video guide provides a step-by-step tutorial on downloading and running Stable Diffusion 3 Medium, a cutting-edge AI model, on your computer. The host explains the process of acquiring a free license for non-commercial use from Hugging Face, choosing the right model file, and setting up the environment in Comfy UI. Viewers are also introduced to various workflows and demo prompts to test the model's capabilities. The guide includes troubleshooting tips for updating Comfy UI and loading workflows, showcasing the model's ability to interpret prompts and generate creative images.

Takeaways

  • 😀 Stable Diffusion 3 Medium is released and the guide will show you how to download, update, and run it on your computer.
  • 📷 The images shown are first-roll renders with Stable Diffusion 3, and the prompts used are not improved for better quality images.
  • 📝 To use Stable Diffusion 3, visit Hugging Face and sign a free license for non-commercial use; for commercial use, contact Stability AI.
  • 📚 Two main model versions are available: 'sd3 medium safe tensor' without text encoder, and 'sd3 medium including clip save tensor' which is recommended.
  • 💾 Download the recommended 6 GB model version into your models folder for automatic updates or for use with Comfy UI.
  • 🔧 Comfy UI (comu) offers different workflows to try out, including basic, multi-prompt, and upscaling workflows.
  • 🖼️ The script mentions downloading demo prompts to test the model with various prompts.
  • 🛠️ Update Comfy UI first to ensure compatibility with the new model, using the manager extension or by manually updating dependencies.
  • 🔄 If the torch Cuda model breaks after updating, fix it by running the 'update comu and python dependencies' file found in the Comfy UI windows portable folder.
  • 🌐 There are additional workflows provided by Comfy UI's developer for different model versions, which can be loaded into Comfy UI by dragging the images onto the canvas.
  • 🎨 The guide demonstrates a simple text-to-image workflow using the model, with settings suggested by Comfy UI's developer, including the use of the U-Net sampler and a CFG value of 5.5.

Q & A

  • What is the Stable Diffusion 3 medium model?

    -The Stable Diffusion 3 medium model is a version of the AI model that does not include the text encoder. It is one of the versions available for download on Hugging Face for non-commercial use.

  • Why should I choose the 'sd3 medium including clip save tensor' file over the 'sd3 medium safe tensor'?

    -The 'sd3 medium including clip save tensor' file is recommended because it includes the text encoder, which is not present in the 'sd3 medium safe tensor' version. This makes it more versatile for generating images based on text prompts.

  • What is the purpose of signing the license on Hugging Face?

    -Signing the license on Hugging Face is necessary for non-commercial use of the Stable Diffusion 3 model. It ensures that the user agrees to the terms of use, which is a requirement for accessing and using the model.

  • Can I use the Stable Diffusion 3 model for commercial purposes?

    -While the model can be used commercially, to do so legally, you must reach out to Stability AI to obtain a commercial use license.

  • What is the size of the 'sd3 medium including clip save tensor' file?

    -The 'sd3 medium including clip save tensor' file is approximately 6 GB in size.

  • How can I access different workflows for using the Stable Diffusion 3 model in comu?

    -You can access different workflows such as basic, multi-prompt, and upscaling workflows by clicking on the 'comu I example workflows' folder in Hugging Face.

  • What is the purpose of the 'sd3 demo prompts.txt' file?

    -The 'sd3 demo prompts.txt' file contains a variety of text prompts that can be used to test the Stable Diffusion 3 model and generate images.

  • Why is it necessary to update comu before using the Stable Diffusion 3 model?

    -Updating comu ensures that you have the latest version compatible with the Stable Diffusion 3 model, allowing you to use all its features without any issues.

  • What should I do if the torch Cuda model breaks after updating comu?

    -If the torch Cuda model breaks, you should go to the comu windows portable folder, find the 'update comu and python dependencies' file, and run it to fix the issue.

  • How can I load the workflows provided by comu Anonymous?

    -To load the workflows provided by comu Anonymous, download the workflow images and drag them into the comu canvas. This will import the workflow into comu for use with the Stable Diffusion 3 model.

  • What settings are suggested by comu Anonymous for generating images with the Stable Diffusion 3 model?

    -The suggested settings include using the sgm uniform scheduler with 30 steps and a CFG value of 5.5, along with the uler sampler.

Outlines

00:00

😀 Introduction to Stable Diffusion 3 Medium

This paragraph introduces the Stable Diffusion 3 Medium model and provides a step-by-step guide on how to download, update, and run it on a computer. The speaker mentions that the images shown are first-roll renders without improved prompts, and promises to demonstrate testing results and tips for better image quality in a follow-up video. The process begins with signing a free license for non-commercial use on Hugging Face, selecting the appropriate model file (recommending the 6 GB 'sd3 medium including clip save tensor' file), and downloading it into the models folder for automatic 1111 or for Comfy UI. The paragraph also touches on different workflows and demo prompts available for testing the model in Comfy UI.

05:03

😺 Setting Up and Using Stable Diffusion 3 Medium in Comfy UI

The second paragraph focuses on setting up the Stable Diffusion 3 Medium model in Comfy UI. It explains the necessity of updating Comfy UI before using the new model and provides a solution for fixing the torch Cuda model if it fails to start after the update. The speaker then guides the user on how to load workflows developed by Comfy UI's developer, including specific settings for the 'medium including clip save tensor' model. An example prompt is given, 'cat holding a sign with the text I love you,' which results in a creative image that aligns with the text description. The paragraph concludes with an invitation for viewers to like, subscribe, and look forward to more videos.

Mindmap

Keywords

💡Stable Diffusion 3

Stable Diffusion 3 is a term referring to a specific version of an AI model that generates images from textual descriptions. In the video, it is the central topic as the host guides viewers through the process of downloading and running this model on their computers. It is part of a series of AI models known for their ability to create detailed and imaginative images based on text prompts.

💡Hugging Face

Hugging Face is a company that provides a platform for sharing machine learning models. In the context of the video, it is the website where viewers are directed to sign a license agreement to use Stable Diffusion 3 for non-commercial purposes. It is a crucial step in the installation process as it ensures legal compliance with the model's usage terms.

💡License

A license in this context is a legal agreement that allows users to use the Stable Diffusion 3 model under certain conditions. The video mentions that viewers need to sign a free license for non-commercial use, or contact Stability AI for a commercial use license. It is an important aspect as it outlines the permissible use of the technology.

💡Model

In the field of AI, a model refers to a system that has been trained to perform specific tasks, such as image generation. The video discusses different versions of the Stable Diffusion 3 model, including one without a text encoder and another that includes additional components like CLIP and T5 XXL. The choice of model affects the capabilities and size of the AI system being installed.

💡CLIP

CLIP is an acronym for Contrastive Language–Image Pre-training, which is a neural network model that connects an image to the text that describes it. In the video, it is mentioned as part of the Stable Diffusion 3 model, enhancing its ability to understand and generate images based on textual prompts, such as 'cat holding a sign with the text I love you'.

💡T5 XXL

T5 XXL is a version of the Text-to-Text Transfer Transformer (T5) model that has been scaled up to a larger size, providing increased capabilities for text processing tasks. The video script mentions a version of Stable Diffusion 3 that includes T5 XXL, indicating a more advanced model that may offer better performance for certain tasks.

💡Comfy UI

Comfy UI, often stylized as comu, is a user interface for running and managing AI models like Stable Diffusion 3. The video provides a tutorial on how to use Comfy UI to load and run the Stable Diffusion 3 model, including updating the software and managing workflows for image generation.

💡Workflow

In the context of the video, a workflow refers to a series of steps or processes that are followed to accomplish a task within Comfy UI. The host discusses different workflows available for Stable Diffusion 3, such as a basic workflow, a multi-prompt workflow, and an upscaling workflow, which are used to generate images from text prompts.

💡Update

Updating, in the context of software, refers to the process of installing the latest version of a program to incorporate new features, improvements, or bug fixes. The video instructs viewers to update Comfy UI to ensure compatibility with the Stable Diffusion 3 model, which is essential for running the model successfully.

💡Checkpoint

A checkpoint in AI is a saved state of a model's training process. In the video, loading a checkpoint refers to selecting a specific version of the Stable Diffusion 3 model to use within Comfy UI. The host uses the term to describe the process of choosing the model file that will be employed for image generation.

💡Prompt

In the context of AI-generated images, a prompt is the textual description that guides the model in creating an image. The video provides examples of prompts, such as 'cat holding a sign with the text I love you,' which the model uses to generate a corresponding image, demonstrating the model's ability to interpret and visualize text.

Highlights

Stable Diffusion 3 medium is released and the guide will show how to download and run it on your computer.

Fun images rendered with Stable Diffusion 3 are showcased, all first roll and unimproved for initial testing.

The importance of signing the free license for non-commercial use on Hugging Face is emphasized.

For commercial use, reaching out to Stability AI for a license is necessary.

Different versions of the model are available, with the recommendation to use the 'sd3 medium including clip save tensor' file.

Instructions on downloading the model into the models folder for automatic 1111 or for com UI are provided.

Workflows and demo prompts are available in the com UI example folder for testing the model.

The process of updating com UI to use the new model is explained, including using the com UI manager.

A potential issue with the torch Cuda model and the steps to fix it are described.

Loading workflows after updating is necessary to utilize the new model features.

Workflows developed by com UI Anonymous are introduced for different model versions.

Instructions on how to load and customize workflows in com UI are given.

Settings recommended by com UI Anonymous for the workflow are detailed, including the use of the SGM uniform scheduler and the uler sampler.

A demonstration of creating an image with the prompt 'cat holding a sign with the text I love you' is shown.

The model's creative decision in interpreting the prompt and generating a heart in the image is highlighted.

An invitation to like and subscribe for more videos like this is extended.