Stable Diffusion 3 - How to use it today! Easy Guide for ComfyUI

Olivio Sarikas
18 Apr 202416:13

TLDRThis video provides an easy guide on how to use Stable Diffusion 3, a new AI image generation tool. The host begins by comparing Stable Diffusion 3's output to that of Mid Journey SXL, showcasing various scenes and styles. The video demonstrates the tool's ability to create cinematic, artful images and its adherence to color rules. It also highlights the model's strengths in generating detailed and expressive characters, despite some awkward compositions. The host then guides viewers through the installation process, explaining the need for a Stability API account and how to integrate the tool with ComfyUI. The video concludes with a discussion on the pricing structure and a call to action for viewers to share their thoughts on the tool.


  • ๐Ÿš€ Stable Diffusion 3 has been released and is now available for use.
  • ๐ŸŽจ The video compares the visual outputs of MidJourney SXL and Stable Diffusion 3, highlighting their differences in aesthetics and artfulness.
  • ๐Ÿ–ผ๏ธ Stable Diffusion 3 is praised for its closer alignment to the aesthetic of MidJourney, particularly in terms of color composition and creativity.
  • ๐Ÿ“ธ Both models are capable of producing high-quality images, but Stable Diffusion 3 shows a bit more artistic flair in certain comparisons.
  • ๐ŸŒˆ The two-color rule is effectively followed in Stable Diffusion 3's generated scenes, creating visually pleasing contrasts.
  • ๐Ÿบ A favorite image showcased is a wolf sitting in the sunset, demonstrating the model's ability to create artful compositions.
  • ๐Ÿฏ Text incorporation in images is a notable feature of Stable Diffusion 3, as seen in its ability to render a tiger with the text 'I love you so much' in a pixel style.
  • ๐Ÿ’ƒ The fashion shoot prompt results in a detailed and stylish portrayal of a poodle, highlighting the model's understanding of fashion and style.
  • ๐Ÿ˜บ Emotional expressions in cartoonish cats prove to be a challenge for both models, with Stable Diffusion 3 requiring more specific prompts for better results.
  • ๐Ÿ”ฅ The complex prompt of 'girls with big guns' results in dynamic and detailed images from both models, showcasing their capability to handle intricate scenarios.
  • ๐Ÿ“ Installation of Stable Diffusion 3 involves creating an account with Stability AI, generating an API key, and following a straightforward GitHub setup process.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is a guide on how to use Stable Diffusion 3, including comparisons with Mid Journey SXL and installation instructions.

  • What is the significance of Stable Diffusion 3 in the context of the video?

    -Stable Diffusion 3 is significant as it is presented as an advancement in image generation, with comparisons made to show its closer aesthetic to Mid Journey and its ability to handle various prompts.

  • How does the video demonstrate the capabilities of Stable Diffusion 3?

    -The video demonstrates the capabilities of Stable Diffusion 3 by showing generated images from various prompts, comparing them with those from Mid Journey SXL and discussing the results.

  • What are the key features of Stable Diffusion 3 that the video highlights?

    -The key features highlighted in the video include the aesthetic and artfulness of the generated images, adherence to color rules, and the ability to create detailed and expressive compositions.

  • How does the video compare Stable Diffusion 3 with Mid Journey SXL?

    -The video compares Stable Diffusion 3 with Mid Journey SXL by generating images from the same prompts and discussing the differences in composition, color, and overall aesthetic.

  • What issues were noted with Stable Diffusion 3 in the video?

    -Some issues noted with Stable Diffusion 3 include occasional awkward compositions, difficulty with wider formats, and challenges with capturing emotional expressions in characters.

  • What is the process for installing Stable Diffusion 3 as described in the video?

    -The installation process involves creating an account with Stability, generating an API key, purchasing credits, cloning the GitHub project into the ComfyUI custom notes folder, configuring the API key in a JSON file, and adding the Stable Diffusion 3 note to ComfyUI.

  • What are the costs associated with using Stable Diffusion 3?

    -The costs include the initial free credits upon signing up, with subsequent credits being purchased in minimum amounts. The cost per image for Stable Diffusion 3 is 6.5 credits, and for the Turbo model, it's 4 credits.

  • How does the video address the issue of text in images generated by Stable Diffusion 3?

    -The video shows examples where text is correctly incorporated into the images, as well as instances where the text is not included or has errors, suggesting that the model may need more detailed prompts to handle text accurately.

  • What are the different modes available in Stable Diffusion 3 as mentioned in the video?

    -The video mentions two modes in Stable Diffusion 3: text to image and image to image, with the latter requiring adjustments to settings for it to work properly.

  • How does the video guide viewers on setting up Stable Diffusion 3 in ComfyUI?

    -The video provides a step-by-step guide, starting from cloning the GitHub project to configuring the API key, and finally adding the Stable Diffusion 3 note to ComfyUI, including the necessary settings for using the model.



๐ŸŽฅ Introduction to Stable Fusion 3 and Comparison with MidJourney SXL

The speaker introduces Stable Fusion 3 and its comparison with MidJourney SXL. They discuss the aesthetic and artfulness of the images generated by both models, highlighting the cinematic and beautiful results produced by MidJourney and the closeness of Stable Fusion 3 to this aesthetic. The speaker also notes the color composition and the style of the images, pointing out the awkwardness in some compositions. They further discuss the specifics of the Stable Fusion 3 model, its improvements, and how it follows the two-color rule effectively.


๐Ÿ–Œ๏ธ Artistic Comparison: SDXL vs Real็ปดๆ–ฏL Model

The speaker compares the artistic outputs of the SDXL and Real็ปดๆ–ฏL models. They analyze the composition, color, and style of the images, expressing their appreciation for the pink highlight and blue contrast in the SDXL result. The speaker also discusses the detailed and beautiful results from the Real็ปดๆ–ฏL model, noting their preference for the hair in this model. They continue by presenting a second result, expressing their love for the cool and photogenic image produced by the SDXL model.


๐Ÿ˜บ Emotional Expressions in Cartoonish Cats and Complex Scenery

The speaker examines the ability of both models to capture emotional expressions in cartoonish cats and complex scenes. They note the variety and accuracy of facial expressions in the MidJourney result and the similarity between the characters in the Stable Fusion 3 result. The speaker also discusses the challenges faced by both models in capturing the highly detailed anime style and the dynamic pose in a war zone, suggesting the need for more specific prompts for better results.


๐Ÿ’Ž Glitter Effect and Wizard on the Hill Prompt

The speaker explores the models' ability to render a glitter effect on a face and to create a scene with a wizard on a hill. They appreciate the beauty and detail in the results from Stable Fusion 3 and Chuggernaut SDXL, despite some deformation in the hands. The speaker also discusses the famous prompt from the Stable Fusion 3 announcement, noting that while the image produced by MidJourney is beautiful, it lacks the text and wizard spell casting present in the Stable Fusion 3 result.

๐Ÿ› ๏ธ Installation and Setup of Stable Fusion 3

The speaker provides a step-by-step guide on how to install and set up Stable Fusion 3 using the stability API. They explain the process of creating an account, generating an API key, and purchasing credits. The speaker also addresses the language barrier on the GitHub page by suggesting a translation to English. They detail the process of cloning the GitHub project, modifying the config.json file, and setting up the nodes in Comuvey for image rendering using Stable Fusion 3.

๐ŸŽฌ Conclusion and Call to Action

The speaker concludes the video by inviting viewers to share their thoughts on the Stable Fusion 3 models in the comments and to like and subscribe for more content. They also encourage viewers to explore other content on their channel before signing off, leaving a positive impression and fostering viewer engagement.



๐Ÿ’กStable Diffusion 3

Stable Diffusion 3 is a new model for generating images from textual descriptions. It is part of the broader AI-driven image synthesis technology. In the video, it is showcased as an advancement in the field, with comparisons made to its predecessors and other models to highlight its capabilities and the quality of the images it produces.

๐Ÿ’กMid Journey

Mid Journey refers to another AI model used for creating images from text prompts. It is used in the video for comparative purposes to demonstrate the differences in the aesthetic and quality of images produced by different models. Mid Journey is noted for its cinematic and beautiful image outputs.


ComfyUI is a user interface that is mentioned in the context of being the first to get new features or updates. It seems to be a platform or software where users can interact with and utilize AI models like Stable Diffusion 3, as indicated by the discussion of how to install and run the model within ComfyUI.


A prompt is a textual description or request given to an AI image generation model to guide the creation of an image. In the video, various prompts are used to demonstrate the AI's ability to interpret and visualize concepts, such as 'sci-fi movie scene' or 'cartoon cat wearing glasses with a series of expressions'.

๐Ÿ’กAPI Key

An API key is a unique identifier used to authenticate a user, device, or application interacting with an API (Application Programming Interface). In the context of the video, the API key is necessary to use the Stability API for running Stable Diffusion 3, which requires users to sign up and generate an API key on the Stability website.

๐Ÿ’กImage-to-Image Rendering

Image-to-image rendering is a process where an AI model takes an existing image and transforms it according to a given prompt. It is mentioned in the video as a feature that is intended to be used with Stable Diffusion 3, although the presenter notes that it does not seem to be functioning correctly at the time of the video.

๐Ÿ’กText Embedding

Text embedding in the context of AI image generation refers to the AI's ability to include and interpret text within the generated image. The video demonstrates this with examples where the AI is asked to generate images with specific text, such as 'I love you so much' in a pixel art style.


Aesthetic refers to the visual or sensory qualities that give pleasure or appeal through beauty. The video discusses the aesthetic qualities of the images produced by different AI models, comparing the artfulness and visual appeal of the results.

๐Ÿ’กNegative Prompt

A negative prompt is a directive given to an AI image generation model to exclude certain elements or characteristics from the generated image. It is used in conjunction with positive prompts to fine-tune the output of the AI to the user's preferences.

๐Ÿ’กStable Diffusion 3 Turbo

Stable Diffusion 3 Turbo is a variant or a higher-tier version of the Stable Diffusion 3 model, which presumably offers faster or more enhanced image generation capabilities. It is mentioned in the context of the credits required to use the model, indicating a cost-benefit decision for users.


The term installation refers to the process of setting up or configuring software or models for use. In the video, the presenter walks through the steps required to install and run Stable Diffusion 3 within ComfyUI, which includes obtaining an API key, cloning a GitHub project, and configuring settings within the user interface.


Stable Diffusion 3 has been released, offering new capabilities for image generation.

Comparisons between Mid Journey SXL and Stable Fusion 3 showcase the improvements in aesthetics and artfulness.

Stable Fusion 3 is noted for its closer resemblance to the aesthetic of Mid Journey, with better color and composition.

The text 'I love you so much' is successfully incorporated into an image of a tiger, demonstrating the model's ability to handle text.

A poodle in a 1960s fashion shoot is rendered with impressive style and detail by both Stable Diffusion 3 and SXL models.

Stable Diffusion 3 struggles with generating emotional expressions in cartoonish cats, requiring more detailed prompts.

The generated image of a girl with big guns by Mid Journey is highly detailed and well-composed, setting a high standard.

Stable Diffusion 3's first attempt at the 'girl with guns' prompt has inaccuracies, but improvements are seen in subsequent attempts.

The 'wizard on the hill' prompt is effectively rendered by Stable Diffusion 3, including text and magical elements.

Stable Diffusion 3 requires an account with Stability AI and uses their API, with a straightforward installation process.

Users are provided with 23 free credits upon signing up with Stability AI, with the option to purchase more.

The installation guide for Stable Diffusion 3 on GitHub is initially in Chinese but can be easily translated to English.

The configuration for Stable Diffusion 3 involves editing a JSON file with the user's API key and saving the changes.

The note 'Stable Diffusion 3' in ComfyUI is used for image generation, with options for text-to-image or image-to-image rendering.

Users can adjust settings such as positive and negative prompts, aspect ratio, and model selection within the ComfyUI node.

The tutorial provides insights into the strengths and limitations of Stable Diffusion 3, encouraging experimentation and feedback.

The video concludes with a call to action for viewers to like, subscribe, and share their thoughts on the Stable Diffusion 3 models.