Best Practice Workflow for Automatic 1111 – Stable Diffusion

AIKnowledge2Go
26 Jun 202307:59

TLDRIn this video, the creator shares their preferred workflow for using the stable diffusion model in an automatic setting with a focus on semi-realistic renderings. They offer tips on setting up the interface, choosing the right parameters for image quality, and utilizing various tools like Euler a for prompt engineering. The video also demonstrates how to refine images through resampling and denoising, ultimately showcasing the process of creating a detailed scene featuring a female astronaut and an exploding space station.

Takeaways

  • 🎨 The video discusses an optimal workflow for using Stable Diffusion with the ref animated model, which is known for its semi-realistic renderings.
  • πŸ”§ To enable the 'clip skip' feature, navigate to settings, user interface, quick settings, and enter 'clip stop at last layers', followed by a UI restart.
  • πŸ’‘ The recommended settings for the model include using Euler a for prompt engineering, a width and height of 768, and a 16:9 resolution for optimal image quality.
  • πŸ“Έ Batch size can be increased for faster rendering of multiple images, allowing for selection among various outputs.
  • πŸ–ŒοΈ For image refinement, the video suggests using the 'send to image to image' option to maintain the original composition.
  • πŸ” Changing the sampler to DPM plus plus 2m arrows enhances image details, especially for facial features.
  • πŸ‘Œ Denoising strength should be set between 0.4 and 0.7, depending on the desired level of change in the final image.
  • πŸ–ΌοΈ The 'intent' feature is used to fix errors in the image, such as adding a missing leg to the character, and requires careful attention to the mask settings.
  • πŸ”„ Resizing the image by two and then scaling it back up helps maintain detail and clarity in the final output.
  • 🎭 The 'upscale' function with the RS Rugen 4X, anime 6B setting is recommended for refining images rendered with the ref animated model.
  • πŸ“Ί The video creator plans to release more tutorials, including one on common mistakes in painting hands with Stable Diffusion.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about the best workflow for stable diffusion in automatic 1111, using a semi-realistic model called ref animated.

  • What is the significance of the 'clip skip' setting mentioned in the video?

    -The 'clip skip' setting is important for the user interface to display the slider, which can be accessed by navigating to settings, user interface, and then Quick Settings.

  • Why is Euler a recommended for prompt engineering in the video?

    -Euler a is recommended for prompt engineering because it is efficient and fast, making it suitable for the experimental phase of the workflow.

  • What are the recommended width and height settings for the model to avoid deformations?

    -The recommended width and height settings are 768, which is the maximum that most models can handle without causing deformations.

  • How does the batch size affect the rendering process?

    -The batch size determines the number of images rendered at once. A higher batch size allows for the selection of nicer images from a larger pool.

  • What sampler is changed to in the video for the image-to-image process?

    -The sampler is changed to DPM plus, plus 2m arrows for the image-to-image process.

  • What is the purpose of adjusting the denoising strength in the workflow?

    -Adjusting the denoising strength allows for control over the amount of change introduced to the image, ranging from significant changes to almost no changes.

  • Why is the resolution scaled down and then up again in the Over Paint process?

    -Scaling down the resolution reduces the detail loss during the downsampling process, and scaling it up later results in a crisp and detailed image.

  • How does the video creator ensure that the mask is correctly applied during the Over Paint process?

    -The creator ensures the mask is correctly applied by checking the settings to confirm that it says 'start drawing', which indicates the mask is active.

  • What upscaler method is recommended for refining the final image?

    -The RS Rugen 4X, anime 6B upscaler method is recommended for refining images rendered with the ref animated model due to its semi-realistic look.

  • What additional content does the video creator mention for future videos?

    -The video creator mentions upcoming tutorials about in painting hands, the seven common mistakes people make with automatic 1111, and many more videos.

Outlines

00:00

🎨 Best Workflow for Stable Diffusion in Automatic 1111

This paragraph introduces the speaker's personal opinion on the best workflow for using Stable Diffusion in Automatic 1111, a semi-realistic model capable of beautiful renderings. The speaker provides tips on setting the CLIP skip to 2 and navigating the user interface to enable certain features. A prepared prompt featuring a female astronaut, a space station, and an exploding station is mentioned, with the intention of sharing it on the speaker's Patreon page. The paragraph emphasizes the importance of using Euler a for prompt engineering and the optimal dimensions for rendering to avoid deformations. The speaker also discusses batch size and the selection of images for further refinement.

05:02

πŸ–ŒοΈ Refining the Image with Sampler and Denoising Settings

The speaker delves into the process of refining the image using different settings. They explain the transition from using Euler a for initial rendering to Cyrus fix for further experimentation. The importance of maintaining the composition while changing the sampler is highlighted. The speaker opts for the 'send to image to image' function to keep the final image based on the original rendering. They discuss the impact of resizing the image and adjusting denoising strength to introduce desired levels of change. The paragraph concludes with the speaker's satisfaction with the rendering results and a call to action for viewers to subscribe for more content.

Mindmap

Keywords

πŸ’‘Stable Diffusion

Stable Diffusion is a type of artificial intelligence model used for generating images from textual descriptions. In the context of the video, it is the primary tool being discussed for creating digital art. The video provides tips on optimizing its use for automatic generation of images, indicating its significance in the workflow described.

πŸ’‘Ref Animated

Ref Animated is a semi-realistic model within the Stable Diffusion AI system that is capable of producing high-quality, beautiful renderings. It is highlighted as a recommended option to try out for its ability to create visually appealing images, as showcased in the video through the various examples of generated art.

πŸ’‘CLIP Skip

CLIP Skip is a setting within the AI art generation process that allows users to control the influence of the text prompt (CLIP) on the generated image. Setting it to 2 means the AI will consider the text prompt but also introduce more creative freedom in the image generation. This term is important as it affects the balance between following the prompt and allowing the AI to create unique content.

πŸ’‘Euler a

Euler a is mentioned as a choice for prompt engineering in the Stable Diffusion workflow. It is chosen for its speed when experimenting with different settings, suggesting that it is an efficient algorithm for the initial stages of image generation where rapid iteration is desired.

πŸ’‘Resolution

Resolution refers to the dimensions of the generated image, with 768 being the maximum width and height recommended to avoid deformations. The speaker also mentions a 16:9 aspect ratio for screen resolution, which is a common format for displays and ensures compatibility with most screens.

πŸ’‘Batch Size

Batch Size in the context of AI art generation refers to the number of images generated at once. Increasing the batch size allows the user to have more options to choose from, which is useful when looking for the best image among multiple outputs.

πŸ’‘Image-to-Image

Image-to-Image is a method used in the AI art generation process where the AI is instructed to create a new image based on an existing one. This technique is used to refine and improve the final image while maintaining the overall composition set by the initial prompt.

πŸ’‘Denoising Strength

Denoising Strength is a parameter that controls the level of detail and changes introduced to the image during the refinement process. A higher value (like 0.7) results in more changes, while a lower value (like 0.4) results in minimal changes. This setting is crucial for balancing the original image's details with the desired level of modification.

πŸ’‘Intent

Intent in the AI art generation context refers to the process of manually refining or fixing parts of the generated image. This can involve adding or modifying elements such as a missing leg on a character, as mentioned in the video. It is a crucial step for achieving a polished final product.

πŸ’‘Upscale

Upscale refers to the process of increasing the resolution of an image, typically to enhance its quality and detail. In the video, the speaker uses an upscaler like 'rs rugen 4X, anime 6B' to improve the final image, especially for images rendered with the ref animated model.

πŸ’‘Tutorial

A tutorial is an instructional video or guide that provides step-by-step information on how to perform a specific task or use a particular tool. In the context of the video, the speaker is creating tutorials to help others learn about AI art generation, including common mistakes and best practices.

Highlights

The video presents the best workflow for stable diffusion in automatic 1111.

The recommended model for this workflow is ref animated, which is semi-realistic and capable of beautiful renderings.

CLIP skip is set to 2, a setting often requested by viewers in the comments.

To enable the CLIP skip slider, navigate to settings, user interface, and Quick Settings, then enter 'clip stop at last layers'.

The video provides a prepared prompt featuring a female astronaut, a space station, and an exploding station in the background.

Euler a is suggested for prompt engineering and experimentation due to its speed.

The dimensions 768x432 are recommended to avoid deformations and maintain a 16:9 resolution.

Batch size is increased to 8 for selecting nice images from the rendering process.

The process includes using send to image to image for maintaining composition while changing the sampler.

DPM plus 2m arrows is recommended as the sampler for enhancing details like facial features.

Denoising strength should be set between 0.4 and 0.7 depending on the desired level of change.

Batch count is set to three to provide a selection of images to choose from.

The video demonstrates the selection process of the best image from the rendered options.

Intent settings are adjusted to fix issues like the missing leg on the astronaut.

The importance of ensuring the mask is applied correctly in the settings is emphasized.

The final step involves upscaling the image using the RS Rugen 4X, anime 6B option for a semi-realistic look.

The video concludes with a preview of upcoming content, including a tutorial on painting hands and common mistakes with automatic 1111.