the best REALISTIC models for Stable Diffusion

James Beltman
26 Jul 202308:44

TLDRThe video discusses the best models for creating lifelike images through stable diffusion. The presenter's favorite model, Epic Realism, is highlighted for its exceptional facial detail capture. Tips for using the model include keeping prompts simple, fine-tuning parameters like steps and CFG scale, and using specific samplers for added realism. High res upscalers like nmkd faces are recommended for better image detail. The video also covers Magic Mix, ideal for dramatic scenes but with a tendency to generate East Asian women, and Analog Madness, praised for its versatility in generating ordinary individuals. The presenter emphasizes the importance of vivid prompts and provides a workflow for each model, including sampler choice and steps for optimal results.

Takeaways

  • 🎨 **Epic Realism**: A favorite model for creating lifelike images, especially good at capturing facial details.
  • πŸ“ˆ **Simplicity is Key**: Avoid using keywords like 'Masterpiece' or '8K' in prompts as they don't affect the outcome.
  • πŸ” **Fine-Tuning Parameters**: Keeping steps above 20 and adjusting CFG scale to 5 can help maintain quality and realism.
  • 🧩 **Sampler Selection**: DPM sde Caris or dpm2m Keras are recommended for an extra dose of realism.
  • πŸ“± **High-Res Upscaling**: Using tools like nmkd super scale or nmkd faces with a denoising setting of 0.35 and upscale factor of 2 improves detail.
  • ❌ **Negative Keywords**: Effective use of negatives helps refine the image and avoid biases like the tendency to generate East Asian women.
  • πŸ’‘ **Lighting Details**: There's no need for extra lighting keywords; the model captures light and shadow details well.
  • 🚫 **Avoid Over-Description**: Over-describing the face can lead to less desirable results.
  • 🌟 **Magic Mix Model**: Specializes in dramatic and dark scenes but has limitations in facial generation.
  • πŸ”„ **Versatility with Analog Madness**: Best for generating images of ordinary individuals with vivid and robust prompts.
  • πŸ“ **Crafting Prompts**: Specific and pointed prompts work extremely well with Analog Madness for detailed and unique images.

Q & A

  • What is the title of the video script discussing?

    -The title of the video script is 'the best REALISTIC models for Stable Diffusion'.

  • Which model is mentioned as the author's current favorite for creating lifelike images?

    -Epic Realism is mentioned as the author's current favorite model for creating lifelike images.

  • What is the key to maintaining a perfect balance between quality and realism in image generation?

    -The key to maintaining a perfect balance between quality and realism lies in fine-tuning several parameters such as steps, CFG scale, and choosing the right sampler.

  • What are some negative keywords that should be included to enhance the realistic qualities of the generated images?

    -Negative keywords such as 'cartoon', 'painting', and 'illustration' should be included to enhance the realistic qualities of the generated images.

  • What are the recommended settings for the high res upscaler to improve the level of detail on the generated image?

    -For the high res upscaler, a denoising setting of 0.35 and an upscale factor of 2 are recommended to improve the level of detail on the generated image.

  • What is the limitation of the Magic Mix model when it comes to facial generation?

    -The limitation of the Magic Mix model is that it tends to almost exclusively generate East Asian women and often leans towards a uniform and unrealistic 'Tick-Tock slim face filter' look.

  • What are the recommended samplers for the Magic Mix model?

    -The recommended samplers for the Magic Mix model are Euler a, Euler dpm2m, Karis, or DPM SDE Caris.

  • What is the suggested range for the number of steps when using the Magic Mix model?

    -The suggested range for the number of steps when using the Magic Mix model is between 20 and 40.

  • How does the Analog Madness model differ from other popular models?

    -The Analog Madness model differs from other popular models by its ability to generate images of ordinary individuals, offering a refreshing alternative to the supermodel renditions often produced by other models.

  • What is the recommended sampler for the Analog Madness model?

    -The recommended sampler for the Analog Madness model is the SDE Carisma.

  • What is the significance of crafting specific and pointed prompts when using the Analog Madness model?

    -Crafting specific and pointed prompts is crucial for the Analog Madness model as it significantly influences the output, making the images more captivating and realistic.

  • What are some additional tips provided to harness the power of Epic Realism?

    -Some additional tips include making effective use of negatives, avoiding biases towards certain ethnicities, not over-describing the face, and using the Epic Realism helper tool by the same author for further enhancement of realism.

Outlines

00:00

πŸ–ΌοΈ Epic Realism for Lifelike Image Generation

The first paragraph introduces the 'Epic Realism' model, which is favored for its ability to convert simple prompts into highly realistic images, particularly noted for its excellence in facial detail. The speaker recommends maintaining simplicity in prompts and adjusting parameters such as steps, CFG scale, and the choice of sampler for optimal results. The use of an upscaler to enhance image resolution and detail is also emphasized, with specific settings suggested for the 'nmkd super scale' and 'nmkd faces'. The importance of using negatives to refine the image and avoid biases, such as generating East Asian women, is discussed. The paragraph concludes with instructions on how to download and implement the Epic Realism model.

05:00

🎭 Magic Mix for Dramatic and Moody Scenes

The second paragraph discusses the 'Magic Mix' model, which is recognized for its strength in creating dramatic and darkly lit scenes with a moody and mysterious atmosphere. However, it is noted that the model has limitations, particularly in generating facial features that tend to default to a slim, East Asian woman look. The speaker shares their workflow with this model, including the preferred sampler, the optimal number of steps, and the recommended upscaler settings. The importance of adjusting the convex shell parameter and using specific terms in prompts to enhance image quality is highlighted. The paragraph concludes with a demonstration of the model's output and a reminder of its potential for crafting AI-generated artwork with striking lighting effects and atmospheric settings.

Mindmap

Keywords

πŸ’‘Epic Realism

Epic Realism is a model for Stable Diffusion that is favored for its ability to transform simple prompts into highly realistic images. It excels at capturing facial details, which is often a challenge for other models. In the video, it is mentioned as the creator's current favorite due to its stunning lifelike results. It is used as an example to demonstrate how fine-tuning parameters can lead to high-quality realistic outputs.

πŸ’‘Prompts

Prompts are the textual instructions given to the Stable Diffusion model to guide the generation of images. The script emphasizes the importance of keeping prompts simple and avoiding the addition of extra keywords that do not contribute to the outcome. Effective prompts are crucial for directing the model to produce desired results, as illustrated by the examples provided in the transcript.

πŸ’‘Parameters

Parameters in the context of the video refer to the settings that can be adjusted within the Stable Diffusion model to achieve a balance between quality and realism. The script mentions steps, CFG scale, and the choice of sampler as key parameters. Fine-tuning these parameters allows for better control over the image generation process and can help to correct errors or artifacts in the generated images.

πŸ’‘Sampler

A Sampler in the context of Stable Diffusion is an algorithm that determines how the model processes the prompts to generate images. The video discusses different samplers like DPM sde Caris, dpm2m Keras, and DPM fast, noting that they each have their own strengths and can affect the level of realism in the output. The choice of sampler can be a personal preference based on the desired outcome.

πŸ’‘High Res Upscale

High Res Upscale refers to a process used to enhance the resolution of generated images, making them more detailed and clearer. The video recommends using specific upscalers like nmkd super scale or nmkd faces with a denoising setting to improve the quality of the images. This technique is shown to be particularly effective in refining facial features and other intricate details.

πŸ’‘Negatives

Negatives are terms or conditions included in the prompts to specify what should be avoided in the generated images. The script discusses the importance of using negatives effectively to increase realism and to define what is not desired in the output. An example given is adding 'Asian, Chinese' to the negatives if the aim is to avoid generating images biased towards East Asian women.

πŸ’‘Magic Mix

Magic Mix is another model for Stable Diffusion that is recognized for its strengths in creating dramatic and dark-lit scenes with moody and mysterious qualities. However, it has limitations, particularly with facial generation, often defaulting to East Asian women with a slim face filter look. The video provides optimization tips for using Magic Mix, including sampler options and step ranges.

πŸ’‘Analog Madness

Analog Madness is a versatile and dynamic model within Stable Diffusion that is capable of generating images of ordinary individuals, offering a refreshing alternative to the typical supermodel renditions. The power of this model lies in the strength of the prompts provided, with more vivid and robust prompts leading to more captivating outputs. The video outlines a workflow with this model, emphasizing the importance of specific and pointed prompts.

πŸ’‘Resolution

Resolution in the context of image generation refers to the level of detail and clarity present in the resulting images. The video discusses the importance of achieving high resolution to ensure fine details, especially in facial features. The use of upscalers is recommended to improve resolution and to avoid issues like smudged faces in the generated images.

πŸ’‘Bias

Bias in the context of AI models like Stable Diffusion refers to the tendency of the model to favor certain outcomes over others. The script notes that many realistic models in Stable Diffusion are biased towards creating East Asian women. Understanding and addressing such biases is important for creating a wider range of realistic and diverse images.

πŸ’‘Textual Inversions

Textual Inversions are techniques used in prompts to guide the AI model away from generating undesirable features or elements in the images. The video mentions NG deep negative and bad hand V4 as examples, which help to prevent malformed anatomy and improve the realism of hands in the generated images, a common challenge for AI image generation models.

Highlights

Epic Realism is a favored model for creating lifelike images, particularly excelling in capturing facial details.

The model can transform simple prompts into stunningly realistic results.

Users can view and click on images on the download page to see prompts used by others.

For optimal results, avoid adding extra keywords like 'Masterpiece' or '8K' to prompts.

Including words like 'cartoon' or 'painting' in the negatives can help maintain realistic qualities.

Fine-tuning parameters such as steps and CFG scale is key to balancing quality and realism.

The author recommends setting the CFG scale to five for a realistic feel.

Samplers like DPM sde Caris or dpm2m Keras are best for an extra dose of realism.

High res upscalers such as nmkd super scale or nmkd faces can significantly improve image detail.

Using a denoising strength of 0.35 and an upscale factor of 2 can enhance facial details.

Effective use of negatives can add realism and define what is not wanted in the image.

The model tends to be biased towards creating East Asian women, which can be adjusted with specific prompts.

Lighting keywords are unnecessary as the model captures light shadows and intricate details well.

Over-describing the face can lead to less desirable results.

Epic realism can be further enhanced using the help of additional tools like the Epic realism helper.

Magic Mix model excels in dramatic and dark lit scenes, enhancing moodiness and mystery.

Magic Mix has limitations with facial generation, often generating East Asian women with a slim face filter look.

Optimal samplers for Magic Mix include Euler a, Euler dpm2m Karis, or DPM sde Caris.

The sweet spot for the number of steps with Magic Mix is between 20 and 40.

Analog Madness is a versatile model that can generate images of ordinary individuals.

The power of Analog Madness lies in the potency of the prompts provided.

The sde Cara sampler is the ideal choice for working with Analog Madness.

Keeping steps and conflict scale within recommended parameters can tap into Analog Madness's potential.