Stable Diffusion 2.1 Released!
TLDRStable Diffusion 2.1 introduces two new models with 512 and 768 resolution, trained on an improved dataset that enhances architecture, design, wildlife, and landscape quality while reducing adult content. This release refines the NSFW filters and builds upon the capabilities of version 2.0, delivering better anatomy and a wider range of art styles. Users can easily download and install the update, experiencing enhanced imagery across various prompts, including anime, surrealism, and hand anatomy.
Takeaways
- 🚀 Stable Diffusion 2.1 has been released, succeeding version 2.0.
- 🎨 Two new models are introduced in 2.1 - 512 and 768 resolution models.
- 🌐 The 2.1 version was trained on a new dataset, different from the one used for 2.0.
- 🔒 The previous release had a high 'not suitable for work' filter which limited the dataset.
- 🏙️ The new release focuses on architecture, design, wildlife, and landscape scenes, improving quality in these areas.
- 🌟 Stable Diffusion 2.1 offers a balance, enhancing both architectural concepts and natural scenery rendering, as well as people and pop culture images.
- 🔍 The NSFW filters in 2.1 are less sensitive but still reduce most adult content.
- 📈 The anatomy in 2.1 is improved, particularly the hands are better rendered across various art styles.
- 🔧 Users need to download the new 2.1768 non-ema pruned checkpoint and the stable diffusion 2.1 config file for setup.
- 💻 Setup instructions are available, and the software can be run on Windows or Linux with the correct configurations.
- 📊 Comparisons between Stable Diffusion 2.0 and 2.1 show clear enhancements in various styles and details.
Q & A
What is the main improvement in Stable Diffusion 2.1 compared to version 2.0?
-Stable Diffusion 2.1 introduces two new models with 512 and 768 resolution, and a new dataset that addresses the previous high not suitable for work (NSFW) filter issue by reducing the number of people in the dataset, while also improving the quality of architecture, design, wildlife, and landscape scenes.
How has the NSFW content been handled in the transition from version 2.0 to 2.1?
-In version 2.1, the NSFW filters have been adjusted to be less sensitive, but they still significantly reduce adult content compared to version 2.0.
What are the benefits of the fine-tuning process from Stable Diffusion 2.0 to 2.1?
-The fine-tuning process allows Stable Diffusion 2.1 to combine the strengths of its predecessor, including the ability to render beautiful architectural concepts and natural scenery, with improvements in generating images of people and pop culture.
What specific improvements have been made to anatomy and art styles in Stable Diffusion 2.1?
-Stable Diffusion 2.1 has improved anatomy, particularly in hands, and can now produce a range of incredible art styles more effectively compared to version 2.0.
How can one obtain and install the Stable Diffusion 2.1 model?
-The Stable Diffusion 2.1 model can be downloaded from the Hugging Face site by selecting the 'files and versions' section, choosing version 2.1768 non-ema pruned checkpoint, and saving it into the Stable Diffusion models directory. Additionally, the Stable Diffusion 2.1 config file should be downloaded and named the same as the model file.
What should users do if they encounter black images when using Stable Diffusion 2.1?
-If users are getting black images, it might be due to the lack of X formers. They can resolve this by setting the environment variable 'attention_precision' to 'fp16' or using the '--no-fp16' option if they are running the automatic 1111 web UE.
How does the quality of the hand anatomy in Stable Diffusion 2.1 compare to version 2.0?
-The hand anatomy in Stable Diffusion 2.1 has been thoroughly redone and improved, resulting in more realistic depictions of hands compared to version 2.0.
What types of prompts were used to test the capabilities of Stable Diffusion 2.1?
-A variety of prompts were used, including a rat in detailed plate armor, a matte acrylic face portrait of a space alien wearing a Tiara, an anime style illustration of a fantasy forest, a surrealism scene with a woman singing opera on the moon, and a normal hand waving goodbye.
What is the main difference in the outputs between Stable Diffusion 2.0 and 2.1 based on the tested prompts?
-The main difference is that Stable Diffusion 2.1 provides improved quality across a range of styles and subjects, including better handling of anatomy and a wider variety of art styles, compared to version 2.0.
How can users share their preferences between Stable Diffusion 2.0 and 2.1?
-Users can share their preferences by comparing the outputs of both versions on various prompts and providing feedback through comments or discussions in relevant forums or platforms.
Outlines
🚀 Introduction to Stable Diffusion 2.1
This paragraph introduces the new Stable Diffusion 2.1 release, highlighting its improvements over the previous 2.0 version. Two new models with 512 and 768 resolution have been added, and the 2.1 version was trained on a refined dataset that excluded inappropriate content, while still maintaining a focus on architecture, design, wildlife, and landscape scenes. The NSFW filters are less sensitive but still effective in reducing adult content. The 2.1 version is fine-tuned from the 2.0 model, offering the best of both worlds with enhanced capabilities for rendering architectural concepts, natural scenery, and detailed images of people and pop culture. The release also boasts improved anatomy and better handling of various art styles.
Mindmap
Keywords
💡Stable Diffusion 2.1
💡Models
💡Data Set
💡NSFW Filters
💡Fine-Tuned
💡Anatomy
💡Art Styles
💡Configuration File
💡Precision
💡Environment Variable
💡Prompts
Highlights
Stable Diffusion 2.1 release introduces two new models with 512 and 768 resolution.
The 2.1 release was trained on a new dataset, addressing the previous 2.0 release's issue of having a not suitable for work filter set too high.
The new data set for 2.1 includes more architecture, design, wildlife, and landscape scenes, improving the quality in these areas.
NSFW filters in 2.1 are less sensitive but still reduce the majority of adult content.
Stable Diffusion 2.1 is fine-tuned off the 2.0 version, combining the best aspects of both.
The new release improves anatomy rendering, particularly hands.
A variety of art styles are better represented in 2.1 compared to 2.0.
The automatic 1111 web UE is easy to download and install for using the new model.
Instructions for downloading and installing the 2.1 model and config file are available on the Stable Diffusion Hugging Face site.
2.1 release expects full precision; if you don't have X formers, you might experience black images.
Options to address precision issues are suggested, such as using the environment variable attention_Precision=fp16 or running with the --no--half option.
Comparisons between 2.0 and 2.1 versions show 2.1's enhanced capabilities in rendering detailed images like a rat in plate armor and a matte acrylic face portrait of a space alien.
2.1 has notably improved handling of anime styles and surrealism, such as an illustration of a village and a woman singing opera on the moon.
Hand anatomy in 2.1 has been thoroughly redone and improved, as demonstrated by a photograph of a normal hand.
There is still room for improvement in hand rendering, as noted by the comparison of hands in 2.0 and 2.1.
A test without any negative prompts showcases the raw capabilities of 2.1, allowing for a direct comparison with 2.0.
The preference between 2.0 and 2.1 is subjective, with the presenter favoring 2.1 for its comprehensive improvements.
Viewers are encouraged to share their preferences and to seek help with prompting on 2.0 if needed.