FREE Stable Diffusion 2.1 Is The Biggest Disappointment Yet?

9 Dec 202221:57

TLDRThe speaker, 'Overlord', discusses the release of Stable Diffusion 2.1, comparing it to previous versions. They express mixed feelings about the update, noting minor improvements but also concerns about the community's expectations and the company's communication. Overlord highlights the ability to generate wide images and better hands in 2.1, but still prefers the quality and versatility of version 1.5. They also mention the potential of future models and the community's reaction to the changes in artistic style and censorship.


  • πŸŽ₯ The speaker, Overlord, discusses the release of Stable Diffusion 2.1 and shares mixed feelings about it.
  • πŸ€– Overlord was considering not making a video due to the lack of significant changes from version 2.0 to 2.1.
  • πŸ“Ή The video aims to be a simpler, less edited version focusing on the new release, community reactions, and the future of Stable Diffusion.
  • πŸ’¬ The speaker emphasizes that their opinions are subjective but covers various topics in the video.
  • πŸ”§ Installation instructions for Stable Diffusion 2.1 are provided, including downloading the 768 and 512 models and yaml files.
  • πŸš€ The 2.1 version allows for super wide images, a feature not possible in previous versions.
  • πŸ–ΌοΈ The speaker compares the image quality of 1.5, 2.0, and 2.1 versions, finding 1.5 superior in terms of realism and detail.
  • πŸ‘ The 2.1 version reportedly improves hand generation, but the differences are minimal and dependent on the image type.
  • 🎨 Art styles are reintroduced in 2.1, but the speaker questions their utility and effectiveness compared to 1.5.
  • πŸ’­ The community's expectations for continuous improvement are acknowledged, but the speaker feels that Stable Diffusion is lagging behind competitors like Mid-Journey.
  • πŸ’¬ Stability AI's communication strategies are criticized for being unclear and not adequately addressing community concerns.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is the discussion of the newly released Stable Diffusion 2.1 and its comparison with previous versions, specifically 2.0 and 1.5.

  • What is the speaker's overall opinion on Stable Diffusion 2.1?

    -The speaker is not very pleased with Stable Diffusion 2.1, finding it only a minor improvement over the previous versions and not significantly better in terms of image generation quality.

  • How does the speaker describe the differences between Stable Diffusion 2.1 and 2.0?

    -The speaker describes the differences as minimal, noting that 2.1 allows for some features from the 1.5 version, such as art styles and more recognizable celebrity images, but does not see it as a substantial upgrade.

  • What are some of the specific improvements mentioned in Stable Diffusion 2.1?

    -Some improvements in Stable Diffusion 2.1 include the ability to generate super wide images, better shaped hands, and a slight decrease in the aggressiveness of filters that were an issue in version 2.0.

  • What does the speaker think about the community's reaction to the new releases?

    -The speaker feels that the community is acting a bit entitled and should appreciate the free access to the technology, which did not exist a few months ago.

  • How does the speaker compare Stable Diffusion to other AI models like Mid-Journey?

    -The speaker believes that Stable Diffusion is currently far behind models like Mid-Journey in terms of ease of use and quality of generated images.

  • What issue does the speaker have with Stability AI's communication?

    -The speaker criticizes Stability AI for being secretive and not effectively communicating the reasons and details behind their model updates and decisions.

  • What is the speaker's recommendation for users interested in trying Stable Diffusion 2.1?

    -The speaker recommends that users can download and install Stable Diffusion 2.1 on their own computers for free to try it out and experience the differences themselves.

  • What does the speaker suggest about the future of Stable Diffusion?

    -The speaker suggests that with minor improvements over time, Stable Diffusion could eventually produce results as good as or better than current models like Mid-Journey, especially once they release their dream Booth 2.0.

  • How does the speaker address the issue of censorship in Stable Diffusion models?

    -The speaker mentions that Stability AI, which was initially against censorship, succumbed to demands to censor their future models, causing a split in the community and disappointment among some users.

  • What is the speaker's final verdict on Stable Diffusion 2.1?

    -The speaker concludes that while 2.1 can produce some cool images, especially in landscape and ultra-realistic styles, it is still significantly behind other AI models in terms of immediate usability and image quality.



πŸ“Ί Introduction to Stable Diffusion 2.1

The speaker begins by introducing the Stable Diffusion 2.1 release, expressing dissatisfaction with the new version. They debate whether to make a video about it due to the lack of significant changes from version 2.0. The speaker decides to create a simpler video to discuss the new release, its impact on the community, and the future of Stable Diffusion. They also mention plans to cover installation instructions for version 2.1 and the differences between the 2.1, 2.0, and 1.5 versions.


πŸ”„ Comparison of Stable Diffusion Versions

The speaker compares the different versions of Stable Diffusion, stating that version 1.5 is superior for image creation. They express their belief that versions 2.0 and 2.1 do not offer significant improvements over 1.5, especially in terms of understanding prompts. The speaker provides examples of images generated by the different versions, emphasizing the realism and detail in 1.5's output compared to 2.0 and 2.1. They also discuss the minor improvements in 2.1, such as the ability to generate wider images and slightly better hands, but overall find the differences minimal.


🎨 Art Styles and Image Quality

The speaker discusses the art styles and image quality in Stable Diffusion 2.1. They argue that while 2.1 allows for more artistic styles, the results are not as impressive as one might hope. The speaker provides examples of images created with various art styles in 2.1 and compares them to those generated by 1.5, expressing a preference for the latter. They note that while 2.1 can produce some cool landscape images, the community's ability to create diverse and unique art styles seems to have been diminished with the newer versions.


πŸ’­ Community Reactions and Expectations

The speaker reflects on the community's reaction to the new Stable Diffusion releases. They express concern that the community has become too entitled and unappreciative of the free access to this technology. The speaker acknowledges the incredible opportunity for anyone with an internet connection to utilize Stable Diffusion and generate impressive images. However, they also express disappointment with the current state of Stable Diffusion compared to other AI models like Mid Journey.


πŸš€ Future of Stable Diffusion and Communication Issues

The speaker discusses the future potential of Stable Diffusion, noting that minor improvements over time could lead to significant advancements. They express anticipation for the release of DreamBooth 2.0, which could allow the community to customize and enhance the 2.0 model. The speaker also criticizes Stability AI's communication strategies, noting the company's lack of transparency and clarity in explaining their decisions and the new features of their models.



πŸ’‘Stable Diffusion

Stable Diffusion is an AI model used for text-to-image generation. In the video, the speaker discusses the release of Stable Diffusion 2.1 and compares it with previous versions, emphasizing its capabilities and limitations in creating images based on textual prompts.

πŸ’‘Version 2.1

Version 2.1 is the latest release of Stable Diffusion at the time of the video. It is described as a minor improvement over version 2.0, with some features from version 1.5 reintroduced, such as art styles and the ability to generate wide images. However, the speaker expresses dissatisfaction with the overall improvements.


Installation refers to the process of downloading and setting up the Stable Diffusion 2.1 model on one's computer. The speaker provides a step-by-step guide on how to install the model, including the need for specific files and adjustments to the web UI user.bat file.

πŸ’‘Art Styles

Art styles refer to the different visual aesthetics that can be applied to the images generated by Stable Diffusion. The speaker mentions that version 2.1 reintroduces the ability to add art styles to images, but questions the overall improvement in quality compared to version 1.5.

πŸ’‘Celestial Bodies

Celestial bodies in the context of the video refer to the generated images of celebrities. The speaker discusses the ability of Stable Diffusion 2.1 to generate images of celebrities that resemble themselves, but notes that this feature is not as effective as in version 1.5.


Hands are a specific detail in the generated images that the speaker critiques. The speaker notes that while version 2.1 may produce better-shaped hands, the differences are minimal and dependent on the type of image and seed used.


Community refers to the group of users and enthusiasts who engage with and provide feedback on Stable Diffusion. The speaker expresses concern that the community may be too demanding and unappreciative of the free access to the technology.


Mid-Journey is another AI model for text-to-image generation, which the speaker compares favorably to Stable Diffusion. The speaker praises Mid-Journey for its ability to generate high-quality, stylized images and ease of use.


Censorship in the context of the video refers to the changes made to Stable Diffusion in response to external demands, which limited the model's capabilities. The speaker discusses the controversy surrounding these changes and the impact on the user experience.


Communication refers to the way Stability AI, the company behind Stable Diffusion, interacts with its user community. The speaker criticizes the company for its lack of transparency and clear communication regarding updates and changes to the model.

πŸ’‘Future of Stable Diffusion

The future of Stable Diffusion encompasses the potential improvements and developments expected for the AI model. The speaker discusses the possibility of future versions overcoming current limitations and the anticipation of the release of DreamBooth 2.0.


The speaker introduces themselves as Overlord and expresses dissatisfaction with the recent release of Stable Diffusion 2.1.

The speaker debates whether to make a video about Stable Diffusion 2.1 due to limited new information compared to version 2.0.

A decision is made to create a simpler video with less editing, termed as a 'ramp video'.

The video will cover the installation process of Stable Diffusion 2.1, differences between versions, and the speaker's opinions on the release.

The speaker emphasizes that the content will be presented early in the video to avoid negative comments about the video's pacing.

Instructions are provided for downloading the 768 and 512 models of Stable Diffusion 2.1, with a recommendation for the 768 model.

The speaker expresses concern over the presence of pickle Imports in the model files, typically a sign of potential viruses.

The process for installing the models involves downloading yaml files and placing the necessary files in the Stable Diffusion web UI folder.

The speaker recommends using the '--transformers' or '--no-half' arguments for running the model with higher precision.

The differences between versions 2.1 and 2.0 are described as minor, with 2.1 allowing for wider image generation but requiring a powerful GPU.

The speaker compares the quality of images generated by versions 1.5, 2.0, and 2.1, finding that 1.5 often produces more realistic images.

The speaker notes that version 2.1 shows some improvement in generating better-shaped hands compared to previous versions.

The speaker criticizes the community for being entitled and reminds them that they receive all this technology for free.

The speaker expresses disappointment with the lack of significant advancements in Stable Diffusion compared to other AI models like Mid Journey.

The speaker discusses the community's split reaction to the 2.0 release and the perceived betrayal by Stability AI.

The speaker praises Mid Journey for its ease of use and higher quality image generation capabilities.

The speaker shares their personal experience of switching from Stable Diffusion to Mid Journey for thumbnail creation due to superior results.

The speaker criticizes Stability AI's poor communication with the community and their cryptic messages.

The speaker remains hopeful for the future of Stable Diffusion, especially with the upcoming release of DreamBooth 2.0.

The speaker concludes by encouraging viewers to try Stable Diffusion 2.1 for themselves and shares gratitude for their supporters.