NEW Stable Diffusion 2.1 Tutorial - easy setup + what you need to know

Olivio Sarikas
7 Dec 202206:33

TLDRThe video tutorial introduces Stable Diffusion 2.1, a new model for image generation with improvements in portrait and landscape rendering, as well as better handling of anatomy and hands. The tutorial explains how to install and use the model with Automatic1111, emphasizing the need for the latest version and specific steps for downloading the non-ema model from Hugging Face pages. It also highlights the importance of negative prompts for achieving better results with the 2.0 and 2.1 versions. The presenter shares test renders to demonstrate the model's capabilities, including a portrait with and without face fix, and apocalyptic cityscapes using both the 512 and 768 versions of the model. The video concludes with a call to action for viewers to like and engage with the content.

Takeaways

  • πŸ†• Stable Diffusion 2.1 has been released with improvements in image quality and additional features.
  • πŸ’¬ For support and community discussion, join the Discord group or the AI Revolution Facebook group with over 10,000 members.
  • πŸ“Έ The blog post provides prompts used to create test images, highlighting the positive and negative prompts for image generation.
  • πŸ” The colon notation (e.g., colon 2, colon -2, colon -4) is used for the Dream Studio page by Stability, not for Automatic1111.
  • 🚫 Importing Invoke AI models into Automatic1111 is not done in the same way as before due to different working mechanisms.
  • πŸ–Ό New features in version 2.1 include better rendering of portraits, landscapes, and architectures, and less strict filtering on not safe for work images.
  • πŸ“ Users can now create images with more extreme aspect ratios, provided the short side of the ratio is at least 512 or 768 pixels.
  • πŸ’» For installation, download the non-ema model for either the 768 or 512 version from the Hugging Face pages and place it in the local Automatic1111 folder.
  • πŸ“„ Ensure to download the correct YAML file for the model version and rename it to match the model file name.
  • βš™οΈ Update the command line arguments in the Automatic1111 web UI to accommodate the new model's requirements.
  • πŸ–₯ After installation, select the desired model from the web UI to start using Stable Diffusion 2.1.
  • 🎨 Test renders show differences between images generated with and without face fix, and between versions 1.5 and 2.1 of the model.

Q & A

  • What is the name of the latest version of Stable Diffusion discussed in the transcript?

    -The latest version discussed is Stable Diffusion 2.1.

  • What are the two types of prompts used in creating test images with Stable Diffusion 2.1?

    -The two types of prompts are the positive prompt and the negative prompt.

  • Why is it important to join the Discord group or AI Revolution Facebook group mentioned?

    -These groups provide a helpful community for discussing Stability Fusion 2.1, getting support, and addressing any errors or issues.

  • What is the significance of the colon followed by a number (e.g., colon 2) in the context of Stable Diffusion 2.1?

    -The colon followed by a number is used to specify parameters for the dream Studio Page by Stability, not for the automatic 11.11 software.

  • How does the Stable Diffusion 2.1 version handle images that are not safe for work?

    -The filtering for not safe for work images is less strict, which should help with the accuracy of anatomy and hands in the generated images.

  • What is the minimum pixel requirement for the short side of an image when using extreme ratios in Stable Diffusion 2.1?

    -The short side of the image must be at least 512 or even 768 pixels.

  • What is the recommended approach to download the non-ema model for Stable Diffusion 2.1?

    -To avoid errors, one should click on the RAW button of the download link, which displays the content as a page, and then right-click and save as, ensuring the file is saved in the correct local folder structure.

  • What is the purpose of the yaml file in the context of installing Stable Diffusion 2.1 with automatic 11.11?

    -The yaml file is required for the inference process and must be renamed to match the model file name for the installation to work correctly.

  • What modification is needed to the command line arguments in the automatic 11.11 web UI to accommodate the new model?

    -You need to add '--' followed by 'no-half' at the end of the command line arguments to enable full precision, which is required by the new model.

  • How does the face fix feature in Stable Diffusion 2.1 affect the quality of portrait images?

    -The face fix feature improves the quality of portrait images by correcting minor imperfections such as the eyes, resulting in a more accurate and realistic depiction.

  • What is the difference between the 512 and 768 versions of Stable Diffusion 2.1 in terms of rendering?

    -The 768 version is generally expected to produce better results due to its higher resolution, but the 512 version can still generate quality images, as demonstrated in the apocalyptic city example.

  • Why are negative prompts more important in the 2.0 and 2.1 versions of Stable Diffusion compared to the 1.5 version?

    -Negative prompts are more crucial in the 2.0 and 2.1 versions because they play a significant role in guiding the model to avoid generating unwanted elements or styles in the output images.

Outlines

00:00

πŸš€ Introduction to Stable Diffusion 2.1 and Installation Guide

The video introduces the release of Stable Diffusion 2.1, emphasizing its improved features such as better rendering of portraits, landscapes, and architectures, and less strict filtering on not safe for work images. It also mentions the ability to handle more extreme image ratios, which can be beneficial for high-resolution outputs. The speaker provides guidance on how to join a Discord group or a Facebook group for further discussions and support. The installation process for the new model with Automatic1111 is detailed, including downloading the correct model files from Hugging Face Pages, handling YAML files, and adjusting settings in the Automatic1111 web UI for full precision.

05:02

πŸ–ΌοΈ Test Renders and Comparison with Previous Versions

The speaker presents test renders comparing the output of Stable Diffusion 2.1 with and without face fix, demonstrating subtle improvements in image quality. They also showcase apocalyptic cityscapes generated with both the 512 and 768 versions of the model, noting personal preference for the 1.5 version's aesthetic but acknowledging potential superior results with 2.1. The importance of negative prompts in versions 2.0 and 2.1 for fine-tuning image generation is highlighted. The video concludes with a call to action for viewers to like the video and a farewell message, followed by an end screen suggesting further content to watch.

Mindmap

Keywords

Stable Diffusion 2.1

Stable Diffusion 2.1 is an updated version of a machine learning model used for generating images from textual descriptions. It is a significant topic in the video as it is the main subject being discussed. The improvements and features of this version are highlighted, such as better image quality and additional art styles.

Dream Studio Page

The Dream Studio Page is a platform by Stability AI where users can utilize the Stable Diffusion models to create images. It is mentioned in the context of where certain features, like extreme ratios for image generation, are accessible, although it is a paid service.

Positive and Negative Prompts

Positive and negative prompts are textual instructions provided to the AI model to guide the image generation process. Positive prompts are elements desired in the image, while negative prompts are aspects to be avoided. They are crucial for achieving the desired outcome in the generated images, as discussed in the video.

Invoke AI

Invoke AI is a different AI model that works with the Stable Diffusion models. The script mentions that it cannot be imported in the same way as previous versions, indicating a change in the way models are used or integrated.

Automatic 1111

Automatic 1111 refers to a specific version of a software or tool used in conjunction with the Stable Diffusion model. The video provides instructions on how to install and update this software for using the new model.

Hugging Face Pages

Hugging Face Pages are online repositories where users can find and download different versions of AI models, such as the Stable Diffusion 2.1 model. They are important for obtaining the necessary files to work with the AI model.

Non-EMA Model

The Non-EMA (Exponential Moving Average) Model is a version of the AI model that does not use the EMA technique for training. It is specified in the video as the version to download for use with the Stable Diffusion 2.1.

YAML File

A YAML (YAML Ain't Markup Language) file is a type of file used for configuring software applications. In the context of the video, it is required to properly set up the Stable Diffusion model within Automatic 1111.

Full Precision

Full Precision refers to the level of detail or accuracy used in calculations or data representation. The video mentions the need for full precision in the context of the new model's requirements.

Face Fix

Face Fix is a feature or process that improves the quality of generated faces in images. The video shows a comparison between images with and without the face fix feature, demonstrating its impact on the final output.

Anatomy and Hands

Anatomy and hands are specific details in generated images that can be challenging for AI models to render accurately. The video notes that the new Stable Diffusion 2.1 version has improvements in these areas.

Negative Prompts

Negative prompts are used to guide the AI model away from including certain elements in the generated image. The video emphasizes their increased importance in the 2.0 and 2.1 models of Stable Diffusion, suggesting that they require more experimentation to achieve the best results.

Highlights

Stable Diffusion 2.1 has been released with improvements in image quality and model capabilities.

Join the Discord or Facebook group for a helpful community and assistance with errors.

The blog post provides prompts used to create test images, showcasing the positive and negative aspects.

Differences between colon 2 and colon minus two or four are explained, relating to the Dream Studio page by Stability AI.

Invoke AI works differently with models, requiring a different import method.

The 2.1 version introduces more extreme aspect ratios, dependent on the strength of the user's computer.

For wide images, the short side of the ratio must be at least 512 or 768 pixels.

Dream Studio page is used for rendering wide images, which requires payment for the service.

To install Stable Diffusion 2.1, the latest version of Automatic1111 is required.

Download the non-EMA model from Hugging Face Pages for Stable Diffusion 2.1 for both 768 and 512 versions.

The YAML file is necessary and should be downloaded from the provided link and renamed to match the model file.

Ensure the YAML file and model file have the same file name for compatibility.

Edit the web UI user BET file to include full precision for the new model.

After installation, users can select the desired model from the web UI.

Test renders demonstrate the difference between face fix on and off in portrait images.

Apocalyptic city scenes were rendered with both 512 and 768 versions of the 2.1 model.

The 2.0 and 2.1 models place more importance on negative prompts compared to the 1.5 version.

Experimentation with negative prompts is encouraged for optimal results with the 2.1 model.

The video concludes with an invitation to like and subscribe for more content.