* This blog post is a summary of this video.

Behind the Scenes Look at Stability AI's New SDXL Model Architecture and Release Plans

Author: Ai FluxTime: 2024-03-23 15:40:00

Table of Contents

Introducing the New SDXL Model for High Quality Image Generation

Stability AI has been working on a new AI image generation model called SDXL (Stable Diffusion Extra-Large). This model builds on the previous Stable Diffusion models and takes things to the next level with higher quality outputs across a wide range of categories.

The SDXL model is significantly larger than previous Stable Diffusion versions, with over 9 billion parameters. This allows it to capture more details and nuances, leading to images that are more coherent, consistent, and detailed.

Size and Performance of SDXL

At over 9 billion parameters, SDXL is considerably larger than the previous leading model Stable Diffusion 2.1. However, thanks to optimizations by the research team, SDXL can still run efficiently on consumer GPUs with 8GB of VRAM for image generation. For training, SDXL can be trained at full 1024x1024 resolution on an Nvidia RTX 2070 with 8GB VRAM according to tests by the Stability AI team. This shows that despite its large size, SDXL has been designed to still be accessible without requiring expensive high-end hardware.

Release Timeline

The SDXL 0.95 weights were initially released to a group of test researchers in December 2022. The aim is to have the first full public release of SDXL 1.0 available by mid-2023. This will include updated terms of service, integration with Stability's Bedrock platform, and guides/tools from prominent community members to help people make the most of the new model.

Training and Developing SDXL

The SDXL model builds on extensive research and testing by Stability's research team over the past year. Many different architectures and data mixes have been tried, with the goal of finding the best balance of quality and performance.

Stability now has over 80 researchers working on developing these models across multiple specialized teams. These include experts focused on video, 3D, and other areas to continue advancing what's possible with AI image generation.

Using SDXL for Image Generation

With SDXL, even minimal prompts can produce detailed, high-quality images exceeding what required significant effort with previous Stable Diffusion models.

It also opens up new possibilities like easier animation, more detailed control through concepts like ControlNets, and simplified workflows for tasks like inpainting.

Future Plans for Stability AI's Image Models

The public release of SDXL 1.0 is only the beginning. Now that this new foundation model is established, Stability AI will refine and optimize it for efficiency and specialize models for areas like photorealism, art, and video.

They also plan to integrate language models for an end-to-end system allowing natural language guidance of the image generation process.

Competing with Midjourney and Other AI Companies

Stability AI aims to collaborate with and empower developers rather than view other companies as strict competitors. They were early investors in Midjourney and believe there is room for multiple high-quality models with different strengths.

Their focus is on democratizing access through efficiency optimization and partnerships to run these models on consumer devices rather than expensive specialized hardware.

Conclusion

The release of Stable Diffusion XL represents a new milestone in AI image generation that raises the bar for quality and creative possibility.

While work remains to streamline and refine it, SDXL provides a new foundation for Stability AI and talented community developers to build upon with specialized creative models. The next year promises rapid progress in areas like video consistency, 3D integration, and real-time workflows.

FAQ

Q: How big is the new SDXL model?
A: SDXL has over 10 billion parameters, making it much larger and more capable than previous Stability AI models like SD 1.5.

Q: What hardware do I need to run SDXL?
A: You can run SDXL on a GPU with at least 8GB of VRAM, such as an RTX 2070 or 3060 Ti. Higher VRAM allows training at higher resolutions.

Q: When will SDXL be publicly released?
A: The team is targeting mid-July 2023 for the initial SDXL 1.0 public release, but it may release a 0.95 version earlier if needed.