* This blog post is a summary of this video.

Supercharge Image Generation with Latent Consistency Models and Stable Diffusion XL Turbo

Author: MSP Media NetworkTime: 2024-03-22 23:35:00

Table of Contents

Introduction to Latent Consistency Models and Stable Diffusion XL Turbo

In the past couple of weeks we've seen the introduction of latent consistency models or LCMs and now Stable Diffusion XL Turbo, which are both new ways of supercharging the image diffusion game. Today I'm going to show you how you can use these with Stable Diffusion through Automatic1111 and Comfy UI, and some of the amazing things that are possible with these new models - mainly generating in real time.

What's up everybody! I am Phil Buck, your host for AI Roundup. It's a weekly digest of everything AI related. The past few stories that I've done in a row have all been pretty serious stuff related to lawsuits and just kind of the heavier side of what's going on with the AI industry. But today I had to get back down into one of my favorite things to do, which is get my hands dirty, do some use cases with the new tools that are out there.

What are Latent Consistency Models (LCMs)?

We're going to be talking about two different new tools. The first one is Latent Consistency Models, which is a model just like any model you would use with Stable Diffusion. But people have actually been able to train what's called Allur as an LCM. What's really cool about that is you're not having to take a full model, a new LCM, and train it to do what you want. You can use all of your same models and just slot in an LCM. It's going to take the render time down to one to four steps with Allur.

Key Capabilities of Stable Diffusion XL Turbo

But now Stability AI itself has announced a new model they call SDXL Turbo 2.1, which is a little confusing because the preface of that, SDXL, would make you think it's the XL model, right? Which is trained on 1024x1024 resolution images. So the original SD XL that came out a while back can make bigger higher resolution images. SDXL Turbo is not that - it is trained on 512x512 images like most of the original Stable Diffusion models. But what really sets this model apart is they claim it is capable of rendering things in one step. If you go check out the Stable Diffusion site, they'll tell you all about the technology, they'll tell you that you can go test it out on Clipdrop. You can start typing stuff in and it will immediately show you pictures that will change as you update the prompt, which is one of the aspects of this that is so exciting. Be careful because I used up my credits extremely fast by just typing stuff in and playing around!

Using LCMs with Automatic1111 and Comfy UI

For those that love to tinker, you can use these models extremely easily. I'll put some links in the show notes, but you can just go download these Luras, put them in your models subfolder Luras in your models folder, and then just like always add them into your prompt.

What you'll notice is that your images, if you don't change any settings, you'll notice your images get burnt really fast. And that's because these new images are set up to be done in, you know, like four steps max. So what you're going to want to do is, when you add in these Luras, change your CFG and your steps way down.

Downloading and Adding LCMs

I would probably start around four steps and see how it looks. You can go all the way up to eight steps I've seen some people doing. And make sure you crank that CFG way down. Normally I have mine set at about a seven, but I keep mine somewhere between one and two when you're rendering with this Lura. It is crazy what it can do! So the LCM Luras are fantastic, they've been keeping me up late at night so I can keep playing with this stuff into the wee hours of the night.

Adjusting Settings for Fast Rendering

Just this week, earlier this week and even on Thursday (I'm recording this on Friday), Stability AI announced SDXL Turbo 2.1. So the model based on the 2.1 version of Stable Diffusion now also has a turbo version, and these are really exciting as well. I mean, they claim that they're able to be rendered in, you know, one step is what they they claim. Now I mean, mileage may vary, results may or may not meet your expectations.

Real-Time and Interactive Image Generation

When you play with it on Clipdrop you'll see exactly what I mean - lots of weird stuff with faces and eyes and the typical problems. But if you want to add these in, I mean, literally just go download the models, drop them in your folder, and you'll be able to select them from your model directory once you refresh.

I think one thing that's worth noting is that if you're using Comfy UI over Automatic1111, at the time of this recording, LCM is not built in as a sampler but you can use LCM methods as a sampler in Comfy UI. So that opens up even more options for you.

Webcam Interpretation with Comfy UI

So now that you kind of get the idea of how to use these, how easy it is to get them into your workflow, let's talk about some of the crazy things that people are doing with these. First up, the first person I saw doing anything with these new models was the channel Enigmatice. The guy's always got some amazing tutorials - he's always been using Stable Diffusion in kind of a VFX frame of mind. So you're seeing lots of cool stuff of, you know, taking TikTok videos, reinterpreting them into video game characters. Well, he did a tutorial about a week ago about how to use Comfy UI with a webcam, and it will interpret the images from your webcam in real time. Now I mean, I'm putting it in air quotes because you know if you watch the video, the frames are not, you know, like 30 frames a second - it's like maybe one frame every one or two seconds. So it's a little janky, it's not really in real time, but it is amazing that not only can you input a webcam and watch it interpret you, but you can change the prompt as you're doing this and watch it update as you go, which brings me to another one...

Prompt Iteration with Multiple Renderers

A guy that I really enjoyed seeing what his techniques were with Comfy UI was Scott Detweiler. He took Comfy UI, he laid out four different renderers, and then he was typing the prompt, and comparing each one of them with four different seeds so he could do what Clipdrop is doing on his own computer at home. And he could see how the prompt was slowly affecting the images that were rendered - every word that he was typing was changing it. It's pretty amazing!

Drawing and Updating in Real-Time

And then finally, this is a really cool one from Theoretically Media - it looks like it's another service called Kaa. And he has an open document where he's drawing, and it's updating it in real time. It's really cool to see how even just filling the entire canvas with a color starts generating things based on the prompt, and then you can kind of outline what kind of figures or structures you want to see. I mean, this is just mind blowing to me!

Practical Applications and Use Cases

I think these tools are really going to open up a whole new subset of users that maybe haven't tried this stuff yet, and a whole new subset of applications that we can do.

Sound off in the Discord if you're trying any of these out - let me know which ones you're liking the best. Let me see which applications you think are practical for putting these into the workflow.

VFX and Video Game Character Generation

With that being said, that is our AI Roundup for today. If you enjoy what we're doing here on the show, please help me out by liking this video, go ahead and drop a comment, and of course sub to the channel. But also be sure to follow us on social media - you can find us at MSP Media TV everywhere.

Rapid Prototyping and Concept Iteration

If you'd like to reach out, use our email [email protected] or you can call us and leave a voicemail at 833-MSP-Network. All right everybody, I am Phil Buck and this has been your December 4th episode of AI Roundup. I hope you have a happy Monday and I'll see you next time! This has been a broadcast of the MSP Media Network.

Conclusion and Next Steps

In conclusion, latent consistency models and Stable Diffusion XL Turbo offer exciting new capabilities for real-time and interactive image generation. By adjusting settings in Automatic1111 or Comfy UI, you can leverage these models to interpret webcams, iterate prompts across multiple renders, draw images that update in real-time, and more. This opens the door to practical use cases like VFX workflows, rapid prototyping of concepts, and beyond. Stay tuned for more developments in this quickly evolving space!

FAQ

Q: What are the key benefits of LCMs?
A: LCMs dramatically reduce image generation time to just 1-4 steps while retaining quality and coherence.

Q: How do I use LCMs with my existing SD workflow?
A: You can simply download LCM models like Allur and add them to your models folder. Then reference them in your prompts and adjust settings for fast rendering.

Q: What can I create with Stable Diffusion XL Turbo?
A: You can generate high quality 512x512 images in just a single step with SDXL Turbo, enabling real-time iteration and concept development.

Q: What real-time applications are enabled by LCMs?
A: You can connect a webcam for live interpretation, instantly see prompt changes with multiple renderers, and draw images that update in real-time.

Q: Where can I learn more about using LCMs and SDXL Turbo?
A: Check out blogs and videos from channels like Cognito and EnigmaticE to see use cases and get implementation tips.