Stable Diffusion Fix Hands Without ControlNet and Inpainting (Easy) | SDXL FREE! (Automatic1111)
TLDRIn this video, the presenter demonstrates a simple method to generate realistic hands using the Real V SDXL model without resorting to complex techniques like control net or inpaintings. The process involves using two models: one for the initial generation and another for upscaling and enhancement. The video provides detailed settings for both models, including the use of mid-journey mimic, sampling steps, and CFG scale. The presenter also suggests using negative prompts to avoid common issues like blurry images or incorrect hand anatomy. The final result is a more aesthetic and realistic image, suitable for most use cases that do not require professional-level detail.
Takeaways
- 🎨 **Using SDXL Model**: The Real V SDXL model is capable of generating decent hands but may not feel as realistic as desired.
- 🚫 **No ControlNet or Inpainting**: The process does not require the use of ControlNet or inpaint techniques.
- 👌 **Simple Hand Creation**: The method is designed for simple hand creation, not for complex hand poses.
- 📈 **Upscaling Process**: The generated image is upscaled later, and the detailer is used during this stage.
- 🌟 **Mid Journey Mimic**: The Mid Journey mimic setting at 0.5 is used to give an aesthetic feel without being too strong.
- 🔍 **Negative Prompting**: Basic negative prompting is used to avoid NSFW content, blurriness, and bad hands.
- 🔢 **Sampling Steps and Settings**: 50 sampling steps with DPM++ 3M, SD Exponential are used for the initial model.
- 🔧 **Batch Count and CFG Scale**: A batch count of two and CFG scale between 6 to 7 are found to be effective settings.
- 🔄 **Image to Image Transfer**: The generated image is then transferred to an image-to-image model for further enhancement.
- 🌈 **Dreamshipper Turbo Model**: The Dreamshipper Turbo model is used for the second stage with settings adjusted for better realism.
- ✅ **Realistic Results**: The final images are more realistic, with properly formed hands and fewer imperfections.
Q & A
What is the main focus of the video?
-The main focus of the video is to demonstrate a method for generating realistic hands in images using the Stable Diffusion model without relying on ControlNet or inpainting techniques.
Which model is recommended for generating decent hands?
-The Real V SDXL model is recommended for generating decent hands, as mentioned in the video.
What is the purpose of using two models in the process?
-The purpose of using two models is to first generate an image with decent hand poses and then refine the image to make it more realistic by upscaling and denoising it with a different model.
What is the significance of the 'mid Journey mimic' setting?
-The 'mid Journey mimic' setting is used to give an aesthetic feel to the image and to control the strength of the aesthetic effect, which is set at 0.5 to avoid it being too strong.
What are the steps used for sampling in the first model?
-The video mentions using 50 sampling steps with DPM Plus+ 3M and SD exponential for the first model.
What is the role of the 'clip skip' setting?
-The 'clip skip' setting is used to improve the details in the generated image, with a value of two being personally preferred by the presenter.
How does the presenter suggest improving the realism of the generated hands?
-The presenter suggests using the 'dream shipper turbo' model with increased S3 caras scale and a batch count of four to improve the realism of the generated hands.
What is the recommended CFG scale for the turbo model?
-The recommended CFG scale for the turbo model is one, as it is found to be good for the desired outcome.
What additional feature is enabled to enhance the image?
-The 'ad tailor' feature is enabled to further enhance the image, and the 'freu integrated' and 'self-attention guidance integrated' settings are also enabled for better results.
What is the presenter's opinion on the final outcome of the process?
-The presenter believes that the final outcome is quite realistic and should cover almost all use cases, especially for non-professional use.
Is there a suggestion for further improvement of the generated images?
-The presenter suggests that for further improvement, one could try inpainting on the generated images.
What is the presenter's final verdict on the process?
-The presenter concludes that the process is fairly simple and effective for generating images with proper hands, suitable for a wide range of applications.
Outlines
🎨 Creating Realistic Hands in Art with SDXL Model
The first paragraph introduces the topic of generating realistic hands without disfigurements using a simple process that doesn't require complex tools like control nets or in-painting. The speaker mentions using the real V SDXL model, which is known for producing decent hands but may not feel realistic enough. To enhance realism, two models are employed in the process, with the first being used for initial generation and the second for upscaling and detail enhancement. The speaker also discusses the importance of a full-body view to properly capture the hands in the desired poses. The settings used for the initial model include 50 sampling steps, DPM++ 3M, SD exponential, and a CFG scale of 6 to 7. The process also involves enabling certain features like self-attention and guidance integration within the Forge web UI, which streamlines the workflow without the need for additional extensions. The paragraph concludes with a note on adjusting the CLIP skip for better detail, although this doesn't significantly impact the hands' appearance.
🚀 Enhancing Image Realism with Turbo Model Upscaling
The second paragraph details the process of improving the realism of generated images using a turbo model for upscaling. The speaker begins by discussing the initial outcome of the hand generation, noting that while the hands have the correct number of fingers, they appear plasticky and lack realism. The solution involves using a turbo model to enhance the image, which is done by adjusting the strength of the denoising and allowing the turbo model to refine the image. The speaker shares their findings that turbo models tend to produce more realistic results. The paragraph continues with the speaker's approach to refining the image further, suggesting that while in-painting could be used for professional purposes, the described method should suffice for most use cases. The speaker concludes by expressing hope that the viewers found the process helpful and thanks them for watching.
Mindmap
Keywords
💡Stable Diffusion
💡Hands
💡ControlNet
💡Inpainting
💡Real V SDXL Model
💡Mid Journey Mimic
💡Negative Prompting
💡Sampling Steps
💡CFG Scale
💡Image to Image
💡Turbo Model
💡Denoising
Highlights
The video demonstrates a method to generate proper and non-disfigured hands using Stable Diffusion without complex tools like ControlNet or inpaintings.
The process is simple and suitable for creating normal poses with decent hands, avoiding extra fingers or other anomalies.
The Real V SDXL model is used for its ability to generate decent hands, but with some lack of realism.
Two models are utilized in the process to enhance the realism of the generated hands.
The importance of a full-body prompt is emphasized to ensure hands are visible and not just a close-up shot.
Mid Journey mimic is used at a 0.5 setting for an aesthetic feel, avoiding an overly strong outcome.
Negative prompting includes avoiding NSFW content, blurriness, and bad hands.
The video does not use a detailer initially as upscaling with a different model is planned.
50 sampling steps and DPM Plus+ 3M SD exponential are used with a batch count of two and CFG scale between 6 to 7.
Self-attention guidance integration is enabled in the Forge web UI for better results.
Clip skip is set to two for slightly better detail, though it does not affect the hands significantly.
The generated image may require rerunning to achieve the desired hand appearance.
The Dream Shipper turbo model is used for upscaling, with settings adjusted for optimal image quality.
The CFG scale is set to one for the turbo model, which is found to be sufficient for realistic results.
The ADtailor is enabled for additional adjustments, and self-attention guidance is maintained for quality.
The turbo model helps in making the skin appear more realistic and less plasticky.
The final images show a significant improvement in hand realism compared to the initial generation.
In-painting can be used for further improvement, but the method covers most use cases for non-professional use.