Playground AI Beginner Guide to Image to Image & Inpainting in Stable Diffusion

Monzon Media
7 Jan 202311:18

TLDRIn this video, the host demonstrates how to use the 'Image to Image' feature in Playground AI, specifically focusing on the Stable Diffusion 1.5 model. They start by creating a composition with a prompt for a raccoon wearing a suit and top hat, and then use various techniques to refine the image, such as adjusting image strength and using negative prompts. The host also explores in-painting to add or correct details in the image, and shows how to use a reference image to create a new character with a unique look. Additionally, they create a landscape from scratch using a simple sketch and in-painting to guide the AI in generating a detailed scene. Throughout the video, different sampler methods are tested to achieve a range of artistic results, illustrating the flexibility and creativity possible with the 'Image to Image' feature.


  • 📸 **Image to Image Composition**: Use a simple prompt to generate an image with good composition, which can then be used for further image to image processing.
  • 🎨 **Anthropomorphic Prompts**: Adding 'anthropomorphic' to your prompt helps generate animal figures with human-like characteristics.
  • 🔍 **Image Strength**: Adjusting the image strength allows for control over how much the original image's characteristics are retained in the output.
  • 👌 **Fine-tuning Details**: By manipulating the image strength, you can fine-tune the level of detail and deviation from the original image.
  • 🎭 **Inpainting for Detailing**: The inpainting feature is useful for adding or correcting details in an image, such as enhancing the appearance of a hat.
  • ✍️ **Masking for Specificity**: Create a mask to specify which parts of the image you want to change, ensuring that only those areas are affected.
  • 🖌️ **Sketching for Scenery**: You can sketch a basic scene to guide the AI in generating a more detailed landscape.
  • 🌄 **Landscape Creation**: Start with simple shapes and colors when sketching a landscape, and let the AI fill in the details.
  • 🎨 **Filter Effects**: Using different filters, like the playtune filter, can give your generated images a specific aesthetic, such as a Pixar look.
  • 🔄 **Sampler Methods**: Experimenting with different sampler methods can yield varied results and help achieve the desired image outcome.
  • 🔍 **Prompt Guidance**: Increasing prompt guidance helps the AI more closely follow the instructions given, resulting in a more accurate output.
  • 🔧 **Iterative Adjustments**: Making iterative adjustments to the prompts, image strength, and other settings allows for gradual refinement of the generated image.

Q & A

  • What is the first method of using image to image in Playground AI as described in the transcript?

    -The first method of using image to image in Playground AI is for composition. The user provides a prompt, such as 'cute and adorable raccoon wearing a suit and Top Hat,' and uses the 'remove from image' and 'insert' negative prompts to generate images.

  • What is the purpose of adding the word 'anthropomorphic' to the prompt?

    -The word 'anthropomorphic' is added to the prompt to help the AI generate images where animals have a human-like figure, which is useful when the user wants the animal to have a more human-like appearance.

  • What are the dimensions used for the image in the example?

    -The dimensions used for the image in the example are 512 by 768.

  • Why does the user lower the quality and details to both 35 in the sampler?

    -The user lowers the quality and details to both 35 because a high number is not needed at this point in the process. This setting helps to generate a composition that can be used for image to image without needing excessive detail.

  • What is the role of the 'image strength' slider in the image to image process?

    -The 'image strength' slider determines how much the generated image will deviate from the original image. A lower number results in a more random and less similar image, while a higher number retains more of the original image's details and characteristics.

  • How does the inpainting feature in Playground AI work?

    -Inpainting in Playground AI is used for adding details or correcting certain aspects of the image. The user creates a mask around the area they want to modify, and the AI focuses on changing only the areas within the mask based on the new prompt provided.

  • What is the purpose of using a reference image in the image to image process?

    -A reference image is used to create a new character or object that is influenced by the style and composition of the reference but is not a direct copy. This allows for the creation of original content inspired by existing images.

  • What is the significance of the 'playtune' filter in the context of the script?

    -The 'playtune' filter is used to achieve a specific visual style, described as a 'Pixar look' in the script. It's a creative tool that helps to stylize the generated images.

  • How does the user create a landscape from scratch in the script?

    -The user creates a landscape from scratch by using a simple drawing tool within the inpaint feature to sketch the elements of the landscape, such as the sky, mountains, and grass. The AI then uses this sketch to generate a more detailed and structured image.

  • What is the purpose of adjusting the 'prompt guidance' setting?

    -Adjusting the 'prompt guidance' setting allows the user to control how closely the AI follows the provided prompt. A higher setting makes the AI adhere more closely to the prompt, while a lower setting allows for more creative freedom.

  • Why does the user change the sampler method during the process?

    -Changing the sampler method can yield different results in the generated images. Different samplers may produce images with varying levels of detail, structure, and adherence to the prompt, allowing the user to find the best fit for their desired outcome.

  • How does the user ensure that the final image adheres closely to their vision?

    -The user ensures the final image adheres to their vision by iteratively refining the process. This includes adjusting the prompts, using inpaint to correct or add details, trying different sampler methods, and gradually building on the generated images to achieve the desired result.



🎨 Image-to-Image Composition and In-Painting Techniques

This paragraph introduces the concept of using image-to-image techniques for creative composition in an AI platform called Playground. The speaker demonstrates how to generate an image of a 'cute and adorable raccoon wearing a suit and top hat' with anthropomorphic characteristics. They discuss the use of negative prompts, image dimensions, and the importance of image strength in determining the level of creativity versus adherence to the original image. The process also involves using in-painting to add or correct details within an image, such as enhancing the top hat's appearance with more intricate details.


🌃 Creating a Superhero Character and Scenery with Image-to-Image

The second paragraph focuses on creating a unique superhero character and a detailed background using image-to-image and in-painting methods. The speaker starts by generating a female superhero character with a 'Pixar look' and then moves on to creating a landscape with elements like mountains, waterfalls, and trees. They use a simple drawing tool to sketch the initial composition, which is then enhanced by the AI to create a more detailed and realistic image. The process involves playing with different settings such as prompt guidance, image strength, and using various filters to achieve the desired level of detail and artistic style.


🖼️ Refining and Finalizing the Image-to-Image Composition

In the final paragraph, the speaker discusses refining the generated image further by using different sampler methods and adjusting the prompts and image strength. They emphasize the ability to transform a simple initial image into a highly detailed and almost photo-realistic composition through a combination of image-to-image generation, in-painting, and careful tweaking of the AI's settings. The speaker concludes by expressing satisfaction with the results and promising to delve deeper into these techniques in future videos.



💡Image to Image

Image to Image is a technique used in AI-generated art where an existing image is used as a starting point to create a new image. In the video, this technique is used to modify and enhance the composition of the generated images, such as changing the details of a raccoon wearing a suit and a top hat, or creating a landscape with mountains, waterfalls, and trees. It's a core concept in the video as it demonstrates how to evolve and refine AI-generated images based on user input and preferences.


Inpainting is a process in AI art generation where specific parts of an image are manually selected and redesigned to add details or correct elements. In the video, the author uses inpainting to give the raccoon's hat a more ornate look by masking out the hat and filling it with a new design. This technique is crucial for fine-tuning the details of the generated images to match the creator's vision.

💡Stable Diffusion

Stable Diffusion is a term referring to a specific version (1.5) of an AI model used for generating images. In the context of the video, it is the underlying technology that enables the creation of detailed and high-quality images from textual prompts. The author uses Stable Diffusion to generate the initial images and to refine them through image to image and inpainting techniques.


Anthropomorphic refers to attributing human characteristics or form to non-human entities, such as animals. In the video, the term is used in the prompt to guide the AI to generate a raccoon with a human-like figure, which is an essential aspect of creating the desired composition.

💡Euler Ancestral

Euler Ancestral is a type of sampler used in AI image generation that determines how the AI interprets and creates the image from the given prompt. The author uses this sampler in the video to generate images with a specific style and level of detail, contributing to the overall look and feel of the final artwork.

💡Image Strength

Image strength is a parameter in AI image generation that controls the degree to which the generated image will resemble the original image used as a reference. A lower image strength results in a more random and less constrained image, while a higher value preserves more details from the original. In the video, the author adjusts image strength to balance creativity with the retention of certain characteristics from the reference image.


A prompt is a textual description or command that guides the AI in generating an image. It includes elements like the subject, desired style, and any specific details the creator wants to be included in the image. In the video, the author uses prompts to instruct the AI on generating images of a raccoon wearing a suit and a top hat, as well as a detailed landscape with mountains and waterfalls.

💡Negative Prompts

Negative prompts are terms or descriptions that the user wants the AI to avoid or exclude when generating an image. They are used to refine the output by specifying what should not be included. In the video, negative prompts are employed to ensure that the generated images meet the desired composition and style without unwanted elements.


Composition refers to the arrangement of visual elements within an image. It is a fundamental aspect of art and design that determines how the elements are organized to create a cohesive and aesthetically pleasing image. In the video, the author discusses the importance of selecting an image with a good composition as a starting point for further enhancements using image to image and inpainting techniques.

💡Playtune Filter

The Playtune Filter is a specific setting or effect used in the AI image generation process that is intended to give the generated images a particular look or style. In the video, the author mentions using the Playtune Filter to achieve a 'Pixar look' for the generated superhero character, indicating its use in stylizing the final output.


A sampler in the context of AI image generation is an algorithm that determines the process by which the AI creates the final image based on the input prompt and other parameters. Different samplers can produce different results in terms of style and detail. In the video, the author experiments with different samplers, like Euler Ancestral and DPM to achieve the desired effects in the generated images.


Explores the use of image-to-image in Playground AI for composition enhancement.

Demonstrates creating a raccoon wearing a suit and top hat with anthropomorphic characteristics.

Introduces the concept of negative prompts to refine image generation.

Details the use of Stable Diffusion 1.5 for generating images.

Adjusting quality and details to 35 for initial image composition.

Explains the use of Euler, Ancestral sampler for generating images.

Discusses the importance of image strength in determining the randomness of the generated image.

Illustrates the process of using image-to-image for creating a composition with a lower image strength.

Shows how increasing image strength to 70 retains more details from the original image.

Introduces the inpaint feature for adding details or correcting images.

Guides on how to use the inpaint mask to focus on specific areas of the image, such as enhancing a hat.

Demonstrates creating a superhero character using a reference image and the Playtune filter for a Pixar look.

Explains the process of generating a detailed comic art image with negative prompts and randomization.

Shows how to create a landscape from scratch using inpaint and a simple sketch.

Details the use of warm box filter and image strength to refine the generated landscape image.

Discusses the iterative process of using image-to-image and inpaint to achieve a desired result.

Provides insights on how to achieve a photorealistic look from a simple sketch through a combination of techniques.

Concludes with a demonstration of the final image, showcasing the power of image-to-image and inpaint features in Playground AI.