Stable diffusion prompt tutorial. NEW PROMPT BOOK released!

Sebastian Kamph
2 Nov 202230:07

TLDRThe video provides an in-depth tutorial on crafting prompts for stable diffusion models to generate desired images. It introduces the OpenArt prompt book, a resource that offers tips and tricks for creating effective prompts. The host discusses the importance of 'prompt engineering', which involves asking specific questions to clarify the desired image characteristics, such as subject, lighting, environment, and perspective. The tutorial covers various aspects like the use of modifiers to alter style or perspective, the impact of word order in prompts, and the role of specific details like art styles or camera lenses. It also explores the use of artist names and art movements to influence the generated images, and the potential for mixing different styles for unique results. The video concludes with practical advice on prompt optimization, including the use of 'magic words' for higher resolution and the strategic use of seeds for consistent results. The host emphasizes the need for experimentation and iteration in achieving the perfect prompt.


  • ๐Ÿ“š There's a new 'Prompt Book' released by OpenArts, which serves as a guide for crafting effective prompts for image generation.
  • ๐Ÿค” Start by asking questions to determine the type of image you want, such as whether it's a photo or painting, the subject, special details, lighting, environment, and point of view.
  • ๐ŸŽจ Include specific art styles or references in your prompts to guide the AI towards the desired output, like '3D render' or 'Studio Ghibli movie poster'.
  • ๐Ÿ“ท Understand the importance of the order of text in your prompts, as it can affect the weight given to different elements by the AI.
  • ๐Ÿ–ผ๏ธ Modifiers like 'cinematic lighting' or 'bokeh' can change the style, format, or perspective of the generated image.
  • ๐Ÿ” Be specific with your choice of lenses and camera types if you have knowledge in that area, as it can influence the outcome.
  • ๐ŸŒˆ Pay attention to color schemes and lighting when engineering your prompts, as they can greatly affect the mood and quality of the image.
  • ๐Ÿ‘ฉโ€๐ŸŽจ Including artists in your prompts can significantly influence the style of the generated image, so research the artist's style before including them.
  • ๐ŸŒž The time of day can be an important aspect of your prompt, especially for landscape images, to set the right atmosphere.
  • ๐ŸŽญ Emotions can set the tone for a scene, so consider including emotional descriptors in your prompts.
  • ๐Ÿง™ Use 'magic words' like 'HDR', 'Ultra HD', and '64k' to increase the resolution and detail of the generated images.
  • ๐Ÿ› ๏ธ Be aware of the parameters you can adjust, such as resolution, CFG (classifier free guidance), and step counts, to fine-tune the AI's output.

Q & A

  • What is the purpose of the 'prompt book' mentioned in the transcript?

    -The 'prompt book' is a collection of tips and tricks for creating prompts that generate images using AI models like Stable Diffusion. It guides users on how to write prompts effectively to get desired outcomes from AI image generation.

  • What are some key factors to consider when crafting a prompt for AI image generation?

    -Key factors include deciding on the type of image (photo or painting), subject (person, animal, landscape), special details (lighting, environment), color scheme, point of view, and specific art styles or mediums if applicable.

  • How does the order of words in a prompt affect the AI's interpretation?

    -The order of words in a prompt can significantly influence the AI's interpretation and the resulting image. Placing more important aspects of the desired image earlier in the prompt can give them more weight in the AI's understanding.

  • What is the role of 'modifiers' in the context of AI image generation?

    -Modifiers are words that can alter the style, format, or perspective of the generated image. They can include terms related to photography, art styles, and other descriptive elements that refine the output.

  • How can mentioning specific artists in a prompt influence the generated image?

    -Mentioning specific artists can guide the AI to emulate the style of those artists in the generated image. However, it's important to research the artists' styles to ensure consistency in the desired outcome.

  • What is the significance of 'scale' in the context of a prompt's effectiveness?

    -The 'scale' or 'CFG value' determines how closely the AI adheres to the prompt. A higher scale value means the AI will follow the prompt more closely, while a lower value allows for more creative freedom but might result in less accurate representations.

  • What are some examples of 'magic words' that can be used to enhance the quality of the generated image?

    -Examples of 'magic words' include 'HDR', 'Ultra HD', '64k' for higher resolution, 'cinematic lighting' for specific lighting effects, and 'professional' to suggest a higher quality outcome.

  • How can the 'seed' parameter be used in AI image generation?

    -The 'seed' parameter is used to control the randomness in the AI's image generation process. A static seed ensures the same starting point for the image, allowing for consistent results when tweaking prompts.

  • What does the term 'bokeh' refer to in photography and how can it be described in a prompt?

    -Bokeh refers to the aesthetic quality of the out-of-focus areas in a photograph. It can be described in a prompt by mentioning 'bokeh' and can be influenced by the choice of camera lens.

  • How can the 'step count' parameter affect the AI's image generation process?

    -The 'step count' determines the number of iterations the AI goes through to generate an image. More steps can lead to more detailed images but also increase render times and computational resources required.

  • What is the importance of 'prompt token efficiency' when crafting prompts for AI image generation?

    -Prompt token efficiency is crucial because most AI systems have a limit on the number of tokens a prompt can contain, typically 75. Efficient prompts convey the desired image with the least number of tokens to ensure clarity and effectiveness within these constraints.



๐Ÿค” Discovering the Secret to Writing Effective Prompts

The speaker humorously introduces the topic of writing prompts, suggesting the existence of a 'Secret Sauce' for creating compelling images. They mention finding an intriguing resource, the Open Arts prompt book, which provides a slideshow of information on crafting prompts. The speaker clarifies that the video is not sponsored and expresses excitement to explore and share the content of the prompt book, which includes tips on creating prompts, the importance of question-asking, and considerations for subject matter, lighting, environment, and point of view. The discussion also touches on the concept of 'prompt engineering' and the influence of word order in generating images through AI.


๐Ÿ“ธ Exploring Modifiers and Photography Techniques in Prompts

This paragraph delves into the role of modifiers in altering the style, format, and perspective of an image. It provides examples of photography-related terms that can be used as modifiers, such as close-up, long shots, and wide shots. The importance of lighting is emphasized, along with the impact of different environments and lenses on the final image. The speaker also discusses the use of specific devices and the significance of the order of words in a prompt, using the example of a cat on a Martian table to illustrate the point.


๐ŸŽจ Understanding Art Styles and Mediums in Prompt Crafting

The speaker moves on to discuss various art mediums and styles that can be incorporated into prompts, such as chalk, oil painting, watercolor, and fabric. They highlight the effectiveness of pencil drawings in AI-generated images and provide tips for using clay in animations. The inclusion of artists in prompts is also covered, with advice on researching artists' styles for more consistent results. The paragraph concludes with suggestions for describing landscapes, mixing artist styles, and using different art movements and aesthetics to inspire creative outcomes.


๐ŸŒŸ Harnessing the Power of Lighting and Emotion in Image Creation

Lighting is a recurrent theme, with the paragraph discussing how different lighting conditions can affect an image's mood and atmosphere. The speaker also talks about the use of emotions in prompts, such as 'sad' or 'cozy,' to set the scene's tone. Aesthetics are explored, with examples ranging from psychedelic lion to Miami 80s vibe, emphasizing the potential for creative expression through AI-generated art.


๐Ÿ” Fine-Tuning Prompts with Magic Words and Parameters

The paragraph introduces 'magic words' that can enhance the resolution and detail of generated images, such as 'HDR Ultra HD' and '64k.' It also covers the impact of studio lighting and the use of specific platforms like ArtStation in prompts. The speaker provides insights into the use of vivid colors and the importance of being specific with color choices. They also discuss the technical aspects of prompts, such as resolution, CFG (classifier free guidance), step counts, and seed values, offering practical advice for beginners and experienced users alike.


๐Ÿ”„ Refining and Iterating AI-Generated Images

The final paragraph focuses on strategies for refining AI-generated images, including the use of image-to-image variations and strength adjustments. The speaker explains how to iterate an image to achieve desired results and the importance of starting with a strong base image. They also mention the utility of conventional tools like face restoration for fixing imperfections in generated images. The video concludes with a showcase of various AI-generated art pieces, demonstrating the potential of AI in creating diverse and compelling visuals.



๐Ÿ’กStable Diffusion

Stable Diffusion refers to a type of machine learning model used for generating images from textual descriptions. It is a part of the broader field of AI known as 'diffusion models,' which are capable of creating high-quality images. In the video, the host discusses how to use prompts to guide the Stable Diffusion model to create desired images, emphasizing the importance of prompt engineering.

๐Ÿ’กPrompt Engineering

Prompt engineering is the process of carefully crafting text prompts to guide AI image generation models like Stable Diffusion to produce specific types of images. It involves understanding how the model interprets different words and phrases. The video provides tips on constructing effective prompts to achieve the desired visual outcomes, such as specifying the subject, lighting, environment, and artistic style.

๐Ÿ’กOrder of Text

The order of text in a prompt is crucial because it can influence the weight the AI model gives to different elements when generating an image. For instance, if 'in the sky' is placed earlier in the prompt, the model is more likely to prioritize that aspect. The video script illustrates this with examples, showing how the position of words can change the final image significantly.


Modifiers are additional words or phrases that can alter the style, format, or perspective of the generated image. They can include terms that describe artistic styles, lighting conditions, or specific visual effects. In the context of the video, modifiers are used to add depth and specificity to prompts, helping to refine the AI's output to match the user's vision more closely.

๐Ÿ’กPhotography Terms

Photography terms such as 'close-up,' 'long shot,' 'Polaroid,' and 'long exposure' are used within prompts to guide the AI in generating images with specific photographic qualities. These terms help the model understand the desired style and mood of the image. The video discusses how specifying photography terms can lead to more consistent and desired results.

๐Ÿ’กArt Styles

Art styles refer to the various visual aesthetics and techniques used in creating artwork. In the video, the host mentions styles like '3D render,' 'Studio Ghibli,' and 'anime' as examples of how specifying an art style can influence the AI's image generation. This is important for achieving a particular look or feel in the final image.

๐Ÿ’กCamera Lenses

Camera lenses, such as 'tilt-shift' for creating miniature effects or 'macro' for close-up details, are mentioned as part of the prompt to influence the perspective and style of the generated image. The video highlights that understanding different lenses and how they affect a photo can help users create more nuanced and professional-looking prompts.


Referring to specific artists or art styles in a prompt can help the AI generate images that resemble the work of those artists. The video script provides examples like 'Greg Rutkowski' and 'Tim Burton,' suggesting that including artist names can be a powerful way to guide the AI towards a particular aesthetic or mood.


Emotion refers to the feelings or mood that a piece of artwork might evoke. In the video, the host discusses how including emotional descriptors in prompts, such as 'sad,' 'happy,' or 'lonely,' can help set the tone for the generated image, making it more expressive and impactful.


Aesthetics in the context of the video pertains to the visual appeal or sensory experience associated with the generated images. Terms like 'psychedelic,' 'vaporwave,' and 'Miami 80s Vibe' are used as modifiers to impart a certain visual style or cultural reference to the images. The video emphasizes the role of aesthetics in creating images that are not only visually pleasing but also culturally resonant.

๐Ÿ’กResolution and Image Quality

Resolution and image quality are important aspects of the final output. The video mentions '4K,' '8K,' and '64k' as examples of resolution specifications that can be included in prompts to guide the AI towards generating higher quality images. It also discusses the use of terms like 'Ultra HD' and 'HDR' to enhance the detail and clarity of the generated images.


A new prompt book has been released to help with creating prompts for generating images.

The OpenArts prompt book provides a slideshow of information on crafting effective prompts.

Prompt engineering involves writing text prompts for image generation, which can be enhanced by asking specific questions about the desired image.

The order of words in a prompt can significantly influence the AI's interpretation and the resulting image.

Modifiers can change the style, format, or perspective of the generated image.

Photography prompts can be made more specific by including details like close-up, long shots, or specific camera lenses.

The lighting in a prompt is crucial, with options like cinematic lighting or butterfly light offering different effects.

Including art styles or specific artists in the prompt can lead to more consistent and desired outcomes in the generated images.

The prompt's length and the order of details within it can affect the weight and impact of each element on the AI's output.

Using 'image to image' variations allows for iterative improvements on generated images, getting closer to the desired result with each iteration.

Seeds in prompts can be set to a specific value for consistency or left random for varied outcomes.

Different samplers have different durations and steps to reach a usable image, with some being faster than others.

CFG or scale values in prompts can balance creativity and guided generation, with higher values ensuring closer adherence to the prompt.

Token efficiency is important as prompts are limited in length; shorter prompts can be more impactful.

Conventional tools like face restoration can be used to fix issues in generated images, such as distorted facial features.

The video provides a comprehensive guide on using the OpenArts prompt book for creating compelling and effective prompts for image generation.

The presenter shares personal experiences and examples of using the prompt book, including successes and areas for improvement.

An ultimate guide tutorial is mentioned for more in-depth information on prompt creation and other related topics.