【Stable Diffusion】構図・アングル・視線の呪文集プロンプトをまとめて紹介

AIジェネ【AIイラスト生成の情報発信】
26 Aug 202308:23

TLDRThe video script offers a comprehensive guide on creating compositions and selecting angles for AI-generated images. It emphasizes the importance of mastering specific 'spells' or prompts, such as 'head shot' and 'cowboy shot', to achieve desired compositions like full-body views or shots from above or below. The script also discusses image size adjustments and the use of terms like 'looking viewer' and 'looking away' to control a character's line of sight. It suggests using 'img2img' or 'openpose' when the desired outcome isn't achieved and encourages viewers to explore these methods for enhanced image generation.

Takeaways

  • 🎨 Understanding composition spells can help generate desired image compositions without multiple attempts.
  • 📸 Using 'head shot' can generate images from the chest area upwards, not just the face.
  • 🔲 Creating square image sizes like 512x512 can facilitate easier and higher quality compositions.
  • 👢 To avoid generating a hat in 'cowboy shot' images, add ((no hat1.3)) to the prompt.
  • 🦵 Composing from the middle of the thigh can be achieved by entering 'cowboy shot' with specific image dimensions.
  • 🧍 For full-body compositions, 'full body' prompt with a 512x1024 image size improves eye quality.
  • 🔄 Adjusting image dimensions to 512x768 and using 'hires.fix' can enhance overall image quality.
  • 👀 Specifying angles like 'shoot from front', 'above', 'below', or 'behind' can control the perspective of the generated image.
  • 👀:1.2 Controlling a person's line of sight with prompts like 'looking up: 1.2' or 'looking down: 1.2' can affect how they interact with the camera.
  • 🔄 If the desired composition or angle isn't achieved, using 'img2img', 'openpose', or changing the model can be alternative solutions.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is about creating compositions and controlling angles in image generation using various spells or prompts.

  • How can one generate an image from the chest area upwards?

    -To generate an image from the chest area upwards, you can insert the 'head shot' spell in the prompt.

  • Why should certain spells like 'skirt' or 'denim' be removed from the prompt?

    -These spells should be removed because they can interfere with generating the desired composition, as they may not be included in the image due to the specific focus on the upper body.

  • What is the recommended image size for easier generation of high-quality images?

    -A square image size, such as 512 by 512 pixels, is recommended for easier generation of high-quality images.

  • How can you reduce the tendency of generating an image with a hat in a 'cowboy shot' composition?

    -You can reduce the tendency of generating a hat by adding '(no hat1.3)' after the 'cowboy shot' in the prompt.

  • What is the recommended image size for generating a full-body image?

    -For a full-body image, it is recommended to set the width to 512 and the height to 1024 pixels to improve the quality of the eyes and generate a better full-body image.

  • What is the most recommended method for generating an image with the highest quality?

    -The most recommended method is to set the width to 512 and the height to 768 pixels and use 'hires.fix' to generate the highest quality image.

  • How can you generate an image from a specific angle?

    -You can generate an image from a specific angle by using spells like 'shoot from front', 'shoot from above', 'shoot from below', 'shoot from side', and 'shoot from behind'.

  • What spell can be used to make the generated character look at the camera?

    -The 'looking viewer' spell can be used to make the generated character look at the camera.

  • If the desired composition or angle is not achieved, what alternative methods can be used?

    -If the desired composition or angle is not achieved, you can use alternative methods like 'img2img', 'openpose', or 'openpose editor' based on a reference image.

  • What should one do if the model does not reflect the spell content?

    -If the model does not reflect the spell content, one can try changing the image size or switching to a different model.

Outlines

00:00

🎨 Composition and Angle Viewpoints in Art

This paragraph discusses the intricacies of creating compositions from various angles and viewpoints. It emphasizes the importance of mastering spells such as 'head shot' for generating images from the chest area upwards and suggests using 'cowboy shot' for compositions from the middle of the thigh. The paragraph highlights the benefits of a square image size for ease of generation and the necessity of adjusting image dimensions for higher quality outputs. It also touches on the use of 'hires.fix' for enhanced eye quality in full-body images. The content provides practical advice for artists looking to improve their compositions through the strategic use of spells and image manipulation techniques.

05:05

🔍 Fine-Tuning Art Prompts and Model Selection

The second paragraph delves into the nuances of fine-tuning art prompts to achieve desired compositions and angles. It addresses common challenges such as the difficulty of generating images without a hat when using 'cowboy shot' and offers solutions like adding '(no hat1.3)' to the prompt. The paragraph also explores the impact of different angles on the final image, including 'shoot from above', 'shoot from below', and 'looking viewer'. It suggests that adjusting image sizes and using alternative methods like 'img2img' or 'openpose' can help overcome issues when the desired composition is not achieved. The content concludes with a reminder of the importance of understanding and applying prompt spells effectively to expedite the image generation process and a call to action for viewers to subscribe for more content.

Mindmap

Keywords

💡composition

In the context of the video, 'composition' refers to the arrangement of elements within an image, specifically how the subject is positioned and framed. It is a crucial aspect of creating visually appealing and meaningful artwork. The video discusses various 'composition spells' or techniques, such as 'head shot' and 'cowboy shot', to achieve different compositions. For instance, generating a composition from the chest area upwards can be done by inserting 'head shot' into the prompt.

💡angle viewpoint

The term 'angle viewpoint' pertains to the perspective from which an image is captured. It significantly influences the viewer's interpretation of the image. The video emphasizes the importance of specifying angles to achieve desired视觉效果, such as 'shoot from above' or 'shoot from below'. These angles can dramatically alter the portrayal of the subject, as exemplified by the 'cowboy shot' which often results in an image of a character wearing a hat unless specified otherwise.

💡image size

'image size' refers to the dimensions of the image being generated. It plays a critical role in determining the quality and focus of the image. The video suggests that a square image size, like 512 by 512, can make it easier to generate high-quality images, especially for compositions from the chest area. Adjusting the image size, such as setting the width to 512 and the height to 1024 for a full-body composition, can improve the quality of the eyes and make it easier to generate a complete body image.

💡hires.fix

'hires.fix' is a term used in the video to denote a method or 'spell' that enhances the quality of the image generation. It suggests that by using this term in conjunction with specific image size settings, one can achieve the highest quality in the output image. This method is particularly recommended for full-body compositions, as it improves the quality of the eyes and overall detail throughout the image, despite potentially longer processing times.

💡prompt

In the context of the video, a 'prompt' is the input or instruction given to the AI system to generate a specific image. It is a critical component in the image creation process, as it directly influences the outcome. The video discusses various spells or keywords that can be included in the prompt to control the composition and angle viewpoint of the generated image. For example, adding '(no hat1.3)' to the prompt can reduce the likelihood of the character wearing a hat in the image.

💡cowboy shot

The 'cowboy shot' is a specific composition technique mentioned in the video, which typically results in an image taken from above the character's thighs. It is a term used to describe a common prompt on the Civitai model distribution site. The video also addresses the common issue of this shot often including a hat on the character and provides a solution by adding '(no hat1.3)' to the prompt to generate the image without a hat.

💡looking viewer

'looking viewer' is a term used in the video to describe a spell or instruction that directs the character in the image to look at the camera. This is important for creating images from the camera's perspective, enhancing the viewer's connection with the subject. The video suggests that while it's often possible to achieve this without specifying the 'looking viewer' spell, including it in the prompt can ensure the desired eye contact is maintained in the generated image.

💡img2img

The term 'img2img' refers to a method mentioned in the video for generating images based on a reference image. If the desired composition or angle is not achieved through the use of spells or prompts alone, 'img2img' can be used as an alternative approach to guide the AI in creating the image. This method allows for more precise control over the final output by providing a visual reference for the AI to follow.

💡openpose

'openpose' is a technique discussed in the video for improving the generation of images, particularly when the desired composition or angle is not obtained through standard prompts. It is suggested as a solution when the model may not have learned certain prompts or spells, offering a way to refine the image generation process and achieve the desired results.

💡Ai Gene

In the context of the video, 'Ai Gene' appears to be the name of the entity or platform that disseminates information about generated AI, likely referring to the technology or algorithms behind the image generation process. The video encourages viewers to subscribe to the channel for more content related to AI-generated images, suggesting that 'Ai Gene' may be a source of educational material or tutorials on the subject.

💡spells

Throughout the video, 'spells' are used metaphorically to describe specific keywords or phrases that are input into the AI system to influence the generation of images. These 'spells' act as directives for the AI, guiding it to produce images with certain compositions, angles, and characteristics. The video provides numerous examples of such spells, like 'head shot', 'shoot from above', and 'looking up: 1.2', and emphasizes the importance of mastering these spells to efficiently create the desired images.

Highlights

The importance of mastering composition and angle spells for efficient image generation.

Using 'head shot' to generate a composition from the chest area upwards.

The necessity of removing spells for lower body elements like 'skirt' or 'denim' before generating upper body compositions.

Creating a square image size, such as 512x512, for easier generation of high-quality images.

The misconception of 'head' in prompts leading to drawings that often extend up to the chest area.

The possibility of generating images from the middle of the thigh using 'cowboy shot'.

Reducing the tendency to generate hats by adding '(no hat1.3)' to the prompt.

The low probability of generating a composition from the top of the thigh with 'american shot'.

Optimizing the image size to 512x1024 for compositions from the upper thighs without hats.

Entering 'full body' to compose the entire body in the image.

The recommendation of using a 512x768 image size with 'hires.fix' for the highest quality.

The advantage of higher quality eyes and body generation with increased image size and processing time.

Entering 'shoot from front' for a frontal angle with the same line of sight as the character.

Generating angles from above, below, or behind using respective spells.

Using 'looking viewer' to have the character look directly at the camera.

Adjusting the character's gaze direction with 'looking up', 'looking down', and 'looking side'.

The strategy of using 'looking away: 1.4' to make the character look elsewhere instead of the camera.

The potential need to change image size or use 'img2img' or 'openpose' when spells do not yield the desired composition.

The possibility of changing the model if the problem persists despite trying all the spells.

Ai Gene's role in disseminating information about generated AI.

The call to action for viewers to subscribe to the channel for more content.