Stable Diffusion 3 vs ChatGPT Dalle-3 vs Midjourney [NEW Best Image Generator?]

AI Andy
3 Mar 202420:50

TLDRThe video script presents a detailed comparison of three AI image generation models: Stable Diffusion 3, Mid Journey, and Dolly 3. The comparison is based on their ability to adhere to a given prompt and their 'coolness' factor, evaluating their output on various prompts. The evaluation includes a cinematic photo of a red apple, a painting of an astronaut on a pig, a chameleon close-up, a 90s desktop computer, glass bottles with colored liquids, an embroidered cloth, a sports car, and a horse balancing on a ball. The script concludes with a preference for Dolly 3's style and adherence to the prompts, despite Stable Diffusion 3's strong performance in text generation.

Takeaways

  • 📸 Comparison of three AI models: Stable Diffusion 3, Mid Journey, and Dolly 3 based on their ability to interpret and generate images from prompts.
  • 🎨 Ranking criteria include detail, adherence to the prompt, and coolness factor of the generated images.
  • 🍎 The first prompt was a cinematic photo of a red apple in a classroom with a motivational message on the blackboard.
  • 🚀 Stable Diffusion 3 was criticized for lacking in coolness but adhered well to the prompt.
  • 🌟 Mid Journey produced images with higher coolness but lacked in detail clarity and text adherence.
  • 🖌️ Dolly 3 demonstrated good clarity, detail, and a dramatic coolness factor in its interpretations.
  • 👨‍🚀 The second prompt involved a painting of an astronaut riding a pig, which showcased the adherence and style of each AI.
  • 🦎 A close-up studio photograph of a chameleon was used to evaluate the AI's ability to capture fine details and textures.
  • 🖥️ A prompt featuring a 90's desktop computer showcased the AI's ability to create nostalgic and detailed images.
  • 🏎️ A night photo of a sports car with text on the side was used to test the AI's understanding of motion and speed.
  • 🐎 The final prompt about a horse balancing on a ball in a field was to test the AI's ability to handle unrealistic but visually appealing scenarios.
  • 🏆 Dolly 3 was favored for its style and ability to handle text and composition effectively, making it the preferred choice for the reviewer.

Q & A

  • What is the main focus of the video script?

    -The main focus of the video script is to compare and rank different AI-generated images based on the same prompts, evaluating them on detail, adherence to the prompt, and coolness factor.

  • Which AI models are being compared in the video?

    -The AI models being compared are Stable Diffusion 3, Mid Journey, and Dolly 3.

  • What are the three factors used to rank the AI-generated images?

    -The three factors used to rank the images are detail, adherence to the prompt, and coolness.

  • How does the video script describe the first prompt being used for the comparison?

    -The first prompt is a cinematic photo of a red apple on a table in a classroom with the words 'Go big or go home' written on the blackboard.

  • What criticism is mentioned about Stable Diffusion V3 in the video script?

    -The criticism mentioned about Stable Diffusion V3 is that it lacks on the coolness factor.

  • How does the video script describe the performance of Mid Journey on the astronaut riding a pig prompt?

    -Mid Journey has good adherence to the prompt but is described as having a street art style that wasn't explicitly requested, though it still maintains a high coolness factor.

  • What is the viewer's final preference for the AI models based on the video script?

    -The viewer's final preference, based on the comparisons, is for Chachi BT and Dolly 3, appreciating their style and ability to do text and composition well.

  • What issue is noted with Mid Journey's adherence to text in the prompts?

    -The issue noted with Mid Journey's adherence to text is that it is not its forte, and it often fails to accurately include text elements as specified in the prompts.

  • How does the video script address the potential for future AI models?

    -The video script suggests that once Stable Diffusion becomes open-source, the community will be able to develop new models that could surpass the current offerings, providing the best outcomes as per user wishes.

  • What is the viewer's critique of Dolly 3's performance on the prompts?

    -The viewer critiques Dolly 3 for not always adhering to the text elements of the prompts, but appreciates its stylistic capabilities and the coolness factor of its outputs.

Outlines

00:00

🎨 Comparative Analysis of AI Art Generation

The paragraph discusses a comparison between three AI art generation models: Stable Diffusion 3, Mid Journey, and Dolly 3. The comparison is based on the same prompt, which is to create a cinematic photo of a red apple on a table in a classroom with specific text written on the blackboard. The AI-generated images are evaluated based on detail, adherence to the prompt, and coolness factor. The paragraph highlights the strengths and weaknesses of each model, noting that Stable Diffusion 3 may lack in coolness, Mid Journey has a higher coolness factor but lacks in text adherence, and Dolly 3 offers good clarity and detail with dramatic lighting.

05:02

🚀 Creative Interpretations of Challenging Prompts

This paragraph continues the comparison by presenting more prompts and the corresponding AI-generated images. The paragraph discusses the adherence to the prompt, style, and quality of the images produced by each model. It covers a variety of prompts, including a painting of an astronaut on a pig, a chameleon close-up, and a 90's desktop computer. The summary notes the unique interpretations and styles of each model, with Mid Journey excelling in animal depictions and Dolly 3 providing stylized and dramatic images.

10:05

🏎️ Evaluating AI's Ability to Handle Complex Scenes

The paragraph focuses on the AI models' ability to handle complex and detailed scenes, such as transparent glass bottles with colored liquids, an embroidered cloth with a tiger and a candle, and a sports car with text on the side. The evaluation criteria include the correct representation of colors, liquids, and text, as well as the overall aesthetic appeal of the images. The paragraph highlights the challenges faced by the models, particularly with text generation and adherence to the order of elements, and praises Dolly 3 for its stylized and dramatic renderings.

15:06

🌄 Diverse Styles and Adherence in AI Art

This paragraph delves into the diversity of styles and adherence to prompts exhibited by the AI models. It covers prompts ranging from a horse balancing on a ball to an anime-style illustration of a stand. The paragraph discusses the realism, style, and creativity of the AI-generated images, noting that while some models excel in certain areas, they may fall short in others, such as text generation or physical accuracy. The paragraph concludes with a preference for the style of Dolly 3 over the other models.

20:09

🌟 Final Thoughts on AI Art Generation Models

The final paragraph wraps up the comparison by discussing the potential for community-driven improvements in AI art generation models once they become open-source. The paragraph reflects on the personal preference for Chachi BT and Dolly 3 models, citing their style and ability to handle text and complex scenes effectively. The video concludes with a call to action for viewers to find their preferred AI art generation prompt and to continue exploring the content.

Mindmap

Keywords

💡Stable Diffusion 3

Stable Diffusion 3 is a version of an AI model used for generating images based on text prompts. In the context of the video, it is one of the models being compared for its ability to adhere to prompts, detail, and coolness factor. The video provides examples of how it performs on various prompts, such as creating a cinematic photo of a red apple or illustrating a sports car with text on it.

💡Mid Journey

Mid Journey appears to be another AI model used for image generation, which is being evaluated alongside Stable Diffusion 3 and Dolly 3. The video discusses its performance in terms of detail clarity, coolness factor, and adherence to the given prompts, with specific examples of the images it generates.

💡Dolly 3

Dolly 3 is yet another AI image generation model compared in the video. It is assessed based on the same criteria as the other models: detail, adherence to prompts, and coolness. The video provides examples of Dolly 3's images and discusses its performance in comparison to Stable Diffusion 3 and Mid Journey.

💡Prompts

In the context of the video, prompts are the text inputs provided to the AI models to generate specific images. The effectiveness of an AI model is judged by its ability to accurately and creatively respond to these prompts, producing images that match the requested details and themes.

💡Detail

Detail refers to the level of intricacy and clarity in the images generated by the AI models. The video assesses how well each model captures fine details in its outputs, such as the texture of the apple or the individual scales of the chameleon.

💡Adherence

Adherence measures how closely an AI model's output follows the specific instructions given in the prompt. The video evaluates the models based on their ability to accurately represent elements mentioned in the prompts, such as objects, text, and their arrangement.

💡Coolness

Coolness is a subjective measure of the appeal or stylistic quality of the AI-generated images. The video discusses the coolness factor in relation to the creativity and visual impact of the images, independent of their realism or detail.

💡Text Generation

Text generation refers to the AI models' ability to include and correctly place text within the generated images as specified by the prompt. The video evaluates how well each model integrates text into its outputs, noting instances where models succeed or fail in this aspect.

💡Image Comparison

Image comparison is the process of evaluating and contrasting the outputs of different AI models based on a set of criteria. The video script involves a detailed comparison of images produced by Stable Diffusion 3, Mid Journey, and Dolly 3, assessing their quality, adherence to prompts, and overall visual appeal.

💡AI Model Performance

AI model performance refers to how effectively an AI model can generate images based on text prompts. The video script provides a qualitative assessment of the performance of three AI models—Stable Diffusion 3, Mid Journey, and Dolly 3—based on factors like detail, adherence to prompts, and coolness.

Highlights

Comparison of Stable Diffusion 3, Mid Journey, and Dolly 3 using the same prompt.

Ranking based on detail, adherence, and coolness factors.

Cinematic photo of a red apple in a classroom with the phrase 'Go big or go home'.

Stable Diffusion V3 criticized for lacking coolness factor.

Mid Journey's Apple image lacks detail clarity but has a higher coolness factor.

Dolly 3's image has good clarity, detail, and coolness with dramatic lighting.

Painting of an astronaut riding a pig with a unique style and high adherence to the prompt.

Mid Journey's street art style with good adherence and coolness factor.

Dolly 3's creation of two images with different styles and quality.

Studio photograph of a chameleon with high detail and coolness factor.

Mid Journey excels at creating detailed and stylized animal images.

Dolly 3's stylized and dramatic photo with high coolness factor.

Photo of a 90's desktop computer with nostalgic graffiti and text.

Mid Journey's steampunk street art style with a unique take on the prompt.

Dolly 3's retro UI and cool style with a nostalgic vibe.

Transparent glass bottles with different colored liquids and reflections.

Mid Journey's struggle with the order and color accuracy of the bottles.

Dolly 3's correct order and stylized look with accurate color reflections.

Embroidered cloth with the text 'good night' and a baby tiger with dramatic lighting.

Mid Journey's moody and cozy interpretation with a focus on style over text adherence.

Dolly 3's detailed and textured embroidery with a well-lit candle and style.

Night photo of a sports car with the text 'sd3' and a road sign with 'faster'.

Mid Journey's neon lights and high-quality photo with text adherence.

Dolly 3's stylized composition with an incorrect placement of 'sd3' but a cool perspective.

Horse balancing on a colorful ball with a green grass field and mountain background.

Mid Journey's unrealistic portrayal of the horse and ball with a focus on the field of flowers.

Dolly 3's realistic and stylized depiction of the horse with a squished ball and a movie-like quality.

Anime style illustration of a new stand with a stormy background and text.

Mid Journey's vending machine interpretation with a cool style and rain effect.

Dolly 3's creative and detailed new stand with vines and a stormy cloud in anime style.

Final preference for Dolly 3 and Chachi BT over Stable Diffusion 3 for their style and coolness factor.