Dall-E 3, Sora, & ChatGPT Plus: Stable Audio vs Suno v3 & New Video Generator!
TLDRIn this week's AI news, OpenAI introduces in-painting for Dolly 3, despite its delayed implementation. Stability AI releases Stable Audio 2.0, offering free music generation, though it lags behind Sunno in quality. Chad GPT 3.5 becomes accessible without login, and Sora releases its first music video, 'World Weight.' Additionally, Anna Portrait emerges as a promising animator, and HiFi, a new video generator, is on the horizon.
Takeaways
- 🚀 OpenAI has introduced 'in paint' feature in Dolly 3, albeit later than expected.
- 🎨 Users can now edit Dolly 3 generated images directly, but the process is not as intuitive as expected.
- 🍞 An example given in the script is adding butter to a piece of toast generated by Dolly 3, which results in an image with an excessive amount of butter.
- 💬 Stability AI released Stable Audio 2.0, capable of creating full musical tracks up to 3 minutes from a single prompt and offering 20 free credits per month.
- 🎵 Sunno, another AI music generator, is considered superior in terms of audio quality and instrumentation, and also allows for singing and the use of audio references.
- 🆓 OpenAI now offers free access to Chat GPT 3.5 without the need for login, providing a still-capable model for public use.
- 🎶 A news article was used to generate an 8-line poem by Chat GPT, showcasing its creative capabilities.
- 📹 The first music video created with Sora, 'World Weight' by August Camp, has been released, featuring a consistent aesthetic and ambient electronic track.
- 🌟 Sora's capabilities are compared to Hyper, a free tool that can generate similar outputs with additional features like overlays and textures.
- 🎭 Anna Portrait is a new tool inspired by emotive Avatar, using a reference photo and video to generate high-quality character animations.
- 🔜 HiFi, a new video generator led by Alex Masharov, is in beta with a focus on improving video editing and character modification.
Q & A
What new feature has been added to Dolly 3 that was long overdue?
-The new feature added to Dolly 3 is the in-painting capability, which allows users to edit images by adding or changing elements within the photo directly.
What was the speaker's initial impression of Dolly 3's output aesthetic?
-The speaker was not a huge fan of Dolly 3's output from an aesthetic standpoint, as they personally did not resonate with it as much as they expected.
How does the integration of Dolly 3 with chat GPT affect the user experience?
-The integration of Dolly 3 with chat GPT allows users to chat with their image generator, providing a more interactive and dynamic experience when creating and editing images.
What is the limitation when trying to add a small amount of butter to the toast in Dolly 3?
-When trying to add a small amount of butter to the toast in Dolly 3, the output image shows the toast with an excessive amount of butter, which is not as intuitive or controllable as one might expect.
What is the unique feature of Stable Audio 2.0 compared to other AI-generated music platforms?
-Stable Audio 2.0's unique feature is that it allows users to add their own audio as a reference for the AI to create music, offering a more personalized and creative output.
How does the new video model, Sora, differ from other AI video generators like Hyper?
-Sora focuses on creating music videos with a distinctive aesthetic, including long tracking shots and vintage film looks. However, it's noted that similar results can be achieved with other platforms like Hyper when combined with additional elements.
What is the significance of the emotive Avatar talker and the new Anna portrait technology?
-The emotive Avatar talker and Anna portrait technology are significant as they improve upon the traditional bobblehead Avatar lip-sync look by using reference videos to create more natural and emotive character animations.
What is HiFi, the new video generator mentioned in the script, planning to offer?
-HiFi plans to offer an improved video editor that will enable users to modify characters and objects in videos and train a more powerful video generation model, aiming to enhance the quality and control over AI-generated videos.
What is the speaker's upcoming event related to AI filmmaking?
-The speaker will be attending the Curious Refuge AI filmmaking Mega party on April 15th, where they will be judging the world's first AI Esports tournament alongside other notable figures.
What does the speaker suggest as an alternative to Sora for creating similar video outputs?
-The speaker suggests using Hyper, a free platform that has upgraded its model to generate up to 4 seconds of video, as an alternative to Sora for creating similar video outputs.
How does the script highlight the rapid advancements in AI technology?
-The script highlights the rapid advancements in AI technology by discussing new features and updates across various platforms, such as Dolly 3's in-painting, Stable Audio 2.0's music generation, and the upcoming HiFi video generator.
Outlines
🚀 Open AI Updates and New Tools
This paragraph discusses recent updates in the AI field, focusing on Open AI's new features and tools. It highlights the long-awaited in-painting feature in Dolly 3, which allows users to edit images directly. The speaker shares their mixed feelings about Dolly 3's aesthetics and functionality, especially regarding its interaction with Chat GPT. The paragraph also touches on the new audio update from Stability AI, which generates full musical tracks from a single prompt, and compares it with the capabilities of Sunno, another AI music generation platform. The speaker provides a demo of the generated music and discusses the advantages of Stability AI's audio reference feature.
🎶 AI Music Generation and Sora News
The second paragraph delves into the world of AI-generated music, discussing the latest updates from Stability AI's Stable Audio 2.0 and its comparison to Sunno's music generation capabilities. It also introduces the new feature of adding singing to Sunno's output. The speaker then shifts focus to Sora, discussing the first music video created with this tool. The video, 'World Weight' by August Camp, is described along with the speaker's thoughts on the visual aesthetics and the potential for creative use of the tool. The paragraph ends with a brief mention of the speaker's upcoming events and a comparison between Sora and Hyper, another free AI tool.
🎨 New Portrait Animators and Upcoming Video Model
This paragraph introduces Anna Portrait, a new tool inspired by Emotive Avatar Talker, which uses a combination of reference photos and videos to create realistic animations. The speaker describes a use case from Visible Maker, where a character created in Mid Journey was upscaled and had its voice generated by 11 Labs. The paragraph concludes with news about an upcoming video generation platform called HiFi, led by Alex Masharov, former head of AI at Snap. HiFi aims to improve video editing by allowing modifications to characters and objects and training a more powerful video generation model. The speaker expresses enthusiasm for HiFi's lean approach and provides information on how to sign up for the beta.
Mindmap
Keywords
💡AI news
💡Dolly 3
💡Stable Audio 2.0
💡Sunno
💡Chat GPT 3.5
💡Sora
💡Hyper
💡Anna portrait
💡HiFi
💡AI filmmaking
💡AI-generated music
Highlights
Open AI introduces in-painting feature in Dolly 3, a long-awaited update.
Dolly 3's in-painting is not as intuitive as one might expect, requiring manual selection and editing.
Despite personal aesthetic preferences, the new in-painting feature represents a step forward for Dolly 3.
Stability AI's drama is not covered in the transcript, but Stable Audio 2.0 is mentioned, which generates full musical tracks up to 3 minutes long from a single prompt.
Stable Audio 2.0 is free, offering 20 credits per month for users to create music.
Sunno, another AI music generator, is highlighted as the current leader in AI-generated music, surpassing Stable Audio in quality and features.
Stable Audio's unique feature of adding personal audio references for music generation is noted.
Chat GPT 3.5 can now be used for free without logging in, showcasing its capabilities.
The first music video created with Sora, titled 'World Weight' by August Camp, is released.
The visual aesthetics of the 'World Weight' music video are praised, drawing comparisons to the style of Hyper.
A new video model, HiFi, is on the horizon, led by Alex Masharov, the former head of AI at Snap.
HiFi aims to build an improved video editor and a more powerful video generation model.
The Anna portrait animator is introduced, offering a new approach to character animation.
Visible Maker demonstrates a creative workflow combining various AI tools to create a character and voice.
AI technology continues to advance, with a quiet week suggesting an upcoming flood of new developments.
The presenter, Tim, will be attending the NAB show and judging the world's first AI Esports tournament.
A new emotive avatar, Anna portrait, is available, offering a different approach from Emo Talker.
Higgs Field AI is in beta, with a focus on video editing and character modification.