Dall-E 3, Sora, & ChatGPT Plus: Stable Audio vs Suno v3 & New Video Generator!

Theoretically Media
4 Apr 202411:16

TLDRIn this week's AI news, OpenAI introduces in-painting for Dolly 3, despite its delayed implementation. Stability AI releases Stable Audio 2.0, offering free music generation, though it lags behind Sunno in quality. Chad GPT 3.5 becomes accessible without login, and Sora releases its first music video, 'World Weight,' showcasing its potential but raising questions about its unique value. Additionally, Anna Portrait emerges as a promising tool, and HiFi, a new video generator, surfaces with a focus on character and object modification in videos.

Takeaways

  • 🎨 OpenAI has introduced in-painting feature in Dolly 3, allowing users to edit images by adding or changing elements within the generated scenes.
  • 🖌️ The in-painting process in Dolly 3 is not as intuitive as expected and requires users to manually select and edit areas of the image using a selection brush.
  • 🍞 An example given in the script demonstrates the addition of butter to a piece of toast in an image, highlighting the capabilities and limitations of Dolly 3's image editing.
  • 🎶 Stability AI released Stable Audio 2.0, which can create full musical tracks up to 3 minutes long from a single prompt and offers 20 free credits per month for users.
  • 🎵 Sunno, another AI music generation platform, is considered superior to Stable Audio in terms of audio quality and genre adherence, and also allows for singing integration.
  • 🆓 OpenAI now allows users to access Chat GPT 3.5 for free without the need to log in, making the technology more accessible to a wider audience.
  • 🎬 The first music video created with Sora has been released, showcasing the platform's capabilities in generating visuals for music tracks.
  • 📈 A comparison between Sora and Hyper indicates that while Sora has generated a lot of buzz, Hyper's free model and additional resources may offer similar results.
  • 🤖 Anna Portrait is a new tool that uses a combination of reference photos and videos to generate high-quality character animations.
  • 🌐 HiFi, a new video generator, has emerged from stealth mode with a focus on improving video editing and character modification in AI-generated videos.

Q & A

  • What new feature has been added to Dolly 3 that was previously implied but not delivered in the initial rollout?

    -The new feature added to Dolly 3 is the in-painting capability, which allows users to edit images by adding or changing elements within the generated content.

  • What is the user's main criticism about Dolly 3's output?

    -The user criticizes Dolly 3's output for not being aesthetically pleasing from their personal standpoint, despite acknowledging that it has fans and use cases.

  • How does the integration of Dolly 3 with chat GPT affect the user's experience?

    -The integration of Dolly 3 with chat GPT allows users to chat with their image generator, which the user appreciates. However, they express frustration with the implementation of this feature.

  • What is the main difference between the two images generated by Dolly 3 when prompted to add butter to a piece of toast?

    -The main difference is that the image generated by Dolly 3 when prompted to add butter shows a piece of toast with an excessive amount of butter, which is not what the user requested.

  • What are the key features of Stable Audio 2.0?

    -Stable Audio 2.0 creates full musical tracks up to 3 minutes in length from a single prompt and offers 20 free credits per month for users to utilize the service.

  • How does the user describe the music generated by Stable Audio 2.0 compared to Sunno?

    -The user describes the music generated by Stable Audio 2.0 as not bad, but it pales in comparison to Sunno, which they consider the current leader in AI-generated music due to its higher audio fidelity, better instrumentation, and composition choices.

  • What unique feature does Stable Audio offer that Sunno does not?

    -Stable Audio allows users to add their own audio as a reference for generating music, which Sunno does not offer.

  • What is the significance of the emotive Avatar talker and the new Anna portrait in the context of character creation?

    -The emotive Avatar talker and Anna portrait are significant as they provide new ways to create and customize characters with more realistic and expressive features, moving away from the bobblehead Avatar lipsync look.

  • What is the main goal of HiFi, the new video generator mentioned in the script?

    -HiFi aims to build an improved video editor that will enable users to modify characters and objects in videos and train a more powerful video generation model.

  • What is the user's opinion on the first music video created with Sora?

    -The user finds the first music video created with Sora to be cool, with a solid aesthetic consistency, but also notes that similar results can be achieved with other tools like Hyper.

  • What upcoming event is the speaker, Tim, excited about and why?

    -Tim is excited about the Curious Refuge AI filmmaking Mega party and the world's first AI Esports tournament, where he will be a judge. He looks forward to hanging out with the guys from Curious Refuge and being part of these unique AI events.

Outlines

00:00

🤖 AI Updates and New Tools

This paragraph discusses various updates in the AI field, highlighting the release of new features in Open AI's Dolly 3, which allows in-painting in 3D. It mentions the limitations and aesthetic preferences regarding Dolly 3's outputs and compares it with other image generators. The paragraph also covers Stability AI's recent release, Stable Audio 2.0, which generates full musical tracks from a single prompt, and compares it with Sunno, another AI music generation platform. Additionally, it touches on the free availability of Chat GPT 3.5 and the creative potential of using AI in filmmaking and music.

05:01

🎶 Music Generation and Sora News

The focus of this paragraph is on the advancements in AI-generated music and the capabilities of different platforms. It delves into the features of Stable Audio 2.0, its free credits, and how it compares to Sunno in terms of audio quality and genre adherence. The paragraph also discusses the ability to add singing in Sunno and use audio references in Stable Audio. It then shifts to Sora, a new entry in the AI music video creation space, and reviews the first music video created with Sora, comparing it with Hyper, a free alternative.

10:02

🎥 Emerging AI Video and Animation Tools

This paragraph introduces new tools and platforms in the AI video and animation space. It talks about Anna Portrait, an AI tool for creating animated characters, and its application in a project by Visible Maker. The paragraph also mentions an upcoming video generator called Higs Field AI, led by Alex Masharov, and its aim to improve video editing and character modification. Lastly, it provides an update on the speaker's personal engagements, including an AI filmmaking event and an esports tournament.

Mindmap

Keywords

💡AI news

AI news refers to the latest updates and developments in the field of artificial intelligence. In the context of the video, it is used to set the stage for discussing recent advancements and events related to AI technology.

💡Open AI

Open AI is an organization that focuses on ensuring artificial general intelligence (AGI) benefits all of humanity. In the video, Open AI is highlighted as a source of updates and new features, such as the in-painting feature in Dolly 3.

💡Dolly 3

Dolly 3 is an AI-generated image tool that is part of the Open AI suite. It is known for creating images based on textual prompts. The video discusses the addition of an in-painting feature to Dolly 3, which was previously implied but not initially included.

💡In-painting

In-painting is a technique used in image editing where missing or unwanted parts of an image are filled in or altered. In the context of the video, it refers to the new feature in Dolly 3 that allows users to edit AI-generated images by adding elements like butter on toast.

💡Stability AI

Stability AI is a company that specializes in creating stable diffusion models for generating images and audio. In the video, Stability AI is mentioned in relation to its drama and the release of Stable Audio 2.0, a tool for creating full musical tracks from a single prompt.

💡Stable Audio 2.0

Stable Audio 2.0 is an AI-powered tool developed by Stability AI that generates full musical tracks based on a textual prompt. It represents an advancement in AI-generated music, offering users a free platform to create music with the potential for longer tracks.

💡Sunno

Sunno is an AI music generation platform that competes with Stability AI's music generation tools. It is known for its high-quality audio output and the ability to generate music in version three, even for users on a free plan.

💡Sora

Sora is an AI platform that generates music videos. It has been used to create the first music video for the track 'World Weight' by August Camp. The video showcases the capabilities of Sora in producing visually appealing content.

💡Hyper

Hyper is a free AI tool that generates short video clips. It is mentioned as an alternative to Sora for creating videos with similar aesthetics and effects, emphasizing its accessibility and ease of use.

💡Anna portrait

Anna portrait is an AI tool inspired by the emotive Avatar talker, which takes a different approach by using a reference photo and a reference video to create a final output. It represents an advancement in AI-generated portraits.

💡HiFi

HiFi is an upcoming AI video generation platform led by Alex Masharov, the former head of AI at Snapchat. It aims to build an improved video editor and a more powerful video generation model.

Highlights

Open AI introduces in-painting feature in Dolly 3, a long-awaited update.

Dolly 3's in-painting is not as intuitive as one might expect, requiring manual selection and editing.

Despite personal aesthetic preferences, the new feature represents a step forward for Dolly 3.

Stability AI releases Stable Audio 2.0, capable of creating full musical tracks up to 3 minutes from a single prompt.

Stable Audio 2.0 offers 20 free credits per month, making it an accessible tool for music creation.

Sunno's AI-generated music surpasses Stability AI in terms of audio fidelity and genre accuracy.

Stable Audio's unique feature allows users to add their own audio for reference, enhancing creativity.

Open AI now offers free access to Chat GPT 3.5 without the need for login, expanding accessibility.

The first music video created with Sora, 'World Weight' by August Camp, showcases the tool's potential in visual storytelling.

The visual consistency and aesthetic of the Sora-generated video are reminiscent of high-quality productions.

Comparisons to Hyper suggest that similar results can be achieved with a combination of free tools and overlays.

Public opinion on Sora seems to be shifting, with some feeling excluded from the platform's exclusive offerings.

Anna Portrait, inspired by Emotive Avatar, uses a reference photo and video to generate high-quality animations.

Visible Maker demonstrates an innovative workflow combining various AI tools for character creation and voice generation.

HiFi, a new video generator led by former Snap AI head Alex Masharov, emerges from stealth mode.

HiFi aims to improve video editing by allowing modifications to characters and objects within videos.

Despite a slow week in AI news, the industry continues to advance rapidly, with numerous updates and releases.

The presenter, Tim, will be attending the Curious Refuge AI filmmaking event and judging the world's first AI Esports tournament.