Wild AI Video Workflow with Viggle, Kaiber, Leonardo, Midjourney, Gen-2, and MORE!

Theoretically Media
2 Apr 202411:58

TLDRIn this video, Tim shares an innovative AI filmmaking workflow that spans from pre-production to short film generation. Inspired by the 2016 film 'Rogue One', Tim explores the use of AI tools to create a hybrid storyboard animatic animation. He demonstrates the process using clips from 'Gladiator' and 'John Carter of Mars', and discusses the use of Vigle, Midjourney, and Kyber in generating characters, refining movements, and stylizing backgrounds. The video highlights the potential of AI in filmmaking, offering a new perspective on pre-production and creative storytelling.


  • 🎬 The speaker shares an AI filmmaking workflow that has potential from pre-production to generating short films.
  • πŸš€ Inspiration came from the 2016 film Rogue One, particularly the use of a fully deep faked character and the creative process behind it.
  • πŸ“š The speaker discusses the process of creating a hybrid storyboard animatic animation using AI tools.
  • πŸŽ₯ The workflow involves clipping reference footage, using Vigle for initial video creation, and refining with Midjourney for character design.
  • 🌟 The speaker emphasizes the importance of using a 16:9 format for image generation to ensure full body shots.
  • πŸ’» Vigle 2.0 is noted for its improvements, but it struggles with handling camera movement.
  • πŸ‘Ύ The character design process involves using prompts and references, such as a mix of John Carter and Warhammer 40K for the main character.
  • 🎞 The speaker highlights the use of Leonardo for fixing issues with the generated footage, such as the 'invisible forearm' problem.
  • πŸ”„ The workflow includes using Kyber for additional video generation, utilizing its motion 3.0 feature for a more stylized and consistent output.
  • 🎨 Backgrounds are created with Gen 2 and Kyber to match the character's style, with attention to adding movement for a dynamic feel.
  • 🎭 The final step involves compositing character and background in a video editor, using chroma key and other effects to blend them seamlessly.

Q & A

  • What is the main topic of the video?

    -The main topic of the video is an AI film making workflow that has potential from pre-production to generating short films, using a combination of various AI tools.

  • What film inspired the creation of this workflow?

    -The 2016 film Rogue One, directed by Gareth Edwards, inspired the creation of this workflow due to its historical significance of featuring the first major film with a fully deep faked character.

  • How did Colin Ghoul's 2017 interview contribute to the idea?

    -In the 2017 interview, Colin Ghoul talked about creating a feature-length story reel for Rogue One before the script was finished, using hundreds of movies to work out dialogue and timing, which stuck with the creator and led to the idea of using AI tools for a similar process.

  • What AI tools are mentioned in the video?

    -The AI tools mentioned in the video include Vigle, Midjourney, Leonardo, and Kyber.

  • What specific scene from a film was used as a reference for the AI generated content?

    -The 'Are you not entertained?' scene from the film Gladiator was used as a reference for the AI generated content.

  • What was the process for creating the main character in the AI generated content?

    -The process for creating the main character involved using Midjourney to create a model dressed like a character from the film John Carter of Mars, and then using Vigle's 2.0 update to generate the character's animation.

  • What challenges were faced with Vigle's output?

    -The challenges faced with Vigle's output included stuttery movements, issues with camera movement, and inconsistencies in the character's appearance, such as an invisible forearm in one shot.

  • How was the shaky footage issue resolved?

    -The shaky footage issue was resolved by using Leonardo to take a screenshot of Wen Phoenix doing the hand raise and using that as an image-to-image reference with a low image strength, and then bringing it back to Vigle for further adjustments.

  • What was the final step in the workflow to enhance the cinematic feel?

    -The final step in the workflow was to bring the character and background into a video editor, such as Premiere or DaVinci, to composite them together using chroma keying, color correction, and additional cinematic touches like black bars for a faux letterbox format.

  • How was the audio for the AI generated content created?

    -The audio for the AI generated content was created using a combination of a free site called audiogen for crowd chanting, and another free source called typcast for the dialogue, using the Frankenstein model for the voiceover.

  • What is the creator's overall assessment of this AI film making method?

    -The creator believes that while the method is not perfect and may not be suitable for a full feature film, it works well for short films and can be a useful and productive tool for pre-production on large-scale projects or for independent filmmakers.



🎬 AI Filmmaking Workflow Introduction

The speaker introduces an AI-based filmmaking workflow that spans from pre-production to generating short films. Inspired by Gareth Edwards' 2016 film 'Rogue One', the speaker aims to share a hybrid storyboard animatic animation process. The workflow involves using AI tools to create a unique storytelling experience, as demonstrated by examples from friends of the channel, and the speaker intends to provide insights to save time for those interested in trying out this method.


🌟 Developing AI-Enhanced Storyboards

The speaker discusses the process of creating AI-enhanced storyboards by using reference footage and AI tools such as Vigle and Midjourney. The goal is to generate a character model dressed in a style reminiscent of 'Shan Connory' and to produce a scene similar to 'Gladiator' mixed with elements from 'John Carter' and 'Warhammer 40K'. The speaker shares their experiences with Vigle's 2.0 update, including its limitations with camera movement and the importance of using a green screen background for better results.


πŸ“Ή Refining AI Video Generation

The speaker elaborates on refining the AI-generated video by using additional AI tools like Kyber and Leonardo. They describe the process of enhancing the character model with Kyber's motion 3.0 feature and creating a more cohesive look by integrating the background with the character. The speaker also shares tips on using video editing software like Premiere and DaVinci to composite the character and background, adjust settings for better integration, and add cinematic touches like black bars and crowd chanting audio from a free site.

🎭 Audio and Final Touches for AI Film

The speaker addresses the challenges of generating dialogue using AI, sharing their unsuccessful attempts with speech-to-speech and text-to-speech methods. They eventually find success with a free source called 'Typcast', using the Frankenstein model to achieve a suitable voiceover. For the soundtrack, the speaker opts to create a custom 20-second loop in Ableton, despite the availability of free models from Sunno. The speaker concludes by reflecting on the potential of this AI filmmaking method for pre-production and indie filmmakers, and teases upcoming workflow videos for those interested in this technique.



πŸ’‘AI film making workflow

The term refers to the process of utilizing artificial intelligence tools and techniques to streamline and enhance various stages of film production, from pre-production to the generation of short films. In the context of the video, this workflow leverages AI to create a hybrid storyboard, animatic, and animation, demonstrating the potential of AI in generating compelling film content.


Deep fake technology involves the use of machine learning algorithms to create realistic but faked audio or video content where individuals appear to say or do things they did not. In the video, the reference to the 2016 film Rogue One highlights the historical significance of deep fakes in major films, indicating a shift towards the use of AI in creating more believable and complex characters or scenes.


A storyboard is a sequence of illustrations or images displayed in the order of scenes they are to be executed in a film, video, or animation. It serves as a visual guide for the production process. In the video, the speaker discusses using AI to create a hybrid storyboard and animatic, which is a step further in the evolution of film making, combining traditional methods with AI enhancements.


An animatic is a preliminary version of an animation or film, often created using storyboards and basic animations to plan the timing and pacing of the final product. The video describes the process of generating an animatic using AI, which is a significant advancement in the pre-production phase of filmmaking, allowing for more dynamic and flexible planning.


Vigle is an AI video editing tool mentioned in the video that assists in generating and editing video content. It is used to create a 2.0 update of the AI film making workflow, allowing for the incorporation of dancing elements and other features, which showcases the evolving capabilities of AI in video production.

πŸ’‘Mid journey

The term 'Mid journey' in the context of the video refers to a stage in the AI film making process where the main character model is created using an AI tool. This process involves generating a detailed character design that fits the narrative needs of the film, which is a crucial step in bringing the story to life through AI-enhanced visuals.


Leonardo is an AI tool mentioned in the video that is used for enhancing and refining the AI-generated content. It is utilized to improve the quality of the video output by adjusting various parameters and adding stylistic elements, which is essential in achieving a more polished and cinematic look for the final film.


Kyber is an AI video generator referenced in the video, known for its unique capabilities in creating stylized and visually compelling content. The speaker uses Kyber to add a layer of stylization to the video, giving it a cohesive and unified look that is crucial for maintaining the aesthetic quality of the film.

πŸ’‘Chroma key

Chroma keying is a visual effects technique used to replace a specific color, usually a solid color of a green or blue screen, with another image or video. In the video, chroma keying is used to composite the AI-generated character onto a separate background, allowing for greater flexibility and creativity in the post-production process.

πŸ’‘Crowd chanting

Crowd chanting refers to the sound of a group of people shouting or cheering in unison, often used in films to create a sense of atmosphere and scale. In the video, crowd chanting is generated using a free site called audiogenen, adding to the immersive and realistic feel of the AI-generated scene.

πŸ’‘Text to speech

Text to speech is a technology that converts written text into spoken words, often used for voiceovers or automated announcements. In the video, the speaker attempts to use text to speech for dialogue generation but encounters challenges, ultimately opting for a different source to achieve the desired audio effect.


AI film making workflow is presented, covering pre-production to generating short films.

The inspiration comes from the 2016 film Rogue One, which featured the first major film with a fully deep faked character.

Editor Colin Ghoul's 2017 interview discussed creating a feature-length story reel before the script was finished.

The idea is to use AI tools to create a hybrid storyboard animatic animation.

The process involves clipping out reference footage and using Vigle 2.0 for initial generation.

Vigle 2.0 can generate dancing videos but doesn't respond well to camera movement.

Mid journey is used to create a model for the main character, with a focus on full-body generation.

The output from Vigle is cleaned up using Kyber's new motion 3.0 feature.

Kyber is praised for being a unique AI video generator, especially with the motion 3.0 model.

The character's arm raise was improved by using Leonardo with a screenshot and an image-to-image reference.

Backgrounds are made dynamic using Gen 2 and then stylized with Kyber for a cohesive look.

Video editing is done in Premiere, with character and background comped together using chroma key and other adjustments.

Crowd chanting audio is generated using the free site audiogen.

Dialogue is created using typcast with the Frankenstein model for a more suitable voice.

The method may not be perfect for full feature films but works well for short films and pre-production.

The presenter, Tim, is excited about incorporating other tools into this kit-bashing technique and plans to share more workflow videos.