Audio Reactive AI Animation Masterclass ft. Cerspense

17 Apr 202498:44

TLDRIn this masterclass, Tyler hosts Spence, a creative professional with a decade of experience in music visualization and audio reactive visuals. Spence, who works for Runway ML, shares his expertise in using AI for video generation and real-time visuals, particularly with TouchDesigner and Notch. He demonstrates how to create audio reactive animations by integrating AI-generated visuals with beat detection to synchronize with music. The session includes a sound check, an introduction to Spence's background, and a step-by-step guide through his creative process using various software. Spence also discusses his journey into AI, his work with GAN models, and his development of a fine-tuned model called ZeroScope. The class concludes with a Q&A, offering insights into Spence's workflow and the tools he recommends for real-time graphics creation.


  • 🎉 The session is an Audio Reactive AI Animation Masterclass featuring Spence, a prominent figure in the field.
  • 👋 Introduction to Spence's background in creating visuals for musical performances and his journey into technology and AI.
  • 🚀 Overview of Spence's workflow, including the use of Notch, Comfy UI, and Touch Designer in his creative process.
  • 🎨 Discussion on the integration of AI with image generation workflows and the utilization of GPT-3 and styleGAN models.
  • 💡 Insight into the use of Notch, a real-time visual effects software, for creating graphics quickly and intuitively.
  • 🎵 Explanation of how audio reactivity works in conjunction with visuals, enhancing the overall audio-visual experience.
  • 🌐 Mention of the availability of Spence's workflow page with downloadable resources for further exploration.
  • 📈 Spence's advice for those hesitant to start with node-based workflows, encouraging learning through tutorials and community interaction.
  • 🖌️ Touch Designer's capabilities in real-time visuals, its node-based interface, and its potential for creative expression.
  • 🔗 Discussion on alternative software like Blender and the potential for using geometry nodes for creating intricate visuals.
  • 🎥 Spence's workflow demonstration, showing the step-by-step process of creating an audio reactive animation from scratch.

Q & A

  • What is the main topic of discussion in the provided transcript?

    -The main topic of discussion is about creating audio reactive AI animations, with a focus on the tools and techniques used by the guest, Spence, who works with Runway ML and has experience in visual creation for musical performances.

  • What software does Spence use for real-time 3D modeling and animation?

    -Spence uses a software called Notch for real-time 3D modeling and animation.

  • How does Spence integrate AI into his visual creation process?

    -Spence integrates AI into his visual creation process by using image generation workflows with Touch Designer to automate them and bring them into the video space. He also uses AI models like Disco Diffusion, StyleGAN, and GPT-3 for various creative tasks.

  • What is the significance of using a node-based program like Touch Designer in Spence's workflow?

    -Touch Designer is significant in Spence's workflow because it is designed for real-time visuals and can interact with APIs, allowing for the generation of complex visuals and the automation of creative processes. It also enables the integration of multiple data streams and the creation of custom systems for expressing ideas.

  • Why is Notch an important tool for Spence's work?

    -Notch is important because it allows for the creation of graphics quickly and in real-time, which is more intuitive than game engines. It is also used in many of the biggest tours in the world, making it a valuable skill for professional visual creators.

  • How does Spence use the concept of 'masks' in his creative process?

    -Spence uses masks as depth maps in his creative process. Even though they are black and white, they are input as depth maps to create visually interesting effects and help generate the final composite.

  • What is the role of the 'control net' in Spence's workflow with AI image generation?

    -The control net is used to smooth out the motion in the generated video, predict the next frame of video, and keep the colors in check to prevent them from blowing out or becoming grainy.

  • How does Spence approach the challenge of learning and using node-based software?

    -Spence suggests starting by looking at and tweaking existing workflows, then building a simple one from scratch. He emphasizes troubleshooting and understanding the connections between nodes rather than knowing what every single node does.

  • What is the purpose of the 'VID DER Iterator' node that Spence mentions?

    -The 'VID DER Iterator' node is used to iterate through a directory of videos, providing the video paths for each video in the directory. This allows for automation in selecting and processing multiple videos.

  • How does Spence use audio reactivity in his animations?

    -Spence uses audio reactivity to modulate the speed of his animations based on the beat of the music. He uses the kick and snare channels from the audio analysis to trigger faster movements in the animation.

  • What are some alternative tools or software that Spence recommends for real-time graphics generation?

    -Spence recommends tools like Unreal Engine's Avalanche features, Cables, Toolbag, and Blender for real-time graphics generation, depending on the specific needs and goals of the project.



🎥 Introduction to the Guest Creator Stream

The video script begins with an introduction by Tyler, the host, who welcomes viewers to a special guest creator stream. He mentions that they will cover a lot of content with the guest, Spence, and encourages viewers to submit questions through the chat. Spence has prepared a presentation and workflow page, which includes comfy workflows and a touch designer file, all accessible via a link shared in the chat. Tyler shares his experience working with Spence and mentions Spence's work with Runway ML on audio reactive projects. The introduction ends with a sound check and Spence giving a brief background about his work in visual creation for musical performances, his journey with technology, and his recent exploration of AI for video generation.


🎨 Spence's Creative Process and Notch Software Overview

Spence delves into his creative process, starting with his use of Notch, a real-time visual effects and graphics software. He explains that Notch is used for major tours and is niche, which can lead to significant opportunities for those who master it. The paragraph also covers alternatives to Notch, such as Blender, and the process of creating loops and visuals that can be used in other software. Spence demonstrates creating a loop in Notch and discusses the workflow he uploaded, which includes masks for depth mapping in animations.


🤖 AI and Node-Based Workflows with Touch Designer

The discussion shifts to AI and node-based workflows, with a focus on Touch Designer—a node-based program for real-time visuals. Spence talks about his transition from Cinema 4D to Touch Designer due to faster rendering times. He advises viewers on overcoming the initial overwhelm of node-based systems by starting with existing workflows and gradually building confidence through troubleshooting. Spence also shares his process of integrating AI into his creative work, hinting at the potential for expressive control and automation in artistry.


🌟 Exploring 3D Rendering and Real-Time Visual Effects

Spence showcases the capabilities of Notch for 3D rendering and real-time visual effects. He discusses the software's powerful rendering capabilities and its use for creative expression. The paragraph includes a demonstration of creating a loop with Notch, adjusting materials and lighting, and experimenting with different effects. Spence also addresses a question about loopable noises in Notch and shares his approach to creating continuous loops with the software.


🔄 High-Resolution Rendering and Post-Processing

The script describes the process of rendering high-resolution loops with Notch and post-processing them for better quality. Spence talks about his preference for using the PhotoJPEG format to maintain quality and the use of a high res fix script to upscale the images. He also shares a trick for using depth maps to enhance the rendering process. The paragraph concludes with a demonstration of how the rendering process works in Notch, emphasizing the speed and efficiency of the software.


🎥 Automating Content Creation with AI and Node-Based Systems

Spence demonstrates how to automate content creation using AI and node-based systems. He discusses using his own custom nodes to iterate through directories of videos and images, automatically generating content based on these sources. The paragraph details the process of setting up the system to produce loops and how it can be left running to create a large volume of content. Spence also mentions his use of a second computer to further increase content generation.


🎛️ Audio Reactive Visuals with Touch Designer

The script concludes with Spence's transition to using Touch Designer for creating audio reactive visuals. He outlines the process of setting up an audio file in Touch Designer and using the software's node-based interface to manipulate the visuals in real-time according to the audio. Spence also discusses the potential of Touch Designer for artists, its capabilities, and the fact that the free version is quite capable for most uses. The paragraph ends with a note on the availability of the Touch Designer file used in the stream for viewers to experiment with.


📚 Sharing Resources and Continuing the Learning Process

Spence shares additional resources and tools that have excited him in the creative software space, such as Unreal Engine's Avalanche setup, Cables, and Toolbag. He also recommends Blender for its community and capabilities, suggesting that it has surpassed Cinema 4D in some respects. The paragraph emphasizes the importance of passion and continuous learning in the field of graphics. Spence encourages viewers to engage with communities, share their work, and seek out tutorials to deepen their understanding of the tools.


🖥️ The Full Scope of Spence's Creative Workflow

The final paragraph provides a comprehensive look at Spence's creative workflow, which involves a complex setup of various tools and software. He discusses the integration of MIDI controllers, the process of manually adjusting parameters, and the use of Notch for compositing and additional effects. Spence also talks about his approach to recording videos in real-time and the iterative process of building his creative system. The paragraph concludes with a discussion about the balance between the technical and artistic sides of his work and the importance of finding harmony between the two.


📡 Closing Remarks and Future Streams Preview

The script ends with closing remarks from the host, Tyler, who thanks Spence for the insightful session and encourages viewers to follow him on social media. Tyler provides information about upcoming streams, including a discussion with Noah Miller about AI in film and animation, and a session with Dot Simulate, the creator of a real-time stable diffusion renderer for Touch Designer. The paragraph also includes information on where to find the workflow and resources shared during the stream, and a reminder about the platform's regular streams.



💡Audio Reactive

Audio reactive refers to a type of animation or visual effect that responds to sound in real-time. In the context of the video, the presenter is discussing how to create audio reactive animations using various software tools. This is a key concept as it ties the visual output to the audio input, creating a synchronized audio-visual experience.


Notch is a real-time visual effects software used for creating graphics quickly and intuitively. It is mentioned as one of the primary tools used by the presenter to generate visuals from scratch. It is significant because it allows for the creation of complex 3D animations that can be manipulated in real-time, which is essential for audio reactive animations.

💡Touch Designer

Touch Designer is a node-based visual development platform for creating interactive 3D graphics and visual effects. It is highlighted as a tool that integrates with APIs and allows for the use of Python scripting to generate custom visuals. The presenter uses it to automate image generation workflows and create custom systems for expressing ideas, which is central to the theme of the video.

💡AI Image Generation

AI image generation involves using artificial intelligence to create images, often with the help of models like StyleGAN or Stable Diffusion. The presenter discusses integrating AI image generation with Touch Designer to automate and innovate the process of creating visuals. This is a core part of the video's message, showcasing the potential of AI in creative fields.

💡Runway ML

Runway ML is a platform that combines machine learning models with creative applications, allowing users to use AI models for various projects. The presenter mentions working for Runway ML and doing audio reactive work, indicating the company's role in facilitating the use of AI in creative processes.

💡Stable Diffusion

Stable Diffusion is a type of AI model used for generating images stably from textual descriptions. It is brought up when the presenter talks about his journey into AI and how he started using different AI models for video creation. This term is important as it represents the cutting-edge technology being utilized in the workflow.


GPT-3, or Generative Pre-trained Transformer 3, is a language model AI technology used for various natural language processing tasks. The presenter mentions using GPT-3 before the advent of chat GPT, indicating the evolution of AI tools in creative applications.

💡Cinema 4D

Cinema 4D is a professional 3D modeling, animation, and rendering software. The presenter references using Cinema 4D in the past for creating visuals, particularly for musical performances. It is an important keyword as it shows the presenter's evolution from traditional 3D modeling to real-time and AI-driven graphics.

💡Silent Partner Studio

Silent Partner Studio is mentioned as the company for which the presenter created concert tour visuals and virtual production visuals. This keyword is significant as it provides context about the presenter's professional background and the scale of projects he has worked on.

💡Workflow Page

The workflow page is a resource provided by the presenter that includes downloadable content like comfy workflows and a TouchDesigner file. It is a key component in the video as it allows viewers to engage with the material practically and experiment with the tools and techniques discussed.

💡Real-Time Visuals

Real-time visuals are graphical images or animations that are rendered and displayed without any noticeable delay. The presenter emphasizes the importance of real-time visuals in the context of live performances and the benefits of using software like Notch and Touch Designer to achieve them. This concept is central to the video's theme of creating dynamic and responsive visual content.


Spence, a Runway ML researcher, shares his expertise in audio reactive visuals and AI integration for creative projects.

He demonstrates the use of Notch, a real-time visual effects software, for creating 3D models and animations.

Spence discusses the transition from traditional rendering to real-time rendering for efficiency and creative freedom.

He introduces TouchDesigner, a node-based program for real-time visuals, and its application in various creative fields.

The presentation includes a workflow page with downloadable resources for participants to follow along.

Spence covers the process of integrating AI models like Stable Diffusion and GPT-3 for video generation.

He provides an overview of his creative process, from initial concept to final audio-visual composite.

The tutorial showcases how to use audio reactive beat detection to synchronize visuals with music.

Spence explains the use of different software components, such as Comfy UI and TouchDesigner, in a seamless workflow.

He addresses the challenges and solutions when working with node-based workflows and real-time rendering.

The masterclass includes a sound check and an introduction to Spence's background in visual creation.

Spence demonstrates how to create custom systems and technical integrations for concert visuals.

He shares his experience in training AI models and the role it plays in his creative process.

The session provides practical advice for artists looking to expand their skills into new technologies.

Spence discusses the potential for artists to find work on major projects by mastering niche skills like Notch.

The masterclass concludes with a Q&A session where Spence addresses questions from participants.