Audio Reactive AI Animation Masterclass ft. Cerspense

Civitai
17 Apr 202498:44

TLDRIn this comprehensive masterclass, Tyler hosts Spence, a creative professional from Runway ML, to explore the intersection of audio reactivity and AI animation. Spence shares his journey of creating visuals for musical performances, integrating technology, and his deep dive into AI, using tools like Disco Diffusion, Style GAN models, and GPT-3. He demonstrates how to use Notch for real-time 3D modeling and animation, Comfy UI for rendering, and Touch Designer for audio-reactive beat detection, showcasing a workflow that fuses these elements to create dynamic visual experiences. Spence also discusses his experience with AI-generated music and the potential of tools like Blender, Unreal Engine's Avalanche, and open-source alternatives. The session is rich with practical advice, as Spence encourages aspiring creators to find passion in their work and leverage community resources to enhance their craft.

Takeaways

  • 🎉 Spence, a guest on the civii Friday guest Creator stream, showcased his expertise in audio reactive visuals and AI integrations in creative works.
  • 🚀 Spence's work involves creating visuals for musical performances, concert tour visuals, and virtual production visuals, driven by his passion for music and technology.
  • 🤖 He is currently a researcher at Runway ML, where he explores AI applications in video generation and real-time visuals, using tools like Touch Designer and Notch.
  • 🌟 The presentation included a workflow page with downloadable resources such as comfy workflows, a Touch Designer file, and loops for creative use.
  • 📚 Spence provided an overview of his background, including his shift towards AI in 2022, working with models like disco diffusion, style GANs, and stable diffusion.
  • 🎨 The live demonstration covered creating loops in Notch, a real-time visual effects software, and integrating them with AI-generated content in Comfy UI.
  • 🔍 Spence discussed the use of depth maps and masks in enhancing the AI image generation process, emphasizing the importance of experimentation.
  • 🎛️ He also touched on the challenges of learning node-based workflows and offered advice for overcoming the initial complexity and finding a balance between technical and creative aspects.
  • 🔊 The audio reactive component of the workflow was demonstrated using Touch Designer, highlighting the potential for creating visuals that respond to music beats.
  • 📈 Spence's approach involves a combination of pre-processing, real-time rendering, and post-processing to achieve high-quality, dynamic visual content.
  • 🌐 He encouraged the audience to explore various tools and platforms, such as Unreal Engine's Avalanche, Blender, and Max MSP, to expand their creative possibilities.

Q & A

  • What is the main topic of the masterclass?

    -The main topic of the masterclass is audio reactive AI animation, featuring a guest Creator named Spence who discusses his work and provides insights into creating visuals for musical performances using technology.

  • What are some of the software tools mentioned in the transcript?

    -Some of the software tools mentioned include Notch, Touch Designer, Runway ML, Cinema 4D, Silent Partner Studio, AI models like Disco Diffusion, Stable Diffusion, GPT-3, and tools for 3D modeling and animation.

  • What does Spence do at Runway ML?

    -Spence works at Runway ML as a researcher and also creates art for himself and for clients. He is involved in creating audio reactive visuals and has developed custom systems and technical integrations for creative possibilities in various shows.

  • How does Spence use AI in his work?

    -Spence uses AI in his work by training and fine-tuning models like Stable Diffusion and GPT-3. He integrates image generation workflows with Touch Designer to automate processes and create custom systems to express his ideas.

  • What is the role of Notch in Spence's workflow?

    -Notch is used by Spence for real-time 3D modeling and animation. It allows him to create graphics quickly and in a way that is more intuitive and fun compared to game engines. Notch is also used for creating audio visual graphics for big tours and shows.

  • What is the significance of the workflow page and files provided by Spence?

    -The workflow page and files provided by Spence include different comfy workflows and a Touch Designer file. These resources allow participants to download and use them to follow along with the masterclass and learn how to create their own audio reactive animations.

  • How does Spence approach learning and mastering new software tools?

    -Spence suggests starting by looking at existing workflows, tweaking them, and then building one from scratch. He emphasizes the importance of troubleshooting and understanding the system of how nodes connect and work together rather than knowing every single node.

  • What is the purpose of the audio reactive setup in Touch Designer?

    -The audio reactive setup in Touch Designer is used to create visuals that respond to audio cues, such as speeding up animations when a kick or snare is detected in the music. This allows for a synchronized audio-visual experience.

  • How does Spence use MIDI controllers in his setup?

    -Spence uses a MIDI controller to manually adjust parameters in real-time, allowing him to change the speed of animations, control the threshold of kick and snare detection, and even manually trigger effects to match the beat of the music.

  • What is the advantage of using Spout protocol to send data from Touch Designer to Notch?

    -The Spout protocol allows for real-time sharing of graphics data between Touch Designer and Notch. This enables Spence to composite visuals and effects in Notch based on the audio reactive parameters set in Touch Designer.

  • How does Spence ensure that his visuals are in sync with the music?

    -Spence uses audio analysis in Touch Designer to detect beats, kicks, and snares. He then maps these detections to the speed and triggers of his animations to ensure that the visuals change and respond in sync with the music.

  • What is the importance of the community in learning and growing as a creative professional?

    -The community is crucial for sharing work, getting feedback, and staying inspired. Spence emphasizes the importance of posting work in the right communities, such as Facebook groups or forums related to the tools and techniques he uses, which can lead to opportunities and collaborations.

Outlines

00:00

🎥 Introduction to the Guest Creator Stream

The video begins with the host, Tyler, welcoming viewers to the Friday guest creator stream. He introduces Spence, a guest with a background in creating visuals for musical performances. Spence has worked with Tyler on personal projects and is currently employed at Runway ML, where he focuses on audio reactive projects. Tyler encourages viewers to submit questions through the chat and promises a comprehensive presentation on creative workflows involving AI and real-time visuals.

05:02

🎨 Spence's Creative Journey and Workflow Overview

Spence gives a brief introduction about his decade-long career in visual creation, primarily for musical performances. He discusses his transition from using Cinema 4D to creating concert tour visuals and virtual production for Silent Partner Studio. His interest in AI was piqued in 2022, leading him to explore tools like Disco Diffusion, StyleGAN models, and GPT-3. He also mentions his work with Touch Designer, a node-based program for real-time visuals, and his creation of a fine-tuned model called ZeroScope. The workflow includes using multiple programs, starting with Notch for 3D modeling and animation, then moving to Comfy UI for rendering, and finally, Touch Designer for audio reactivity and compositing.

10:03

🛠️ Exploring Notch and Creating Loops

The tutorial shifts to using Notch, a real-time visual effects software, to create 3D models and animations. Spence demonstrates how to generate graphics quickly and intuitively by using a cloner to duplicate objects like spheres and applying textures to them. He shows how to create loops by manipulating the movement of the image and discusses techniques for achieving randomness and scale. Spence also addresses the Twitch audience's questions about looping noises in Notch and provides alternatives to using the software.

15:05

🔄 Perfecting Loops and Overcoming Learning Curves

Spence continues to refine the loops in Notch, adjusting parameters to achieve the desired visual effects. He discusses the process of finding a good loop and the importance of rendering visuals quickly. The conversation touches on the challenges of learning node-based workflows and offers advice for overcoming the initial fear and complexity. Spence suggests starting with existing workflows, tweaking them, and gradually building confidence through troubleshooting and customization.

20:05

🎨 Notch's 3D Rendering and Exploring Effects

The host explores additional features of Notch, such as its 3D rendering capabilities and the ability to add 2D effects to 3D visuals. Spence demonstrates the software's potential for creative expression and its advantages over other rendering programs. He also discusses the benefits of using Notch, including its real-time rendering and the fun, intuitive experience it provides for creators.

25:06

📚 Rendering and Post-Processing with Comfy UI

After rendering loops in Notch, the process moves to Comfy UI for further refinement. Spence shows how to load videos, adjust settings for efficient loading and rendering, and use image references to guide the AI in creating visuals. He also discusses techniques for upscaling and enhancing the quality of the final output, emphasizing the importance of starting at a lower resolution for better composition and faster testing.

30:07

🤖 Controlling Creativity with Control Nets and Parameters

The workflow now involves using control nets to manage the AI's image generation process in Comfy UI. Spence explains the use of different control nets for depth and motion prediction, which help in achieving smoother transitions and maintaining color integrity. He also addresses a question about the 'clip skip' parameter and its impact on the model's interpretation of prompts. The goal is to create a setup that balances creativity with control for generating visuals that adhere closely to the reference images.

35:09

📈 High-Resolution Rendering and Efficiency

Spence demonstrates how to render high-resolution videos using Comfy UI, emphasizing the importance of efficient rendering settings. He uses a high-res fix script to upscale the visuals and applies a control net to enhance the rendering process. The video shows the speed and quality of the rendering, highlighting the capabilities of the AI model and the potential for detailed and high-quality output.

40:09

🔁 Automating Content Creation with Custom Nodes

To automate the process of generating content, Spence introduces custom nodes from his CSP nodes pack, which allow for iterating through directories of videos and images. He sets up a system to automatically load and process multiple files, creating a continuous loop of content generation. This approach enables the rendering of numerous videos overnight, providing a wealth of material to review and select from later.

45:10

🎛️ TouchDesigner: The Power of Node-Based Real-Time Graphics

Spence transitions to using TouchDesigner, a node-based program for real-time visuals. He explains its capabilities, such as controlling lasers, lights, and APIs, and its similarity to programming languages. The host provides a link to download TouchDesigner and the workflow file shared during the stream. Spence sets up an audio file and demonstrates how to use audio analysis to detect beats, which can be visualized and used for audio reactivity in the visuals.

50:12

🎧 Audio Reactive Visuals with TouchDesigner

The tutorial focuses on creating audio reactive visuals in TouchDesigner. Spence shows how to manipulate video playback speed based on the detection of specific audio elements like kicks and snares. He uses a combination of nodes to create an audio reactive setup that changes the speed and appearance of the visuals in sync with the music. The process involves mapping audio analysis output to video speed parameters and using additional nodes for visual effects like bloom and color correction.

55:12

📹 Finalizing the Audio Reactive Workflow

Spence concludes the audio reactive workflow by demonstrating how to composite different video elements based on audio cues. He uses a series of nodes to delay frames, create outlines, and apply effects that respond to the audio beat. The process results in a dynamic visual that changes in real-time with the music. Spence also shows how to render the final video with integrated audio and discusses the limitations of the free version of TouchDesigner.

00:12

🌟 Wrapping Up and Future Creative Software

The session wraps up with a discussion about other creative software that has potential for real-time graphics and audio reactivity. Spence recommends tools like Unreal Engine's Avalanche, Cables, and Toolbag3, and emphasizes the importance of finding a passion project to drive the learning process. He also encourages posting work in the right communities and continually practicing with various tools to gain proficiency. The host and Spence reflect on the complexity and creativity involved in building such comprehensive workflows.

05:14

📚 Providing Resources for Further Exploration

Spence provides information about the contents of the zip file available on the workflow page, which includes masks for use in Notch, an audio reactive time setup in TouchDesigner with detailed comments, and a more organized version of the workflow used during the stream. The host reminds viewers of upcoming streams and guests, encouraging them to follow Spence on social media and explore the tools and techniques demonstrated in the stream.

Mindmap

Keywords

Audio Reactive

Audio reactive refers to a type of animation or visual effect that responds to sound in real-time. In the video, Spence demonstrates how to create audio reactive animations using various software tools. It's a key concept as it ties the visuals to the audio, creating a synchronized and dynamic experience.

Notch

Notch is a real-time visual effects software used for creating graphics quickly and intuitively. It is mentioned as one of the primary tools Spence uses to generate visuals from scratch. Notch is significant in the video as it allows for the creation of complex 3D animations that can be further manipulated in other software for added effects.

Touch Designer

Touch Designer is a node-based program used for creating real-time visuals and interactive multimedia content. It is highlighted in the video as a tool that Spence uses to automate image generation workflows and create custom systems for expressing his ideas. It's integral to the workflow as it interacts with APIs and allows for the integration of AI models like GPT-3.

Runway ML

Runway ML is a platform that Spence works for, which focuses on machine learning models for creative applications. It is mentioned in the context of Spence's professional work, indicating his involvement with AI and machine learning in the creative process, which is a central theme in the video.

Stable Diffusion

Stable Diffusion is a type of AI model that Spence started using for image generation. It is part of the discussion on how AI is being integrated into the creative process for generating visuals. The term is significant as it represents the cutting-edge technology that enables the creation of complex visuals as showcased in the video.

Cinema 4D

Cinema 4D is a professional 3D modeling, animation, and rendering software package. Spence mentions using Cinema 4D in the past for creating visuals, particularly for musical performances. It's an important keyword as it represents the traditional tools that have been part of Spence's journey into more advanced and integrated systems.

AI Training

AI Training refers to the process of teaching AI models to perform tasks by providing them with data. Spence talks about training style GAN models, which is a form of AI training. This keyword is significant as it shows the depth of integration of AI into the creative process, moving beyond simple tools to include model development and machine learning.

Real-Time Visuals

Real-Time Visuals are graphics or animations that are rendered and displayed as they are being created, without any delay. This concept is central to the video as Spence discusses the importance of real-time rendering in his work and how it has influenced his choice of tools and software.

Workflow Page

The Workflow Page is a resource that Spence uploaded, containing different Comfy UI workflows and a Touch Designer file. It is mentioned as a way for viewers to download and explore the tools and setups used in the video. The Workflow Page is a practical resource that allows viewers to engage with the content on a deeper level.

Node-Based Program

A Node-Based Program is a type of software that uses a visual programming interface where operations are represented by nodes. Spence discusses Touch Designer being a node-based program, which is significant as it allows for a high level of customization and the ability to create complex systems for generating visuals.

Interpolation

Interpolation is a method of generating intermediate values between known points in a dataset. In the video, Spence uses a program called Flow Frame to interpolate video frames, creating smooth slow-motion effects. This keyword is important as it demonstrates one of the techniques used to enhance the visual output.

Highlights

Spence, a guest on the stream, works for Runway ML and specializes in audio reactive projects.

Spence has created a presentation and workflow page with downloadable comfy workflows and a touch designer file.

Spence's introduction to using technology for visual expression through music.

The use of AI in video creation, specifically integrating image generation workflows with touch designer.

An overview of Spence's professional experience, including concert tour visuals and virtual production.

The introduction of the Notch software and its application in real-time 3D modeling and animation.

Tips for learning node-based programs like comfy UI and touch designer.

A live demonstration of creating a loopable animation in Notch without looping noise.

The transition from Notch to comfy UI for rendering animations and using audio reactive beat detection.

A detailed explanation of the touch designer interface and its capabilities.

The process of automating video creation using AI and node-based workflows.

Spence's recommendation of other creative software like Unreal Avalanche and Tool 3.

Advice for beginners in learning creative coding and graphic software.