Audio Reactive AI Animation Masterclass ft. Cerspense
TLDRIn this comprehensive masterclass, Tyler hosts Spence, a creative professional from Runway ML, to explore the intersection of audio reactivity and AI animation. Spence shares his journey of creating visuals for musical performances, integrating technology, and his deep dive into AI, using tools like Disco Diffusion, Style GAN models, and GPT-3. He demonstrates how to use Notch for real-time 3D modeling and animation, Comfy UI for rendering, and Touch Designer for audio-reactive beat detection, showcasing a workflow that fuses these elements to create dynamic visual experiences. Spence also discusses his experience with AI-generated music and the potential of tools like Blender, Unreal Engine's Avalanche, and open-source alternatives. The session is rich with practical advice, as Spence encourages aspiring creators to find passion in their work and leverage community resources to enhance their craft.
Takeaways
- 🎉 Spence, a guest on the civii Friday guest Creator stream, showcased his expertise in audio reactive visuals and AI integrations in creative works.
- 🚀 Spence's work involves creating visuals for musical performances, concert tour visuals, and virtual production visuals, driven by his passion for music and technology.
- 🤖 He is currently a researcher at Runway ML, where he explores AI applications in video generation and real-time visuals, using tools like Touch Designer and Notch.
- 🌟 The presentation included a workflow page with downloadable resources such as comfy workflows, a Touch Designer file, and loops for creative use.
- 📚 Spence provided an overview of his background, including his shift towards AI in 2022, working with models like disco diffusion, style GANs, and stable diffusion.
- 🎨 The live demonstration covered creating loops in Notch, a real-time visual effects software, and integrating them with AI-generated content in Comfy UI.
- 🔍 Spence discussed the use of depth maps and masks in enhancing the AI image generation process, emphasizing the importance of experimentation.
- 🎛️ He also touched on the challenges of learning node-based workflows and offered advice for overcoming the initial complexity and finding a balance between technical and creative aspects.
- 🔊 The audio reactive component of the workflow was demonstrated using Touch Designer, highlighting the potential for creating visuals that respond to music beats.
- 📈 Spence's approach involves a combination of pre-processing, real-time rendering, and post-processing to achieve high-quality, dynamic visual content.
- 🌐 He encouraged the audience to explore various tools and platforms, such as Unreal Engine's Avalanche, Blender, and Max MSP, to expand their creative possibilities.
Q & A
What is the main topic of the masterclass?
-The main topic of the masterclass is audio reactive AI animation, featuring a guest Creator named Spence who discusses his work and provides insights into creating visuals for musical performances using technology.
What are some of the software tools mentioned in the transcript?
-Some of the software tools mentioned include Notch, Touch Designer, Runway ML, Cinema 4D, Silent Partner Studio, AI models like Disco Diffusion, Stable Diffusion, GPT-3, and tools for 3D modeling and animation.
What does Spence do at Runway ML?
-Spence works at Runway ML as a researcher and also creates art for himself and for clients. He is involved in creating audio reactive visuals and has developed custom systems and technical integrations for creative possibilities in various shows.
How does Spence use AI in his work?
-Spence uses AI in his work by training and fine-tuning models like Stable Diffusion and GPT-3. He integrates image generation workflows with Touch Designer to automate processes and create custom systems to express his ideas.
What is the role of Notch in Spence's workflow?
-Notch is used by Spence for real-time 3D modeling and animation. It allows him to create graphics quickly and in a way that is more intuitive and fun compared to game engines. Notch is also used for creating audio visual graphics for big tours and shows.
What is the significance of the workflow page and files provided by Spence?
-The workflow page and files provided by Spence include different comfy workflows and a Touch Designer file. These resources allow participants to download and use them to follow along with the masterclass and learn how to create their own audio reactive animations.
How does Spence approach learning and mastering new software tools?
-Spence suggests starting by looking at existing workflows, tweaking them, and then building one from scratch. He emphasizes the importance of troubleshooting and understanding the system of how nodes connect and work together rather than knowing every single node.
What is the purpose of the audio reactive setup in Touch Designer?
-The audio reactive setup in Touch Designer is used to create visuals that respond to audio cues, such as speeding up animations when a kick or snare is detected in the music. This allows for a synchronized audio-visual experience.
How does Spence use MIDI controllers in his setup?
-Spence uses a MIDI controller to manually adjust parameters in real-time, allowing him to change the speed of animations, control the threshold of kick and snare detection, and even manually trigger effects to match the beat of the music.
What is the advantage of using Spout protocol to send data from Touch Designer to Notch?
-The Spout protocol allows for real-time sharing of graphics data between Touch Designer and Notch. This enables Spence to composite visuals and effects in Notch based on the audio reactive parameters set in Touch Designer.
How does Spence ensure that his visuals are in sync with the music?
-Spence uses audio analysis in Touch Designer to detect beats, kicks, and snares. He then maps these detections to the speed and triggers of his animations to ensure that the visuals change and respond in sync with the music.
What is the importance of the community in learning and growing as a creative professional?
-The community is crucial for sharing work, getting feedback, and staying inspired. Spence emphasizes the importance of posting work in the right communities, such as Facebook groups or forums related to the tools and techniques he uses, which can lead to opportunities and collaborations.
Outlines
🎥 Introduction to the Guest Creator Stream
The video begins with the host, Tyler, welcoming viewers to the Friday guest creator stream. He introduces Spence, a guest with a background in creating visuals for musical performances. Spence has worked with Tyler on personal projects and is currently employed at Runway ML, where he focuses on audio reactive projects. Tyler encourages viewers to submit questions through the chat and promises a comprehensive presentation on creative workflows involving AI and real-time visuals.
🎨 Spence's Creative Journey and Workflow Overview
Spence gives a brief introduction about his decade-long career in visual creation, primarily for musical performances. He discusses his transition from using Cinema 4D to creating concert tour visuals and virtual production for Silent Partner Studio. His interest in AI was piqued in 2022, leading him to explore tools like Disco Diffusion, StyleGAN models, and GPT-3. He also mentions his work with Touch Designer, a node-based program for real-time visuals, and his creation of a fine-tuned model called ZeroScope. The workflow includes using multiple programs, starting with Notch for 3D modeling and animation, then moving to Comfy UI for rendering, and finally, Touch Designer for audio reactivity and compositing.
🛠️ Exploring Notch and Creating Loops
The tutorial shifts to using Notch, a real-time visual effects software, to create 3D models and animations. Spence demonstrates how to generate graphics quickly and intuitively by using a cloner to duplicate objects like spheres and applying textures to them. He shows how to create loops by manipulating the movement of the image and discusses techniques for achieving randomness and scale. Spence also addresses the Twitch audience's questions about looping noises in Notch and provides alternatives to using the software.
🔄 Perfecting Loops and Overcoming Learning Curves
Spence continues to refine the loops in Notch, adjusting parameters to achieve the desired visual effects. He discusses the process of finding a good loop and the importance of rendering visuals quickly. The conversation touches on the challenges of learning node-based workflows and offers advice for overcoming the initial fear and complexity. Spence suggests starting with existing workflows, tweaking them, and gradually building confidence through troubleshooting and customization.
🎨 Notch's 3D Rendering and Exploring Effects
The host explores additional features of Notch, such as its 3D rendering capabilities and the ability to add 2D effects to 3D visuals. Spence demonstrates the software's potential for creative expression and its advantages over other rendering programs. He also discusses the benefits of using Notch, including its real-time rendering and the fun, intuitive experience it provides for creators.
📚 Rendering and Post-Processing with Comfy UI
After rendering loops in Notch, the process moves to Comfy UI for further refinement. Spence shows how to load videos, adjust settings for efficient loading and rendering, and use image references to guide the AI in creating visuals. He also discusses techniques for upscaling and enhancing the quality of the final output, emphasizing the importance of starting at a lower resolution for better composition and faster testing.
🤖 Controlling Creativity with Control Nets and Parameters
The workflow now involves using control nets to manage the AI's image generation process in Comfy UI. Spence explains the use of different control nets for depth and motion prediction, which help in achieving smoother transitions and maintaining color integrity. He also addresses a question about the 'clip skip' parameter and its impact on the model's interpretation of prompts. The goal is to create a setup that balances creativity with control for generating visuals that adhere closely to the reference images.
📈 High-Resolution Rendering and Efficiency
Spence demonstrates how to render high-resolution videos using Comfy UI, emphasizing the importance of efficient rendering settings. He uses a high-res fix script to upscale the visuals and applies a control net to enhance the rendering process. The video shows the speed and quality of the rendering, highlighting the capabilities of the AI model and the potential for detailed and high-quality output.
🔁 Automating Content Creation with Custom Nodes
To automate the process of generating content, Spence introduces custom nodes from his CSP nodes pack, which allow for iterating through directories of videos and images. He sets up a system to automatically load and process multiple files, creating a continuous loop of content generation. This approach enables the rendering of numerous videos overnight, providing a wealth of material to review and select from later.
🎛️ TouchDesigner: The Power of Node-Based Real-Time Graphics
Spence transitions to using TouchDesigner, a node-based program for real-time visuals. He explains its capabilities, such as controlling lasers, lights, and APIs, and its similarity to programming languages. The host provides a link to download TouchDesigner and the workflow file shared during the stream. Spence sets up an audio file and demonstrates how to use audio analysis to detect beats, which can be visualized and used for audio reactivity in the visuals.
🎧 Audio Reactive Visuals with TouchDesigner
The tutorial focuses on creating audio reactive visuals in TouchDesigner. Spence shows how to manipulate video playback speed based on the detection of specific audio elements like kicks and snares. He uses a combination of nodes to create an audio reactive setup that changes the speed and appearance of the visuals in sync with the music. The process involves mapping audio analysis output to video speed parameters and using additional nodes for visual effects like bloom and color correction.
📹 Finalizing the Audio Reactive Workflow
Spence concludes the audio reactive workflow by demonstrating how to composite different video elements based on audio cues. He uses a series of nodes to delay frames, create outlines, and apply effects that respond to the audio beat. The process results in a dynamic visual that changes in real-time with the music. Spence also shows how to render the final video with integrated audio and discusses the limitations of the free version of TouchDesigner.
🌟 Wrapping Up and Future Creative Software
The session wraps up with a discussion about other creative software that has potential for real-time graphics and audio reactivity. Spence recommends tools like Unreal Engine's Avalanche, Cables, and Toolbag3, and emphasizes the importance of finding a passion project to drive the learning process. He also encourages posting work in the right communities and continually practicing with various tools to gain proficiency. The host and Spence reflect on the complexity and creativity involved in building such comprehensive workflows.
📚 Providing Resources for Further Exploration
Spence provides information about the contents of the zip file available on the workflow page, which includes masks for use in Notch, an audio reactive time setup in TouchDesigner with detailed comments, and a more organized version of the workflow used during the stream. The host reminds viewers of upcoming streams and guests, encouraging them to follow Spence on social media and explore the tools and techniques demonstrated in the stream.
Mindmap
Keywords
Audio Reactive
Notch
Touch Designer
Runway ML
Stable Diffusion
Cinema 4D
AI Training
Real-Time Visuals
Workflow Page
Node-Based Program
Interpolation
Highlights
Spence, a guest on the stream, works for Runway ML and specializes in audio reactive projects.
Spence has created a presentation and workflow page with downloadable comfy workflows and a touch designer file.
Spence's introduction to using technology for visual expression through music.
The use of AI in video creation, specifically integrating image generation workflows with touch designer.
An overview of Spence's professional experience, including concert tour visuals and virtual production.
The introduction of the Notch software and its application in real-time 3D modeling and animation.
Tips for learning node-based programs like comfy UI and touch designer.
A live demonstration of creating a loopable animation in Notch without looping noise.
The transition from Notch to comfy UI for rendering animations and using audio reactive beat detection.
A detailed explanation of the touch designer interface and its capabilities.
The process of automating video creation using AI and node-based workflows.
Spence's recommendation of other creative software like Unreal Avalanche and Tool 3.
Advice for beginners in learning creative coding and graphic software.