AnimateDiff ControlNet Animation v1.0 [ComfyUI]

Jerry Davos AI
5 Nov 202316:03

TLDRThis tutorial outlines a workflow for creating animations using AnimateDiff ControlNet and ComfyUI. It guides users through downloading JSON files, setting up the workspace, and using After Effects to downscale and export a reference video as JPEG images. The process involves making control net passes, organizing them, and rendering frames with various models and settings. The final step includes fixing facial issues with Automatic 1111 and upscaling images for a polished animation.


  • 🖥️ AnimateDiff ControlNet Animation v1.0 is a tool that utilizes a ComfyUI for creating animations by importing JSON files and reference videos.
  • 📂 Users need to have specific ComfyUI extensions installed and follow the tutorial links provided in the description for effective usage.
  • 💃 The reference video by Helen Ping is used as an example to demonstrate the workflow for creating animations.
  • 📸 To start, users should downscale their reference video and export it as a JPEG image sequence for the initial control net passes.
  • 🎨 Import the JPEG images into ComfyUI using the 'Load Images from Directory' node and organize them with soft Edge and open pose passes.
  • 📁 Create two new folders for the soft Edge and open pose images and include a control net passes JSON file for ease of use.
  • 🔄 Render all control net images and organize them in respective folders, ensuring all images are rendered correctly.
  • 🔧 Choose the desired animation style (realistic, anime, or cartoon) and set the resolution nodes to match the reference video's dimensions.
  • 🚦 Use the batch range and skip frames nodes to manage the rendering process effectively, adjusting based on PC capabilities and the number of images.
  • 🎭 Apply control net passes using the purple node inputs and adjust settings according to the animation style and user preferences.
  • 🔍 After rendering, fix any issues with the animation, such as disproportionate faces, using the automatic 1111 image to image tab and detailer extensions.

Q & A

  • What is the primary software used for the animation mentioned in the script?

    -The primary software used for the animation is After Effects.

  • What are the extensions required to use the Comfy UI workflow?

    -The specific Comfy UI extensions are not named, but it is implied that the user must have certain Comfy UI extensions installed to utilize this workflow.

  • How is the reference video used in the animation process?

    -The reference video is used for dragging and dropping into After Effects, making a new composition, and downscaling the video to a smaller resolution for exporting as JPEG images.

  • What is the purpose of exporting the video as JPEG image sequences?

    -The purpose of exporting the video as JPEG image sequences is to use these images for making the initial control net passes in Comfy UI.

  • What are the two control net passes needed and how are they named?

    -The two control net passes needed are Soft Edge and Open Pose. They are saved and named with prefixes accordingly for better organization.

  • How many frames can the creator's RTX 3070 TI laptop GPU handle for rendering?

    -The creator's RTX 3070 TI laptop GPU can handle up to 150 frames for the resolution used in the tutorial.

  • What is the purpose of the detailer extension in the animation process?

    -The detailer extension is used for enhancing the quality of the rendered images, particularly for fixing issues with the face by using negative embeddings and adjusting the noise strength.

  • How is the final animation assembled and edited?

    -The final animation is assembled by sequencing all the batches in After Effects, adding color corrections, zooming the composition, and then rendering out the video.

  • What is the suggested method for fixing disproportionate faces in the animation?

    -The suggested method for fixing disproportionate faces is by lowering the impainted noise strength and using the detailer with larger noise strength for the face.

  • How can users share their works created using this workflow?

    -Users can share their works by forwarding them to the creator on Discord or mentioning them in the comments section of the tutorial.

  • What is the recommended tool for upscaling the images after rendering?

    -The recommended tool for upscaling the images after rendering is Topaz Gigapixel AI.



🎨 Animation Workflow Setup

This paragraph outlines the initial steps for setting up an animation project using Comfy UI and Animate,Diff. It begins by instructing to download JSON files from the description and import them into the workspace. The process involves downscaling a reference video, exporting it as JPEG images, and organizing these images for creating control net passes. The paragraph emphasizes the need for specific extensions and provides a link to a tutorial for troubleshooting common issues. It concludes with a step-by-step guide on rendering control net images and organizing them into folders for further use.


🖌️ Customizing Animation Style and Parameters

The second paragraph delves into the customization of the animation style, whether it's realistic anime or cartoon. It details the process of selecting a model, setting dimensions according to the original video, and managing the batch range and skip frames for processing. The paragraph also explains the role of control net passes and input nodes in the workflow. It provides guidance on testing animations and adjusting settings for optimal rendering, including the use of prompts and control net images. The paragraph concludes with a note on the importance of rendering all frames and fixing facial animations in post-production.


🎵 Enhancing Animation with Music and Detailing

This paragraph focuses on the integration of music and the enhancement of animation quality. It discusses the process of rendering test frames and adjusting the final animation for better visual appeal. The paragraph mentions the use of an RTX 3070 TI laptop GPU for rendering and provides specific frame count recommendations based on GPU capacity. It also introduces the use of the automatic 1111 tool for face detailing and suggests using Topaz Gigapixel AI for image upscaling. The paragraph concludes with a mention of additional color corrections and composition adjustments made in After Effects to finalize the animation.


💌 Sharing and Future Tutorials

The final paragraph invites users to share their creations made using the described workflow and offers support for any encountered issues. It encourages users to reach out via Discord or comments for further assistance and shares the creator's Discord username for direct interaction. The paragraph ends on a positive note, promising more tutorials and expressing gratitude for the community's love and support.




AnimateDiff refers to a process or tool used in the animation industry that involves creating dynamic and fluid motion graphics by using differences between frames. In the context of the video, it is a method that the creator employs to animate a dance video by Helen ping, using a series of images and control net passes to generate a smooth animation sequence.


A control net, short for 'control network', is a series of reference images or passes that are used to guide and constrain the animation process. In the video, the creator uses control net passes to ensure that the animation follows the desired movement and poses from the reference video. These passes are essential for achieving a realistic and accurate representation of the dance in the final animation.


ComfyUI appears to be a user-friendly interface or an extension used in the animation workflow described in the video. It simplifies the process of dragging and dropping files, organizing them into folders, and applying various settings for the animation. The script mentions using ComfyUI extensions, suggesting that it is a software or a plugin that aids in the animation process.

💡Automatic 1111

Automatic 1111 seems to be a feature or a tool used in the post-processing stage of the animation to fix issues, particularly with the rendering of faces. The video mentions using this tool in the image-to-image tab to automatically detect and correct facial features, ensuring that the final animation has realistic and properly rendered faces.

💡Dance Video

The dance video, as referenced in the script, is the source material for the animation project. It is a recording of Helen ping dancing, which the creator uses as a reference to animate the dance movements. The video is downsized and exported as a sequence of JPEG images, which are then used to create control net passes and guide the animation process.

💡After Effects

After Effects is a widely used digital video editing and compositing application developed by Adobe Systems. In the context of the video, it is the software where the creator imports the reference video, makes a new composition, and exports the video as a sequence of JPEG images. It is also where the final animation is assembled and rendered.

💡Soft Edge

Soft Edge is one of the control net passes mentioned in the script. It is a technique used in image editing to create a smooth transition between the edges of an object and the background. In the animation process described, the Soft Edge pass is used to help the animation software understand the boundaries and shapes of the subject, which is crucial for accurate motion tracking and rendering.

💡Open Pose

Open Pose refers to another control net pass used in the animation process. It is likely a method or setting that helps to identify and track the human pose in the reference video. The Open Pose pass is essential for ensuring that the animation captures the correct body movements and postures from the dance video.

💡K Sampler

K Sampler is mentioned as a node in the animation workflow. It is likely a component of the animation software that is used to sample or select different frames or images from the control net passes. This process is important for generating variations and ensuring a diverse range of movements in the final animation.


Detailer is an extension used in the post-processing stage of the animation. It is employed to enhance the quality and detail of the rendered images, particularly the faces. The script mentions using the Detailer with negative embeddings and different models to improve the results, suggesting that it is a tool for refining and perfecting the final output.


Upscaling is the process of increasing the resolution of an image or video. In the video script, the creator uses a tool called Topaz Gigapixel AI to upscale all the images after fixing the faces. This step is crucial for maintaining the quality of the animation when it is rendered at a higher resolution.


The animation was created using Animate, Diff, and Comfy UI.

Automatic 1111 was utilized for the animation process.

Users can download JSON files from the description below.

The workflow involves dragging and dropping files into the Comfy UI workspace.

A reference dance video by Helen Ping is used for demonstration.

A new composition is created in After Effects with the video downsized to a smaller resolution.

The video is exported as a JPEG image sequence.

Initial control net passes are made with images for Soft Edge and Open Pose.

The control net images are organized into HD and Open Pose folders for easy access.

A control net passes JSON file is included for convenience.

The animation workflow is broken down, explaining each green and purple node's function.

The model loader node allows selection of animation style: realistic, anime, or cartoon.

The dimension of the animation is set to match the reference video's aspect ratio.

Batching and skipping frames are used to manage PC capabilities and rendering efficiency.

Control net passes are applied without extra processing due to pre-rendered images.

Testing animations can be done quickly by adjusting the batch range and skip frames.

The final animation rendering takes into account the GPU's maximum handling capacity.

Face issues are fixed in the automatic 1111 using the Image to Image tab and various models.

The Detailer extension is used for enhancing results and fixing facial discrepancies.

Upscaling images is done using Topaz Gigapixel AI for improved quality.

The final video is assembled in After Effects with color corrections and other enhancements.

The workflow is an innovative method for creating animations, opening up endless possibilities for artistic expression.