Deep Learning for Music Generation - Training GPT-2 on a large music dataset - Lakh MIDI Dataset

Tristan Behrens

25 Feb 202206:22

Summary

TLDRIn this video, the creator expresses gratitude to NVIDIA for providing access to the DGX2, a powerful AI tool that has significantly enhanced their music experiments. They delve into the technical aspects and share their experience training a new AI music network on 150,000 songs using the DGX2. The video showcases the network's progress through loss curves and demonstrates its capabilities by playing a generated string ensemble piece. The creator discusses their process of training, evaluating, and refining AI-generated music, hinting at future projects involving more neural networks and interactive composing tools.

Takeaways

🙌 The speaker expresses gratitude to NVIDIA for providing access to the DGX2, a powerful AI tool that has significantly enhanced their music experiments.
🔍 The video delves into the technical details and inner workings of the NVIDIA DGX2, emphasizing its computational capabilities.
📈 The DGX2 is described as a robust piece of hardware established in 2018, known for its heavy weight and high power consumption.
🎵 The speaker shares their experience with training a new AI music network on the DGX2, which was trained on 150,000 songs over seven days.
📊 Loss curves are discussed to illustrate the performance of the new AI music network, indicating a slight overfitting but overall positive training progress.
🎼 The importance of subjective listening is highlighted, as there's currently no automatic measurement for the quality of AI-generated music.
💾 The speaker mentions their collection of trained AI networks and the dataset used, which consists of 150,000 MIDI files after some pre-processing.
🎧 An example of AI-generated music is played, demonstrating the network's ability to create harmonic string ensembles.
🔄 The process of training, downloading checkpoints, and experimenting with different versions of the AI network on a local machine is outlined.
🚀 Future plans include training more neural networks and integrating the new AI music network into a more interactive tool for composing music.

Q & A

What is the main topic discussed in the video?
-The main topic discussed in the video is the experience and technical details of working with NVIDIA's DGX2, a powerful AI hardware, and its impact on the creator's AI music experiments.
Why is the NVIDIA DGX2 referred to as a '3D powerhouse'?
-The NVIDIA DGX2 is referred to as a '3D powerhouse' because it is a high-performance AI system that significantly enhances the creator's AI music experiments, suggesting its capabilities in terms of data processing, deep learning, and overall computational power.
What does the creator appreciate about the NVIDIA DGX2?
-The creator appreciates the NVIDIA DGX2 for its computational power, which has helped elevate their AI music experiments to the next level.
What is the physical description of the NVIDIA DGX2 mentioned in the script?
-The NVIDIA DGX2 is described as being very heavy, weighing 160.3 kilograms, which is both an advantage and a disadvantage due to its power and the space it occupies.
How long did it take to train the new AI music network mentioned in the video?
-The new AI music network was trained for seven days, reaching 1 million global steps with 112 samples per batch.
What does the loss curve in the video represent?
-The loss curve represents the performance of the AI network during training, showing both the training and validation set loss curves, which help to identify overfitting or underfitting.
What was the dataset used for training the AI music network?
-The dataset used for training the AI music network consisted of 150,000 MIDI files, which had to be filtered from an original dataset of 175,000 due to some issues in the pre-processing pipeline.
How does the creator assess the quality of music generated by AI?
-The creator assesses the quality of music generated by AI by listening to it, as there is currently no automatic subjective measurement for the quality of AI-generated music.
What is the purpose of the AI music network trained in the video?
-The purpose of the AI music network trained in the video is to generate music, specifically string ensembles, based on the conditioning provided during the training process.
What are the next steps for the creator after training the AI music network?
-The next steps for the creator include training more neural networks and integrating the newly trained network into a tool to enhance the composing experience, making it more interactive and user-friendly.
What is the creator's view on the current state of AI-generated music?
-The creator views AI-generated music positively, noting that while the network they trained has shown promise in generating music with proper syntax, it still struggles with harmony at times, indicating that there is room for improvement.