Advanced Settings Tutorial - Kits AI

Kits AI
29 Feb 202405:46

TLDRThis tutorial video from Kits AI guides viewers on how to optimize their AI voice conversions using advanced settings. It begins with the removal of instrumentals from full songs and offers options to clean up audio by removing reverb, delay, and backing vocals. The importance of pitch adjustment through 'pitch shift' is highlighted, noting its effect on the key of the audio. The video then delves into the critical settings of conversion strength and volume blend, which directly affect the output's quality. It advises starting with a medium conversion strength and adjusting as needed to avoid mispronunciations. The tutorial also covers pre- and post-processing effects, such as cut noise, smooth volume, and the use of a compressor for better audio presence. Creative effects like chorus, reverb, and delay are briefly mentioned, with a suggestion to use them judiciously, especially when integrating the audio into other projects. The presenter demonstrates these settings using a clean studio recording and the M strange Rock model, showing the difference in dynamics and presence between the original and AI-converted audio. The video concludes by encouraging viewers to save their preferred settings as a preset for future use.


  • 🎙️ **Advanced Settings for Voice Conversion**: The video provides an in-depth guide on how to use advanced settings in Kits AI for better voice conversion.
  • 🔍 **Remove Instrumentals**: A feature to separate vocals from instrumentals in a full song, useful for clean vocal extraction.
  • 🎶 **Reverb and Delay Reduction**: Tools to clean up audio by reducing reverb and delay, enhancing the clarity of vocals.
  • 🎤 **Remove Backing Vocals**: An option to eliminate additional vocals like ad libs or backup singers for a more focused vocal track.
  • 🎛️ **Pitch Shifting**: Adjusting the pitch of the audio to match the range of the selected AI model without altering the original key.
  • 🔉 **Conversion Strength**: A setting that determines how much the AI voice will be applied to the input audio, affecting the conversion's authenticity.
  • 📈 **Volume Blending**: Balancing the AI model's volume with the original audio to either maintain dynamics or achieve a smoother, more polished sound.
  • 🛠️ **Pre-Processing Effects**: Subtle audio adjustments before conversion, including noise reduction and volume smoothing for improved audio quality.
  • ⚙️ **Post-Processing Effects**: Application of effects like compression, chorus, reverb, and delay after conversion to refine the final output.
  • 💡 **Start with Medium Settings**: It's recommended to begin with medium conversion strength and volume blend, then adjust according to the specific needs of the audio.
  • 📌 **Save Presets**: Once the desired settings are found, users can save them as presets for future use, streamlining the conversion process.
  • 📚 **Understanding for Better Conversions**: The tutorial aims to give users a better understanding of the program to achieve the best possible AI voice conversions.

Q & A

  • What is the purpose of the 'Remove Instrumentals' feature in Kits AI?

    -The 'Remove Instrumentals' feature is used to separate vocals from the instrumentals in a full song that includes vocals, melodies, bass, drums, etc., which is helpful when you want to isolate the vocal track for conversion.

  • How does the 'Remove Reverb and Delay' button help in the audio conversion process?

    -The 'Remove Reverb and Delay' button helps to clean up the audio by reducing or eliminating reverberation and delay effects that are common in vocal tracks, resulting in a clearer vocal for conversion.

  • What is the function of the 'Remove Backing Vocals' feature?

    -The 'Remove Backing Vocals' feature assists in eliminating additional vocal layers such as ad libs in hip-hop songs or backup singers, leaving only the primary vocals for conversion.

  • How does the 'Pitch Shift' tool work in Kits AI?

    -The 'Pitch Shift' tool adjusts the pitch of the audio to match the range of the selected AI model. If the audio's pitch is too high or low for the model, you can use this tool to shift the pitch up or down in semitones without affecting the overall quality of the conversion.

  • What is the significance of 'Conversion Strength' in determining the output of the AI voice conversion?

    -The 'Conversion Strength' setting determines how much the input audio is altered to resemble the AI voice. A higher setting will result in a more pronounced AI voice character, but it may also increase mispronunciation of certain words.

  • How does the 'Volume Blend' setting affect the final audio?

    -The 'Volume Blend' setting controls the balance between the original audio levels and the AI voice conversion. A lower model volume maintains the original audio dynamics, while a higher model volume results in a smoother and more polished output, which is useful for recordings with varied audio levels.

  • What are the benefits of using the 'Cut Noise' pre-processing effect?

    -The 'Cut Noise' effect helps to mask or reduce static background noise, rumble, or harshness in the high end of the recording, leading to a cleaner input for conversion.

  • What is the role of the 'Smooth Volume' pre-processing effect?

    -The 'Smooth Volume' effect is used to even out recordings with varied volume levels, ensuring a more consistent audio input for the conversion process.

  • Why is the 'Compressor' post-processing effect recommended for most conversions?

    -The 'Compressor' effect is recommended because it helps to manage varied volumes and enhance the overall presence of the audio. It's particularly useful when the converted audio will be used in a track where consistency in volume levels is important.

  • What should be considered when using creative post-processing effects like 'Chorus', 'Reverb', and 'Delay'?

    -When using creative effects like 'Chorus', 'Reverb', and 'Delay', it's important to have a clear idea of what you want from your converted audio. These effects can be useful for quick and easy enhancements, but for more professional or flexible use, it might be better to apply your own plugins for these effects.

  • How can users save their preferred settings in Kits AI for future use?

    -Users can save their preferred settings as a preset in Kits AI, allowing them to quickly apply the same settings to future conversions without having to manually adjust them each time.

  • What is the recommended starting point for 'Conversion Strength' and 'Volume Blend' settings?

    -It is suggested to start with a medium 'Conversion Strength' and adjust as needed based on the audio. For 'Volume Blend', a higher volume blend is recommended for a smoother and more polished sound, while a lower blend is better for preserving the dynamics of the original recording.



🎙️ Advanced Voice Conversion Settings with Kits AI

This paragraph introduces the video's focus on advanced settings for converting voices using Kits AI. It explains the process of converting audio by selecting an AI model and delving into advanced settings to refine the conversion. Key features discussed include removing instrumentals, handling reverb and delay, and removing backing vocals. The paragraph also touches on pitch shifting for audio that doesn't match the model's range and emphasizes the importance of conversion strength and volume blend for achieving the desired output. It concludes with a brief mention of starting with medium settings and adjusting as needed.


🔊 Volume Blend and Pre-Processing Effects for Audio Quality

The second paragraph delves into the significance of the volume blend setting, explaining when to use high or low model volume for different types of recordings. It then introduces pre-processing effects such as cut noise for background noise, low/high shelf for frequency adjustments, and smooth volume for uneven audio levels. The paragraph also advises starting with low pitch correction and increasing as necessary. It concludes with a mention of post-processing effects like compression, chorus, reverb, and delay, noting the importance of understanding desired outcomes for creative use or flexibility in audio production.



💡Advanced Settings

Advanced Settings refers to the optional configurations that users can adjust to fine-tune the performance of a software application. In the context of the video, it pertains to the detailed options within Kits AI for optimizing the conversion of audio to AI voices. These settings allow for a more customized and higher quality output tailored to the user's specific needs.

💡Remove Instrumentals

Remove Instrumentals is a feature that allows users to separate vocals from the instrumental track in a song. In the video, it is mentioned as one of the first advanced settings that can be utilized when converting audio with Kits AI, which is particularly useful for audio that contains full musical compositions alongside vocals.

💡Reverb and Delay

Reverb and Delay are audio effects that simulate the persistence of sound in a particular space and the echo effect respectively. The video explains that these effects are common in vocals and songs, and Kits AI provides options to reduce or remove them for a cleaner audio conversion.

💡Pit Shift

Pit Shift is a tool used to adjust the pitch of an audio file. If the audio is out of the range suitable for the selected AI model, the Pit Shift feature can be used to alter the pitch to match the model's capabilities. However, it's noted in the video that this will also change the key of the audio, which is an important consideration.

💡Conversion Strength

Conversion Strength is a parameter that determines the degree to which the input audio is altered to resemble the AI voice during the conversion process. A higher conversion strength can add more character to the AI voice but may also increase mispronunciation of words. The video suggests starting with a medium conversion strength and adjusting as needed.

💡Volume Blend

Volume Blend is a setting that controls the balance between the original audio levels and the AI voice conversion. A lower model volume maintains the original audio dynamics, whereas a higher model volume results in a smoother, more polished sound. It is context-dependent and chosen based on whether the user wants to preserve the original audio dynamics or achieve a more uniform sound.

💡Pre-processing Effects

Pre-processing Effects are audio treatments applied to the input audio before it is converted by the AI. These effects, such as cut noise and smooth volume, are used to clean up the input and prepare it for conversion. They help to address issues like background noise, volume inconsistencies, and pitch inaccuracies.

💡Post-processing Effects

Post-processing Effects are applied to the audio after it has been converted by the AI. The video mentions a compressor for balancing volume levels and creative effects like chorus, reverb, and delay for additional audio enhancement. These effects can be used for a more polished final output but may reduce flexibility if the audio is intended for further editing.


Dynamics in audio refer to the range between the loudest and softest parts of a sound recording. Preserving the dynamics is important for maintaining the original expressiveness and impact of the recording. In the video, it's suggested to use a lower volume blend when the user wants to retain the dynamics of a clean, studio-quality recording.

💡AI Cover

An AI Cover is a version of a song created using artificial intelligence to convert the original vocals into a different voice or style. The video discusses how to achieve a proper mix where the AI-converted vocals sit well with the original instrumentals, which is crucial for creating a harmonious AI cover.


Presets are pre-defined settings that users can save and reuse for future projects. In the context of the video, once a user finds a combination of settings that yields satisfactory results with Kits AI, they can save these as a preset for convenience and consistency in subsequent audio conversions.


An instructional video on advanced settings for converting voices with Kits AI is presented.

The importance of using advanced settings for better AI voice conversions is emphasized.

The 'Remove Instrumentals' feature can separate vocals from the instrumental in a full song.

Reverb and Delay can be reduced or removed for cleaner audio.

The 'Remove Backing Vocals' feature aids in eliminating ad libs or backup singers.

Pitch Shift tool adjusts audio to match the AI model's range, but may change the key.

Conversion Strength alters how much the AI voice is applied to the input audio.

High Conversion Strength can exaggerate certain sounds and potentially mispronounce words.

Medium Conversion Strength is recommended as a starting point.

Volume Blend affects the smoothness and polish of the converted audio.

High Model Volume is suitable for recordings with varied audio levels or less than ideal conditions.

Low Model Volume preserves the dynamics of the original audio recording.

Pre-processing effects like Cut Noise and Smooth Volume can improve the input audio quality.

Post-processing with a Compressor can enhance volume consistency and presence.

Creative post-processing effects like Chorus, Reverb, and Delay can be used for specific audio needs.

It's crucial to know the desired outcome for the converted audio when using creative effects.

The video provides a practical example of converting audio using the M strange Rock model.

Saving presets allows for quick reuse of preferred settings for future conversions.

The tutorial concludes with a comparison between the original and AI-converted audio, showcasing the effectiveness of the advanced settings.