The Secrets Behind Voice Cloning & AI Covers

bycloud
8 Aug 202316:54

Summary

TLDRThis video provides an overview of the current state of AI voice generation technologies. It explains the two main types: text-to-speech, which converts text into audio, and voice-to-voice conversion, which clones voices. It then covers the popular AI models used for each type, like Tacotron 2 and Tortoise for text-to-speech, and so-vits-svc and RVC for voice cloning. It discusses popular services like UberDuck, FakeYou, and ElevenLabs that leverage these models, comparing their capabilities, limitations and use cases. Finally, it imagines creative applications, like translating content while retaining the original creator's voice or utilizing multiple models together to achieve high quality and convenience.

Takeaways

  • 😃Two main types of voice cloning: text-to-speech synthesis and voice-to-voice conversion
  • 👂Popular text-to-speech backbones: Tacotron 2 and Tortoise TTS
  • 🎤Popular voice-to-voice conversion options: so-vits-svc and RVC
  • 🔊HiFiGAN is a commonly used vocoder for generating audio waveforms
  • 🚀Services: UberDuck, FakeYou, ElevenLabs offer text-to-speech and cloning
  • 🤖ElevenLabs offers easy 1 minute voice cloning with decent quality
  • ⚙️Tortoise + RVC offers high quality custom text-to-speech pipelines
  • 🎙Combining Tortoise and RVC enables fully AI generated narration
  • 💰Brilliant provides structured, interactive STEM learning content
  • 👍Video creator open to collaborating on custom voice pipelines

Q & A

  • What are the two main types of voice cloning technologies discussed?

    -The two main types discussed are pure text-to-speech synthesis and voice-to-voice conversion.

  • What is the difference between Tacotron 2 and Tortoise TTS?

    -Tacotron 2 is faster but lower quality, while Tortoise TTS is slower but higher quality. Tortoise also needs less training data and time.

  • What vocoder is commonly used with these voice synthesis technologies?

    -HiFiGAN is a commonly used vocoder because it can generate high quality and natural sounding speech quickly.

  • What are so-vits-svc and RVC?

    -So-vits-svc and RVC are two popular voice-to-voice conversion technologies, with RVC being a more recent advancement.

  • What services offer pre-trained voice models?

    -Services like Uberduck, FakeYou, and ElevenLabs offer access to pre-trained voice models, both text-to-speech and voice conversion.

  • What tools allow you to train custom voice models?

    -There are open source tools like Tortoise TTS GUI and RVC UI that allow training custom voice models locally.

  • What innovation combines Tortoise TTS and RVC?

    -Using Tortoise TTS output as reference audio input for RVC allows high quality text-to-speech without need for reference audio.

  • How was the narration in this video generated?

    -The narration was generated using Tortoise TTS trained on the narrator's voice, piped into RVC for smoothing and quality enhancement.

  • What are some applications of this technology?

    -Applications include translating content while keeping original creator's voice, lip syncing, cloning voice actors, and more.

  • What sponsorship was included?

    -Brilliant.org sponsored the video, offering STEM courses like AI and coding in an intuitive format.

Outlines

00:00

📢 The Two Categories of Voice Synthesis/Generation: Text-to-Speech (TTS) and Voice-to-Voice Conversion

Paragraph 1 introduces and differentiates between the two main categories of voice synthesis: the more basic text-to-speech (TTS) synthesis like Siri, and the more advanced voice-to-voice conversion which enables AI singing by converting a person's vocal into another person's vocal, like the AI Drake song. It explains that TTS can't imitate styles/tones while voice-to-voice conversion can, since it uses audio references as input.

05:00

🔬 The Popular AI Backbones/Models Powering Voice Synthesis and Conversion

Paragraph 2 dives into the popular AI models powering voice synthesis and conversion: Tacotron 2 (used by AI SpongeBob) and Tortoise TTS for TTS, and so-vits-svc and RVC for voice-to-voice conversion. It compares their strengths and weaknesses in terms of quality, data/hardware requirements, and more. Another key component called vocoders (like HiFiGAN) which generate the final audio is also explained.

10:03

📱 Overview of Popular Services Offering Pre-trained Models and Custom Voice Cloning

Paragraph 3 provides an overview of popular services that offer pre-trained voice models and tools for custom voice cloning, covering Uberduck, FakeYou, ElevenLabs, and local UIs for Tacotron 2, Tortoise TTS, so-vits-svc and RVC. Their key features, capabilities, pricing, and more are compared and summarized.

15:08

🤯 Combining Tortoise TTS and RVC for High-Quality Unpaired Text-to-Speech

Paragraph 4 shares an innovative idea to combine Tortoise TTS and RVC to achieve high-quality unpaired text-to-speech without needing an audio reference. This is demonstrated through the narration of the current video, which uses Tortoise + RVC voices instead of the author's actual voice. ElevenLabs' new pro voice cloning feature is also tested and compared.

Mindmap

Keywords

💡Text-to-speech synthesis

This refers to AI systems that can convert text into human-like speech. The video discusses popular text-to-speech models like Tacotron 2 and Tortoise TTS that are used to power many voice cloning services. These are important for enabling AI assistants and audio narration without human voice acting.

💡Voice cloning

Voice cloning enables AI to replicate and mimic a human voice by learning from example audio data. Technologies like SVC and RVC convert an input voice into a target voice with high accuracy. This is used to create realistic fake voices and even auto-translate speeches while preserving the original vocal style.

💡Vocoder

A vocoder is an audio processing technique used to generate the final speech waveforms and quality from the basic phoneme and frequency outputs of a text-to-speech or voice conversion model. Vocoders like HiFiGAN define the realism and naturalness of synthesized speech.

💡Tacotron 2

Tacotron 2 is an influential neural text-to-speech model developed by Google in 2018. It converts text and punctuation into speech spectrograms which are converted into audio by a vocoder. The video praises it for its speed but notes quality limitations.

💡Tortoise TTS

Tortoise TTS is a high-quality but slower text-to-speech model useful for voice cloning. It can accurately clone voices with 30 minutes of data. The video uses Tortoise-generated narration piped through RVC vocoding to achieve a fully AI-generated narration.

💡SVC (so-vits-svc)

SVC stands for SoftVC vits Singing Voice Conversion and is one of the most popular technologies for accurate voice conversion and singing voice synthesis. It powers many of the celebrity voice cloning apps.

💡RVC

RVC (Retrieval Based Voice Conversion) is a recently-developed voice cloning method that rivals and may surpass SVC in audio quality and training time. It produces highly accurate vocal mimicry given limited data.

💡Uberduck

Uberduck is a popular voice cloning app with thousands of Text-to-Speech and SVC voice models, mainly from user uploads. However, copyright issues have led it to remove user models in favor of officially licensed voices.

💡ElevenLabs

ElevenLabs provides easy voice cloning APIs, including 1-click cloning and advanced professional cloning. It likely uses the Tortoise TTS backbone. The video author compares its cloning quality and convenience to Tortoise + RVC.

💡AI narration

By piping Tortoise TTS outputs into RVC for smoother vocals, fully AI-generated narration for videos is possible without any actual human voice acting. This could save costs for creators wanting translated content in their own voices.

Highlights

There are two main types of voice cloning: text-to-speech synthesis and voice-to-voice conversion.

Tacotron 2 is a popular text-to-speech model developed by Google and NVIDIA, known for its speed but lower audio quality.

Tortoise TTS is a high-quality but slower text-to-speech model that needs less training data and time than Tacotron 2.

HiFiGAN is a commonly used vocoder that generates high-fidelity audio waveforms from mel spectrograms produced by text-to-speech models.

so-vits-svc and RVC are two popular voice-to-voice conversion models, with RVC being the likely technology behind the AI Drake song.

Uberduck, FakeYou, and ElevenLabs offer text-to-speech and voice cloning services, with varying features, constraints, and pricing models.

Custom UIs like the Tacotron 2 voice cloning app make it easy to clone voices locally using open-source text-to-speech models.

Chaining Tortoise TTS and RVC creates a high-quality customizable text-to-speech pipeline without needing voice recordings.

Potential creative applications include translating content while retaining the original speaker's voice.

ElevenLabs requires less effort but produces lower quality voice clones compared to chaining open source models.

The choice between services like ElevenLabs and custom solutions depends on priorities like quality, speed, effort, and cost.

Text-to-speech and vocoder models have different strengths - combining them can optimize for different priorities.

Cultural differences may drive varying priorities in AI speech research, with US focused on text-to-speech and China on voice conversion.

Quality and intelligibility of voice clones depends greatly on accent, with native English speakers working best currently.

Rapid innovation creates quick outdated information - modular architectures allow combining latest components.

Transcripts

play00:00

You might have seen the weird text-to-speech AI from AI SpongeBob

play00:03

or the insanely good AI Drake songs

play00:05

wondered what are all these AIs

play00:08

and how can they be so good and so bad at the same time?

play00:12

Well today, I will be here to demystify everything for you.

play00:15

Let's break down what all the AI technologies people are using

play00:18

what resources are actually available to you

play00:20

and how people have made Donald Trump and Joe Biden play Overwatch together.

play00:24

But first, in order to differentiate them easier, we would need to dig a bit into how they actually work and what makes all of them different from each other.

play00:32

Currently, voice, speech, saying, or whatever generation you might call it can be generalized into two main categories.

play00:38

The first one is the classic text-to-speech synthesis, or what I like to call it, pure text-to-speech.

play00:44

This is pretty straightforward, it's like Siri replying to you, or the TikTok text-to-speech for commentating videos, because the AI only uses

play00:51

text to generate the audio.

play00:53

The AIs I'm talking about today are the ones that let you generate your own custom voices, which is much cooler.

play00:58

But the coolest is definitely the second category, voice to voice conversion.

play01:02

This is the one that enables AI generated singing, which is how the AI Drake song was made.

play01:07

If you don't know the infamous AI Drake song, go search it real quick because it's surprisingly good.

play01:11

I just can't play it here because that song has become a DMCA landmine.

play01:14

The music record did not take that well for sure.

play01:17

Anyways, this type of speech or even voice synthesis requires an audio reference as an input.

play01:21

So how AI Drake was made was that you basically take someone's vocal like Drake's

play01:26

throw it into the AI to train and learn his voice

play01:28

then have a person sing a song and the AI would be able to use that singing to convert the person's vocal into Drake's.

play01:33

The small but significant difference between the two is that pure text-to-speech doesn't let you imitate sounds or style of speech.

play01:40

So if you want to make sounds like ohhhhh

play01:42

or speak in a certain tone

play01:44

only voice-to-voice would be able to copy that

play01:46

since text-to-voice only reads text, and those things cannot be accurately expressed through text.

play01:51

So now you know the two main types of voice cloning you probably would want to try them out now.

play01:55

But before I start information dump you about what's the best app or services

play01:59

let's talk about the backbones or basically the research papers that carry most of the voice cloning tools right now

play02:04

so you would understand your options a lot more.

play02:06

For text-to-voice synthesis, there are two main research that is currently the most popular.

play02:11

The first one is Tacotron 2, which is the one that you are listening to right now.

play02:15

Its research was published by Google and built by Nvidia back in 2018, and it's the one that AI SpongeBob uses to do all the text-to-speech in the livestream.

play02:24

The main advantage is that it's fast, but the trade-off is that the quality is just not up there.

play02:29

What's worse for Tacotron 2 though is that it needs to be fine-tuned for around 2-3 days to copy someone's voice decently

play02:35

and people usually use around 3 hours of the target voice data so it would definitely be the last option for some.

play02:40

However, since Tacotron 2 is pretty old, it has one of the largest custom voice libraries online and I'll talk about those libraries or services later.

play02:48

Second one is Tortoise TTS, which is probably the most popular research or backbone for most of the voice cloning programs.

play02:54

Developed by James Becker, this bad boy can fit so many voices in it because it only needs around 30 minutes of data and 30 minutes of training time.

play03:01

And this is how a 30 minutes data and 30 minutes of training sounds.

play03:05

The main downside is that since Tortoise is diffusion based, it is much slower at generating voices, especially if you're not running it on a GPU.

play03:13

So if you want something high quality, the worst case could be a taking 10 minutes to render just one sentence.

play03:18

However, the quality is much better than Tacotron 2 and would retain better consistencies of the voice that it's copying from.

play03:24

The problem is that it'll sometimes act up and reread a sentence twice if your training data is flawed, which is pretty funny.

play03:30

How can they be so good and so bad at the same time?

play03:33

And how can they be so good and so bad at the same time?

play03:38

But what about their frequencies and audio quality?

play03:40

Are they the same?

play03:41

Well, actually, there is something else that is handling this task.

play03:44

You see, these two AI models are the synthesis module, where the AI converts the text into audio spectrograms, more specifically, mel spectrograms.

play03:52

Audio frequencies are generated by a second module called the vocoder, where the AI generates audio waveform from those audio spectrograms.

play03:59

There are loads of vocoders like WaveNet or WaveGlow, but the most popular one right now is called HiFiGAN, where pretty much most of the synthesizer uses it.

play04:06

HiFiGAN is popular because of its superior performance in generating high-fidelity, natural-sounding speech.

play04:12

It is designed to create high-resolution, high-quality audio waveforms from mel spectrograms with an incredible speed, and depending on what frequencies the synthesizer is trained on.

play04:21

It can generate from 22 kHz up to 48 kHz.

play04:24

It does this using a generator and discriminator architecture similar to GANs.

play04:28

but with several enhancements specifically for audio synthesis.

play04:31

However, some people choose to use other vocoders for different synthesizers because there are other models like BigVGAN.

play04:37

Because this vocoder sounds better for text-to-speech synthesizers

play04:41

and there are also supersampling techniques to upscale audio from 22kHz to 32kHz

play04:46

but these are definitely a bit too complicated and a story for another day.

play04:50

These vocoders are also used in voice-to-voice conversions, with roughly the same logic.

play04:55

which is a perfect time for me to introduce you to the two main popular options for voice-to-voice conversions.

play05:00

These are the ones that can get insane audio quality like up to 48kHz because it doesn't use text to synthesize at all.

play05:06

The lore of these AIs are a bit complicated because they are like research on top of implementations on top of more implementations.

play05:13

So I'mma be lazy and choose the easy way to refer to them as AI software.

play05:17

The first one is so-vits-svc, which stands for SoftVC vits Singing Voice Conversion.

play05:22

As the name suggests, the main characters of this software are SoftVC and vits

play05:27

to achieve the voice-to-voice conversions, with a bunch of other side characters

play05:30

like parts of DiffSinger, Hubert, and HiFiGAN to support this function.

play05:34

With each update, they often reorganize or change the combinations and designs of its AI architecture, so it's really hard going into details and not get outdated in like a month.

play05:43

For reference, this is VISVC, another type of implementation that combines vits with other AI architectures, and they take apart, recombine, or redesign them every other version.

play05:52

It can get really messy for any non-researchers looking at it, so yeah, let's not go into it.

play05:57

Right now, so-vits-svc is probably the most popular one, with over 16.8k stars on GitHub.

play06:02

However, the second one, RVC, short for Retrieval Based Voice Conversion, which is like a direct successor of so-vits-svc, may be a better choice.

play06:10

Released on March 23rd, 2023, a few days later than so-vits-svc, it can generate more consistent and accurate vocals, and is just hands down a better package with faster training time, needs less amount of data, and lower hardware requirements.

play06:14

Released on March 23rd, 2023, a few days later than so-vits-svc, it can generate more consistent and accurate vocals

play06:17

and is just hands down a better package with faster training time,

play06:20

needs less amount of data, and lower hardware requirements.

play06:23

It was technically better than SVC, especially quality-wise,

play06:26

before SVC used an encoder called Hubert and RVC uses ContentVEC which is a better version of Hubert,

play06:32

with the addition of a retrieval functionality within RVC that can reduce tone leakage.

play06:36

However, with the latest so-vits-svc 4.0 update, they both now use similar components and architectures.

play06:42

and some people would still prefer SVC because it has a bigger community surrounding it.

play06:46

I am pretty sure that RVC is the technology behind the AI Drake song

play06:50

Because back when it was made, SVC's quality was definitely not the tech behind it.

play06:54

What do you think about the audio quality difference?

play06:56

What's up guys, welcome back to another episode of the AI Timeline

play07:00

What's up guys, welcome back to another episode of the AI Timeline

play07:03

Where I cover the coolest AI

play07:04

developments in the past week.

play07:06

Where I cover the coolest AI

play07:07

developments in the past week.

play07:08

Pure singing and pure speech can make a difference too, so here's a quick singing comparison.

play07:12

♪ Legends never die ♪

play07:16

♪ Legends never die ♪

play07:20

♪ Can you hear them screaming out your name? ♪

play07:26

♪ Can you hear them screaming out your name? ♪

play07:32

It's also interesting to note that the developing community for all the top-tier text-to-speech research are mostly made by US researchers

play07:39

and most of the top-tier voice conversions and related papers are all published or made by Chinese researchers.

play07:44

The SVC and RVC github page are originally Chinese too.

play07:48

It's funny how there are different priorities for researchers with different cultures.

play07:53

What's that sound?

play07:54

Oh my god, it's TalkNet.

play07:55

It's TalkNet, guys.

play07:56

We forgot about it.

play07:57

TalkNet is a text-to-speech synthesis research, but with the ability to also input an arpabet, which is a pronunciation notation to specify a sound.

play08:05

I've made a video about it.

play08:06

You can check it out if you're interested.

play08:08

I'm not going to cover it in this video since it ain't the best now.

play08:11

All right, since now we know these all backbones, it's time to move on to services and the most popular ones to see what they are about.

play08:17

Oh, and I am not sponsored by any of them.

play08:20

UberDuck probably has the largest online library for both text-to-speech and voice-to-voice models.

play08:25

But unfortunately, in a recent update, they have removed all the users upload models from their website and have turned into a commercial friendly service instead.

play08:32

Potentially due to the amount of takedowns for their voice clone models from the Universal Music Group.

play08:37

So a moment of silence for UberDuck.

play08:42

And there's FakeYou, which is another similar service with 3000+ Tacotron, 2 TTS models, and 400 plus SVC models and has a pretty nice user interface.

play08:51

Its library is slightly smaller compared to UberDuck and it'll let you generate for free, but you would usually have to wait in a very long queue.

play08:57

Additionally, free TTS is only limited to 12 seconds.

play09:00

It also has a Wave2Lip Video lip-syncing service where you input an audio and it'll lip-sync the person in the video to that audio for you.

play09:07

So the paid subscriptions are pretty much getting you off the queue.

play09:10

They are planning to support RVC models very soon too.

play09:13

It's important to note that two of the above services do not offer any voice cloning and you would only be able to use the pre-trained models uploaded by other people.

play09:21

This may be because custom voice clone like Tacotron 2 or SVC requires too much hardware so they did not make them as an available service.

play09:29

However, the last service does offer voice cloning, but not for singing, only text-to-speech.

play09:35

ElevenLabs, which you might have heard of, given its extreme ease of copying and cloning voices, took the internet by storm a few months ago.

play09:41

The infamous Biden, Trump, and Obama playing Overwatch is probably made with ElevenLabs too.

play09:46

I actually think its backbone is Tortoise Base because just listen to these two audio clips.

play09:50

You might have seen the weird text-to-speech AI from AI SpongeBob.

play09:54

You might have seen the weird text-to-speech AI from AI SpongeBob.

play09:58

ElevenLabs offers a few basic things, but the highlight is basically the voice cloning function it has.

play10:03

In its instant voice cloning function, with just taking in around 1 minute of voice as an example, it can clone the voice pretty well, probably the best quality out of all the current existing tools.

play10:13

However, there is a downside where the voice that you're cloning has to be fluent in English or else if the voice like mine was used, it can sound a bit weird and unnatural.

play10:21

They have recently just released an English V2 model so it definitely sounds a lot better compared to English V1.

play10:26

I think they are planning to release a new voice conversion function but the details are still unclear.

play10:31

They do have another special function called professional voice cloning where you would give them around 80 minutes of voice data and they would clone it for you.

play10:38

However, it is only limited to one voice for a 22 bucks per month subscription right now, and if you do not want to pay anything, especially if you just want to clone your voice, there are free local UIs you can use.

play10:49

Which is perfect if you have a computer that has at least 8GB of VRAM.

play10:53

For Tacotron 2, check out this voice cloning app.

play10:56

You can basically do most of the stuff about Tacotron 2 on here, and the wiki has made it pretty straightforward too.

play11:01

It includes tools to help you sort and organize your data before training, load other pre-trained models, and generate results.

play11:07

If you need a tutorial, check out this one made by the author of this GUI.

play11:11

I forgot to mention all the training data for any of these requires the speaker voices without any background noises.

play11:16

So if you do need to separate voice from music or noise, check out this tool.

play11:20

For Tortoise TTS, check out this repo called AI Voice Cloning made by MRQ.

play11:24

This is probably the most actively maintained Tortoise web UI on the internet, and it has most of the stuff you would also need including training and inferencing.

play11:31

Check out this tutorial to learn how to use it.

play11:33

For so-vits-svc, check out this GUI and this tutorial is made perfectly for it.

play11:38

For RVC, the official web UI is already good enough and is actively maintained, so definitely use that.

play11:43

You can check out this tutorial to learn how to use it too.

play11:46

Alright, that was probably the most boring information dump I've ever made, but hear me out, there's more good stuff coming.

play11:51

Since voice-to-voice conversion requires a reference audio

play11:54

what if you use the text-to-speech result as the reference audio

play11:57

so you can have a fully text-to-speech AI with way better quality without the need of a reference?

play12:02

The custom pipeline of combining Tortoise’s result and using it as a reference for RVC to generate an unpaired text-to-speech is completely possible.

play12:10

So if you take two voice clone models like Tortoise and RVC and use the Tortoise output as an input for RVC, then I technically do not need myself to narrate my script.

play12:19

I can have Tortoise read and maintain most of my style speech while RVC smooths the audio out, which is exactly what I did for this video's narration.

play12:26

Some of you might have already realized, but yes, none of the audio in this video is a recording of my voice.

play12:32

This is how my voice actually sounds.

play12:34

Pretty cool, right?

play12:35

What I would need to correct myself though was that I used a Tortoise model that was trained on 4 hours of my voice with around 4,500 iterations, which is like 11 hours of training time on a 4080.

play12:45

Shoutout to Synthetic Voices for helping me out with this.

play12:48

Right now I think there are a lot of great ways to utilize this AI tech like helping content creators to translate their content to another language while maintaining their own voice.

play12:56

On top of that, syncing lips ain't that difficult too thanks to AI.

play13:00

And if you already have a voice actor, voice to voice might just do the perfect job.

play13:04

So with the team that I've worked with in this video, I think it's totally possible to create a custom tailored pipeline for everyone.

play13:10

If you're a content creator and are interested in this, shoot me a DM on Twitter and maybe I can make a few demos for you.

play13:15

Okay, just a few days before publishing this video, I have just received a message from Eleven Labs that the pro-Finetune voice cloning, which every account is limited to only one voice, has been completed.

play13:25

This took somewhere around one and a half hours of my clear voice to Finetune.

play13:30

Unlike its standard voice cloning that only takes like one minute long of reference voice.

play13:34

And to answer the age-old question, which one is the best?

play13:37

Well, in my opinion, Eleven Labs took way less effort to use since I can just paste a whole paragraph into it and it'll generate everything coherently without any bugs.

play13:46

While Tortoise + RVC needs to do two manual passes for each sentence.

play13:50

And you have to double check if they have generated correctly because it can screw up very easily.

play13:54

No need to tidy up transcriptions before training and whatsoever, so this is probably the most convenient option.

play14:00

I can't really say that it's also the faster option because I don't know how long it was trained on, and ElevenLabs gave me no specific time frame of when this would be done.

play14:08

If I recall correctly, I submitted this nearly 2 months ago, and only got it today too.

play14:13

However, the inference time seems to be really fast.

play14:16

I only had to wait like 10 seconds before it started generating and continuously expanding the voice audio for 3 paragraphs.

play14:22

However, if you are looking for the best quality and resemblance in pure text voice cloning, Tortoise + RVC combo is unbeatable.

play14:29

I think you might have picked up by now, and yes, the audio you are hearing for this additional comparison is fully generated by Eleven Labs' pro voice cloning, not the combo.

play14:37

So I do think quality wise, this is not as good as the open source and the free option right now.

play14:42

I think it may be the problem with the vocoder, which I mentioned before, where it'll perform better on people with less accents, and it'll probably work best on native speakers.

play14:52

But the downside of the combo is that it takes too much data, too much time, and too much effort to get everything right, so it really depends on how much time you have.

play15:02

For better quality but takes longer and more effort would be the Tortoise + RVC combo.

play15:08

for the quicker with less superior quality but a lot more convenient would be ElevenLabs pro voice cloning

play15:15

and this voice clone is only trained on one minute of Asmongold's voice so ElevenLabs may be better in this case

play15:21

Holy shit!

play15:22

This voice clone is just too real please subscribe to bycloud

play15:26

I'll give it back to him I think he'll talk about some sponsors or something

play15:29

And today's video is brought to you by Brilliant.org.

play15:31

You know, I've actually been asked a lot of times about where to start learning the fundamentals of AI and machine learning.

play15:36

And you know what I always say?

play15:38

YouTube videos because they are free, but free content comes with compromises.

play15:43

The cheap quality of the learning content, the difficulty in interacting with the learning materials, or the unclear and roadmap to master the field is not always there.

play15:51

So this is where Brilliant.org comes in.

play15:54

Not only that, Brilliant also provides a clear roadmap on different subjects for all knowledge levels.

play15:58

From basic algebra to advanced multivariable calculus, from programming with Python to artificial neural networks, Brilliant is full of STEM courses that are usually a pain to study in but made into a much friendlier and digestible format.

play16:11

I had actually used it a lot during my time in high school too.

play16:14

It helped me when I got introduced into the cursed world of calculus and somehow helped me to survive it because it made understanding the difficult concepts much easier with these very intuitive interactive lessons.

play16:24

So yeah, quickly get started on Brilliant by heading to brilliant.org slash bycloud to get started for a free 30 days experience with Brilliant's ever-expanding interactive lessons, and the first 200 people to sign up will get a 20% off an annual membership.

play16:38

Thank you Brilliant for sponsoring this video.

play16:40

A big shout out to (patreon names here) and many others that support me through Patreon or YouTube.

play16:51

Follow me on Twitter if you haven't, and I'll see you all in the next one.

Rate This

5.0 / 5 (0 votes)

Benötigen Sie eine Zusammenfassung auf Englisch?