Unsupervised Learning: Crash Course AI #6

CrashCourse
20 Sept 201912:35

Summary

TLDRCrash Course AI explores unsupervised learning, where AI models learn from data without teacher-provided labels. Key concepts include clustering, where AI groups similar data points, and representation learning, which helps AI understand and compare complex data like images. The video uses the K-means algorithm to demonstrate how AI can identify patterns in iris flowers, and discusses the challenges and importance of unsupervised learning in advancing AI capabilities.

Takeaways

  • 🤖 Unsupervised learning allows computers to learn from data without needing labeled examples, similar to how humans learn by observing patterns.
  • 🧠 The difference between supervised and unsupervised learning lies in the presence of a teacher or labels; supervised learning predicts answers with labeled data, while unsupervised learning models the world based on patterns.
  • 🌼 A practical example of unsupervised learning is clustering, where objects are grouped based on shared properties, as demonstrated by categorizing flowers by color, shape, or species.
  • 📊 K-means clustering is an algorithm used for unsupervised learning that involves predicting and correcting the model's understanding of data clusters based on observed patterns.
  • 🔍 Representation learning is a process in unsupervised learning where the AI learns to identify abstract patterns in data, such as recognizing features in images beyond individual pixels.
  • 🎨 The script uses the analogy of drawing from memory to explain how representation learning works, where the mind reconstructs an image based on remembered features.
  • 🌐 Unsupervised learning is crucial for AI as it mimics the human brain's ability to learn from the environment, which is essential for AI's grand ambitions.
  • 🏆 Yann LeCun, a Turing Award winner, emphasizes the importance of unsupervised learning for AI's future, suggesting it's the 'ultimate answer'.
  • 🌱 The script highlights the ongoing research in unsupervised learning, noting the complexity of designing AI systems that can effectively learn and recognize patterns like the human brain.
  • 🌐 The application of unsupervised learning extends to natural language processing, where AI systems are trained to find patterns in words and language, which will be explored in future episodes.

Q & A

  • What is the main difference between supervised and unsupervised learning?

    -Supervised learning requires labeled data and a teacher to guide the model, whereas unsupervised learning does not require labels or a teacher, and the model learns by finding patterns in the data.

  • How do humans naturally perform unsupervised learning?

    -Humans perform unsupervised learning by observing the world and identifying patterns without explicit instruction, such as recognizing different animals or understanding the rules of a game by watching.

  • What is unsupervised clustering?

    -Unsupervised clustering is the process of grouping similar objects together based on their properties without the use of predefined labels.

  • How does the K-means clustering algorithm work?

    -The K-means clustering algorithm works by initially guessing the number of clusters (K), randomly assigning data points to clusters, calculating the average of each cluster, and then iteratively refining the cluster assignments and averages until convergence.

  • What are the two key questions that need to be answered when constructing a model for unsupervised learning?

    -The two key questions are: 1) What observations can we measure? and 2) How do we want to represent the world?

  • Why is it important to measure petal length and width in the iris flower example?

    -Petal length and width are chosen as measurements because they are distinguishing features that can help differentiate between different species of iris flowers.

  • What is representation learning in the context of unsupervised learning?

    -Representation learning is the process of finding meaningful patterns in data that are more abstract than individual data points, which helps in understanding and comparing the data.

  • How does the autoencoder neural network contribute to representation learning?

    -An autoencoder neural network contributes to representation learning by encoding the input data into a representation and then decoding it to reconstruct the original input, thereby learning meaningful patterns.

  • Why is unsupervised learning considered the 'ultimate answer' by Professor Yann LeCun?

    -Unsupervised learning is considered the 'ultimate answer' because it mimics how humans learn from the environment without explicit supervision, which is a key aspect of AI's grand ambitions to understand and interact with the world.

  • What is the challenge in designing AI systems for effective unsupervised learning?

    -The challenge lies in the fact that AI systems cannot learn exactly like humans do through simple observation and imitation; they require specifically designed models and guidance on how to find patterns.

  • How does the script suggest that unsupervised learning can be applied to natural language processing?

    -The script hints that unsupervised learning can be applied to natural language processing by finding patterns in words and language, similar to how it finds patterns in other types of data.

Outlines

00:00

🌟 Introduction to Unsupervised Learning

This paragraph introduces the concept of unsupervised learning, contrasting it with supervised learning. Unlike supervised learning, which relies on labeled data and a teacher to correct errors, unsupervised learning allows computers to find patterns in data without explicit guidance. The paragraph uses examples like recognizing different animals or understanding sports rules by observation alone. It also touches on the potential of unsupervised learning to utilize freely available data and the idea of clustering, which is grouping similar objects together based on shared properties.

05:01

🌼 Exploring K-means Clustering

The second paragraph delves into the practical application of unsupervised learning through the K-means clustering algorithm. It uses the example of different iris species to illustrate how the algorithm works. The process begins with an assumption about the number of clusters (K) and randomly assigns data points to these clusters. The algorithm then iteratively refines the cluster centers (averages) and reassigns data points based on proximity to these centers. The goal is to achieve a stable model where the positions of cluster centers no longer change, representing a meaningful organization of the data.

10:02

🤖 Representation Learning and Autoencoders

The final paragraph discusses representation learning, a method for creating abstract representations of data that can capture complex patterns. It contrasts this with the simpler averaging used in K-means clustering, emphasizing the need for more sophisticated techniques when dealing with high-dimensional data like images. The paragraph introduces autoencoders, a type of neural network that learns to encode and decode data, effectively learning a compressed representation of the input. The discussion highlights the potential of unsupervised learning to mimic human learning processes and the ongoing challenges in developing AI systems capable of such learning.

Mindmap

Keywords

💡Unsupervised Learning

Unsupervised Learning is a type of machine learning where the model is trained on input data without any labeled responses. It is used to explore the underlying structure of the data and find patterns or relationships within it. In the video, unsupervised learning is compared to how humans learn by observing the world and identifying patterns without explicit instruction. The video uses the example of recognizing different types of flowers or understanding the rules of a sport by watching it being played.

💡Supervised Learning

Supervised Learning is a category of machine learning where the model is trained on labeled data. The model learns to predict outcomes based on the input data and the corresponding labels provided by a 'teacher'. It is contrasted with unsupervised learning in the video, where the need for labeled data is highlighted. Supervised learning is likened to a classroom setting where a teacher provides guidance and correct answers.

💡Clustering

Clustering is a method within unsupervised learning that groups a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. The video explains clustering through the example of categorizing flowers based on their properties like color and petal shape, without any pre-existing labels.

💡K-means Clustering

K-means Clustering is a specific algorithm used for clustering in unsupervised learning. It partitions the data into K distinct non-overlapping subgroups (clusters) where each data point belongs to the cluster with the nearest mean. The video uses the K-means algorithm to demonstrate how to find patterns in iris flowers based on petal length and width, aiming to identify different species.

💡Representation Learning

Representation Learning is the process of learning a representation for the data in a form that is useful for solving tasks. It is a concept within unsupervised learning where the model learns to identify meaningful patterns in the data. In the video, representation learning is discussed in the context of images, where the model must learn to recognize and compare complex features rather than individual pixels.

💡Autoencoder

An Autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The network learns to encode the input into a hidden representation and then decode it to reconstruct the input. It is mentioned in the video as a method for unsupervised learning that can be used to process and reconstruct images, thus learning their internal representations.

💡Pattern Recognition

Pattern Recognition is the ability to identify regularities or patterns in data. It is a fundamental concept in machine learning, especially in unsupervised learning, where the model must discover the underlying structure of the data. The video discusses how humans naturally recognize patterns, such as distinguishing between different animals or understanding sports rules, and how AI can be designed to mimic this ability.

💡Data Points

Data Points are individual entries or observations in a dataset. In the context of the video, data points are the measurements of iris flowers' petal lengths and widths, which are used as inputs for the K-means clustering algorithm. The video explains how these data points are grouped into clusters based on their similarities.

💡Model

A Model in machine learning refers to a mathematical representation of a system or process that is used to make predictions or decisions. The video discusses how models in unsupervised learning are used to predict and understand the structure of the world without the guidance of labeled data, similar to how humans learn from their environment.

💡Algorithm

An Algorithm is a set of rules or steps used to solve a problem or perform a computation. In the video, algorithms like K-means clustering are used to perform unsupervised learning tasks. The algorithm is described as a method that takes in data, makes predictions about the world, and then learns by correcting itself based on the observed data.

💡Feature

In machine learning, a Feature is an individual measurable property or characteristic of a phenomenon being observed. Features are used as inputs for the model. The video uses the example of petal length and width as features of iris flowers, which are then used by the K-means algorithm to categorize the flowers into clusters.

Highlights

Unsupervised learning allows computers to learn without a teacher by finding patterns in the world.

Humans can learn through observation, like recognizing different animals or understanding sports rules without being explicitly taught.

The key difference between supervised and unsupervised learning is the presence or absence of a teacher providing labels.

Unsupervised learning is useful for utilizing freely available data without the need for labeled information.

Unsupervised clustering is the process of recognizing different properties and creating categories without labels.

AI can be programmed to perform clustering by choosing properties of interest, such as color or shape.

The K-means clustering algorithm is used for unsupervised learning, requiring a method to compare observations and calculate averages.

The K-means algorithm involves predicting and learning steps to model the world and find clusters in data.

Unsupervised learning can be applied to real-world problems, such as identifying different species of iris flowers.

Representation Learning is about finding meaningful patterns in data that are more abstract than individual data points.

Autoencoders are a type of neural network that can be used for unsupervised learning and representation learning on images.

Unsupervised learning allows AI to learn from the world as its teacher, similar to how humans learn through observation.

Professor Yann LeCun considers unsupervised learning as the ultimate answer for AI's grand ambitions.

Building AI systems that can perform effective unsupervised learning is challenging due to the differences in human and AI learning processes.

The human brain's structure and pattern recognition capabilities are the result of billions of years of evolution.

Upcoming episodes will explore applying unsupervised learning concepts to AI systems for natural language processing.

Transcripts

play00:00

Thanks to Wix for supporting PBS Digital Studios.

play00:03

Hey, I’m Jabril and welcome to Crash Course AI!

play00:07

So far in this series, we’ve focused on artificial intelligence that uses Supervised

play00:11

Learning.

play00:13

These programs need a teacher to use labeled data to tell them “right” from “wrong.”

play00:17

And we humans have places where supervised learning happens, like classrooms with teachers,

play00:22

but that’s not the only way we learn.

play00:25

We can also learn lots of things on our own by finding patterns in the world.

play00:29

We can look at dogs and elephants and know they’re different animals without anyone

play00:32

telling us.

play00:33

Or we can even figure out the rules of a sport just by watching people play.

play00:38

This kind of learning without a teacher is called Unsupervised Learning and, in some

play00:43

cases, computers can do it too.

play00:46

INTRO

play00:54

The key difference between supervised and unsupervised learning is what we’re trying

play00:59

to predict.

play01:00

In supervised learning, we’re trying to build a model to predict an answer or label

play01:04

provided by a teacher.

play01:05

In unsupervised learning, instead of a teacher, the world around us is basically providing

play01:10

training labels.

play01:11

For example, if I freeze this video of a tennis ball RIGHT NOW, can you draw what could be

play01:16

the next frame?

play01:17

Unsupervised learning is about modeling the world by guessing like this, and it’s useful

play01:22

because we don’t need labels provided by a teacher.

play01:25

Babies do a lot of unsupervised learning by watching and imitating people, and we’d

play01:30

like computers to be able to learn like this as well.

play01:33

This lets us utilize lots of freely available data in the world or on the internet.

play01:38

In many cases, one of the easiest ways to understand how AI can use unsupervised learning

play01:43

is by doing it ourselves, so let’s look at a few photos of flowers with no labels.

play01:49

The most basic way to model the world is to assume that it’s made up of distinct groups

play01:53

of objects that share properties.

play01:55

So, for example, how many types of flowers are here?

play01:59

We could say there are two because there are two colors, purple and yellow.

play02:03

Or we could look at the petal shapes, and divide them into round petals and tall vertical ones.

play02:07

Or maybe we have some more experience with flowers and realize that two of these are

play02:12

tulips, one is a sunflower, and one is a daisy, so there are three categories.

play02:17

Immediately recognizing different properties like this and creating categories is called

play02:21

unsupervised clustering.

play02:23

We don’t have labels provided by a teacher, but we do have a key assumption about the

play02:28

world that we’re modeling: certain objects are more similar to each other than others.

play02:33

We can program computers to perform clustering too.

play02:35

But to do that, we need to choose a few properties of flowers we’re interested in looking at,

play02:41

like how we picked color or shape just now.

play02:43

For a more realistic example, let’s say I bought a packet of iris seeds to plant in

play02:48

my garden.

play02:49

After the flowers bloom though, it looks like there were several species of irises mixed

play02:54

up in that one packet.

play02:55

Now I’m no expert gardener, but I can use some AI to help me analyze my garden.

play03:00

To construct a model, we have to answer two key questions.

play03:04

First, what observations can we measure?

play03:08

All of these flowers are purple, so that’s probably not the best way to tell them apart.

play03:13

But different irises seem to have different petal lengths and widths, which we can measure

play03:19

and place on this graph with petal length on the Y axis and width on the X axis.

play03:24

And second, how do we want to represent the world?

play03:27

We’re going to stick to a very simple assumption here: there are clusters in our data.

play03:33

Specifically, we’re going to say there are some number of groups called K clusters, but

play03:38

we don’t know where they are.

play03:40

To help us, we’re going to use the K-means clustering algorithm.

play03:45

K-means clustering is a simple algorithm.

play03:47

All it needs is a way to compare observations, a way to guess how many clusters exist in

play03:52

the data, and a way to calculate averages for each cluster it predicts.

play03:56

In particular, we want to calculate the mean by adding up all data points in a cluster

play04:02

and dividing by the total number of points.

play04:05

Remember, unsupervised learning is about modeling the world, so our algorithm will have two

play04:10

steps:

play04:11

First, our AI will predict.

play04:14

What does the model expect the world to look like?

play04:16

In other words, which flowers should be clustered together because they’re the same species?

play04:21

Second, our AI will correct or learn.

play04:24

The model will update its beliefs to agree with its observation of the world.

play04:29

To start the process, we have to specify how many clusters the model should look for.

play04:33

I’m guessing there are three clusters in the data, so that becomes the model’s initial

play04:38

understanding of the world, and we’re looking for K=3 averages, or three types of irises.

play04:45

But to start, our model doesn’t really know anything, so the averages are random and so

play04:50

are its predictions.

play04:52

Each datapoint (which is a flower) is given a label as type1, type2, or type3, based on

play04:59

the algorithm’s beliefs.

play05:01

Next, our model tries to correct itself.

play05:03

The average of each cluster of datapoints should be in the middle, so the model corrects

play05:08

itself by calculating new averages.

play05:11

We can see those averages here, marked with Xs, which gives our updated model of the three

play05:16

(or so we guessed) types of irises.

play05:19

The graph is still pretty noisy.

play05:22

For example, it’s a little weird that there are type2 flowers so close to the average

play05:26

for type3.

play05:28

But we did start with a random model, so we can’t expect too much accuracy.

play05:33

Logically, we know that irises of the same species tend to have similar petals, so those

play05:39

datapoints should be clustered together.

play05:42

Since we just did a correction or learning step, we can repeat the process, starting

play05:46

with a new prediction step.

play05:48

Let’s predict new labels using the Xs that mark the averages of each label.

play05:52

We’ll give every datapoint the label of its closest X -- type1, type2, or type3 -- and

play05:59

then we’ll calculate new averages.

play06:02

That’s better, but still not the cleanest clusters, so we can repeat the process again:

play06:10

Predict, Learn, Predict, Learn.

play06:12

Eventually, the Xs will stop moving and we have a model of iris clusters created with

play06:18

unsupervised learning!

play06:20

Now the ultimate question is, did we find meaningful patterns about the world with our

play06:25

AI?

play06:27

We made an assumption that there were three types of irises, and we assumed that they

play06:31

have different petal lengths and widths.

play06:34

Was this true?

play06:35

Lucky for us, I have a friend who is a master gardener.

play06:38

I showed him the real-life flowers closest to each of the three averages and he said

play06:42

that type1 is Versicolor, type2 is Setosa and type3 is Virginica.

play06:49

Three different iris species!

play06:51

We learned about the world from observation, which is what makes this unsupervised learning,

play06:57

even though we relied a tiny bit on a teacher(the master gardener) for confirmation and help.

play07:03

Now that we’ve learned the basics, we can experiment with harder examples.

play07:07

Let’s say we want to use an unsupervised learning algorithm to sort a bunch of different

play07:10

photos, not just three iris species.

play07:13

First, what observations can we measure?

play07:16

How much green there is?

play07:17

Whether there’s a nose and fur?

play07:19

To have a computer make these observations, we need to measure thousands of red, green,

play07:24

and blue pixels in each image.

play07:26

Second, how do we want to represent the world?

play07:28

Before, we were only working with 2 features, so we could just use averages of the clustered

play07:33

datapoints and get meaningful abstraction from it.

play07:36

But when dealing with images, we can’t use the same method, because we won’t get much

play07:40

meaning out of averaging colored pixels for what we want to accomplish.

play07:45

Somehow, we need the model to create a representation that tells us if two images are similar.

play07:51

There are meaningful patterns in the data that are more abstract than individual pixels,

play07:56

and finding them across many images is called Representation Learning.

play08:01

These patterns help us understand what’s in the images and how to compare them to each

play08:06

other.

play08:06

Representation learning happens both in supervised and unsupervised learning models, so we can

play08:12

do it with or without labels to find patterns in the world.

play08:15

To understand the basic idea of representation learning, check out this experiment: I’m

play08:20

gonna look at a picture really fast and then try to draw it.

play08:23

Ready, Set, Go!

play08:32

Woah. That was 5 seconds?

play08:35

My eyes took in the picture and remembered important features, so I’m building a representation

play08:40

in my mind.

play08:41

But I can’t just show you my thoughts to get feedback on what parts I misremembered,

play08:46

so I have to produce a reconstruction, or draw the original image from memory.

play09:00

Alright, so this is what I’ve got.

play09:03

Now let’s compare my drawing to the original image.

play09:06

Let's see round plate, triangle slice of pizza, some cheese, some crust, tablecloth. Pretty good.

play09:14

For an AI, making a reconstruction would mean producing all the right pixel values to make

play09:19

a reconstruction.

play09:20

Our K-means clustering algorithm from before, predicted classes for flowers based on how

play09:24

close the datapoints were to the averages.

play09:27

For images, we will have learned image representations instead of averages.

play09:32

After that step, just like before, the AI will have to correct itself.

play09:36

Previously, we updated the K clusters based on how well our predicted labels fit the data.

play09:43

But for images, we’d have to update the model’s /internal representations/ based

play09:48

on its reconstructions.

play09:50

There are different ways to use unsupervised learning in combination with representation

play09:54

learning so that an AI can compare images.

play09:57

Like, for example, there’s a type of neural network called an autoencoder, which uses

play10:01

the same basic principles of weights and biases to process inputs, pass data onto hidden neuron

play10:07

layers, and finally to a prediction output layer.

play10:10

If John-Green-bot was programmed with an autoencoder, the input would be an image, the hidden layers

play10:15

would contain representations, and the output would be a full reconstruction of the original

play10:21

image (which gets more accurate the more we train his AI).

play10:25

Theoretically, I could give John-Green-bot a representation of a pizza and he could reconstruct

play10:30

the original pizza image.

play10:31

What’s so powerful about unsupervised learning is that the world is our teacher.

play10:36

By looking around, taking in a lot of data, and predicting what we’ll see and hear next,

play10:40

we learn about how the world works and how it should be represented.

play10:44

When asked how AI will fulfill its grand ambitions, 2018

play10:48

Turing Award Winner Professor Yann LeCun, said: “We all know that unsupervised learning

play10:53

is the ultimate answer.“

play10:55

So I guess we better keep working on it!

play10:58

Unsupervised learning is a huge area of active research.

play11:02

The human brain is specially designed for this kind of learning and has different parts

play11:06

for vision, language, movement, and so on.

play11:10

These structures and what kinds of patterns our brains look for were developed over billions

play11:16

of years of evolution.

play11:18

But it’s really tricky to build an AI that does unsupervised learning well because AI

play11:22

systems can’t learn exactly like human often do, just by watching and imitating.

play11:27

Someone, like us, has to design the models and tell them how to look for patterns before

play11:32

letting them loose.

play11:33

Next time, we’ll look at applying similar concepts to AI systems that find patterns

play11:38

in words and language, in what’s called Natural Language Processing.

play11:43

See you then!

play11:44

Thanks to Wix for supporting PBS Digital Studios.

play11:47

Checkout Wix.com if you’re looking to make your own website.

play11:50

Wix is a platform that allows you to build a personalized website for almost any purpose

play11:55

from promoting your business or creating an online shop to a place for you to test out

play12:00

new ideas.

play12:01

Their technology allows you to create something unique no matter your skill level with templates

play12:06

and all in one management.

play12:07

If you’d like to check it out you can go to wix.com/go/crashcourse

play12:12

Or click the link in the description.

play12:14

Crash Course AI is produced in association with PBS Digital Studios.

play12:18

If you want to help keep Crash Course free for everyone, forever, you can join our

play12:22

community on Patreon.

play12:24

And if you want to learn more about the math of k-means clustering, check out this video

play12:28

from Crash Course Statistics.

Rate This

5.0 / 5 (0 votes)

Related Tags
Unsupervised LearningAI EducationMachine LearningK-means AlgorithmData ClusteringPattern RecognitionAI DevelopmentRepresentation LearningCrash CoursePBS Digital Studios