Lec 04-Introduction to AI Algorithms

IIT Roorkee July 2018
30 Nov 202328:32

Summary

TLDRThis video script offers an insightful overview of AI algorithms in marketing, focusing on their definition, complexity, and function. It delves into the three primary learning patterns: supervised, unsupervised, and reinforcement learning, each with its distinct set of algorithms like decision trees, random forests, and neural networks. The script also touches on the Algorithmic Bill of Rights, emphasizing the importance of awareness, accountability, and validation in AI algorithm usage.

Takeaways

  • 🧠 AI algorithms are sets of instructions for computers to learn and operate independently, significantly more complex than general algorithms.
  • 📚 The learning patterns for AI include supervised learning, unsupervised learning, and reinforcement learning, each with distinct training and functioning methods.
  • 🌳 In supervised learning, algorithms are trained with labeled data to predict outcomes, akin to a student learning with a teacher's guidance.
  • 🔍 Unsupervised learning uses unlabeled data to find patterns and relationships within the data, without any prior guidance.
  • 🤖 Reinforcement learning algorithms learn from feedback in the form of rewards, improving actions based on the environment's responses.
  • 📊 Common supervised learning algorithms include Decision Trees, Random Forests, Support Vector Machines (SVM), Naive Bayes, and Logistic Regression.
  • 🔢 Linear Regression in supervised learning is used for continuous predictions, like sales forecasting, based on the relationship between variables.
  • 👥 Unsupervised learning examples include K-means clustering for grouping data points and Gaussian Mixture Models for more complex cluster shapes.
  • 🕵️‍♂️ K-Nearest Neighbors (KNN) is an algorithm used for both classification and anomaly detection, based on the proximity of data points.
  • 🧬 Neural networks mimic the human brain's functions, with interconnected nodes organized in layers, capable of pattern recognition and complex tasks.
  • 👮‍♂️ The Algorithmic Bill of Rights outlines principles for ethical AI use, emphasizing awareness, accountability, explanation, and validation to prevent biases and harm.

Q & A

  • What is an AI algorithm?

    -An AI algorithm is a set of instructions for a computer to learn and operate on its own. It is a complex programming that determines the steps and learning capabilities of an AI program.

  • How do AI algorithms work?

    -AI algorithms work by taking in training data, which can be labeled or unlabeled, and using that information to learn and grow. They complete tasks using the training data as a basis.

  • What are the different types of learning patterns in AI algorithms?

    -The different types of learning patterns in AI algorithms are supervised learning, unsupervised learning, and reinforcement learning.

  • What is supervised learning in AI?

    -Supervised learning is a category of AI algorithms that work by taking in clearly labeled data during training to learn and predict outcomes for other data.

  • Can you explain the decision tree algorithm in supervised learning?

    -A decision tree algorithm is a supervised learning method that classifies data into nodes, with a root node and leaf nodes, using attribute selection measures like entropy and information gain.

  • What is the random forest algorithm and how does it differ from a single decision tree?

    -The random forest algorithm is a collection of multiple decision trees that are used to gain more accurate results. It differs from a single decision tree by adding randomness to the model and considering the majority vote or average of multiple trees for the final output.

  • How does the support vector machine (SVM) algorithm work?

    -The support vector machine algorithm works by plotting data in an N-dimensional space and finding the hyperplane that best separates the classes. It aims to maximize the margin between the nearest points of different classes.

  • What is the role of the Naive Bayes algorithm in AI?

    -The Naive Bayes algorithm is a classification algorithm that assumes the presence of a feature is unrelated to the presence of other features in the same class. It is used for making probabilistic predictions based on the likelihood of features.

  • What is the purpose of linear regression in AI?

    -Linear regression in AI is used for regression modeling to discover relationships between data points and make predictions or forecasts. It works by plotting data points and finding the best fit line that represents the relationship between variables.

  • How does logistic regression differ from linear regression?

    -Logistic regression differs from linear regression in that it estimates a binary outcome (0 or 1) rather than a continuous value. It is used when the dependent variable is categorical, such as in spam filtering or predicting the occurrence of an event.

  • What is unsupervised learning and how does it differ from supervised learning?

    -Unsupervised learning is a type of AI algorithm that is given unlabeled data and creates models to find patterns or relationships within the data. It differs from supervised learning in that it does not use labeled data and instead focuses on discovering inherent structures in the data.

  • What is the K-means clustering algorithm and how does it work?

    -The K-means clustering algorithm is an unsupervised learning method that partitions data into K pre-defined clusters. It works by iteratively assigning data points to the nearest centroid and recalculating the centroids based on the assigned clusters until it converges to the best clustering.

  • What is the role of the Gaussian Mixture Model (GMM) in AI?

    -The Gaussian Mixture Model is used in unsupervised learning for clustering data into groups. It is more versatile than K-means as it allows for clusters of various shapes, not just circular, and uses a probabilistic approach rather than a distance-based one.

  • What is the K-Nearest Neighbors (KNN) algorithm and its applications?

    -The K-Nearest Neighbors (KNN) algorithm is a simple AI algorithm that classifies new data points based on their similarity to existing data points. It can be used for both supervised and unsupervised learning, with applications in classification, regression, and anomaly detection.

  • How do neural networks function in AI?

    -Neural networks function by mimicking the human brain, consisting of interconnected nodes organized into layers. They process information by adjusting connection strengths (weights) during training to recognize patterns and make predictions.

  • What is reinforcement learning and how does it differ from other types of learning?

    -Reinforcement learning is a type of AI algorithm where an agent learns by taking actions in an environment and receiving feedback in the form of rewards. It differs from other types of learning as it focuses on learning from interactions and consequences rather than from labeled data.

  • What are the key components of the Algorithmic Bill of Rights?

    -The Algorithmic Bill of Rights includes principles such as awareness, access and redress, accountability, explanation, data provenance, auditability, and validation and testing. These principles aim to guide the ethical use of algorithms and ensure fairness and transparency.

Outlines

00:00

🧠 Introduction to AI Algorithms

This paragraph introduces the concept of AI algorithms in the context of a marketing course. It explains what AI algorithms are, emphasizing their complexity and importance in programming computers to learn and operate autonomously. The paragraph outlines the necessity of training data for AI algorithms and touches on different learning patterns such as supervised, unsupervised, and reinforcement learning. It also briefly mentions the Algorithmic Bill of Rights, setting the stage for a deeper dive into various AI algorithms.

05:01

🌲 Supervised Learning Algorithms

The second paragraph delves into supervised learning algorithms, which rely on labeled data for training. It describes how these algorithms use the data to predict outcomes, likening the process to a student learning with a teacher's guidance. The paragraph highlights decision trees, random forests, support vector machines, and Naive Bayes classifiers, explaining their structures and functions. It also touches on linear regression and logistic regression, detailing how they are used for modeling relationships and estimating binary outcomes.

10:03

🔍 Unsupervised Learning and K-Means Clustering

This paragraph focuses on unsupervised learning algorithms, which work with unlabeled data to discover patterns and relationships. It introduces K-means clustering, explaining how it assigns data points to clusters based on proximity to centroids. The paragraph outlines the steps of the K-means algorithm and contrasts it with Gaussian Mixture Models, which allow for more versatile cluster shapes. It also mentions the use of unsupervised learning for applications like sales forecasting and customer churn analysis.

15:04

🤖 K-Nearest Neighbors and Neural Networks

The fourth paragraph discusses the K-nearest neighbors (KNN) algorithm, a non-parametric method used for both classification and regression. It describes how KNN classifies new data points based on proximity to existing data. The paragraph then transitions to neural networks, which are complex AI algorithms that mimic the human brain. It explains the architecture of neural networks, including input, hidden, and output layers, and how they process information to recognize patterns and make predictions.

20:04

🛠️ Reinforcement Learning and Algorithmic Rights

This paragraph explores reinforcement learning algorithms, which learn from feedback in the form of rewards. It describes the components of reinforcement learning, including the agent and the environment, and how they interact through a cycle of actions and rewards. The paragraph also covers different approaches to reinforcement learning, such as policy-based and value-based methods. It concludes with an introduction to the Algorithmic Bill of Rights, a set of principles aimed at ensuring fairness and accountability in algorithmic decision-making.

25:05

📜 Algorithmic Bill of Rights and Conclusion

The final paragraph provides a detailed look at the Algorithmic Bill of Rights, a set of guiding principles for ethical algorithm use. It discusses seven key areas: awareness, access and redress, accountability, explanation, data provenance, auditability, and validation. The paragraph emphasizes the importance of these principles in mitigating biases and ensuring transparency in algorithmic decisions. It concludes the module by summarizing the discussion on AI algorithms, their applications in various learning techniques, and the importance of adhering to the Algorithmic Bill of Rights.

Mindmap

Keywords

💡AI Algorithms

AI Algorithms, or Artificial Intelligence algorithms, are programming instructions that enable computers to learn and operate autonomously. They are central to the video's theme, as they form the backbone of AI applications in marketing. The script discusses various types of AI algorithms, such as supervised, unsupervised, and reinforcement learning algorithms, and their specific uses in data analysis and prediction.

💡Supervised Learning

Supervised Learning is a type of machine learning where the algorithm is trained on labeled data to predict outcomes for new data. It is a key concept in the video, as it illustrates how AI algorithms learn from existing data to make informed decisions. Examples in the script include decision trees and random forests, which are supervised learning algorithms used for classification and regression tasks.

💡Unsupervised Learning

Unsupervised Learning is another machine learning paradigm where the algorithm works with unlabeled data to identify patterns or structures within the data. The video mentions this concept in the context of algorithms like K-means clustering and Gaussian Mixture Models, which group data based on similarities without prior labeling.

💡Reinforcement Learning

Reinforcement Learning is an area of AI where an agent learns to make decisions by receiving feedback in the form of rewards or penalties. The script explains this concept by describing how the algorithm interacts with an environment, learning from the outcomes of its actions to improve performance over time.

💡Decision Trees

Decision Trees are a supervised learning algorithm represented as an inverted tree structure, where the root node represents the training dataset, and the leaf nodes represent the outcomes. The script uses decision trees as an example to explain how AI algorithms classify data by following a series of decisions based on attribute selection measures like entropy and information gain.

💡Random Forest

Random Forest is an ensemble learning method that operates by constructing multiple decision trees and outputting the class that is the mode of the classes of the individual trees. The video script describes how Random Forest improves accuracy by combining multiple decision trees, adding randomness to the model to find the best feature among a random subset of features.

💡Support Vector Machines (SVM)

Support Vector Machines (SVM) are used for classification or regression by finding the hyperplane that best separates data into classes. The script explains how SVM works by maximizing the margin between support vectors to achieve the best possible segregation of data points.

💡Naive Bayes

Naive Bayes is a classification algorithm based on Bayes' theorem that assumes the simplicity of feature independence. The video script describes Naive Bayes as a probabilistic classifier that predicts the likelihood of an event based on the presence of certain features, making probabilistic predictions despite potential overlapping attributes.

💡Linear Regression

Linear Regression is a statistical method used for predictive analysis, making it a fundamental concept in the video's discussion of AI algorithms. The script explains how linear regression models the relationship between a dependent variable and one or more independent variables, predicting continuous numeric values such as sales or age.

💡Logistic Regression

Logistic Regression is an algorithm used when the dependent variable is categorical, often taking binary values like 0 or 1. The video script illustrates its use in applications like spam filters, where it estimates the probability of an event occurring based on a logistic function that maps inputs to a probability between 0 and 1.

💡K-means Clustering

K-means Clustering is an unsupervised learning algorithm that partitions data into K distinct clusters based on the proximity of data points. The script describes the iterative process of K-means, from selecting initial centroids to assigning data points to the nearest centroid, and adjusting centroids until the clusters are optimized.

💡Gaussian Mixture Models (GMM)

Gaussian Mixture Models (GMM) are a probabilistic model that assumes the data is generated from a mixture of several Gaussian distributions. The video script contrasts GMM with K-means clustering, noting that GMM can handle more complex cluster shapes than the circular patterns of K-means, allowing for greater clarity in data segmentation.

💡K-Nearest Neighbors (KNN)

K-Nearest Neighbors (KNN) is a non-parametric algorithm used for both classification and regression. The script describes KNN as a 'lazy learning' algorithm that stores data and classifies new instances based on the nearest neighbors in the training dataset, making it useful for anomaly detection and classification tasks.

💡Neural Networks

Neural Networks are a collection of AI algorithms inspired by the human brain, consisting of interconnected nodes or neurons organized into layers. The video script explains how neural networks learn by adjusting connection strengths during training, enabling them to recognize patterns and make predictions, with applications in classification and pattern recognition.

💡Algorithmic Bill of Rights

The Algorithmic Bill of Rights is a set of guiding principles aimed at ensuring ethical use of algorithms. The video script outlines seven areas covered by these principles, including awareness, access and redress, accountability, explanation, data provenance, auditability, and validation. These principles are crucial for understanding the responsible development and application of AI algorithms.

Highlights

Introduction to AI algorithms in marketing, discussing various learning patterns: supervised, unsupervised, and reinforcement learning.

Definition of an AI algorithm as a complex set of instructions for a computer to learn and operate on its own.

Importance of training data acquisition and labeling in distinguishing different AI algorithms.

Overview of supervised learning algorithms, emphasizing their reliance on clearly labeled data for training.

Explanation of decision trees, their structure, and how they use attribute selection measures for classification.

Introduction to Random Forest, a collection of decision trees that improve accuracy through diversity.

Support Vector Machines (SVM) for classification or regression by finding the optimal hyperplane for data separation.

Naive Bayes classifier, based on the base theorem, and its use in large datasets with various classes.

Linear regression for predictive analysis and forecasting, emphasizing its simplicity and popularity.

Logistic regression for binary outcomes, using a logistic function to estimate probabilities.

Unsupervised learning algorithms and their use in creating models from unlabeled data to find patterns.

K-means clustering for dividing datasets into clusters based on data point proximity to centroids.

Gaussian Mixture Models for clustering with more versatile shapes than K-means, allowing linear patterns.

K-Nearest Neighbors (KNN) algorithm for both supervised and unsupervised learning, focusing on data proximity.

Neural networks as complex AI algorithms mimicking the human brain, with applications still being discovered.

Reinforcement learning algorithms that learn from feedback in the form of rewards, involving an agent and an environment.

Algorithmic Bill of Rights, a set of guiding principles for ethical algorithm design and use.

Seven general areas of the Algorithmic Bill of Rights, including awareness, access, accountability, and auditability.

Conclusion summarizing the module's coverage of AI algorithms, learning techniques, and ethical considerations.

Transcripts

play00:00

[Music]

play00:09

[Music]

play00:25

welcome to this uh nptl online

play00:27

certification course on artificial

play00:29

intelligence in marketing and now we are

play00:31

discussing module 4 so we are talking

play00:34

about introduction to AI algorithms and

play00:37

we are in chapter 1 and module 4 so this

play00:40

is what we are talking about that is

play00:42

Introduction to artificial intelligence

play00:44

algorithms so to introduce the module we

play00:47

will talk about what are AI algorithms

play00:49

and how they

play00:50

work what are the various types of

play00:53

commonly used AI algorithms under the

play00:56

different kind of learning patterns and

play00:58

we have seen that the the various types

play00:59

of learning patterns are supervised

play01:01

learning unsupervised learning and

play01:03

reinforcement learning and then we will

play01:06

talk briefly on the algorithmic Bill of

play01:09

Rights so now to start with what is an

play01:11

AI algorithm the definition of algorithm

play01:15

is a set of instructions to be followed

play01:18

in calculation or other operations so it

play01:21

is a pure simple set of

play01:23

instructions this applies to both

play01:25

mathematics and computer science so thus

play01:28

at the essential level an AI algorithm

play01:31

is the programming that tells the

play01:32

computer how to learn to operate on its

play01:35

own and AI algorithm is much more

play01:38

complex than what most people learn

play01:40

about in algebra of course a complex set

play01:44

of rules Drive AI program determine

play01:47

their steps and their ability to learn

play01:50

without an algorithm AI would not

play01:53

exist while a general algorithm can be

play01:56

simple AI algorithms are by Nature more

play02:00

complex AI algorithms work by taking in

play02:03

training data that helps the algorithm

play02:06

to learn how the data is acquired and is

play02:09

labeled marks the key difference between

play02:10

different types of AI algorithm so keep

play02:13

in mind that how that data is acquired

play02:15

and is

play02:16

labeled that gives the clear key

play02:20

difference between different types of AI

play02:22

at the code level an AI algorithm takes

play02:25

in training data labeled or unlabeled

play02:29

supp lied by developers or acquired by

play02:31

the program itself and use that

play02:34

information to learn and grow then it

play02:38

completes its task using the training

play02:41

data as a basis so that training data is

play02:45

so important for this kind of AI some

play02:48

types of AI algorithms can be taught to

play02:50

learn on their own and taken new data to

play02:52

change and refine their processes others

play02:55

will need the intervention of

play02:56

programmers in order to streamline now

play02:59

let let us look at the various types of

play03:01

AI algorithms so there are three major

play03:04

categories of AI algorithms that we have

play03:07

already learned in the previous module

play03:09

namely one was the supervised learning

play03:13

the second was unsupervised learning and

play03:15

the third was reinforcement learning the

play03:18

key difference between these algorithms

play03:20

are in how they are trained and how they

play03:23

function so their differences come

play03:27

from a how they are trained and B how

play03:31

they function under those

play03:33

categories there are dozens of different

play03:37

algorithms we will discuss about the

play03:39

most popular and commonly used from each

play03:43

category as well as where they are

play03:44

commonly used so the supervised learning

play03:47

algorithms the first and the most

play03:49

commonly used category of algorithm is

play03:51

supervised

play03:52

learning these work by taking in clearly

play03:56

labeled data while being trained and

play03:58

using that to learn and grow it uses the

play04:02

labeled data to predict outcomes for

play04:04

other data the name supervised learning

play04:06

comes from the comparison of a student

play04:08

learning in the presence of a teacher or

play04:10

expert so that is why it is called as

play04:12

supervised learning building a

play04:14

supervised learning algorithm that

play04:16

actually works take a team of dedicated

play04:19

experts to evaluate and review the

play04:21

results not to mention data scientist to

play04:24

test the models the algorithms created

play04:27

to ensure their accuracy against the

play04:29

original data and catch any errors from

play04:32

the artificial

play04:34

intelligence so this is the decision

play04:38

tree one of the most common supervised

play04:41

learning algorithm decision trees get

play04:42

their name because of their tree like

play04:45

structure even though the tree is

play04:48

inverted so it is the inverted tree the

play04:53

roots of the tree are the training data

play04:55

sets and they leads to specific

play04:57

nodes which donates a text attribute

play05:01

notes often lead to another notes and a

play05:03

note that doesn't lead onward is called

play05:06

a

play05:07

leaf so this is the decision node that

play05:11

is the root node then we have a sub tree

play05:13

decision node Leaf node this is again

play05:16

Leaf node because nothing flows from

play05:18

them here in this decision node again

play05:22

another decision node node and then this

play05:24

is leaf node this is leaf node and this

play05:26

is LEF Leaf node decision Tre is

play05:28

classify all the data into decision

play05:31

nodes it uses a selection criteria

play05:35

called attribute selection

play05:37

measures which takes into account

play05:39

various measures some examples would be

play05:41

entropy gain ratio Information Gain

play05:44

Etc using the root data and following

play05:47

the ASM the decision tree can classify

play05:49

the data it is given by following the

play05:51

training data into subnodes until it

play05:53

reaches the conclusion a decision tree

play05:56

diagram with root node decision node and

play05:59

Leaf node for better understanding so

play06:02

this is how root node with friends then

play06:05

yes windy cold yes no below par so this

play06:10

is the branch no is splitting walk or

play06:13

cart decision notes walk above Park cart

play06:17

cold Etc so this is the demonstration of

play06:20

this decision tree diagram another type

play06:22

is random Forest the random Forest

play06:24

algorithm is actually a broad collection

play06:26

of different recision trees leading to

play06:28

its name

play06:30

the random Forest builds different

play06:32

decision trees and connects them to gain

play06:34

more accurate results so that is the

play06:37

main advantage that it gives more

play06:39

accurate results this can be used for

play06:42

both classification and regression type

play06:44

of supervised learning while a solo

play06:47

decision tree has no outcome and a

play06:49

narrow range of groups the forest

play06:51

assures a more accurate result with a

play06:54

bigger number of groups and decisions it

play06:57

has the added benefits of adding random

play07:01

Ness to the model by finding the best

play07:04

feature among a random subset of

play07:06

features overall these benefits create a

play07:09

model that has wide diversity that many

play07:12

data scientist

play07:13

favors so as we can see from the diagram

play07:16

the results of decision Tre 1 2 and

play07:18

three are combined which is then

play07:21

averaged out or the majority is

play07:22

considered as the final result so these

play07:25

are the data sets so this is decision

play07:28

Tre one this is decision tree 2C

play07:30

decision tree three so with the three

play07:33

typ with the same type of data sets

play07:34

there are three decision trees and the

play07:36

three results majority voting averaging

play07:39

and then we get the final results

play07:42

another is Vector support Vector

play07:45

machines the support Vector machine

play07:47

algorithm is another common AI algorithm

play07:49

that can be used for either

play07:51

classification or regression but is most

play07:54

often used for

play07:56

classification the support Vector

play07:58

machine

play08:01

works by plotting each piece of data on

play08:04

a chart in N Dimension space where n is

play08:06

the number of data points then the

play08:09

algorithm classifies the data point by

play08:10

finding the hyper plane that separates

play08:13

each class there can be more than one

play08:15

hyper plane the main objective of a

play08:18

support Vector machine is to segregate

play08:20

the given data sets in the best possible

play08:22

way the distance between either nearest

play08:25

points is known as the margin the the

play08:29

objective is to select a hyper plane

play08:31

with the maximum possible margin between

play08:34

support factors in the given data set

play08:36

svm searches for the maximum marginal

play08:39

hyper plane so these are the two

play08:42

excesses X1 and X2 and here it is a

play08:46

negative hyper plane then there are

play08:48

support vectors this is positive hyper

play08:50

plane so that is maximum margin and this

play08:53

is maximum margin hyper plane so

play08:56

generate hyperplanes which segregates

play08:59

the classes in the best way the figure

play09:01

on the top shows three hyper

play09:03

planes black blue and

play09:06

orange so these are the three hyper

play09:08

planes here the blue and orange have

play09:11

high classification errors but the black

play09:12

is separating the two classes

play09:14

correctly so so this black black is

play09:17

separating these two select the right

play09:20

hyper plane with the maximum segregation

play09:22

from the either nearest data points as

play09:25

shown in figure at the bottom then there

play09:27

is Nave base

play09:30

the reason this algorithm is called Nave

play09:32

Bas is that it is based on base theorem

play09:35

and also relies heavily on a large

play09:37

assumption that the presence of one

play09:39

particular feature is unrelated to the

play09:41

presence of other features in the same

play09:43

class that major assumption is the Nave

play09:46

aspect in the name so KN bu is useful

play09:51

for large data set with different

play09:53

classes it like many other supervised

play09:55

learning algorithm is a classification

play09:58

algorithm it is an algorithm that learns

play10:00

the probability of every object its

play10:02

features and which groups they belong to

play10:04

it is also known as

play10:07

probabilistic classifier for example you

play10:09

cannot identify a bir based on its

play10:11

feature and color as there are many

play10:13

words with similar attributes but you

play10:16

make a probabilistic prediction about

play10:18

the same and that is where knif wise

play10:21

algorithm comes in so this is these are

play10:24

the three classifiers 1 2 3 and this is

play10:27

knif bias classifiers

play10:29

and this is how they are classified into

play10:31

three different categories NWI use the

play10:33

following equation that is p h upon e is

play10:36

equal to p h upon e into p h ID p e so p

play10:42

h upon e denotes how event H happens

play10:45

when event e take

play10:48

place then P eh represents how often

play10:51

event e happens when event H takes place

play10:54

first pH represents the probability of

play10:56

event X happening on its own and P e

play10:59

represents the probability of Event Y

play11:01

happening on its own so then comes

play11:03

linear regression is a supervised

play11:05

learning AI algorithm used for

play11:07

regression modeling it is mostly used

play11:09

for discovering the relationship between

play11:11

data points predictions and

play11:14

forecasting much like support Vector

play11:17

machines it works by plotting pieces of

play11:19

data on a chart with the

play11:21

xais as the independent variable and the

play11:24

Y AIS as the dependent variable the data

play11:26

points are then plotted

play11:29

in a linear fashion to determine their

play11:31

relationships and forecast possible

play11:33

future data linear regression is one of

play11:36

the easiest and the most popular machine

play11:37

learning algorithm so that is the best

play11:40

part that is it it is the easiest it is

play11:43

a statistical method that is used for

play11:45

predictive

play11:47

analysis linear regressions make

play11:50

predictions for continuous real or

play11:53

numeric values such as the sales salary

play11:56

age product prices Etc linear

play11:58

integration algorithm shows a linear

play12:01

relationship between a dependent that is

play12:03

y and one or more independent that is X

play12:06

variables hence that is called as a

play12:09

linear regression since linear

play12:11

regression shows the linear relationship

play12:14

which means it finds how the value of

play12:16

the dependent variable is changing

play12:18

according to the value of the

play12:19

independent variable the linear

play12:21

regression model provides a sloped

play12:24

straight line representing the

play12:25

relationships between the variables so

play12:28

this is these are the independent

play12:30

variables on the x-axis and here we have

play12:32

the dependent variable and then we have

play12:34

all these data points and in between is

play12:38

the line of regression so now it it

play12:40

tells that how it will happen here what

play12:44

what will the dependent variable look

play12:46

look like like at this level of

play12:49

independent variable then comes logistic

play12:52

regression a logistic regression

play12:54

algorithm usually uses a binary value 01

play12:57

to estimate value from a set of

play12:59

independent

play13:01

variables the output of Lo logistic

play13:04

regression is either 1 or zero yes or no

play13:08

an example of this would be a spam

play13:10

filter in email the filter uses logistic

play13:13

regression to Mark whether the incoming

play13:16

mail is Spam zero or not one logistic

play13:21

regression is only useful when the

play13:23

dependent variable is

play13:25

categorical either yes or no so if it is

play13:29

not if it is somewhere in between then

play13:33

logistic regression will not work the

play13:36

logistic regression model is based on

play13:38

logistic function which is the type of s

play13:41

shaped curve that Maps any continuous

play13:44

input to the probability value between 0

play13:47

and 1 the logistic function allows us to

play13:50

model the relationship between the

play13:52

independent variables and the

play13:54

probability of dependent variable taking

play13:55

on the value of one the logistic

play13:58

regression model estimates the

play13:59

coefficient of the independent variables

play14:02

that are most productive of the

play14:04

dependent variable these coefficients

play14:06

are used to create a linear equation

play14:09

that is then transformed by the logistic

play14:11

functions to produce a probability value

play14:13

for the dependent variable taking on the

play14:15

value one the logistic regression is

play14:18

commonly used in fields such as

play14:20

Healthcare marketing finance and social

play14:23

sciences to predict the likelihood of an

play14:25

event occurring such as whether a

play14:27

patient has a certain disease or or

play14:28

whether the customer will buy a product

play14:30

or not so now as you can see from this

play14:33

figure we have independent variable X

play14:37

and then we have dependent variable y

play14:40

now this dependent variable varies from

play14:43

0 to 1 independent variable can take any

play14:49

value but dependent variable can take

play14:52

values only between 0 to 1 so this

play14:55

predicts y lies within 0 to one one

play14:58

range

play15:00

so that is why it is a s shaped

play15:03

curve

play15:05

so now you

play15:07

see at this level what will be the value

play15:11

of y at this level what will be the

play15:13

value of y and at this level what will

play15:15

be the value of y so as we have already

play15:17

studied before unsupervised learning

play15:20

algorithms are given data that is not

play15:22

labeled unsupervised learning algorithm

play15:25

use that unlabelled data to create

play15:27

models and evaluate the relationships

play15:29

between different data points in order

play15:32

to give more insights to the data the

play15:35

next comes K means clustering K means is

play15:38

an algorithm designed to perform the

play15:40

clustering

play15:41

function in unsupervised learning it

play15:44

does takes this by taking in the

play15:46

predetermined clusters and plotting out

play15:48

all the data regardless of the

play15:51

cluster it then plots a randomly

play15:53

selected piece of data such as the cerid

play15:57

of each cluster think link of it as a

play15:59

circle around each cluster which with

play16:02

that piece of data as the exact center

play16:04

point from there it sorts the remaining

play16:07

data points into clusters based on their

play16:09

proximity to each other and the CID data

play16:11

point for each cluster the algorithms

play16:14

takes the unlabel data sets as input

play16:17

divides the data set into K numbers of

play16:19

clusters and repeat the process until it

play16:21

does not find the best cluster the value

play16:24

of K should be predetermined in this

play16:27

algorithm the means clustering algorithm

play16:29

mainly perform two tasks the first is

play16:32

determine the West value of K Center

play16:34

points or cids by and iterative process

play16:38

the second is assign each data point to

play16:40

its closest G

play16:42

Center those data points which are near

play16:44

to the particular K Center creates a

play16:46

cluster so the working of the K means

play16:49

algorithm is as follows first select the

play16:51

number of K to decide the number of

play16:54

clusters second is Select random ke

play16:58

points or or

play16:59

centroids the third is assign each data

play17:02

point to those closed centroid which

play17:04

means form which will form the

play17:06

predefined K cluster the fourth is

play17:08

calculate the variance and the place a

play17:11

new centroid of each cluster the fifth

play17:14

is repeat the third

play17:16

steps which means reassigning each data

play17:18

point to the new closest Cento for each

play17:22

cluster so this is how it works before

play17:24

the K means and this after K means how

play17:27

neatly they are clustered the next comes

play17:30

gausian mixture model goian mixture

play17:33

models are similar to g means clustering

play17:35

in many ways both are concerned with

play17:37

sorting data into predetermined clusters

play17:39

based on proximity however goian models

play17:42

are a little more versatile in the

play17:44

shapes of the Clusters they

play17:46

allow K means clustering only allows

play17:49

data to be clustered in circles with the

play17:51

centroid in the center of each cluster

play17:54

goian mixtures can handle data that LS

play17:57

on the graph in more linear patterns

play17:59

allowing for oblong shaped

play18:02

structures this allows for greater

play18:05

Clarity in clustering of one data point

play18:08

lands inside the circle of another

play18:10

cluster the starting point and training

play18:12

process of the K means and GMM are the

play18:15

same however K means use a

play18:17

distance-based approach NG GMM uses a

play18:19

probabilistic based approach there is

play18:22

one primary assumption in GMM the data

play18:24

set consist of multiple cians in other

play18:26

word a mixture of the cians it is used

play18:29

to forecast the sales of product

play18:31

understand customer churn through the

play18:32

length of different groups of customers

play18:34

some AI algorithms can use either

play18:37

supervised or unsupervised data input

play18:39

and still function they might have

play18:42

slightly different applications based on

play18:44

their

play18:45

status the next comes K nearest neighbor

play18:49

algorithm so K nearest neighbor that is

play18:52

K NN algorithm is a simplistic AI

play18:54

algorithm that assumes that all the data

play18:57

points provide ided are in a proximity

play18:59

to each other and plots them onto a map

play19:01

to show the relationship between them

play19:03

then the algorithm can calculate the

play19:05

distance between data points in order to

play19:07

extrapolate the relationships and

play19:09

calculate the distance on the graph both

play19:11

supervised and unsupervised algorithm in

play19:14

supervised learning it can be used for

play19:16

either classification or regression

play19:18

applications in unsupervised learning it

play19:20

is popularly used for anomaly detection

play19:23

that is finding data that does not

play19:25

belong and removing it a NN is a

play19:29

non-parametric algorithm which means it

play19:32

does not make any assumptions on

play19:34

underlying data it is also called a lazy

play19:37

learn algorithm because it does not

play19:39

learn from the training set immediately

play19:42

Instead at the training phase it just

play19:44

store the data sets and when it gets new

play19:46

data it classifies that data into a

play19:48

category that is much similar to the new

play19:51

data KNN algorithms can be used for

play19:53

regression as well as classification but

play19:55

mostly it is used for classification

play19:57

problem problem s Suppose there are two

play20:00

categories category a and category B and

play20:03

we have a new data point X1 so this data

play20:06

point will lie in which of these

play20:08

categories to solve this type of problem

play20:10

we need a knnn algorithm with the help

play20:13

of knnn we can easily identify the

play20:15

category or class of a particular data

play20:17

set so now this was category a this was

play20:21

category B and now we have this new data

play20:24

now after KNN this new data point is

play20:27

assigned to category

play20:30

1 so now it becomes easier to deal with

play20:34

this uh this data point rather than just

play20:36

having one data point is Stand Alone the

play20:40

next comes neural networks neural

play20:42

network algorithm is a term for a

play20:43

collection of AI algorithms that mimic

play20:45

the functions of a human brain so that

play20:48

mimics the functions of a human brain

play20:52

these tend to be more complex than many

play20:55

of the algorithms discussed above and

play20:57

have applic applications which are still

play20:59

being discovered so all those

play21:01

applications have not yet been

play21:02

discovered they are still in the process

play21:04

of Discovery in unsupervised and

play21:07

supervised learning it is it can be used

play21:09

for

play21:10

classification and pattern

play21:13

recognition it consists of

play21:15

interconnected nodes that is neurons

play21:18

organized into layers information flows

play21:21

through this

play21:22

nodes and the network adjust the

play21:24

connection strengths that is weights

play21:26

during training to learn from data

play21:28

enabling it to recognize

play21:30

patterns make predictions and solve

play21:33

various tasks in machine learning and

play21:35

artificial intelligence and there are

play21:38

three levels in the network architecture

play21:41

the input layer the hidden layer which

play21:43

can be more than one and the output

play21:45

layer because of the numerous layers it

play21:47

is sometimes refers to as the MLP that

play21:51

is multi-layer percep dra it is possible

play21:53

to think of the Hidden layer as a

play21:56

distillation layer which ass some of the

play21:58

most relevant patterns from the inputs

play22:01

and send them onto the next layer for

play22:03

further analysis it accelerates and

play22:05

improves the efficiency of the network

play22:07

by recognizing just the most important

play22:09

information from the inputs and

play22:11

discarding the Redundant

play22:14

information so this is how it works so

play22:17

this is the hidden layer then there are

play22:19

input layers Network layers feed forward

play22:22

so this is Network

play22:24

output and this is back propagation then

play22:27

comes reinforcement learning algorithm

play22:30

the last major type of AI algorithm is

play22:33

reinforcement learning algorithm which

play22:34

learns by taking in the feedback from

play22:36

the results of its action this is

play22:39

typically in the form of a reward the

play22:42

reinforcement algorithm is usually

play22:44

composed of two major

play22:46

parts the first is an agent that

play22:48

performs an

play22:50

action that is

play22:53

one and the second is the environment in

play22:56

which the action is performed so these

play22:58

are the two major parts of this

play23:00

algorithm the cycle begins when the

play23:03

environment sends a state signal to the

play23:05

agent that cues the agent to perform a

play23:08

specific action within the environment

play23:12

once the action is performed the

play23:13

environment sends the

play23:15

reward signal to the agent informing it

play23:19

on what happens so that the agent can

play23:22

update and evaluate its last

play23:25

actions then with that new information

play23:28

it can take the action again that cycle

play23:31

repeats until the environment sends a

play23:34

termination signal so there are two

play23:36

types of reinforcement the algorithm can

play23:39

use one is either a positive report or a

play23:44

negative reward in reinforcement

play23:47

algorithms there are slightly different

play23:49

approaches depending on what is being

play23:50

measured and how it is being measured

play23:54

here are some definitions of different

play23:56

models and measures so one is

play23:59

policy the approach that agent takes to

play24:02

determine the next action taken by the

play24:05

agent the second is model the situation

play24:08

and dynamics of the environment and the

play24:11

third is value the expected long-term

play24:13

results this is different from the

play24:15

reward which is the result of a single

play24:17

action within the environment the value

play24:19

is the long-term result of many actions

play24:22

the second is value based in this value

play24:26

based reinforcement algorithm the agent

play24:28

pushes towards an expected long-term

play24:31

return so that is important

play24:34

here instead of just focusing on the

play24:36

short-term reward the next is policy

play24:39

based a policy based reinforcement

play24:41

algorithm usually take one of the two

play24:44

approaches to determine the next course

play24:46

of

play24:47

action either uh a standardized approach

play24:51

so that can be one approach where any

play24:53

state produces the same action or the

play24:56

dynamic approach

play24:59

where certain probabilities are mapped

play25:01

out and the probabili calculated each

play25:04

probability has its own policy

play25:07

reactions the next is the model based in

play25:10

this algorithm the programmers create a

play25:13

different Dynamic for each environment

play25:16

so the programmer create different

play25:18

Dynamics for each environment one

play25:20

environment one

play25:24

dynamics that way when the agent is put

play25:26

into each different model

play25:28

it learns to perform consistently under

play25:30

each condition now let us look at the

play25:33

algorithms algorithmic Bill of Rights in

play25:36

January 2017 the US public policy

play25:39

Council of the associations for

play25:40

computing

play25:41

Machinery which consist of Educators

play25:45

researchers and Professionals in the

play25:47

world of Information Technology outlined

play25:50

a set of guiding principles that could

play25:52

serve as a precursor for an algorithmic

play25:55

Bill of Rights these principles covers

play25:58

seven general areas which we will be

play26:00

discussing one by one so the first one

play26:03

is awareness those who design Implement

play26:08

and use algorithms must be aware of

play26:10

their potential biases and possible harm

play26:13

and take these into accounts in their

play26:17

practices the second is access and

play26:19

redress those who are negatively

play26:21

affected by algorithms must have systems

play26:24

that enable them to question the

play26:25

decisions and seek a red

play26:28

the third is accountability

play26:30

organizations that use algorithms must

play26:33

take responsibility for the

play26:35

decisions those algorithms reach even if

play26:39

it is not feasible to explain how the

play26:40

algorithms arrive at those decisions the

play26:44

fourth is explanation those affected by

play26:47

algorithm should be given explanations

play26:49

of the

play26:50

decisions and the procedures that

play26:52

generated them the fifth is data

play26:56

provenance those who design and use

play26:58

algorithms should maintain record of the

play27:01

data used to train the algorithm and

play27:03

make those records available to

play27:05

appropriate individuals to be studied by

play27:08

to be studied for possible

play27:10

biases the sixth is auditability

play27:13

algorithms and data should be recorded

play27:16

so that they can be audited in case of

play27:19

possible harm the seventh is validation

play27:23

and testing organizations that use

play27:25

algorithms should test them regularly

play27:27

for biases and make the results publicly

play27:31

available so to conclude in this module

play27:33

we have briefly introduced AI algorithms

play27:36

and how do they work then we have

play27:39

discussed about the commonly used AI

play27:41

algorithms under the various learning

play27:43

techniques that is supervised

play27:44

unsupervised and reinforcement learning

play27:48

and then finally we have given a brief

play27:51

on the algorithmic Bill of Rights and

play27:55

these are the five references from which

play27:57

the material for this module was taken

play28:00

thank

play28:01

[Music]

play28:26

you

play28:27

[Music]

play28:30

oh

Rate This

5.0 / 5 (0 votes)

相关标签
Artificial IntelligenceMarketingOnline CourseAI AlgorithmsMachine LearningSupervised LearningUnsupervised LearningReinforcement LearningAlgorithmic EthicsData Science
您是否需要英文摘要?