Hidden Markov Model Clearly Explained! Part - 5
Summary
TLDRIn this 'Normalized Nerd' video, the presenter delves into Hidden Markov Models (HMMs), a concept derived from Markov chains, which are widely used in fields like bioinformatics and natural language processing. The video offers both intuition and mathematical insights into HMMs, illustrating their workings with a hypothetical scenario involving weather and mood. It explains the model's components, including the transition and emission matrices, and demonstrates how to calculate the probability of a sequence of observed variables. The presenter also introduces the use of Bayes' theorem in determining the most likely sequence of hidden states, providing a clear and engaging explanation of the complex topic.
Takeaways
- 😀 Hidden Markov Models (HMMs) are an extension of Markov chains and are used in various fields such as bioinformatics, natural language processing, and speech recognition.
- 🔍 The video aims to explain both the intuition and the mathematics behind HMMs, including the role of Bayes' theorem in the process.
- 🌤️ The script uses a hypothetical town with three types of weather (rainy, cloudy, sunny) to illustrate the concept of a Markov chain where the weather tomorrow depends only on today's weather.
- 🤔 It introduces the concept of 'hidden' states by showing that while we can't observe the weather directly, we can infer it from Jack's mood, which is an observed variable dependent on the weather.
- 📊 The script explains the use of matrices to represent the probabilities of state transitions (transition matrix) and the probabilities of observed variables given the states (emission matrix).
- 📚 It emphasizes the importance of understanding the basics of Markov chains before diving into HMMs, suggesting viewers watch previous videos for a foundation.
- 🧩 The video provides a step-by-step example of calculating the joint probability of an observed mood sequence and a hypothetical weather sequence using the Markov property and matrices.
- 🔑 The concept of the stationary distribution of a Markov chain is introduced as necessary for calculating the probability of the initial state in the HMM.
- 🔍 The video poses the question of finding the most likely sequence of hidden states given an observed sequence, a common problem in applications of HMMs.
- 📈 The script explains the formal mathematical approach to solving HMM problems using Bayes' theorem and the joint probability distribution of hidden and observed variables.
- 📝 The video concludes by encouraging viewers to rewatch if they didn't understand everything, highlighting the complexity of the topic and the elegance of the mathematical solution.
Q & A
What is the main topic discussed in the video?
-The main topic discussed in the video is Hidden Markov Models (HMMs), their concept, intuition, and mathematics, with applications in bioinformatics, natural language processing, and speech recognition.
What is the relationship between Hidden Markov Models and Markov Chains?
-Hidden Markov Models are derived from Markov Chains. They consist of an ordinary Markov Chain and a set of observed variables, where the states of the Markov Chain are unknown or hidden, but some variables dependent on the states can be observed.
Why is the concept of 'emission matrix' important in HMMs?
-The 'emission matrix' in HMMs captures the probabilities corresponding to the observed variables, which depend only on the current state of the Markov Chain. It's essential for understanding the relationship between the hidden states and the observable outcomes.
What is the role of Bayes' Theorem in Hidden Markov Models?
-Bayes' Theorem is used in HMMs to find the most likely sequence of hidden states given a sequence of observed variables. It helps in rewriting the problem of finding the probability of hidden states given observed variables into a more manageable form.
How does the video script illustrate the concept of Hidden Markov Models using Jack's hypothetical town?
-The script uses Jack's town, where the weather (rainy, cloudy, sunny) and Jack's mood (sad, happy) are the variables. The weather is the hidden state, and Jack's mood is the observed variable. The script explains how these variables interact and how HMMs can be used to predict the weather based on Jack's mood.
What is the significance of the 'transition matrix' in the context of the video?
-The 'transition matrix' represents the probabilities of transitioning from one state to another in the Markov Chain. In the video, it shows how the weather changes from one day to the next, which is essential for modeling the sequence of states in HMMs.
What is the purpose of the 'stationary distribution' in calculating the probability of the first state in HMMs?
-The 'stationary distribution' is used to find the probability of the initial state in a Markov Chain. It is necessary because the probability of the first state cannot be directly observed and must be inferred from the long-term behavior of the system.
How does the video script explain the computation of the probability of a given scenario in HMMs?
-The script explains the computation by breaking down the scenario into a product of terms from the emission matrix and transition matrix, and using the stationary distribution for the initial state probability. It provides a step-by-step approach to calculate the joint probability of the observed mood sequence and the weather sequence.
What is the most likely weather sequence for a given mood sequence according to the video?
-The video does not provide the specific sequence but explains the process of finding the most likely weather sequence for a given mood sequence by computing the probability for each possible permutation and selecting the one with the maximum probability.
How does the video script help in understanding the formal mathematics behind HMMs?
-The script introduces symbols to represent the hidden states and observed variables, explains the use of Bayes' Theorem, and simplifies the problem into a form that can be maximized. It provides a clear explanation of the mathematical expressions involved in HMMs, making the formal mathematics more accessible.
Outlines
🌦️ Introduction to Hidden Markov Models
This paragraph introduces the concept of Hidden Markov Models (HMMs), explaining their derivation from Markov chains and their applications in various fields such as bioinformatics, natural language processing, and speech recognition. The speaker encourages viewers to watch previous videos for a better understanding of Markov chains, which are foundational to HMMs. The video promises to cover both the intuition and mathematical aspects of HMMs, including how Bayes' theorem is applied. The scenario of Jack's town with three types of weather and its effect on Jack's mood is used to illustrate the concept of hidden states and observed variables in HMMs. The transition and emission matrices are introduced as key components in modeling the system.
🔍 Analyzing a Scenario with Hidden Markov Models
This paragraph delves deeper into the mechanics of HMMs by analyzing a hypothetical scenario where Jack's mood sequence is observed over three days. The task is to determine the most likely weather sequence that corresponds to this mood sequence. The speaker explains the process of calculating the joint probability of the observed mood sequence and the weather sequence, using the transition and emission matrices. The concept of the stationary distribution is introduced to determine the initial probability of the weather states. The paragraph concludes with a teaser about the formal mathematics behind HMMs, hinting at the use of Bayes' theorem and the importance of understanding the joint probability distribution of the hidden states and observed variables.
Mindmap
Keywords
💡Hidden Markov Model
💡Markov Chain
💡Bayes Theorem
💡Observed Variables
💡Transition Matrix
💡Emission Matrix
💡Stationary Distribution
💡Joint Probability
💡Most Likely Sequence
💡Natural Language Processing
💡Bioinformatics
Highlights
Introduction to Hidden Markov Models (HMMs) and their derivation from Markov chains.
HMMs' applications in bioinformatics, natural language processing, and speech recognition.
Explanation of the intuition and mathematics behind Hidden Markov Models.
Importance of understanding Markov chains before diving into HMMs.
Use of a hypothetical town scenario to illustrate the concept of a Markov chain.
Description of the weather and mood states in the hypothetical town as a Markov chain model.
Introduction of the concept of hidden states in HMMs through the weather example.
Differentiation between the transition matrix and the emission matrix in HMMs.
Explanation of how observed variables depend only on the current state, not on previous states.
Calculation of the probability of a given mood sequence using the emission and transition matrices.
Discussion on finding the most likely sequence of hidden states given an observed sequence.
Use of the stationary distribution to determine the probability of the initial state.
Computational approach to finding the most probable sequence using a Python script.
Formal mathematical explanation involving Bayes' theorem in HMMs.
Derivation of the joint probability distribution of observed and hidden states.
Clarification of the process to maximize the probability of a sequence of hidden states given observed data.
Encouragement for viewers to rewatch the video for better understanding.
Call to action for comments, suggestions, and subscriptions to support the channel.
Transcripts
hello people from the future welcome to
normalized nerd
today i'm gonna talk about hidden markov
models
this is a concept derived from our old
friend markov chains
and it is very useful in bioinformatics
natural language processing
speech recognition and in many other
domains in this video i will go through
both intuition and the mathematics
behind it
so make sure to watch this video till
the end and yes i will also show you
how bayes theorem helps us in this
process
so if you find value in this video
please consider subscribing and hit the
bell icon
that will help me a lot so let's get
started well to understand hidden markov
models
obviously you need to know what a markov
chain really is
so i would highly recommend you to watch
my previous videos
before continuing with this one okay
let's start our discussion with our
friend
jack who lives in a hypothetical town
there exists only three kinds of weather
in the town
rainy cloudy and sunny on
any given day only one of them will
occur
and the weather tomorrow depends only on
today's weather yes we can model this as
a simple markov chain let me add the
state transitions
and here are the transition
probabilities
also assume that on any given day
jack can have one of the two following
moods
sad or happy moreover his mood
depends on the weather of that
particular day
i'm gonna represent it with red arrows
here are the corresponding probabilities
so according to our diagram there's a 90
chance that jack is sad given that it's
a rainy day
hopefully it's clear to you too okay
so now our problem is we don't live in
jack's town
so we can't possibly know what's the
weather on a particular day
however we can contact jack over the
internet and get to know
his mood so the states of the markov
chain
are unknown or hidden from us but
we can observe some variables that are
dependent on the states
and this my friend is called a hidden
markov model
in other words a hidden markov model
consists of
an ordinary markov chain and a set of
observed variables
please note that jack's mood that is the
observed variable
depends only on today's weather
not on the previous day's mood
to make things more clear i am going to
write the probabilities
into a matrix you are already familiar
with the green matrix that is the
transition matrix
and the red one captures the
probabilities corresponding to the
observed variables
which is also known as the emission
matrix
let me add the indices to make it more
interpretable
this is the correct time to pause and
convince yourself that you have
understood everything so far
okay let's consider a particular
scenario we will look at three
consecutive days
suppose on the first day it was sunny
and jack was happy
next day was cloudy and jack was happy
and on the third day it was sunny again
but
jack was sad well i know that
we can't observe the hidden states but
just assume
that this scenario happened trust me
analyzing this
will help us a lot okay the question i
want to ask here is
what is the probability of this scenario
occurring
more precisely what is the joint
probability of the observed mood
sequence
and the wither sequence pause here if
you would like to try
by using the markov property we can
compute this
as a product of six terms first
the probability of a sunny day then the
probability of
happy mood given a sunny day then the
transition probability from sunny to
cloudy
and so on
but how we got these six terms please
bear with me till the end
to understand the underlying mathematics
we can fill the rate terms
from the emission matrix and the green
ones
from the transition matrix
but what about the first term how can we
find the probability of the first state
yes we need the stationary distribution
of the markov chain for this purpose
you can use the normalized left
eigenvectors or repeated matrix
multiplication to compute the stationary
distribution
i have explained both of them in my
previous videos
so as you can see the sunny state has
the probability of
0.549 now we can compute the product
and the answer turns out to be 0.00391
okay let me hide the states now
so we only have a sequence of the
observed variable
here i'm going to ask a more interesting
question
what is the most likely weather sequence
for this given mood sequence
huh there are many possible permutations
right
we can have rainy cloudy sunny cloudy
rainy sunny
rainy cloudy rainy and so on
to find the most likely sequence we need
to compute the probability corresponding
to each of them
and find the one with the maximum
probability
we can calculate the probability just
like we did in the last case
i wrote a python script to do the
computations and found that
this weather sequence maximizes the
joint probability
and the probability is 0.04105
if you are still watching this i believe
you would also like to know
the formal mathematics behind this so
get ready for some symbols
instead of emojis i will represent the
hidden states of our markov chain
by x and the observed variable by
y we can rewrite our problem like this
this simply means find that particular
sequence of x
for which the probability of x given y
is maximum please note that in a hidden
markov model
we observe the sequence of y
that's why i have written x given y now
if you notice carefully you will see
there's no direct way to find this
probability
here comes the base theorem by the way i
have a video explaining bayes theorem
you can watch that if you want so by
using bayes theorem
we can rewrite this as the probability
of y
given x times the probability of x
upon the probability of y for all
practical purposes we can neglect the
denominator and the
numerator is just the joint probability
distribution
of x and y now we are going to further
simplify this
let's take the first part the
probability of
y given x according to our assumption
y i depends only on x i
so we can write probability of y
given x as a product
we can fill all these terms from our red
matrix
for the second term probability of x
we must use the markov property that
says
x i depends only on x i minus 1
so we can convert p x into this product
now there's one subtlety involved for x
0
we must use the stationary distribution
vector
as i already showed you earlier okay so
now replace these things into the above
expression
and we have got the thing that we want
to maximize
you can clearly see that this expression
is just a product
of two n terms i hope now you fully
understand the origin of
those six terms in the example that i
computed earlier
isn't it super elegant well i guess it
is
i know it's not the easiest thing so
don't worry if you didn't get everything
at once
you can always replay the video to
remove your confusions
so that was all for this video guys do
comment below your thoughts and
suggestions
and subscription will be a major help to
me
goodbye guys stay safe and thanks for
[Music]
watching
[Music]
you
関連動画をさらに表示
5.0 / 5 (0 votes)