Human activity detection
Summary
TLDRThis video discusses the importance and implementation of a human activity detection model in the context of Big Data. Highlighting its significance in medical care and elderly support, the script details the dataset from the UCI repository, which includes eight attributes and eleven activity classes. After pre-processing, feature extraction, and model training using a decision tree, the model achieved over 73% accuracy in classifying activities based on attributes like Z coordinates and tag identifiers. The video concludes with suggestions for improving model performance through time series analysis.
Takeaways
- 🧑🔬 Activity detection is increasingly important in medical care, supporting elderly independent living and emergency assistance.
- 📈 Companies like Fitbit and Apple rely on activity detection for health monitoring and safety features, respectively.
- 📚 The data set used for the project was sourced from the UCI Machine Learning Repository and includes eight attributes and eleven classes.
- 👥 The data set was created by recording five people performing various activities over five sessions each.
- 📊 The data set attributes include sequence name, tag identifier, timestamp, date, and x, y, z coordinates, along with the activity classification.
- ⚙️ Pre-processing involved checking for missing values and removing highly correlated activities, resulting in three main classes: lying, walking, and sitting.
- 📈 Feature extraction and analysis identified highly correlated attributes like timestamp, tag identifier, and x, y, z coordinates.
- 📉 Box plots were used to visualize variations in attributes across different activities, aiding in attribute selection for the model.
- 🌐 A 70/30 train-test split was found to be optimal for model training, with 5-fold cross-validation repeated three times for parameter tuning.
- 🔑 The decision tree model was chosen for classification, using attributes like x, y, z coordinates, timestamp, and tag identifier.
- 📊 The model achieved over 73% accuracy in classifying activities, with the Z coordinate being particularly effective for distinguishing lying and walking.
- 🤖 Error analysis showed some confusion between lying and sitting, and walking and sitting, suggesting room for improvement in the model.
Q & A
Why is activity detection important in the field of medical care?
-Activity detection is important in medical care because it can support the elderly for independent living and can be a life-saving feature, such as in fall detection systems that automatically alert emergency services if a person falls and doesn't get up for a while.
Which companies are mentioned in the script that rely on activity detection?
-Fitbit and Apple are mentioned as companies that rely on activity detection. Fitbit for tracking physical activities and Apple for features like fall detection in their Apple Watch products.
What is the source of the data set used for the human activity detection model?
-The data set used for the human activity detection model was obtained from the UCI Machine Learning Repository.
How many people were involved in the creation of the data set and what were they asked to do?
-Five people were involved in the creation of the data set, and they were asked to perform a sequence of activities over five times.
What are the eight attributes contained in the data set?
-The eight attributes in the data set are the sequence name, tag identifier, timestamp, date, and the x, y, z coordinates, followed by the activity classification.
What was the reason for removing certain activities during the pre-processing stage?
-Certain activities were removed during pre-processing because they were highly correlated, such as active researches lying and lying down, which had extremely similar variations in terms of their features.
How many class labels were retained after the pre-processing of the data set?
-After pre-processing, three class labels were retained: lying, walking, and sitting.
What was the best train-test split ratio found for the model?
-The best train-test split ratio found for the model was 70/30.
Which model was used to classify the data based on the given attributes?
-A decision tree model was used to classify the data based on the given attributes.
What attributes were selected for training the decision tree model?
-The attributes selected for training the decision tree model were the x, y, z coordinates, along with the timestamp and the tag identifier.
What was the overall accuracy achieved by the decision tree model in classifying the activities?
-The decision tree model achieved an accuracy of over 73% in classifying the activities.
What was the main issue identified in the error analysis of the decision tree model?
-The main issue identified in the error analysis was that the model sometimes confused activities like lying with sitting about 16% of the time, and walking with sitting about 12% of the time.
How can the performance of the decision tree be improved further?
-The performance of the decision tree can be improved by conducting a time series analysis of each attribute, which is likely to increase accuracy.
Outlines
📊 Human Activity Detection: Importance and Data Overview
The video introduces a human activity detection model developed as part of a Big Data course. It emphasizes the significance of activity detection in healthcare and elderly care, citing examples like Fitbit and Apple Watch's fall detection feature. The dataset used is from the UCI Machine Learning Repository, capturing data from five individuals performing various activities. The data includes attributes such as sequence name, tag identifier, timestamps, dates, and XYZ coordinates, along with activity labels. The video explains the pre-processing steps, which involved removing highly correlated activities to focus on three primary classes: lying, walking, and sitting. This decision was made to retain over 80% of the dataset while simplifying the model's complexity.
📈 Data Analysis and Model Training Strategy
This paragraph delves into the analysis of the dataset, highlighting the use of box plots to understand attribute variations across different activities. It discusses the selection of attributes for model training based on their correlation and variation. The training and testing strategy involves a 70/30 split and 5-fold cross-validation repeated three times to ensure robust parameter tuning. The chosen model is a decision tree, which uses attributes like XYZ coordinates, timestamp, and tag identifier to classify activities. The model's accuracy in classifying activities such as lying, walking, and sitting is reported, with a focus on the decision tree's ability to differentiate these activities based on the Z coordinate primarily.
Mindmap
Keywords
💡Activity Detection
💡Data Analysis
💡Big Data
💡UCI Machine Learning Repository
💡Pre-processing
💡Feature Extraction
💡Correlation
💡Decision Tree
💡Accuracy
💡Confusion Matrix
💡Time Series Analysis
Highlights
The importance of activity detection in medical care and elderly support.
Companies like Fitbit and Apple utilize activity detection for health monitoring and safety features.
Introduction of a human activity detection model implemented as part of a Big Data course.
Data set obtained from UCI machine learning repository with over eight attributes and eleven classes.
Description of the data set attributes including sequence name, tag identifier, timestamp, and coordinates.
The significance of the activity label in identifying the performed activity.
Pre-processing steps including checking for missing values and removing highly correlated activities.
Final class labels after pre-processing: lying, walking, and sitting.
Feature extraction and data visualization techniques used in the study.
High correlation found between timestamp, tag identifier, and XYZ coordinates.
Selection of attributes for model training based on their variation with activities.
70/30 train-test split and 5-fold cross-validation strategy for model fine-tuning.
Use of a decision tree model for activity classification based on selected attributes.
Achievement of over 73% accuracy in classifying activities using the decision tree model.
Error analysis revealing confusion between similar activities like lying and sitting.
Potential improvement of model performance through time series analysis of attributes.
Summary of the activity detection model's effectiveness and areas for future enhancement.
Transcripts
hi everyone in this video we talk about
the human accurate detection model that
we implemented as a as a part of our
introduction to Big Data course but
before we get into the implementations
let's first have a look at why activity
detection is important as we can see
from the current trend data analysis is
becoming increasingly important in the
field of medical care activity detection
can be important for the purposes of
supporting the elderly for independent
living companies like Fitbit rely almost
entirely on activity detection while
companies like Apple have introduced
features like fall detection in the
Apple watch products which could be a
life saving feature for example if a
person Falls and doesn't get up for a
while
he's probably injured so the watch
automatically allows the emergency
services for assistance for these
reasons these chosen the localization
data for person activity as our data set
which we obtained from the UCI machine
learning repository for the creation of
this data set five people were asked to
perform a sequence of activities over
five times and the data set contains
over eight attributes and eleven classes
which we'll get into in just a minute
alright let's have a look at the
attributes in the data set the data set
has over eight attributes namely the
sequence name the tag identifier time
stamp date x y&z coordinates followed by
the activity it is classified into as we
can see from the notebook in the right
the sequence name is composed mainly of
normal values from a 0 1 to e zero 5
over here a denotes the name of the
person being recorded while the number
denotes the recording session of that
person the tag identifier helps us
identify which Tag sensor information is
actually being reported in the room the
time stamp tells us the time at which
the recording
while the date gives us the date of the
recording the XYZ coordinates as we know
are simply the XYZ coordinates of that
tag finally the activity label gives us
which activity was actually being
performed let's have a look at an
example instance here is 0 1 is the
sequence name a is the name of the
person performing the activity while 0 1
is the first recording instance of that
activity set the second column the tag
identifier corresponds to the chest type
so we know that this row reports theta
mainly for the chest bag here's the
timestamp and the x y&z coordinates for
the data as we can see this particular
row has been classified into the walking
activity next we get into pre-processing
the data as a part of our pre-processing
we check for missing values and removed
highly correlated activities for example
active researches lying and lying down
had an extremely similar variation in
terms of their features so we removed
these activities and ended up with final
3 class labels namely lying walking and
sitting since these are the most
important activities generally performed
by humans we still managed to retain
over 80% of our dataset we can see the
cleaning being performed in the Jupiter
to the right
next we get into the feature extraction
from our dataset as we can see from the
right we found out the summary and
visualize the data set as follows we
found that lying sitting and walking
comprised a major part of the data set
to see the correlation between all the
attributes we plotted the we plotted
their correlations as can be seen in in
the notebook on the right the
observations from the plot can be seen
below we found out looking at the upper
right triangle that the attributes like
the timestamp tag identifier x y&z
coordinates were highly correlated and
this is what drove our attribute
selection to train our model let's have
a look at the variation between each
attributes activity wise as we can see
from box plots for the activity versus
bag identifiers the timestamp the X the
Y and the Z coordinates we are able to
make out how each are to be varies with
activities thus this gives us a good
sense of which attribute to select for
the purpose of training our model next
we dive into the Train and test strategy
we use for our body after dividing date
the data into several ratios we found
the best plate to be the 70/30 train
test split for our model for the
purposes of fine-tuning our parameters
we used a 5 fold cross validation
repeated three times this helped us to
work with the dependent and grouped data
as in the data set below we can see in
the notebook on the right how the data
was split into train and test groups
let's have a look at the model we use to
classify our data we use the decision
tree model for the purposes of pinning
our activities based on the given
attributes in two sitting falling in
line as seen from the attribute
correlations plotted above we selected
attributes such as the x y&z coordinates
along with the time stamp and the tag
identifier for the purposes of training
our decision tree as we can see from the
decision tree in the notebook to the
right based simply on the Z coordinates
we are able to classify activities such
as lying and walking with accuracy of 61
and 53% respectively all right let's try
and evaluate our model now we were able
to achieve an accuracy of over 73% in
classifying the attributes into
activities let's analyze the performance
of our decision tree based on attributes
like the Z coordinate we were able to
classify the data to large extent next
it used the tagger identify attribute to
be able to differentiate between
activities like sitting and walking
thirdly the y coordinate was used to
differentiate between activities like
lying and walking let's get into the
error analysis of our decision tree as
we can see from the confusion matrix we
were able to predict correctly
activities such as lying sitting and
walking with an accuracy of power 82 71
and 80% however our tree still got
confused between activities such as
lying and sitting about 16% of the time
and walking and sitting about 12% of the
time finally let's summarize what we
just saw we saw how each activity varies
with the attribute mainly through the
boxplots
of the activities
absolutes this also validated our
initial ideas that activities such as
lying would have a very little change in
all three coordinates whereas activities
like walking would have no change in
that chest Z coordinate especially while
the person is walking on a straight line
to improve the performance of our tree
we can further do a time series analysis
of each of these attributes which is
definitely bound to increase the
accuracy thank you for your time and
patience have a good day
تصفح المزيد من مقاطع الفيديو ذات الصلة
Image classification + feature extraction with Python and Scikit learn | Computer vision tutorial
Machine Learning Tutorial Python - 9 Decision Tree
Fake Profile Detection on Social Networking Websites using Machine Learning | Python IEEE Project
Training a model to recognize sentiment in text (NLP Zero to Hero - Part 3)
Building a Plagiarism Detector Using Machine Learning | Plagiarism Detection with Python
Model & Prediksi Data | Model data berdasarkan objek [2.1/3]
5.0 / 5 (0 votes)