Data Analysis| Project 02: Call Center Data Analysis With SQL
Summary
TLDRThis video script outlines a comprehensive data analysis process on a database table with 12 columns. It begins with visual inspection, noting issues like incorrect date formats and spaces in column names. The script guides through SQL commands to correct date formats, alter data types, and rename columns. It addresses nullifying blank values and checks for duplicates. The analysis includes calculating call percentages by reason, identifying peak call days, and assessing call durations. It concludes with customer sentiment analysis and service level assessments for call centers, providing actionable insights for improvement.
Takeaways
- 📊 The script involves working with a database table, focusing on data issues and their resolutions.
- 🗓️ There's an issue with the date format in the 'call timestamp' column, which needs to be corrected from 'month date year' to 'year month day'.
- 🚫 The 11th column has a name with spaces, which needs to be renamed for consistency.
- 🔄 The 'C Series score' is incorrectly identified as a 'capital deliber' data type and needs to be changed to 'integer'.
- 🔧 The 'current timestamp' is identified as a 'debt' type, which should be corrected.
- 🛠️ SQL safe updates are disabled for making changes to the table, a practice for safety during modifications.
- 🔄 The 'timestamp' column's date format is changed using the STR_TO_DATE function in SQL.
- 🔄 Data types are corrected for 'call timestamp' and 'C Series score' columns to ensure accuracy.
- 🔧 The 11th column, initially named with spaces, is renamed to 'CD in a minute' to avoid issues.
- ✅ Blank values in the 'C Series score' column are replaced with 'NULL' to maintain data integrity.
- 🔍 The script checks for duplicate values in the dataset and finds none, indicating data is ready for analysis.
- 📊 Various data analysis steps are performed, including calculating percentages of call reasons, identifying peak call days, and analyzing call durations.
- 📈 The analysis reveals that 'billing question' is the most common call reason with the highest percentage.
- 📉 Friday is identified as the day with the most calls, while Sunday has the least.
- 📊 The minimum, maximum, and average call durations are calculated, providing insights into call handling times.
- 🔎 Customer sentiment analysis shows a higher percentage of negative sentiments compared to positive.
- 📊 The final analysis examines call center performance by counting calls within and above service levels for different response times.
Q & A
What was the issue with the date format in the database?
-The date format under the 'call timestamp' column was in the format of month-date-year, but it needed to be in the format of year-month-day.
How was the date format corrected in the database?
-The date format was corrected by using the STR_TO_DATE function in SQL to convert the 'call timestamp' column to the desired format.
What was the problem with the 11th column in the database?
-The name of the 11th column contained spaces, which needed to be renamed for consistency and proper SQL practices.
How were the data types issues in the 'C Series score' and 'call timestamp' columns addressed?
-The data types were corrected by altering the table to change the 'C Series score' to an integer and the 'call timestamp' to a date type.
What was the issue with the values in the fourth column?
-There were blanking values in the fourth column, which were replaced with NULL to accurately represent missing data.
How were duplicates checked in the database?
-Duplicates were checked by counting the total number of rows and comparing it with the count of unique 'ID' and 'customer name' values.
What was the most common reason for calls according to the analysis?
-The most common reason for calls was 'billing question', with a percentage of 71.2%.
Which day of the week had the most calls?
-Friday had the most calls, while Sunday had the least.
What were the minimum, maximum, and average call durations?
-The minimum call duration was 5 minutes, the maximum was 45 minutes, and the average call duration was 25 minutes.
What was the sentiment analysis result for the 'sentiment' column?
-The sentiment analysis showed that negative sentiments were higher compared to positive sentiments, with positive sentiments being very low.
How were calls within or above the service level checked for each call center?
-Calls within or above the service level were checked by grouping and counting the calls based on 'call center' and 'response time', then ordering the results by 'call center' and 'counts'.
Outlines
🔍 Database and Data Table Inspection
The speaker begins by introducing the database and data table, highlighting the need to examine the 'Title, First' column for the first 10 rows. They note issues such as incorrect date formats in the 'call timestamp' column, which should be in 'year-month-day' order, and a column name with spaces that needs renaming. Additionally, there are blank values in the fourth column that require attention. The speaker proceeds to demonstrate how to check data types and how to correct the identified issues by altering the table structure and data types.
🛠️ Correcting Data Types and Renaming Columns
In this segment, the speaker focuses on correcting data types and renaming a column. They explain that the 'C Series score' should be an integer and the 'current timestamp' should be a date type. The speaker demonstrates how to modify these data types using SQL commands. They also rename the 11th column from 'duration in minutes' to 'CD in a minute' after ensuring that all data types are correct. The speaker then addresses the issue of blank values in the 'C series score' column, replacing zeros with null values, and checks for duplicates, confirming there are none.
📊 Analyzing Call Data and Customer Sentiments
The speaker moves on to data analysis, starting with calculating the number of calls made for each reason and the corresponding percentages. They demonstrate how to use SQL to group and calculate these metrics. The analysis continues with identifying the day with the most calls, revealing that Friday has the highest number. The speaker also calculates the minimum, maximum, and average call duration. Lastly, they analyze customer sentiment, finding a higher number of negative sentiments compared to positive ones, and check the number of calls written below or above the service level for each call center.
📝 Conclusion and Project Details
In the final paragraph, the speaker concludes the data analysis and provides a brief overview of the findings. They mention that the project details and a download link for the internet will be available in the description below the video. The speaker thanks the viewers for watching and teases the next project, inviting them to join for further insights.
Mindmap
Keywords
💡Database
💡Data Table
💡Column
💡Data Type
💡SQL
💡Timestamp
💡Data Analysis
💡Null Values
💡Duplicate Values
💡Customer Segments
💡Service Level
Highlights
Introduction to the database and data table with 12 columns.
Identification of issues with date format and column naming.
Plan to change the date format from month-date-year to year-month-date.
Need to rename a column that contains spaces.
Addressing blanking values in the fourth column.
Checking data types and identifying mismatches.
Conversion of data types for 'C Series score' and 'Call timestamp'.
Disabling SQL safe updates for data manipulation.
Updating the 'Call timestamp' column to the correct date format.
Enabling SQL safe updates after the update.
Fixing data type issues for 'Call timestamp' and 'C Series score'.
Renaming the 11th column from 'Duration in minutes' to 'CD in a minute'.
Replacing zero values in the 'C Series score' column with null.
Checking for duplicate values and confirming the data's uniqueness.
Analyzing call reasons and calculating the percentage of calls per reason.
Identifying the day with the most calls and the least.
Calculating minimum, maximum, and average call duration.
Analyzing customer sentiment with a focus on negative versus positive sentiments.
Checking the number of calls written below or above the service level for each call center.
Conclusion of data analysis and introduction to the next project.
Transcripts
this is our database and this is our
entire table so let's check our Title
First how it looks like
will be the results to 10 rows
[Music]
let's check it now
[Music]
okay this is our data table and we have
12 column here
which is some problems here like the
date format under call timestamp is
month date in the year but it is SQL we
need your first then month and end it we
need to change that
and in 11th column the name of the
column contain spaces we need to renamed
it
and in the fourth column there are some
blanking values
we need to fix that too
so now in Step number two let's check
the data types now
explained then we'll write in the base
name Dot
the data table name
let's check the result
okay uh the problem we found are the
C Series score is the number but
this is um
a capital deliber we returns this data
type and in the current timestamp we
know this is a debt but the data is in
capital we need to change that to
okay so let's convert the bit format
first
so our step number three is change the
date format to do this first of all we
need to disable the SQL save updates
this is optional but uh this is safety
okay well
save updates
we go to zero
now we update our table table name
and we set the column name
call timestamp
equal to
scr to date
parenthesis
the column name again call timestamp
and the format we have
month
slash
okay
slash here
okay fine now enable the SQL safe
updates again
copy this code
and change w01
now let's check this value
okay we have an error here
now it will work
[Music]
yeah it worked
[Music]
let's check our turntable again
[Music]
we'll meet the results to open rows
let's check it
[Music]
yes the date formula is now year month
and
day
now step number four we are going to fix
the types issues we have to change the
data types as a date for call times step
and scissor score has integer field so
all third tables or table name then
modify
or modify column
and column name B is called timestamp
[Music]
we put the data type date okay another
row now alter table
table name
then modify column
name is
CZ score and we put the intercept
to visit
let's check
okay the buildup IS successfully updated
now let's check the details again
explain
our database name
dot or data table name
[Music]
this is the updated types
all the types are now correct
now step number five let's rename the
11th column name which was called
duration in minutes first of all check
the data table again
foreign
[Music]
in minutes we are going to change this
name
[Music]
so we alter our table the name CC data
now change
the column
[Music]
called division
Vietnamese and the new name of this
column will be CD in a minute
okay the other type will be integer to
visit and not know
[Music]
okay go
[Music]
the column ahead is successfully updated
let's check it
foreign
[Music]
values in the fourth column
we will replace those zero values
with null
so again let's disable the SQL save
updates
let's go to save up here is equal to
zero
Now update
our data table
and set
our column name c series called equal to
null
where
she said score
is equal to zero
now again enable the SQL save updates
copy this code
and change the value
1.
[Music]
okay successfully updated let's check it
now
[Music]
[Music]
all the blank values are now replaced
with null
okay now we are going to check for
duplicates value
[Music]
let's count select count everything
as row number
[Music]
from alternative
[Music]
so we have
32
941 total low number now let's check the
unique hello number for ID and customer
name
[Music]
select count distinct ID
and count for
distinct customer name
[Music]
let's check it
okay both of them have 32 941 rows so we
don't have any duplicate values here
so our data is ready for the analysis
now so our next analysis steps is
uh The Operators
call the first one for various reasons
so let's count how many calls they made
for each reason
we need to calculate the percentage too
so select reason count
everything and we round up the
percentage so round
[Music]
count
foreign
divided by
[Music]
[Music]
into 100
and we round up
with one digit
let's fix the parenthesis
as percentage
from
and we will Group by
provision
[Music]
okay let's check it now
so we got the highest percentage for
billing question reason
and it's 71.2 now reset number nine we
find out which guy has most calls
[Music]
the select day name
the column name
as day of call
now count everything
as number of calls
from evil CC data and we will Group by
1 that means day name this will day of
and we order by
the count flow
that means number of calls
let's check the output
so we clearly see Friday has most of the
polls
and the Sunday has less amount of code
last number 10
we will find out the minimum maximum and
average call duration
so select
[Music]
minimum
the club name is Siri I mean
as maximum sorry minimum division
minimum duration
Max function
for the column CDMA
as maximum duration
[Music]
now average
we round the average value so run
function
and then average
for the column
and for the Roundup value one
and as average integration
[Music]
from our data table
okay check the result
we got minimum division is 5 minute
maximum deviation is 45
and average duration is 25 minute
now we are going to analyze the
customers segments for the sentiment
column
select sentiment
and count everything
from our data table
[Music]
Group by sentiment
foreign
[Music]
okay let's take it
okay we found the negative sentiments
are high comparing to the positive
sentiments
well positive sentiments is very low
now step number 12 and last step
checking how many calls are written
below or above the service for each
columns
so select call center
then response time
count everything
as counts
from conductable CC guitar
now we Group by
[Music]
one and two that means call center and
response time
and Water by
1 3
that means call center and counts
let's check the result
[Music]
we found a number of calls which are
within below or above the service for
each call center like Baltimore
within service
within service
that's it done our data analysis second
project the internet download link and
the project details the project details
link will be in the description below
thanks for watching see you in the next
project
Ver Más Videos Relacionados
Part3 : Database Testing | How To Test Schema of Database Table | Test Cases
Cara Membuat Database dan Tabel MySQL pada phpMyAdmin
Spark SQL Tutorial 12 | Null Check On Table Spark SQL | Spark Tutorial | Data Engineering
Full Project in Excel | Excel Tutorials for Beginners
Analisi brevi di mercato con l'AI: Metodo a 3 fasi (insights, opportunità e posizionamento)
B.Ed Practicum (Sem-4) English Version, EVS~Visit To Polluted Sites And Preparation Of A Report,
5.0 / 5 (0 votes)