What is Data Science?
Summary
TLDRThe video script delves into the realm of data science, highlighting its intersection with computer science, mathematics, and business expertise. It outlines the data science methods, ranging from descriptive to prescriptive analytics, each answering different business questions with varying complexity and value. The script also details the data science lifecycle, starting from business understanding to data mining, cleaning, exploration, and visualization. It discusses the roles of business analysts, data engineers, and data scientists, emphasizing the importance of collaboration among these roles to transform data into actionable business insights.
Takeaways
- 📊 Data science involves extracting knowledge and insights from noisy data and turning them into actionable steps for businesses.
- 🔄 Data science is at the intersection of computer science, mathematics, and business expertise, requiring collaboration across all three disciplines.
- 🔍 Descriptive analytics answers 'what happened,' diagnostic analytics answers 'why it happened,' predictive analytics answers 'what will happen,' and prescriptive analytics answers 'what should be done.'
- 🏢 The data science lifecycle begins with business understanding to ensure the right questions are being asked.
- 📥 Data mining is the process of gathering relevant data from various sources for analysis.
- 🧹 Data cleaning is essential to remove errors, duplicates, and missing values to prepare the data for analysis.
- 🔬 Data exploration helps analysts use various tools to answer questions, including advanced techniques like machine learning for prediction and recommendation.
- 📊 Visualization is critical in presenting insights from data analysis in a way that businesses can understand and act on.
- 🤝 Roles in the data science lifecycle include business analysts, data engineers, and data scientists, all of whom collaborate to cover different stages of the process.
- 💡 There's often overlap in roles, with business analysts, data engineers, and data scientists sharing tasks such as data exploration, machine learning, and visualization.
Q & A
What is the textbook definition of data science?
-Data science is the field of study that involves extracting knowledge and insights from noisy data, and then turning those insights into actions that a business or organization can take.
What are the three disciplines that intersect to form data science?
-Data science is the intersection of computer science, mathematics, and business expertise.
What is the first type of data science method mentioned in the script, and what does it involve?
-The first type of data science method mentioned is descriptive analytics, which is about understanding what is happening in the business and involves accurate data collection.
What is the difference between diagnostic and descriptive analytics?
-Diagnostic analytics focuses on why something happened, such as why sales went up or down, while descriptive analytics is about what is happening, like whether sales increased or decreased.
How does predictive analytics differ from descriptive and diagnostic analytics?
-Predictive analytics is about what is likely to happen next, using historical patterns to predict future outcomes, whereas descriptive analytics focuses on current happenings and diagnostic analytics on the root causes of past events.
What is prescriptive analytics and what kind of question does it answer?
-Prescriptive analytics is about recommending the best actions to achieve a particular outcome, such as what actions to take to improve sales by 10%.
What is the first step in the data science lifecycle?
-The first step in the data science lifecycle is business understanding, which is critical to ensure that the right questions are asked before proceeding with data science initiatives.
Why is collaboration across different roles in a data science project important?
-Collaboration is important because different roles such as business analysts, data engineers, and data scientists each contribute unique expertise and there is often overlap in their responsibilities, requiring them to work together effectively.
What role do data engineers play in the data science lifecycle?
-Data engineers help find, clean, and prepare data for analysis, playing a crucial role in the data mining and data cleaning stages of the data science lifecycle.
How does visualization fit into the data science process?
-Visualization is the step where insights and outcomes from the analysis are presented in a way that is understandable and useful for business decision-making.
What is the role of a business analyst in a data science project?
-A business analyst is involved in formulating questions, contributing domain expertise, and helping to visualize insights in a way that is useful for the business.
Outlines
📊 Introduction to Data Science and Its Disciplines
This paragraph introduces the concept of data science as a field that extracts knowledge and insights from noisy data to guide business actions. It emphasizes the intersection of computer science, mathematics, and business expertise as the core of data science. The paragraph also outlines different types of data science methods, including descriptive, diagnostic, predictive, and prescriptive analytics, each serving to answer questions of varying complexity and value. Descriptive analytics focuses on current business conditions, diagnostic seeks to understand why events occur, predictive anticipates future outcomes, and prescriptive recommends actions for desired outcomes. The data science lifecycle is introduced, starting with business understanding, followed by data mining, cleaning, and exploration.
🔍 Data Science Lifecycle and Roles
The second paragraph delves into the data science lifecycle, discussing the importance of using analytical tools to answer business questions and the progression from data exploration to advanced analytics using machine learning. It highlights the need for visualization to communicate insights effectively. The paragraph also outlines the roles within an organization that contribute to the data science process: business analysts who frame questions and visualize insights, data engineers who handle data procurement and cleaning, and data scientists who specialize in exploration and advanced techniques. The collaborative nature of these roles is emphasized, acknowledging the overlap and interdependence in their contributions to the data science lifecycle.
Mindmap
Keywords
💡Data Science
💡Predictive Analytics
💡Machine Learning
💡Descriptive Analytics
💡Diagnostic Analytics
💡Prescriptive Analytics
💡Data Mining
💡Data Cleaning
💡Exploration
💡Visualization
💡Business Understanding
Highlights
Data science is defined as the field of study that extracts knowledge and insights from noisy data to inform business actions.
Data science is an intersection of computer science, mathematics, and business expertise.
Data science initiatives require collaboration across computer science, mathematics, and business disciplines.
Descriptive analytics focuses on understanding what is happening in the business through accurate data collection.
Diagnostic analytics investigates the root cause of events, such as why sales went up or down.
Predictive analytics uses historical data patterns to forecast future outcomes, like next quarter's sales performance.
Prescriptive analytics recommends actions to achieve specific outcomes, such as increasing sales by a certain percentage.
The data science lifecycle begins with business understanding to ensure the right questions are asked.
Data mining is the process of procuring the necessary data for analysis.
Data cleaning involves preparing and cleaning data to remove issues like missing values or duplicates.
Data exploration uses analytical tools to answer business questions and may involve advanced techniques like machine learning.
Visualization is crucial for translating insights and analysis outcomes into understandable formats for business use.
Business analysts play a role in formulating questions, understanding the business, and visualizing insights.
Data engineers assist in finding, cleaning, and exploring data as part of the data science process.
Data scientists specialize in advanced exploration and machine learning techniques, contributing to the data science lifecycle.
Collaboration between business analysts, data engineers, and data scientists is essential for a successful data science initiative.
The data science lifecycle transforms noisy data into actionable knowledge and insights for business decisions.
Transcripts
Let's talk about data science and some of the other related terms you may have heard,
such as predictive analytics, machine learning, advanced analytics and others.
So let's start with the textbook definition of data science.
So data science is the field of study that involves extracting knowledge and insights, from
noisy data, and then turning those insights into actions that our business or organization
can take. Okay. So let's dig into it a little bit more and discuss what are the different
areas that are covered by data science. So really data science is the intersection
between three different disciplines. We start with computer science, but then we also cover
the area of mathematics, and then what I think is the most important is. Business
expertize. So the intersection of these three disciplines is data science, and true data
science initiatives involve collaboration across all these three different areas.
Okay. So now let's touch on the different types of data science that you can do.
Now, what we need to understand here is that we have different data science
methods for different questions that we might ask in an organization. And these questions can
vary by complexity and the value that we get out of them. So let's chart them here
by complexity and value. Okay. So the first one that we have here is descriptive analytics.
So this is really about what is happening in my business, right. And it involves having
accurate data collection to make sure that we know what's happening. So a good question
we could ask here is, well, did sales go up or down? The next level is diagnostic analytics,
and this is more about why did something happen? So why did sales go up or down?
And it involves drilling down to the root cause of our problem. Now, the next one that we have is
predictive analytics. So this is about what is likely to happen next. Right. So what will
our sales performance be next quarter? And it involves using historical patterns in our
in our data to predict outcomes in the future. And then finally we have prescriptive analytics.
So this is about what do I need to do next? What is the recommended best action for a particular
outcome? So a question we could ask here is what do I need to do to improve sales by 10%?
Right. Okay. So now we can talk about how data science is done and who actually does it. So let's
look at the data science lifecycle. And the first thing that we always must start with is business
understanding.
So this is really critical to make sure that we're asking the right question before we go down a
lengthy data science initiative. And this is where you can see that having the business expertize and
the domain expertize can be incredibly critical to make sure that we're asking the right questions.
Okay. So once we've defined that, we can move on to data mining.
So this is the process of actually going out into our data landscape and procuring the data
that we need for our analysis. So once we've done that, we can move on to data cleaning.
So the reality of the marketplace is that we when we find data, it's probably not in the best format
that we need it in. And it probably has some some issues with it. Right. It might have rows
that have missing values. It might have duplicates in it. So there are some preparation and cleaning
that we have to do before it's ready for our analysis. So once we've done that cleansing.
We can move on to exploration.
Okay. So this is the part of the process that allows us to use different analytical tools that
can start helping us answer some of the types of questions that I mentioned here earlier.
And if we actually want to get into some of these higher value questions like predictive and
prescriptive, then we must start using advanced analytical tools such as machine learning tools
that leverage massive amounts of computing power and massive amounts of high quality data
to make predictions and prescribe actions for the future. Now once we've done our exploration
and perhaps our advanced analytics. What do we do next? Well, we need to visualize
our insights and outcomes of our analysis. Okay. Now I want to quickly touch on who does what
in this life cycle. So in an organization, you may have roles like a business analyst,
you might have data engineers. And then you might have data scientists.
So business analysts are obviously involved in formulating the questions. They have the
domain expertize. They can help with the business understanding, but they're also involved with.
Visualizing our insights in a way that's useful for the business. Right. And then we have folks
like data engineering folks. So these are the people that can help us find the data,
clean the data. And then also help with some of the exploration. Next, we move on to our
data scientists. So these are the people that will really help us with the exploration. They'll help
us with the advanced machine learning techniques. And they'll also assist in the visualization.
So you can see there's there's some overlap between the roles. And that's why it's critical
to have collaboration across these roles. And what you also start seeing nowadays in the marketplace
is that sometimes business analysts have to do some machine learning. They have to help out
with exploration data. Scientists sometimes need to go and find the data on their own.
So there's a lot of overlap, and these different roles must collaborate with each other.
Okay. So I hope you can see now how the data science lifecycle can help us take noisy data,
turn it into knowledge and insights, and then turn it into meaningful action for our business.
Thank you. If you have questions, please drop us a line below.
And if you want to see more videos like this in the future, please like and subscribe.
Ver Más Videos Relacionados
Understanding The Data Life Cycle with DataBrew
Introduction to Python
ML Engineering is Not What You Think - ML jobs Explained
Curso Básico de Ciência de Dados - Aula 1 - Introdução a Ciência de Dados
Data Science Life Cycle | Life Cycle Of A Data Science Project | Data Science Tutorial | Simplilearn
Data Management - Analytics
5.0 / 5 (0 votes)