Starting a Career in Data Science (10 Thing I Wish I Knew…)
Summary
TLDRThis video script serves as a guide for aspiring data scientists, highlighting common pitfalls to avoid. It emphasizes that coding is merely a tool and not the core of data science, which is rooted in statistics and machine learning. The speaker advises against the misconception that becoming a data analyst is a prerequisite to becoming a data scientist. They also caution against jumping into roles without understanding various career paths in data science, the importance of having a structured learning plan, and the necessity of treating job hunting as a project. The script warns against expecting quick job placement and relying solely on tutorials, advocating for hands-on practice and a well-rounded skill set beyond just technical abilities.
Takeaways
- 🔧 Start with the fundamentals: Focus on learning statistics, machine learning, and math as the core knowledge for data science, rather than jumping straight into coding.
- 🛠️ Coding is a tool: Understand that coding languages like Python and SQL are tools to apply data science concepts, not the entirety of data science itself.
- 🚫 Avoid the misconception: You don't need to become a data analyst first to become a data scientist; they are distinct roles with different skill sets.
- 🔍 Explore various roles: Research different roles within the data science domain to understand which might align better with your interests and goals.
- 📈 Have a plan: Create a roadmap for your learning journey, working backward from your target role to identify the necessary skills and knowledge.
- 📚 Practice, practice, practice: Treat job hunting as a project, dedicating time to building a portfolio, practicing coding, and preparing for interviews.
- 🤖 Embrace generative AI: Leverage generative AI tools to enhance your learning and project development in data science.
- 📊 Beyond regression: Recognize that while regression and other statistical concepts are important, a successful data scientist also needs domain knowledge, business understanding, and communication skills.
- 🧮 Math is important, but context-dependent: The depth of math knowledge required varies depending on the type of data science work you'll be doing.
- 🪟 Avoid the tutorial trap: Engage in hands-on work and practical application to solidify your understanding and avoid merely watching tutorials without applying the knowledge.
Q & A
What is the first mistake to avoid when learning to become a data scientist according to the video?
-The first mistake to avoid is starting with coding. The video suggests that coding is a tool for applying data science, not data science itself. The core knowledge for a data scientist should be statistics and machine learning.
Why does the video discourage learning Python or SQL as the initial step in becoming a data scientist?
-The video discourages this because Python and SQL are tools for applying data science, not the core of data science. The core should be statistics and machine learning, and if one does not enjoy these fundamentals, there is no need to invest time in learning coding languages.
What is the common misconception about the career path to becoming a data scientist that the video addresses?
-The common misconception is that one must become a data analyst before becoming a data scientist. The video clarifies that this is not necessary and could lead to a waste of time that could be spent directly on learning data science concepts.
Why does the video suggest not jumping into a specific data science role without research?
-The video suggests not jumping into a specific role without research because there are many roles in the data science domain, and one should understand what each role entails to ensure they are pursuing the right career path that aligns with their interests.
What is the importance of having a plan when learning data science, as emphasized in the video?
-Having a plan is important because it helps to stay on track, allocate time effectively, and ensures that the learning process is goal-oriented. The video recommends working backward from the target role to understand the requirements and build a tailored learning roadmap.
How does the video suggest using generative AI in the learning process of data science?
-The video suggests leveraging generative AI to teach new concepts or to help with coding tasks. It encourages incorporating generative AI into the learning curriculum to take advantage of this transformative technology.
What is the 'tutorial trap' mentioned in the video, and how can one avoid it?
-The 'tutorial trap' refers to the misconception that one has mastered a skill just by watching a tutorial. To avoid it, the video advises doing hands-on work, practicing, and not moving on until the concepts are thoroughly understood and can be applied independently.
How does the video differentiate the importance of math for different types of data scientists?
-The video differentiates by stating that if one is developing custom machine learning models, math is crucial. However, for those using pre-built models, the importance of math is still significant but not as critical, suggesting a spectrum of math knowledge requirements.
What is the video's stance on the necessity of domain knowledge and communication skills in data science?
-The video emphasizes that beyond technical skills like regression, domain knowledge, business understanding, product management, and communication are essential for a successful data science career. These skills help in applying data science concepts in real-world scenarios and communicating findings effectively.
Why does the video compare job hunting to a project, and what does it suggest for preparation?
-The video compares job hunting to a project to highlight the need for dedicated time and effort. It suggests treating the job search as a project by building a portfolio, preparing for interviews with practice, and understanding the requirements of the target role.
What resources does the video recommend for learning Python for data analysis?
-The video recommends an intro to Python ebook created by HubSpot, which covers essential libraries like pandas, numpy, and matplotlib for data analysis with Python. It also provides coding snippets for beginners.
Outlines
🚀 Starting Your Data Science Journey
The speaker emphasizes that coding, while important, should not be the starting point for aspiring data scientists. Instead, the core knowledge of statistics and machine learning should be the foundation. The speaker advises against the common misconception that one must become a data analyst first, highlighting that data science encompasses more than just analytics skills, such as a deeper understanding of statistics and machine learning. The speaker also warns against jumping into a specific role without understanding the broader data science domain and the various roles available, such as machine learning engineer or AI product manager.
📈 Avoiding the Tutorial Trap and Job Market Realities
The speaker discusses the importance of having a structured plan when learning data science, suggesting to work backward from the desired role to identify the necessary skills. They recommend creating a roadmap and staying on track with it. Additionally, the speaker addresses the tutorial trap, cautioning against the over-reliance on tutorials without hands-on practice. They also touch on the realities of the job market, suggesting that job hunting should be treated as a project, with a focus on building a portfolio and preparing for interviews, which includes coding, behavioral questions, and statistical/machine learning fundamentals.
🧠 Balancing Math Skills and Practical Application
In the final paragraph, the speaker advises on the importance of math in data science, suggesting that its relevance depends on the specific role one aims to fill. For those developing custom machine learning models, a deep understanding of math is crucial, but for those using pre-built models, it's less critical. The speaker also stresses the importance of domain knowledge, business understanding, and communication skills, arguing that these are essential for a successful data science career. They conclude by encouraging learners to avoid the tutorial trap and to engage in hands-on work to solidify their learning.
Mindmap
Keywords
💡Data Science
💡Coding
💡Statistics
💡Machine Learning
💡Data Analyst
💡Career Path
💡Generative AI
💡Regression
💡Tutorial Trap
💡Mathematics
💡Hands-On Work
Highlights
Focusing on avoiding common mistakes while learning to become a data scientist.
Disagreeing with the advice to start learning to code as the first step in data science.
Emphasizing that coding is a tool, not the core of data science, which is statistics and machine learning.
Suggesting to start with fundamentals in statistics, machine learning, and math.
Myth busting: You don't need to become a data analyst before becoming a data scientist.
Warning against spending too much time on data analytics skills at the expense of statistics and machine learning.
Personal advice from the speaker's experience about not jumping into a data science role without exploring other roles.
Importance of understanding different roles in the data science domain before deciding on a career path.
Advice on creating a plan and roadmap for learning data science.
Recommendation to work backward from the target role to understand the required skills.
Highlighting the importance of treating job search as a project and not just learning to code.
Stressing the need for practice, especially for coding interviews in data science.
Advocating the use of generative AI as a tool to aid in learning and project development.
Cautioning against moving beyond fundamental concepts like regression too quickly.
Stressing the importance of domain knowledge, business understanding, and communication skills for data scientists.
Discussing the varying importance of math depending on the type of data scientist one aims to be.
Warning against the tutorial trap and the importance of hands-on practice.
Transcripts
if you got to this video that means you have probably watched a lot of videos about data
science road map and how you can become a data scientist and this video will focus on mistakes
that you should be avoiding while learning to become a data scientist these learnings
are based on my experience in the industry over the last 10 years working in the data science
domain and especially becoming a data scientist in a non-traditional way this is the first thing
that you should avoid in the Journey of becoming data scientist there will be many people who will
tell you to start with coding whether that is python or something else I completely disagree
with that advice coding is a tool to apply data science it is not data science in itself and I've
seen many people make this mistake where they will jump into coding start learning python
start learning SQL and they think like this is the right way to approach it okay maybe it is
the right way to approach it if you are very new to coding and you have no background and
you just want to see like if coding is something that you even enjoy because it's such a big part
of data science but what data science actually is is statistics and machine learning that's the
core knowledge that a data scientist need to know and coding is a tool that how a data scientist
will apply statistics and machine learning so I'm going to say this again coding is a tool to apply
statistics and machine learning in data science it is not data science in itself so what I would
like you to do is start learning the fundamentals in statistics and machine learning and math and
see if this is something that you enjoy doing because if you start doing that and you realize
that you don't enjoy it then I would want you to stop learning coding languages such as python SQL
R are very powerful languages that you can learn as a data scientist but these are just tools to
apply data science second and this is one of my biggest pet peeve and I'm going to say it a lot
of people believe that they need to become a data analyst before becoming a data scientist this is
not true you do not need to become a data analyst before before becoming a data scientist okay I'm
going to say this again you don't need to become a data analyst before becoming a data scientist
and I think this is such a big misunderstanding that a lot of people have they will jump right
into data analytics they will learn all the skills that you need to have as a data analytics whether
that is like data analysis learning SQL building dashboards yes data scientists and data analysts
have a lot of Concepts that they have in common but data science is more than that for example as
a data analyst you're going to be spending a lot of time building dashboards as a data scientist
you will like likely not be using that skill set additionally as a data scientist you require a lot
more statistics and machine learning knowledge if you choose to become a data analyst you're
going to miss out on a lot of time that you could have spent on learning Concepts in statistics and
machine learning to solidify your knowledge you could have used that time to build projects and
data science you could have also used that time to get experiences whether that's internships
personal projects and whatnot so if your end goal is to become a data scientist please reconsider
your decision to become a data analyst first yes there are many people who will transition from
data data analyst to data scientist completely fine and it's okay for people who don't know that
they want to become a data scientist but later they realize they want to become a data scientist
totally fine it's easier to make a transition from data analyst to data scientist of course it
requires a lot of work but if your end goal is to become a data scientist then start with that road
map one of my recent videos talks you through the entire data scientist road map so definitely watch
that I'm going to link it somewhere here number three is the mistake that I personally made when
I started in the data science domain I jumped right into the data science role at that time I
did not know that there are so many other roles that existed there is machine learning engineer
AI engineer data analyst data engineer although I did work as a data engineer for some time but
that's a different story many other roles that exist in the data science domain including a
recent one AI product manager which is booming topic lately okay so what I'm trying to say is
that don't be me don't be sundas if you want to get into data science don't just jump into one
career right away try to understand what different career do what does a data analyst do what does a
machine learning engineer do maybe there is a role that you will learn about and realize like no this
sounds actually more interesting to me and that's what I want to pursue so like do your research
before deciding and jumping into one picking one and moving forward with it because it's going to
take a lot of work and you want to be sure that this is exactly what you want to do before you put
in a lot of work the fourth mistake that I see a lot of people make when trying to learn data
science is not following a plan and this goes to my first mistake where you just jump right
into learning to code sure that's fine it's great it's better than doing nothing right but if you
want to become a data scientist just make sure that you are going forward with a plan create a
plan create a road map and one of the things that I would highly recommend is for you to
work backward this is something that I personally have done in my personal projects in my interview
preps and in my job search and whatnot figure out your target company your target role what
you want to work as then figure out what does a data scientist let's say at meta does then look
at the job description for somebody who works at meta then look at the job description for open
role at meta for data scientist role understand what the requirements are then go to LinkedIn and
find somebody who is working as a data scientist at meta or have worked as a data scientist at meta
look at their projects look at their education try to understand what type of work they do what
kind of background they have this will give you a really good understanding and will help you define
your road map that you would need to follow in order to become a data scientist so make sure
when you're starting to learn data science you go with a plan that will help you make sure that
you stay on track and you follow it put it on your calendar whatever you need to do allocate whatever
time you need to allocate over the weekend after work during day whatever you need to do like make
sure you are creating a plan and following it on the topic of creating a plan I found this intro to
python ebook which is basically a beginner guide to learning python for data analysis this ebook
is created by HubSpot who is also sponsoring this portion of the video the ebook covers libraries
such as pandas numpy matte plot lib which are some of the essential libraries for analyzing data with
python it also walks you through basic ideas and gives you coding Snippets so you can plug and play
it is available to download free and I'm linking it in the description below now let's talk about
the next thing that you should be avoiding while learning to become a data scientist the fifth
mistake that you should avoid is expecting to land a job after learning these things over the
last few months and years job market has become what is the right word stressful job market has
become stressful and it takes a lot more than just learning how to code to land a job I like
to think of Landing a job as a project in itself just like learning these skills is a project in
itself let's say you're able to do regression and Analysis using python but you go into a coding
interview you're in a time pressure setting there are another person sitting in front of you and
it's possible that you might forget so in order to truly be successful in these job interview
scenarios you need to practice practice practice you need to practice it as much as you can and
I like to break data science into three buckets one is coding which a lot of people spend a lot
of time on it's one of the buckets there's other elements second is behavioral questions and third
third which I in my opinion is very important is statistics and machine learning fundamentals where
you will get a lot of hypothetical scenario based questions so make sure you're treating job search
as a project and fully dedicating time building your portfolio building your projects that you
can speak about in your interviews and also going into the interview with a lot of prep the sixth
mistake that you should avoid is my favorite generative AI generative AI is one of the most
transformative technology that we have seen in over the last decade in Tech data science and
Beyond so don't avoid it while you're learning these skills make sure that you're leveraging
generative AI to your advantage whether that is helping generative AI teach you new Concepts or
working on a project and having chadt WR basic code for you I have done a few videos on chadt
and actually have a playlist on how to use generative AI specifically chat GPT Bard
and other tools for coding and data analysis you can watch it somewhere here so take generative a
seriously and make it part of your curriculum then number eight mistakes is moving beyond regression
and this is is going to counter the first thing I mentioned regression is one of the most important
Concepts in data science it is often that we spend a lot of time on these Concepts on
learning regression on learning classification and so on but there's a lot more to data science than
that in order to be a great data scientist you need to have good domain knowledge good business
understanding good pming understanding pming as in like product management and product development
life cycle and good communication so regression is important learning these fundamentals and
concepts are important but how do you put that into the real world where you're going to be
working with a lot of different stakeholders in different domains how do you communicate
your findings your results your insights or how do you even understand what the business
needs so having a good understanding of all of these things around the data science umbrella
will help you be a successful in your career so when you're defining your curriculum make
sure that you're incorporating these things into your learning plan ninth math is important but not
really okay this might be controversial but the reason I say this math is definitely important
but that that also depends on what kind of data scientist you're going to be if you're going to
be a data scientist who is going to be developing custom machine learning models math is your friend
and you should be knowing math in and out but if you are going to be a data scientist who is using
machine learning models that are already built for example regression analysis classification
model and you're not building your own custom machine learning models math is important still
but not as much so understand where you fall in the spectrum of how much math you need to know so
you don't fall into that trap and finally my last and my biggest tip is do not fall into tutorial
trap when I say tutorials I basically mean you're watching a YouTube video or you watching a video
on the course yes those are all important and it is often that when you watch it for example
maybe you're watching this video and you're like yes I get it I get all 10 points but in order to
apply them in practice it's way more difficult similarly when you're watching these tutorials
let's say you're watching a tutorial in Python it looks very easy it looks very straightforward you
finished that tutorial and you were like yes I got it I can do it but when you actually start
applying these on a real project on real data set on your computer the chances are that you're
going to run into issues so this is called tutorial trap because you watched the video
and you moved on don't move on right away spend some time doing the Hands-On work understanding
the concept in more detail before you move on because the chances are that when you do
it yourself you're going to run into issues when you run into issues you learn more running into
issues is actually not bad so that's why you need to do more Hands-On work so it sticks with you the
knowledge sticks with you and you're not becoming Target of tutorial trap okay so these were the 10
things that I wanted to mention is there anything else that you would like to share let me know in
comments and if you like this video maybe I'll see you in the next one have a great day bye
Browse More Related Video
Data Science Roadmap 2024 | Data Science Weekly Study Plan | Free Resources to Become Data Scientist
How I'd Learn Data Science In 2024 (If I Could Restart) - The Ultimate Roadmap
Data Science Roadmap 2024 | Data Science Weekly Study Plan | Free Resources to Become Data Scientist
How I Would Learn Data Science in 2022
How I Became A Data Scientist (No CS Degree, No Bootcamp)
Should You Transition from Data Analyst to Data Scientist? [Maven Musings]
5.0 / 5 (0 votes)