Starting a Career in Data Science (10 Thing I Wish I Knew…)

Sundas Khalid
23 Feb 202410:41

Summary

TLDRThis video script serves as a guide for aspiring data scientists, highlighting common pitfalls to avoid. It emphasizes that coding is merely a tool and not the core of data science, which is rooted in statistics and machine learning. The speaker advises against the misconception that becoming a data analyst is a prerequisite to becoming a data scientist. They also caution against jumping into roles without understanding various career paths in data science, the importance of having a structured learning plan, and the necessity of treating job hunting as a project. The script warns against expecting quick job placement and relying solely on tutorials, advocating for hands-on practice and a well-rounded skill set beyond just technical abilities.

Takeaways

  • 🔧 Start with the fundamentals: Focus on learning statistics, machine learning, and math as the core knowledge for data science, rather than jumping straight into coding.
  • 🛠️ Coding is a tool: Understand that coding languages like Python and SQL are tools to apply data science concepts, not the entirety of data science itself.
  • 🚫 Avoid the misconception: You don't need to become a data analyst first to become a data scientist; they are distinct roles with different skill sets.
  • 🔍 Explore various roles: Research different roles within the data science domain to understand which might align better with your interests and goals.
  • 📈 Have a plan: Create a roadmap for your learning journey, working backward from your target role to identify the necessary skills and knowledge.
  • 📚 Practice, practice, practice: Treat job hunting as a project, dedicating time to building a portfolio, practicing coding, and preparing for interviews.
  • 🤖 Embrace generative AI: Leverage generative AI tools to enhance your learning and project development in data science.
  • 📊 Beyond regression: Recognize that while regression and other statistical concepts are important, a successful data scientist also needs domain knowledge, business understanding, and communication skills.
  • 🧮 Math is important, but context-dependent: The depth of math knowledge required varies depending on the type of data science work you'll be doing.
  • 🪟 Avoid the tutorial trap: Engage in hands-on work and practical application to solidify your understanding and avoid merely watching tutorials without applying the knowledge.

Q & A

  • What is the first mistake to avoid when learning to become a data scientist according to the video?

    -The first mistake to avoid is starting with coding. The video suggests that coding is a tool for applying data science, not data science itself. The core knowledge for a data scientist should be statistics and machine learning.

  • Why does the video discourage learning Python or SQL as the initial step in becoming a data scientist?

    -The video discourages this because Python and SQL are tools for applying data science, not the core of data science. The core should be statistics and machine learning, and if one does not enjoy these fundamentals, there is no need to invest time in learning coding languages.

  • What is the common misconception about the career path to becoming a data scientist that the video addresses?

    -The common misconception is that one must become a data analyst before becoming a data scientist. The video clarifies that this is not necessary and could lead to a waste of time that could be spent directly on learning data science concepts.

  • Why does the video suggest not jumping into a specific data science role without research?

    -The video suggests not jumping into a specific role without research because there are many roles in the data science domain, and one should understand what each role entails to ensure they are pursuing the right career path that aligns with their interests.

  • What is the importance of having a plan when learning data science, as emphasized in the video?

    -Having a plan is important because it helps to stay on track, allocate time effectively, and ensures that the learning process is goal-oriented. The video recommends working backward from the target role to understand the requirements and build a tailored learning roadmap.

  • How does the video suggest using generative AI in the learning process of data science?

    -The video suggests leveraging generative AI to teach new concepts or to help with coding tasks. It encourages incorporating generative AI into the learning curriculum to take advantage of this transformative technology.

  • What is the 'tutorial trap' mentioned in the video, and how can one avoid it?

    -The 'tutorial trap' refers to the misconception that one has mastered a skill just by watching a tutorial. To avoid it, the video advises doing hands-on work, practicing, and not moving on until the concepts are thoroughly understood and can be applied independently.

  • How does the video differentiate the importance of math for different types of data scientists?

    -The video differentiates by stating that if one is developing custom machine learning models, math is crucial. However, for those using pre-built models, the importance of math is still significant but not as critical, suggesting a spectrum of math knowledge requirements.

  • What is the video's stance on the necessity of domain knowledge and communication skills in data science?

    -The video emphasizes that beyond technical skills like regression, domain knowledge, business understanding, product management, and communication are essential for a successful data science career. These skills help in applying data science concepts in real-world scenarios and communicating findings effectively.

  • Why does the video compare job hunting to a project, and what does it suggest for preparation?

    -The video compares job hunting to a project to highlight the need for dedicated time and effort. It suggests treating the job search as a project by building a portfolio, preparing for interviews with practice, and understanding the requirements of the target role.

  • What resources does the video recommend for learning Python for data analysis?

    -The video recommends an intro to Python ebook created by HubSpot, which covers essential libraries like pandas, numpy, and matplotlib for data analysis with Python. It also provides coding snippets for beginners.

Outlines

00:00

🚀 Starting Your Data Science Journey

The speaker emphasizes that coding, while important, should not be the starting point for aspiring data scientists. Instead, the core knowledge of statistics and machine learning should be the foundation. The speaker advises against the common misconception that one must become a data analyst first, highlighting that data science encompasses more than just analytics skills, such as a deeper understanding of statistics and machine learning. The speaker also warns against jumping into a specific role without understanding the broader data science domain and the various roles available, such as machine learning engineer or AI product manager.

05:03

📈 Avoiding the Tutorial Trap and Job Market Realities

The speaker discusses the importance of having a structured plan when learning data science, suggesting to work backward from the desired role to identify the necessary skills. They recommend creating a roadmap and staying on track with it. Additionally, the speaker addresses the tutorial trap, cautioning against the over-reliance on tutorials without hands-on practice. They also touch on the realities of the job market, suggesting that job hunting should be treated as a project, with a focus on building a portfolio and preparing for interviews, which includes coding, behavioral questions, and statistical/machine learning fundamentals.

10:04

🧠 Balancing Math Skills and Practical Application

In the final paragraph, the speaker advises on the importance of math in data science, suggesting that its relevance depends on the specific role one aims to fill. For those developing custom machine learning models, a deep understanding of math is crucial, but for those using pre-built models, it's less critical. The speaker also stresses the importance of domain knowledge, business understanding, and communication skills, arguing that these are essential for a successful data science career. They conclude by encouraging learners to avoid the tutorial trap and to engage in hands-on work to solidify their learning.

Mindmap

Keywords

💡Data Science

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. In the video, the speaker emphasizes that data science is not just about coding but involves a deeper understanding of statistics and machine learning. The video's theme revolves around avoiding common misconceptions about entering the field of data science, highlighting the importance of focusing on these core areas rather than just learning to code.

💡Coding

Coding, specifically mentioned in relation to languages like Python and SQL, is presented as a tool applied in data science rather than the core of the field itself. The speaker advises against starting with coding, suggesting that it is a means to apply statistical and machine learning concepts rather than the foundational knowledge required for data science. This keyword is central to the video's message about the misconception of coding being synonymous with data science.

💡Statistics

Statistics is highlighted as a fundamental aspect of data science, essential for understanding patterns and making inferences from data. The video suggests that aspiring data scientists should focus on learning statistics before moving on to coding, as it forms the basis for many data science methodologies. This keyword is integral to the video's advocacy for a strong statistical foundation in data science education.

💡Machine Learning

Machine Learning is mentioned as a core component of data science, involving the development of algorithms that enable computers to learn and make predictions or decisions without being explicitly programmed. The video underscores the importance of machine learning knowledge for data scientists, suggesting that it is as crucial as statistics in the field.

💡Data Analyst

The role of a Data Analyst is discussed in contrast to that of a Data Scientist. The speaker clarifies that becoming a data analyst is not a prerequisite for becoming a data scientist, despite being a common misconception. This keyword is used to illustrate the different career paths within the data field and to advise viewers on the direct approach to a data science career.

💡Career Path

Career Path in the context of the video refers to the various roles and trajectories one can take within the data science domain. The speaker advises viewers to research different roles such as machine learning engineer, AI engineer, and data engineer before committing to one, to ensure it aligns with their interests and goals. This keyword is central to the video's guidance on career planning in data science.

💡Generative AI

Generative AI is discussed as a transformative technology that can be leveraged in the learning process of data science. The speaker encourages viewers to incorporate generative AI into their learning journey, suggesting it can aid in teaching new concepts and assisting with coding tasks. This keyword is used to highlight the evolving landscape of technology in the field of data science.

💡Regression

Regression is mentioned as a fundamental concept in data science, often used for predictive modeling. The video emphasizes the importance of understanding regression but also cautions against focusing solely on it, advocating for a broader understanding of data science that includes domain knowledge and communication skills. This keyword is used to balance the focus on core technical skills with the need for a holistic approach to data science.

💡Tutorial Trap

Tutorial Trap refers to the pitfall of relying too heavily on tutorials without applying the learned concepts in practical, hands-on work. The speaker warns against this trap, suggesting that true learning comes from practical application and problem-solving. This keyword is used to emphasize the importance of active learning and practical experience in mastering data science skills.

💡Mathematics

Mathematics is discussed in relation to its importance in data science, particularly for those involved in developing custom machine learning models. The speaker suggests that while math is crucial, its level of importance may vary depending on the specific role of the data scientist. This keyword is used to address the varying requirements of mathematical knowledge across different data science roles.

💡Hands-On Work

Hands-On Work is emphasized as a critical component of learning data science, where theoretical knowledge is applied to real-world problems. The video suggests that hands-on work is essential for solidifying understanding and avoiding the tutorial trap. This keyword is used to stress the practical aspect of learning data science, complementing theoretical studies.

Highlights

Focusing on avoiding common mistakes while learning to become a data scientist.

Disagreeing with the advice to start learning to code as the first step in data science.

Emphasizing that coding is a tool, not the core of data science, which is statistics and machine learning.

Suggesting to start with fundamentals in statistics, machine learning, and math.

Myth busting: You don't need to become a data analyst before becoming a data scientist.

Warning against spending too much time on data analytics skills at the expense of statistics and machine learning.

Personal advice from the speaker's experience about not jumping into a data science role without exploring other roles.

Importance of understanding different roles in the data science domain before deciding on a career path.

Advice on creating a plan and roadmap for learning data science.

Recommendation to work backward from the target role to understand the required skills.

Highlighting the importance of treating job search as a project and not just learning to code.

Stressing the need for practice, especially for coding interviews in data science.

Advocating the use of generative AI as a tool to aid in learning and project development.

Cautioning against moving beyond fundamental concepts like regression too quickly.

Stressing the importance of domain knowledge, business understanding, and communication skills for data scientists.

Discussing the varying importance of math depending on the type of data scientist one aims to be.

Warning against the tutorial trap and the importance of hands-on practice.

Transcripts

play00:03

if you got to this video that means you have  probably watched a lot of videos about data  

play00:08

science road map and how you can become a data  scientist and this video will focus on mistakes  

play00:13

that you should be avoiding while learning  to become a data scientist these learnings  

play00:18

are based on my experience in the industry over  the last 10 years working in the data science  

play00:22

domain and especially becoming a data scientist  in a non-traditional way this is the first thing  

play00:28

that you should avoid in the Journey of becoming  data scientist there will be many people who will  

play00:33

tell you to start with coding whether that is  python or something else I completely disagree  

play00:39

with that advice coding is a tool to apply data  science it is not data science in itself and I've  

play00:46

seen many people make this mistake where they  will jump into coding start learning python  

play00:50

start learning SQL and they think like this is  the right way to approach it okay maybe it is  

play00:54

the right way to approach it if you are very  new to coding and you have no background and  

play00:58

you just want to see like if coding is something  that you even enjoy because it's such a big part  

play01:02

of data science but what data science actually  is is statistics and machine learning that's the  

play01:07

core knowledge that a data scientist need to know  and coding is a tool that how a data scientist  

play01:13

will apply statistics and machine learning so I'm  going to say this again coding is a tool to apply  

play01:19

statistics and machine learning in data science  it is not data science in itself so what I would  

play01:24

like you to do is start learning the fundamentals  in statistics and machine learning and math and  

play01:29

see if this is something that you enjoy doing  because if you start doing that and you realize  

play01:32

that you don't enjoy it then I would want you to  stop learning coding languages such as python SQL  

play01:38

R are very powerful languages that you can learn  as a data scientist but these are just tools to  

play01:43

apply data science second and this is one of my  biggest pet peeve and I'm going to say it a lot  

play01:49

of people believe that they need to become a data  analyst before becoming a data scientist this is  

play01:55

not true you do not need to become a data analyst  before before becoming a data scientist okay I'm  

play02:02

going to say this again you don't need to become  a data analyst before becoming a data scientist  

play02:06

and I think this is such a big misunderstanding  that a lot of people have they will jump right  

play02:10

into data analytics they will learn all the skills  that you need to have as a data analytics whether  

play02:15

that is like data analysis learning SQL building  dashboards yes data scientists and data analysts  

play02:20

have a lot of Concepts that they have in common  but data science is more than that for example as  

play02:25

a data analyst you're going to be spending a lot  of time building dashboards as a data scientist  

play02:29

you will like likely not be using that skill set  additionally as a data scientist you require a lot  

play02:34

more statistics and machine learning knowledge  if you choose to become a data analyst you're  

play02:37

going to miss out on a lot of time that you could  have spent on learning Concepts in statistics and  

play02:42

machine learning to solidify your knowledge you  could have used that time to build projects and  

play02:45

data science you could have also used that time  to get experiences whether that's internships  

play02:50

personal projects and whatnot so if your end goal  is to become a data scientist please reconsider  

play02:55

your decision to become a data analyst first yes  there are many people who will transition from  

play02:59

data data analyst to data scientist completely  fine and it's okay for people who don't know that  

play03:04

they want to become a data scientist but later  they realize they want to become a data scientist  

play03:07

totally fine it's easier to make a transition  from data analyst to data scientist of course it  

play03:11

requires a lot of work but if your end goal is to  become a data scientist then start with that road  

play03:16

map one of my recent videos talks you through the  entire data scientist road map so definitely watch  

play03:21

that I'm going to link it somewhere here number  three is the mistake that I personally made when  

play03:25

I started in the data science domain I jumped  right into the data science role at that time I  

play03:31

did not know that there are so many other roles  that existed there is machine learning engineer  

play03:36

AI engineer data analyst data engineer although  I did work as a data engineer for some time but  

play03:40

that's a different story many other roles that  exist in the data science domain including a  

play03:45

recent one AI product manager which is booming  topic lately okay so what I'm trying to say is  

play03:49

that don't be me don't be sundas if you want to  get into data science don't just jump into one  

play03:55

career right away try to understand what different  career do what does a data analyst do what does a  

play04:00

machine learning engineer do maybe there is a role  that you will learn about and realize like no this  

play04:04

sounds actually more interesting to me and that's  what I want to pursue so like do your research  

play04:08

before deciding and jumping into one picking one  and moving forward with it because it's going to  

play04:13

take a lot of work and you want to be sure that  this is exactly what you want to do before you put  

play04:17

in a lot of work the fourth mistake that I see  a lot of people make when trying to learn data  

play04:21

science is not following a plan and this goes  to my first mistake where you just jump right  

play04:26

into learning to code sure that's fine it's great  it's better than doing nothing right but if you  

play04:31

want to become a data scientist just make sure  that you are going forward with a plan create a  

play04:35

plan create a road map and one of the things  that I would highly recommend is for you to  

play04:39

work backward this is something that I personally  have done in my personal projects in my interview  

play04:45

preps and in my job search and whatnot figure  out your target company your target role what  

play04:49

you want to work as then figure out what does a  data scientist let's say at meta does then look  

play04:54

at the job description for somebody who works at  meta then look at the job description for open  

play04:58

role at meta for data scientist role understand  what the requirements are then go to LinkedIn and  

play05:03

find somebody who is working as a data scientist  at meta or have worked as a data scientist at meta  

play05:07

look at their projects look at their education  try to understand what type of work they do what  

play05:11

kind of background they have this will give you a  really good understanding and will help you define  

play05:15

your road map that you would need to follow in  order to become a data scientist so make sure  

play05:20

when you're starting to learn data science you  go with a plan that will help you make sure that  

play05:24

you stay on track and you follow it put it on your  calendar whatever you need to do allocate whatever  

play05:28

time you need to allocate over the weekend after  work during day whatever you need to do like make  

play05:33

sure you are creating a plan and following it on  the topic of creating a plan I found this intro to  

play05:38

python ebook which is basically a beginner guide  to learning python for data analysis this ebook  

play05:44

is created by HubSpot who is also sponsoring this  portion of the video the ebook covers libraries  

play05:49

such as pandas numpy matte plot lib which are some  of the essential libraries for analyzing data with  

play05:55

python it also walks you through basic ideas and  gives you coding Snippets so you can plug and play  

play06:00

it is available to download free and I'm linking  it in the description below now let's talk about  

play06:04

the next thing that you should be avoiding while  learning to become a data scientist the fifth  

play06:08

mistake that you should avoid is expecting to  land a job after learning these things over the  

play06:13

last few months and years job market has become  what is the right word stressful job market has  

play06:21

become stressful and it takes a lot more than  just learning how to code to land a job I like  

play06:27

to think of Landing a job as a project in itself  just like learning these skills is a project in  

play06:32

itself let's say you're able to do regression and  Analysis using python but you go into a coding  

play06:36

interview you're in a time pressure setting there  are another person sitting in front of you and  

play06:40

it's possible that you might forget so in order  to truly be successful in these job interview  

play06:45

scenarios you need to practice practice practice  you need to practice it as much as you can and  

play06:50

I like to break data science into three buckets  one is coding which a lot of people spend a lot  

play06:55

of time on it's one of the buckets there's other  elements second is behavioral questions and third  

play06:59

third which I in my opinion is very important is  statistics and machine learning fundamentals where  

play07:03

you will get a lot of hypothetical scenario based  questions so make sure you're treating job search  

play07:08

as a project and fully dedicating time building  your portfolio building your projects that you  

play07:13

can speak about in your interviews and also going  into the interview with a lot of prep the sixth  

play07:18

mistake that you should avoid is my favorite  generative AI generative AI is one of the most  

play07:23

transformative technology that we have seen in  over the last decade in Tech data science and  

play07:28

Beyond so don't avoid it while you're learning  these skills make sure that you're leveraging  

play07:33

generative AI to your advantage whether that is  helping generative AI teach you new Concepts or  

play07:39

working on a project and having chadt WR basic  code for you I have done a few videos on chadt  

play07:44

and actually have a playlist on how to use  generative AI specifically chat GPT Bard  

play07:49

and other tools for coding and data analysis you  can watch it somewhere here so take generative a  

play07:54

seriously and make it part of your curriculum then  number eight mistakes is moving beyond regression  

play07:58

and this is is going to counter the first thing I  mentioned regression is one of the most important  

play08:03

Concepts in data science it is often that  we spend a lot of time on these Concepts on  

play08:07

learning regression on learning classification and  so on but there's a lot more to data science than  

play08:11

that in order to be a great data scientist you  need to have good domain knowledge good business  

play08:16

understanding good pming understanding pming as  in like product management and product development  

play08:21

life cycle and good communication so regression  is important learning these fundamentals and  

play08:26

concepts are important but how do you put that  into the real world where you're going to be  

play08:30

working with a lot of different stakeholders  in different domains how do you communicate  

play08:34

your findings your results your insights or  how do you even understand what the business  

play08:38

needs so having a good understanding of all of  these things around the data science umbrella  

play08:43

will help you be a successful in your career  so when you're defining your curriculum make  

play08:48

sure that you're incorporating these things into  your learning plan ninth math is important but not  

play08:54

really okay this might be controversial but the  reason I say this math is definitely important  

play08:58

but that that also depends on what kind of data  scientist you're going to be if you're going to  

play09:02

be a data scientist who is going to be developing  custom machine learning models math is your friend  

play09:07

and you should be knowing math in and out but if  you are going to be a data scientist who is using  

play09:11

machine learning models that are already built  for example regression analysis classification  

play09:16

model and you're not building your own custom  machine learning models math is important still  

play09:21

but not as much so understand where you fall in  the spectrum of how much math you need to know so  

play09:27

you don't fall into that trap and finally my last  and my biggest tip is do not fall into tutorial  

play09:33

trap when I say tutorials I basically mean you're  watching a YouTube video or you watching a video  

play09:37

on the course yes those are all important and  it is often that when you watch it for example  

play09:42

maybe you're watching this video and you're like  yes I get it I get all 10 points but in order to  

play09:46

apply them in practice it's way more difficult  similarly when you're watching these tutorials  

play09:51

let's say you're watching a tutorial in Python it  looks very easy it looks very straightforward you  

play09:56

finished that tutorial and you were like yes I  got it I can do it but when you actually start  

play09:59

applying these on a real project on real data  set on your computer the chances are that you're  

play10:04

going to run into issues so this is called  tutorial trap because you watched the video  

play10:09

and you moved on don't move on right away spend  some time doing the Hands-On work understanding  

play10:14

the concept in more detail before you move  on because the chances are that when you do  

play10:19

it yourself you're going to run into issues when  you run into issues you learn more running into  

play10:23

issues is actually not bad so that's why you need  to do more Hands-On work so it sticks with you the  

play10:27

knowledge sticks with you and you're not becoming  Target of tutorial trap okay so these were the 10  

play10:32

things that I wanted to mention is there anything  else that you would like to share let me know in  

play10:36

comments and if you like this video maybe I'll  see you in the next one have a great day bye

Rate This

5.0 / 5 (0 votes)

Related Tags
Data ScienceCareer AdviceMachine LearningStatisticsCoding SkillsJob MarketGenerative AIData AnalysisEducational TipsIndustry Insights