How I'd Learn Data Science In 2024 (If I Could Restart) - The Ultimate Roadmap

Data Nash
4 Jan 202426:42

Summary

TLDRThis video script serves as an ultimate roadmap for aspiring data scientists, detailing a comprehensive learning path for 2024. It emphasizes the importance of understanding data science basics, acquiring key skills in Python and mathematics, and applying them through project-based learning. The guide also advises on job applications, further specialization, and staying updated with industry trends. It encourages building a strong foundation, exploring various data science domains, and networking within the community to excel in the field.

Takeaways

  • 🧭 Start with self-assessment: Understand what data science is, the skills required, and if it aligns with your interests before diving in.
  • πŸ’‘ Research and observe: Spend time researching data science, looking at its applications in various industries, and understanding a typical data scientist's workday.
  • πŸ› οΈ Programming first: Prioritize learning programming over math, starting with Python due to its prevalence in job postings and community support.
  • 🐍 Choose Python: Focus on Python as it is sought after by 60% of job postings, providing a wider range of opportunities compared to R.
  • πŸ’» IDE selection: Use a popular and user-friendly IDE like Visual Studio Code (VS Code) for writing Python code.
  • πŸ“š Learn Python basics: Grasp fundamental Python concepts such as data types, variables, lists, dictionaries, and pandas data frames.
  • πŸ“Š Visualization tools: Learn to use visualization libraries like Matplotlib, Seaborn, or Plotly to create charts and graphs for data representation.
  • πŸ” Project-based learning: Apply newly learned skills through projects to solidify understanding and make learning more practical and memorable.
  • πŸ” Specialize wisely: After mastering the basics, consider specializing in areas like machine learning, NLP, or computer vision based on job market demand and personal interest.
  • πŸ”— Networking and community: Engage with data science communities and networks, both online and in-person, for support, collaboration, and knowledge sharing.

Q & A

  • What is the main goal of the video?

    -The main goal of the video is to provide a comprehensive blueprint for becoming a data scientist in 2024, including the most effective learning path, projects to work on, resources for learning, and ways to stand out among other data scientists.

  • Why does the video emphasize starting with programming rather than math?

    -The video emphasizes starting with programming because it forms the foundation for implementing mathematical concepts in data science. Even if one is not great at math, they likely have some familiarity with it from high school, whereas programming can feel like a new world and is essential for coding in data science.

  • What programming language does the video recommend learning first for data science?

    -The video recommends learning Python first because it is more widely requested in job postings and offers more employment opportunities compared to R.

  • What are the six basic concepts in Python that the video suggests learning?

    -The six basic concepts in Python suggested by the video are data types, variable assignment, lists, dictionaries, Pandas data frames, and basic control flows (IF statements, for loops, while loops).

  • Why is project-based learning emphasized in the video?

    -Project-based learning is emphasized because it helps solidify the knowledge gained from learning programming and mathematical concepts, making it easier to remember and apply these skills in real-life situations.

  • What are three beginner-friendly projects suggested in the video to solidify Python skills?

    -The three beginner-friendly projects suggested in the video are creating a simple contact book application, building an inventory management system, and writing a function to analyze an Excel file and return basic descriptive statistics.

  • How does the video advise approaching the job market while still learning?

    -The video advises gently prodding the job market by applying for entry-level roles and internships without full customization of the application. This helps to understand the job market's demands and to test and improve one's CV.

  • What are some areas of specialization mentioned in the video for data scientists?

    -The video mentions natural language processing, anomaly detection, predictive modeling, recommendation algorithms, marketing mix modeling, computer vision, and general machine learning as areas of specialization for data scientists.

  • Why is SQL considered a valuable skill for data scientists according to the video?

    -SQL is considered valuable for data scientists because it is used for querying and creating databases, which is a skill primarily used by data engineers and data analysts but is still beneficial for data scientists to understand and utilize.

  • What is the video's stance on the importance of having a digital footprint for data scientists?

    -The video suggests that having a digital footprint, such as posting about your journey on LinkedIn and Twitter, can help data scientists stand out as it shows their progression and engagement with the field, which can be appealing to potential employers.

Outlines

00:00

πŸš€ Introduction to Becoming a Data Scientist

The paragraph introduces the concept of becoming a data scientist and outlines the roadmap to achieve this goal. It emphasizes the importance of understanding what data science is, the skills required, and whether it aligns with one's interests. The speaker encourages research, looking into industries that utilize data science, and understanding the typical workday of a data scientist. The focus is on answering three key questions before diving into the field: What is data science? What skills are needed? And is it the right career choice? The paragraph also stresses the importance of learning programming, specifically Python, as a foundational skill for data scientists.

05:01

πŸ’» Building a Solid Foundation in Programming

This section delves into the importance of programming, particularly Python, as the cornerstone of a data scientist's skill set. It advises against spreading efforts across multiple programming languages and instead recommends focusing on Python due to its prevalence in job postings. The paragraph introduces the concept of an IDE (Integrated Development Environment) and suggests Visual Studio Code as a user-friendly option. It then lists six fundamental Python concepts essential for data science: data types, variable assignment, lists, dictionaries, control flows, and functions. The speaker also advocates for project-based learning to solidify these concepts, suggesting beginner-friendly projects like a contact book application, an inventory management system, and a function to analyze Excel files.

10:01

πŸ“ˆ Advancing with Pandas and Data Visualization

The paragraph discusses the transition from basic Python skills to more advanced data manipulation with Pandas, focusing on data frames. It simplifies the concept of data frames as 'fancy tables' akin to those used in Excel. The speaker introduces three additional fundamental concepts: basic control flows, data visualization libraries like Matplotlib and Plotly, and functions. The importance of project-based learning is reiterated with the suggestion to apply these newly learned skills in practical projects. The paragraph concludes by encouraging self-congratulation for overcoming the initial challenges of learning to code, emphasizing that the acquired knowledge is useless unless applied through projects.

15:02

πŸ”’ Integrating Mathematics into Data Science

This section introduces the integration of mathematical concepts into the data science learning journey. It emphasizes the importance of understanding statistical concepts, linear algebra, basic calculus and trigonometry, and probability. The paragraph suggests focusing on fundamental mathematical concepts that are relevant to data science and provides resources for learning, including YouTube channels and random Google articles. It also suggests project-based learning to solidify mathematical knowledge, proposing projects like calculating moving averages, implementing statistical functions, and exploring libraries like NumPy and SciPy.

20:02

🌐 Specializing and Enhancing Employability

The paragraph discusses the importance of specializing in certain areas of data science to enhance employability. It advises having a solid understanding of basic data pre-processing, feature engineering, and supervised learning. It then suggests exploring areas like natural language processing, anomaly detection, and machine learning for deeper knowledge. The speaker recommends starting to apply for jobs early in the learning process to test the job market and refine the CV. The paragraph also introduces additional skills like SQL, data visualization with Tableau, and the importance of community and networking. It concludes by encouraging the pursuit of a data science community for support and knowledge sharing.

25:03

πŸ“Š Mastering Data Presentation and Community Building

This section focuses on the importance of data presentation skills, particularly with Tableau, to create appealing dashboards that can help in job applications. It also touches on the value of community building and networking for moral support and problem-solving. The speaker suggests forming a community for dedicated learners and professionals to accelerate their data science journey through study sessions, calls, and mentorship. The paragraph concludes by encouraging the pursuit of additional skills like working with APIs, using GitHub, and learning streamlit, as well as maintaining a digital footprint on professional platforms to showcase one's journey and expertise.

πŸ“ˆ Staying Cutting-Edge in Data Science

The final paragraph emphasizes the importance of staying up-to-date with the latest trends in the rapidly evolving field of data science. It suggests following data science professionals on platforms like Medium, YouTube, Twitter, and LinkedIn for insights, tutorials, and discussions on new techniques. The speaker also encourages subscribing to newsletters and engaging with the data science community to continue learning and growing in the field. The paragraph concludes with a reassurance that the roadmap provided is comprehensive and encourages subscribers to access written resources for further guidance.

Mindmap

Keywords

πŸ’‘Data Science

Data Science is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data. In the video, it is the central theme, with the speaker providing a roadmap for aspiring data scientists. The video aims to clarify what data science entails and the skills required to excel in this field.

πŸ’‘Python

Python is a high-level programming language that is widely used in data science for its simplicity and the powerful libraries it supports for data analysis and machine learning. The video emphasizes learning Python as a foundational skill for data scientists, noting that many job postings explicitly ask for Python proficiency.

πŸ’‘IDE (Integrated Development Environment)

An Integrated Development Environment is a software application that provides comprehensive facilities to computer programmers for software development. In the context of the video, the speaker recommends using an IDE like VS Code for writing Python code, highlighting its popularity and utility among developers.

πŸ’‘Pandas

Pandas is a Python library providing high-performance, easy-to-use data structures, and data analysis tools. The video mentions Pandas as a crucial library for data manipulation, particularly for working with data frames, which are akin to tables and are fundamental to handling data in Python.

πŸ’‘Mathematics

Mathematics, particularly statistics, linear algebra, calculus, and probability, forms the backbone of many data science algorithms and techniques. The video underscores the importance of having a strong grasp of mathematical concepts to interpret and apply data science methods effectively.

πŸ’‘Project-Based Learning

Project-Based Learning is an educational method where students gain knowledge and skills by working for an extended period on investigations, research, and projects. The video advocates for this approach in learning data science, suggesting that applying newly learned skills to real-world projects solidifies understanding and makes the learning process more meaningful.

πŸ’‘Statistical Concepts

Statistical Concepts such as mean, median, mode, variance, and standard deviation are essential in data analysis for summarizing and describing data. The video includes these concepts as part of the mathematical foundation necessary for data scientists to analyze and interpret data accurately.

πŸ’‘Linear Algebra

Linear Algebra is a branch of mathematics that deals with linear equations, vectors, matrices, and linear transformations. The video mentions that understanding linear algebra is crucial for data scientists as it provides the framework for many operations and algorithms in the field.

πŸ’‘Visualization

Visualization refers to the presentation of data in a graphical or visual format, making it easier to understand and interpret. The video suggests learning visualization libraries like Matplotlib and Plotly to create charts and graphs that can help in presenting data insights more effectively.

πŸ’‘SQL

SQL (Structured Query Language) is a domain-specific language used in programming and software engineering, as a database query tool, especially the requested query for online transaction processing, but is also used for database management, and data analysis. The video highlights SQL as an additional skill that data scientists should learn, as it is valuable for querying and managing databases.

πŸ’‘Machine Learning

Machine Learning is a subset of artificial intelligence that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. The video touches on machine learning as an area where data scientists can specialize, and it is a field that is constantly evolving with new techniques and applications.

Highlights

By the end of this video, you will have a comprehensive blueprint to become a data scientist in 2024.

The video provides a detailed roadmap, including the most effective self-taught route, project ideas, and resources for learning data science.

It emphasizes the importance of answering three fundamental questions before starting to study data science.

Research is suggested to understand data science, its applications in various industries, and the typical workday of a data scientist.

Programming, particularly Python, is recommended as the first skill to learn for data science, ahead of mathematics.

Python is chosen over R for its broader demand in job postings and less competition for those positions that do require R.

Visual Studio Code (VS Code) is recommended as an Integrated Development Environment (IDE) for writing Python code.

Six fundamental Python basics are outlined as essential for data science, including data types, variables, lists, dictionaries, and control flows.

Pandas and data frames are introduced as important for working with data tables in Python.

Project-based learning is stressed as a method to solidify knowledge, with three beginner-friendly projects suggested.

The video discusses the benefits of structured learning through courses or boot camps over self-teaching via YouTube.

DataCamp is recommended for learning Python, SQL, and other data science skills.

Mathematics is introduced as a concurrent learning path, focusing on statistics, linear algebra, calculus, and probability.

Applying for jobs with a basic understanding of math and programming is advised, even before mastering all skills.

The importance of tailoring a CV to highlight data science skills and projects is emphasized for job applications.

Specialization areas in data science such as NLP, anomaly detection, and machine learning are suggested for deeper knowledge.

SQL is highlighted as a valuable skill for data scientists, with a focus on querying, database creation, and working with relational tables.

Visualization skills, particularly with Tableau, are recommended to present data science findings effectively.

The video concludes with advice on staying up-to-date with the latest trends in data science through platforms like Medium, YouTube, and following industry leaders.

Transcripts

play00:00

by the end of this video you will have

play00:01

the blueprint to become a data scientist

play00:03

in 2024 videos on this topic are usually

play00:06

optimized to be as digestable as

play00:08

possible for YouTube but a 12 minute 25

play00:11

second video will only allow you to walk

play00:13

away with a few bullet points and a lot

play00:15

of confusion today I will be giving you

play00:17

so much more than that the ultimate road

play00:19

map I'll not only tell you the most

play00:21

effective cell taught route to take but

play00:23

also give you different projects you can

play00:25

do and in what order at different stages

play00:27

of your journey the resources that I use

play00:30

to learn data and most importantly a

play00:32

bunch of different ways to stand out in

play00:34

a sea of data scientists strap yourself

play00:37

in thousands of people get into this

play00:40

journey they spend countless hours

play00:42

hunched in front of their laptop away

play00:43

from friends and family learning all

play00:45

these different programming languages

play00:47

just to get to the end and say yeah I

play00:49

studi data science but I realize it's

play00:51

not really for me that's what happens

play00:53

when you don't answer these three simple

play00:55

questions before you start studying data

play00:58

science do you know what data science is

play01:00

do you know the skill set you need to

play01:02

become a data scientist and does it

play01:04

sound like something that you want to do

play01:06

so how do you get the knowledge to

play01:07

answer these questions you need to do

play01:09

three different things research what is

play01:11

data science online just an hour or two

play01:13

is enough to give you the gist look up a

play01:15

couple of industries that you're

play01:16

interested in and see how they use data

play01:18

science and the third thing you need to

play01:20

do is look at what a typical day in the

play01:22

life of a data scientist looks like

play01:24

research the typical work days are you

play01:26

happy with the amount of coding they

play01:27

have to do the amount of stakeholder

play01:29

interaction they have and the amount of

play01:31

presenting they have to do for example

play01:32

give yourself the solid foundation

play01:34

before diving head first step two dive

play01:37

head first the two most important skills

play01:38

we need as data scientists are maths and

play01:41

programming so how do we pick which to

play01:43

learn first to me the answer is simple

play01:46

programming definitely programming

play01:48

that's because even if you're not great

play01:49

at maths you at least have a level of

play01:51

familiarity with it from your high

play01:53

school days but programming can feel

play01:55

like entering a whole new world

play01:56

programming is also how we'll often be

play01:58

implementing the math that we do you

play01:59

learn so it makes sense to have this

play02:01

solid foundation of programming before

play02:03

you begin the major programming

play02:05

languages we have as data scientists are

play02:07

R and Python and don't waste time on

play02:09

this I'll make it super simple for you

play02:11

python just pick python your end goal

play02:14

here is to be as Employer as possible so

play02:16

just pick python why well when I look at

play02:19

job postings it's like this 60% ask for

play02:22

python explicitly 30% ask for r or

play02:25

Python and very few ask distinctly just

play02:28

for r with no alternative for another

play02:30

language so if you choose R you will be

play02:32

excellently positioned for these jobs

play02:34

because there is a lot less competition

play02:36

but the problem is those jobs are much

play02:38

rarer and you'll be hamstringing

play02:40

yourself for all of these jobs so just

play02:43

pick python please the next step is to

play02:46

pick a simple commonly used IDE a simple

play02:49

analogy to understand what IDE are if

play02:52

tomorrow I decided I wanted to write a

play02:54

book in English I'll have to decide what

play02:56

software I would write that book in

play02:58

msword Google docs scrier the software

play03:01

is the equivalent of an IDE so what IDE

play03:04

should you write your python code inside

play03:07

of again I'll make your life easy pick

play03:09

something people use or at least what I

play03:11

use vs code congratulations you now have

play03:14

your basics in order now what do we have

play03:17

to learn in Python good news I'm not

play03:19

just going to give you the basics in

play03:21

Python but also great mini projects that

play03:24

you can Implement to solidify your

play03:27

knowledge these six Basics are things

play03:29

that you use every day and some of them

play03:31

every single line of your coding Journey

play03:34

without a good grasp of these it is

play03:35

effectively impossible to code for data

play03:38

science so the first thing I want you to

play03:40

learn is data types and what you can and

play03:42

can't do with each one of these within

play03:44

python you probably intuitively know a

play03:46

few of these such as ins are basically

play03:48

whole numbers floats are decimals and

play03:51

strings are just like words and words

play03:53

and numbers and a combination of

play03:55

everything in between the second basic

play03:57

after that that I need you to get a hold

play03:58

of is a assigning variables and again

play04:01

this is pretty straightforward assigning

play04:02

a variable is basically giving your

play04:04

object a code name that you will refer

play04:06

to it as throughout the rest of your

play04:08

code after that learn about lists next

play04:11

and lists really are super simple to

play04:14

understand they are effectively ways of

play04:16

storing different items together within

play04:18

python so let's say instead of always

play04:21

referring to items individually you can

play04:23

use a collective name a simple easy to

play04:25

understand use case for this is if you

play04:27

had four variables which were a country

play04:29

name England Wales Scotland and Northern

play04:32

Ireland instead of always writing them

play04:34

out one by one you could just put them

play04:36

into a list called the UK okay next up

play04:38

after that is dictionaries and how they

play04:40

work they really are sort of like lists

play04:43

except stored in pairs so if you wanted

play04:45

to include more information you could do

play04:47

that so instead of just storing the

play04:49

countries in the UK you could also store

play04:51

each country and its capital city okay

play04:53

I'm going to stop giving examples

play04:55

because it might start to get confusing

play04:57

but trust me it's super straightforward

play04:59

it might just just be because this is

play05:00

your first exposure to these Concepts

play05:03

okay now we're on to the basics of

play05:04

Panda's data frames if we want to be

play05:07

over simplified a panda's data frame is

play05:09

essentially a fancy way of saying a

play05:11

table okay a data frame is a glorified

play05:14

table but you understand how tables work

play05:16

in Excel okay good well you're at least

play05:18

on your way to understanding how data

play05:20

frames work in Python okay are you

play05:22

feeling confused at this point if so

play05:24

it's absolutely fine these might be new

play05:27

Concepts to you and I promise you as

play05:28

soon as you start learning and getting

play05:30

the hang of these Concepts You' be like

play05:32

I remember when I used to struggle with

play05:33

what a list was trust me progress is

play05:36

inevitable okay now I need you to learn

play05:38

just three more things and we'll be done

play05:40

with the basics and you'll see just how

play05:42

much we can do with just the basics when

play05:44

we get to the projects okay the fourth

play05:46

basic I need you to learn are basic

play05:48

control flows these are IFL statements

play05:50

for Loops while loops and all of these

play05:53

are basically what they sound like it

play05:55

might take a little bit of grasping but

play05:57

you'll be fine after this I want you to

play05:58

learn a basic visualization Library such

play06:01

as matplot lib Seaborn but personally I

play06:04

prefer plotly it is just a little bit

play06:06

nicer looking and it just lets you make

play06:08

simple charts to visualize your data as

play06:10

you go along and the last last last

play06:13

absolute basic that I want you to learn

play06:15

are functions and how to define and

play06:17

create functions functions are basically

play06:20

predefined bits of code that you can

play06:22

call at any time to avoid writing the

play06:24

same code again and again so let's say

play06:27

for whatever reason in your code you're

play06:29

going to need to divide numbers by two

play06:31

then multiply by five then subtract

play06:33

three you could do that manually every

play06:36

single time which is super time

play06:38

consuming not to mention boring or you

play06:41

could write a function where you say hey

play06:43

every time I give you a number I want

play06:44

you to divide it by two multiply by

play06:46

three and then subtract four and then

play06:49

all you have to do is call that function

play06:51

and it does it for you much simpler okay

play06:53

okay I promised I'm going to stop giving

play06:55

examples because it could get a little

play06:57

bit confusing but anyway when When You

play06:59

Reach This stage give yourself a pad on

play07:01

the back because you've come through

play07:02

some of the toughest parts of learning

play07:04

how to code but all of this knowledge

play07:06

that you've just gained is absolutely

play07:08

useless unless you apply this one

play07:10

principle from now on every time you

play07:12

pick up two to three skills I want you

play07:14

to implement the principles over here

play07:16

and that principle is project-based

play07:18

Learning Without applying these skills

play07:20

to a project that represents real life

play07:22

you will instantly forget exactly what

play07:24

you've learned and you'll think why

play07:26

would anybody care about lists or

play07:27

dictionaries that mindset will change

play07:29

when you apply to a project so these are

play07:31

three beginner friendly projects to

play07:33

solidify your skills that you're already

play07:35

beginning to build the first one is to

play07:37

create a simple contact book application

play07:39

in other words within vs code I want you

play07:41

to create functions that will allow you

play07:43

to create a contact add that contact to

play07:45

a contact list find the details of a

play07:48

contact update a contact's details and

play07:50

then also delete a contact if you need

play07:52

to do so the only skills that you'll

play07:54

need are the ones that we've learned so

play07:56

far the second project actually Builds

play07:58

on what we've just learned but it's an

play07:59

inventory management system so I want

play08:01

you to be able to create an item with

play08:03

the price find the details of that item

play08:06

but I also want you to have a till

play08:07

balance that updates every time somebody

play08:09

makes a sale or a return and the last

play08:12

project is the simplest of the three I

play08:14

want you to write a function that takes

play08:15

in an Excel file converts it to a data

play08:18

frame and then Returns the basic

play08:20

descriptive statistics of that file and

play08:22

all of these projects we can do with

play08:24

just the previously discussed skills now

play08:26

you know what to learn how to apply it

play08:28

but where do you actually learn it if

play08:30

you are going self-taught there are two

play08:32

options each with its own pros and cons

play08:35

the first one might actually be the true

play08:37

self Tor route which would be looking

play08:39

each one of these up on YouTube and

play08:41

looking to learn that way it's super

play08:43

cheap and On Demand but there's a lot of

play08:45

drawbacks it lacks structure and when

play08:47

you learn A New Concept from one Creator

play08:49

they don't know what concept you already

play08:51

knew before that so they might refer to

play08:53

knowledge that you do not have yet

play08:55

that's why honestly I think it's worth

play08:56

just Shing out the money to get a course

play08:58

or a a boot camp especially considering

play09:00

how much you will make once you become a

play09:02

data scientist the major advantages of

play09:04

this for me is the structure it will

play09:06

give you a lot of structure and

play09:08

importantly you can still get all the

play09:10

information that you would have gotten

play09:11

with this method over here and in that

play09:14

way you can supplement your knowledge

play09:15

from the course it also has

play09:17

disadvantages it's not free but still

play09:19

let's be honest way cheaper than a

play09:21

degree and trust me I would know and

play09:23

it's also gamified that was a big

play09:25

problem that I had with my course and

play09:27

what I used when I was learning was Data

play09:29

Camp as it has skill tracks in Python

play09:31

SQL and anything else that you can think

play09:34

of I do have my gripes with data camp

play09:36

but I still use it to this day so I will

play09:38

leave a link for it in the description

play09:40

for you to be able to check out the

play09:42

courses that they do offer and I'll be

play09:43

making a video in the next few weeks

play09:45

teaching you how to effectively learn

play09:47

using an online course so subscribe so

play09:50

that you don't miss that so everything

play09:52

that I'm saying might seem like a lot

play09:54

but if you subscribe to my newsletter

play09:55

you'll get a written road map of

play09:57

everything that I have shown to you but

play09:59

more importantly than just a written

play10:00

road map is an insights that I will be

play10:02

sending you one to two times a month

play10:05

insights that you can only get from

play10:06

working in the industry and the things

play10:08

that aren't great for the YouTube

play10:10

algorithm but will help you to not just

play10:12

land a data science job but improve as a

play10:15

data scientist head to datan nash. co.uk

play10:18

pop your email into the box and you'll

play10:20

get a road map and a free subscription

play10:22

to my

play10:23

newsletter okay so now we have our base

play10:25

in programming we can introduce some

play10:27

concurrency into our learning that means

play10:29

learning two things side by side we'll

play10:32

be doing more complex work in Python

play10:34

that will be tailored to making you more

play10:35

employable and I'll touch on that in

play10:37

depth a little bit later because the

play10:39

other thing that we're learning for now

play10:41

is mats now I don't want you to download

play10:44

the entire contents of a math book into

play10:46

your mind so we want the best bang for

play10:48

the book the fundamental mathematical

play10:50

Concepts that are asked for by all jobs

play10:53

so that math includes a basic

play10:55

understanding of some statistical

play10:57

Concepts like median mean the mode data

play11:00

standardization variance and standard

play11:03

deviation ketosis skewness correlation

play11:06

and covariant and you can read the rest

play11:08

on the screen after that we also have

play11:10

these important linear algebra topics

play11:13

and linear algebra provides the

play11:14

framework for many data science

play11:16

operations and algorithms but the key

play11:18

Concepts you should focus on are systems

play11:20

of linear equations vectors matrices

play11:23

igen values and igen vectors

play11:25

normalization and distance calculations

play11:28

after that is some B basic calculus and

play11:30

trigonometry and you can see the four

play11:32

major areas I want you to focus on

play11:33

differentiation integration limits and

play11:36

trigonometric functions and finally this

play11:38

is an important one we need to

play11:41

understand probability and the concepts

play11:43

in particular I want you to familiarize

play11:45

yourself with are hypothesis testing

play11:47

Invasion probability conditional

play11:49

probability probability distribution and

play11:52

expected

play11:54

values take a breath it's just maths

play11:57

it's going to be okay and unlike High

play11:59

School where you had to write out pages

play12:01

and pages of maths by hand and get a

play12:03

nice thick red X every time you were

play12:05

wrong the goal here is mainly to

play12:07

understand the underlying mathematical

play12:09

Concepts so that you can interpret it

play12:12

and use it to Aid your decision making

play12:14

most of the time there will be python

play12:15

libraries that can do the implementation

play12:17

on your behalf and your job will mainly

play12:19

be structuring the code around it and

play12:21

interpreting those results and remember

play12:23

you don't have to be a master of these

play12:25

Concepts but just have a good grasp of

play12:27

the fundamentals you will be fine one

play12:30

step at a time remember so what can you

play12:32

use to help you learn the math well here

play12:34

are three excellent channels that I

play12:36

recommend in this aspect stack Quest

play12:39

redlick mats and three blue one brown

play12:41

and let's not forget my favorite

play12:43

resource which is random Google articles

play12:46

those are pretty good to teach you some

play12:47

maths now don't forget our number one

play12:49

principle Project based learning so now

play12:52

we will combine our math skills and our

play12:54

programming skills to solidify our

play12:56

knowledge so the first project can you

play12:58

code a function that will calculate the

play13:00

moving average of a series of numbers

play13:03

and plot the output in a graph in Matt

play13:05

plot Li after that can you code a basic

play13:07

statistical function that takes in a

play13:09

list of numbers calculates the mean

play13:12

median and mode variance and standard

play13:14

deviation and for variance and standard

play13:16

deviation I want you to implement that

play13:18

manually just to make sure that you have

play13:20

a good understanding of those Concepts

play13:22

the third thing code a function that

play13:24

calculates the dot product of two

play13:26

vectors for this one again I don't want

play13:27

you to use any like Li iies at all no

play13:30

numpy okay for the next one can you code

play13:32

a function that takes in two matrices

play13:34

firstly checks if multiplication of

play13:36

those matrices is possible then if it is

play13:39

possible multiplies them otherwise it

play13:42

returns an informative error and then

play13:44

besides that I want you to explore what

play13:46

you can do with these two libraries in

play13:47

particular numpy and

play13:50

scipi and of course let's not forget my

play13:53

favorite one the fourth one random

play13:55

Google articles which have saved me on

play13:57

more than one occasion we have now

play13:59

reached the whole goal of our data

play14:01

science journey and this is one of my

play14:04

more controversial takes but armed with

play14:06

just the basics of maths and programming

play14:08

you should start applying for jobs but

play14:11

maybe not in the way that you expect I

play14:13

want you to put together a CV that shows

play14:15

off your data skills and the projects

play14:17

that we've done so far but here's the

play14:19

key part we aren't spending 4 hours a

play14:21

day applying for jobs at this stage

play14:23

instead we're just gently prodding

play14:25

around the market and mainly apply for

play14:26

entry level roles and internships that

play14:29

do not require full customization of the

play14:32

application because right now we

play14:33

probably won't get the job but Nash

play14:35

what's the point of applying if we won't

play14:37

get the job good question two simple

play14:39

reasons and the second one might be more

play14:41

important than the first the academic

play14:43

year of my masters was a 10-month

play14:45

process but I secured my first

play14:47

internship to work as a data scientist

play14:49

in January of 2022 4 months into my

play14:52

masters when learning data science we

play14:54

often view the process like this I'll s

play14:56

of having no skill as a data scientist

play14:58

and nobody will want to hire me and then

play15:00

eventually I'll get through all the

play15:02

courses and get that one final skill and

play15:04

all of a sudden people will be dying to

play15:06

hire me the reality is that the skill

play15:08

acquisition chart looks more like this

play15:10

where as you gain skills your level of

play15:12

employability Rises you do not know

play15:15

where on this axis your first employer

play15:17

will be willing to take you on as a data

play15:18

scientist so by podding around with your

play15:20

CV you get to get your first job as

play15:22

early in the process as possible the

play15:25

second reason is all about downloading

play15:27

data the last thing I want for you is to

play15:29

spend months picking up all of these

play15:30

skills and then applying for hundreds of

play15:32

jobs and never hearing anything back and

play15:35

trust me this will happen if your CV is

play15:37

awful by doing a bunch of simple

play15:39

applications as you learn you get to

play15:41

test your CV you might put in 20 to 30

play15:43

easy applications and hear nothing back

play15:46

and that is a sign to tweak your CV and

play15:49

see if your response rate improves maybe

play15:51

your education sector needs to go below

play15:53

your work experience or change the

play15:55

length of your CV consistent easy

play15:57

applications will allow you to to

play15:58

experiment until you have reached your

play16:00

optimal

play16:02

CV with the basics of pipe and mastered

play16:05

we can now have a lot more fun in this

play16:07

part of your journey I want you to have

play16:08

a lot more autonomy around which areas

play16:11

you dig into firstly by having a

play16:13

curiosity mindset in addition to this

play16:15

curiosity mindset I also want you to

play16:18

adopt just in time learning so as

play16:20

opposed to just thinking hm I don't know

play16:22

anything about web scraping let me just

play16:24

randomly learn how to web scrape the

play16:26

more effective way is to always be

play16:27

working on projects and then when there

play16:29

is a project that demands for you to

play16:31

learn how to scrape you put aside time

play16:33

to then learn it to move your project

play16:35

forward as opposed to learning things

play16:37

just in case but before digging into the

play16:39

potential areas of specialization I will

play16:42

make mention of these three areas that I

play16:44

advise you have a solid understanding of

play16:46

well four areas the first is basic data

play16:49

pre-processing and feature engineering

play16:51

but I also want you to know what

play16:52

supervised learning is and the basics of

play16:54

how to do that the same with

play16:56

semi-supervised learning and supervised

play16:58

learn after this common areas in which

play17:00

to specialize or get deeper knowledge in

play17:02

the first is natural language processing

play17:04

anomaly detection predictive modeling

play17:07

recommendation algorithms marketing mix

play17:09

modeling computer vision and general

play17:12

machine learning are all good areas in

play17:14

which to get deeper knowledge this

play17:16

doesn't mean you have to only specialize

play17:17

in one of these at this stage but these

play17:19

are the areas that you commonly see job

play17:21

postings for now with all of that

play17:23

knowledge you should be in a good

play17:24

position to actually get a job and the

play17:26

key is to have excellent projects that

play17:28

that appeal to employers regardless of

play17:30

which specialization area you're looking

play17:32

for it can be difficult to think of new

play17:34

projects that appeal to employers and a

play17:36

platform that I actually recently

play17:38

discovered that allows me to think of

play17:39

great projects is called project Pro

play17:42

they literally have industry-leading

play17:43

standard projects and in my opinion are

play17:46

more advanced than what you can find

play17:47

online in general so whenever I need a

play17:49

new project that's the platform I go to

play17:51

and I will put a link in the description

play17:54

for it there are a lot of advantages as

play17:56

you can see listed but the one thing is

play17:58

that there's subscription price does

play17:59

reflect the standard of the projects

play18:01

that it does contain so only do this if

play18:03

you have the funds too afforded and one

play18:05

those really really top projects to

play18:07

stand out quick interjection people I've

play18:10

actually spoken to the people at project

play18:12

Pro they've agreed to give you a 5%

play18:14

discount if you use the link below just

play18:16

to make it a little bit more affordable

play18:18

so I think first do free projects the

play18:20

ones mentioned in this road map see what

play18:23

you can do on your own time after that

play18:25

then when you really want to take your

play18:27

project to the next level if that's

play18:28

what's holding you back project Pro is a

play18:30

great place I'm using it mainly for

play18:32

learning how to implement lrm Solutions

play18:35

but they have so much more than that so

play18:38

yeah link below 5% if that sounds

play18:40

interesting to you if you can't afford

play18:42

that there's plenty of cheaper and free

play18:44

options the first one is taking on the

play18:46

projects that are within your course

play18:48

this also has a lot of advantages but

play18:50

also a lot of disadvantages such as

play18:52

being quite generic focused on skill

play18:54

display rather than being employable and

play18:56

often times they spoon feed you to get

play18:58

through that project but from the free

play19:00

options what I would recommend is using

play19:02

your own internal knowledge and

play19:04

curiosity for example maybe you have a

play19:06

background in customer service and

play19:08

decide I think it would be useful to

play19:10

write some code that would tell you the

play19:11

sentiment of customer reviews about our

play19:13

product frequent questions that come up

play19:15

from customers and the frequently

play19:17

mentioned reasons for poor reviews now

play19:19

imagine that you are an employer who

play19:21

does e-commerce and you see that project

play19:23

from this person your mind will

play19:24

instantly think oh wow they could bring

play19:26

so much to my company if they can

play19:28

translate that to us because we want to

play19:30

know what our customers are thinking I

play19:32

have a whole video here explaining how

play19:34

to do effective projects to actually get

play19:36

employed so you can open that in a new

play19:38

tab or add it to your watch later and to

play19:40

do any project you will need data so

play19:42

familiarize yourself with kaggle.com

play19:44

which is a website where you can get

play19:45

free data to do your personal projects

play19:49

with okay so now we have good python

play19:51

good projects and good fundamentals but

play19:54

so do a lot of data scientists we now

play19:56

want to be hyper valuable and pick up up

play19:58

additional skills that will put us head

play20:00

and shoulders above the competition the

play20:02

first of which is SQL which is a great

play20:04

querying and database creating language

play20:06

it's excellent and is mainly used by

play20:08

data engineers and data analysts but

play20:10

it's still very valuable for us as well

play20:12

compared to learning the basics of

play20:13

python learning the basis of SQL is

play20:15

pretty straightforward but a few key

play20:17

areas I want you to focus on are how to

play20:20

query how to create a database including

play20:22

reducing to 2 NF and 3 NF format working

play20:25

with relational tables and foreign Keys

play20:27

as well as elements like creating

play20:29

temporary tables and some easy window

play20:31

functions and partitioning again what I

play20:33

used to learn all of this when I was

play20:35

going through my soft tour phase was

play20:37

Data camp and their SQL Developer track

play20:39

which was actually really good and once

play20:41

again we've picked up a new skill so

play20:43

what do we do Project based learning

play20:45

employers don't care about you telling

play20:47

them you can do SQL they want to see

play20:49

that you can do SQL so for these

play20:51

projects we can have a dedicated SQL

play20:53

project which is just you showing how to

play20:55

create a database nothing wrong with it

play20:57

perfectly fine but option b I think is

play21:00

integrating it with your existing data

play21:02

science projects so before we're

play21:04

building a predictive model on a Cagle

play21:06

data set now firstly create a database

play21:08

for that data set reduce it to 3 NF then

play21:11

do the necessary joins to get the

play21:12

columns you need to build your

play21:14

prediction model on top of that I'm

play21:16

linking this free kaggle data set down

play21:18

below so that you can do that if you

play21:20

wish to now the next secret weapon as

play21:22

data scientists we often do not pay

play21:24

enough attention to the front end the

play21:26

customer facing aspect of of our

play21:28

projects we just concentrate on getting

play21:30

good at the coding and then leave it to

play21:32

the data analyst to make it look pretty

play21:34

but a lot of companies can't afford to

play21:35

have a dedicated data analyst so they're

play21:38

looking for a data scientist with the

play21:40

ability to present their findings and

play21:41

not just throw a random Jupiter notebook

play21:44

at them so the next thing that you

play21:45

should do is become competent with the

play21:47

visualization software and I do

play21:49

recommend Tableau when presenting your

play21:51

work to employers and recruiters you now

play21:53

be able to show it off both as code but

play21:55

also as a really appealing dashboard

play21:57

that Crystal izes the work that you've

play21:59

done the best part is learning the

play22:00

basics of Tableau won't take you long at

play22:02

all so there's no reason not to take a

play22:04

weekend or two just to learn the basics

play22:06

now with Tableau Python and SQL in Your

play22:09

Arsenal and continued work on all three

play22:11

of these you should be well positioned

play22:13

to get your first job where before we

play22:15

were being casual in a job search we are

play22:18

now being really intentional with our

play22:20

job hunt really take the time to fix

play22:22

your CV now and have dedicated time to

play22:24

apply for entry roles that you think you

play22:26

can get it's not just easy apply anymore

play22:29

take the time to customize your CED

play22:30

where possible for different jobs list

play22:32

your experience and projects in a nicely

play22:34

ordered Manner and I will be doing a

play22:36

completely separate video on this but in

play22:38

the meantime here's some information on

play22:40

how to increase your odds of getting

play22:42

that first job as well as a couple more

play22:44

videos that I've done around this area

play22:46

that I will be linking down below as

play22:48

well listen you will feel stuck at

play22:51

different points during your data

play22:52

science journey and if you do go down

play22:54

this path solo it will get extremely

play22:56

lonely extreme quickly so you need to

play22:59

find a community of other people who are

play23:01

getting into data science or this area

play23:03

in general for moral support but also to

play23:05

discuss problems and look to solve these

play23:07

together you can look for communities

play23:09

online in the shape of forums provided

play23:11

by the courses you pay for social media

play23:13

groups and those sort of things and I'll

play23:15

be honest I don't have experience with

play23:16

either of those but I do have experience

play23:18

with networking on person which is an

play23:20

amazing resource that I've had great

play23:22

results with and I do have a video that

play23:24

exclusively discusses how to network

play23:26

effectively but the one thing that does

play23:28

do is limit your community to those who

play23:30

are local to you and that's a huge

play23:32

missed opportunity which is why I'm

play23:34

looking to form a community to solve

play23:35

these problems that you can sign up for

play23:38

in this community I'll be having study

play23:39

sessions and regular calls with the

play23:41

members to provide more tailored advice

play23:43

and mentorship on accelerating your data

play23:45

science Journey it will be for dedicated

play23:47

fellow Learners and experienced

play23:49

professionals who don't just want to be

play23:51

mediocre data sors but want to work

play23:53

their way up to being truly great it

play23:55

will provide an ecosystem for growth

play23:57

support knowledge sharing and it's it's

play23:59

just a space where you can ask questions

play24:01

share insights collaborate on projects

play24:03

and get feedback all of which are

play24:05

essential to accelerating your progress

play24:08

if that sounds like something you want

play24:09

to be a part of sign up for the weight

play24:11

list below to get Early Access when this

play24:13

community does go

play24:15

live if we're thinking about the 8020

play24:17

rule we're now definitely in the realm

play24:19

of nice to have things rather than

play24:21

Necessities but these definitely would

play24:24

make you one of the outstanding

play24:25

candidates on top of everything else

play24:27

that we've already already learned is

play24:28

learning how to work with apis and use

play24:31

them to fetch data that can change

play24:33

dynamically instead of the static csvs

play24:35

we've been working with when downloading

play24:37

data off of kago also learn the basics

play24:39

of GitHub and these Basics are getting

play24:41

your projects into your GitHub

play24:43

repository so that other people have

play24:45

access to your code as well and

play24:47

something else that I'm learning is

play24:48

streamlit which allows you to easily

play24:50

turn your code into an interactive web

play24:52

application that other people can use

play24:55

and more on that coming on the channel

play24:56

soon but that's that's a super useful

play24:58

skill and the last one this is very much

play25:01

an extra he is posting about your

play25:03

journey onto platforms like LinkedIn and

play25:05

Twitter which are particularly useful

play25:07

cuz they're more professional at times

play25:10

and if you have a digital footprint it

play25:11

shows that you're slowly leveling up and

play25:14

it could help you to stand out but don't

play25:16

dedicate too much time to the

play25:18

documentation at this

play25:20

stage and now we are on to the final

play25:23

element which is The Cutting Edge data

play25:26

science is a field that is always in

play25:28

flocks so you need to remain up to date

play25:30

with the latest Trend and the three best

play25:32

ways that I look to do this firstly

play25:34

medium and towards data science these

play25:36

platforms are Treasure troves of

play25:38

Articles and tutorials insights and so

play25:40

much more by data science professionals

play25:42

and enthusiasts whether you're looking

play25:44

for in-depth tutorials case studies or

play25:46

thought-provoking discussions on the

play25:48

latest AI or machine learning techniques

play25:51

these are pretty good although you do

play25:52

have to pay a couple bucks a month a

play25:54

free alternative to this is YouTube

play25:56

which of course I'm quite biased because

play25:58

I am on YouTube I think there are a lot

play26:00

of smart data scientists on this

play26:02

platform who can give you so much

play26:04

information so subscribing to a few

play26:06

channels is always a good idea and the

play26:08

last thing is following experienced data

play26:10

science leaders on other platforms again

play26:12

mainly Twitter and Linkedin those are

play26:14

excellent resources in order to keep you

play26:16

on The Cutting Edge and there you have

play26:18

it the best freaking road map on this

play26:21

platform and yes this year I'm talking

play26:23

my so don't forget to subscribe to

play26:25

the newsletter to get written resour

play26:27

ources to everything that I've talked

play26:29

about and at this stage you might be

play26:30

feeling a little bit intimidated

play26:32

wondering if you have wanted taste to

play26:34

become a data scientist I have this

play26:36

video over here that addresses whether

play26:38

you are too dumb to be a data scientist

play26:41

so click on screen now

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Data ScienceCareer RoadmapPython ProgrammingMathematics for DataProject-Based LearningJob ApplicationsData AnalysisMachine LearningCoding SkillsData Visualization