Python Pandas Tutorial 5: Handle Missing Data: fillna, dropna, interpolate

codebasics
17 Feb 201722:07

Summary

TLDRThis tutorial explores handling missing data in pandas, a Python library. It demonstrates using fill, interpolate, and drop methods on a dataset of NYC weather data with missing values. The video guides through converting string dates to a datetime index, replacing missing values with specified values or forward/backward filling, and using interpolation for estimates. It also covers advanced techniques like axis filling, limit parameter for fill, and inserting missing dates with reindexing, offering a comprehensive guide for data preprocessing.

Takeaways

  • πŸ“‰ Handling missing data in pandas is crucial when working with datasets that have incomplete values.
  • πŸ’Ύ The tutorial uses New York City's weather data as an example to demonstrate handling missing data.
  • πŸ“ Converting a string date column to a datetime column is done using the 'parse_dates' argument.
  • πŸ”„ Setting a column as an index in a DataFrame requires the 'set_index' method with 'inplace=True'.
  • ❌ Missing values can be handled using methods like 'fillna', 'interpolate', and 'dropna'.
  • πŸ”’ 'fillna' can replace all NaN values with a specified value or a dictionary of values for specific columns.
  • ➑️ The 'ffill' method in 'fillna' carries forward the previous day's value to fill missing data.
  • ⬅️ 'bfill' is another method in 'fillna' that uses the next day's value to fill missing data.
  • πŸ“ˆ Interpolation methods like linear, time, and others can provide better estimates for missing values.
  • βœ‚οΈ The 'dropna' method can be used to drop rows or columns with missing values, with options to specify conditions.
  • πŸ“… Missing dates can be inserted into the DataFrame using 'date_range' and 'reindex'.

Q & A

  • What is the main topic of the tutorial?

    -The main topic of the tutorial is how to handle missing data in pandas, a Python library for data analysis.

  • What kind of data does the CSV file in the tutorial contain?

    -The CSV file contains New York City's weather data with some missing values, including data for 2nd and 3rd January.

  • What are the three methods covered in the tutorial for dealing with missing data in pandas?

    -The three methods covered are fillna, interpolate, and dropna.

  • Why might the tutorial recommend converting a string column to a date column?

    -Converting a string column to a date column allows for better data manipulation and analysis, especially when setting the date as an index for a DataFrame.

  • What does the fillna method do in pandas?

    -The fillna method in pandas is used to replace missing values (NaNs) with a specified value or a method for estimation.

  • How can you specify different fill values for different columns using the fillna method?

    -You can specify different fill values for different columns by passing a dictionary to the fillna method, where the keys are the column names and the values are the fill values.

  • What does the forward fill method do when dealing with missing data?

    -The forward fill method carries forward the value from the previous day's non-missing data to fill in the missing values.

  • What is the purpose of the 'limit' parameter in the fillna method?

    -The 'limit' parameter in the fillna method restricts the number of consecutive NaNs to be filled with the specified fill value.

  • What is interpolation and how is it used in pandas to handle missing data?

    -Interpolation is a method used to estimate intermediate values between two known data points. In pandas, the interpolate method can be used to fill missing values with estimated values based on different interpolation methods like linear, quadratic, or time-based.

  • How can you drop rows with missing data in pandas?

    -You can drop rows with missing data in pandas using the dropna method. You can specify parameters like 'how' to determine if rows with any or all missing values should be dropped, and 'thresh' to define the minimum number of non-NA values required to keep a row.

  • What is the process of re-indexing in pandas and why might you need to do it?

    -Re-indexing in pandas is the process of conforming a DataFrame to a new set of labels for its index. You might need to re-index if you want to insert missing dates or align the data with a complete date range.

Outlines

00:00

πŸ“Š Handling Missing Data in Pandas

This paragraph introduces the tutorial on managing missing data in pandas, a Python library for data analysis. The script uses a CSV file with New York City's weather data, which contains missing values. The primary focus is on three methods to address these missing values: filling with a specific value, interpolation, and dropping any rows with missing data. The tutorial begins with setting up a Python environment, typically using Jupyter Notebook, and importing the pandas library. It then demonstrates how to read a CSV file into a DataFrame and convert a string column into a date column, setting it as the DataFrame's index.

05:01

πŸ”§ Customizing Fill Values for Missing Data

The second paragraph discusses advanced techniques for filling in missing data with more accurate guesses. It explains how to use the `fillna` method with a dictionary to specify different fill values for different columns. The example given replaces missing temperature and wind speed values with zero, but uses 'no event' for the event column. The paragraph also touches on the limitations of using zero as a fill value and introduces the concept of forward filling, which carries the previous day's value to fill missing data points.

10:02

πŸ”„ Forward and Backward Filling with Limitations

This section delves deeper into the forward fill method, explaining how it can be used to propagate values from the previous day. It also introduces the backward fill method, which does the opposite by copying values from the next day. The paragraph highlights the potential issues with these methods, such as incorrect data representation, and discusses the 'axis' parameter, which allows for horizontal filling. Additionally, it introduces the 'limit' parameter, which can restrict the number of times a value is propagated to handle missing data points.

15:04

πŸ“‰ Interpolation for Estimating Missing Values

The fourth paragraph covers the interpolation method in pandas, which provides a more sophisticated way to estimate missing values. It describes linear interpolation and how it can be used to fill in missing temperature values by calculating intermediate values based on surrounding data points. The tutorial also mentions other interpolation methods such as quadratic, cubic, and piecewise polynomial, and introduces the 'time' method, which takes the date into account for a more accurate estimate.

20:08

πŸ—‘οΈ Dropping Rows with Missing Data

This paragraph discusses the use of the `dropna` method to remove rows that contain missing values. It explains how to drop rows based on the presence of any missing values or when all values are missing. The 'how' parameter is introduced to specify whether to drop rows with any or all missing values. Additionally, the 'threshold' parameter is explained, which allows for dropping rows based on the number of valid values they contain. The paragraph concludes with a method for inserting missing dates into the DataFrame by reindexing with a complete date range.

Mindmap

Keywords

πŸ’‘Pandas

Pandas is an open-source Python library used for data manipulation and analysis. It provides data structures and functions needed to manipulate structured data, making it a fundamental tool for working with structured data in Python. In the video, Pandas is used to handle missing data in a dataset containing New York City's weather data.

πŸ’‘Missing Data

Missing data refers to the absence of values in a dataset, which can occur for various reasons such as data collection errors or data entry omissions. The video discusses how to handle missing data in Pandas, which is crucial for maintaining data integrity and ensuring accurate analysis.

πŸ’‘Fill NA

The 'fill NA' method in Pandas is used to replace missing values (NA) in a DataFrame with specified values. The video demonstrates using 'fill NA' to replace missing values with zeros or other meaningful values, depending on the context of the data.

πŸ’‘Interpolate

Interpolation is a method used to estimate missing data points based on the surrounding known data points. In the video, the 'interpolate' method in Pandas is shown as a way to estimate temperatures for missing dates by using linear or time-based interpolation, providing a more accurate representation of the data.

πŸ’‘Drop NA

The 'drop NA' method in Pandas is used to remove rows or columns that contain missing values. The video explains how to use 'drop NA' to either remove all rows with any missing values or to be more selective by using parameters like 'how' and 'threshold' to determine which rows to drop.

πŸ’‘CSV File

A CSV (Comma-Separated Values) file is a simple file format used to store tabular data, where each line represents a row and commas separate the values in each row. The video script mentions a CSV file containing weather data with missing values as an example dataset.

πŸ’‘DataFrame

A DataFrame is a central data structure in Pandas, which is similar to a spreadsheet or SQL table, with rows and columns. The video script uses the term to refer to the structured data that the tutorial is working with, including operations to handle missing values within the DataFrame.

πŸ’‘Index

In Pandas, an index is a mechanism to identify a particular location or row in a DataFrame. The video script describes setting the 'day' column as the index to organize the DataFrame, which is important for operations like reindexing and filling missing dates.

πŸ’‘Reindex

Reindexing in Pandas is the process of conforming a DataFrame or Series to a new index. The video script mentions reindexing to insert missing dates into the DataFrame, ensuring that all dates are represented even if they originally had missing data.

πŸ’‘Linear Interpolation

Linear interpolation is a method of curve fitting using linear polynomials to estimate unknown points between known points. In the context of the video, linear interpolation is used to estimate missing temperature values by calculating intermediate values based on the surrounding known temperatures.

πŸ’‘Time-based Interpolation

Time-based interpolation is a specific type of interpolation that takes into account the time component of the data when estimating missing values. The video script explains how using time as a method in the 'interpolate' function can provide a more accurate estimate for the missing temperature by considering the dates.

Highlights

Tutorial overview on handling missing data in pandas.

Introduction to a CSV file with missing values in NYC's weather data.

Explanation of three primary methods to deal with missing values: fill_na, interpolate, and dropna.

Starting a new Jupyter Notebook for the tutorial.

Importing pandas and reading the CSV file into a DataFrame.

Converting the 'day' column to a date column using pandas' `parse_dates` argument.

Setting the 'day' column as the DataFrame index.

Using `fillna()` to replace NaN values with a specified value.

Demonstration of replacing NaN values with different values for different columns.

Using forward fill to carry forward the previous day's value for missing data.

Explaining backward fill to copy the next day's value for missing data.

Introduction of the `axis` parameter to control the direction of value filling.

Utilizing the `limit` parameter to restrict the number of times a value is carried forward.

Interpolation method to estimate missing values based on surrounding data points.

Different interpolation methods available in pandas such as linear, quadratic, and time.

Using `dropna()` to remove rows with missing values based on different conditions.

Parameter `how` to specify dropping rows with all NaN values or at least one NaN.

Parameter `threshold` to determine the minimum number of non-NaN values required to keep a row.

Method to insert missing dates into the DataFrame by reindexing.

Error handling during reindexing and the correct approach to resolve it.

Conclusion of the tutorial with a teaser for the next part focusing on additional techniques for handling missing data.

Transcripts

play00:00

difference in this tutorial we are going

play00:01

to look at how to handle missing data in

play00:04

pandas now often when you are

play00:07

downloading data from internet or less

play00:09

say getting it from any other source it

play00:12

might have missing values as shown in

play00:15

this CSV file this file contains New

play00:19

York City's weather data and you can see

play00:22

that some of these cells are not having

play00:25

any value in it also it is missing the

play00:29

data for 2nd and 3rd January ok so when

play00:34

you're processing this kind of

play00:36

information in pandas we will see how

play00:40

you can deal with these missing values

play00:43

using fill na interpolate and drop any

play00:47

methods I have more tutorials on how to

play00:51

handle missing data but this is just to

play00:53

start and we are only covering these

play00:54

three methods ok so as usual I'm going

play00:57

to start my Jupiter not book now if you

play01:02

don't know what is Jupiter not book I

play01:04

have a separate tutorial on it but you

play01:06

can also use any IDE of your choice such

play01:11

as py charm or not pay plus plus

play01:14

whatever you prefer

play01:16

I like Jupiter not book because it is

play01:18

great with data visualization ok so I'm

play01:21

going to click on new and start a new

play01:24

Python notebook and the first thing we

play01:28

do as usual is import pandas as PD and

play01:33

then I will read the CSV file that I

play01:41

just showed you ok and print the data

play01:45

frame the star that you were saying here

play01:48

means it was processing it so it read

play01:52

this csv file successfully into a data

play01:56

frame now for the purpose of this

play01:59

tutorial i want to make my day a date

play02:04

column so let me show you what i mean by

play02:07

that so when i

play02:10

then you normally read CSV like this

play02:13

what if what it's gonna do is it's gonna

play02:16

read de as a string column you can see

play02:20

it is a string so whatever you are

play02:23

seeing here this is nothing but but a

play02:25

string it's not an excel file okay it's

play02:27

a CSV file so I want to first convert

play02:31

that column into a date column and for

play02:36

doing that you have to use past dates

play02:41

argument and in that you can say that

play02:44

past day column as a date type okay and

play02:52

when you do that let's first print it

play02:56

you can see that it convert it now by

play03:00

looking at it you cannot probably figure

play03:02

out the type so what I do usually is

play03:05

just so you can see that now the type is

play03:11

timestamp okay so we're good all right

play03:14

so I got day as a date/time column now I

play03:20

want to make this an index for my data

play03:24

frame and in order to do that you can

play03:27

just say DF dot set index day as your

play03:31

index and anyplace equal to true

play03:35

remember you have to do in place equal

play03:37

to true otherwise it's not gonna modify

play03:39

the original data frame but instead it

play03:42

will return a new data frame okay and

play03:44

when you do that you got day as your in

play03:48

now if you have any values and if you

play03:51

are processing this information then you

play03:54

have to do special handling you have to

play03:55

check like if value equal to na then do

play03:58

the special thing okay often it makes

play04:01

sense to replace these any values with

play04:04

some meaningful value or a guess okay so

play04:09

in this case let's say I want to replace

play04:12

all any n values with some other value

play04:17

okay so the first method that we are

play04:21

going to cover

play04:23

is fill any okay so what you can do is

play04:27

be F dot fill any okay and in bracket

play04:32

you can pass the value that you want any

play04:36

to replace with okay and I'm not going

play04:41

to modify my original data frame but

play04:45

instead to get this back into a new data

play04:48

frame and when I run it you can see that

play04:52

all these NN values that it had it

play04:56

replaced them with zero value you can

play05:00

see that everything is everything that

play05:02

was any is zero now okay so this is good

play05:08

now sometimes having 0 is not probably

play05:14

the best guess so you want to come up

play05:17

with a better guess okay for example

play05:22

here in the case of event what does zero

play05:27

mean right so maybe you want to use fill

play05:31

any but you don't want to fill entire

play05:35

data frame with this value maybe you

play05:38

want to specify different values for

play05:41

different columns okay so pandas

play05:44

supports that also so the way you do it

play05:48

is again I am going to receive it a new

play05:51

data frame and inside fill and a method

play05:55

now you can pass it dictionary okay now

play05:59

what does this dictionary contain so

play06:01

dictionary contains name of the column

play06:05

okay now in temperature column let's say

play06:08

I want to replace all any values with

play06:10

zero and in my day not day but wind

play06:15

speed column I want to replace it with

play06:18

again zero but my event I want to say no

play06:25

event okay

play06:28

and then print new data frame now as you

play06:32

can see here the temperature and wind

play06:35

speed is replaced with zero as you can

play06:38

see here but the event now I have no

play06:42

event okay so you can just use this

play06:47

dictionary to fill specific values for a

play06:50

specific column but still I am not happy

play06:54

with how I handle missing values here

play06:57

because see if you are calculating a

play06:59

mean or something for this temperature

play07:01

then mean is gonna come really horrible

play07:05

and if someone looks at data he'll thing

play07:07

okay on 1st gen Ewell it were it was 32

play07:10

temperature and the second January it

play07:12

was zero Fahrenheit right some someone

play07:15

might think this the temperature went

play07:17

down by so much but in reality we

play07:22

actually don't know what was a

play07:24

temperature and all we are trying to do

play07:27

is come up with some estimate okay so

play07:31

then the other way of getting better

play07:34

estimate would be just to carry forward

play07:37

the temperature on 1st January here ok

play07:40

so whatever was the temperature on the

play07:42

previous day you carry forward and you

play07:45

do it in a similar way for other two

play07:48

data types okay so for that you can use

play07:52

again your fill and a method okay but

play07:59

here what you will do is use a method

play08:02

equal to forward fill forward fill you

play08:07

can specify by typing F fill F fill

play08:10

means if I have any value then just

play08:16

carry forward previous day's value okay

play08:20

so

play08:22

let's bring that okay cool

play08:26

now you can see that it just carry

play08:30

forward the value from the previous day

play08:32

so forth January had any value but now

play08:35

it carry forward it forced January's

play08:38

value here so this looks little better

play08:41

than just having zero value okay same

play08:45

thing on 9th January I I didn't have any

play08:49

event so you look at 9 January now it is

play08:53

sunny because you carry forward previous

play08:56

days value you can also use backward

play09:00

fill meaning carry forward next day's

play09:04

value it's not good really care for but

play09:06

you're copying instead of copying

play09:09

previous day's value you're copying next

play09:11

day's values so if you do that what's

play09:14

gonna happen is now for January has a

play09:17

value from 5th January so now it copied

play09:20

value from 5th to 4th

play09:22

ok so you can use be fill method also

play09:27

now if you go to pandas documentation

play09:31

you can just Google in pandas fill any

play09:35

it's gonna show the documentation for

play09:39

fill any and you can see that we used

play09:43

back fill be fill and FL you can also

play09:47

use bad or like wall back fill okay

play09:51

so you can use all of that you also have

play09:56

this other argument called

play09:58

axis so let's see what axis can do for

play10:02

us so here if I say

play10:09

excess okay X is equal to columns when

play10:20

you do X is equal to column what it is

play10:23

doing now is let me open this CSV file

play10:26

here so here previously when we were

play10:30

using backfill it was copying values

play10:34

vertically like it will go vertically

play10:36

and copy value from here to here but now

play10:39

with X is equal to columns it's copying

play10:43

values horizontally so it's going row by

play10:46

row and copying value from previous cell

play10:49

so here look at here it it was 9:00 a.m.

play10:52

and it copied that nine in to

play10:55

temperature so you can see this nine is

play10:57

copied here then the snow was copied

play11:01

here so you can see if this was no and

play11:03

this is also snow now so you can based

play11:07

on what kind of data you are dealing

play11:08

with you can copy it either horizontally

play11:11

or vertically okay now if you check the

play11:16

documentation of fill n/a it has another

play11:18

interesting property or argument called

play11:22

limit so let me show you what limit can

play11:25

do for you so here I am going to replace

play11:28

this bit forward fill and just kind of

play11:32

show you so when you have forward fail

play11:35

let's say in the case of 7 January I had

play11:38

32 and it will just copy 32 to both of

play11:42

these missing data points okay now let's

play11:47

say due to some reason I want to carry

play11:49

forward this value only once okay so I

play11:54

want to copy it only here but not here

play11:56

in that case you can specify limit and

play11:59

you can say my limit is 1 as far as

play12:04

copying my valid value to missing value

play12:07

is concerned okay when you run this you

play12:10

can see that now 7 January value was 32

play12:15

it copied that to 8 but 9 still has

play12:20

any because my limit is one I can copy

play12:23

it only once okay

play12:25

same thing here 6 January wind speed was

play12:29

7 miles per hour and it copied it to

play12:32

seventh so you can see that 7 January

play12:35

now also has that value but 8 & 9

play12:38

January has na ok

play12:41

if you chained them it to be 2 you will

play12:44

notice that this 7 is copied here 2

play12:48

times right 7 & 7 but my 9th January is

play12:52

still any ok so this is how you can use

play12:57

your limit parameter

play12:58

okay now I'm still not happy with the

play13:01

guests that I'm making because if the

play13:03

temperature on 1st January was 32 and on

play13:07

first it was 28 it is likely that

play13:10

temperature on 4th was in between ok

play13:14

I mean it's not always guaranteed but

play13:17

that something you would consider a

play13:20

better guess okay so we have a method

play13:25

called interpolate in ponder so let me

play13:28

just create a new cell and by the way I

play13:31

am using the shortcuts you can you can

play13:33

you access all the shortcuts here

play13:36

so when you say insert cell below the

play13:39

shortcut is B so that's what I'm using

play13:41

so I'm here I'm pressing B it's creating

play13:43

a new cell for me okay

play13:45

so here D F dot interpolate okay

play13:53

so then you do D F dot interpolate it's

play13:57

gonna interpolate the values so if you

play14:00

look at your new data from here you will

play14:03

notice that now for the 4th January it

play14:06

came up with a better case which is a

play14:09

linear interpolation so if you have

play14:12

studied linear interpolation or you

play14:16

basically you will come up with this

play14:18

value 30 okay so it was 32 28 and you're

play14:22

gradually transitioning and and having

play14:26

this intermediate data point okay so

play14:29

this is probably a better guess

play14:32

okay and it did the same thing for these

play14:37

two cells also you can see that 32:24

play14:41

and here is 32 point 66 33 point 33 so

play14:46

it's somehow coming up with this it was

play14:50

33 point of the day so it's using

play14:52

interpolation linear interpolation and

play14:56

coming up with this values okay

play14:58

so again I'm going to go ahead and check

play15:01

the documentation for interpolate so in

play15:04

search bar you can type in interpolate

play15:06

and look at data frame dot interpolate

play15:09

documentation and you will notice that

play15:12

in a method if you don't specify

play15:16

anything it is by default linear but you

play15:18

can use so many other methods you can

play15:22

use quadratic cubic and piecewise

play15:25

polynomial there are so many methods to

play15:30

specify as far as your interpolation is

play15:33

concerned okay so I'm going to use time

play15:37

now so let's see what time can do for us

play15:40

so here before we do that you will see

play15:44

that using linear interpolation it came

play15:47

up with the middle value okay 32 and 28

play15:50

the middle value is 30 but look at the

play15:52

date okay late is not in the middle okay

play15:55

then it is more near towards fifth

play15:58

January okay so I'm missing second and

play16:00

third January so 30 still doesn't look

play16:03

like a better guess it should be

play16:05

relatively near to this value 28 so when

play16:09

you use method equal to time you can see

play16:15

that now it came up with value 29

play16:17

because now it is considering this time

play16:23

this date also in coming up with this

play16:26

value it is realizing that for January

play16:28

is near to fifth hence the value should

play16:31

not be excellent middle but it should be

play16:34

more near to this value okay so this

play16:37

feature I found to be pretty powerful

play16:41

whenever you are making a guess or

play16:44

estimate form is

play16:46

values okay so far so good

play16:50

sometimes based on the situation I just

play16:52

want to let's say drop all the rows with

play16:57

any values in that case you can use this

play17:01

method called drop any so you can say DF

play17:06

drop any okay and I'm just printing the

play17:12

new data frame so you can see that in my

play17:15

excel sheet whichever row had any any

play17:18

value okay it dropped all of them so now

play17:21

I got only three rows which has a valid

play17:25

contained in all of the columns okay

play17:28

sometimes you want to drop the row if it

play17:34

has at least one nè

play17:37

okay so here what it is doing is

play17:40

activities doing that so here if you

play17:43

have at least one any it is dropping it

play17:46

but let's say I want to drop only if it

play17:49

has all any so for example I want to

play17:51

drop this row but I still want to

play17:55

preserve these rows because it has at

play17:58

least some data okay so for that you can

play18:01

use how parameter and you can say how is

play18:04

equal to all so now you don't see 9th

play18:10

January here in this data frame because

play18:12

it had all the values to be any it has

play18:16

this date but this date is a index so it

play18:19

is not considering it is not considering

play18:21

that in the process of dropping okay and

play18:25

these values this Rose has some n/a

play18:29

cells but not everything is any so it's

play18:31

not dropping that okay now what if I

play18:34

want to go by non any value so let's say

play18:37

I want to say that if I have at least

play18:41

one non any value then keep that row and

play18:46

drop any other rows so for that you can

play18:50

use a threshold parameter so you can say

play18:54

threshold equal to one thousand equal to

play18:56

one means if I have at least one non any

play18:59

you then keep the rope okay so when you

play19:02

run that see what happens is again the

play19:05

same result 9th January got dropped

play19:08

because it doesn't have any valid value

play19:11

everything was on it

play19:12

okay now let's so it kept the six

play19:16

January value because it has at least

play19:18

one valid value so if I change threshold

play19:22

to be one what it means is all right so

play19:25

let's run this okay

play19:27

nine when I sit that's what you go to do

play19:29

it dropped this particular you can see

play19:32

it dropped that particular row because

play19:35

two means I need two valid values in

play19:40

order to keep the row but I don't have

play19:43

two valid values I have only one value

play19:46

the date is not counted because it is

play19:48

index okay so if I have one value I'm

play19:51

going to drop it okay so you can use

play19:54

threshold to drive you're dropping

play19:58

process by number of valid values that

play20:01

you have okay last thing that we want to

play20:04

cover is how do you go about inserting

play20:07

the missing dates so I don't have 2nd

play20:09

and 3rd January here and I want to let's

play20:12

say insert those dates so for that you

play20:17

will do something like this so here you

play20:19

will create a date range and using the

play20:22

date range let's say I have a date range

play20:25

from 1st January to 11 January so first

play20:28

January to 11 January I created a date

play20:31

range so this is your date range and you

play20:34

pass that to date time index and create

play20:37

a date time index and then you do

play20:41

re-indexing in your data frame so I'm

play20:43

saying DF not reindex using that index

play20:46

and then you print your data frame again

play20:54

you have to do in place equal to true

play21:01

okay I'm getting some error here because

play21:05

it index got unexpected keyword argument

play21:09

okay so this is unexpected

play21:11

so let's see what's going on here okay

play21:17

so looks like reindex is not accepting

play21:20

in place as a valid argument so what I

play21:22

have to do is DF equal to DF dot array

play21:25

index and when executed you'll see that

play21:29

I got 2nd and 3rd January rows now I

play21:32

have any values but again you can use

play21:34

one of the field and methods to fill

play21:37

them with some estimated values okay so

play21:42

that's all we had for this tutorial in

play21:45

the next tutorial of we will continue on

play21:49

how to handle missing data using some

play21:53

other techniques okay until then thank

play21:57

you very much for watching and if you

play21:59

liked this tutorial please don't forget

play22:02

to give it a thumbs up below okay bye

Rate This
β˜…
β˜…
β˜…
β˜…
β˜…

5.0 / 5 (0 votes)

Related Tags
Data HandlingPandas TutorialMissing ValuesFill NAInterpolate DataCSV FileWeather DataData VisualizationJupyter NotebookLinear Interpolation