What are MCAR, MAR, and MNAR? 🧩

Data-Centric AI Community
5 Jun 202301:25

Summary

TLDRThe video discusses three types of missing data in research: Missing Completely at Random (MCAR), Missing at Random (MAR), and Missing Not at Random (MNAR). MCAR occurs when missingness is unrelated to any data values, exemplified by accidentally skipping a survey question. MAR suggests that missing data can be explained by observed variables, like younger smokers being less likely to report their smoking habits. Finally, MNAR indicates that the reason for missing data is related to the data itself, such as heavy smokers being less forthcoming about their smoking frequency.

Takeaways

  • πŸ˜€ There are three types of missing data to consider.
  • πŸ˜€ Missing Completely at Random (MCAR) means missingness is unrelated to any data values.
  • πŸ˜€ An example of MCAR is unintentionally skipping a survey question.
  • πŸ˜€ Missing at Random (MAR) indicates missingness can be explained by observed data features.
  • πŸ˜€ In a tobacco study, younger participants may report values more frequently, illustrating MAR.
  • πŸ˜€ Missing Not at Random (MNAR) means missingness is related to the missing data itself.
  • πŸ˜€ In the tobacco study, heavier smokers may choose not to disclose their smoking habits, demonstrating MNAR.
  • πŸ˜€ Understanding the type of missingness is crucial for data analysis.
  • πŸ˜€ Each type of missing data requires different handling methods for accurate analysis.
  • πŸ˜€ Accurate handling of missing data is essential for drawing valid conclusions from studies.

Q & A

  • What are the three types of missing data discussed in the transcript?

    -The three types of missing data are missing completely at random, missing at random, and missing not at random.

  • What does 'missing completely at random' mean?

    -It means that the missingness of data is completely unrelated to any observed or unobserved values in the dataset.

  • Can you give an example of missing completely at random?

    -An example is when a survey respondent unintentionally skips a question, resulting in missing data that has no relation to other data points.

  • What is meant by 'missing at random'?

    -Missing at random indicates that the missingness can be explained by some observed features in the dataset.

  • How might age influence missing data in a tobacco study?

    -In a tobacco study, younger participants might report their smoking habits more frequently, leading to missing data based on their age.

  • What does 'missing not at random' imply?

    -It implies that the reason for the missing data is related to the value of the missing data itself.

  • Can you provide an example of missing not at random in the context of smoking data?

    -In a tobacco study, participants who smoke more might be less likely to disclose their smoking habits, creating a missing not at random situation.

  • Why is it important to understand the type of missing data?

    -Understanding the type of missing data is crucial for selecting appropriate methods for handling it and for ensuring the validity of the analysis.

  • What impact does missing data have on research outcomes?

    -Missing data can lead to biased results and inaccurate conclusions if not addressed properly.

  • How can researchers address missing data effectively?

    -Researchers can use various techniques such as imputation, weighting, or sensitivity analysis to address missing data based on its type.

Outlines

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Mindmap

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Keywords

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Highlights

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now

Transcripts

plate

This section is available to paid users only. Please upgrade to access this part.

Upgrade Now