04. Cómo crear una tabla de contingencia | Curso de SPSS

BIOESTADISTICO
20 Aug 201505:16

Summary

TLDRThis script explains the construction and significance of contingency tables, also known as cross-tabulations, used to analyze the relationship between two categorical variables. It uses the example of smoking habit and lung cancer to demonstrate how these tables can reveal correlations, such as a higher frequency of lung cancer among smokers. The script details the conventional arrangement of variables in the table, with the disease in columns and the risk factor in rows, and discusses the practical approach of building groups from diseased individuals (cases) and non-diseased individuals (controls) for comparative analysis.

Takeaways

  • 📊 Contingency tables, also known as cross-tabulation or double-entry tables, are used to analyze the relationship between two categorical variables.
  • 🚬 The example used in the script involves the relationship between smoking habit (a categorical variable) and lung cancer (another categorical variable).
  • 🔍 It is suspected that smokers have a higher frequency of lung cancer, and a contingency table is used to demonstrate this relationship.
  • 📋 The conventional presentation of a contingency table places the disease or outcome in the columns and the risk factor or independent variable in the rows.
  • 🏥 Obtaining a group of people with lung cancer is easier than finding a group of smokers or non-smokers within which to identify cancer cases.
  • 📈 The script describes how to calculate percentages for the table cells, with examples showing a higher percentage of smokers among those with lung cancer compared to those without.
  • 👥 The groups for analysis are constructed based on the presence of disease, with affected individuals called 'cases' and non-affected individuals 'controls'.
  • 🧐 The script emphasizes the ease of finding a control group without lung cancer compared to identifying cancer within groups of smokers or non-smokers.
  • 📝 The contingency table is a foundational tool for case-control study designs, where multiple characteristics are compared between affected and non-affected groups to identify potential risk factors.
  • ⚖️ The script highlights the importance of proper group selection and comparison in statistical analysis to accurately identify risk factors associated with diseases like lung cancer.

Q & A

  • What are contingency tables also known as?

    -Contingency tables are also known as double-entry tables or cross-tabulation tables.

  • What is the simplest way to construct a contingency table?

    -The simplest way to construct a contingency table is with two categorical variables, where one variable is analyzed and the other is varied according to the level of the investigation.

  • What is an example of a categorical variable?

    -An example of a categorical variable is the habit of smoking, as it categorizes individuals into those who smoke and those who do not.

  • Why is lung cancer chosen as the second variable in the contingency table example?

    -Lung cancer is chosen as the second variable because it is also a categorical variable, and there is a suspicion that it occurs more frequently in smokers than in non-smokers.

  • What is the conventional order for presenting a contingency table?

    -In a contingency table, the disease or consequence is always presented in the columns, and the risk factor or independent variable is presented in the rows.

  • Why is it easier to find a group of people with lung cancer than a group of smokers?

    -It is easier to find a group of people with lung cancer because hospitals can provide a list of such patients, while it is more challenging to find a group of smokers without the disease.

  • What percentage of the 54 people with lung cancer in the example are smokers?

    -In the example, 66.7% of the 54 people with lung cancer are smokers.

  • What percentage of the 46 people without lung cancer in the example are smokers?

    -In the example, 34.8% of the 46 people without lung cancer are smokers.

  • What is the basis for constructing case-control studies mentioned in the script?

    -The basis for constructing case-control studies is to compare affected individuals (cases) with unaffected individuals (controls) on various characteristics that may be more frequent in the affected group.

  • Why are cases and controls used in the construction of contingency tables?

    -Cases and controls are used to identify risk factors by comparing the frequency of certain characteristics in groups with and without the disease.

Outlines

00:00

📊 Introduction to Contingency Tables

This paragraph introduces contingency tables, also known as cross-tabulation or double-entry tables, which are used to analyze the relationship between two categorical variables. The example used is the relationship between smoking habits and lung cancer. The paragraph explains the construction of a contingency table with smoking as the independent variable in rows and lung cancer as the dependent variable in columns. It emphasizes the conventional presentation where the outcome (lung cancer) is in columns and the risk factor (smoking) is in rows. The paragraph also discusses the ease of obtaining a group of people with lung cancer compared to a group of smokers or non-smokers for statistical analysis. It concludes with a description of how to calculate percentages within the table and the rationale behind constructing groups based on disease status, termed as 'cases' and 'controls'.

05:02

🔍 Risk Factors in Case-Control Studies

The second paragraph delves into the concept of case-control studies, focusing on identifying risk factors that are more prevalent in the affected group compared to the non-affected group. It discusses the process of comparing multiple characteristics between these two groups to determine potential risk factors for a particular disease or condition. The paragraph highlights the importance of this comparative analysis in understanding the relationship between various factors and the occurrence of diseases.

Mindmap

Keywords

💡Contingency Tables

Contingency tables, also known as cross-tabulation or double-entry tables, are statistical tools used to analyze the relationship between two categorical variables. In the video, contingency tables are used to explore the connection between smoking habits and lung cancer. The script mentions constructing a table with 'smoking habit' in rows and 'lung cancer' in columns, which illustrates how these tables help in comparing the frequency of an outcome (cancer) across different groups (smokers vs. non-smokers).

💡Categorical Variables

Categorical variables are data points that represent groups or categories rather than numerical values. In the context of the video, both 'smoking habit' and 'lung cancer' are categorical variables because they classify individuals into distinct groups (smokers/non-smokers, cancer/no cancer). The video explains that contingency tables are particularly useful for analyzing relationships between categorical variables.

💡Risk Factor

A risk factor is a variable that increases the likelihood of a particular health outcome, such as disease. In the video, 'smoking' is identified as a potential risk factor for 'lung cancer.' The script discusses how contingency tables can help to determine if there is a higher prevalence of lung cancer among smokers compared to non-smokers, thereby suggesting smoking as a risk factor.

💡Descriptive Statistics

Descriptive statistics summarize and organize data to describe the main features of a dataset. The video script refers to using descriptive statistics to analyze data and select contingency tables as a method for this analysis. Descriptive statistics are foundational for understanding the distribution and patterns in the data before drawing conclusions about the relationship between variables.

💡Cases and Controls

In the context of the video, 'cases' refer to individuals with the disease (lung cancer), and 'controls' are those without the disease. The script explains that when constructing a contingency table, it's easier to start with a group of people with the disease (cases) and then look at the proportion who smoke. This approach is part of the case-control study design, which compares factors between those with and without the condition to identify potential risk factors.

💡Percentages

Percentages are used in the video to express the proportion of individuals within a group that exhibit a certain characteristic. For example, the script mentions that of the 54 people with lung cancer, 66.7% smoke. Percentages are a key way to communicate the strength of associations found in contingency tables and are essential for understanding the prevalence of characteristics within different groups.

💡Hospital Data

The video script discusses obtaining data on lung cancer patients from a hospital, particularly an oncology department. This data is used to identify 'cases' for the contingency table. Hospital data is a common source for health-related research as it provides access to information on individuals with specific health conditions.

💡Non-Smokers

Non-smokers are individuals who do not smoke and are used as a comparison group in the video's discussion of contingency tables. The script contrasts the difficulty of finding lung cancer among non-smokers with the relative ease of finding it among smokers, highlighting the importance of comparing different groups to understand risk factors.

💡Prevalence

Prevalence refers to the proportion of a population that has a certain condition at a specific time. In the video, the prevalence of lung cancer among smokers and non-smokers is compared using a contingency table. The script uses the term to highlight the higher frequency of lung cancer among smokers, which is a key piece of evidence when exploring risk factors.

💡Factor of Risk

A factor of risk is a characteristic or behavior that is associated with an increased chance of developing a disease. In the video, smoking is discussed as a factor of risk for lung cancer. The script explains that by comparing the prevalence of smoking among those with and without lung cancer, one can assess whether smoking is a significant risk factor.

💡Statistical Analysis

Statistical analysis involves the use of statistical methods to analyze data and draw conclusions. The video script describes the process of using statistical analysis to explore the relationship between smoking and lung cancer through contingency tables. This analysis is crucial for understanding the potential impact of smoking on health outcomes.

Highlights

Contingency tables are also known as double-entry tables or cross-tabulations.

They are used to relate two categorical variables, such as smoking habit and lung cancer.

The smoking habit is a categorical variable because it divides people into smokers and non-smokers.

Lung cancer is also categorical, as it categorizes people into those who have it and those who do not.

It is suspected that smokers have a higher frequency of lung cancer compared to non-smokers.

Descriptive statistical analysis is used to demonstrate this relationship through contingency tables.

In constructing a contingency table, the disease or outcome is always placed in the columns.

The risk factor or independent variable is placed in the rows, not the other way around.

This order facilitates the comparison between two groups, such as those with and without lung cancer.

It is easier to find a group of people with lung cancer than to find a group of smokers within it.

Conversely, it is easier to find a group of non-smokers and identify those with lung cancer.

Hospitals can provide a group of people with lung cancer, making it easier to construct the 'cases' group.

The 'controls' group consists of people without the disease, which is easier to find in the general population.

Percentages are calculated in vertical cells or by columns to compare the prevalence of smoking in each group.

Out of 54 people with lung cancer, 36 are smokers, which is 66.7% of that group.

Out of 46 people without lung cancer, 16 are smokers, which is 34.8% of that group.

The proportion of smokers is significantly higher in the group with lung cancer.

The construction of groups starts with the disease, with affected individuals called 'cases' and unaffected called 'controls'.

In case-control studies, multiple characteristics are compared between the affected and unaffected groups to identify risk factors.

Transcripts

play00:01

tablas de contingencia

play00:05

se denominan también tablas de doble

play00:08

entrada o tablas cruzadas y la forma más

play00:13

sencilla de construir una tabla de

play00:15

contingencia es con dos variables

play00:20

análisis y variado correspondiente al

play00:24

nivel de la investigación relacional

play00:28

entonces se trata de relacionar dos

play00:32

variables pero que tienen la naturaleza

play00:36

categórica por ejemplo el hábito de

play00:40

fumar

play00:41

es una variable categórica porque

play00:43

algunos fuman y otros no fuman el cáncer

play00:48

pulmonar o cáncer broncogénico también

play00:53

es una variable categórica porque

play00:55

algunos les da al cáncer y a otros no

play00:58

bueno se sospecha que a quienes fuman

play01:01

les da el cáncer con mayor frecuencia

play01:04

que aquellos que no fuman y como

play01:08

demostramos esto con una tabla de

play01:11

contingencia veamos

play01:14

vámonos a analizar estadísticos

play01:17

descriptivos y seleccionamos la opción

play01:20

tablas de contingencia

play01:24

en filas vamos a colocar el hábito de

play01:26

fumar y en columnas el cáncer de pulmón

play01:31

existe un orden convencional para la

play01:34

presentación de una tabla de

play01:37

contingencia la enfermedad o la

play01:40

consecuencia siempre va en las columnas

play01:43

y el factor de riesgo o la variable

play01:46

independiente va en las filas y no al

play01:50

revés

play01:52

esto nos sirve para poder comparar los

play01:55

dos grupos por ejemplo es difícil

play01:59

conseguir un grupo de personas que fumen

play02:03

y en ellas descubrir quiénes tienen

play02:06

cáncer de pulmón más difícil aún es

play02:10

conseguir un grupo de personas que no

play02:13

fumen y en ellas identificar quiénes

play02:16

tienen cáncer de pulmón es mucho más

play02:19

fácil esta segunda estrategia conseguir

play02:22

un grupo de personas con cáncer de

play02:25

pulmón y ver cuántos de ellos fuman del

play02:29

mismo modo conseguir un grupo de

play02:31

personas sin cáncer de pulmón y ver en

play02:34

este grupo cuántos de ellos fuman esto

play02:38

es mucho más fácil porque uno puede ir a

play02:41

un hospital y conseguir un grupo de

play02:43

personas que tengan este padecimiento

play02:47

por esta razón la comparación se hace

play02:50

entre personas con cáncer y sin cáncer y

play02:54

no de la siguiente manera que fuman o no

play02:57

fuman este razonamiento nos sirve para

play03:01

obtener los porcentajes en casillas en

play03:05

forma vertical o por columnas continuar

play03:11

y aceptar

play03:13

veamos entonces que de 54 personas que

play03:18

tienen cáncer de pulmón 36 fuman y estos

play03:24

36 de los 54 hacen un 66,7 por ciento

play03:29

por otro lado tenemos 46 personas que no

play03:33

tienen cáncer de pulmón y en ellas 16

play03:38

fuman y 16 de 46 hace un 34 8% una

play03:44

proporción mucho menor a lo observado en

play03:48

el grupo de las personas que tienen

play03:50

cáncer de pulmón como hemos conseguido

play03:54

estas 54 personas con cáncer de pulmón

play03:58

pues vamos al hospital al oncológico y

play04:02

podemos sacar un listado de estas

play04:04

personas o de estos pacientes más

play04:07

fáciles conseguir aquellos que no tienen

play04:09

cáncer de pulmón pero si partimos por el

play04:12

otro lado buscando 52 personas que fuman

play04:15

es poco probable encontrar cáncer de

play04:18

pulmón y peor aún si conseguimos 48

play04:22

personas que no fuman que será mucho más

play04:25

difícil detectar que alguien tenga

play04:28

cáncer de pulmón en este grupo por esta

play04:30

razón los grupos se construyen a partir

play04:33

de la enfermedad denominándose a los que

play04:36

se encuentran afectados como casos y

play04:39

aquellos que no tienen la enfermedad o

play04:42

no están afectados como los controles

play04:45

esta es la base para la construcción del

play04:49

diseño de los casos y controles donde se

play04:52

busca un grupo de enfermos y un grupo de

play04:54

sanos en estos grupos se comparan no

play04:58

solamente una característica sino varias

play05:01

características que pueden tener mayor

play05:04

frecuencia en el grupo afectado respecto

play05:07

del no afectada

play05:09

y que se considerarán factores de riesgo

Rate This

5.0 / 5 (0 votes)

Étiquettes Connexes
Contingency TablesData AnalysisCategorical VariablesStatistical MethodsRisk FactorsCancer ResearchSmoking HabitsHealth StudiesDescriptive StatisticsCase-Control Design
Besoin d'un résumé en anglais ?