04. Cómo crear una tabla de contingencia | Curso de SPSS
Summary
TLDRThis script explains the construction and significance of contingency tables, also known as cross-tabulations, used to analyze the relationship between two categorical variables. It uses the example of smoking habit and lung cancer to demonstrate how these tables can reveal correlations, such as a higher frequency of lung cancer among smokers. The script details the conventional arrangement of variables in the table, with the disease in columns and the risk factor in rows, and discusses the practical approach of building groups from diseased individuals (cases) and non-diseased individuals (controls) for comparative analysis.
Takeaways
- 📊 Contingency tables, also known as cross-tabulation or double-entry tables, are used to analyze the relationship between two categorical variables.
- 🚬 The example used in the script involves the relationship between smoking habit (a categorical variable) and lung cancer (another categorical variable).
- 🔍 It is suspected that smokers have a higher frequency of lung cancer, and a contingency table is used to demonstrate this relationship.
- 📋 The conventional presentation of a contingency table places the disease or outcome in the columns and the risk factor or independent variable in the rows.
- 🏥 Obtaining a group of people with lung cancer is easier than finding a group of smokers or non-smokers within which to identify cancer cases.
- 📈 The script describes how to calculate percentages for the table cells, with examples showing a higher percentage of smokers among those with lung cancer compared to those without.
- 👥 The groups for analysis are constructed based on the presence of disease, with affected individuals called 'cases' and non-affected individuals 'controls'.
- 🧐 The script emphasizes the ease of finding a control group without lung cancer compared to identifying cancer within groups of smokers or non-smokers.
- 📝 The contingency table is a foundational tool for case-control study designs, where multiple characteristics are compared between affected and non-affected groups to identify potential risk factors.
- ⚖️ The script highlights the importance of proper group selection and comparison in statistical analysis to accurately identify risk factors associated with diseases like lung cancer.
Q & A
What are contingency tables also known as?
-Contingency tables are also known as double-entry tables or cross-tabulation tables.
What is the simplest way to construct a contingency table?
-The simplest way to construct a contingency table is with two categorical variables, where one variable is analyzed and the other is varied according to the level of the investigation.
What is an example of a categorical variable?
-An example of a categorical variable is the habit of smoking, as it categorizes individuals into those who smoke and those who do not.
Why is lung cancer chosen as the second variable in the contingency table example?
-Lung cancer is chosen as the second variable because it is also a categorical variable, and there is a suspicion that it occurs more frequently in smokers than in non-smokers.
What is the conventional order for presenting a contingency table?
-In a contingency table, the disease or consequence is always presented in the columns, and the risk factor or independent variable is presented in the rows.
Why is it easier to find a group of people with lung cancer than a group of smokers?
-It is easier to find a group of people with lung cancer because hospitals can provide a list of such patients, while it is more challenging to find a group of smokers without the disease.
What percentage of the 54 people with lung cancer in the example are smokers?
-In the example, 66.7% of the 54 people with lung cancer are smokers.
What percentage of the 46 people without lung cancer in the example are smokers?
-In the example, 34.8% of the 46 people without lung cancer are smokers.
What is the basis for constructing case-control studies mentioned in the script?
-The basis for constructing case-control studies is to compare affected individuals (cases) with unaffected individuals (controls) on various characteristics that may be more frequent in the affected group.
Why are cases and controls used in the construction of contingency tables?
-Cases and controls are used to identify risk factors by comparing the frequency of certain characteristics in groups with and without the disease.
Outlines
📊 Introduction to Contingency Tables
This paragraph introduces contingency tables, also known as cross-tabulation or double-entry tables, which are used to analyze the relationship between two categorical variables. The example used is the relationship between smoking habits and lung cancer. The paragraph explains the construction of a contingency table with smoking as the independent variable in rows and lung cancer as the dependent variable in columns. It emphasizes the conventional presentation where the outcome (lung cancer) is in columns and the risk factor (smoking) is in rows. The paragraph also discusses the ease of obtaining a group of people with lung cancer compared to a group of smokers or non-smokers for statistical analysis. It concludes with a description of how to calculate percentages within the table and the rationale behind constructing groups based on disease status, termed as 'cases' and 'controls'.
🔍 Risk Factors in Case-Control Studies
The second paragraph delves into the concept of case-control studies, focusing on identifying risk factors that are more prevalent in the affected group compared to the non-affected group. It discusses the process of comparing multiple characteristics between these two groups to determine potential risk factors for a particular disease or condition. The paragraph highlights the importance of this comparative analysis in understanding the relationship between various factors and the occurrence of diseases.
Mindmap
Keywords
💡Contingency Tables
💡Categorical Variables
💡Risk Factor
💡Descriptive Statistics
💡Cases and Controls
💡Percentages
💡Hospital Data
💡Non-Smokers
💡Prevalence
💡Factor of Risk
💡Statistical Analysis
Highlights
Contingency tables are also known as double-entry tables or cross-tabulations.
They are used to relate two categorical variables, such as smoking habit and lung cancer.
The smoking habit is a categorical variable because it divides people into smokers and non-smokers.
Lung cancer is also categorical, as it categorizes people into those who have it and those who do not.
It is suspected that smokers have a higher frequency of lung cancer compared to non-smokers.
Descriptive statistical analysis is used to demonstrate this relationship through contingency tables.
In constructing a contingency table, the disease or outcome is always placed in the columns.
The risk factor or independent variable is placed in the rows, not the other way around.
This order facilitates the comparison between two groups, such as those with and without lung cancer.
It is easier to find a group of people with lung cancer than to find a group of smokers within it.
Conversely, it is easier to find a group of non-smokers and identify those with lung cancer.
Hospitals can provide a group of people with lung cancer, making it easier to construct the 'cases' group.
The 'controls' group consists of people without the disease, which is easier to find in the general population.
Percentages are calculated in vertical cells or by columns to compare the prevalence of smoking in each group.
Out of 54 people with lung cancer, 36 are smokers, which is 66.7% of that group.
Out of 46 people without lung cancer, 16 are smokers, which is 34.8% of that group.
The proportion of smokers is significantly higher in the group with lung cancer.
The construction of groups starts with the disease, with affected individuals called 'cases' and unaffected called 'controls'.
In case-control studies, multiple characteristics are compared between the affected and unaffected groups to identify risk factors.
Transcripts
tablas de contingencia
se denominan también tablas de doble
entrada o tablas cruzadas y la forma más
sencilla de construir una tabla de
contingencia es con dos variables
análisis y variado correspondiente al
nivel de la investigación relacional
entonces se trata de relacionar dos
variables pero que tienen la naturaleza
categórica por ejemplo el hábito de
fumar
es una variable categórica porque
algunos fuman y otros no fuman el cáncer
pulmonar o cáncer broncogénico también
es una variable categórica porque
algunos les da al cáncer y a otros no
bueno se sospecha que a quienes fuman
les da el cáncer con mayor frecuencia
que aquellos que no fuman y como
demostramos esto con una tabla de
contingencia veamos
vámonos a analizar estadísticos
descriptivos y seleccionamos la opción
tablas de contingencia
en filas vamos a colocar el hábito de
fumar y en columnas el cáncer de pulmón
existe un orden convencional para la
presentación de una tabla de
contingencia la enfermedad o la
consecuencia siempre va en las columnas
y el factor de riesgo o la variable
independiente va en las filas y no al
revés
esto nos sirve para poder comparar los
dos grupos por ejemplo es difícil
conseguir un grupo de personas que fumen
y en ellas descubrir quiénes tienen
cáncer de pulmón más difícil aún es
conseguir un grupo de personas que no
fumen y en ellas identificar quiénes
tienen cáncer de pulmón es mucho más
fácil esta segunda estrategia conseguir
un grupo de personas con cáncer de
pulmón y ver cuántos de ellos fuman del
mismo modo conseguir un grupo de
personas sin cáncer de pulmón y ver en
este grupo cuántos de ellos fuman esto
es mucho más fácil porque uno puede ir a
un hospital y conseguir un grupo de
personas que tengan este padecimiento
por esta razón la comparación se hace
entre personas con cáncer y sin cáncer y
no de la siguiente manera que fuman o no
fuman este razonamiento nos sirve para
obtener los porcentajes en casillas en
forma vertical o por columnas continuar
y aceptar
veamos entonces que de 54 personas que
tienen cáncer de pulmón 36 fuman y estos
36 de los 54 hacen un 66,7 por ciento
por otro lado tenemos 46 personas que no
tienen cáncer de pulmón y en ellas 16
fuman y 16 de 46 hace un 34 8% una
proporción mucho menor a lo observado en
el grupo de las personas que tienen
cáncer de pulmón como hemos conseguido
estas 54 personas con cáncer de pulmón
pues vamos al hospital al oncológico y
podemos sacar un listado de estas
personas o de estos pacientes más
fáciles conseguir aquellos que no tienen
cáncer de pulmón pero si partimos por el
otro lado buscando 52 personas que fuman
es poco probable encontrar cáncer de
pulmón y peor aún si conseguimos 48
personas que no fuman que será mucho más
difícil detectar que alguien tenga
cáncer de pulmón en este grupo por esta
razón los grupos se construyen a partir
de la enfermedad denominándose a los que
se encuentran afectados como casos y
aquellos que no tienen la enfermedad o
no están afectados como los controles
esta es la base para la construcción del
diseño de los casos y controles donde se
busca un grupo de enfermos y un grupo de
sanos en estos grupos se comparan no
solamente una característica sino varias
características que pueden tener mayor
frecuencia en el grupo afectado respecto
del no afectada
y que se considerarán factores de riesgo
関連動画をさらに表示
Lecture 4.2 - Association between two categorical variables - Introduction
Video Aula sobre o Câncer de Pulmão
Bar Chart, Pie Chart, Frequency Tables | Statistics Tutorial | MarinStatsLectures
The Effects of Lifestyle Choices on Respiratory & Circulatory Systems I FULL VIDEO
DEFINISI PENGOLAHAN DATA
Marginal and conditional distributions | Analyzing categorical data | AP Statistics | Khan Academy
5.0 / 5 (0 votes)