Descriptive Statistics, Part 1
Summary
TLDREn este tutorial, Amanda Rockson Appq introduce conceptos clave de las estadísticas descriptivas, como la distribución, las medidas de tendencia central (media, mediana y moda) y la dispersión (rango, varianza y desviación estándar). Utiliza ejemplos para ilustrar cómo calcular estas medidas y cuándo es apropiado reportar cada una, destacando la importancia de la variación en los datos. También explica cómo usar SPSS para generar estadísticas descriptivas y cómo reportarlas en formato APA, proporcionando tanto descripciones narrativas como ejemplos de tablas formateadas.
Takeaways
- 📊 Una distribución es una lista de puntajes tomados en una variable particular y es fundamental en estadísticas.
- 🔢 Los tres medidores de tendencia central son la media, la mediana y la moda.
- 💡 La media es el promedio aritmético de un grupo de puntajes, o la suma de puntajes dividida por el número de puntajes.
- 📈 La mediana es el punto central de todos los puntajes en una distribución ordenada de menor a mayor.
- 📦 La moda es el valor con la mayor frecuencia en la distribución.
- 📑 El tipo de datos y la distribución de los datos determinan qué medida de tendencia central se informará.
- 🔄 La variación ocurre en cada distribución y es causada por diferentes variables.
- 📋 La importancia de la variación se entiende mejor cuando se conoce cómo se dispersan los datos y cómo varían.
- 📏 La dispersión se mide por el rango, la varianza y la desviación estándar.
- 📘 La varianza es la media de las diferencias al cuadrado de la media y es fundamental para la desviación estándar.
- 📊 La desviación estándar es una medida común de dispersión y es simplemente la raíz cuadrada de la varianza.
- 📐 La desviación estándar muestra cómo se dispersan los puntajes alrededor de la media, y un valor más grande indica una mayor dispersión.
Q & A
¿Qué es una distribución en estadística?
-Una distribución es una lista de puntuaciones obtenidas en una variable específica. Es fundamental para el análisis estadístico, ya que permite identificar patrones y realizar cálculos, como la tendencia central y la dispersión.
¿Cuáles son las tres medidas de tendencia central?
-Las tres medidas de tendencia central son la media, la mediana y la moda. La media es el promedio, la mediana es el valor central en una lista ordenada, y la moda es el valor que aparece con mayor frecuencia.
¿Cómo se calcula la media?
-La media se calcula sumando todas las puntuaciones y dividiendo esa suma por el número total de puntuaciones. Por ejemplo, si las puntuaciones son 69, 77, 84, 85, 87, 92 y 98, la media sería 83.1.
¿Qué indica la mediana en un conjunto de datos?
-La mediana es el punto medio de un conjunto de datos ordenados de mayor a menor. Si el número de datos es impar, es el valor central; si es par, es el promedio de los dos valores centrales.
¿Qué es la moda y cómo se identifica?
-La moda es el valor que aparece con mayor frecuencia en una distribución. En el ejemplo del guion, 77 es la moda porque aparece tres veces en el conjunto de datos.
¿Cuándo se utiliza la media, la mediana o la moda?
-La media se usa con datos de nivel de intervalo o de razón, la mediana con datos ordinales, y la moda con datos categóricos. Sin embargo, en casos de valores atípicos o distribuciones sesgadas, la mediana puede ser más representativa que la media.
¿Qué es la variación en una distribución?
-La variación es el grado de diferencia entre las puntuaciones de una distribución. Puede deberse a diferentes factores, como la experiencia previa de los estudiantes o sus hábitos de estudio.
¿Qué es la varianza y cómo se calcula?
-La varianza es el promedio de las diferencias al cuadrado respecto a la media. Se calcula restando la media de cada puntuación, elevando al cuadrado esa diferencia, y luego promediando los resultados.
¿Qué es la desviación estándar?
-La desviación estándar es la raíz cuadrada de la varianza. Mide qué tan dispersas están las puntuaciones alrededor de la media. Un valor alto indica mayor dispersión y un valor bajo indica menor dispersión.
¿Cómo se interpretan los resultados de una distribución normal en términos de desviación estándar?
-En una distribución normal, aproximadamente el 68% de las puntuaciones caen dentro de una desviación estándar de la media, el 95% dentro de dos desviaciones estándar y el 99.7% dentro de tres desviaciones estándar.
Outlines
🎓 Introducción a las estadísticas descriptivas
En este párrafo, Amanda Rockson presenta el tutorial sobre estadísticas descriptivas, explicando que es la primera parte de dos. Se discutirán las distribuciones, medidas de tendencia central, dispersión y cómo calcular estas estadísticas en SPSS y reportarlas en formato APA. También se introduce el concepto de distribución como una lista de puntuaciones de una variable específica.
📊 Medidas de tendencia central: Media, Mediana y Moda
Aquí se explican las tres medidas principales de tendencia central: la media, la mediana y la moda. Se utiliza una distribución de diez puntuaciones de un examen de estadística para ilustrar cómo calcular cada medida. La media se obtiene sumando todas las puntuaciones y dividiendo entre el número de puntuaciones, la mediana es el punto medio de una distribución ordenada, y la moda es el valor más frecuente.
🔍 Variación en las puntuaciones: Ejemplos y causas
El párrafo aborda la variación en las puntuaciones de los estudiantes en un curso de estadística y el concepto de dispersión. Se da un ejemplo de cómo las puntuaciones en las pruebas varían debido a diferentes factores, como la experiencia previa y el tiempo de estudio. Se introducen los conceptos de rango y dispersión en las puntuaciones, que son clave para interpretar las estadísticas correctamente.
📉 Importancia de la variación y dispersión en los datos
Este párrafo resalta la importancia de entender la dispersión en los datos, especialmente en casos con datos socioeconómicos. Se ilustra cómo las medidas de dispersión, como el rango, la varianza y la desviación estándar, brindan una visión más completa de los datos. Se muestra un ejemplo donde el promedio puede ser engañoso debido a la presencia de valores atípicos.
📏 Medidas de dispersión: Rango, varianza y desviación estándar
Aquí se explican en detalle las medidas de dispersión: rango, varianza y desviación estándar. Se muestran cálculos paso a paso usando una distribución de puntuaciones. El rango es la diferencia entre el valor más alto y el más bajo, la varianza mide la dispersión de las puntuaciones respecto a la media, y la desviación estándar es la raíz cuadrada de la varianza.
🔬 Desviación estándar y la distribución normal
Este párrafo explora cómo la desviación estándar describe la dispersión de puntuaciones alrededor de la media. También se introduce la curva de campana o distribución normal, explicando que el 68% de las puntuaciones caen dentro de una desviación estándar de la media, el 95% dentro de dos, y el 99.7% dentro de tres desviaciones estándar.
📊 Ejemplo de cálculo de varianza y desviación estándar
Se presenta un ejemplo de cálculo de la varianza y la desviación estándar utilizando puntuaciones de cinco estudiantes en un curso. Se explica cómo se calcula primero la varianza y luego la desviación estándar. Además, se compara cómo cambia el cálculo al utilizar una población completa frente a una muestra, destacando la diferencia entre las fórmulas.
🔍 Comparación de distribuciones y dispersión de los datos
Este párrafo compara tres distribuciones con diferentes medias y desviaciones estándar. Se muestra cómo la dispersión de las puntuaciones afecta la interpretación de los datos, destacando que una mayor desviación estándar indica mayor dispersión y una menor desviación estándar indica menor dispersión.
💻 Uso de SPSS para calcular estadísticas descriptivas
Se explican las tres funciones principales en SPSS para calcular estadísticas descriptivas: 'frequencies', 'descriptives' y 'explore'. Se anima a explorar estas funciones y se dan instrucciones para calcular medidas de tendencia central y dispersión tanto manualmente como en SPSS.
📝 Cómo reportar estadísticas descriptivas en formato APA
El párrafo concluye explicando cómo reportar estadísticas descriptivas en formato APA, ya sea en narrativa o en tablas. Se enfatiza el uso de notaciones como 'M' para la media y 'SD' para la desviación estándar, y se muestra un ejemplo de cómo reportar los resultados de una distribución de datos en un formato APA adecuado.
Mindmap
Keywords
💡distribución
💡tendencia central
💡media
💡mediana
💡modal
💡variación
💡dispersión
💡rango
💡varianza
💡desviación estándar
💡SPSS
Highlights
Introduction to descriptive statistics, including definitions of distributions, central tendency, and dispersion.
Explanation of a distribution: a list of scores taken on a particular variable, with an example using 10 student test scores.
Introduction to measures of central tendency: mean, median, and mode, with clear examples of how to calculate each.
Demonstration of when to report different measures of central tendency based on data type: mean for interval/ratio data, median for ordinal data, and mode for categorical data.
Explanation of variation and why it's important in understanding data distribution, with examples from student scores and heart rates.
Introduction to measures of dispersion: range, variance, and standard deviation, with a focus on their importance in providing a complete picture of data.
Step-by-step guide to calculating variance and standard deviation, including an example using 10 test scores.
Discussion of how outliers can impact the mean and why the median might be more appropriate in certain cases.
Clarification of standard deviation's role in measuring the spread of scores around the mean, and its relevance in statistical analysis.
Use of a normal bell curve to explain how data distribution relates to standard deviation, with a breakdown of scores within 1, 2, and 3 standard deviations from the mean.
Detailed example calculating descriptive statistics for a group of five students' course points, demonstrating variance and standard deviation.
Important note on the difference between population and sample formulas for variance and standard deviation.
Explanation of how to calculate and interpret standard deviation and its influence on data spread using different distributions.
Introduction to using SPSS to calculate descriptive statistics, highlighting the key functions: frequencies, descriptives, and explore.
Guidance on reporting descriptive statistics in APA format, both in narrative form and through tables.
Transcripts
[Music]
welcome this is Amanda rockson appq and
in this tutorial we are going to talk
about descriptive
statistics this is part one of a
two-part tutorial in part one we are
going to identify and Define
distributions measures of central
tendency dispersion and also briefly
look at how to calculate descriptive
statistics in SPS
as well as how to report them in apa
format let's get
started of distribution is foundational
to everything we are going to talk about
in this tutorial and it's very
foundational to statistics a
distribution is a list of scores taken
on a particular variable for example the
following is a distribution of 10
students scores on a Statistics test AR
ranged from highest to lowest you'll see
that the distribution includes 69 9 77
77 77 84 85 85 87 92 and
98 a researcher can take a distribution
and calculate measures of central
tendency there are three measures of
central tendency and you see them listed
here the mean median and mode the mean
is the arithm average of a group of
scores or it's the sum of scores divided
by the number of scores for example in
our distribution of 10 scores we add 69
+ 77 + 77 + 77 + 84 + 85 + 85 + 87 + 92
+ 98 and we divide that by the number of
scores which is 10 and we find that the
mean is
83.1 now the median is the middle of all
the scores into the distribution when
they are arranged from highest to lowest
it's the midpoint of the distribution
when the distribution has an odd number
of scores or the number is halfway
between the middle or two middle
scores when the uh distribution is an
even number so here for example we
arrang our distribution of 10 test
scores from lowest to highest and we see
that the median is
84.5 finally we have the mode the mode
is the value with the greatest frequency
in the distribution so here for example
in our dist distribution we see that 77
is the mode because it's listed most
frequently it's listed three times in
this
distribution so in statistics the
measure of central tendency that
reported by the researcher depends upon
the type of data as well as the
distribution of the data let's talk
about this for categorical scores the
researcher would report the mode the
score that occurs most frequently to
indicate what response is most typical
for ordinal data or ordinal scores the
researcher would use the median and for
Interval ratio level scores quantitative
data the researcher would use the mean
however sometimes the researcher would
also report the median to describe
central tendency and here she would do
this in cases where there may be
outliers or the distribution of data is
skewed in some manner because the mean
is really less robust in cases like this
and actually the median May May then be
more useful we're going to look at an
example of this so that we can better
understand it however before we do that
let's talk a little bit about
variation now variation occurs in every
distribution before we Define um more
specifically variant let's talk a little
bit about what it is and why it exists I
want you to think for a moment if we
look at students course points in one
online statistics course it's likely
that if we had 20 students all of their
course points would differ why would
this
be well maybe some students have had
more exposure to statistics than others
maybe some have put in more time
studying these are variables that cause
variation and the reason that we have
what we consider the range and scores
why let's say if the course points um
ranged from anywhere from 0 to a th000
that's why we would have scores ranging
anywhere from let's say 700 to
1,000 let's look at another example
let's say that if we would let's say
that we measure everyone um in the
class's heart rate again the score
values would differ across all of the
class members why do you think that the
heart rates would
differ well there's lots of reasons
heart rates would differ
um variables that might cause different
heart rates or cause variation in heart
rate across class members may include
things such as sex age body weight maybe
somebody is drinking a cup of coffee or
even just thinking about
statistics um maybe somebody recently
exercised so on and so forth again these
are variables that cause
variation it's the reason that we have
the range of
scores why then is understanding this
variation or this dispersion of scores
or the distribution
important without knowing something
about how the data is dispersed or how
it varies measures of central tendency
can actually be misleading let's say
that we know that social economic status
influences course grades so a researcher
wants to know the socioeconomic status
of a group of 10 students the group of
10 students that we've been talking
about abouts these 10 students let's say
come from families in which the mean
income is
$200,000 with very little variation from
the mean meaning that most of the
students come from families that make
around
$200,000 however this would be very
different if we had a group of 20
students who came from families in which
the mean score was $220,000 also
but what we knew about these group of
students is that three of the students
um parents or three of the students come
from families that make a combined
income of a million dollar and the other
students come from families that make a
combined income of
$60,000 measures thus of dispersion
provide a more complete picture of the
data set
here um so dispersion includes three
measures it includes range variance and
standard deviation and we're going to
talk about those next but before we move
on and talk about those I want to point
out that this example also provides a
good example of how the mean can be
deceiving when there are outliers in our
second scenario we have three outliers
the students who come from families that
make a million dollars really what we
see is the majority of the families um
have a combined income of $660,000
our median here is then
$60,000 and really provides a better
picture of um central tendency than the
mean does in this second
scenario we are now going to look at the
measures of dispersion range variance
and standard
deviation let's start with range I've
already talked about range so you may
have an idea of what range is but the
range is the distance between the
minimum score and the maximum score for
example in our distribution of 10 test
scores the range would be 29 because our
highest test score was 98 our lowest was
69 and 98 - 69 is
29 now let's talk about variance the
variance tells us about the variation of
the data and it's actually foundational
to uh the next term we're going to talk
about which is standard
deviation variance is defined as the
average of the squar differences from
the mean let's talk about how to
calculate variance so we can better
understand this definition to calculate
variance the first thing that you need
to do is calculate the mean we've
already done that we know it's
83.1 next for each number in the
distribution subtract the mean for
example we have the score of 98 98 minus
we would take 98 minus
83.1 for the score of 92 we take 92
minus 83.1 and so
on then we square that number or each of
those numbers and that's known as the
squared difference finally we calculate
the average of the squared difference
that is we add up the squared difference
and we divide them by the number of
scores which is 10 when we do this what
we get is
[Music]
7.54 so the variance for the
distribution that we've been looking at
is 70.5 4 that brings me finally to
standard deviation
standard deviation is a commonly used
measure of dispersion and it's simply
the square root of the variance here we
know the square root of the variance of
7.54 is
8.39 let's talk a little bit about more
a little bit more about standard
deviation standard de because this is a
really really important term and it's a
and it's also a measure that's used in a
lot of Statistics that we're to talk
about in other
tutorials standard deviation then is
um the measure is a measure as I said of
the dispersion of scores and it measures
the dispersion of scores specifically
around the mean in other words another
way to think about standard deviation is
how spread out a number of or a set of
numbers are or how far numbers are from
what's normal or what's
average the larger the standard
deviation ation then the larger the
spread of scores around the mean and the
smaller the standard deviation the
smaller the spread of scores is going to
be around the
meat let's talk a little bit further
about standard deviation and look at it
in terms of a normal bell curve here we
have a picture of a normal bell curve
and for a normally distributed um data
values here we can see that
approximately
68% of the distribution Falls with one
in one standard deviation of the mean
95% of the distribution Falls within two
standard deviations of the mean and
99.7% of the distribution Falls within
three standard deviations of the mean
now let's talk about this in terms of
the distribution that we've been looking
at if our 10 scores were evenly
distributed with the mean of
83.1 this would mean that 68% % of our
scores would fall between
91.4 n and
74.
71 now how um before we go on how did I
get that number well I I got let's talk
about um
91.4 n what I did was was I knew that
our mean was 83.1 and if I add our
standard deviation of um
8.39 what I find is is that the first
standard or scores that are a one
standard deviation above the mean are 8
or 91 or equals 91.4 n then if I take
our mean again of
83.1 and I want to know um scores that
fall below one standard deviation of the
mean or a score that falls below one
standard deviation of the mean I'm going
to mind that if I take um 83.1 minus
8.39 that I'm going to get
74. 71 okay so that's how I'm get that's
how I'm getting those numbers so let's
go back so if our 10 scores again were
evenly distributed with the mean of
83.1 we would know then that 68% of our
scores would fall within one standard
deviation of the mean that means our
scores would fall between 91 .49 and
74.
71 this then we would also know that 95%
of our scores would range between
99.88% of our scores would fall between
108.2 7 and
57.9 3 now now that we've defined
measures of central tendency and also
measure measures of dispersion we're
going to take a little bit closer look
at measures of dispersion just to make
sure that we have a clear understanding
of
them we're going to walk through this
example and calculate variance and
standard deviation while I'm aware that
software such as SPSS automatically
calculates these values for you you
still need to understand the statistic
and how it works and be able to uh know
how to calculate it
um because this will really help you
understand um what the values represent
so that you can clearly describe and
interpret them when you get them in the
SPs SPSS output so let's um take a look
at this example an educational
statistics Professor has a class of five
students the students um course points
range from 0o to 600 and the following
are the students um points there's a
student that has 600 points points a
student that has 470 points a student
that has 170 points a student that has
430 points and a student that has 100
points now remember that variance is
foundational to standard deviation so we
first need to calculate the variance the
first step to calculating the variance
is to calculate the mean and remember we
calculate the mean by adding all of the
scores and dividing it by the number of
scores as you can see I've done here
here we find that the mean is
394 for each number in the distribution
we then subtract the mean and here you
can see all of the numbers with the mean
subtracted for example
600-
374 =
206 and so on and so
forth once those numbers of are
calculated we then Square those numbers
remember those numbers are known as
Square difference or Square differences
we then add the squared differences and
divide them by the number of scores in
this case remember we have five and we
by doing this we find that the variance
is
2174 now that we know the variance we
can calculate the standard deviation the
standard deviation remember is the
square root of the variance here the
square root of
2174 is
147.300 deviation in this example is
147 now that we know the standard
deviation as well as the variance we as
educational researchers or social
science researchers can use the standard
deviation to determine which scores are
within one standard deviation of the
mean here again I've listed um just as a
reminder the distribution of scores we
know that the mean is 394 and the
standard deviation is 147 so how how do
we use these scores well scores between
220 and
541 fall within one standard deviation
of the mean again in order to find the
220
um one standard deviation below the mean
we we take 394 and we subtract 147 which
is the mean minus the standard deviation
and then in order to calculate one
standard deviation above the mean we
take 394 plus 147 and we find
541 so again we know that the the scores
that range between 220 and
541 fall within one standard deviation
of the mean and what we can see if we
look at our distribution is that we have
three scores that fall within one
standard um deviation of the mean we can
also note if we look at our distribution
that one of the uh students is a really
high achiever with a score of 600
whereas one of the students is somewhat
of a low achiever with a score of
170 before we move on from this example
I want to discuss two more important
points you'll note that the formula that
we used to calculate both variance and
standard deviation assumed that we were
using a population of five
students so what we did when we
calculated variance is we divided by n
or five in this case however if the data
was from a sample for example let's say
that we had 10 students in the class and
we randomly selected five of them we
would have divided by n minus one or 5 -
one which was which would be four when
calculating the variance so if we did
this what we would find is that our
sample variance was
27,100 and our standard deviation was
164 so this is just an important uh
important point to note that the
population formula here is different
from the sample population formula and
you'll find that to be the case or to be
true and a number of the statistics will
discuss in these
tutorials I then also wanted to take
time to refer back to a point that I
made earlier remember when defining
standard deviation I said that the
larger the standard deviation the larger
the spread of scores around the mean in
each
other here we have a um an example of
three distributions you'll first see at
the top the distribution that we've been
talking about where the mean is 394 and
the standard deviation is 147 with um
our scores ranging from 170 to
600 let's say that we took another
sample of score or another we looked at
another sample of scores and this
distribution ranged from 70 to 600 so
the range is a little bit larger what we
see here then is that the mean is
334 so it's a little bit
lower and that our standard deviation is
236.000 what we not is that the lower
score of 70 influenced the mean and we
also can note a larger standard or
larger standard deviation which implies
which we can see by examining the scores
and which we talked about is that the
scores are more spread
out let's say then we have one more um
we take data from one more sample
population and this set of or this
distribution is not as spread out the
range is only for from 370 to 600 and
what we find is is that the mean is 454
and the standard deviation is
8961 in this case what we see is that
the mean is smaller but this or sorry
larger in this case um because the
scores are higher however we all we then
see that the standard deviation is
smaller implying that these scores are
not very spread out at all so so this is
an example of what I meant when I said
the larger the standard deviation the
larger the spread of scores and then the
Opposites also true the smaller the
standard deviation the smaller the
spread of scores now let's take a look
at how do we C now that we understand
how to calculate U measures of central
tendency as well as dispersion let's
take a look at how we can do that in
SPSS now there are three functions that
can be used when you want to calculate
descriptive statistics and SPSS
you can use frequencies descriptives or
explore so you go to descriptive
statistics um from an sorry from Analyze
you go to descriptive statistics and
then you can choose frequency
descriptives or
explore each of these have different
features and different functions within
so I encourage you to explore them
however for the most part you can
calculate uh measures of central
tendency and dispersion using these
three functions
once you've calculated your descriptive
statistics using either SPSS or
calculating them by hand you are then
ready to report them when you're
reporting descriptive statistics you can
either use the name and say the mean is
or you can use notations here are the
notations for different uh descriptive
statistics so for the mean you can see
that it's a capital M for the standard
deviation you can see you can use um a
capital SD I will note here that
whenever you report notations per APA
they should be italicized so it's really
important to make sure that your
statistics are
italicized so using the example
distribution that we've been talking
about here's how I might report the
descriptive statistics for that example
I would say that I have a convenient
sample of undergraduate students I um
enrolled in a statistics course and as
you can see here that n which means the
entire sample population equal equals
five and then I um tell the reader that
the course points ranged from 300 to 600
now I'm going to the the measure of
central tendency that I'm going to
report is the mean because I'm dealing
with course points which are measured at
the uh ratio level of measurement so I
report the mean and then I also can then
report the standard deviation so this is
an example of how I would report scores
you or specifically descriptive
statistics using um a
narrative then here is an example of how
I would report descriptive statistics
using a table now let's say that instead
of just having five students undergrad
students to look at I actually want to
look at two two sets of undergraduate
students students who take maybe a
statistics course online and students
who take it
residentially and I'm not just going to
look at course points here maybe I want
to look at um their scores on a test
also so here I have a descriptive
statistics table disaggregated by the
type of course at the top at the title
you'll note that the um the the n
capital N equals 10 because that's my
entire sample population that I'm
looking at but then down below for on
line I have n equal 5 and you'll note
that that's a lowercase n because that
indicates that's a group within the
entire sample population and you can see
that I had I looked at five online
students and five residential students
you can then also see in this table that
we have the mean and the standard
deviations listed so this is an example
of an APA formatted table that could be
used to report descriptive
statistics this then concludes part one
of the tutorial on descriptive
statistics at this point you should be
able to Define what a distribution is
and explain what it is you should be
able to uh identify Define and calculate
measures of central tendency as well as
dispersion know how to go about
calculating descriptive statistics in
SPSS and then also reporting them in APA
format
関連動画をさらに表示
Descriptive Statistics, Part 2
Rango, varianza, desviación estándar, coeficiente de variación, desviación media: datos no agrupados
Varianza, Desviación Estándar y Coeficiente de Variación | Datos agrupados en intervalos
Varianza, desviación estándar y coeficiente de variación en Excel
Media mediana y moda | Datos sin agrupar
Media, Mediana y Moda - Ejemplos y Ejercicios Resueltos - Medidas de Tendencia Central
5.0 / 5 (0 votes)