Descriptive Statistics, Part 1

The Doctoral Journey
26 Aug 201325:23

Summary

TLDREn este tutorial, Amanda Rockson Appq introduce conceptos clave de las estadísticas descriptivas, como la distribución, las medidas de tendencia central (media, mediana y moda) y la dispersión (rango, varianza y desviación estándar). Utiliza ejemplos para ilustrar cómo calcular estas medidas y cuándo es apropiado reportar cada una, destacando la importancia de la variación en los datos. También explica cómo usar SPSS para generar estadísticas descriptivas y cómo reportarlas en formato APA, proporcionando tanto descripciones narrativas como ejemplos de tablas formateadas.

Takeaways

  • 📊 Una distribución es una lista de puntajes tomados en una variable particular y es fundamental en estadísticas.
  • 🔢 Los tres medidores de tendencia central son la media, la mediana y la moda.
  • 💡 La media es el promedio aritmético de un grupo de puntajes, o la suma de puntajes dividida por el número de puntajes.
  • 📈 La mediana es el punto central de todos los puntajes en una distribución ordenada de menor a mayor.
  • 📦 La moda es el valor con la mayor frecuencia en la distribución.
  • 📑 El tipo de datos y la distribución de los datos determinan qué medida de tendencia central se informará.
  • 🔄 La variación ocurre en cada distribución y es causada por diferentes variables.
  • 📋 La importancia de la variación se entiende mejor cuando se conoce cómo se dispersan los datos y cómo varían.
  • 📏 La dispersión se mide por el rango, la varianza y la desviación estándar.
  • 📘 La varianza es la media de las diferencias al cuadrado de la media y es fundamental para la desviación estándar.
  • 📊 La desviación estándar es una medida común de dispersión y es simplemente la raíz cuadrada de la varianza.
  • 📐 La desviación estándar muestra cómo se dispersan los puntajes alrededor de la media, y un valor más grande indica una mayor dispersión.

Q & A

  • ¿Qué es una distribución en estadística?

    -Una distribución es una lista de puntuaciones obtenidas en una variable específica. Es fundamental para el análisis estadístico, ya que permite identificar patrones y realizar cálculos, como la tendencia central y la dispersión.

  • ¿Cuáles son las tres medidas de tendencia central?

    -Las tres medidas de tendencia central son la media, la mediana y la moda. La media es el promedio, la mediana es el valor central en una lista ordenada, y la moda es el valor que aparece con mayor frecuencia.

  • ¿Cómo se calcula la media?

    -La media se calcula sumando todas las puntuaciones y dividiendo esa suma por el número total de puntuaciones. Por ejemplo, si las puntuaciones son 69, 77, 84, 85, 87, 92 y 98, la media sería 83.1.

  • ¿Qué indica la mediana en un conjunto de datos?

    -La mediana es el punto medio de un conjunto de datos ordenados de mayor a menor. Si el número de datos es impar, es el valor central; si es par, es el promedio de los dos valores centrales.

  • ¿Qué es la moda y cómo se identifica?

    -La moda es el valor que aparece con mayor frecuencia en una distribución. En el ejemplo del guion, 77 es la moda porque aparece tres veces en el conjunto de datos.

  • ¿Cuándo se utiliza la media, la mediana o la moda?

    -La media se usa con datos de nivel de intervalo o de razón, la mediana con datos ordinales, y la moda con datos categóricos. Sin embargo, en casos de valores atípicos o distribuciones sesgadas, la mediana puede ser más representativa que la media.

  • ¿Qué es la variación en una distribución?

    -La variación es el grado de diferencia entre las puntuaciones de una distribución. Puede deberse a diferentes factores, como la experiencia previa de los estudiantes o sus hábitos de estudio.

  • ¿Qué es la varianza y cómo se calcula?

    -La varianza es el promedio de las diferencias al cuadrado respecto a la media. Se calcula restando la media de cada puntuación, elevando al cuadrado esa diferencia, y luego promediando los resultados.

  • ¿Qué es la desviación estándar?

    -La desviación estándar es la raíz cuadrada de la varianza. Mide qué tan dispersas están las puntuaciones alrededor de la media. Un valor alto indica mayor dispersión y un valor bajo indica menor dispersión.

  • ¿Cómo se interpretan los resultados de una distribución normal en términos de desviación estándar?

    -En una distribución normal, aproximadamente el 68% de las puntuaciones caen dentro de una desviación estándar de la media, el 95% dentro de dos desviaciones estándar y el 99.7% dentro de tres desviaciones estándar.

Outlines

00:00

🎓 Introducción a las estadísticas descriptivas

En este párrafo, Amanda Rockson presenta el tutorial sobre estadísticas descriptivas, explicando que es la primera parte de dos. Se discutirán las distribuciones, medidas de tendencia central, dispersión y cómo calcular estas estadísticas en SPSS y reportarlas en formato APA. También se introduce el concepto de distribución como una lista de puntuaciones de una variable específica.

05:00

📊 Medidas de tendencia central: Media, Mediana y Moda

Aquí se explican las tres medidas principales de tendencia central: la media, la mediana y la moda. Se utiliza una distribución de diez puntuaciones de un examen de estadística para ilustrar cómo calcular cada medida. La media se obtiene sumando todas las puntuaciones y dividiendo entre el número de puntuaciones, la mediana es el punto medio de una distribución ordenada, y la moda es el valor más frecuente.

10:01

🔍 Variación en las puntuaciones: Ejemplos y causas

El párrafo aborda la variación en las puntuaciones de los estudiantes en un curso de estadística y el concepto de dispersión. Se da un ejemplo de cómo las puntuaciones en las pruebas varían debido a diferentes factores, como la experiencia previa y el tiempo de estudio. Se introducen los conceptos de rango y dispersión en las puntuaciones, que son clave para interpretar las estadísticas correctamente.

15:03

📉 Importancia de la variación y dispersión en los datos

Este párrafo resalta la importancia de entender la dispersión en los datos, especialmente en casos con datos socioeconómicos. Se ilustra cómo las medidas de dispersión, como el rango, la varianza y la desviación estándar, brindan una visión más completa de los datos. Se muestra un ejemplo donde el promedio puede ser engañoso debido a la presencia de valores atípicos.

20:04

📏 Medidas de dispersión: Rango, varianza y desviación estándar

Aquí se explican en detalle las medidas de dispersión: rango, varianza y desviación estándar. Se muestran cálculos paso a paso usando una distribución de puntuaciones. El rango es la diferencia entre el valor más alto y el más bajo, la varianza mide la dispersión de las puntuaciones respecto a la media, y la desviación estándar es la raíz cuadrada de la varianza.

25:05

🔬 Desviación estándar y la distribución normal

Este párrafo explora cómo la desviación estándar describe la dispersión de puntuaciones alrededor de la media. También se introduce la curva de campana o distribución normal, explicando que el 68% de las puntuaciones caen dentro de una desviación estándar de la media, el 95% dentro de dos, y el 99.7% dentro de tres desviaciones estándar.

📊 Ejemplo de cálculo de varianza y desviación estándar

Se presenta un ejemplo de cálculo de la varianza y la desviación estándar utilizando puntuaciones de cinco estudiantes en un curso. Se explica cómo se calcula primero la varianza y luego la desviación estándar. Además, se compara cómo cambia el cálculo al utilizar una población completa frente a una muestra, destacando la diferencia entre las fórmulas.

🔍 Comparación de distribuciones y dispersión de los datos

Este párrafo compara tres distribuciones con diferentes medias y desviaciones estándar. Se muestra cómo la dispersión de las puntuaciones afecta la interpretación de los datos, destacando que una mayor desviación estándar indica mayor dispersión y una menor desviación estándar indica menor dispersión.

💻 Uso de SPSS para calcular estadísticas descriptivas

Se explican las tres funciones principales en SPSS para calcular estadísticas descriptivas: 'frequencies', 'descriptives' y 'explore'. Se anima a explorar estas funciones y se dan instrucciones para calcular medidas de tendencia central y dispersión tanto manualmente como en SPSS.

📝 Cómo reportar estadísticas descriptivas en formato APA

El párrafo concluye explicando cómo reportar estadísticas descriptivas en formato APA, ya sea en narrativa o en tablas. Se enfatiza el uso de notaciones como 'M' para la media y 'SD' para la desviación estándar, y se muestra un ejemplo de cómo reportar los resultados de una distribución de datos en un formato APA adecuado.

Mindmap

Keywords

💡distribución

Una distribución es una lista de puntajes obtenidos en una variable específica. Es fundamental para la estadística y permite a los investigadores analizar y entender los datos de una forma más estructurada. En el vídeo, se utiliza la distribución de puntajes de estudiantes en un examen de estadísticas para explicar conceptos como la tendencia central y la dispersión.

💡tendencia central

La tendencia central es un concepto estadístico que se refiere a la representación de un conjunto de datos a través de valores que indican su 'centro'. En el vídeo, se mencionan tres medidas de tendencia central: la media, la mediana y la moda. Estos valores son cruciales para entender la distribución de los datos y para reportar características clave de los datos.

💡media

La media es el promedio aritmético de un grupo de puntajes y se calcula sumando todos los puntajes y dividiendo por la cantidad de puntajes. En el vídeo, se utiliza un ejemplo de puntajes de estudiantes para calcular la media, que resulta en 83.1. La media es una medida de tendencia central que indica el punto promedio de los datos.

💡mediana

La mediana es el punto medio de una distribución de datos ordenada, que divide la distribución en dos partes iguales. Si la cantidad de datos es impar, la mediana es el valor central; si es par, es el promedio de los dos valores centrales. En el vídeo, la mediana de los puntajes de los estudiantes se calcula como 84.5.

💡modal

La moda es el valor que aparece con mayor frecuencia en una distribución de datos. Es una medida de tendencia central que indica el valor más común en los datos. En el vídeo, el puntaje de 77 es la moda porque es el valor que se repite con mayor frecuencia.

💡variación

La variación es la diferencia o la diversidad que existe dentro de un conjunto de datos. Es fundamental para entender la dispersión de los datos y su distribución. En el vídeo, se discute cómo la variación puede ser causada por diferentes factores, como el tiempo de estudio o la exposición previa al tema en el caso de los puntajes de los estudiantes.

💡dispersión

La dispersión se refiere a la distribución de los datos en torno a la tendencia central. Incluye medidas como el rango, la varianza y la desviación estándar. En el vídeo, la dispersión es importante para entender la variabilidad de los datos y para interpretar correctamente las medidas de tendencia central.

💡rango

El rango es una medida de dispersión que representa la diferencia entre el puntaje más alto y el más bajo en una distribución de datos. En el vídeo, se calcula el rango de los puntajes de los estudiantes, que es de 29 puntos, como una forma de entender la dispersión de los datos.

💡varianza

La varianza es una medida de dispersión que indica la variabilidad de los datos en torno a la media. Se calcula como el promedio de las diferencias al cuadrado entre cada valor y la media. En el vídeo, la varianza de los puntajes de los estudiantes se calcula como 7.54, lo que indica la cantidad de variabilidad en los datos.

💡desviación estándar

La desviación estándar es una medida de dispersión que indica la dispersión de los datos en torno a la media. Se calcula como la raíz cuadrada de la varianza. En el vídeo, la desviación estándar se usa para ilustrar cómo los datos se distribuyen en torno a la media y se calcula como 8.39 para los puntajes de los estudiantes.

💡SPSS

SPSS (Statistical Package for the Social Sciences) es un software estadístico utilizado para la manipulación de datos, el modelado estadístico y la reporte de análisis. En el vídeo, se menciona cómo usar SPSS para calcular medidas de tendencia central y dispersión, lo que facilita la tarea de los investigadores y proporciona una salida estructurada para los análisis estadísticos.

Highlights

Introduction to descriptive statistics, including definitions of distributions, central tendency, and dispersion.

Explanation of a distribution: a list of scores taken on a particular variable, with an example using 10 student test scores.

Introduction to measures of central tendency: mean, median, and mode, with clear examples of how to calculate each.

Demonstration of when to report different measures of central tendency based on data type: mean for interval/ratio data, median for ordinal data, and mode for categorical data.

Explanation of variation and why it's important in understanding data distribution, with examples from student scores and heart rates.

Introduction to measures of dispersion: range, variance, and standard deviation, with a focus on their importance in providing a complete picture of data.

Step-by-step guide to calculating variance and standard deviation, including an example using 10 test scores.

Discussion of how outliers can impact the mean and why the median might be more appropriate in certain cases.

Clarification of standard deviation's role in measuring the spread of scores around the mean, and its relevance in statistical analysis.

Use of a normal bell curve to explain how data distribution relates to standard deviation, with a breakdown of scores within 1, 2, and 3 standard deviations from the mean.

Detailed example calculating descriptive statistics for a group of five students' course points, demonstrating variance and standard deviation.

Important note on the difference between population and sample formulas for variance and standard deviation.

Explanation of how to calculate and interpret standard deviation and its influence on data spread using different distributions.

Introduction to using SPSS to calculate descriptive statistics, highlighting the key functions: frequencies, descriptives, and explore.

Guidance on reporting descriptive statistics in APA format, both in narrative form and through tables.

Transcripts

play00:00

[Music]

play00:09

welcome this is Amanda rockson appq and

play00:11

in this tutorial we are going to talk

play00:13

about descriptive

play00:14

statistics this is part one of a

play00:17

two-part tutorial in part one we are

play00:19

going to identify and Define

play00:22

distributions measures of central

play00:23

tendency dispersion and also briefly

play00:27

look at how to calculate descriptive

play00:28

statistics in SPS

play00:30

as well as how to report them in apa

play00:32

format let's get

play00:35

started of distribution is foundational

play00:37

to everything we are going to talk about

play00:39

in this tutorial and it's very

play00:41

foundational to statistics a

play00:43

distribution is a list of scores taken

play00:46

on a particular variable for example the

play00:49

following is a distribution of 10

play00:51

students scores on a Statistics test AR

play00:55

ranged from highest to lowest you'll see

play00:57

that the distribution includes 69 9 77

play01:01

77 77 84 85 85 87 92 and

play01:08

98 a researcher can take a distribution

play01:11

and calculate measures of central

play01:13

tendency there are three measures of

play01:15

central tendency and you see them listed

play01:17

here the mean median and mode the mean

play01:20

is the arithm average of a group of

play01:23

scores or it's the sum of scores divided

play01:26

by the number of scores for example in

play01:29

our distribution of 10 scores we add 69

play01:32

+ 77 + 77 + 77 + 84 + 85 + 85 + 87 + 92

play01:40

+ 98 and we divide that by the number of

play01:43

scores which is 10 and we find that the

play01:45

mean is

play01:47

83.1 now the median is the middle of all

play01:51

the scores into the distribution when

play01:53

they are arranged from highest to lowest

play01:56

it's the midpoint of the distribution

play01:58

when the distribution has an odd number

play02:00

of scores or the number is halfway

play02:03

between the middle or two middle

play02:06

scores when the uh distribution is an

play02:08

even number so here for example we

play02:11

arrang our distribution of 10 test

play02:13

scores from lowest to highest and we see

play02:17

that the median is

play02:20

84.5 finally we have the mode the mode

play02:23

is the value with the greatest frequency

play02:26

in the distribution so here for example

play02:29

in our dist distribution we see that 77

play02:31

is the mode because it's listed most

play02:34

frequently it's listed three times in

play02:36

this

play02:38

distribution so in statistics the

play02:41

measure of central tendency that

play02:43

reported by the researcher depends upon

play02:45

the type of data as well as the

play02:47

distribution of the data let's talk

play02:49

about this for categorical scores the

play02:51

researcher would report the mode the

play02:54

score that occurs most frequently to

play02:56

indicate what response is most typical

play03:00

for ordinal data or ordinal scores the

play03:02

researcher would use the median and for

play03:04

Interval ratio level scores quantitative

play03:07

data the researcher would use the mean

play03:10

however sometimes the researcher would

play03:12

also report the median to describe

play03:14

central tendency and here she would do

play03:17

this in cases where there may be

play03:18

outliers or the distribution of data is

play03:21

skewed in some manner because the mean

play03:24

is really less robust in cases like this

play03:27

and actually the median May May then be

play03:30

more useful we're going to look at an

play03:33

example of this so that we can better

play03:35

understand it however before we do that

play03:37

let's talk a little bit about

play03:39

variation now variation occurs in every

play03:43

distribution before we Define um more

play03:46

specifically variant let's talk a little

play03:48

bit about what it is and why it exists I

play03:52

want you to think for a moment if we

play03:55

look at students course points in one

play03:58

online statistics course it's likely

play04:01

that if we had 20 students all of their

play04:04

course points would differ why would

play04:07

this

play04:09

be well maybe some students have had

play04:12

more exposure to statistics than others

play04:15

maybe some have put in more time

play04:16

studying these are variables that cause

play04:19

variation and the reason that we have

play04:21

what we consider the range and scores

play04:24

why let's say if the course points um

play04:27

ranged from anywhere from 0 to a th000

play04:31

that's why we would have scores ranging

play04:33

anywhere from let's say 700 to

play04:36

1,000 let's look at another example

play04:39

let's say that if we would let's say

play04:41

that we measure everyone um in the

play04:45

class's heart rate again the score

play04:48

values would differ across all of the

play04:50

class members why do you think that the

play04:53

heart rates would

play04:56

differ well there's lots of reasons

play04:58

heart rates would differ

play05:00

um variables that might cause different

play05:03

heart rates or cause variation in heart

play05:05

rate across class members may include

play05:07

things such as sex age body weight maybe

play05:11

somebody is drinking a cup of coffee or

play05:13

even just thinking about

play05:15

statistics um maybe somebody recently

play05:18

exercised so on and so forth again these

play05:21

are variables that cause

play05:23

variation it's the reason that we have

play05:26

the range of

play05:28

scores why then is understanding this

play05:31

variation or this dispersion of scores

play05:34

or the distribution

play05:35

important without knowing something

play05:38

about how the data is dispersed or how

play05:40

it varies measures of central tendency

play05:43

can actually be misleading let's say

play05:46

that we know that social economic status

play05:49

influences course grades so a researcher

play05:52

wants to know the socioeconomic status

play05:55

of a group of 10 students the group of

play05:58

10 students that we've been talking

play05:59

about abouts these 10 students let's say

play06:02

come from families in which the mean

play06:04

income is

play06:06

$200,000 with very little variation from

play06:09

the mean meaning that most of the

play06:11

students come from families that make

play06:14

around

play06:16

$200,000 however this would be very

play06:20

different if we had a group of 20

play06:24

students who came from families in which

play06:26

the mean score was $220,000 also

play06:30

but what we knew about these group of

play06:32

students is that three of the students

play06:35

um parents or three of the students come

play06:37

from families that make a combined

play06:39

income of a million dollar and the other

play06:42

students come from families that make a

play06:44

combined income of

play06:48

$60,000 measures thus of dispersion

play06:51

provide a more complete picture of the

play06:53

data set

play06:55

here um so dispersion includes three

play06:59

measures it includes range variance and

play07:01

standard deviation and we're going to

play07:03

talk about those next but before we move

play07:05

on and talk about those I want to point

play07:08

out that this example also provides a

play07:10

good example of how the mean can be

play07:13

deceiving when there are outliers in our

play07:15

second scenario we have three outliers

play07:18

the students who come from families that

play07:20

make a million dollars really what we

play07:22

see is the majority of the families um

play07:26

have a combined income of $660,000

play07:30

our median here is then

play07:32

$60,000 and really provides a better

play07:35

picture of um central tendency than the

play07:39

mean does in this second

play07:42

scenario we are now going to look at the

play07:45

measures of dispersion range variance

play07:47

and standard

play07:49

deviation let's start with range I've

play07:52

already talked about range so you may

play07:53

have an idea of what range is but the

play07:55

range is the distance between the

play07:57

minimum score and the maximum score for

play08:00

example in our distribution of 10 test

play08:02

scores the range would be 29 because our

play08:04

highest test score was 98 our lowest was

play08:08

69 and 98 - 69 is

play08:11

29 now let's talk about variance the

play08:14

variance tells us about the variation of

play08:17

the data and it's actually foundational

play08:18

to uh the next term we're going to talk

play08:20

about which is standard

play08:22

deviation variance is defined as the

play08:25

average of the squar differences from

play08:27

the mean let's talk about how to

play08:29

calculate variance so we can better

play08:31

understand this definition to calculate

play08:33

variance the first thing that you need

play08:35

to do is calculate the mean we've

play08:37

already done that we know it's

play08:40

83.1 next for each number in the

play08:42

distribution subtract the mean for

play08:45

example we have the score of 98 98 minus

play08:49

we would take 98 minus

play08:51

83.1 for the score of 92 we take 92

play08:55

minus 83.1 and so

play08:57

on then we square that number or each of

play09:01

those numbers and that's known as the

play09:03

squared difference finally we calculate

play09:07

the average of the squared difference

play09:10

that is we add up the squared difference

play09:13

and we divide them by the number of

play09:15

scores which is 10 when we do this what

play09:18

we get is

play09:20

[Music]

play09:21

7.54 so the variance for the

play09:23

distribution that we've been looking at

play09:25

is 70.5 4 that brings me finally to

play09:28

standard deviation

play09:30

standard deviation is a commonly used

play09:33

measure of dispersion and it's simply

play09:36

the square root of the variance here we

play09:39

know the square root of the variance of

play09:41

7.54 is

play09:46

8.39 let's talk a little bit about more

play09:49

a little bit more about standard

play09:50

deviation standard de because this is a

play09:53

really really important term and it's a

play09:55

and it's also a measure that's used in a

play09:58

lot of Statistics that we're to talk

play09:59

about in other

play10:01

tutorials standard deviation then is

play10:04

um the measure is a measure as I said of

play10:08

the dispersion of scores and it measures

play10:10

the dispersion of scores specifically

play10:13

around the mean in other words another

play10:15

way to think about standard deviation is

play10:18

how spread out a number of or a set of

play10:20

numbers are or how far numbers are from

play10:24

what's normal or what's

play10:26

average the larger the standard

play10:28

deviation ation then the larger the

play10:31

spread of scores around the mean and the

play10:34

smaller the standard deviation the

play10:36

smaller the spread of scores is going to

play10:38

be around the

play10:41

meat let's talk a little bit further

play10:43

about standard deviation and look at it

play10:46

in terms of a normal bell curve here we

play10:48

have a picture of a normal bell curve

play10:50

and for a normally distributed um data

play10:53

values here we can see that

play10:54

approximately

play10:56

68% of the distribution Falls with one

play10:59

in one standard deviation of the mean

play11:02

95% of the distribution Falls within two

play11:05

standard deviations of the mean and

play11:08

99.7% of the distribution Falls within

play11:11

three standard deviations of the mean

play11:14

now let's talk about this in terms of

play11:16

the distribution that we've been looking

play11:17

at if our 10 scores were evenly

play11:21

distributed with the mean of

play11:25

83.1 this would mean that 68% % of our

play11:29

scores would fall between

play11:33

91.4 n and

play11:36

74.

play11:38

71 now how um before we go on how did I

play11:44

get that number well I I got let's talk

play11:47

about um

play11:49

91.4 n what I did was was I knew that

play11:53

our mean was 83.1 and if I add our

play11:57

standard deviation of um

play12:00

8.39 what I find is is that the first

play12:03

standard or scores that are a one

play12:06

standard deviation above the mean are 8

play12:10

or 91 or equals 91.4 n then if I take

play12:15

our mean again of

play12:18

83.1 and I want to know um scores that

play12:23

fall below one standard deviation of the

play12:25

mean or a score that falls below one

play12:27

standard deviation of the mean I'm going

play12:28

to mind that if I take um 83.1 minus

play12:34

8.39 that I'm going to get

play12:36

74. 71 okay so that's how I'm get that's

play12:40

how I'm getting those numbers so let's

play12:42

go back so if our 10 scores again were

play12:45

evenly distributed with the mean of

play12:48

83.1 we would know then that 68% of our

play12:51

scores would fall within one standard

play12:54

deviation of the mean that means our

play12:56

scores would fall between 91 .49 and

play13:01

74.

play13:02

71 this then we would also know that 95%

play13:06

of our scores would range between

play13:17

99.88% of our scores would fall between

play13:21

108.2 7 and

play13:24

57.9 3 now now that we've defined

play13:28

measures of central tendency and also

play13:31

measure measures of dispersion we're

play13:33

going to take a little bit closer look

play13:35

at measures of dispersion just to make

play13:37

sure that we have a clear understanding

play13:40

of

play13:41

them we're going to walk through this

play13:43

example and calculate variance and

play13:45

standard deviation while I'm aware that

play13:48

software such as SPSS automatically

play13:50

calculates these values for you you

play13:53

still need to understand the statistic

play13:55

and how it works and be able to uh know

play13:58

how to calculate it

play13:59

um because this will really help you

play14:02

understand um what the values represent

play14:05

so that you can clearly describe and

play14:07

interpret them when you get them in the

play14:09

SPs SPSS output so let's um take a look

play14:13

at this example an educational

play14:15

statistics Professor has a class of five

play14:17

students the students um course points

play14:21

range from 0o to 600 and the following

play14:25

are the students um points there's a

play14:27

student that has 600 points points a

play14:29

student that has 470 points a student

play14:32

that has 170 points a student that has

play14:36

430 points and a student that has 100

play14:41

points now remember that variance is

play14:44

foundational to standard deviation so we

play14:46

first need to calculate the variance the

play14:49

first step to calculating the variance

play14:52

is to calculate the mean and remember we

play14:55

calculate the mean by adding all of the

play14:57

scores and dividing it by the number of

play14:59

scores as you can see I've done here

play15:03

here we find that the mean is

play15:07

394 for each number in the distribution

play15:10

we then subtract the mean and here you

play15:13

can see all of the numbers with the mean

play15:16

subtracted for example

play15:19

600-

play15:21

374 =

play15:23

206 and so on and so

play15:26

forth once those numbers of are

play15:29

calculated we then Square those numbers

play15:33

remember those numbers are known as

play15:35

Square difference or Square differences

play15:38

we then add the squared differences and

play15:41

divide them by the number of scores in

play15:43

this case remember we have five and we

play15:46

by doing this we find that the variance

play15:48

is

play15:53

2174 now that we know the variance we

play15:55

can calculate the standard deviation the

play15:58

standard deviation remember is the

play16:00

square root of the variance here the

play16:03

square root of

play16:06

2174 is

play16:13

147.300 deviation in this example is

play16:18

147 now that we know the standard

play16:20

deviation as well as the variance we as

play16:24

educational researchers or social

play16:25

science researchers can use the standard

play16:28

deviation to determine which scores are

play16:30

within one standard deviation of the

play16:32

mean here again I've listed um just as a

play16:35

reminder the distribution of scores we

play16:38

know that the mean is 394 and the

play16:40

standard deviation is 147 so how how do

play16:45

we use these scores well scores between

play16:48

220 and

play16:50

541 fall within one standard deviation

play16:54

of the mean again in order to find the

play16:57

220

play16:59

um one standard deviation below the mean

play17:02

we we take 394 and we subtract 147 which

play17:07

is the mean minus the standard deviation

play17:10

and then in order to calculate one

play17:12

standard deviation above the mean we

play17:14

take 394 plus 147 and we find

play17:18

541 so again we know that the the scores

play17:21

that range between 220 and

play17:24

541 fall within one standard deviation

play17:27

of the mean and what we can see if we

play17:29

look at our distribution is that we have

play17:31

three scores that fall within one

play17:34

standard um deviation of the mean we can

play17:38

also note if we look at our distribution

play17:40

that one of the uh students is a really

play17:43

high achiever with a score of 600

play17:45

whereas one of the students is somewhat

play17:48

of a low achiever with a score of

play17:53

170 before we move on from this example

play17:55

I want to discuss two more important

play17:57

points you'll note that the formula that

play18:00

we used to calculate both variance and

play18:03

standard deviation assumed that we were

play18:05

using a population of five

play18:08

students so what we did when we

play18:11

calculated variance is we divided by n

play18:13

or five in this case however if the data

play18:16

was from a sample for example let's say

play18:19

that we had 10 students in the class and

play18:21

we randomly selected five of them we

play18:24

would have divided by n minus one or 5 -

play18:27

one which was which would be four when

play18:30

calculating the variance so if we did

play18:33

this what we would find is that our

play18:34

sample variance was

play18:37

27,100 and our standard deviation was

play18:42

164 so this is just an important uh

play18:45

important point to note that the

play18:47

population formula here is different

play18:51

from the sample population formula and

play18:53

you'll find that to be the case or to be

play18:56

true and a number of the statistics will

play18:58

discuss in these

play19:00

tutorials I then also wanted to take

play19:02

time to refer back to a point that I

play19:04

made earlier remember when defining

play19:06

standard deviation I said that the

play19:08

larger the standard deviation the larger

play19:11

the spread of scores around the mean in

play19:12

each

play19:13

other here we have a um an example of

play19:17

three distributions you'll first see at

play19:19

the top the distribution that we've been

play19:21

talking about where the mean is 394 and

play19:24

the standard deviation is 147 with um

play19:29

our scores ranging from 170 to

play19:32

600 let's say that we took another

play19:35

sample of score or another we looked at

play19:37

another sample of scores and this

play19:39

distribution ranged from 70 to 600 so

play19:43

the range is a little bit larger what we

play19:47

see here then is that the mean is

play19:49

334 so it's a little bit

play19:51

lower and that our standard deviation is

play19:57

236.000 what we not is that the lower

play20:00

score of 70 influenced the mean and we

play20:04

also can note a larger standard or

play20:06

larger standard deviation which implies

play20:10

which we can see by examining the scores

play20:12

and which we talked about is that the

play20:14

scores are more spread

play20:16

out let's say then we have one more um

play20:20

we take data from one more sample

play20:22

population and this set of or this

play20:25

distribution is not as spread out the

play20:27

range is only for from 370 to 600 and

play20:31

what we find is is that the mean is 454

play20:35

and the standard deviation is

play20:38

8961 in this case what we see is that

play20:41

the mean is smaller but this or sorry

play20:44

larger in this case um because the

play20:47

scores are higher however we all we then

play20:50

see that the standard deviation is

play20:53

smaller implying that these scores are

play20:56

not very spread out at all so so this is

play20:59

an example of what I meant when I said

play21:01

the larger the standard deviation the

play21:03

larger the spread of scores and then the

play21:05

Opposites also true the smaller the

play21:07

standard deviation the smaller the

play21:09

spread of scores now let's take a look

play21:11

at how do we C now that we understand

play21:13

how to calculate U measures of central

play21:16

tendency as well as dispersion let's

play21:17

take a look at how we can do that in

play21:20

SPSS now there are three functions that

play21:23

can be used when you want to calculate

play21:26

descriptive statistics and SPSS

play21:29

you can use frequencies descriptives or

play21:32

explore so you go to descriptive

play21:33

statistics um from an sorry from Analyze

play21:36

you go to descriptive statistics and

play21:38

then you can choose frequency

play21:40

descriptives or

play21:41

explore each of these have different

play21:44

features and different functions within

play21:47

so I encourage you to explore them

play21:50

however for the most part you can

play21:52

calculate uh measures of central

play21:54

tendency and dispersion using these

play21:57

three functions

play22:00

once you've calculated your descriptive

play22:02

statistics using either SPSS or

play22:05

calculating them by hand you are then

play22:07

ready to report them when you're

play22:09

reporting descriptive statistics you can

play22:12

either use the name and say the mean is

play22:16

or you can use notations here are the

play22:19

notations for different uh descriptive

play22:23

statistics so for the mean you can see

play22:26

that it's a capital M for the standard

play22:28

deviation you can see you can use um a

play22:31

capital SD I will note here that

play22:34

whenever you report notations per APA

play22:37

they should be italicized so it's really

play22:40

important to make sure that your

play22:41

statistics are

play22:43

italicized so using the example

play22:48

distribution that we've been talking

play22:49

about here's how I might report the

play22:52

descriptive statistics for that example

play22:55

I would say that I have a convenient

play22:56

sample of undergraduate students I um

play23:00

enrolled in a statistics course and as

play23:02

you can see here that n which means the

play23:05

entire sample population equal equals

play23:08

five and then I um tell the reader that

play23:12

the course points ranged from 300 to 600

play23:17

now I'm going to the the measure of

play23:18

central tendency that I'm going to

play23:20

report is the mean because I'm dealing

play23:22

with course points which are measured at

play23:25

the uh ratio level of measurement so I

play23:28

report the mean and then I also can then

play23:31

report the standard deviation so this is

play23:34

an example of how I would report scores

play23:37

you or specifically descriptive

play23:40

statistics using um a

play23:44

narrative then here is an example of how

play23:46

I would report descriptive statistics

play23:49

using a table now let's say that instead

play23:52

of just having five students undergrad

play23:55

students to look at I actually want to

play23:57

look at two two sets of undergraduate

play24:00

students students who take maybe a

play24:02

statistics course online and students

play24:04

who take it

play24:05

residentially and I'm not just going to

play24:07

look at course points here maybe I want

play24:08

to look at um their scores on a test

play24:11

also so here I have a descriptive

play24:13

statistics table disaggregated by the

play24:15

type of course at the top at the title

play24:17

you'll note that the um the the n

play24:21

capital N equals 10 because that's my

play24:23

entire sample population that I'm

play24:25

looking at but then down below for on

play24:28

line I have n equal 5 and you'll note

play24:31

that that's a lowercase n because that

play24:33

indicates that's a group within the

play24:35

entire sample population and you can see

play24:37

that I had I looked at five online

play24:39

students and five residential students

play24:42

you can then also see in this table that

play24:44

we have the mean and the standard

play24:45

deviations listed so this is an example

play24:48

of an APA formatted table that could be

play24:52

used to report descriptive

play24:55

statistics this then concludes part one

play24:58

of the tutorial on descriptive

play24:59

statistics at this point you should be

play25:01

able to Define what a distribution is

play25:04

and explain what it is you should be

play25:06

able to uh identify Define and calculate

play25:11

measures of central tendency as well as

play25:13

dispersion know how to go about

play25:16

calculating descriptive statistics in

play25:18

SPSS and then also reporting them in APA

play25:21

format

Rate This

5.0 / 5 (0 votes)

Related Tags
Estadísticas DescriptivasTutorialAmanda RocksonSPSSMediosDispersiónModaMediaMedianaApa Formato
Do you need a summary in English?