Forms of Validity in Research and Statistics

Daniel Storage
24 Jun 201909:02

Summary

TLDRThis video delves into the concept of validity in research, complementing the previously discussed concept of reliability. Validity ensures the accuracy of measurements, crucial for drawing correct conclusions in inferential statistics. The presenter explains that while reliability can exist without validity, validity requires reliability. Three types of validity are explored: content, criterion, and construct validity, each with its nuances and examples. The video emphasizes the importance of these concepts for making accurate inferences about populations.

Takeaways

  • 🔍 Validity in research refers to the accuracy of measurements, whereas reliability is about the consistency of those measurements.
  • 📊 Reliability and validity are crucial for inferential statistics, which use sample data to make conclusions about populations.
  • 📚 You can have a reliable measure without validity, but for a measure to be valid, it must also be reliable.
  • 📝 Content validity ensures that a test or measurement tool covers a representative sample of the entire content area it aims to assess.
  • 🎯 Criterion validity assesses whether a test reflects abilities in a current or future setting, with two types: concurrent criterion validity (current abilities) and predictive validity (future abilities).
  • 🔮 Construct validity is about whether a test truly measures the theoretical construct it claims to, such as aggression or intelligence.
  • 📋 The script uses the example of an IQ test to illustrate the importance of reliability for validity, where inconsistent scores would question the test's validity.
  • 📐 The video script highlights that validity is not just theoretical but can be tested, for example, by correlating test scores with real-world behaviors.
  • 📖 The importance of validity and reliability is emphasized for making accurate conclusions about populations, which is the goal of inferential statistics.
  • 🔑 The script concludes by hinting at the next topic, hypothesis testing, which is a method for making conclusions about populations based on sample data.

Q & A

  • What is the main difference between reliability and validity in research?

    -Reliability refers to the consistency of measurements, ensuring that repeated measurements under the same conditions yield the same results. Validity, on the other hand, is about the accuracy of those measurements, ensuring that they truly measure what they are intended to measure.

  • Why are reliability and validity crucial for inferential statistics?

    -Reliability and validity are essential for inferential statistics because they ensure that the conclusions drawn from sample data accurately represent the population. Without reliable and valid measurements, the inferences made about the population could be incorrect, undermining the purpose of the research.

  • Can a study be reliable without being valid?

    -Yes, a study can be reliable without being valid. An example given in the script is measuring thumb size to assess intelligence, which could be consistent (reliable) but is not an accurate (valid) measure of intelligence.

  • What is content validity and why is it important?

    -Content validity is the degree to which a test or measurement tool covers all the relevant content within a specific domain. It is important because it ensures that the test items truly reflect the entire universe of possible items in that domain, providing a comprehensive assessment.

  • How does criterion validity differ from content validity?

    -Criterion validity assesses whether a test accurately reflects a set of abilities in a current or future setting, whereas content validity focuses on whether the test items represent the entire content area. Criterion validity can be further divided into concurrent criterion validity, which assesses current abilities, and predictive validity, which predicts future performance.

  • What is the purpose of concurrent criterion validity?

    -The purpose of concurrent criterion validity is to determine if a test accurately assesses a person's current level of ability or knowledge. It is used to see if the test results correlate with what the test-taker has learned up to the present moment.

  • How is predictive validity used in standardized testing?

    -Predictive validity is used in standardized testing to forecast a student's future academic performance. For instance, SATs and GREs aim to predict how well a student will perform in college or graduate school, respectively, based on their test scores.

  • What is a construct in psychology and why is it important for construct validity?

    -A construct in psychology is a group of interrelated variables that represent an abstract concept. It is important for construct validity because it ensures that the test or measurement tool is actually measuring the intended psychological construct, such as aggression or intelligence.

  • How can construct validity be empirically tested?

    -Construct validity can be empirically tested by correlating the results of a measurement tool with actual outcomes or behaviors. For example, if an aggression scale is developed, one would look for a correlation between the scale scores and real-life aggressive behaviors.

  • Why is it necessary to have both reliability and validity for accurate conclusions in research?

    -Both reliability and validity are necessary because reliability alone does not guarantee that the measurements are meaningful or accurate. Validity ensures that the measurements are meaningful, but without reliability, the measurements cannot be trusted to be consistent. Together, they allow researchers to draw accurate and meaningful conclusions about the population.

Outlines

00:00

🔍 Understanding Research Validity and Reliability

This paragraph introduces the concept of validity in research, distinguishing it from reliability. Reliability refers to the consistency of measurements, while validity is about the accuracy of those measurements. The speaker emphasizes the importance of both for inferential statistics, which uses sample data to draw conclusions about populations. The relationship between reliability and validity is explored, stating that while reliability can exist without validity, validity requires reliability. Examples are given to illustrate these concepts, such as measuring thumb size as a reliable but invalid measure of intelligence, and the inconsistency of IQ test scores over time as an unreliable and thus invalid measure.

05:03

📚 Exploring Types of Validity in Research

The paragraph delves into the three types of validity: content validity, criterion validity, and construct validity. Content validity assesses whether a test sample accurately represents the entire content area. Criterion validity evaluates how well a test measures current abilities (concurrent criterion validity) or predicts future performance (predictive validity). Construct validity, the most discussed type, concerns whether a test truly measures the theoretical construct it claims to, such as aggression. The speaker also mentions other forms of validity like face validity and stresses the importance of validity and reliability in making accurate conclusions about populations, which will be further explored through hypothesis testing.

Mindmap

Keywords

💡Validity

Validity in research refers to the accuracy of the measurements taken, ensuring that they truly reflect what they are intended to measure. In the video, the speaker emphasizes the importance of validity in making accurate conclusions about populations. For instance, the speaker discusses how a test must be a valid measure of intelligence to be useful, contrasting it with thumb size, which, while reliably measurable, is not a valid indicator of intelligence.

💡Reliability

Reliability pertains to the consistency of measurements in research. The video script explains that while reliability does not guarantee validity, it is a necessary condition for it. The speaker uses the example of an IQ test where the same test given on different occasions should yield similar results to be considered reliable, which is essential for the test to also be valid.

💡Inferential Statistics

Inferential statistics is the process of drawing conclusions about populations from sample data. The video script mentions that both reliability and validity are crucial for inferential statistics because without them, the conclusions drawn from data could be inaccurate, undermining the purpose of research.

💡Content Validity

Content validity is used to assess whether a test covers a representative sample of the content it is designed to measure. In the video, the speaker uses the example of a statistics test, where good content validity would mean the test includes a variety of questions that cover the full range of statistical concepts taught, rather than just focusing on calculating the mean.

💡Criterion Validity

Criterion validity assesses how well a test measures abilities in a current or future setting. The video explains two types: concurrent criterion validity, which measures current abilities, and predictive validity, which predicts future performance. The speaker illustrates this with the example of SATs and GREs, which are designed to predict future academic success.

💡Construct Validity

Construct validity is the degree to which a test measures the theoretical construct it is designed to assess. In the video, the speaker defines a construct as an abstract idea composed of interrelated variables, such as aggression, which includes behaviors like verbal insults and relational manipulation. The speaker emphasizes the importance of demonstrating that a test actually measures the construct it claims to, using the example of correlating an aggression scale with real-world aggressive behaviors.

💡Test-Retest Reliability

Test-retest reliability is a measure of how consistently a test yields the same results when administered multiple times. The video script uses this concept to illustrate the difference between a reliable and unreliable test, explaining that an unreliable test cannot be valid because its inconsistent results prevent accurate measurement of the intended construct.

💡Construct

In psychology, a construct refers to an abstract concept that is made up of a group of interrelated variables. The video script explains that constructs are complex ideas like aggression, which encompass various behaviors and are not directly measurable, requiring the use of tests with construct validity to assess them.

💡Hypothesis Testing

Hypothesis testing is a process used to make conclusions about populations based on sample data. The video script mentions that after discussing validity and reliability, the speaker will cover hypothesis testing, which is a statistical method for testing the validity of research hypotheses.

💡Standardized Tests

Standardized tests are assessments that are administered and scored in a consistent manner to all test-takers. The video script refers to tests like the SAT and GRE as examples of standardized tests designed to have predictive validity, aiming to forecast students' future academic performance.

Highlights

Validity in research is about the accuracy of measurements, in contrast to reliability which focuses on consistency.

Reliability and validity are crucial for inferential statistics, which uses sample data to make conclusions about populations.

A rule of thumb: you can have reliability without validity, but you need reliability for validity.

Example of a reliable but invalid measure: measuring thumb size as an indicator of intelligence.

Example of needing reliability for validity: an IQ test with inconsistent results over time.

Content validity ensures a sample of items reflects the entire universe of possible items on a topic.

Bad content validity example: a test that only asks about calculating the mean.

Good content validity example: a test that covers various statistical concepts like mean, standard deviation, and t-tests.

Criterion validity assesses if a test reflects abilities in a current or future setting.

Concurrent criterion validity is about assessing current abilities, like a midterm exam.

Predictive criterion validity predicts future performance, like SATs or GREs predicting college success.

Construct validity is about the degree to which a test measures the construct it claims to measure.

A construct in psychology is a group of interrelated variables that represent an abstract idea, like aggression.

Construct validity can be tested by correlating test results with real-world observations of the construct.

The importance of validity and reliability for making accurate conclusions about populations.

Hypothesis testing is the process through which conclusions about populations are made based on validity and reliability.

Transcripts

play00:00

in this video we're gonna take just a

play00:02

few moments to talk about validity in

play00:04

research we've already spoken quite a

play00:06

bit about reliability which is all about

play00:08

the consistency of your measurements but

play00:11

now we're gonna focus on validity which

play00:12

is more about the accuracy of your

play00:14

measurements I'm gonna talk about three

play00:16

different forms of validity in a little

play00:18

bit but I want to take a step back and

play00:20

just talk a little bit more generally

play00:22

about reliability and validity and where

play00:24

these fit into each other and also into

play00:26

inferential statistics more broadly so

play00:29

remember we're headed toward inferential

play00:30

statistics which is all about using

play00:33

sample data to make conclusions about

play00:35

populations to answer questions about

play00:38

the world reliability and validity are

play00:40

really important for inferential

play00:41

statistics because if you have studies

play00:43

that are not reliable and or not valid

play00:46

you're gonna end up making inaccurate

play00:48

conclusions about the world on the basis

play00:51

of your data which defeats the whole

play00:53

purpose of what we're trying to do so I

play00:56

want to talk for a moment about the

play00:57

relationship between reliability and

play00:59

validity so here's the rule of thumb

play01:01

that you should kind of go by you can

play01:03

have reliability without validity but

play01:06

you need reliability for validity so let

play01:09

me illustrate with an example or two

play01:10

here's an example let's say I'm a

play01:12

researcher and I'm convinced that

play01:13

thumb-sized is the best way to measure

play01:16

intelligence well think about this this

play01:19

would be a pretty reliable sort of

play01:21

measure that we can use right think

play01:23

about test retest reliability I can

play01:24

measure your thumb now and I can measure

play01:26

it a week from now and a month from now

play01:28

and chances are I'll get the same exact

play01:30

measurement in centimeters or

play01:31

millimeters or whatever every time I

play01:33

measure your thumb however do we think

play01:37

that thumb size is is a valid I should

play01:39

say measure of intelligence is it an

play01:41

accurate measure of intelligence

play01:43

probably not so this is an example of a

play01:46

reliable study but a very invalid one

play01:49

but now think about the flipside needing

play01:52

reliability for validity we've seen an

play01:54

example of this already if I'm

play01:56

developing an IQ test and you take it

play01:58

tomorrow and you get a 120 and then I

play02:01

give it to you again and you take it and

play02:03

you get up 94 well this is really

play02:06

unreliable right we have very poor test

play02:09

retest reliability in this case and as a

play02:12

result you would probably say the

play02:13

is not a valid measure of intelligence

play02:15

it's not a valid IQ test because I don't

play02:18

expect that you got 25 points or 30

play02:21

points or whatever less intelligent over

play02:23

time so again you can have reliability

play02:27

without validity but you need

play02:28

reliability for validity ok so now let's

play02:32

talk about those three different types

play02:34

of validity that I mentioned earlier

play02:36

first of all we're gonna talk about

play02:37

content validity next we'll talk about

play02:40

criterion validity and we'll end by

play02:42

talking about construct validity notice

play02:45

that all three forms of validity start

play02:47

with C this is kind of convenient

play02:49

because it makes it easier to remember

play02:50

but also kind of problematic because it

play02:52

makes it slightly more difficult to

play02:54

differentiate between them but I'll do

play02:56

my best to help you out in that

play02:57

department so let's start with content

play03:00

validity content validity is used when

play03:04

you want to know whether a sample of

play03:05

items truly reflects an entire universe

play03:08

of possible items in a certain topic all

play03:11

right that's pretty wordy but I can

play03:13

illustrate this very simple idea with an

play03:15

example let's say as a professor of

play03:17

statistics I want to design a test to

play03:19

measure my students ability in

play03:21

statistics pretty reasonable right but

play03:23

we need to make sure that this test has

play03:25

good content validity so let me

play03:27

illustrate here's an example of a test

play03:30

that has bad content validity question

play03:32

one calculate the mean to calculate the

play03:35

mean 3 calculate the mean and so on

play03:37

think about our definition this doesn't

play03:39

accurately kind of sample a really

play03:42

representative set of questions that

play03:44

covers the entire content that we're

play03:47

learning about in statistics

play03:49

instead I might want to ask you some

play03:51

things like this on the final exam for

play03:53

example calculate the mean maybe do a

play03:55

standard deviation perhaps a correlation

play03:57

a t-test you know some other things and

play04:00

so on this is much better this has

play04:02

higher and greater content validity

play04:04

because I'm doing a better job of

play04:05

covering all the content we learned

play04:07

about in the test okay so next let's

play04:11

talk about criterion validity criterion

play04:14

validity is used when you want to assess

play04:16

whether a test reflects a set of

play04:18

abilities in a current or future setting

play04:20

and there are two types of criterion

play04:22

validity that are really important to

play04:24

know in order to fully understand what

play04:26

criterion validity is

play04:27

about first we have concurrent criterion

play04:30

validity so think about what concurrent

play04:32

means here it means here and now at the

play04:34

same time etc etc so concurrent

play04:37

criterion validity is all about this

play04:39

question here does this test accurately

play04:41

assess again validity accuracy does this

play04:45

test accurately assess my students

play04:47

current level of ability so whenever

play04:49

you're given a final exam or you're

play04:51

given a you know midterm in a class it's

play04:53

all about concurrent criterion validity

play04:55

this is what your professors are

play04:56

interested in if you understand 95% of

play04:59

what we've learned do you get a 95% on

play05:02

the exam we would hope that you do and

play05:05

that would be a good example of solid

play05:07

concurrent criterion validity a

play05:09

different form of criterion validity

play05:11

however is more interested in predicting

play05:13

things about the future this is called

play05:16

predictive validity and it's kind of

play05:18

focused on this question here does this

play05:21

test accurately assess how my students

play05:23

will do in the future on the final exam

play05:26

for example can I predict how well

play05:28

you're gonna do in the future on the

play05:29

basis of this exam that I'm giving you

play05:31

right now you could think for example

play05:34

about SATs and GRE so a lot of these

play05:36

standardized tests that are requirements

play05:39

for you know different levels in school

play05:41

that you're trying to enter into that's

play05:43

what these are all about we're trying to

play05:45

give you an SI t as a as a way as an

play05:47

extra method for us to sort of make a

play05:49

best guess about how you're gonna do in

play05:51

college for the GRE we're giving you you

play05:54

know this test to predict how well

play05:56

you're gonna do in graduate school now

play05:58

how well these standardized tests

play05:59

actually accomplish this goal is perhaps

play06:01

another issue altogether another topic

play06:03

of discussion but for now it's important

play06:05

to just understand that this is the goal

play06:07

behind the tests to have good predictive

play06:09

validity so we're gonna end by talking

play06:12

about construct validity I'll note by

play06:14

the way that these you know three forms

play06:17

of validity are by no means

play06:18

all-inclusive there are others like face

play06:20

validity and you know there's lots of

play06:22

others but these are by far the most

play06:23

common I would say so this is what I'm

play06:25

focusing on here but before I get to

play06:28

construct validity I have to take just a

play06:30

second or two to define what a construct

play06:32

is so in psychology a construct is a

play06:35

group of interrelated variables that you

play06:37

care about think about most things in

play06:40

psychology that you

play06:41

might be interested in measuring

play06:42

aggression for example aggression is a

play06:45

complicated thing there's all sorts of

play06:46

things that make up this construct this

play06:49

abstract idea of aggression it might

play06:51

involve your behaviors it might involve

play06:53

physical behaviors pushing shoving it

play06:56

might involve verbal behaviors verbal

play06:58

aggression

play06:59

you know name-calling insults there's

play07:01

also relational aggression where you

play07:04

know we're talking about manipulating

play07:05

friendships and ruining relationships

play07:07

and things like that there's all sorts

play07:10

of stuff that goes into aggression and

play07:12

this is what I mean by interrelated

play07:14

variables but the overall idea that's

play07:17

kind of behind these interrelated

play07:19

variables is your construct so construct

play07:22

validity is sort of the truest validity

play07:25

in in terms of how we typically think

play07:26

about it construct validity is the

play07:29

degree to which your test measures the

play07:31

construct it claims to measure for

play07:34

example am i really measuring aggression

play07:37

and this is so crucial people argue

play07:40

about this in the field all the time and

play07:42

when you write a paper you have to

play07:43

defend the construct validity of what

play07:45

you've done you have to prove or at

play07:47

least convince other people as much as

play07:48

possible that you know I'm actually

play07:50

measuring the thing that I'm trying to

play07:52

measure and that's an issue of construct

play07:54

validity just as one final note a way to

play07:57

sort of get at the construct validity is

play07:59

to correlate your scale for example or

play08:02

your test or whatever you did to actual

play08:05

outcomes if you developed an aggression

play08:07

scale that's designed to measure

play08:09

aggression do you find a correlation

play08:11

between how people do on your scale how

play08:13

aggressive they appear to be on your

play08:15

scale and how much aggression they

play08:17

actually show in real life can you

play08:19

correlate your scale with real tangible

play08:22

observations of behaviors and that's an

play08:24

example of how we might address

play08:25

something like this in most cases when

play08:28

we think about validity it's sort of

play08:30

something conceptual that we're focusing

play08:32

on and you know we're just arguing at a

play08:34

sort of theoretical level about whether

play08:36

what we're doing is valid or not but

play08:37

there are ways like this to sort of put

play08:39

some numbers to this and and test it so

play08:42

in general again why do we care about

play08:45

validity and why do we care about

play08:46

reliability because it helps us to make

play08:48

accurate conclusions about populations

play08:52

and that's where we're gonna head next

play08:55

how do we make those conclusions about

play08:56

populations and it's going to be through

play08:58

a process called

play08:59

hypothesis testing

Rate This

5.0 / 5 (0 votes)

Etiquetas Relacionadas
Research MethodsValidity TypesReliability ImportanceInferential StatisticsMeasurement AccuracyContent ValidityCriterion ValidityConstruct ValidityStatistical AnalysisPsychological Testing
¿Necesitas un resumen en inglés?