SEM Series (2016) 3. Exploratory Factor Analysis (EFA)
Summary
TLDRThis video script details a comprehensive guide to conducting Exploratory Factor Analysis (EFA). It emphasizes the importance of reflective latent measures, excluding formative and categorical variables, and setting up the analysis with maximum likelihood extraction. The speaker discusses the significance of KMO and Bartlett's test, communalities, and factor extraction. They address issues like cross-loadings and low factor loadings, suggesting strategies for resolving them. The script concludes with reporting the final pattern matrix, factor correlations, and reliability analysis, ensuring convergent and discriminant validity.
Takeaways
- 📊 The speaker begins by emphasizing the importance of saving data sets after making changes, suggesting to save the data as 'trimmed and no missing'.
- 🔄 The process of reordering variables in the data set is discussed, with imputed variables being moved to the end and key demographic variables placed at the bottom.
- 🔍 The speaker conducts an exploratory factor analysis (EFA), highlighting the need to include only reflective latent measures and excluding formative measures, categorical variables, and demographics not part of a reflective latent construct.
- 🎯 The choice of extraction method is discussed, with the speaker preferring maximum likelihood estimation due to its use in subsequent confirmatory factor analysis (CFA).
- 🔄 The use of Promax rotation is mentioned, which is less forgiving but may be necessary if issues arise with the factor analysis.
- 📉 The speaker discusses the suppression of small coefficients, aiming to focus on loadings greater than 0.5 for meaningful results.
- 📊 The Kaiser-Meyer-Olkin (KMO) measure and the significance of the Bartlett's test are highlighted as part of assessing the adequacy of the factor analysis.
- 📈 The total variance explained by the factors is discussed, with the speaker aiming for over 60% as an ideal threshold.
- 🔍 The pattern matrix is scrutinized for high loadings and cross-loadings, with the speaker identifying and addressing issues such as low loadings and Haywood cases.
- 🔄 The speaker demonstrates how to refine the factor analysis by iteratively removing problematic items and rerunning the analysis until a satisfactory model is achieved.
- 📝 The final step involves reporting the KMO and Bartlett's test results, the total variance explained, non-redundant residuals, the pattern matrix, and factor correlations to assess convergent and discriminant validity.
Q & A
What is the first step the speaker takes in the exploratory factor analysis process?
-The first step the speaker takes is to save the dataset with the changes made, renaming it to 'trimmed and no missing'.
Why does the speaker reorder the variables in the dataset?
-The speaker reorders the variables to place the imputed variables at the end and to organize the dataset with age, ID, gender, frequency, and experience at the bottom for better analysis.
What type of measures should be included in an Exploratory Factor Analysis (EFA) according to the speaker?
-The speaker emphasizes that only reflective, not formative, measures should be included in EFA. Categorical variables and demographics not part of a reflective latent construct should be excluded.
What extraction method does the speaker prefer to use in EFA?
-The speaker prefers to use the maximum likelihood extraction method because it is the same algorithm used in Amos for confirmatory factor analysis.
What rotation method is suggested by the speaker for EFA?
-The speaker suggests using the Promax rotation method as it is less forgiving and can help in resolving issues with factor loadings.
Why does the speaker choose to suppress small coefficients at 0.3 in the analysis?
-The speaker suppresses small coefficients at 0.3 because they are not interested in loadings less than 0.5, which is a threshold for meaningful factor loadings.
What does the speaker look for in the KMO (Kaiser-Meyer-Olkin) measure to assess the adequacy of the factor analysis?
-The speaker looks for a KMO value above 0.7 and ideally about 0.9 to ensure the factor analysis is adequate.
How does the speaker interpret the total variance explained by the factors in the analysis?
-The speaker interprets the total variance explained by the factors as good if it is more than 60%, with 64.5% being a satisfactory result in this case.
What issue does the speaker identify with the pattern matrix and how does it affect the analysis?
-The speaker identifies a discriminant validity issue between decision quality and information acquisition, which affects the convergent validity of information acquisition.
What action does the speaker take to resolve the discriminant validity issue between decision quality and information acquisition?
-The speaker runs another factor analysis excluding all items except decision quality and information acquisition to isolate and resolve the issue.
How does the speaker address items with low loadings in the pattern matrix?
-The speaker considers dropping items with low loadings, such as decision quality one and eight, and re-runs the analysis to see the impact on the pattern matrix and overall model.
What additional analysis does the speaker perform to assess the reliability of the items?
-The speaker performs a Cronbach's alpha scale reliability analysis to assess the internal consistency reliability of the items.
Outlines
📊 Exploratory Factor Analysis Setup
The speaker begins by discussing the process of setting up an exploratory factor analysis (EFA). They emphasize the importance of saving the dataset with changes and reordering variables. The speaker then proceeds to conduct an EFA, highlighting the need to include only reflective latent measures and exclude formative measures, categorical variables, and demographics not part of a reflective latent construct. They also mention the importance of using maximum likelihood as the extraction method and Promax rotation. The speaker sets parameters for the analysis, such as suppressing small coefficients and allowing a certain number of iterations, and checks the Kaiser-Meyer-Olkin (KMO) measure and the significance value as part of the adequacy assessment. The goal is to identify the number of factors that explain the variance in the model.
🔍 Addressing Factor Analysis Issues
In this segment, the speaker identifies issues in the factor analysis, particularly cross-loadings between decision quality and information acquisition factors. They decide to rerun the EFA, focusing only on these two sets of items to resolve the issue. After rerunning the analysis and examining the pattern matrix, they find that certain items, such as decision quality six and one, are causing problems and decide to eliminate them. The speaker also discusses the concept of Haywood cases, where factor loadings exceed one, and their decision to ignore these until other issues are resolved. The goal is to improve the discriminant and convergent validity of the factors.
📉 Finalizing Factor Analysis and Reporting
The speaker concludes the EFA by rerunning the analysis with the revised set of items. They report on the KMO and communalities, indicating the adequacy of the analysis, and the total variance explained by the six-factor model. They also discuss the non-redundant residuals and the pattern matrix, noting improvements in factor loadings. The speaker addresses the issue of discriminant validity by examining the factor correlation matrix and ensuring that no factors share a majority of variance. They also perform a reliability analysis using Cronbach's alpha to assess the internal consistency of the scales. The speaker provides a detailed explanation of what to report from the EFA, including the pattern matrix, factor correlation matrix, and reliability analysis results.
Mindmap
Keywords
💡Exploratory Factor Analysis (EFA)
💡Reflective Latent Measures
💡Formative Measures
💡Kaiser-Meyer-Olkin (KMO)
💡Maximum Likelihood Extraction
💡Promax Rotation
💡Cross-Loadings
💡Cronbach's Alpha
💡Convergent Validity
💡Discriminant Validity
Highlights
Exploratory Factor Analysis (EFA) is initiated with saving the dataset after minor changes.
Variables are reordered to place imputed ones at the end for clarity.
Factor analysis is conducted with all reflective latent measures, excluding formative or categorical variables.
Descriptives, KMO, and extraction methods are set, favoring maximum likelihood estimation for consistency with Amos.
Promax rotation is chosen for its less forgiving nature in handling factor loadings.
Factor scores can be saved as new variables for further analysis.
KMO measure indicates the data's suitability for factor analysis, with a value above 0.9 being ideal.
Eigenvalues above 0.3 in the extraction column suggest sufficient communalities.
Six factors are extracted, aligning with the预先 expected model.
Total variance explained by the model is over 64%, exceeding the minimum threshold of 60%.
Reproduced values are below the 5% threshold, indicating good model fit.
Issues with discriminant validity between decision quality and information acquisition are identified.
A focused factor analysis is conducted on decision quality and information acquisition to resolve issues.
Items with cross-loadings or low factor loadings are considered for removal to improve model clarity.
Reliability analysis is performed to assess the impact of item removal on Cronbach's alpha.
Final model retains all items despite some low loadings, anticipating improvement in confirmatory factor analysis.
Reporting includes KMO, eigenvalues, total variance explained, non-redundant residuals, and pattern matrix.
Cronbach's alpha values are reported to assess the reliability of the factors.
Factor correlation matrix is examined for evidence of discriminant validity.
Transcripts
all right moving right along next is
exploratory factor analysis so what I
would do first things first I would go
back to the data set and save it we've
made several minor changes here and
there I mean who had save this as and
I'm going to save it as trimmed and no
missing let's see
no missing there we go save so now all
of our changes are recorded and if we
make a mistake somewhere we're good to
go and I'm going to go ahead and reorder
these variables look we have these guys
in the end here because we imputed them
we replaced the missing values I'm going
to resource this column sort ascending
smus a sure you want to do this the
answer is yes but do you want to save it
as a different thing no okay we're in
order now I'm just going to put age ID
gender frequency experience all at the
bottom again okay there we go
the next thing to do is a factor
analysis I'm excited I like factor
analyses so analyze dimension reduction
factor analysis what I'm going to stick
in there we're going to start with
everything throw it all in there all the
the reflective latent measures this is
critical to bear in mind you must have
reflective not formative reflective
latent meaning multi indicator measures
if you have formative measures don't
include them in the EFA if you have
categorical variables like gender don't
include them the EFA if you have
demographics that are clearly not part
of a reflective latent construct don't
include them in the EFA or CFA they
don't belong in a reflective measurement
model you're only going to include
reflective latent factors hope that was
clear enough cool throw this in here if
you're not sure what reflective versus
formative means please refer to my
YouTube video called formative versus
reflective measures and a factor
analysis I think it's called that
something like that okay throw these all
over descriptives I've done this
times before reproduced kmo continued
extraction I like to use maximum
likelihood why well that's the same
algorithm that Amos is going to use when
we do the confirmatory factor analysis I
like to do it based on I gain values
instead of a fixed number of factors at
least initially just to see what it's
going to give me how many iterations do
we want to allow 25 is fine continued
rotation I like to use Promax it's less
forgiving but we might have to switch if
we have issues continue noting in scores
although just FYI if you wanted to save
each of the factors as a variable is
called factor scores then you click here
and save variables and that'll give you
however many factors you came up with in
your pattern matrix it will create that
many more new variables to represent
each of those factors in it if set I'm
not going to check that okay cancel in
the options I am going to suppress small
coefficients at point three because I'm
really not interested in loadings less
than 0.5 and we need them to be at least
point to difference I've talked about
this in other videos I'm not going to go
into depth here okay this is more
procedural video anyway we want to look
at km okay Emily looks good 935 you want
this above 0.7 um ideally about 0.9
you're at the sig value to be
significant this is all part of adequacy
we're going to talk about adequacy
validity convergent discriminant and
reliability okay back here we look at
the extraction column and we want to see
if there's anything less than about 0.3
we're at point three a decision called
the eighth order the line dude do
looking pretty good okay moving on and
we have total variance explained that we
will look at this cumulative column it
came up with six factors how many were
we expecting well if we go back to our
model we were expecting one two three
four five six it came up with exactly
what we wanted this rarely happens so
I'm kind of surprised
and this doesn't provide us an
opportunity to do mitigation strategies
so maybe see my other videos for
mitigation strategies okay
it explains sixty four and a half
percent of the variance in the model
that's good we want more than sixty
percent at a minimum we want more than
fifty percent but again above sixty
percent is ideal skip the factor matrix
skip the goodness fit go down to the
reproduced we want a number here less
than about five percent we're looking
pretty good and pattern matrix is
looking stellar ish who actually we do
have some issues we have a few issues
this is fun I don't like it when just
works so let's do this one time
a typical use looks fabulous is there no
cross loadings anywhere um all the items
loaded on to a single factor factor five
decision quality not so fabulous
I'm still good but not fabulous look we
have 0.40 seven that's fairly low we
also have these two other items from
information acquisition that loaded with
all the decision quality items that's a
problem we'll have to resolve that
separately information acquisition
loaded onto its own factor but look at
those loadings they're awful so I'm not
sure what to do about that I'll have to
address that next joy looks joyful no
problems there whatsoever playfulness
looks incredible
and usefulness looks incredible as well
so the only real problem is this
discriminant validity issue between the
decision quality and information
acquisition which my guess is is causing
the convergent validity issue with
information acquisition so what would
you do here I would actually just run
another factor analysis but get rid of
everything except decision quality and
information acquisition there we go and
just run it again with just those two
sets of items and looking good looking
good really what I want to do is go down
to the pattern matrix good it did come
up with two factors that's what we
wanted but you can see there are some
issues decision quality six is loading
most equally on both sides that is the
first one I would delete so let's do
that factor analysis again decision
quality six sayonara K write it again
jump down to the pattern matrix decision
quality one loading on both sides hey
look at these loadings though those are
looking better okay this one no good
decision quality one you may say hey
James wait wait wait what is this
it's above one it was last time two
we're going to ignore what it is called
a Haywood case we're going to ignore
this Haywood case until we resolve other
issues because it'll probably just
resolve itself so decision quality one
you are gone kicked off the island there
we go jump down the pattern matrix
looking better but look at this decision
quality eight not really contributing
very well I'm going to drop decision
quality gate and pattern matrix much
better this is borderline we might keep
it this is also borderline we might keep
it what I'm going to do at this point is
I'm going to recreate the larger pattern
matrix and see if everything is resolved
if not we can see where we'll go
probably decision quality seven and into
acquisition five will be the next to be
eliminated so back to the full factor
analysis we're going to throw everything
in there except decision quality one six
and eight do yep run it again and I am
just going to do a few cursory things
I'm going to jump down here looks like
we still have six factors excellent good
variance explained actually better than
before and we have only three percent on
or doesn't residuals this time and
here's the pattern matrix and it already
looks better okay decision quality that
looks really good information
acquisition also very good Wow actually
I wasn't expecting it to be that good um
and everything else looks just as good
as before well I might do is drop
information acquisition five it is still
fairly low and you can see these
loadings here point seven point seven
point seven
6m4 these aren't going to average out to
above 0.7 which is problem if I want to
verify this what I might do is do
reliability analysis aniline is scale
reliability and just stick in those
information acquisition items here I'll
pull it over and then go to statistics
and do a scale of item deleted continue
and okay and what this is going to tell
us is if dropping that item will
actually do us any good so it was me go
back over here to the pattern matrix it
was information acquisition five now if
we go down to the reliability analysis
click here if you look at this last
column it says what our cronbach's alpha
would be if we deleted each of these
items the current convict self is 0.8 4
- but if we deleted information
acquisition 5 it would go up to point 8
4 6 this isn't a big difference and so
if I was struggling if I wanted to keep
all these items
I'm fully justified in keeping all these
items even though that is a low loading
most likely scenario is it will bump up
a little bit during the confirmatory
factor analysis in Amos so I can keep it
if I really don't care and these are
scales I made up myself and and I had
the liberty to do so then I might just
drop information acquisition 5 which is
what I am going to do at this point so
I'm going to run this one more time drop
into X 5 watch what happens to see makes
big differences that's an uptick which
is good 3 percents the same ooh ok so it
actually caused some problems it threw
in a new loading here above 0.3 what
what happened is information acquisition
5 helped distinguish us from decision
quality whereas now we're having a hard
time distinguishing yourself so I'm
going to retain information acquisition
5 even though this is a greater than 0.2
difference it did bring up that
discriminant sort of cost loading issue
so my final pattern matrix is actually
going to be the one with information
acquisition 5 still in it here we go run
it
what do I report I report the kmo say
it's awesome I report the cig say it's
awesome these are all under adequacy
under communalities this is this is
another adequacy measure I look at the
extraction column and I say all of mine
all my communalities were above 0.3
looks like they are the lowest one is
this one point three nine seven and then
I'd say the six factor model explains
sixty six point three percent of the
variance which is good and then I'd say
we had less than three percent non
redundant residuals which is great and
here's the pattern matrix and I'd say as
evidence of convergent validity we have
all the loadings above 0.5 except this
one point 4 which way which I'd mention
and then evidence of discriminant
validity is we had no strong cross
loadings another bit of evidence for
discriminant validity is this factor
correlation matrix we can look and see
at all these non diagonal values and
make sure they're not above 0.7 which
would indicate sharing a majority of the
variance so the closest one is this
factor for two factors six I'm guessing
that is information acquisition and
decision quality I'd go here check four
and six yep those are those two and they
are highly related but not so related
that they're sharing a majority of their
variance so that's the closest one
what would I report I would report the
pattern matrix at a minimum you may also
want to report this factor correlation
matrix okay that is adequacy convergent
validity discriminant validity if we
want to do reliability you just do like
I did before go do a cronbach's alpha
scale reliability analysis we did it for
information acquisition move those over
we'll do another one for decision
quality but not all decision quality
items will they have two three four five
and seven two three four five and do two
six eight loops so two three four five
and seven throw those in there okay and
report this number here 0.90 one
what I like to do is just stick it at
the top of my pattern matrix so this
point 901 I would go stick it right here
that was decision quality so I'd
replaced this 4 with 0.9 on one and that
put cronbach's alpha right over here
okay you want all those cronbach's
alphas to be greater than 0.7 if they're
not there's actually literature that
says it can get down 0.6 particularly if
you have only a few items 2 or 3 and
that is the EFA
Voir Plus de Vidéos Connexes
5.0 / 5 (0 votes)