Вы находитесь на странице: 1из 6

Behavioural Processes 97 (2013) 9095

Contents lists available at SciVerse ScienceDirect


Behavioural Processes
j our nal homepage: www. el sevi er . com/ l ocat e/ behavpr oc
Exploratory factor analysis with small sample sizes:
A comparison of three approaches
Sunho Jung
School of Management, Kyung Hee University, 1 Hoegi-dong, Dongdaemon-gu, Seoul, 130-872, Republic of Korea
a r t i c l e i n f o
Article history:
Received 23 February 2012
Received in revised form13 August 2012
Accepted 29 November 2012
Keywords:
Exploratory factor analysis
Small sample size
Regularized exploratory factor analysis
Generalized exploratory factor analysis
Unweighted least-squares
a b s t r a c t
Exploratory factor analysis (EFA) has emerged in the eld of animal behavior as a useful tool for deter-
mining and assessing latent behavioral constructs. Because the small sample size problem often occurs in
this eld, a traditional approach, unweighted least squares, has been considered the most feasible choice
for EFA. Two new approaches were recently introduced in the statistical literature as viable alternatives
to EFA when sample size is small: regularized exploratory factor analysis and generalized exploratory
factor analysis. A simulation study is conducted to evaluate the relative performance of these three
approaches in terms of factor recovery under various experimental conditions of sample size, degree
of overdetermination, and level of communality. In this study, overdetermination and sample size are
the meaningful conditions in differentiating the performance of the three approaches in factor recovery.
Specically, when there are a relatively large number of factors, regularized exploratory factor analysis
tends to recover the correct factor structure better than the other two approaches. Conversely, when
few factors are retained, unweighted least squares tends to recover the factor structure better. Finally,
generalized exploratory factor analysis exhibits very poor performance in factor recovery compared to
the other approaches. This tendency is particularly prominent as sample size increases. Thus, generalized
exploratory factor analysis may not be a good alternative to EFA. Regularized exploratory factor analysis
is recommended over unweighted least squares unless small expected number of factors is ensured.
2013 Elsevier B.V. All rights reserved.
1. Introduction
Exploratory factor analysis (EFA) is a basic tool in the behavioral
sciences. The attraction of EFAis its ability to investigate the nature
of unobservable behavioral constructs (often called latent vari-
ables) that account for relationships among measured variables.
Traditionally, unweighted least squares (ULS) and maximum like-
lihood(ML) have beentwo of the most popular estimationmethods
in EFA (e.g., Fabrigar et al., 1999). Despite the popularity of ULS and
ML, the development of a newfactor analysis procedure continues
to be actively studied (e.g., Bentler and de Leeuw, 2011). Recently,
two new approaches have been published in the statistical litera-
ture as viable alternatives to EFA in small sample size situations:
One is regularized EFA (REFA) (Jung and Lee, 2011), and the other
is generalized EFA (GEFA) (Trendalov and Unkel, 2011).
Researchers in animal behavior have long recognized the bene-
t of studying an underlying latent variable and the importance of
EFA. However, the small sample size problemoften limits the gen-
eral applicability of EFA in animal behavior studies (e.g., Budaev,
2010). Both REFA and GEFA are shown to performwell particularly
when sample sizes are small (Jung and Lee, 2011; Trendalov and
E-mail address: sunho.jung@khu.ac.kr
Unkel, 2011). Thus, use of these approaches can be valuable in ani-
mal behavior research. In fact, our review found one instance of
an application of REFA in the study of animal personality (Kone cn
et al., in press). However, the newapproaches may still be novel to
researchers in animal behavior. Moreover, previous research has
not compared the performance of the new approaches to that of
traditional approach. It would be useful to know under what cir-
cumstances the new approaches are preferable to the traditional
approach. The objective of this article, therefore, is to make the
new approaches more accessible to researchers in animal behav-
ior, and to evaluate the traditional and newapproaches in terms of
factor recovery capability using a Monte Carlo simulation study.
The article is organized as follows. We offer a general overview
pointing to the relative strengths and weaknesses of the traditional
andnewapproaches. Then, we describethedesignof thesimulation
study and report results. An empirical study is conducted as well,
to investigate factor recovery in a more realistic context. Finally,
we discuss the implications of the study and also offer recommen-
dations based on the factor recovery capability.
2. Background
In animal behavior research, small sample sizes may be rule
rather than the exception (e.g., Havstad and Olson-Rutz, 1991). The
0376-6357/$ see front matter 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.beproc.2012.11.016
S. Jung / Behavioural Processes 97 (2013) 9095 91
potential application of EFA in this research even involves those
cases where the number of variables exceeds the sample size (e.g.,
Liu and Gershenfeld, 2003). In general, ML is most frequently rec-
ommended for use in EFA (e.g., Fabrigar et al., 1999). However,
ML requires large sample sizes to yield reliable and stable solu-
tions (e.g., MacCallum et al., 1999). Previous research has shown
that ULS can yield high quality solutions with small sample sizes
(below 50) under ideal conditions of few factors, highly reliable
data, andhighcommunalities (e.g., Preacher andMacCallum, 2002).
Although these conditions often fail in practice, ULS has been the
most feasible option for EFA in animal behavior research (Budaev,
2010). Here, only ULS is considered the traditional approach (here-
after referred to as ULSFA).
ULSFA, REFA, and GEFA are based on a common factor model.
This model partitions variance of each measured variable into
two parts: unique variance (uniqueness) due to a unique factor
and common variance (communality) due to the common factors.
Unique factors represent latent variables specic to only one mea-
sured variable, whereas common factors are shared among two
or more measured variables and represent latent variables that
can explain relationships among measured variables. Thus, factor
analysis concerns only common variance to derive the patterns of
relations between the common factors and the measured variables
(i.e., factor loadings). However, this requires the computation of
unique variances in some way. Broadly speaking, REFA and GEFA
differ from ULSFA in that estimation of unique variances is sep-
arated from estimation of factor loadings. In REFA, only a single
parameter needs to be determined to obtain estimates of unique
variances. GEFA rst derives unique factor scores and then uses
themto estimate unique variances.
Compared to ULSFA, REFA and GEFA have several advantages.
ULSFA relies on complicated iterative algorithm, whereas REFA
and GEFA simplify the estimation of parameters in the model.
REFA assumes that the true unique variances are proportional
to their tentative estimates. That is, REFA involves estimating a
single parameter representing proportion. Unique variances are
estimated by multiplying their tentative quantities by an esti-
mate of the proportion. After unique variance estimates have been
obtained, factor loadings are then computed by a closed-formfor-
mula (e.g., Jreskog, 1977). On the other hand, GEFA rst derives
the common and unique factor scores for each observation. For
these latent variable scores, factor loadings and unique variances
are then found respectively using separate linear regression anal-
yses.
Another advantage results from the piece-by-piece approach
that REFA and GEFA take to estimate parameters in the model.
Data sets with small samples and many measured variables are
common in behavioral research. With many parameters relative
to sample size, iterative numerical methods would be feasible but
computationally demanding. The implementation of these meth-
ods may present numerical difculties (e.g., nonconvergence). To
deal with this problem, methodologists have advocated subdivid-
ing the parameters into several subsets (e.g., Bollen, 1996).
However, compared to ULSFA, REFA and GEFA also have several
limitations. For EFA, an overall model t index can be used in deter-
mininghowwell themodel ts a sampleas a whole. ULSFAprovides
indices to assess the models goodness of t and also permits sta-
tistical signicance testing (Browne, 1984). However, no overall t
indices are available for these two new approaches. Another limi-
tation is the absence of software programs for implementation of
the new approaches. The lack of statistical software could make
applied researchers less motivated to understand and implement
the new approaches. In contrast, ULSFA are readily implemented
and available in many software programs, including SPSS, SAS,
Mplus (Muthn and Muthn, 2006), CEFA (Browne et al., 1998),
FACTOR (Lorenzo-Seva and Ferrando, 2006).
Beyond the theoretical characteristics among the three
approaches we have discussed, the utility of a factor analysis proce-
dure depends heavily on its ability to recover the underlying factor
structure, which is required for good interpretation and inference.
Inthenext section, wepresent a simulationstudythat evaluates the
performance of REFA and GEFA relative to ULSFA in terms of factor
recovery. Based on our ndings, we provide guidelines in the Dis-
cussion and recommendations section for choosing between the
three approaches.
3. Simulation design
In the present simulation study, we investigate the capability
of the three approaches in population factor recovery under sev-
eral experimental conditions of sample size, communality, and
overdetermination. These conditions are commonly encountered
in simulations based on exploratory factor analysis (e.g., Hogarty
et al., 2005). Prior research has shown that level of commu-
nality moderates the impact of sample size on factor recovery
(e.g., MacCallum et al., 1999). Degree of overdetermination was
shown to greatly inuence the quality of factor solutions, partic-
ularly with small samples (e.g., Preacher and MacCallum, 2002).
Overdetermination is dened as the ratio of factors to measured
variables. If the number of observed variables (p) is held con-
stant, varying the number of factors (f ) leads to different degrees
of overdetermination. Fewer factors indicate higher degree of
overdetermination. With p held constant (p=20), the Monte Carlo
simulations involved manipulating four experimental conditions
as follows.
1 Approach: REFA, GEFA, and ULSFA.
2 Sample Size: N=5, 10, 20, 30, and 50.
3 Communality: High (0.60.7), Low(0.20.4), and Wide (0.20.8).
4 Overdetermination: High (f =3) and Low(f =7).
Usinga methoddevelopedbyTucker et al. (1969), sixpopulation
correlation matrices were generated to correspond to the six com-
binations of communality and overdetermination. Individual-level
multivariate normal data were drawn fromN(0, ), where is the
population covariance matrix, using the Cholesky decomposition
(Wijsman, 1959). One hundred samples were generated for each
level of the experimental conditions resulting in 3000 samples for
each approach (5 Sample Sizes 3 levels of communality 2 degrees
of overdetermination100 Replications). All sample matrices were
factor analyzed using each of the three approaches. All simulations
were carried out using MATLAB (R2010a).
4. Simulation results
To assess the degree of factor pattern recovery under the
three approaches, we computed the root mean squared deviation
(denoted as g) (Velicer and Fava, 1998) as follows:
g =

trace

(

)

(

)

pf

1/2
. (1)
This index represents the average difference between sample
loadings andtheir corresponding populationloadings. Inthis study,
the value of g was computed after the sign of loading estimates was
recovered using a procedure suggested by Cliff (1966). Because this
value can be affected by the rotational indeterminacy, the loadings
produced by each of the three approaches were not submitted to
the rotation method.
An ANOVA test was performed using the root mean squared
deviation g as the dependent variable and the four experimental
92 S. Jung / Behavioural Processes 97 (2013) 9095
Table 1
The results of an ANOVA test for the root mean squared deviation (g) (Sig. =signicance,
2
=effect size).
Source SS DF F Sig.
2
Approach (A) 1,020,833 2 23806 0.00 0.84
Communality (C) 1205 2 28 0.00 0.01
Overdetermination (O) 25576 1 1192 0.00 0.12
Sample size (N) 56418 4 657 0.00 0.23
A*C 3044 4 35 0.00 0.02
A*O 33353 2 111 0.00 0.15
A*N 270325 8 1576 0.00 0.59
C*O 1582 2 36 0.00 0.01
C*N 2711 8 15 0.00 0.01
O*N 1530 4 17 0.00 0.01
A*C*O 3005 4 35 0.00 0.02
A*C*N 6902 16 20 0.00 0.04
A*O*N 7500 8 43 0.00 0.04
C*O*N 2266 8 13 0.00 0.01
A*C*O*N 2266 16 6 0.00 0.01
Error 190,236 8910
conditions as design factors. Table 1 presents the results of the
ANOVA. As the table shows, all of the main and interaction effects
werestatisticallysignicant. This shouldnot betoosurprisinggiven
the large number of observations (i.e., a total of 9000 observations).
Thus, it is crucial toexaminetheeffect sizeas well (e.g., Paxtonet al.,
2001). Following accepted practice, we focus on main and interac-
tion effects whose sizes were at least medium(i.e.,
2
greater than
0.06) (Cohen, 1988).
First, approach (
2
=0.84), overdetermination (
2
=0.12), and
sample size (
2
=0.23) had large main effects. Here, approach
was the most important determinant of the root mean squared
deviation, followed in descending order by sample size and
overdetermination. There are likely meaningful differences in the
root mean squared deviation among the three approaches. It
appears that there was little difference in the root mean squared
deviation between REFA (7.2) and ULSFA (7.3), whereas GEFA
exhibited a much higher level of the root mean squared devia-
tion (29.9). Moreover, it is likely that the levels of the root mean
squared deviation were higher when the number of factors was
small (16.5) than when the number of factors was large (13.1). This
result is somewhat counterintuitive and needs some explanation.
In the study, the large number of factors (f =7) has the larger num-
ber of estimatedparameters thanthesmall number of factors (f =3):
the former has 140 unknown elements, and the latter has 60. Note
that the number of unknown elements is used as denominator to
calculate the root mean squared deviation.
Second, the ANOVA test reveals two two-way interactions
worthy of examination: that between approach and sample size
(
2
=0.58) and that between approach and overdetermination
(
2
=0.15). Fig. 1 displays the average values of the root mean
squared deviation of the three approaches across sample sizes. On
average, the loading estimates of ULSFA involved a little smaller
values of g than those under REFA when N 10, whereas REFA
resulted in the smallest value of g at N=5. GEFA exhibited the
largest values of g across all sample sizes. Fig. 1 demonstrates the
poor factor recovery of GEFA particularly for larger sample sizes. In
fact, GEFA had much higher values of g with N=50 than with N=5.
Conversely, REFA and ULSFA performed better in factor recovery
for larger sample sizes. Fig. 2 displays the average values of the
root meansquareddeviationof the three approaches under the two
levels of overdetermination. When the number of factors is small
(i.e., f =3), ULSFA was associated with the smallest level of the root
mean squared deviation (7.3) and REFA was associated with the
second smallest (7.9). Conversely, when the number of factors is
large, REFA involved the smallest level of the root mean squared
deviation (6.6) and ULSFA had the second smallest (7.2). At both
levels of overdetermination, GEFA yielded the largest level of the
root mean squared deviation.
The above ANOVA test suggested that factor recovery of the
three approaches was distinct between the two levels of overde-
termination. To gain a greater understanding of factor recovery
capability of the three approaches under this condition, we further
investigated the overall nite-sample properties of the estimated
loadings of the three approaches across the overdetermination lev-
els. Tables 2 and 3 presents the average relative biases, standard
deviations, and mean square errors of factor loading estimates
obtainedfromthe three approaches under the twolevels of overde-
termination. Among these properties, the mean square error (MSE)
is the average squared difference between a parameter and its esti-
mate, thereby indicating howfar anestimate is, onaverage, fromits
parameter value, i.e., the smaller the mean square error, the closer
the estimate is to the parameter. Specically, the MSE is given by
MSE(

j
) = E[(

j
)
2
] = E[(

j
E(

j
))
2
] +(E(

j
)
j
)
2
. (2)
As shown in Eq. (2), the MSE of an estimate is the sum of its
variance and squared bias. Thus, the MSE accounts for both bias
and variability of the estimate (Mood et al., 1974).
Fig. 1. The average values for indexof root meansquareddeviation(g) of the loading
estimates obtained fromthe three approaches across different sample sizes.
S. Jung / Behavioural Processes 97 (2013) 9095 93
Table 2
Overall nite-sample properties of the loading estimates of the three approaches under high overdetermination (f = 3) across different sample sizes (RB=relative bias (%),
SD =standard deviation, MSE =mean square error).
GEFA ULSFA REFA
RB SD MSE RB SD MSE RB SD MSE
N=5 90 0.91 0.91 4 0.46 0.21 8 0.44 0.20
N =10 195 1.04 1.41 6 0.34 0.11 4 0.34 0.12
N =20 304 1.12 2.21 9 0.25 0.06 10 0.25 0.06
N =30 419 1.13 2.95 3 0.20 0.04 2 0.21 0.04
N =50 608 1.10 4.54 1 0.15 0.02 2 0.17 0.03
Table 3
Overall nite-sample properties of the loading estimates of the three approaches under low overdetermination (f = 7) across different sample sizes (RB=relative bias (%),
SD =standard deviation, MSE =mean square error).
GEFA ULSFA REFA
RB SD MSE RB SD MSE RB SD MSE
N=5 58 0.62 0.44 20 0.35 0.13 11 0.31 0.09
N =10 217 0.82 0.93 11 0.32 0.11 4 0.27 0.07
N =20 449 0.97 1.58 25 0.26 0.07 16 0.21 0.04
N =30 498 1.05 2.12 8 0.24 0.06 7 0.19 0.03
N =50 486 1.14 3.14 28 0.20 0.04 16 0.15 0.02
In the study, absolute values of relative bias greater than 10 per-
cent may be singled out as unacceptable (Lei, 2009). As shown in
Table 2, when few factors are retained, both REFA and ULSFA on
average yielded unbiased estimates of loadings across all sample
sizes whereas GEFA yielded tremendously and positively biased
loading estimates regardless of sample sizes. In addition to relative
bias, it is important toevaluate the stabilityof the three approaches.
The stability of each approach was examined by computing the
standard deviation. REFA produced on average the standard devia-
tions of the estimates almost identical to those of ULSFA. As sample
size increased, the standard deviations of the estimates tended to
decrease across these two approaches. Conversely, GEFA was asso-
ciated with much larger standard deviations, and the degree of
Fig. 2. The average values for indexof root meansquareddeviation(g) of the loading
estimates obtained fromthe three approaches across two levels of overdetermina-
tion (3 factors =high overdetermination, 7 factors =lower overdetermination).
standard deviations tended to increase with larger sample sizes.
Overall, ULSFAshowedthe smallest meansquare errors of the load-
ing estimates. However, the differences in the mean squared errors
of the estimates were negligibly small between ULSFA and REFA.
On the other hand, GEFA clearly exhibits its inconsistency: larger
sample size does not help produce the loading estimates closer to
the true population values.
As Table 3 shows, when a large number of factors are retained,
REFAwas the least biasedof the three approaches. REFAledtotoler-
abledegreeof bias for estimates of loadings across samplesizes. The
estimates under REFA were consistently associated with smaller
standard deviations than those under the other approaches. On
average, REFAhas the smallest mean square errors of factor loading
estimates across all ample sizes. In particular, this tendency of fac-
tor recovery was more apparent inthe two smallest sample sizes. In
general, under low overdetermination, REFA led to more accurate
and stable solutions than the other approaches.
5. Empirical example
The simulation study presented in the previous section com-
pared the performance of the three approaches under various
experimental conditions commonly encountered in behavioral
research. Nonetheless, the range of conditions in this study may
still be limited in scope. Questions are often raised regarding the
validity of results froma Monte Carlo simulation study. In this sec-
tion, we provide concrete empirical evidence supporting results of
the simulation study.
An empirical example is presented using the sensory attribute
data collected by quantitative evaluation of uid milk using a con-
sumer panel (Chapman et al., 2001). The data set consisted of nine
milkproducts describedbyeight sensoryattributes (e.g., aroma, a-
vor, texture, etc.). Twelve sensory panelists were asked to rate each
of the products on each attribute, using an intensity scale ranging
from0 (not present) to 10 (extremely strong). Because the attribute
rating data collected frommultiple panelists often reveal a lack of
agreement, the attribute ratings of the panelists are aggregatedinto
a single groupcomposite value. The arithmetic meanof the individ-
ual ratings of the panelists was computed to aggregate the ratings
(see Chapmanet al. (2001), pp. 1415). The aggregationof response
data helps minimize individual-level biases and errors, and thus
improves estimation accuracy (Van Bruggen et al., 2002).
94 S. Jung / Behavioural Processes 97 (2013) 9095
Table 4
Factor loadings and unique variances for the four factors fromthe sensory attribute data (F =factors, =unique variances).
Attributes REFA
a
ULSFA
Fl F2 F3 F4 Fl F2 F3 F4
Cooked aroma 0.965 0.001 0.027 0.228 0.230 0.979 0.008 0.040 0.209 0.000
Caramel aroma 0.476 0.529 0.615 0.222 0.262 0.524 0.576 0.556 0.292 0.151
Grainy aroma 0.949 0.015 0.251 0.024 0.251 0.975 0.007 0.224 0.021 0.035
Cooked avor 0.681 0.556 0.090 0.383 0.268 0.727 0.591 0.073 0.341 0.144
Sweet avor 0.030 0.107 0.865 0.192 0.403 0.046 0.081 0.987 0.147 0.000
Bitter avor 0.189 0.008 0.242 0.815 0.433 0.187 0.008 0.191 0.965 0.000
Dry texture 0.007 0.888 0.094 0.102 0.371 0.022 0.992 0.094 0.083 0.186
Lingering aftertaste 0.002 0.740 0.332 0.422 0.358 0.023 0.813 0.424 0.398 0.264
a
Factor names: F1: Cooked; F2: Drying/lingering; F3: Sweet; F4: Bitter.
Kaisers criterion suggested a four factor solution. To facilitate
interpretation of the results, the factors were then orthogonally
rotated using a varimax rotation. GEFA was not considered for the
study due to its worst performance on factor recovery across all
experimental conditions. In this empirical study, REFA and ULSFA
were applied to the mean attribute ratings. If Heywood cases
(i.e., negative variance estimates) occur in the iteration process,
ULSFA set communalities to one (e.g., Dillon et al., 1987). The
results of these analyses are presented in Table 4. The varimax
rotated factor loadings produced by ULSFA were almost consis-
tently larger than the corresponding REFA solutions. This was due
to the existence of zero entries in the estimated unique variances
under ULSFA. Some researchers assume that the Heywood case
variable is explained entirely by the corresponding common factor.
However, this assumption is unrealistic because of the presence of
measurement error in behavioral research. It appears that ULSFA
may overestimate factor loadings and therefore pose a more dif-
cult interpretation problem. Results from ULSFA indicated that an
attribute of caramel aroma may cross-load (i.e., having large fac-
tor loadings on the second and third factors). Compared to ULSFA,
REFA offered greater conceptual clarity on the extracted factors.
6. Discussion and recommendations
In this article, we undertook investigations of the relative
performance of regularized exploratory factor analysis and gener-
alized exploratory factor analysis with respect to the more familiar
unweighted least squares in exploratory factor analysis using both
simulated and empirical data. Based on our simulation results, we
would provide some recommendations for the applied researcher
in animal behavior.
First, we recommend the adoption of regularized exploratory
factor analysis as a sensible alternative to generalized exploratory
factor analysis, a technique specically designed to derive fac-
tors when the number of variables is larger than the number
of observations. As we demonstrate, regularized exploratory fac-
tor analysis performed much better than generalized exploratory
factor analysis in terms of factor recovery, under small sample sit-
uations. Moreover, the loading estimates produced by generalized
exploratory factor analysis never converge to the true population
values as sample size increases (i.e., the estimates are inconsistent).
Second, if the number of expected factors is small, we rec-
ommend the use of unweighted least squares. On average, this
approachperforms better thanor as well as regularizedexploratory
factor analysis in terms of factor recovery. The relatively good per-
formance of unweighted least squares in situations involving fewer
underlying factors has been previously reported in the literature
(e.g., Preacher and MacCallum, 2002).
Finally, if researchers have little condence that the true num-
ber of factors is small, they should use regularized exploratory
factor analysis because it outperforms unweighted least squares in
factor recovery when there are a relatively large number of factors.
Under the same situation, all three approaches yielded biased esti-
mates of factor loadings. However, regularized exploratory factor
analysis had (acceptably) lower levels of relative bias compared to
the other two approaches. Moreover, regularized exploratory fac-
tor analysis resulted in more accurate estimates of loadings. The
high performance level of regularized exploratory factor analysis
was more prominent in very small samples.
Researchers routinely rely on factor retention criteria such as
parallel analysis (Horn, 1965) in order to determine the number of
factors to retain unless they have an a priori hypothesis about the
number of factors to extract. Yet, no extant methodfor factor reten-
tion decisions can avoid erroneous conclusions. In addition, it is not
straightforward to assess the degree of under- and overdetermina-
tion of the number of factors. Other things being equal, retaining
a small number of factors is deemed preferable because retaining
too many often hinders interpretability. In most cases, however,
retaining too fewfactors may tend to reduce communalities, which
in turn has adverse effect on factor recovery in small samples (e.g.,
MacCallumet al., 1999). Unless researchers have tremendous con-
dence that the correct number of factors is small, we suggest that
theyconsider regularizedexploratoryfactor analysis acomplement
or substitute for unweighted least squares under small sample size.
Our study provides a greater understanding of the three avail-
able approaches to exploratory factor analysis with small sample
sizes. We hope that this study leads researchers in animal behavior
to adopt regularized exploratory factor analysis in many situations,
particularly those in which the researchers should expect to retain
a relative large number of factors. In many research domains, it
is often unavoidable to have data with many measured variables
and small sample size; some specic examples include behavior
genetics and neuroimaging data analysis, among others. In these
domains, use of regularized exploratory factor analysis can also be
valuable.
As doother simulationstudies, the present study is not free from
limitation. Althoughthe simulationstudy took into account various
experimental conditions that are frequently used in Monte Carlo
simulation studies in the domain of exploratory factor analysis, it
may be necessary to contemplate a wider range of conditions for
more careful investigations of the relative performance of the three
approaches.
Appendix A. Supplementary data
Supplementary data associated with this article can be
found, in the online version, at http://dx.doi.org/10.1016/j.beproc.
2012.11.016.
References
Bentler, P.M., deLeeuw, J., 2011. Factor analysis via components analysis. Psychome-
trika 76, 461470.
S. Jung / Behavioural Processes 97 (2013) 9095 95
Bollen, K.A., 1996. An alternative two stage least squares (2SLS) estimator for latent
variable equations. Psychometrika 61, 109121.
Browne, M.W., 1984. Asymptotically distribution-free methods for the analysis of
covariance structures. BritishJournal of Mathematical andStatistical Psychology
37, 6283.
Browne, M. W., Cudeck, R., Tateneni, K., Mels, G. (1998) CEFA: Comprehensive
Exploratory Factor Analysis. WWW document and computer program. URL
http://quantrm2.psy.ohio-state.edu/browne/
Budaev, S.V., 2010. Using principal components andfactor analysis inanimal behav-
ior research: caveats and guidelines. Ethology 116, 472480.
Chapman, K.W., Lawless, H.T., Boor, K.J., 2001. Quantitative descriptive analysis
andprincipal component analysis for sensory characteristics of ultrapasteurized
milk. J. Dairy Sci. 84, 1220.
Cliff, N., 1966. Orthogonal rotation to congruence. Psychometrika 31, 3342.
Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences, 2nd ed.
Lawrence EarlbaumAssociates, Hillsdale, NJ.
Dillon, W.R., Kumar, A., Mulani, N., 1987. Offendingestimates incovariancestructure
analysis: comments on the causes of and solutions to Heywood cases. Psychol.
Bull. 101, 126135.
Fabrigar, L.R., Wegener, D.T., MacCallum, R.C., Strahan, E.J., 1999. Evaluating the use
of exploratory factor analysis in psychological research. Psychol. Methods 4 (3),
272299.
Havstad, K.M., Olson-Rutz, K.M., 1991. Sample size determination for studying
selected cattle foraging behaviors. Appl. Anim. Behav. Sci. 30, 1726.
Hogarty, K.Y., Hines, C.V., Kromrey, J.D., Ferron, J.M., Mumford, K.R., 2005. The
quality of factor solutions in exploratory factor analysis: the inuence of sam-
ple size, communality and overdetermination. Educ. Psychol. Measur. 65, 202
226.
Horn, J.L., 1965. A rationale and test for the number of factors in factor analysis.
Psychometrika 30, 179185.
Jreskog, K.G., 1977. Factor analysis by least square andmaximum-likelihoodmeth-
ods. In: Enslein, K., Ralston, A., Wilf, H.S. (Eds.), Statistical Methods for Digital
Computers. Wiley, NewYork, pp. 125165, III.
Jung, S., Lee, S., 2011. Exploratory factor analysis for small samples. Behav. Res.
Method 43, 701709.
Kone cn, M., Weiss, A., Lhota, S., Wallner, B. Personality in Barbary macaques
(Macaca sylvanus): temporal stability and social rank. J. Res. Personal.,
http://dx.doi.org/10.1016/j.jrp.2012.06.004, in press.
Lei, P.-W., 2009. Evaluating estimation methods for ordinal data in structural equa-
tion modeling. Qual. Quant. 43 (3), 495507.
Liu, X., Gershenfeld, H.K., 2003. An exploratory factor analysis of the tail suspen-
sion test in 12 inbred strains of mice and an F2 intercross. Brain Res. Bull. 60,
223231.
Lorenzo-Seva, U., Ferrando, P.J., 2006. FACTOR: a computer program to t the
exploratory factor analysis model. Behav. Res. Methods, Instrum. Comp. 38,
8891.
MacCallum, R.C., Widaman, K.F., Zhang, S., Hong, S., 1999. Sample size in factor
analysis. Psychol. Methods 4, 8499.
Mood, A.M., Graybill, F.A., Boes, D.C., 1974. Introduction to the Theory of Statistics.
McGraw-Hill, NewYork.
Muthn, L.K., Muthn, B.O. (2006). Mplus. Statistical Analysis with Latent Variables.
Users Guide, version 4.1. Los Angeles, California.
Paxton, P., Curran, P.J., Bollen, K., Kirby, J.B., Chen, F., 2001. Monte Carloexperiments:
design and implementation. Struct. Equat. Model. 8, 287312.
Preacher, K.J., MacCallum, R.C., 2002. Exploratory factor analysis in behavior
genetics research: factor recovery with small sample sizes. Behav. Genet. 32,
153161.
Trendalov, N.T., Unkel, S., 2011. Exploratory factor analysis of data matri-
ces with more variables than observations. J. Computation. Graph. Stat. 20,
874891.
Tucker, L.R., Koopman, R.F., Linn, R.L., 1969. Evaluation of factor analytic research
procedures by means of simulated correlation matrices. Psychometrika 34,
421459.
Van Bruggen, G.H., Lilien, G.L., Kacker, M., 2002. Informants in organizational mar-
keting research: why use multiple informants and howto aggregate responses.
J. Mark. Res. 34 (November), 469L 78.
Velicer, W.F., Fava, J.L., 1998. Effects of variable and subject sampling on factor
pattern recovery. Psychol. Methods 3, 231251.
Wijsman, R.A., 1959. Applications of a certain representation of the Wishart matrix.
Ann. Math. Stat. 30, 597601.

Вам также может понравиться