Академический Документы
Профессиональный Документы
Культура Документы
th
The primary objective of this study was to examine the effect of missing data on goodness of fit
statistics in confirmatory factor analysis (CFA). For this aim, four missing data handling methods;
listwise deletion, full information maximum likelihood, regression imputation and expectation
maximization (EM) imputation were examined in terms of sample size and proportion of missing data. It
is evident from the results that when the proportions of missingness %1 or less, listwise deletion can
be preferred. For more proportions of missingness, full information maximum likelihood (FIML)
imputation method shows visible performance and gives closest fit indices to original fit indices. For
this reason, FIML imputation method can be preferred in CFA.
Key words: Missing data, goodness of fit, confirmatory factor analysis, incomplate data, and missing value.
INTRODUCTION
Educational and psychological scientists have improved
their ability to carry out quantitative analysis on large and
complex data bases with the use of computers. The goal
of the researcher is running the most precise analysis of
the data for making acceptable and effective deductions
about the population (Schafer and Garham, 2002).
Generally, scientists have ignored or have underestimated
some kind of research problems by reason of missing
data but with the help of improved technology and
computers, this problem can be handled easily. The term
missing data means that some type of interested
information about the phenomena is missing (Kenny,
2005). Missing data is one of the most common problems
in data analysis. The problem occurs as a result of
E-mail: i.alper.kose@gmail.com
Author agree that this article remain permanently open access under the terms of the Creative Commons
Attribution License 4.0 International License
Kse
Missing at random
Deletion Methods
209
210
Kse
0%
100
200
500
1000
20%
80
160
400
800
211
Method of analysis
reliability of measure,
..true variability,
..observed variability,
..unexplained variability in the measure.
It can be seen from the equation, as the error variance
increases, reliability decreases. With respect to missing
data, lost information can give rise to larger amounts of
error variance. Missing data has negative effects on
research results such as contribution to biased resultsand
making it difficult to make valid and efficient inferences
about a population, decreasing statistical power and
finally cause violation statistical assumptions (Schafer
and Garham, 2002; Kang et al., 2005; Kenny, 2005;
RESULTS
Results of analytical procedures described in the methods
section for the simulated data are presented in this
section. In order to see the effects of missing data handling methods, goodness of fit indices were calculated
separately for each original data samples first (Table 2).
Before looking at the relative performance of various
212
df
X /df
RMSEA
GFI
SRMR
CFI
IFI
AGFI
27,67
32,42
27,64
37,59
32
29
35
35
0,86
1,11
0,79
1,07
0,685
0,302
0,807
0,351
0,000
0,024
0,000
0,009
0,95
0,97
0,99
0,99
0,062
0,047
0,026
0,022
1,00
0,98
1,00
1,00
1,00
0,98
1,00
1,00
0,91
0,94
0,98
0,99
Kse
Sample
Size
MR
1%
5%
100
10%
20%
Method
X2
Df
X2/df
RMSEA
SRMR
CFI
GFI
AGFI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
27,67
27,76
28,89
28,98
28,89
27,67
26,81
27,92
26,74
31,92
27,67
23,84
26,68
28,66
26,69
27,67
25,16
27,86
29,55
28,25
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
0,86
0,87
0,90
0,91
0,90
0,86
0,84
0,87
0,84
1,00
0,86
0,75
0,83
0,90
0,83
0,86
0,79
0,87
0,92
0,88
0,685
0,680
0,624
0,624
0,624
0,685
0,727
0,670
0,729
0,470
0,685
0,850
0,732
0,636
0,732
0,685
0,799
0,701
0,590
0,656
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,062
0,062
0,063
0,063
0,063
0,062
0,063
0,067
0,062
0,067
0,062
0,056
0,061
0,062
0,061
0,062
0,067
0,060
0,063
0,063
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
0,99
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
0,95
0,95
0,94
0,94
0,94
0,95
0,95
0,94
0,95
0,94
0,95
0,95
0,95
0,95
0,95
0,95
0,94
0,95
0,94
0,95
0,91
0,91
0,91
0,91
0,91
0,91
0,91
0,91
0,91
0,90
0,91
0,92
0,91
0,91
0,91
0,91
0,90
0,91
0,90
0,91
Sample
Size
MR
1%
5%
200
10%
20%
Method
df
X2/df
RMSEA
SRMR
CFI
GFI
AGFI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
32,42
33,02
33,75
34,41
35,75
32,42
31,11
33,12
33,91
30,56
32,42
36,61
33,79
36,07
32,33
32,42
34,97
33,83
37,68
33,05
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
29
1,12
1,14
1,16
1,19
1,13
1,12
1,07
1,14
1,17
1,05
1,12
1,26
1,16
1,24
1,11
1,12
1,21
1,16
1,3
1,14
0,302
0,300
0,298
0,224
0,288
0,302
0,36
0,299
0,242
0,386
0,302
0,156
0,279
0,171
0,227
0,302
0,205
0,285
0,129
0,275
0,024
0,027
0,028
0,031
0,025
0,024
0,02
0,025
0,029
0,016
0,024
0,036
0,027
0,035
0,027
0,024
0,038
0,03
0,039
0,029
0,047
0,048
0,047
0,048
0,047
0,047
0,047
0,048
0,048
0,046
0,047
0,05
0,05
0,05
0,052
0,047
0,048
0,049
0,052
0,051
0,98
0,97
0,97
0,96
0,97
0,98
0,98
0,97
0,97
0,98
0,98
0,95
0,96
0,96
0,97
0,98
0,97
0,96
0,94
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,97
0,96
0,97
0,97
0,96
0,97
0,97
0,97
0,96
0,96
0,94
0,94
0,94
0,94
0,94
0,94
0,94
0,94
0,94
0,94
0,94
0,93
0,93
0,93
0,93
0,94
0,94
0,94
0,93
0,93
213
214
Sample
Size
MR
1%
5%
500
10%
20%
Method
X2
df
X2/df
RMSEA
SRMR
CFI
GFI
AGFI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
27,67
28,00
26,95
26,77
26,92
27,67
29,89
26,89
31,39
26,89
27,67
31,43
28,82
39,94
28,88
27,67
26,49
28,69
37,20
28,93
35
35
35
35
35
32
32
32
32
32
32
32
32
32
32
32
32
32
32
32
0,79
0,80
0,77
0,76
0,77
0,86
0,93
0,84
0,98
0,84
0,86
0,98
0,90
1,25
0,90
0,86
0,83
0,89
1,16
0,90
0,807
0,566
0,350
0,320
0,366
0,807
0,713
0,835
0,643
0,835
0,807
0,641
0,759
0,259
0,758
0,807
0,849
0,821
0,367
0,755
0,000
0,008
0,009
0,010
0,008
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,000
0,017
0,000
0,000
0,000
0,000
0,011
0,000
0,026
0,022
0,022
0,022
0,022
0,026
0,028
0,026
0,029
0,026
0,026
0,029
0,027
0,032
0,027
0,026
0,029
0,027
0,031
0,027
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
1,00
0,99
1,00
1,00
1,00
1,00
0,99
1,00
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,98
0,99
0,99
0,99
0,99
0,99
0,99
0,98
0,99
0,99
0,99
0,99
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
0,98
Sample
Size
MR
1%
5%
1000
10%
20%
Method
X2
df
X2/df
RMSEA
SRMR
CFI
GFI
AGFI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
Original
LWD
FIML
RI
EMI
37,59
35,18
32,18
51,08
32,42
37,59
41,80
37,20
43,79
38,15
37,59
40,32
38,94
50,68
42,83
37,59
55,08
35,18
33,26
34,42
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
35
1,07
1,05
0,92
1,46
0,92
1,07
1,19
1,06
1,25
1,09
1,07
1,15
1,11
1,45
1,22
1,07
1,57
1,05
0,95
0,98
0,351
0,453
0,604
0,038
0,502
0,351
0,199
0,368
0,146
0,380
0,351
0,246
0,296
0,042
0,170
0,351
0,048
0,304
0,553
0,502
0,009
0,008
0,000
0,021
0,007
0,009
0,014
0,008
0,016
0,008
0,009
0,013
0,011
0,021
0,015
0,009
0,022
0,000
0,000
0,007
0,020
0,020
0,020
0,030
0,020
0,020
0,020
0,020
0,020
0,020
0,020
0,020
0,020
0,030
0,020
0,020
0,030
0,020
0,020
0,020
1,00
1,00
1,00
0,99
1,00
1,00
0,99
0,99
0,99
1,00
1,00
1,00
1,00
1,00
0,99
1,00
0,99
1,00
1,00
1,00
0,99
0,99
0,99
0,99
0,99
0,99
0,99
1,00
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,99
0,98
0,99
0,99
0,99
1,00
0,99
0,99
0,99
0,99
0,99
0,98
0,99
0,99
0,98
0,99
0,99
0,99
force on statistical inference (1999) was against the application of traditional methods, like listwise and pairwise
Kse
Conflict of Interests
The author(s) have not declared any conflict of interests.
REFERENCES
Acock AC.(2005). Working with missing values. J.Marriage Fam.
67:1012-1028.
Allison PD (2003). Missing data techniques for structural equation
modeling. J. Abnormal Psychol. 112(4):545-557.
Byrne BM (2001). Structural equation modeling with AMOS, EQS, and
LISREL: Comparative approaches to testing for the factorial validty of
a measuring instrument. Int. J. Testing 1(1):55-86.
Cheema J (2012). Handling missing data in educational research
using SPSS. Unpublished Doctoral Dissertation. George Mason
University, USA.,
Chen SF, Wang S,Chen CY (2012). A simulation study usinf EFA and
CFA programs based on the impact of missing data on test
dimensionality. Expert Syst. Appl. 39:4026-4031.
Davey A,Savla J(2010). Statistical power Analysis with Missing Data.
A Structural Equation Modeling Approach. Routledge. Taylor &
Francis Group. New York, NY.
Enders CK (2001). A primer on maximum likelihood algorithims
available for use with missing data. Structural Equation Modeling: A
Multidisciplinary J. 8(1):128-141.
215