Вы находитесь на странице: 1из 21

Factor Analysis

By

Amir Iqbal

Introduction

Factor analysis is a data reduction technique for identifying the internal structure
of a set of variables.
Factor analysis is a de-compositional procedure that identifies the underlying
relationships that exist within a set of variables.
Factor analysis forms groups of metric variables (interval or ratio scaled). These
groups are called factors.
These factors can be thought of as underlying constructs that cannot be measured
by a single variable (e.g. happiness).
Common factors have effects shared in common with more than one observed
variable.
Unique factors have effects that are unique to a specific variable.
2

OBJECTIVES OF FACTOR ANALYSIS

To determine how many factors are needed to explain the set


of variables

To find the extent to which each variable is associated with


each of a set of common factors.
To provide interpretation to the common factors.
To determine the amount of each factor possessed by each
observation. (Identified by the factor scores)
3

Assumptions
Linear Relationship
The variables used in factor analysis should be linearly
related to each other. This can be checked by looking at
scatterplots of pairs of variables.

Moderately Correlated
The variables must also be at least moderately correlated to
each other; otherwise the number of factors will be almost
the same as the number of original variables, which means
that carrying out a factor analysis would be pointless.
4

correlation matrix
It presents the inter-correlations between the studied
variables.
The dimensionality of this matrix can be reduced by looking
for variables that correlate highly with a group of other
variables, but correlate very badly with variables outside of
that group (Field 2000).
These variables with high inter-correlations could well
measure one underlying variable, which is called a factor.
5

A hypothetical correlation matrix


In this matrix two clusters of variables with
high inter-correlations are represented.
These clusters of variables could well be
manifestations of the same underlying
variable (Rietveld & Van Hout 1993: 255).

1.00

0.77

1.00

0.66

0.87 1.00

0.09

0.04 0.11 1.00


1.0

0.06 0.10 0.51 0


0.4

0.12
0.08

The data of this matrix could then be


reduced down into these two underlying
variables or factors.

0.14 0.08 0.61 9

1.00

Correlation Matrix
important: Two things
The variables have to be inter-correlated,
But no extreme multi-collinearity. As this would cause
difficulties in determining the unique contribution of the
variables to a factor (Field 2000: 444).

Number of factors to be Retained


Retain only those factors with an eigenvalue larger than 1
(Guttman-Kaiser rule);
Keep the factors which, in total, account for about 70-80% of
the variance;
Make a scree-plot; keep all factors before the breaking point
or elbow.

How many factors to include


use one of the following methods:
The factors account for a particular percentage (e.g. 75%) of the
total variability in the original variables.
Choose factors with eigenvalues over 1 (if using the correlation
matrix).
Use the scree plot of the eigenvalues. This will indicate whether
there is an obvious cut-off between large and small eigenvalues.
The second method, choosing eigenvalues over 1, is probably
the most common one.
9

Interpreting factor loadings:

By one rule of thumb in confirmatory factor analysis, loadings should be .7 or


higher to confirm that independent variables identified a priori are represented
by a particular factor,

on the rationale that the .7 level corresponds to about half of the variance in the
indicator being explained by the factor.

However, the .7 standard is a high one and real-life data may well not meet this
criterion,

A lower level such as .4 for the central factor and .25 for other factors call
loadings above .6 "high" and those below .4 "low".

In any event, factor loadings must be interpreted in the light of theory, not by
arbitrary cutoff levels.
10

Kaiser criterion:
The Kaiser rule is to drop all components with eigenvalues
under 1.0.
The Kaiser criterion is the default in SPSS and most computer
programs
But is not recommended when used as the sole cut-off
criterion for estimating the number of factors.

11

Scree plot:
The Cattell scree test, plots the components as the X axis and the
corresponding eigenvalues as the Y-axis.
As one moves to the right, toward later components, the eigenvalues
drop.
When the drop ceases and the curve makes an elbow toward less
steep decline, Cattell's scree test says to drop all further components
after the one starting the elbow.
This rule is sometimes criticised for being amenable to researchercontrolled "fudging".
That is, as picking the "elbow" can be subjective because the curve has
multiple elbows or is a smooth curve, the researcher may be tempted to set
the cut-off at the number of factors desired by his or her research agenda.
12

13

If Barlett's test of spericity is significant


H0: The inter-correlation matrix of these
variables is an identity matrix is rejected.
Thus from the perspective of Bartlett's
test, factor analysis is feasible.
14

As a rule of thumb, KMO should be 0.60 or


higher in order to proceed with a factor
analysis.

Kaiser suggests 0.50 as a cut-off value, and a


desirable value of 0.8 or higher.

15

16

A loading must satisfy certain criteria


A factor can be interpreted if at least 4 variables have a loading of
more than 0.60. The variables with the highest loading are the
"marker variables".
A factor can be interpreted if at least 10 variables have a loading of
more than 0.40. The variables with the highest loading are the
"marker variables".

If fewer than 10 variables have a loading of more than 0.40 and the
sample size is less than 300, the loading structure is likely to be
random.

Normative: A factor loading of less than 0.2 cannot be considered


=> such items are omitted and the analysis must be recalculated
17

18

Example variable v01


Communality after extraction 0.479 => 47.9% of the variance of v01 is explained by
Factors 1 and 2.
0.479 = 0.6912 + 0.0392
47.9% = 47.8%
+ 0.1%

19

20

THANKS

Вам также может понравиться