Вы находитесь на странице: 1из 7

This article was downloaded by: [University of Otago]

On: 31 December 2014, At: 11:16


Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
House, 37-41 Mortimer Street, London W1T 3JH, UK

The Journal of Experimental Education


Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/vjxe20

The Advanced Ravens Progressive Matrices


Steven M. Paul

University of California, Berkeley


Published online: 16 Apr 2014.

To cite this article: Steven M. Paul (1986) The Advanced Ravens Progressive Matrices, The Journal of Experimental
Education, 54:2, 95-100, DOI: 10.1080/00220973.1986.10806404
To link to this article: http://dx.doi.org/10.1080/00220973.1986.10806404

PLEASE SCROLL DOWN FOR ARTICLE


Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content)
contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors
make no representations or warranties whatsoever as to the accuracy, completeness, or suitability
for any purpose of the Content. Any opinions and views expressed in this publication are the opinions
and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of
the Content should not be relied upon and should be independently verified with primary sources of
information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands,
costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or
indirectly in connection with, in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or
systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in
any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://
www.tandfonline.com/page/terms-and-conditions

Downloaded by [University of Otago] at 11:17 31 December 2014

The Advanced Raven's .Progressive


Matrices: Normative Data for an
American University Population and
an Examination of the Relationship
with Spearman's g
STEVEN M. PAUL
University of California, Berkeley

ABSTRACT
Normative data for the Advanced Raven's Progressive Matrices are presented based on 300 University
of California, Berkeley, students. Correlations with the
Wechsler Adult Intelligence Scale and the Terman Concept Mastery Test are reported. The relationship between the Advanced Raven's Progressive Matrices and
Spearman's g is explored.

THE RAVEN'S PROGRESSIVE MATRICES (RPM)


are the best known and most widely used culture reduced
tests of mental ability. British geneticist Lionel Penrose
and British psychologist 5. C. Raven were the first to
present perceptual analogy 'and inductive reasoning
problems in the form of a matrix. In their matrices the
perceptual analogies simultaneously involve both horizontal and vertical transformations. The variety of figures, relationships, and transformations are virtually
limitless. Figures may increase or decrease in size, elements may be added or subtracted, shaded or unshaded,
flipped, rotated, mirror imaged, or show many other
progressive changes in pattern. In each case, the lower
right corner of the total matrix is missing, and the subject must select the best one of the six or eight multiplechoice alternatives to fill the empty corner.

Raven described the Progressive Matrices as "a test


of a person's present capacity to form comparisons, reason by analogy, and develop a logical method of thinking, regardless of previously acquired information'?
(Raven, 1938, p. 12). He was responsible for publishing
the first Progressive Matrices Test and its subsequent
improvements and extensions (Raven, 1938, 1947,
1960). There are three forms of the RPM now in use:
Standard Progressive Matrices (SPM), Colored Progressive Matrices (CPM), and Advanced Progressive
Matrices (APM). Considerable research has been conducted involving the SPM and CPM, but little information is available concerning the APM. Adequate standardization norms are lacking in the United States. Research is notably absent in relation to university students, the type of population the APM is best suited to
measure.
The Standard Progressive Matrices consists of 60
items grouped in five sets (A, B, C, D, E) of 12 items
each. Each set involves different principles of matrix
transformation and within each set the items become
progressively more difficult. It was designed to cover
the widest possible range of mental ability and to be
equally useful with persons of all ages, whatever their
education, nationality, or physical condition. The scale
is intended to cover the entire range of intellectual development starting with the time a child is able to grasp the
idea of finding a missing piece to complete a pattern. It
is sufficiently long to assess a person's maximum capac-

Downloaded by [University of Otago] at 11:17 31 December 2014

JOURNAL OF EXPERIMENTAL EDUCATION


ity to form comparisons and reason by analogy without
being overly taxing or unwieldy. A person's total score
provides an index of his intellectual ability. The scores
obtained by adults tend to cluster in the upper half of
the scale.
The Colored Progressive Matrices, Sets A, Ab, and
B, were devised as a test for young children and old people, for anthropological studies, and for clinical work.
It can be used with people who, for whatever reason,
cannot understand or speak the English language, suffer
from physical disabilities, are intellectually subnormal,
or have deteriorated. To make the test independent of
verbal instructions, the problems are printed on colored
backgrounds and the scale is arranged so that it can be
presented in the form of illustrations printed in a book
or as boards with movable pieces. Success in Set Ab depends on the comprehension of discrete figures as spatially related "wholes" and in combination with Sets A
and B adequately covers the cognitive processes of which
children under 11 years of age are usually capable.
The Advanced Progressive Matrices, Sets I and 11,
were constructed as a test of intellectual efficiency that
can be used with people of more than average intellectual ability and that will differentiate clearly between individuals of even superior ability. The difficulty level of
the APM is such as to make it unsuitable for persons
scoring below a raw score of about 50 on the SPM. For
the general adult population the APM has too small a
range of scores to be useful. The APM is intended for
intellectually superior youths and adults, university students, and others for whom the SPM is too easy.
The APM was originally created in 1943 for use'at the
War Office Selection Boards. In 1947 a revision was
prepared for general use as a nonverbal test of intellectual efficiency with which a person is able to form comparisons between figures and to develop a logical
method of reasoning. Based on the experimental work
with the 1947 edition of Foulds (Foulds & Raven, 1950)
and an item analysis carried out by Forbes (1964), the
1962 edition of the APM dropped 12 problems that
made no contribution to the score distributions for
adults of more than average intellectual ability from Set
I1 and arranged the remaining problems in order according to the frequency with which they were solved as
the total score on the revised set increased from 0 to 36.
Raven arranged the 1962 edition so that it could be
used without a time limit, in order to assess a person9s
total capacity for observation and clear thinking, or
with a time limit, in order to assess the examinee's intellectual efficiency. It consists of two sets of tests. In
Set I there are 12 problems designed to introduce a person to the method of working and cover all the intellectual processes needed for success in Set 11. The 36 problems in Set I1 are identical in presentation and argument
with those in Set I. They only increase in difficulty more
steadily and become considerably more complex.

To assess a person's total capacity for observation


and clear thinking, Raven suggests that the examinee be
shown the problems of Set I as examples to explain the
principle of the test. The subject can then be allowed to
work through Set I1 at his own speed from beginning to
end without interruption. To assess a person's intellectual efficiency, Set.1 can be given as a short practice test
followed by Set 11 as a speed test. The most common
time limit is 40 minutes.
Examination of the literature reveals a preference for
the administration of the APM without a time limit.
Yates, in particular, states that even the shorter 1962
edition has not overcome the problem of power and
speed contamination when given a 40-minute time limit.
In a study involving 960 freshman university students,
he found that the number of persons not attempting to
solve problems increases with the later items of the test.
Consequently, the difficulty levels of the items cannot
be determined (Yates, 1966). Unlimited working time
practically eliminates the number of items not attempted
and enables a determination of the true difficulty of each
item, unconfounded by differences in speed of working.
This present study was undertaken to provide normative information about the APM administered to an
American university population. Comparisons to other
mental ability tests are presented and the relationship
between the APM and Spearman's g is explored.
Method

Subjects
Three hundred students (190 female, 110 male) from
the University of California, Berkeley, served as subjects. Their average age was 252 months (21 years) with
a standard deviation of 32 months.

Procedure
Each subject was tested individually. The basic procedure of the matrices test was explained by the experimenter using examples (problems A1 and C5) from
the SPM. Subjects were instructed to put some answer
down for every question and were given a loose time
limit of 1 hour. If the subject was not finished in an
hour an additional 10 to 15 minutes was given to complete the test. A subject's score was the total number of
items answered correctly.
One hundred fifty of the subjects were also individually given the Terman Concept Mastery Test (CMT), a
high level test of verbal ability. A different set of 62 subjects out of the 300 were also individually administered
the Wechsler Adult Intelligence Scale (WAIS).

The mean total score for the sample of 300 students


was 27.0 with a standard deviation of 5.14. The median

Downloaded by [University of Otago] at 11:17 31 December 2014

PAUL
total score was also 27.0. The mean total score of the
normative group of 170university students presented by
Raven (1965) was only 21 (SD = 4). Gibson (1975) also
found data on the APM which were significantly higher
than the published university norms. The mean total score
of 281 applicants to a psychology honors course at Hatfield Polytechnic in Great Britain was 24.28 (SD = '4.67).
Table 1 presents the absolute frequency, cumulative
frequency percentile, t score, and normalized t score for
the total APM score values based on the sample of 300
students. The 95th percentile corresponds to a total
score between 34 and 35 for this sample. The 95th percentile value based on Raven's normative group with
similar ages is between 23 and 24. The Berkeley sample
scored much higher overall than the normative sample
of Raven's 1962 edition of the APM.
The internal consistency reliability based on the
Kuder-Richardson formula (KR-20) is .83. That is, approximately 83% of the variance in total test scores is
attributable to true score variance, i.e., to what the
APM is actually testing.
There is strong agreement between the rank order of
the items, according to the frequency with which they
are solved, presented by Raven and those determined
for this sample (r = .94). However, there is one noteworthy exception. The item Raven ranked 13th turned
out to be much more difficult for the Berkeley students
than would have been expected. It ranked as only the
22nd most frequently solved item.
The item involves changes in three variables: object
shape (diamond, square, circle), number of internal
lines (one, two, three), and slant of internal lines (45",
90, 135"). The majority of subjects who did not choose
the correct response (#2) were attracted to a distractor
(#5) that ignored the necessary change in the slant of the
internal lines.
Information beyond what is provided by just total
score values can sometimes be found in an examination
of the incorrect responses to the APM (Thissen, 1976).
Selection of distractor items, incorrect multiple-choice
alternatives, for each of the problems of the APM was
examined to determine if patterns developed that would
aid in the discrimination between subjects. Two subgroups of the total sample of 300 were formed. The low
group came from the bottom 24th percentile receiving
total scores less than or equal to 23 (n = 72). The high
group comprised those in the top 26th percentile who
scored greater than or equal to 31 (n = 78). A comparison was made between the two groups to see if distractors chosen by the high group were different from or
perhaps better (i.e., closer to the correct response) than
the incorrect responses chosen by the low group. No differences between the two groups were found.
Unlike most studies of the Raven's Progressive Matrices, a significant difference (a = .05) was found between the average total score of males and females. In

97

TABLE 1-Absolute Frequency, Cumulative Frequency Percentile,


t Score, and Normalized t Score for Total APM
Score Values (N = 300)
Total
score

Absolute
frequency

Cumulative
frequency
percentile

t score

Normalized
t score

this sample the males (M = 28.40, SD = 4.85, n = 110)


outscored the females (M = 26.23, SD 5.11, n = 190).
Four percent of the variance in APM total scores can be
explained by the differences in sexes. The sex differences occasionally reported in the literature are thought
to be attributable to sampling errors. No true sex differences have been reliably demonstrated (Court & Kennedy, 1976).
One hundred fifty of the Raven's testees were also individually given the Terman Concept Mastery Test.
There was a moderate positive relationship (r = .44) between the total scores on the two tests (APM: M =
27.24, SD = 5.14; CMT: M = 81.69, SD = 32.80).
Sixty-two of the subjects were also administered the
WAIS. Full Scale IQ scores of the WAIS correlated .69
with the APM total scores. Correcting this correlation
for restriction of range, based on the population WAIS
IQ SD of 15, by the method given by McNemar (1949,
p. 127), the correlation becomes. 84 (APM: M = 28.23,
SD = 5.08; WAIS: M = 122.84, SD = 9.30).
In a similar study, McLauren et al. (1973) reported a
correlation of .55 (.74 corrected for restriction of range)
between the APM and the WAIS based on 131 students
at the University of Alabama in Birmingham.
These results indicate that the APM, CMT, and the
WAIS are tapping some of the same general ability. The
possible nature of that ability is examined in the following section.

98

JOURNAL OF EXPERIMENTAL EDUCATION

Downloaded by [University of Otago] at 11:17 31 December 2014

Spearman's g
One of the most solidly established phenomena in
psychology is that scores on all mental ability tests, no
matter how diverse the mental skills or areas they cover,
are positively intercorrelated when they are obtained in
a representative sample of the general population. It
was Spearman who first hypothesized that there is some
"general factor" of mental ability that is measured in
common by all of the intercorrelated mental tests. He
gave the label "g" to this general factor.
Spearman developed the mathematical method known
as factor analysis which enabled him to extract the g
from all the intercorrelations among a collection of diverse tests and show the correlation between each test
and the hypothetical general ability factor. The correlation of a particular test with the g factor common to all
tests in the analysis is called the test's g loading. The
square of a test's g loading indicates the proportion of
the total variance in the scores on the test that is due to
individual differences in this general ability.
It is important to note that the g factor may not show
up on some tests given to highly selected groups, such as
the often tapped pool of university students, although
these tests show moderate g loadings when given to the
general population. The explanation is that these groups
have already been highly selected on g-loaded tests, such
as college entrance exams, and therefore their scores indicate less individual variation on the g factor. This
limits the intercorrelations among the various tests and
thereby prevents the g factor from showing up strongly
in a factor analysis of the matrix of intercorrelations.
Spearman originally hypothesized that each test measures only g plus some specific ability, s, which is tapped
only by the particular test. This theory that any given
test score is composed of only g + s, as well as measurement error, was soon refuted by the finding that there
are other common factors besides g in many mental
ability tests. However, they cannot be considered general factors because they do not enter into all tests, as
does g, but do enter only into certain groups of tests. In
a factor analysis of a large number of various mental
tests, the first unrotated factor (or principal component)
is g or general mental ability. It usually accounts for
almost half of the total variance in a large battery of diverse tests. The several other smaller factors, the group
factors, show highly differential loadings on tests that
are often characterized as verbal, numerical, spatial, or
involving memory.
Factor analysis by itself does not and cannot explain
the basis for the existence of g. Spearman himself stated
that factor analysis cannot reveal the essential nature of
g but only reveals where to look for it. Examination of
the characteristics of a wide variety of tests in connection with their g loadings can provide some descriptive
generalizations about the common features that charac-

terize tests that have relatively high g loadings as compared with tests that have relatively low g loadings.
Spearman originally tried to get at the psychological
nature of g by factor analyzing more than 100 tests,
each fairly homogeneous in content, and then comparing their g loadings (Spearman & Jones, 1950). He characterized the most g-loaded tests essentially as those requiring "the eduction of relations and correlates," that
is, perceiving relationships, inducing the general from
the particular, and deducing the particular from the
general. Such tests require inductive or inventive as contrasted to reproductive or rule-applying behavior. The
most g-loaded test in the whole battery was the Raven's
Progressive Matrices (RPM), which, as previously mentioned, depends almost entirely on perceiving key
features and relationships and discovering the abstract
rules that govern the differences among the elements in
the matrix.
There is much more test material available now than
was available to Spearman more than 50 years ago. This
had led to broader generalizations about g . The g factor
is manifested in tests to the degree that they involve
mental manipulation of the input elements, choice, decision, invention in contrast to selection, meaningful
memory in contrast to rote memory, long-term memory
in contrast to short-term memory, and distinguishing
relevant information from irrelevant information in
solving complex problems (Jensen, 1979).
Task comple~rityand the amount of conscious mental
manipulation required seem to be the most basic determinants of the g loading of a task. There are many examples in which a slight increase in task complexity is
accompanied by an increase in the g loading of the task.
Virtually any task involving mental activity that is complex enough to be recognized as involving some kind of
conscious mental effort is substantially g loaded. It is
the task's complexity rather than its content that is most
related to g. An almost infinite variety of test items, regardless of sensory modality, substantive or cultural
content, or the form of effector activity involved in the
required response, is capable of measuring g. This observation led to Spearman's principle of "the indifference of the indicator," meaning that the manifestation
of g is not limited to any particular types of information
or item types.
Previous research suggests that the Raven's Progressive Matrices administered to the general population
measures g and little else (Burlte, 1958). The occasional
loadings found on other factors, independently of g, are
mostly trivial and inconsistent from one analysis to
another. Although many other tests measure g to a
similar extent, unlike the Raven, they also have loadings
on the major group factors such as verbal, numerical,
spatial, and memory. The RPM does not measure perceptual ability or spatial-visualization ability as is com-

Downloaded by [University of Otago] at 11:17 31 December 2014

PAUL
monly believed. In fact, the Raven has very small loadings on these factors, when g is excluded.
Factor analysis of the RPM at the item level should
result in only a single factor. Some investigations that
have found more than one factor have employed improper orthogonal rotations of the principal components. This method can artificially create the appearance of several factors even in correlation matrices
that are artificially constructed so as to contain only one
factor plus random error (Jensen, 1980). Some of the
small spurious factors that emerge from factor analysis
of the inter-item correlations are not really ability factors at all but are "difficulty" factors, due to varying
degrees of restriction of variance on items of widely differing difficulty levels and to nonlinear regression of
item difficulties on age and ability (McDonald, 1965).
When these psychometric artifacts are taken into account, the RPM seems to measure only a single factor of
mental ability, which can be termed g.
The APM test results of this study were factor analyzed
at the item level. The first principal factor of the intercorrelation matrix of the 36 items, scored correct or incorrect, accounts for 15% of the total inter-item variance. A factor loading correlation matrix was created
from this first principal factor and subtracted from the
original inter-item correlation matrix. The resulting
residual matrix was tested and found not significantly
different from zero at a! = .05. Therefore, the APM can
be considered to measure only one factor. That this factor only accounts for 15% of the total inter-item variance indicates that the variance of each item is due
mostly to uniqueness, that is, item specificity and error.
The items are not highly intercorrelated. However, what
they do have in common may indeed be Spearman's g.
The first principal factor should not be considered
just a difficulty factor. The correlation between the item
loadings on the first principal factor and item difficulty
levels (percent passing) is - .36. (Only 13% of the variance in the first principal factor loadings can be explained
by differences in item difficulty.) The correlation between the item loadings on the first principal factor and
item variance is .41. (Sixteen percent of the variance in
the first principal factor loadings can be explained by
differences in item variance.) The correlation between
item difficulty and item variance is - 37. When item
variance is held constant, i.e., partialled out, the correlation between item loadings on the first principal factor and difficulty levels is .01.
The loadings of each item with the first principal factor and the correlations of each item with total score are
shown in Table 2. The total score on the APM can be
considered a reasonable measure of general mental ability. This notion is at the very least intuitively appealing.
There is near perfect agreement between the correlations
of each item with total score and the correlations of each
item with the hypothesized g factor. The correlation be-

99

tween the 36 item by point biserial correlations and the


36 g loadings with the effect of item variance partialled
out is .99. This evidence supports the claim that the first
principal factor of this analysis is not just a difficulty
factor and that it measures general mental ability.
Further evidence that the APM measures g comes
from a closer look at the relationship between the APM
and the WAIS. Two subgroups consisting of 13 items
each and matched on item variance were created from
the 36 APM test items. The high g group had an average
g loading of .46 and an average item variance of .15.
The low g group had an average g loading of .27 and an
average item variance of .IS. Correlations were obtained
TABLE 2-APM Item Correlations with Total Score and First
Principal Factor (N = 300)

Item

Total
score

First
principal
factor

Item

Total
score

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

.08
.20
.25
.38
.34
.17
.24
.35
.34
.34
.30
.42
.24
.30
.24
.45
.39
.35

.04
.24
.29
.43
.37
.17
.24
.37
.41
.37
.34
.46
.18
.29
.20
.47
.36
.31

19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

.30
.27
.50
.41
.56
.53
.37
.36
.50
.45
.50
.45
.47
.39
.43
.53
.49
.43

First
principal
factor
.26
.23
.49
.36
.57
.50
.30
.30

.44
.38
.44
.38
.41
.33
.37
.51
.45
.37

based on the 62 subjects who took both tests. The correlation between the high g items and WAIS Full Scale
IQ scores was .67. Low g items and WAIS Full Scale IQ
scores correlated 56. Although the two correlations are
not significantly different from each other (t = 1.39, a
= .05), a trend was apparent. Despite the fact that the
items of the APM and the WAIS are drastically different in content, those items correlating highest with
the hypothesized g factor derived from the APM show a
stronger relationship to WAIS Full Scale IQ scores than
those items with a low correlation with the hypothesized
APM g. Since WAIS Full Scale IQ scores have been
shown to be highly g loaded in previous research
(Matarazzo, 1972; Jensen, 1980), the pattern found here
can be interpreted to indicate that the hypothesized g of
the APM is the same g that is measured by the WAIS.
In summary then, the distribution of scores for a
large cross section of University of California students
on the Advanced Raven's Progressive Matrices is
markedly higher than the estimated score distribution of

JOURNAL OF EXIPERXMENTAL EDUCATION


university students that accompanies the 1962 version of
the test. A moderate 'positive association exists between
the APM and the Terman Concept Mastery Test. There
is an even stronger positive relationship between the
APM and the Wechsler Adult Intelligence Scale. Examination of the internal factor structure of the items of
the test indicate that the APM measures only one factor.
This factor is not just a difficulty factor. The results
support the notion that the APM provides a measure of
Spearman's g.

Downloaded by [University of Otago] at 11:17 31 December 2014

REFERENCES
Burke, H. R. (1958). Raven's Progressive Matrices: A review and critical evaluation. Journal of Genetic Psychology, 93, 199-228.
Court, J. H., & Kennedy, R. J. (1976). Sex as a variable in Raven's
Standard Progressive Matrices. Proceedings of the 21st International Congress of Psychology, Paris, France.
Forbes, A. R. (1964). An item analysis of the Advanced Matrices.
British Journal of Educational Psychology, 34, 1-14.
Foulds, G. A., & Raven, J. C. (1950). An experimental survey with
Progressive Matrices (1947). British Journal of Educational Psychology, 20, 4-10.
Gibson, H. B. (1975). Relations between performance on the Advanced Matrices and the EPI in high-intelligence subjects. British
Journal qf Social and Clinical Psychology, 14, 363-369.

Jensen, A. R. (1979). g: Outmoded theory or unconquered frontier?


Creative Science and Technology,2, 16-29.
Jensen, A. R. (1980). Bias in mental testing. New York: The Free
Press.
Matarazzo, J. D. (1972). Wechsler's measurement and appraisal of
adult intelligence (5th ed.). Baltimore: Williams & Wilkins.
McDonald, R. P. (1965). Difficulty factors and nonlinear factor analysis. British Journal of Mathematical and Statistical Psychology,
18, 11-23.
McLaurin, W. A., Jenkins, J. F., Farrar, W. E., & Rumore, M. C.
(1973). Correlation of IQ's on verbal and nonverbal tests of intelligence. Psychological Reports, 33, 821-822.
McNemar, Q. (1949). Psychological statistics. New York: Wiley.
Raven, J. C. (1938). Progressive Matrices: A perceptual test of intelligence, 1938, Individual form. London: H . K. Lewis.
Raven, J. C. (1947). Coloured Progressive Matrices. London: H . K.
Lewis.
Raven, J. C. (1960). Guide to the Standard Progressive Matrices.
London: H. K. Lewis.
Raven, J. C. (1965). Advanced Progressive Matrices Sets I and II.
London: H. K. Lewis.
Spearman, C., & Jones, L. L. W. (1950). Human ability. London:
Macmillan.
Thissen, D. M. (1976). Information in wrong responses to the Raven
Progressive Matrices. Journal of Educational Measurement, 13,
201-214.
Yates, A. J. (1966). A note on Progressive Matrices (1962). Australian
Journal of Psychology, 18, 281-283.

Вам также может понравиться