Вы находитесь на странице: 1из 6

Developmental Psychology

1978, Vol. 14, No. 5, 555-560

Development of a More Reliable Version of the


Matching Familiar Figures Test
ED CAIRNS AND TOMMY CAMMOCK
New University of Ulster
Coleraine, Northern Ireland

Because of the inadequate reliability of Matching Familiar Figures Test (MFFT)


error scores, a longer, more reliable version was developed. As a result of item
analysis of 30 MFFT-type items, 20 were selected on the basis of item-total error
correlations and ability to discriminate reflective from impulsive 11-year-old boys.
Two subsequent studies with subjects of similar ages suggested that split-half
correlations for the new 20-item MFFT over 2 weeks were .91 for latency and .89 for
errors, while test-retest correlations over 5 weeks were calculated as .85 and .77 for
latency and errors, respectively. These results plus those of a further study with 7-
and 9-year-old boys and girls led to the suggestion that the new MFF (the MFF20) is
suitable for use with children in the age range 7-11 years.

Although many test procedures have been & Weinberg, 1976) despite their statistical
cited as adequate measures of the reflec- significance have tended to be low to moder-
tion-impulsivity dimension, the Matching ate by most psychometric standards. Ault et
Familiar Figures Test (MFFT) is generally al. suggest increasing the test length and es-
accepted as the "primary index." timate that "a test of 80-96 items would be
Kagan, developed three forms of the necessary to increase the reliability to .90
MFFT in the 1960s. Egeland and Weinberg from its current average of .52" (p. .230), Of
(1976) point out that "none of the sources course, consideration of such factors as
has discussed in detail the psychometric de- fatigue, boredom, etc., necessitates limiting
velopment of the tests; in fact, different pri- the test length to not more than 24 items
mary references are often cited for the same when testing children. Consequently, any
form of the test" (p. 484). Form F of the attempt to enhance the reliability of MFFT
MFFT is the most popular version, and error scores should concentrate on improv-
Form S, which has generally been used as a ing the discriminative power of the items,
posttest measure in pretest and posttest in- thus eliminating the need for a protracted
tervention designs, is no longer distributed. version of the test. The most efficient way to
Form K, the younger children's version, select items that yield a good discriminative
also has 12 test items but an array of four index is by item analysis based on the corre-
instead of six alternatives. lation of each item with the total test score
The reported test-retest and internal con- (Nunnally, 1967). Nunnally argues that
sistency reliabilities of the MFFT tests those items that correlate highest with the
(Ault, Mitchell, & Hartmann, 1976; Egeland total test score should be selected because
they are probably less ambiguous, cannot be
We are grateful to J. Kagan and T. Zelniker for pro- extreme in difficulty in either direction, have
viding the test items. more variance relating to the common factor
Copies of the MFF20 may be obtained, at cost, from among the items, and will tend to make the
the first author.
Requests for reprints should be sent to Ed Cairns, final test highly reliable.
Department of Psychology, New University of Ulster, The aim of this study was to improve the
Coleraine, County Londonderry, Northern Ireland. reliability of MFFT error scores by increas-
Copyright 1978 by the American Psychological Association, Inc. 0012-1649/78/1405-0535$00.75

555
556 ED CAIRNS AND TOMMY CAMMOCK

Table 1: Item Analysis of the 30-Item Matching Familiar Figures Test (Errors Only)
Mean errors
Item Form TP Reflectives Impulsives t Total8 Selection
House F 1 .36 1.30 4.11* ' .43 Yes
T.V. Z 1 .19 1.24 4.51" .49 Yes
Aeroplane S 1 .90 2.21 4.99*" .46 Yes
Soldier s 1 .88 1.79 2.70* ' .34 No
Bear F 1 .24 1.42 4.77* 1 .49 Yes
Spaceship Z 2 .31 .85 2.61« .27 Yes
Ship F 2 .33 .76 2.14* .25 Yes
Lion S 2 1.50 1.91 1.00* ' .21 No
Tree F 2 .00 .09 .98 .18 No
Leaf S 2 .69 1.15 1.48 .28 Yes
Glasses S 3 .57 1.76 3.68* » .40 Yes
Dog S 3 .83 1.36 1.96 .37 No
Cat F 3 .62 1.33 2.43* .40 Yes
Dress I Z 3 .69 1.91 3.61* .42 Yes
Telephone F 3 .50 1.03 2.63* .39 No
Dress II S 4 1.76 2.79 3.34* .41 No
Graph S 4 .86 1.79 3.40* .38 No
Cowboy F 4 .64 2.09 5.16* .60 Yes
Speedboat Z 4 1.00 1.97 3.72* .44 Yes
Giraffe F 4 .29 1.64 5.92* .47 Yes
Bed S 5 1.02 1.33 1.04 .23 No
Flower S 5 1.36 2.70 4.03* .42 Yes
Lamp I F 5 .14 .82 4.54* .38 Yes
Dress III F 5 .57 1.62 3.81* .28 No
Duck Z 5 .21 1.48 6.65* .58 Yes
Leaf F 6 .24 1.03 3.97* .53 Yes
Baby S 6 1.26 2.61 4.22* .43 No
Wigwam Z 6 .38 1.67 4.66* .45 Yes
Lamp II S 6 .33 1.52 5.75* .50 Yes
Scissors F 6 .05 1.18 5.64*' .59 Yes
Note. F, S, and Z are Forms F, S, and Zelniker items, respectively, of the Matching Familiar Figures Test. TP = correct response target position.
a
Total = item score-total score correlation, error only.
*p < .05.
«* p < .01.

ing the number of items and selecting, Procedure. The children were tested individually. Fol-
through item-total error performance, only lowing the administration of two practice items, the 30
test items, which had been thoroughly shuffled prior to
those items providing an efficient dis- the testing of each subject, were presented according to
criminative index. standard directions (Kagan, 1965). Latency to first re-
sponse and number of choices to correct response were
noted for each item.
Experiment 1

Results
Method
The criterion for selection was highest item-total
Subjects. Subjects were 98 boys with an age range of error correlation with the added provisions that the item
11.4-12.2 years attending a secondary school in a work-clearly discriminate between reflectives' and impul-
ing class area. Boys were chosen because there are sives' error scores and also that each target position be
more data on boys' conceptual tempo and because in adequately represented. The item-total error correla-
several studies researchers have found significant sex tions were calculated using Pearson's product-moment
differences in responding to the MFFT (e.g., Lewis, technique. Subjects were classified as reflective or im-
Rausch, Goldberg & Dodd, 1968). Based on Moray pulsive using the double median split criterion. The
House Verbal IQ scores, obtained 8 months prior to this median error and latency scores were 34 errors and 11.7
experiment, subjects were selected to represent the sec, respectively. This resulted in the classification of
verbal IQ range 80-130. 42reflectives, 33impulsives,and 13 fast-accurate and 10
slow-inaccurate subjects. The mean error scores of
Apparatus. The apparatus consisted of 32 MFFT items, reflectives and impulsives on each item were computed
2 practice and 30 test items (Forms F and S of the MFFT and / tests conducted to determine which items dis-
tests and 6 other items supplied by T. Zelniker). criminated between the error scores of reflectives and
A MORE RELIABLE MFFT 557

impulsives. Coefficient alphafor the 30-item MFFT was Table 3: Means and Standard Deviations of Scores
.98 for latency and .81 for errors, both/« < .01. Obtained on the Delayed Split-Half
Based on error scores only, five items failed to dis- Administration of the MFF20
criminate reflectives from impulsives (see Table 1), yet
item-total error correlations were disappointing, rang- Errors RT
ing from .18 to .60 (comparable correlations for re- M SD
sponse times ranged from .70 to .91). In particular, Group Set SD M
those items in which the target occupied the center Test
positions (i.e., Positions 2 or 5) resulted in low item- 1 A 13.6 5.9 9.7 5.2
total error correlations. However, in order to ensure 2 B 10.2 7.2 14.1 8.2
adequate representation of all target positions, thus
maximizing response uncertainty, items were selected Retest
from these target position categories even though the 1 B 12.6 5.2 8.4 3.9
item-total error correlations were low and one item, 2 A 10.9 5.8 12.4 8.0
"leaf I," failed to discriminate between reflective and Note, n = 15 for each group. RT = reaction time. MFF20 = 20-item
impulsive error scores. form of the Matching Familiar Figures Test.

Experiment 2 The allocation of items to order positions


The aim of this study was to obtain relia- was random except for the provision that
bility coefficients for the MFF20 developed items with similar target positions were re-
in Experiment 1 using the corrected correla- quired to occupy equivalent order positions
tion between split-halves given 2 weeks on the first 10 compared to the latter 10 order
apart as recommended by Nunnally (1967). positions on the test. This provision ensured
To calculate reliability using this technique, that items were assigned a fixed-order posi-
it was necessary to appoint each item an tion and organized into two comparable
order position on the test and to divide the halves (items 1-10 were Set A, and 11-20
items into two comparable halves. The allo- were Set B).
cation of order positions to items on the test
was also necessary to provide a standard Method
presentation order for the reference of fu-
ture investigators using the measure. The Subjects. Subjects were 30 boys with an age range of
order positions of items on the test are illus- 11.2-11.9 years attending a secondary school in a work-
trated in Table 2. ing class area. Based on Moray House Verbal IQ
scores, obtained 8 months prior to this study, the sub-
jects were selected to represent the verbal IQ range
85-125. This sample was equivalent to but independent
Table 2: Item Order Position on the MFF20 of the subjects tested in Experiment 1.

Item Form Order position Target Procedure. Subjects were assigned randomly to receive
Set A or Set B first followed by the alternate set 10 days
Leaf F 1 2 later. On each occasion two practice items were admin-
Scissors F 2 6 istered first, the standard procedure was followed (Ka-
Glasses S 3 3
Cowboy F 4 4 gan, 1965), and latency to first response and number of
House F 5 1 choices to correct response were noted for each item.
Spaceship Z 6 2
Leaf F 7 6
Giraffe F 8 4
Aeroplane S 9 1 Results
Flower S 10 5
Ship F 11 2 The delayed split-half (alternative set)
Wigwam Z 12 6
Cat F 13 3 product-moment correlations for errors and
Speedboat Z 14 4 latency were estimated at .80 and .83, re-
T.V. Z 15 1
spectively (for both, df = 28, p < .01). As
Duck Z 16 5
Lamp II S 17 6 these correlations were derived from cor-
Dress I Z 18 3 relating scores on two subsets, each only
Bear F 19 1
Lamp I F 20 5 one-half the length of the complete test, the
Note. F, S, and Z are Forms F, S, and Zelniker items, respectively, of
reliability of the full-length test was obtained
the Matching Familiar Figures Test. by applying the Spearman-Brown formula.
558 ED CAIRNS AND TOMMY CAMMOCK

Application of this formula indicated that working class area. Based on Moray House Verbal IQ
the 20-item MFFT should have reliabilities scores, obtained 8 months prior to this experiment,
subjects were selected to represent the verbal IQ range
of .89 (errors) and .91 (latency). 85-125. This sample was equivalent to but independent
of children tested in Experiments 1 and 2.
Discussion
Procedure. Subjects were tested individually. Follow-
Coefficient alpha for the 30-item MFFT in ing two practice items, the 20 test items were presented
Experiment 1 was .98 for latency and .81 for in the fixed-order positions allocated in Experiment 2
errors (bothps < .01). Thus reducing the (see Table 2) according to standard procedure (Kagan,
1965).
test length from 30 to 20 items had no appre- Five weeks later, the same experimenter revisited the
ciable effect on its reliability; in fact, error school and retested the subjects with the same 20 items.
score reliability improved slightly, testifying
to the efficiency of the item analysis.
In order to assess if Set A and Set B items Results
were of comparable difficulty, the mean The mean total error and mean latency
total error and latency scores of the 30 sub- scores of the 37 boys on the first administra-
jects on each set were compared. On Set A tion of the test were 18.84 errors (SD = 9.13)
items, there was a mean error score of 12.23 and 13.40 sec (SD - 7.44), respectively. At
(SD = 5.90) and a mean latency score of retest the mean total error and mean latency
11.06 (SD = 6.76), compared with a mean scores were 16.22 errors (SD = 9.15) and
error score of 11.40 (SD = 6.27) and a mean 12.2 sec, (SD = 7.20), respectively. The
latency score of 11.25 sec (SD = 6.94) on Set product-moment stability coefficients were
B (see Table 3). calculated atr(35) = .85andr(35) = .77, both
No significant differences between the ps < .01 for latency and errors, respectively.
means for total error, f(28) = .53, ns, and For latency and error scores obtained on the
mean latency, ?(28) = .12, ns, were re- first administration of the test, r(35) = -.67,
vealed, indicating that the sets are of equiva- p < .01; and on the second administration,
lent difficulty. These results therefore sug- r(35) = -.57,p < .01.
gest that the MFF20 may be divided into two
comparable halves, sufficiently equivalent
for pretest and posttest research purposes, Discussion
pending the development of an alternative The stability coefficient for errors on the
20-item form. MFF20 (.77) compares favorably with the
best error stability coefficient (.36,p < .05)
Experiment 3 reported for the MFFT (Form F) when ad-
According to Nunnally (1967) the cor- ministered to a similar age group (Egeland &
Weinberg, 1976). Furthermore, the results
rected, delayed split-half correlation is a indicated that the latency-error correlation
more appropriate measure of a test's reliabil- (r - — .67) is in accordance with the range of
ity than the correlation yielded by the test-
retest procedure. However, MFFT reliabil- correlations (r = -.50 to -.60) that Kagan
(Kagan & Messer, 1975) hitherto has con-
ity has typically been determined using the sidered adequate.
latter design, and consequently the reliabil- Kagan has always maintained that MFFT
ity coefficients derived in Experiment 2 have response time is "independent" of or "or-
no comparison in the conceptual tempo lit- thogonal" to verbal skills (see Kagan, 1965;
erature. A third study was therefore con- Kagan & Kogan, 1970), and the evidence
ducted to investigate the test-retest reliabil- generally has supported this suggestion. In
ity of the MFF20. This also provided an the present study Moray House Verbal IQ
opportunity to examine the latency-error re- scores obtained some 8 months earlier
lationship. yielded correlations of -.04 and -.06 (for
both, df - 35, ns) for latency and error
Method scores, respectively. Even if these correla-
Subjects. Subjects were 37 boys, with an age range of tions are corrected for attenuation due to
11.3-11.9 years, attending a secondary school in a low score reliability, the correlations rise to
A MORE RELIABLE MFFT 559

-.04 (latency IQ) and - .07 (errors IQ), both In order to assess the influence of age and
nonsignificant. sex on MFF20 performance, two-way anal-
yses of variance were performed on the la-
Experiment 4 tency and error scores. The analysis of the
latency scores revealed no significant main
This series of studies has outlined the con- effects or interactions. The analysis of error
struction of a more reliable version of the scores, however, revealed a significant main
MFFT. So far, however, it has only been effect due to age, F(l, 114) = 12.19,p < .01,
demonstrated that the new test, the MFF20, but no main effect due to sex nor any Sex x
is suitable for use with boys in the 11- to Age interaction,
12-year age range. Because the items in the
MFF20 were selected in order to be of op- Discussion
timum difficulty for the 11- to 12-year-olds,
we undertook an examination of the reliabil- The results of the present study suggest
ity of the MFF20 with a younger age sample that at the 9-year-old level the MFF20 dis-
and one that included girls as well as boys in plays the same qualities of reliability (for
order to examine sex differences in perfor- both latency and error scores), high negative
mance on the MFF20. latency-error correlation, and a lack of cor-
relation with verbal intelligence as it did in
the earlier studies with 11-year-olds. The
Method
results from the 7-year-old group are equally
Subjects. Subjects included 63 boys (n = 31) and girls (n satisfactory except that the reliability of the
= 32) with a mean age of 7.6 years and 52 boys (n = 29) error scores is somewhat lower than might
and girls (n = 23) with a mean age of 9.7 years, all of be desired. However, error score reliability
whom attended the same elementary school.
is still higher than that usually reported for
Procedure. The children were tested individually, and the MFFT (Form F), especially for this age
the MFF20 was administered following the usual two group (Ault et al., 1976).
practice items and according to the standard proce- Further, the analysis of variance suggests
dures (Kagan, 1965). Two weeks later, the Verbal that sex differences are not important for
Meaning subtest of the Primary Mental Abilities test
(Thurstone & Thurstone, 1963) was administered to the MFF20 performance, while age differences
children in their classrooms. are important but only for error scores. In
fact, if one compares the MFF20 latency
scores for the U-year-olds in Experiment 3
Results
(13.40 sec) and the 9-year-olds (11.14 sec)
For the 9-year-old group the mean latency and 7-year-olds (10.14 sec) in the present
and mean total error scores were 11.14 sec study, it would appear that even between
(SD = 4.84) and 27.73 errors (SD = 11.89). ages 7 and 11 years there is little increase in
Coefficients alpha for the two principal latency. Similarly, the respective error
scores were .78 for errors and .94 latency, scores are 35.19, 27.73, and 18.14, which
with latency and error scores yielding r(50) suggest a decrease in errors over the age
= -.67,p. < .01. Correlations between ver- range 7-11 years. These results are identical
bal intelligence test scores and latency and in trend to those obtained by Cairns (in
error scores were -.07 and -.06, respec- press) for MFFT latency scores and to those
tively (df = 50, ns). obtained for error scores except that with
For the 7-year-old group the correspond- the MFFT it appears that errors only de-
ing results were as follows: mean latency = creased between ages 5 and 7 but not there-
10.14 sec (SD = 9.54); mean total errors = after.
35.19 (SD = 11.59) with coefficient alpha In conclusion the MFF20 would appear to
calculated as .69 for errors and .92 for la- be a test that can be safely recommended for
tency. The correlation between latency and use within the age range 9-11 years for both
errors was -.62(61),p < .01, and these two boys and girls. Caution must be exerted
measures yielded r(61) = .11 and r(61) = when the test is used for those younger than
-.21, both nonsignificantly, with intelli- 9 years, and it is not recommended for use
gence. with children under 7 years.
560 ED CAIRNS AND TOMMY CAMMOCK

REFERENCES ings about the Matching Familiar Figures Test as a


measure of reflection-impulsivity." Developmental
Ault, R. L., Mitchell, C., & Hartmann, D. P. Some Psychology, 1975, //, 244-248.
methodological problems in reflection-impulsivity Kagan, J., Pearson, L.,& Welsh, L. Conceptual impul-
research. Child Development, 1976, 47, 227-231. s j v jty and inductive reasoning. Child Development,
Cairns, E. Age and conceptual tempo. Journal of Ge- 1966, $j: 583-594.
netic Psychology, in press. Lewis,'M.', Rausch, M., Goldberg, S., & Dodd, C.
Egeland, B., & Weinberg, R. A. The Matching Familiar Error response and time and IQ: Sex differences in
Figures Test: A look at its psychometric credibility. cognitive style of preschool children. Perceptual and
Child Development, 1976,47, 483-491. Motor Skills, 1968,2(5, 563-568.
Kagan, J. Impulsive and reflective children: Sig- Nunnally, J. C. Psychometric theory. New York:
nificance of conceptual tempo. In J. D. Krumboltz McGraw-Hill, 1967.
(Ed.), Learning and the educational process. Thurstone, L. L., & Thurstone, T. G. Primary mental
Chicago: Rand-McNally, 1965. abilities. Chicago: Science Research Associates,
Kagan, J., & Kogan, N. Individual variation in cogni- 1953.
live processes. In P. H. Mussen (Ed.), Carmichael's Yando, R. M., & Kagan, J. Theeffects of teacher tempo
manual of child psychology (3rd ed.). New York: On the child. Child Development, 1968, 39, 27-34.
Wiley, 1970.
Kagan, J., & Messer, S. B. A reply to "Some misgiv- (Received November 18, 1977)

Manuscripts Accepted
(Continued from page 473)
Children's Moral Judgments as a Function of Intention, Damage, and an Actor's
Physical Harm. Jerry Suls (Department of Psychology, State University of New
York at Albany, 1400 Washington Avenue, Albany, New York 12222) and Robert
J. Kalle.
Development of Selective Listening and Hemispheric Asymmetry. Gina Geffen
(Psychology Discipline, School of Social Sciences, Flinders University of South
Australia, Bedford Park, South Australia, Australia 5042) and Jpcelyn Wade.
Effect of Language on Preference for Responses to a Moral Dilemma. Joseph J.
Moran (Department of Behavioral and Humanistic Studies, State University of
New York College at Buffalo, Buffalo, New York 14222) and Andrew J. Joniak.
Androgyny Across the Life Span. Janet Shibley Hyde (Department of Psychology,
Bowling Green State University, Bowling Green, Ohio 43403) and Diane E.
Phillis.
Perceived Determinants of Highs and Lows in Life Satisfaction. William Me Kinley
Runyan (1203 Tolman Hall, Institute of Human Development, University of
California, Berkeley, California 94720).
Care-Giving and Infant Behavior in Day Care and in Homes. Judith L. Rubenstein
(Division of Child Psychiatry, Tufts New England Medical Center, 171 Harrison
Avenue, Boston, Massachusetts 02111) and Carollee Howes.
Children's Orientation of a Listener to the Context of Their Narratives. Carole L.
Menig-Peterson (Memorial University of Newfoundland, St. John's Newfound-
land, Canada) and Allyssa McCabe.
Development of Activity Level in Children Revisited: Effects of Mother Presence.
Donald K. Routh (Department of Psychology, University of Iowa, Iowa City,
Iowa 52240), Marsha D. Walton, and Efrat Padan-Belkin.
Neonatal Precursors of Infant Behavior. Raymond K. Yang (Department of Child
and Family Development, University of Georgia, Athens, Georgia 30602) and
Howard A. Moss.

Вам также может понравиться