Академический Документы
Профессиональный Документы
Культура Документы
DOI 10.3758/s13421-014-0453-7
J. L. Little : M. A. McDaniel
Department of Psychology, Washington University, St. Louis, MO,
USA
M. A. McDaniel
Center for Integrative Research in Cognition, Learning, and
Education, Washington University, St. Louis, MO, USA
J. L. Little (*)
Department of Psychology, Hillsdale College, 33 E. College Street,
Hillsdale, MI 49242, USA
e-mail: jerilittle@gmail.com
Metamemory indices
In theory, in order to make good study decisions, it is necessary for learners to (a) have an accurate representation of what
they do and do not know and (b) use that knowledge appropriately to inform subsequent study, known within the
metacognitive/metamemory literature as monitoring and
control, respectively (e.g., Nelson & Narens, 1994).
Monitoring is commonly measured in two ways: absolute
and relative accuracy. Absolute accuracy measures the degree
to which learners can judge how much they learned. Relative
accuracy measures the degree to which learners can judge
which information was better or less well learned. Although
both measures can inform metacognitive control (e.g., restudy
decisions), it is sensible that relative accuracy would better
Mem Cogn
grade their tests against a key. They found that even with a key
to grade their responses, learners overestimated the accuracy
of their responses. One difficulty in learners self-assessments
was that definitions included several idea units, and learners
often gave themselves full credit for a response when critical
idea units were missing. The implication here is that if learners
cannot accurately grade their responses when a key is provided, their likelihood of accurately judging retrieval-practice
performance without a key, let alone without their retrieval
responses, is even more unlikely (see Dunlosky, Rawson, &
Middleton, 2005). We suggest, however, that even if learners
greatly overestimate their overall memory for text, their absolute accuracy should still be better than that of learners who
reread, and retrieval practice may still provide learners with a
higher degree of relative accuracy than would rereading, as
they may have a good idea of the sections for which they
recalled the most information and the sections for which they
recalled the least.
The above expectation remains uninformed, however, as
no studies have directly investigated accuracy of metamemory
judgments as a consequence of retrieval practice versus rereading with text materials. Most related is a study in which
Roediger and Karpicke (2006b) had participants repeatedly
recall or reread a short passage and then rate their likelihood of
future recall. They found that participants gave higher ratings
in the repeated reading condition than in the repeated recall
condition (suggesting that learners underestimate the benefit
of testing relative to rereading overall); this study, however,
did not address the extent to which learners make accurate
predictions about what they know, either in terms of absolute
or relative accuracy.
Although little research has investigated metamemory with
text, much research has investigated metacomprehension with
text (i.e., predictions of future ability to answer inference
questions; e.g., Dunlosky et al., 2005; Rawson, Dunlosky, &
Thiede, 2000). Yet, none has done so in a manner that directly
contrasts retrieval practice with rereading. Most related,
Thiede and Anderson (2003; see also Anderson & Thiede,
2008) examined the effect of summarizing on
metacomprehension, demonstrating that summary generationwhich shares some commonalities with retrieval practicecan improve metacomprehension accuracy as compared
to a no-reread control; a rereading control was not used.
Rereading is an important comparison condition for both
practical and theoretical reasons, however. In fact, although
students report using testing for diagnosing learning, Kornell
and Bjork (2007) found that 76 % of students reported using
rereadingeither of whole chapters or highlighted portions
for study (see also Hartwig & Dunlosky, 2012, for similar
results). Furthermore, rereading has been shown to improve
metacomprehension (Rawson et al., 2000). Theoretically, rereading improves metacomprehension because it leads
learners to allocate relatively more resources to the situation
Mem Cogn
Restudy allocation
Much research has examined restudy decisions following
metamemory judgments. Importantly, views vary regarding
how learners allocate future study (see, e.g., Metcalfe &
Kornell, 2005). Two prominent views are that (a) learners will
allocate attention toward information given the lowest judgments of learning (JOLs) or that they cannot recall (i.e., discrepancy reduction; e.g., Dunlosky & Hertzog, 1998; Dunlosky
& Thiede, 1998, Thiede et al., 2003), or (b) they will use a
combination of JOLs and likelihood of future learning, allocating the most time to the easiest unknown information or information given moderate JOL ratings (i.e., region of proximal
learning; Metcalfe & Kornell, 2005), perhaps not even allocating any study time to the most difficult items (e.g., Son &
Metcalfe, 2000; Thiede & Dunlosky, 1999; but see also Ariel,
Dunlosky, & Bailey, 2009, for agenda-based control in which
participants allocate study time on the basis of specific goals).
For the purposes of the present article, we will blend these
two views into the informed-decision hypothesis, a broader
view of Nelson and Leonesios (1988) monitoring-affectscontrol hypothesis, which here encompasses JOLs and
retrieval-practice performance as predictors of study time
allocation. This hypothesis predicts that the correlation between JOLs and restudy time and the correlation between
retrieval-practice performance and restudy time in the
retrieval-practice condition will both be negative. The notion
that restudy time will be allocated on the basis of prior retrieval experience gains some support from previous research
(Koriat & Bjork, 2006; Mazzoni & Cornoldi, 1993;
Soderstrom & Bjork, 2014). In Mazzoni and Cornoldis
Experiment 1, participants examined 40 sentences (e.g.,
The bride is drinking coffee) and made JOLs, tried to recall
the sentences in a first free-recall test, restudied the sentences
at their own pace, made a second set of JOLs, and finally took
a free-recall test. Restudy time was predicted by test performance such that unrecalled sentences were studied longer than
recalled ones. In their Experiment 4 (but not Exp. 1), restudy
time was predicted by JOLs. If the informed decision hypothesis has merit for coherent text, then in the present experiments, both JOLs and retrieval-practice performance would be
associated with restudy time decisions.
It is possible that the just-mentioned patterns might not
emerge, however. Because our sentences were contained within a coherent text (unlike Mazzoni & Cornoldi, 1993), it may
be less likely that participants would realize, upon reading a
sentence, that their memory for that sentence was low and
allocate time accordingly, especially if they had an overall
sense of fluency as a consequence of rereading the familiar
sentences within context of other familiar information (Jacoby
& Whitehouse, 1989). Related to this idea is the notion that
participants can develop an illusion of knowing. Glenberg,
Wilkinson, and Epstein (1982), for example, showed that
Mem Cogn
Experiment 1
Method
Participants
A total of 81 participants (42 in the retrieval-practice
condition, 39 in the reread condition) from the
Washington University in St. Louis community participated individually for course credit or payment ($20 for
two 1-h sessions).
Materials
We used six passages about regions of the world (Norway,
Australia, Africa, Canada, Greenland, and Siberia), with each
passage containing one section about each of three topics:
geography, climate, and people (18 sections, average length
= 134 words, SD = 19.8). Each section contained eight facts.
Sections had an average coherence rating of .32 (SD = .12,
range = .16.65; latent semantic analysis: Landauer &
Dumais, 1997; see also Landauer, Foltz, & Laham, 1998)
and an average readability rating of 45.3 (SD = 10.55, range
= 30.864.7; FleschKincaide index: Flesch, 1948). An example of the three sections pertaining to one region (i.e.,
Norway) is presented in the Appendix. To foreshadow, readability ratings, coherence ratings, and number of words in
each section did not predict average retrieval-practice performance, JOL ratings, or final-test performance for those sections, with one exception: the correlation between LSA scores
and retrieval-practice performance was significant, r(18) =
.54, p = .02 (less coherent sections were associated with
better retrieval-practice performance).
Mem Cogn
Results
Scoring
On both the retrieval practice and the final test, each of the 144
facts received a score of 1, .5, or 0. Scores of .5 were awarded
to responses that were correct but vague or correct excepting
an omission of an important detail (e.g., stating that a large
sandstone rock exists in the middle of Australia without
providing the name of the formation, Uluru). In cases in which
a fact contained multiple parts, and the recall of one part
would diminish the likelihood of recalling the other part
due, for example, to redundancy in information1 point was
awarded for a correct response (e.g., recalling that Africa has
over a billion people or 12 % of the Earths population would
each earn 1 point). No points were awarded for responses that
clearly contained commission errors. The scoring guide for
the Norway passage is presented as Appendix-Table 1 .
The first author scored the data for 75 % of the
participants, and a research assistant scored the data
for 50 % of the participants, with the data for 25 %
of the participants being scored by both graders.
Interrater reliability for the scoring of the overlapping
retrieval-practice and final-test assessments was substantial, = .738, p < .001. The research assistants scoring
was used for the twice-scored data, so that each scorer
would provide half of the data in the analyses.
We found no significant effects (either main effects
or interactions) of the order in which information was
reviewed during rereading or retrieval practice (i.e., by
region or by topic) on judgments of learning, rereading
time, or performance, and thus, this variable will not be
discussed further.
Recall performance
Participants recalled 29.0 (SD = 11.8) facts during retrieval
practice. On the final test, participants in the retrieval-practice
condition recalled more facts (M = 33.7, SE = 2.2) than did
participants in the restudy condition (M = 25.1, SE = 2.0),
t(79) = 2.89, p < .01, d = 0.65.
Metacognition
Judgments of learning Participants in the retrieval-practice
condition provided lower estimates of future recall (M =
49.6, SE = 2.9) than did participants in the reread condition
(M = 79.7, SE = 3.3), t(79) = 6.91, p < .001, d = 2.7.
Restudy time Total restudy time did not differ between the
retrieval-practice condition (M = 12.9 min, SE = 0.8) and the
reread condition (M = 12.5 min, SE = 0.9), t(78) = 0.39, p = .70.
Mem Cogn
Retrieval-practice performance and restudy time The correlation between section-by-section retrieval-practice performance and restudy time was .06 (SE = 0.03), which was
reliably different from zero, t(41) = 2.25, p = .03, indicating
that participants spent more time studying sections that were
associated with lower initial recall. This association was marginally weaker, however, than the correlation between JOLs
and restudy time reported above, t(42) = 1.77, p = .09. More
informative for the purposes of examining retrieval-practice
performance on restudy time, we also calculated the average
restudy times (ms/word in each sentence) for each fact as a
consequence of whether it earned 0, .5, or 1 point during
retrieval practice, and we averaged those restudy times for
each participant. Participants spent averages of 330 (SE = 21),
352 (SE = 27), and 372 ms/word (SE = 23) for facts that
Discussion
In Experiment 1, restudy was differentially allocated on
the basis of both JOLs and retrieval-practice performance,
Mem Cogn
Experiment 2
The goal of Experiment 2 was to extend the major
findings of Experiment 1, as well as to test more
directly our assumption that retrieval practice fosters
better metamemory monitoring than does rereading.
For this reason, we had participants make JOLs for all
of the sections, but they only restudied information for
half of the regions. This procedure also enabled us to
test the hypothesis that restudy is more effective following retrieval practice than following rereading.
Specifically, we predicted that the difference in finaltest performance between the restudy condition and the
no-restudy condition would be greater following retrieval practice than following rereading.
To additionally extend and clarify the Experiment 1
findings, we modified the materials and procedure in
several other ways. First, a potential concern with our
analysis of restudy times in Experiment 1 is that we
transformed restudy times by dividing by the number of
words, but number of words does not alone influence
reading time. Thus, in Experiment 2 we adopted a
different approach: Participants studied the passages for
the first time at their own pace (section by section), so
that we could compare their restudy times (also section
by section) to their initial study times. Specifically, we
conducted regression analyses for each participant, with
the residuals serving as a measure of their restudy time
(see, e.g., Ferreira & Clifton, 1986, Exp. 3).
Second, we provided participants with more time during
the review phase (24 min, as compared to 18 min in Exp. 1),
and participants were allowed to recall or reread the information in any order that they chose (rather than in the two orders
that we utilized in Exp. 1). Rereading or recalling in an
unconstrained manner should provide a more conservative
test of the benefit of retrieval practice because, in the case of
Experiment 2, participants would be able to move between
regions during rereading, which should provide them with a
chance to compare and contrast the material, a process that
might foster elaborative encoding. Finally, participants typed
their responses in both conditions. The increase in time, as
well as having participants type their responses, was intended
to maximize participants ability to recall everything that they
remembered (nevertheless, performance remained about the
same as in Exp. 1).
Mem Cogn
Method
Participants
A total of 80 participants (40 in the retrieval-practice condition, 40 in the reread condition) from the Washington
University in St. Louis community participated individually
for course credit or payment ($20 for two 1-h sessions).
Materials
The materials from Experiment 1 were modified to control
better for coherence and readability. Specifically, for each
section, LSA ratings ranged from .25 to .35, Flesch
Kincaide readability ratings ranged from 45 to 65, and the
numbers of words ranged from 93 to 122. Readability ratings,
coherence ratings, and numbers of words in each section did
not predict average retrieval-practice performance, JOL ratings, restudy time (residuals), or final-test performance for
those sections.
Procedure
Participants read the passages, self-paced, one section (e.g.,
Norway: Geography) at a time, with a maximum of 60 s per
section, and reading times were recorded. After participants
had read all of the passages once, they were randomly
assigned either to a retrieval-practice condition (i.e., freerecall test) or to a reread condition. Participants in the
retrieval-practice condition were provided with an electronic
document labeled with each region and the three topics for
each region (e.g., Geography) on the computer screen, and
they were told to type as much information as they could
remember but that they need not write in complete sentences.
They were given 24 min to do this task, and they were allowed
to recall information in any order. Participants in the reread
condition were given a six-page packet with the information
for each region printed on a single page, and they were told
that they would have 24 min to restudy the information in any
order. After the review activity, all participants made JOLs in
the same manner that was described in Experiment 1.
Next, all participants were told that they would have one
last chance to study half of the information; sections (e.g.,
Norway: Geography) would be presented one at a time on the
computer screen, and they could spend as much or as little
time as they wanted on each section, moving from one section
to the next by pressing Enter. Sections were presented in the
same order as in the initial reading (either the first or last three
passages, counterbalanced across participants), with a maximum of 75 s per section. Restudy time was recorded.
The final free-recall test took place 48 h after the first
session. Participants were provided with an electronic document with the name of each region on the computer screen.
Mem Cogn
General discussion
Testing is presumed to improve learning in educational contexts, in part because it gives the learner metacognitive
Discussion
Although the procedure changed in several ways between
Experiments 1 and 2 (i.e., initial study and restudy occurred
section by section; initial study was self-paced; participants could
retrieve/reread information in any order, and extended time was
provided to do so; and only half of the regions were restudied),
we generally obtained the same pattern of results. Specifically,
retrieval practice was better for metamemory accuracy (and recall
performance) than was rereading, and JOLs informed restudy
time consistent with the informed-decision hypothesis.
Mem Cogn
Mem Cogn
Metacognitive control
Metcalfe and Finn (2008) suggested that people use JOLs to
inform what they choose to restudy, even when such JOLs do
not reflect learning, and we found evidence consistent with
this notion. Across the two studies, we observed that restudy
time was predicted by JOLs (M = .11, SE = .02), t(145) =
4.41, p < .001, in a manner consistent with the informed
decision hypothesis, but that JOLs were not relied upon more
in the retrieval-practice condition (M = .13, SE = .4) than in
the reread condition (M = .09, SE = .03), t(143) = 0.64, p =
.53, even though JOLs were much more accurate in the
retrieval-practice than in the reread condition. We posited,
on the basis of these data as well as of the work by Metcalfe
and Finn, that participants use their JOLs to inform future
study, regardless of how accurate those JOLs are, and that they
Mem Cogn
practice without the effect being caused directly by the accuracy of the JOLs.
In sum, students report rereadingeither full texts or
highlighted portionsas a favored study strategy, although
they also report using retrieval-based study techniques (e.g.,
flashcards, practice tests) for monitoring purposes. We found
that for text materials, retrieval practice improved learners
ability to monitor their learning as compared to rereading, and
restudy was more effective following such practice.
Appendix
Section
Fact
Answer
Geo
Clim
1
2
3
4
5
6
7
8
1
Ppl
2
3
4
5
6
7
8
1
2
3
4
5
6
7
8
To earn a full point for each fact, participants needed to recall the
information presented in the table above, with the following exceptions.
A slash mark indicates that participants would receive a full point for
recalling one piece of information. Recall of information contained within
parentheses was not necessary to earn a full point, but recalling that
information would earn half a point. For the 7th and 8th facts in the
People section, two of the three listed groups would need to be recalled
for a full point; the recall of one listed group would earn half a point. Geo,
Geography; Clim, Climate; Ppl, People
Norway
Geography
Norway comprises the western portion of the
Scandinavian Peninsula. Norway is larger than either
Italy or Great Britain. Much of the country is dominated
by mountainous or high terrain, making much of the land
uninhabitable. The country has one of the longest and
most rugged coastlines in the world. The rugged coastline
is marked by fjords, which are long, narrow inlets with
steep sides or cliffs, created in a valley carved out by
glacial activity. The longest fjord in Norway is the
Sognefjorden (127 miles). Additionally, there are some
50,000 islands off of the coastline. The capital, Oslo, is
located in the southeast of the country, at the end of the
Oslofjorden.
Climate
Norways climate shows great variation, due in part to the 13degree span in latitude and the rugged topology. Summers are
remarkably mild for the latitude; Norways (reasonably) temperate climate is the result of the warming Gulf Stream. The
average high temperature in the capital during summer is 70
degrees Fahrenheit. The warmest temperature ever recorded in
Norway was 96 degrees Fahrenheit in Nesbyen (in the south).
The winters, however, can be very cold: The coldest temperature recorded was 61 degrees in the north. Rainfall is very
heavy in the west with an average of about 88 inches yearly.
Although it doesnt snow much along the western coast, the
Mem Cogn
References
Anderson, M. C. M., & Thiede, K. W. (2008). Why do delayed summaries improve metacomprehension accuracy? Acta Psychologica,
128, 110118. doi:10.1016/j.actpsy.2007.10.006
Ariel, R., Dunlosky, J., & Bailey, H. (2009). Agenda-based regulation of
study-time allocation: When agendas override item-based monitoring. Journal of Experimental Psychology: General, 138, 432447.
doi:10.1037/a0015928
Arnold, K. M., & McDermott, K. B. (2013). Test-potentiated learning:
Distinguishing between direct and indirect effects of tests. Journal
of Experimental Psychology: Learning, Memory, and Cognition, 39,
940945. doi:10.1037/a0029199
Callender, A. A., & McDaniel, M. A. (2009). The limited benefits of
rereading educational texts. Contemporary Educational Psychology,
34, 3041.
deWinstanley, P. A., & Bjork, E. L. (2004). Processing strategies and the
generation effect: Implications for making a better reader. Memory
& Cognition, 32, 945955. doi:10.3758/BF03196872
Dunlosky, J., Hartwig, M. K., Rawson, K. A., & Lipko, A. R. (2011).
Improving college students evaluation of text learning using ideaunit standards. Quarterly Journal of Experimental Psychology, 64,
467484.
Dunlosky, J., & Hertzog, C. (1998). Training programs to improve
learning in later adulthood: Helping older adults educate themselves.
In D. J. Hacker, J. Dunlosky, & A. C. Graesser (Eds.),
Metacognition in educational theory and practice (pp. 249275).
Mahwah: Erlbaum.
Dunlosky, J., & Matvey, G. (2001). Empirical analysis of the intrinsic
extrinsic distinction of judgments of learning (JOLs): Effects of
relatedness and serial position on JOLs. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 27, 11801191.
doi:10.1037/0278-7393.27.6.1180
Dunlosky, J., & Nelson, T. O. (1992). Importance of the kind of cue for
judgments of learning (JOL) and the delayed-JOL effect. Memory &
Cognition, 20, 373380.
Mem Cogn
Kornell, N., & Son, L. K. (2009). Learners choices and beliefs about selftesting. Memory, 17, 493501.
Landauer, T. K., & Dumais, S. T. (1997). A solution to Platos problem:
The latent semantic analysis theory of acquisition, induction, and
representation of knowledge. Psychological Review, 104, 211240.
doi:10.1037/0033-295X.104.2.211
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). An introduction to
semantic analysis. Discourse Processes, 25, 259284. doi:10.1080/
01638539809545028
Lovelace, E. A. (1984). Metamemory: Monitoring future recallability
during study. Journal of Experimental Psychology: Learning,
Memory, and Cognition, 10, 756766.
Maki, R. H., & Serra, M. (1992). The basis of test predictions for text
material. Journal of Experimental Psychology: Learning, Memory,
and Cognition, 18, 116126. doi:10.1037/0278-7393.18.1.116
Mazzoni, G., & Cornoldi, C. (1993). Strategies in study time allocation:
Why is study time sometimes not effective? Journal of
Experimental Psychology: General, 122, 4760.
McDaniel, M. A., & Masson, M. E. (1985). Altering memory representations through retrieval. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 11, 371385. doi:10.1037/
0278-7393.11.2.371
Metcalfe, J., & Finn, B. (2008). Evidence that judgments of learning are
causally related to study choice. Psychonomic Bulletin & Review,
15, 174179. doi:10.3758/PBR.15.1.174
Metcalfe, J., & Kornell, N. (2005). A region of proximal learning model
of study time allocation. Journal of Memory and Language, 52,
463477.
Nelson, T. O. (1984). A comparison of current measures of accuracy of
feeling-of-knowing predictions. Psychological Bulletin, 95, 109
133. doi:10.1037/0033-2909.95.1.109
Nelson, T. O., & Dunlosky, J. (1991). When peoples judgments of
learning (JOLs) are extremely accurate at predicting subsequent
recall: The delayed-JOL effect. Psychological Science, 2, 267
270.
Nelson, T. O., Dunlosky, J., Graf, A., & Nairnes, L. (1994). Utilization of
metacognitive judgments in the allocation of study during multitrial
learning. Psychological Science, 5, 207213.
Nelson, T. O., & Leonesio, R. J. (1988). Allocation of self-paced study
time and the labor-in-vain effect. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 14, 676686. doi:
10.1037/0278-7393.14.4.676
Nelson, T. O., & Narens, L. (1994). Why investigate metacognition? In J.
Metcalfe & A. P. Shimamura (Eds.), Metacognition: Knowing about
knowing (pp. 125). Cambridge, MA: MIT Press.
Rawson, K. A., Dunlosky, J., & Thiede, K. W. (2000). The rereading
effect: Metacomprehension accuracy improves across reading trials.
Memory & Cognition, 28, 10041010.
Rhodes, M. G., & Castel, A. D. (2008). Memory predictions are influences by perceptual information: Evidence for metacognitive illusions. Journal of Experimental Psychology: General, 137, 615625.
doi:10.1037/a0013684
Rhodes, M. G., & Tauber, S. K. (2011). The influence of delaying
judgments of learning on metacognitive accuracy: A meta-analytic