Вы находитесь на странице: 1из 29

Journal of Experimental Psychology: General

1975, Vol. 104, No. 3. 268-294

Depth of Processing and the Retention of Words in


Episodic Memory
Fergus I. M. Craik and Endel Tulving
University of Toronto, Toronto, Ontario, Canada

SUMMARY

Ten experiments were designed to explore the levels of processing framework


for human memory research proposed by Craik and Lockhart (1972). The basic
notions are that the episodic memory trace may be thought of as a rather auto-
matic by-product of operations carried out by the cognitive system and that the
durability of the trace is a positive function of "depth" of processing, where depth
refers to greater degrees of semantic involvement. Subjects were induced to
process words to different depths by answering various questions about the words.
For example, shallow encodings were achieved by asking questions about type-
script; intermediate levels of encoding were accomplished by asking questions
about rhymes; deep levels were induced by asking whether the word would fit into
a given category or sentence frame. After the encoding phase was completed,
subjects were unexpectedly given a recall or recognition test for the words. In
general, deeper encodings took longer to accomplish and were associated with
higher levels of performance on the subsequent memory test. Also, questions lead-
ing to positive responses were associated with higher retention levels than questions
leading to negative responses, at least at deeper levels of encoding.
Further experiments examined this pattern of effects in greater analytic detail.
It was established that the original results did not simply reflect differential encod-
ing times; an experiment was designed in which a complex but shallow task took
longer to carry out but yielded lower levels of recognition than an easy, deeper
task. Other studies explored reasons for the superior retention of words associated
with positive responses on the initial task. Negative responses were remembered
as well as positive responses when the questions led to an equally elaborate encoding
in the two cases. The idea that elaboration or "spread" of encoding provides a
better description of the results was given a further boost by the finding of the
typical pattern of results under intentional learning conditions, and where each
word was exposed for 6 sec in the initial phase. While spread and elaboration
may indeed be better descriptive terms for the present findings, retention depends
critically on the qualitative nature of the encoding operations performed; a
minimal semantic analysis is more beneficial than an extensive structural analysis.
Finally, Schulman's (1974) principle of congruity appears necessary for a
complete description of the effects obtained. Memory performance is enhanced
to the extent that the context, or encoding question, forms an integrated unit with
the word presented. A congruous encoding yields superior memory performance
because a more elaborate trace is laid down and because in such cases the struc-
ture of semantic memory can be utilized more effectively to facilitate retrieval.
The article concludes with a discussion of the broader implications of these data
and ideas for the study of human learning and memory.
268
DEPTH OF PROCESSING AND WORD RETENTION 269

While information-processing models of provides an experimental setting for the


human memory have been concerned study of mental operations and their effects
largely with structural aspects of the on learning. It has been shown that when
system, there is a growing tendency for subjects perform orienting tasks requiring
theorists to focus, rather, on the processes analysis of the meaning of words in a list,
involved in learning and remembering. subsequent recall is as extensive and as
Thus the theorist's task, until recently, has highly structured as the recall observed
been to provide an adequate description of under intentional conditions in the absence
the characteristics and interrelations of the of any specific orienting task; further re-
successive stages through which search has indicated that a "process"
information flows. An alternative approach explanation is most compatible with the
is to study more directly those processes results (Hyde, 1973; Hyde & Jenkins,
involved in remembering— processes such 1969, 1973; Walsh & Jenkins, 1973).
as attention, encoding, rehearsal, and Schulman (1971) has also shown that a
retrieval—and to formulate a description of semantic orienting task is followed by
the memory system in terms of these higher retention of words than a "struc-
constituent operations. This alternative tural" task in which the nonsemantic aspects
viewpoint has been advocated by Cermak of the words are attended to. Similar find-
(1972), Craik and Lockhart (1972), Hyde ings have been reported for the retention of
and Jenkins (I960, 1973). Kolers (1973a), sentences (Bobrow & Bower, 1969; Rosen-
Neisser (1967), and Paivio (1971), among berg & Schiller, 1971; Treisman & Tux-
others, and it represents a sufficiently worth, 1974) and in memory for faces
different set of fundamental assumptions to (Bower & Karlin, 1974). In all these
justify its description as a new paradigm, experiments, an orienting task requiring
or at least a miniparadigm, in memory semantic or affective judgments led to
research. How should we conceptualize better memory performance than tasks
learning and retrieval operations in these involving structural or syntactic judgments.
terms? What changes in the system However, the involvement of semantic
underlie remembering? Is the "mem- analyses is not the whole story: Schulman
ory trace" best regarded as some copy of (1974) has shown that congruous queries
the item in a memory store (Waugh & Nor- about words (e.g., "Is a SOPRANO a
man, 1965), as a bundle of features singer?"') yield better memory for the
(Bower, 1967), as the record resulting from words than incongruous queries (e.g., "Is
the perceptual and cognitive analyses MUSTARD concave?"). Instruction to form
carried out on the stimulus (Craik & images from the words also leads to excel-
Lockhart, 1972), or do we remember in lent retention (e.g., Paivio, 1971; Sheehan,
terms of the encoding operations 1971).
themselves (Neisser, 1967; Kolers, 1973a)? The results of these studies have impor-
Although we are still some way from tant theoretical implications. First, they
answering these crucial questions demonstrate a continuity between incidental
satisfactorily, several recent studies have and intentional learning—the operations
provided important clues. carried out on the material, not the intention
The incidental learning situation, in to learn, as such, determine retention. The
which subjects perform different results thus corroborate Postman's (1964)
orienting tasks, position on the essential similarity of inci-
_________________________ dental and intentional learning, although the
recent work is more usually described in
The research reported in this article was sup- terms of similar processes rather than sim-
ported by National Research Council of Canada
Grants A8261 and A8632 to the first and second i l a r responses (Hyde & Jenkins, 1973).
authors, respectively. The authors gratefully Second, it seems clear that attention to the
acknowledge the assistance of Michael Anderson, word's meaning is a necessary prerequisite
Ed Darte, Gregory Mazuryk, Marsha Carnat, of good retention. Third, since retrieval
Marilyn Tiller, and Margaret Barr.
Requests for reprints should be sent to F. I. M.
Craik, Erindale College, University of Toronto,
Mississauga, Ontario, L5L 1C6, Canada.
270 FERGUS I. M. CRAIK AND ENDEL TULVING

conditions are typically held constant in product of perceptual processing; just as


the experiments described above, the dif- perception may be thought to be composed
ferences in retention reflect the effects of of a series of analyses, proceeding from
different encoding operations, although the early sensory processing to later semantic-
picture is complicated by the finding that associative operations, so the resultant
different encoding operations are optimal memory trace may be more or less elab-
for different retrieval conditions (e.g., orate depending on the number and qualita-
Eagle & Leiter, 1964; Jacoby, 1973). tive nature of the perceptual analyses car-
Fourth, large differences in recall under ried out on the stimulus. It was further
different encoding operations have been suggested that the durability of the memory
observed under conditions where the sub- trace is a function of depth of processing.
jects' task does not entail organization or That is, stimuli which do not receive full
establishment of interitem associations; attention, and are analyzed only to a shal-
thus the results seem to take us beyond low sensory level, give rise to very transient
associative and organization processes as memory traces. On the other hand, stimuli
important determinants of learning and that are attended to, fully analyzed, and
retention. It may be, of course, that the enriched by associations or images yield a
orienting tasks actually do lead to organiz- deeper encoding of the event, and a long-
ation as suggested by the results of Hyde lasting trace.
and Jenkins (1973). Yet, it now becomes The Craik and Lockhart formulation
possible to entertain the hypothesis that provides one possible framework to accom-
optimal processing of individual words, qua modate the findings from the incidental
individual words, is sufficient to support learning studies cited above. It has the
good recall. Finally, the experiments may advantage of focusing attention on the pro-
yield some insights into the nature of learn- cesses underlying trace formation and on
ing operations themselves. Classical verbal the importance of encoding operations;
learning theory has not been much con- also, since memory traces are not seen as
cerned with processes and changes within residing in one of several stores, the depth
the system but has concentrated largely on of processing approach eliminates the neces-
manipulations of the material or the experi- sity to document the capacity of postulated
mental situation and the resulting effects stores, to define the coding characteristic of
on learning. Thus at the moment, we know each store, or to characterize the mechanism
a lot about the effects of meaningfulness, by which an item is transferred from one
word frequency, rate of presentation, var- store to another. Despite these advantages,
ious learning instructions, and the like, but there are several obvious shortcomings of
rather little about the nature and character- the Craik and Lockhart viewpoint. Does
istics of underlying or accompanying the levels of processing framework say any
mental events. Experimental and theo- more than "meaningful events are well
retical analysis of the effects of various remembered"? If not, it is simply a collec-
encoding operations holds out the promise tion of old ideas in a somewhat different
that intentional learning can be reduced setting. Further, the position may actually
to, and understood in terms of, some com- represent a backward step in the study of
bination of more basic operations. human memory since the notions are much
The experiments reported in the present vaguer than any of the mathematical models
paper were carried out to gain further in- proposed, for example, in Norman's (1970)
sights into the processes involved in good collection. If we already know that the
memory performance. The initial experi- memory trace can be precisely represented
ments were designed to gather evidence as
for the depth of processing view of mem- ψ t(1-γ )
ory outlined by Craik and Lockhart (1972).
l = λ e-
These authors proposed that the memory (Wickelgren, 1973), then such woolly
trace could usefully be regarded as the by- statements as "deeper processing yields a
DEPTH OF PROCESSING AND WORD RETENTION 271

more durable trace" are surely far behind The purpose of t h i s article is to describe
us. Third, and most serious perhaps, the 10 experiments carried out within the levels
very least the levels position requires is of processing framework. The first experi-
some independent index of depth—there are ments examined the plausibility of the basic
obvious dangers of circularity present in notions and attempted to rule out alterna-
that any well-remembered event can too tive explanations of the results. Further
easily be labeled deeply processed. experiments were carried out in an attempt
Such criticisms can be partially countered. to achieve a better characterization of depth
First, cogent arguments can be marshaled (e.g., of processing and how it is that deeper
Broadbent, 1961) for the advantages semantic analysis yields superior memory
of working with a rather general theory— performance. Finally, the implications of the
provided the theory is still capable of gen- results for an understanding of learning
erating predictions which are distinguish- operations are examined, and the adequacy
able from the predictions of other theories. of the depth of processing metaphor ques-
From this general and undoubtedly true tioned.
starting point, the concepts can he refined in
the light of experimental results suggested EXPERIM ENTAL INVESTIGATIONS
by the theoretical framework. In this Since one basic paradigm is used through-
sense the levels of processing viewpoint will out the series of studies, the method will be
encourage rather different types of described in detail at this point. Variations
question in the general method will be indicated as
and may yield new insights. A further each study is described.
point on the issue of general versus
specific theories is that while strength General Method
theories o f memory are commendably Typically, subjects were tested individually.
specific and so- They were informed that the experiment con-
cerned perception and speed of reaction. On each
phisticated mathematically, the sophistica- trial a different word (usually a common noun)
tion may be out of place if the basic premises was exposed in a tachistoscope for 200 msec.
are of limited generality or even wrong. It Before the word was exposed, the subject was
is now established, for example, that the asked a question a b out the word. T he purpose
of the question was to induce the subject to pro-
trace of an event can he readily retrieved in
cess the word to one of several levels of analysis,
one environment of retrieval cues, while it thus the questions were chosen to necessitate
is retrieved with difficulty in another (e.g., processing either to a relatively shallow level
Tulving & Thomson, 1973); it is hard to (e.g., questions about t h e word's physical appear-
reconcile such a finding with the view that ance) or to a relatively deep level (e.g., questions
about the word's meaning). In some experiments,
the probability of retrieval depends only on the subject read the questions on a card; in others,
some unidimensional strength. the question was read to him. After reading or
With regard to an independent index of hearing the question, the subject looked in the
processing depth, C r a i k and Lockhart tachistoscope with one hand resting on a yes
response key and the other on a no response key.
(1972) suggested t h a t , when other t h i n g s One second after a warning "ready" signal the
are held constant, deeper levels of process- word appeared and the subject recorded his (or
ing would require longer processing times. her) decision by pressing the appropriate key
Processing time cannot always be taken as (e.g., if the question was "Is the word an animal
name?" and the word presented was TIGER, the
an absolute indicator of depth, however, subject w ould respond yes). A fter a series of
since highly familiar stimuli (e.g., simple such question and answer trials, the subject was
phrases or pictures) can be rapidly analyzed unexpectedly given a retention test for the words.
to a complex meaningful l e v e l . But w i t h i n The expectation was that memory performance
would vary systematically with the depth of
one class of materials, or better, with one processing.
specific stimulus, deeper processing is T hr ee ty pe s o f q ue st io n w er e a sk ed in th e
assumed to require more time. Thus, in initial encoding phase. (a) An analysis of the
the present studies, the time to make deci- physical structure of the word was effected by
asking about the physical structure of the word
sions at different levels of analysis was
taken as an initial index of processin g
depth.
272 FERGUS I. M. CRAIK AND ENDEL TULVING

TABLE 1

T YPICAL Q UESTIONS AND R ESPONSES U SED IN THE E XPERIMENTS

(e.g., “Is the word printed in capital letters?").


(b) A phonemic level of analysis was induced by
asking about the word's rhyming characteristics
(e.g., "Does the word rhyme with TRAIN?").
(c) A semantic analysis was activated by asking
either categorical questions (e.g., "Is the word
an animal name?") or "sentence" questions (e.g.,
"Would the word fit the following sentence:
'The girl placed the ____________ on the table'?").
Further examples are shown in Table 1. At each
of the three levels of analysis, half of the ques-
tions yielded yes responses and half no responses.
T he general procedure thus consisted of explaining the perceptual-reaction time task to a single
subject, giving him a long series of trials
in which both the type of question and yes-no decisions were randomized, and finally giving him an
unexpected retention test. This test was either free recall ("Recall all the words you have seen
in the perceptual task, in any order") ; cued recall,
in which some aspect of each word event was represented as a cue; or recognition, where copies
of the original words were re-presented along with a number of distractors. In the initial en -
coding phase, response latencies were in fact recorded: A millisecond stop clock was started by
the timing mechanism which activated the tachisto-
scope, and the clock was stopped by the subject's
key response. T ypically, over a group of sub -
jects, the same pool of words was used, but each word was rotated through the various level and response
combinations (CAPITALS ?-yes; SEN-
TENCE ?-no, and so on). The general prediction
was that deeper level questions would take longer
to answer but would yield a more elaborate mem-
ory trace which in turn would support higher
recognition and recall performance.

Experiment 1
Method. In the first experiment, single subjects
were given the perceptual-reaction time test; this
encoding phase was followed by a recognition test.
Five types of question were used. First, "Is there
a word present?" Second, "Is the word in cap-
ital letters?" Third, "Does the word rhyme with
———— ?"— F ourth, " Is the word in the cat -
egory — — — — — — — —?"— Fifth, "Would the word fit
in the sentence — — — — — — — — — ?" — When the first ty pe
of question was asked ("Is there a word pres -
ent?"), on half of the trials a w ord was present
and on half of the trials no word was present on
the tachistoscope card; thus, the subject could respond yes when he detected any wordlike pat -
tern on the card. (This task may be rather diff e r e n t f r o m t h e o t h e r s a n d w a s n o t u s e d i n further
experiments; also, of course, it yields difficulties of analysis since no word is presented on the negative
trials, these trials cannot be
included in the measurement of retention.)
The stimuli used were common two-syllable n ou ns of 5 , 6, o r 7 l et te r s. F or ty tr ia l s w er e given; 4
words represented each of the 10 conditions (5 levels × yes-no). The same pool of 40 words was used for
all 20 subjects, but each word was rotated through the 10 conditions so that, for different subjects, a word
was presented as a rhyme-yes stimulus, a category-no stimulus and so on. This procedure yielded 10
combinations
of questions and words; 2 subjects received each combinat ion. O n each trial, the question w as r e a d t o
the subject who was already looking
in the tachistoscope. After 2 sec, the word was exposed and the subject responded by saying yes
or no—his vocal response activated a voice key which stopped a millisecond timer. The experimenter
recorded the response latency, changed the word in the tachistoscope, and read the next question; tria ls
thus occurred approxima tely
every 10 sec.
After a brief rest, the subject was given a sheet with the 40 original words plus 40 similar dis-
t r a c t o r s t y p e d o n i t . A n y o n e s u b j e c t h a d actually only seen 36 words as no word was presented
on negative "Word present?" trials. He was asked to check all words he had seen on the tachistoscope. N o
time limit was imposed for this task. Two different randomizations of the
80 recognition words were typed; one randomization was given to each member of the pair of subjects who
received identical study lists. Thus each subject received a unique presentation-
recognition combination. The 20 subjects were college students of both sexes paid for their
services.
Results and discussion. The results are
shown in Table 2. The upper portion
shows response latencies for the different
questions. Only correct answers were in-
DEPTH OF PROCESSING AND WORD RETENTION 273

cluded in the analysis. The median latency TABLE 2


was calculated for each subject; Table 2 INITIAL DECISION LATENCY AND RECOGNITION
shows mean medians. Although the five P ERFORMANCE FOR W ORDS AS A F UNCTION OF
INITIAL T ASK (E XPERIMENT 1)
question levels were selected intuitively, the
table shows that in fact response latency
rises systematically as the questions neces-
sitated deeper processing. Apart from the
sentence level, yes and no responses took
equivalent times. The median latency
scores were subjected to an analysis of
variance (after log transformation). The
analysis showed a significant effect of level,
F (4, 1 71) = 35.4, p < .001, but no effect
of response type (yes-no) and no inter-
action. Thus, intuitively deeper questions
—semantic as opposed to structural deci- ing questions were, associated with marked
sions about the word—required slightly differences in recognition performance:
longer processing times (150-200 msec). Semantic questions were followed by higher
Table 2 also shows the recognition re- recognition of the word. In fact, Table 2
sults. Performance (the hit rate) increased shows that initial response latency is sys-
substantially from below 20% recognized tematically related to subsequent recogni-
for questions concerning structural charac- tion. Thus, within the limits of the present
teristics, to 96% correct for sentence–yes assumptions, it may be concluded that
decisions. The other prominent feature of deeper processing yields superior retention.
the recognition results is that the yes re- It is of course possible to argue that the
sponses to words in the initial perceptual higher recognition levels are more simply
phase were accompanied by higher sub- attributable to longer study times. This
sequent recognition than the no responses. point will be dealt with later in the paper,
Further, the superiority of recognition of but for the present it may be noted that in
yes words increased with depth (until the these terms, 200 msec of extra study time
led to a 400% improvement in retention.
trend was apparently halted by a ceiling
It seems more reasonable to attribute the
effect). These observations were confirmed
enhanced performance to qualitative differ-
by analysis of variance on recognition pro-
ences in processing and to conclude that
portions (after arc sine transformation). manipulation of levels of processing at the
Since the first level (word present?) had time of input is an extremely powerful
only yes responses, words from this level determinant of retention of word events.
were not included in the analysis. Type of The reason for the superior recognition of
question was a significant factor, F (3, 133) yes responses is not immediately apparent—
= 52.8, p < .001, as was response type (yes– it cannot be greater depth of processing in
no), F ( 1 , 133) = 40.2, p < .001. The the simple sense, since yes and no responses
Question × Response Type interaction was took the same time for each encoding ques-
also significant, F (1, 133) = 6.77, p < .001. tion. Further discussion of this point is
The results have thus shown that differ- deferred until more experiments are described.
ent encoding questions led to different re- Experiment 2 is basically a replication of
sponse latencies; questions about the sur- Experiment 1 but with a somewhat tidier
face form of the word were answered com- design and with more recognition distrac-
paratively rapidly, while more abstract tors to remove ceiling effects.
questions about the word's meaning took
longer to answer. If processing time is an Experiment 2
index of depth, then words presented after
a semantic question were indeed processed Method. Only three levels of encoding were
more deeply. Further, the different encod- used in this study; questions concerning type-
274 FERGUS I. M. CRAIK AND ENDEL TULVING

F IGURE 1. Initial decision latency and recognition performance for words as a


function of the initial task (Experiment 2 ) .

script (uppercase or lowercase), rhyme questions, and each level, positive and negative responses
sentence questions ( i n which subjects were took the same time. An analysis of variance
given a sentence frame with one word missing).
During the initial perceptual phase 60 questions on mean medians yielded an effect of ques-
were presented: 10 yes and 10 no questions at tion type, F (2, 46) = 46.5, p < .001, but
each of the three levels. Question type was ran- yielded no effect of response type and no
domized within the block of 60 trials. The ques - interaction.
tion was presented auditorily to the subject; 2
sec later the word appeared in the tachistoscop e
Figure 1 also shows the recognition
for 200 msec. The subject responded as rapidly results. For yes words, performance in-
as possible by pressing one of two response keys. creased from 15% for case decisions to 81%
After completing the 60 initial trials, the subject for sentence decisions—more than a five-
was given a typed list of 180 words comprising fold increase in hit rate for memory per-
the 60 original w ords plus 120 distractors. H e
was told to check all words he had seen in the
formance for the same subjects in the same
first phase. experiment. Recognition of no words also
All words used were five-letter common con- increased, but less sharply from 19% (case)
crete nouns. From the pool of 60 words, two to 49% (sentence). An analysis of vari-
question formats were constructed by randomly ance showed a question type (level of pro-
allocating each word to a question type until all
10 words for each question type were filled. In cessing) effect, F (2, 46) = 118, p < .001,
addition, two orders of question presentation and a response type (yes-no) effect, F (1, 23)
two random orderings of the 180-word recogni- = 47.9, p < .001, and a Question Type ×
tion l i s t were used. Three subjects were tested Response Type interaction, F (2, 46) =
on each of the eight combinations thus generated.
The 24 subjects were students of both sexes paid 22.5, p < .001.
for their services and tested individually. Experiment 2 thus replicated the results
Results and discussion. The left-hand of Experiment 1 and showed clearly (a)
panel of Figure 1 shows that response Different encoding questions are associated
latency rose systematically for both response with different response latencies—this find-
types, from case questions to rhyme ques- ing is interpreted to mean that semantic
tions to sentence questions. These data questions induce a deeper level of analysis
again are interpreted as showing that deeper of the presented word, (b) positive and
processing took longer to accomplish. At negative responses are equally fast, (c)
DEPTH OF PROCESSING AND WORD RETENTION 275

recognition increases to the extent that the


encoding question deals with more abstract,
semantic features of the word, and (d)
words given a positive response are asso-
ciated with higher recognition performance,
but only after rhyme and category ques-
tions.
The data from Figure 1 are replotted in
Figure 2, in which recognition performance
is shown as a function of initial categoriza-
tion time. Both yes and no functions are
strikingly linear, with a steeper slope for
yes responses. This pattern of data sug-
gests that memory performance may simply
be a function of processing time as such
(regardless of "level of analysis"). This
suggestion is examined (and rejected) in
this article, where we argue that level of FIGURE 2. Proportion of words recognized as a
analysis, not processing time, is the critical function of initial decision time (Experiment 2).
determinant of recognition performance.
Experiments 3 and 4 extended the gen- rhyme and sentence questions, a different specific
erality of these findings by showing that question was asked. Thus, when the word TRAIN
fell into the rhyme- yes category, the question
the same pattern of results holds in recall asked on its first presentation might have been
and under intentional learning conditions. "D oes the word rhyme w ith B R A I N ?" while on
the second presentation the question might have
Experiment 3 been "Does the word rhyme with CRANE ?" For
Method. Three levels of encoding were again case questions the same question was asked on the
included in the study by asking questions about two occurrences since each subject was given the
typescript (case), rhyme, and sentences. On each same question throughout the experiment (e.g.,
trial the question was read to the subject: after " Is the w ord in low ercase?" ). This procedure
2 sec the word was exposed for 200 msec on the was adopted as early work had shown that sub -
tachistoscope. The subject responded by press- jects' response latencies were greatly slowed if
i n g t h e r e l e v a n t r e s p o n s e k e y. A t t h e e n d o f they had to associate yes responses to both upper-
the encoding trials, the subject was allowed to case and lowercase words.
rest for 1 min and was then asked to recall as A constant pool of 48 words was used for all
many words as he could. In Experiment 3, this subjects. The words were common concrete
final recall task was unexpected—thus the initial nouns. Five presentation formats were constructed
encoding phase may be considered an incidental in which the words were randomly allocated to
learning task—while in Experiment 4 subjects the various encoding conditions. Four subjects
were informed at the beginning of the session w ere tested on each format: Tw o made yes
that they would be required to recall t h e responses with their right hand on the right
words. response key while tw o used the left- hand key
Pilot studies had show n tha t the recall level for yes responses. The 20 student subjects were
in this situation tends to be low. Thus, to boost paid for their services. They were told that the
recall, and to examine the effects of encoding experiment concerned perception and reaction time;
level on recall more clearly, half of the words in they were warned that some words would occur
the present study were presented twice. In a l l , twice, but they were not informed of the final
48 different words were used, but 24 were pre- recall test.
sented twice, making a total of 72 trials. Of the
24 words presented once only, 4 were presented Results and discussion. Response laten-
under each of the six conditions (three types of cies are shown in Table 3. For each sub-
question × yes-no). Similarly, of the 24 words ject and each experimental condition (e.g.,
presented twice, 4 were presented under each of case–yes) the median response latency was
the six conditions. When a word was repeated, calculated for the eight words presented on
it always occurred as the 20th item after i t s first
presentation: that is, the lag between first and their first occurrence (i.e., the four words
second presentations was held constant. On i t s presented only once, and the first occurrence
second appearance, the same type of question was of the four repeated words). The median
asked as on the word’s first appearance but, for
276 FERGUS I. M. CRAIK AND ENDEL TULVING

TABLE 3 tions. For example, the second presenta-


RESPONSE LATENCIES FOR EXPERIMENTS tion of a rhyme question may remind the
3 AND 4 subject of the first presentation and thus
facilitate the decision.
Figure 3 shows the recall probabilities
for words presented once or twice. There
is a marked effect of question type (sen-
tence > rhymes > case); retention is again
superior for words given an initial yes
response and recall of twice-presented words
is higher than once-presented words. An
analysis of variance confirmed these obser-
vations. Semantic questions yielded higher
recall, F (2, 38) = 36.9, p < .01; more yes
responses than no responses were recalled,
F ( 1 , 19) = 21.4, p < .01; two presenta-
tions increased performance, F (1, 19) =
33.0, p < .01. In addition, semantically
encoded words benefited more from the sec-
ond presentation, as shown by the signifi-
cant Question Level × Number of Presen-
tations interaction, F (2, 38) = 10.8, p <
.01.
Experiment 3 thus confirmed that deeper
levels of encoding take longer to accomplish
and that yes and no responses take equal
encoding times. More important, semantic
Note. Mean medians of response latencies are presented.
questions led to higher recall performance
and more yes response words were recalled
latency was also calculated for the four than no response words. These basic re-
repeated words on their second presentation. sults thus apply as well to recall as they do
Only correct responses were included in the to recognition. Experiments 1-3 have used
calculation of the medians. Table 3 shows an incidental learning paradigm; there are
the mean medians for the various experi- good reasons to believe that the incidental
mental conditions. There was a systematic nature of t h e task is not critical for the ob-
increase in response latency from case ques- tained pattern of results to appear (Hyde
tion to sentence questions. Also, response & Jenkins, 1973). Nevertheless, it was
latencies were more rapid on the word's decided to verify Hyde and Jenkins' con-
second presentation—this was especially clusion using the present paradigm. Thus,
true for yes responses. These observations Experiment 4 was a replication of Experi-
were confirmed by an analysis of variance. ment 3, but with the difference that sub-
The effect of question type was significant, jects were informed of the final recall task
F (2, 38) = 14.4, p < .01, but the effect of at the beginning of the session.
response type was not (F < 1.0). Repeated
words were responded to reliably faster, Experiment 4
F (1, 19) = 10.3, p < .01 and the Number
Method. The material and procedures were
of Presentations × Response Type (yes–no) identical to those in Experiment 3 except that
interaction was significant, F ( 1 , 19) = 5.33, subjects were informed of the final free recall
p < .05. task. They were told that the memory task was
Thus, again, deeper level questions took of equal importance to the initial phase and that
they should thus attempt to remember all words
longer to process, but yes responses took shown in the tachistoscope. A 10-min period was
no longer than no responses. The extra allowed for recall. The subjects were 20 college
facilitation shown by positive responses on
the second presentation may be attributable
to the greater predictive value of yes ques-
DEPTH OF PROCESSING AND WORD RETENTION 277

F IGURE 3. Proportion of words recalled as a function of the initial task


(Experiment 3).

students, none of whom had participated in Experi- twice-presented words and yes responses.
ment 1, 2, or 3. Words associated with semantic questions
Results and discussion. The response and with yes responses showed the greatest
latencies are shown in Table 3. These data enhancement of recall after a second presen-
are very similar to those from Experiment tation.
3, indicating that subjects took no longer to To further explore the effects of inten-
respond under intentional learning instruct- tional versus incidental conditions more
ions. Analysis of variance showed that comprehensive analyses of variance were
deeper levels were associated with longer carried out, involving the data from both
decision latencies, F (2, 38) = 27.7, p < Experiments 3 and 4. For the latency data,
.01, and that second presentations were re- there was no significant effect of the
sponded to faster, F (1, 19) = 18.9, p < intentional-incidental manipulation, nor did the
.01. No other effect was statistically intentional-incidental factor interact with
reliable. any other factor. Thus, knowledge of the
With regard to t h e recall results, the final recall test had no effect on subjects'
analysis o f variance yielded significant decision times, in the case of recall scores,
effects of processing level, F (2, 38) = 43.4, intentional instructions yielded superior
p < . 0 1, o f re p e titio n , F ( 1 , 1 9) = 6 9 .7 , performance, F ( 1 , 38) = 11.73, p < .01,
p < .01, and of response type (yes-no), F and the Intentional-Incidental × Number of
( 1 , 19) = 13.9, p < .01. In addition, the Presentations interaction was significant,
Number of Presentations × Level of Processing F ( 1, 38) = 5.75, p < .05. This latter ef-
interaction, F (2, 38) = 12.4, p < .01, and the fect shows that the superiority of inten-
Num- tional instructions was greater for twice-
ber of Presentations × Response presented items. No other interaction in-
Type (yes-no) interaction, F (1, 19) = 7.93, volving the incidental-intentional factor was
p < .025, were statistically reliable. Figure significant. It may thus be concluded that
4 shows that these effects were attributable the pattern of results obtained in the present
to superior recall of sentence decisions,
278 FERGUS I . M. CRAIK AND ENDEL TULVIN G

FIGURE 4. Proportion of words recalled as a function of the initial task


(Experiment 4 ) .

experiments d oes not depend critically on dental and intentional learning ( Hyde &
incidental instructions. Jenkins, 1 969, 1 973; Till & Jenkins, 197 3).
The findings that intentional recall was The reported effects were both robust, and
superior to incidental recall , but that deci - large in magnitude: Sentence- yes words
sion times did not differ between intentional showed recognition and recall levels which
and incidental conditions, is at first sight were superior to C ase-no words by a factor
contrary to the theoretical notions proposed ranging from 2.4 to 13 .6. Plainly, the n a-
in the introduction to this article. I f recall ture of the encoding operation is an impor -
is a function of depth of processing and tant determinant of both incidental and
depth is indexed by decision time, then intentional learning and hence of retention.
clearly differences in recall should he asso - At the same time, some aspects of the
ciated with differences in initial response present results are clearly inconsistent with
latency. However, it is possible that fur - the depth of processing formulation outlined
ther processing was carried out in the inten - in the introduction . First, words given a
tional condition, after the orienting task yes response in the initial task w ere better
question was answered, an d was thus not recalled and recognized than words given a
reflected in the decision times. no response, although reaction times to yes
and no responses were identical. Either
Discussion of Experiments 1-4 reaction time is not an adequate index of
Experiments 1-4 have provided empirical depth, or depth is not a good predictor of
flesh for the theoretical bones of the argu - subsequent retention. We will argue the
ment advanced by Craik and Lockhart former case. If depth of processing (defined
(1972). When semantic (deeper level) loosely as increasing semantic-associative
questions were asked about a presented analysis of the stimulus) is decoupled from
word, its subsequent retention was greatly processing time, then on the one hand the
enhanced. This result held for both recog - independent index of depth has been lost,
nition and recall; it also held for both inci- but on the other hand, the res ults of Exper i-
D EPTH OP PROCESSING AND WORD RETENTION 279
ments 1-4 can be described in terms of PROCESSING TIME VERSUS ENCODING
qualitative differences in encoding opera- OPERATIONS
tions rather than simply in terms of i n - A s a first step, the data from Experiment
creased processing times. The following 2 were examined for evidence relating the
section describes evidence relevant to the effects of processing time to subsequent
question of whether retention performance memory performance. At first sight, Ex-
is primarily a function of "study time" or periment 2 provided evidence in line with
the qualitative nature of mental operations the notion that longer categorization times
carried out during that time are associated with higher retention levels—
The results obtained under intentional Figure 2 demonstrated linear relationships
learning conditions (Experiment 4 ) are between initial decision latency and sub-
also not well accommodated by the initial sequent recognition performance. How-
depth of processing notions. If the large ever, if it is processing time which determines
differences in retention found in Experi- performance, and not the qualitative
ments 1-3 are attributable to different nature of t h e task, then within one task,
depths o f processing in the rather literal longer processing times should be associated
sense that only structural analyses are a c t i - with superior memory performance. That
vated by the case judgment task, phonemic is, with the qualitative differences in pro-
analyses are activated by rhyme judgments, cessing held constant, performance should
and semantic analyses activated by category be determined by the time taken to make the
or sentence judgments, then surely under in it ia l decision. On the other hand, if dif-
intentional learning conditions the subject ferences in encoding operations are critical
would analyse and perceive the name and for differences in retention, then memory
meaning of the target word with all three performance should vary between orienting
types of question. In t h i s case equal reten- tasks, but within any given task, retention
tion should ensue (by the Craik and Lock- level should not depend on processing time.
hart formulation), but Experiment 4 showed This point was explored by analyzing the
that large differences in recall were s t i l l data from Experiment 2 in terms of fast and
found. slow categorization times. The 10 response
A more promising notion is that retention latencies for each subject in each condition
differences should be attributed in degrees were divided into the 5 fastest responses
of stimulus elaboration rather than to differ- and the 5 slowest responses. Next, mean
ences in depth. This revised formulation recognition probabilities for the fast and
retains the important point (borne out by slow subsets of words were calculated across
Experiments 1 -4) that the qualitative na- all subjects for each condition. The results
ture of encoding operations is critical for of t h i s analysis are shown in Figure 5;
the establishment of a durable trace, but mean medians for the response latencies in
gets away from the notions that semantic each subset are plotted against recognition
analyses necessarily always follow structural probabilities. If processing time were
analyses and that no meaning is involved in
crucial, then the words which fell into the
shallow processing tasks.
slow subset for each task should have been
Discussion of th e best descriptive frame-
work for these studies will be resumed after recognized at higher levels than words which
further experiments are reported; for the e lic it e d f a st r e sp o n se s . F i g u r e 5 s h o w s
moment, the term depth is retained to signify that t h i s did not happen. Slow responses
greater degrees of semantic involvement. were recognized l i t t l e better than fast
Before further discussions of the theoretical responses w i t h i n each level of analysis. On
framework are presented, the following sec- the other hand, t h e qualitative nature of
tion describes attempts to evaluate the rela- the ta sk co ntin ue d to exert a ver y lar ge
tive effects of processing time and the qual- effect on recognition performance, suggest-
itative nature of encoding operations on the ing again that it is the nature of the encod-
retention of words.
280 FERGUS I. M. CRAIK AND ENDEL TULVING

FIGURE 5. Recognition of words as a function of task and Initial decision time: Data
partitioned into fast and slow decision times (Experiment 2).

ing operations and not processing time which vowel and C = consonant, the word brain could
determines memory performance. be characterized as CCVVC, the word uncle as
VCCCV, and so on. Before each nonse mantic
For both yes and no responses, slow case trial the subject was shown a card with a partic-
categorization decisions took longer than ular consonant-vowel pattern typed on i t ; after
fast sentence decisions. However, words studying the card as long as necessary, the sub-
about which subjects had made sentence ject looked into the tachistoscop e and the word
was exposed. The experiment was again described
decisions showed higher levels of recogni- as a perceptual, reaction time study concerning
tion; 73% as opposed to 17% for yes re- different aspects of words and the subject was
sponses and 45% as opposed to 17% for no instructed to respond as rapidly as possible by
r e sp o n se s. N o sta tistic a l an a ly sis w a s pressing one of two response keys. The seman -
tic task was the sentence task from previous
thought necessary to support the conclusion studies in the series. In this case, the subject
that task rather than time is the crucial was show n a card with a short sentence typed
a s p e c t i n t h e se e x p e r im e n ts . S in c e th e on it; the sentence had one missing word, thus
point is an important one, however, a further the subject's task was to decide whether the word
on the tachistoscope screen would fit the sentence.
experiment was conducted to clinch the Examples of sentence-yes trials are: "The man
issue. Subjects were given either a com- threw the ball to the ————" (CHILD) and
plex structural task or a simple semantic "Near her bed she kept a ————"
task to perform; it was predicted that the (CLOCK). On sentence-no trials an inappropriate
complex structural task would take longer noun from the general pool was exposed on the
tachistoscope. Again the subject responded as
to accomplish but that the semantic task rapidly as pos-
would yield superior memory performance. sible. The subjects were not informed of the
subsequent memory test.
Experiment 5 The pool of words used consisted of 120 high
Method. The purpose of Experiment 5 was to frequency, concrete five-letter nouns. Each sub-
devise a shallow nonsemantic task which was ject received 40 words on the i n i t i a l decision
difficult to perform and would thus take longer phase of the task and was then shown all 120
than an easy but deeper semantic task. In this words, 40 targets and 80 distractors mixed ran-
way, further evidence on the relative contribu- domly, in the second phase. He was then asked
tions of processing time and processing depth to to recognize the 40 words he had been shown on
memory performance could be obtained. In both the tachistoscope by circling exactly 40 words.
tasks, a five-letter word was shown in the tachisto- Two forms of the recognition test were typed with
scope for 200 msec and the subject made a yes-no the same 120 words randomized differently. In
decision about the word. The nonsemantic deci- a l l , 24 subjects were tested in the experiment.
sion concerned the pattern of vowels and con- T he pool of 120 w ords was arbit rari ly parti -
sonants which made up the word. Where V = tioned into three blocks of 40 words; the first 8
subjects received one block of 40 as targets and
D E P T H O F P RO C E S S I N G A N D W O RD 281
RE T E N T I O N

the remaining 80 words served as distractors; TABLE 4


the second 8 subjects received the second block DECISION LATENCY AND RECOGNITION PERFORM-
of 40 words as targets and the third 8 subjects ANCH FOR WORDS AS A FUNCTION OF THE INITIAL
received the third block of 40—in all cases the T ASK ( E XPERIMENT 5)
remaining 80 words formed the distractor pool.
Within each group of 8 subjects who received
the same 40 target w ords, 4 received one form
of the recognition test and 4 received the other
form. Finally, within each group of 4 subjects,
each word was rotated so that it appeared (for
different subjects) in all four conditions: non-
semantic yes and no and semantic yes and no.
Each subject was tested individually. After
the two tasks had been explained, he was given a
few practice trials, then received 40 further trials,
10 under each experimental condition. The order
of presentation of conditions was randomized.
After a brief rest period the subject was given
the recognition l i s t and told to circle exactly 40
words (those he had just seen on the tachis to-
scope), guessing if necessary. The subjects were
24 undergraduate students of both sexes, paid perceptual phase. This result has also
for their services. been reported by Schulman (1974). The
Results. The results of the experiment reasons for the better retention of yes re-
are straightforward. Table 4 shows that t he sponses are not immediately apparent; for
nonsemantic task took longer to accomplish example, it is not obvious that positive
but that the deeper sentence task gave rise responses require deeper processing before
to higher levels of recognition . Decisions the i n i t i a l perceptual decision can be made.
about consonant-vowel structure of words This problem invites a closer investigation
were substantially slower than sentence of the yes-no difference and may perhaps
decisions (1.7 sec as opposed to .85 sec) force a further reevaluation of the concept of
and this difference was significant statis - depth.
tically, F ( 1 , 23) = 11.3, p < .01. Neither POSITIVE NEGATIVE CATEGORIZATION
AND
the response type (yes- no) nor the inter - DECISIONS
action was significant . For recognition, the
Why are words to which positive re -
analysis of variance showed tha t sentence sponses are made in the perceptual-decision
decisions g ave rise to higher recognition,
task better remembered? As discussed pre -
F (1, 23) = 40. 9, p < .001; yes responses viously, it does not seem intuitively reason -
were recognized better than no responses,
able that words associated with yes responses
F ( 1 , 23) = 10.6, p < .01 , but the Task × require deeper processing before the deci -
Response Type interaction was not signifi -
sion is made. However, if high levels of
cant. retention are associated with "rich" or
Experiment 5 has thus confirmed the con-
"elaborate" encodings of the word (rather
clusion from the reanalysis of Experiment than deep encodings), the differences in
2; that it is the qualitative nature of the task
retention between positive and negative
—we argue, depth of processing—and not words become understandable. In cases
the amount of processing time, which deter-
where a positive response is made, the
mines memory performance. Figure 2 encoding question and the target word can
illustrates that a dee p semantic task takes
form a coherent, integrated unit. This
longer to accomplish and yields superior integration would be especially likely with
memory performance, but when the two
semantic questions: for example, "A four-
factors are separated it is the task which is footed animal?" (BEAR ) or "The boy met a
crucial, not processing time as such.
——— on t h e street" ( FRIEND ). How-
One constant feature of Experimen ts 1-4 ever , integration of the question and tar -
has been the superior recall or recognition
get word would be much less likely in t he
of words given a yes response in the initial negative case: "A four-footed animal?"
282 FERGUS I. M. CRAIK AND ENDEL TVLVING

( CLOUD ) or "The boy met a ————— on For each set an additional reference object was
the street" (SPEECH), Greater degrees of chosen such that half of the objects represented by
the word set were "greater than" the reference ob-
integration (or, alternatively, greater de- ject and half of the objects were "less than" the
grees of elaboration of the target word) referent. The reference object was always used
may support higher retention in the sub- in the question pertaining to that dimension;
sequent test. This factor of integration or examples were "Taller than a man?" ( STEEPLE -
congruity (Schulman, 1974) between target yes; CHILD -no), "More valuable than $10?"
(JEWEL-yes; BUTTON-no). "Sharper than a
word and question would also apply to fork?" (NEEDLE-yes; CLUB-no). For half of the
rhyme questions but not to questions about subjects, the question was reversed in sense, so
typescript: If the target word is in capital that words given a yes response by one group of
letters (a yes decision), the word's encod- subjects were given a no response by the other
g ro up . T hu s, " T al le r th an a ma n? " b ec am e
ing would be elaborated no more than if the "Shorter than a man?" (STEEPLE-no; CHILD-
word had been presented in lowercase type yes).
(a no decision). This analysis is based on Each subject was asked questions relating to
the premise that effective elaboration of an two dimensions; he thus answered 16 questions—
encoding requires further descriptive attri- 4 yielding positive responses and 4 yielding neg-
ative responses for each dimension. Four dif-
butes which (a) are salient, or applicable to fident versions of the questions and targets were
the event, and (b) specify the event more constructed, with two different dimensions being
uniquely. While positive semantic and used in each version. Four subjects received each
rhyme decisions fit this description, neg- version—two received the original questions (e.g.,
"heavier than . . ." "hotter than . . .") and two
ative semantic and rhyme decisions and received the questions reversed ("lighter than . . ."
both types of case decision do not. In line "colder than . . ."). Thus each subject received
with this analysis is the finding from Experi- 16 questions; both question type and response
ments 1-4 that while positive decisions are type (yes-no) were randomized. Subjects were
associated with higher retention levels for 16 undergraduate students of both sexes; they
were paid for their services.
semantic and rhyme questions, words elicit- On each trial, the subject looked into a tachisto-
ing positive and negative decisions are scope; the question was presented auditorily, and
equally well retained after typescript judg- 2 sec later the target word w as exposed for 1
ments. sec. The subject responded by pressing the ap-
propriate one of two keys. Subjects were again
If the preceding argument is valid, then told that they had to make rapid judgments about
questions leading to equivalent elaboration words; they were not informed of the retention
for positive and negative decisions should be test. After completing the 16 question trials,
followed by equivalent levels of retention. subjects were asked to recall the target words.
Each subject was reminded of the questions he
Questions which appear to meet the case had been asked. T hus, in this study, memory
are those of the type "Is the object bigger was assessed in the presence of the original
than a chair?" In this case both positive questions.
target words (HOUSE, TRUCK) and negative
Results. Again, the results are much
target words (MOUSE, PIN) should be en-
easier to describe than the procedure.
coded with equivalent degrees of elabora-
Words given yes responses were recalled
tion; thus, they should be equally well
remembered. This proposition was tested with a probability of .36, while words given
in Experiment 6. no responses were recalled with a probabil-
ity of .39. These proportions did not differ
Experiment 6 significantly when tested by the Wilcoxon
Method. Eight descriptive dimensions were test. Thus, when positive and negative
used in the study: size, length, width, height, decisions are equally well encoded, the re-
weight, temperature, sharpness, and value. For
each of these dimensions, a set of eight concrete spective sets of target words are equally well
nouns was generated, such that the dimension was a recalled. The results of this demonstration
salient descriptive feature for the words in each set study suggest that it is not the type of
(e.g., size-ELEPHANT, MOUSE; value-DIAMOND,
CRUMB ). The words were chosen to span the com-
response given to the presented word that is
plete range of the relevant dimension (e.g., from responsible for differences in subsequent
very small to very large; very hot to very cold). recall and recognition, but rather the rich-
DEPTH OF PROCESSING AND WORD RETENTION 283

ness or elaborateness of the encoding. It swooped down and carried off the strugglin g
is possible that negative decisions in Experi- ————" and "The small lady angrily picked
up
ments 5-4 were associated with rather poor the red ————." The sentence frames
encodings of the presented words—they did were
not fit the encoding question and thus did written on cards and given to the subject. After
not form an integrated unit with the ques- studying it he looked into the tachistoscope with
one hand on each response key. After a ready
tion. On the other hand, positive responses signal the word was presented for 1.0 sec and
would be integrated with the question, and the subject responded yes or no by pressing the
thus, arguably, formed more elaborate en- appropriate key . The words were exposed for
codings which supported better retention a longer lime in this study since the que stions
were more complex . Subjects were again told
performance. that the experiment was concerned with percep -
Experiment 7 was an attempt to manip- tion and s peed of reaction and that they should
ulate encoding elaboration more directly. thus respond as rapidly as possible. No mention
Only semantic information was involved in was made of a memory test. T he 20 subjects
were tested individually. They were undergrad -
this study. All encoding questions were uate students of both sexes, paid for their services.
sentences with a missing word; on half of After completing the 60 encoding trials, sub -
the trials the word fitted the sentence (thus jects were given a short rest and then asked to
all queries were congruous in Schulman's recall as many words as they could from the first
phase of the experiment. They were given 8 min
terms). The degree of encoding elabora- for free recall. After a further rest, they were
tion was varied by presenting three levels given the deck of cards containing the original
of sentence complexity, ranging from very sentence, frames (in a new random order) and
simple, spare sentence frames (e.g., "He asked to recall the word associated with each
dropped the ————") to complex, elaborate sentence. Thus there were two retention tests in
this study: free recall followed by cued recall.
frames (e.g., "The old man hobbled across
the room and picked up the valuable ———— Results. Figure 6 shows the results.
from the mahogany table"). The word For free recall, there is no effect of sentence
presented was WATCH in both cases. Al- complexity in the case of no responses, but
though the second sentence is no more a systematic increase in recall from simple
predictive of the word, it should yield a to complex in the case of yes responses.
more elaborate encoding and thus superior The provision of the sentence frames as
memory performance. cues did not enhance the recall of no re-
sponses, but had a large positive effect on
Experiment 7 the recall of yes responses; the effect of
Method. Three levels of sentence complexity sentence complexity was also amplified in
were used: simple, medium, and complex. Each cued recall. These observations were con-
subject received 20 sentence frames at each level
of complexity; within each set of 20 there were
10 yes responses and 10 no responses. The 60
encoding trials were randomized with respect
to level o f complexity and response type. A
constant pool of 60 words was used in the experi-
ment, but two completely different sets of en-
coding questions were constructed. Words were
randomly allocated to sentence level and response
type in the two sets (with the obvious constraint
that yes and no words clearly fitted or did not
fit the sentence frame, respectively). Within
each set of sentence frames, two different ran-
dom presentation orders were constructed. Five
subjects were presented with each format thus
generated and 20 subjects were tested in all.
The words used were common nouns.
Examples of sentence frames used are: simple,
"She cooked the ——" "The—— is torn"; medium,
“The ———— frightened the children" and F IGU RE 6. Proportion of words recalled as a
"The ripe ——— tasted delicious"; complex, function of sentence complexity (Experiment 7).
"The great bird (CR = cued recall, NCR = non cu ed recall.)
284 FERGUS I. M. CRAIK AND ENDEL TULVING

firmed by analysis of variance. In free the sentence frame, the subject cannot form
recall, a greater proportion of words given a unified image or percept of the complete
positive responses were recalled than those sentence, the memory trace will not rep-
given negative responses, F(l, 19) = 18.6, resent an integrated meaningful pattern,
p < .001 ; the overall effect of complexity and the word will not be well recalled. In
was not significant, F(2, 38) = 2.37, p > the case of positive responses, such coherent
.05, but the interaction between complexity patterns can he formed and their degree of
and yes-no was reliable, F(2, 38) = 3.78, cognitive elaborateness will increase with
p < .05. A further analysis, involving posi- sentence complexity. While increased elab-
tive responses only, showed that greater oration by itself leads to some increase in
sentence complexity was reliably associated recall (possibly because richer sentence
with higher recall levels, F(2, 38) = 4.44, frames can be more readily recalled) per-
p < .025. In cued recall, there were sig- formance is further enhanced when part of
nificant effects of response type, F (1, 19) the encoded trace is reprovided as a cue.
= 213, p < .001, complexity, F (2, 38) = It is well established that cuing aids recall,
49.2, p < .001, and the Complexity × Re- provided that the cue information has been
sponse Type interaction, F (2, 38) = 19.2, encoded with th e target word at presenta-
p < .001. An overall analysis of variance, tion and thus forms part of the same encoded
incorporating both free and cued recall, was unit (Tulving & Thomson, 1973). The
also carried out and this analysis revealed present results are consistent with the find-
significantly higher performance for greater ing, but may also be interpreted as showing
complexity, F (2, 38) = 36.5, p < .001, that a cue is effective to the extent that the
for positive target words, F (l, 19) cognitive system can encode the cue and the
= 139, p < .001, and for cued recall rela- target as a congruous, integrated unit.
tive to free recall, F (1, 19) = 100, p < Elaborate cues by themselves do not aid
.001. All the interactions were significant performance even if they were presented
at the p < .01 level or better; the descrip- with the target word at input, as shown by
tion of these effects is provided by Figure 6. the poor recall of negative response words.
Experiment 7 has thus demonstrated that It is also necessary that the target and the
more complex, elaborate sentence frames cue form a coherent, integrated pattern.
do lead to higher recall, but only in the case Schulman (1974) reported results which are
of positive target words. Further, the essentially identical to the results of
effects of complexity and response type are Experiment 7. H e found better recall of
greatly magnified by reproviding the sen- congruous than incongruous phrases; he
tence frames as cues. also found that cuing benefited congruously
These results do not fit the original simple encoded words much more than incongruous
view that memory performance is deter- words. Schulman suggests that congruent
mined only by the nominal level of pro- words can form a relational encoding with
cessing. In all conditions of Experiment 7 their context, and that the context can then
semantic processing of the target word was serve as an effective redintegrative cue at
necessary, yet there were still large differ- recall (Begg, 1972; Horowitz & Prytulak,
ences in performance depending on sentence 1969). In these terms, Experiment 7 has
complexity, the relation between target word added the finding that the semantic richness
and the sentence context, and the presence of the context benefits congruent encodings
or absence of cues. It seems that other but has no effect on the encoding of incon-
factors besides the level of processing re- gruous words.
quired to make the perceptual decision are Is t h e concept of depth s ti ll useful in
important determinants of memory perform- describing the present experimental results,
ance. or are the findings better described in terms
The notion of code elaboration provides of the "spread" of encoding where spread
a more satisfactory basis for describing the refers to the degrees of encoding elaboration
results. If a presented word does not fit or the number of encoded features? These
DEPTH OF PROCESSING AND WORD RETENTION 285

questions will be taken up in the general Some empirical support for these ideas
discussion, but in outline, we believe that may be drawn from two unpublished studies
depth still gives a useful account of the b y Mo sc o v itc h an d Cr a ik (N o te 1 ) . T he
major qualitative shifts in a word's encod- first study used the same paradigm as the
ing (from an analysis of physical features present series and compared cued with non-
through phonemic features to semantic prop- cued recall, where the cues were the original
erties). Within one encoding domain, how- encoding questions. It was found that cuing
ever, spread or number of encoded features enhanced recall, and that the effect of cuing
may be better descriptions. Before grap- was greater with deeper levels of encoding.
pling with these theoretical issues, three final T h u s th e en c o d in g q ue stio n s d o h e lp
short experiments will be described. The retrieval, and their beneficial effect is
findings from the preceding experiments greatest with semantically encoded words.
were so robust that it becomes of interest The second study showed that when several
to ask under what conditions t h e effects of target words shared the same encoding
differential encoding disappear. Experi- question (e.g., "Rhymes with train?" BRAIN ,
ments 8, 9, and 10 were attempts to set CRANE, PLANE ; "Animal category?" LION,
boundary limits on the phenomena . HORSE, GIRAFFE ), the sharing manipulation
had an adverse effect on cued recall. Fur-
FURTHER EXPLORATIONS OP DEPTH AND ther, the adverse effect was greatest for
ELABORATION deeper levels of encoding, suggesting that
The three studies described in t h i s sec- the normal advantage to deeper levels is
tion were undertaken to examine further associated with the uniqueness of the en-
aspects of depth of processing and to throw coded question-target complex, and that
more light on the factors underlying good when t h i s uniqueness is removed, the
memory performance. The first experi- mnemonic advantage disappears.
ment explored the idea that the critical d i f - These ideas and findings suggest an
ference between case-encoded and sentence- experiment in which a case-encoded word
is made more unique by being the one word
encoded words might l i e in the similarity
in an encoding series to be encoded in this
of encoding operations within the group o f
way. In this situation the one case word
case-encoded words. That is, each case-
might be remembered as well as a word,
encoded word is preceded by the same ques-
which, nominally, received deeper process-
tion, " I s the word in capital letters?", ing. Such an experiment in its extreme
whereas each rhyme-encoded and sentence- form would be expensive to conduct, in that
encoded word has i t s own unique question. one word forms the focus of interest. Ex-
At retrieval, it is likely that the subject uses periment 8 pursues the idea of uniqueness
what he can remember of the encoding in a less extreme form. Three groups of
question to help him retrieve the target subjects each received 60 encoding trials;
word. Plausibly, encoding questions which each trial consisted of a case, rhyme, or
were used for many target words would be category question. However, each group
less effective as retrieval cues since they of subjects received a different number of
do not uniquely specify one encoded event trials of each question type: either 4 case,
in episodic memory. This overloading of 16 rhyme, and 40 category trials; 16, 40,
retrieval cues would be particularly evident and 4 t r i a l s ; or 40, 4, and 16 trials, respec-
for case-encoded words. It is possible to tively. The prediction was that while the
extend the argument to rhyme-encoded typical pattern of results would be
words also; although each target word found when 40 trials of one type were given,
receives a different rhyme question, pho- sub-
nemic differences may not be so unique or sequent recognition performance would be
distinctive as semantic differences (Lock- enhanced with smaller set sizes; this en-
hart, Craik, & Jacoby, 1975). hancement would be especially marked for
the case level of encoding.
286 FERGUS I. M. CRAIK AND ENDEL TULVING

TABLE 5
DESION AND R ESULTS OF EXPERIMENT 8

Experiment 8
Method. Three groups of subjects were tested. Group 1 received 4 case questions, 16 rhyme questions, and
40 category questions. Group 2 received 16, 40, and 4, respectively, while Group
3 received 40, 4, and 16, respectively. At each level of encoding, half of the questions were de-
signed to el ic i t yes responses and half no responses. Thus each group received 60 t r i a l s ; question type and
response type were randomized. The design
is shown in Table 5.
The subjects were tested individually. Each question was read by the experimenter while the subject looked
in the tachistoscope; the word was exposed for 200 msec and the subject responded
by pressing one of two response keys. The sub-
jects were informed that the test was a perceptua1-
reaction time task; the subsequent memory test
was not mentioned. After completing the 60 en-
coding trials, each subject was given a sheet containing the 60 target words plus 120 distrac-
tors. He was told to check exactly 60 words—
those words he had seen on the tachistoscope.
The same pool of 60 common nouns was used
as targets throughout the experiment. Within
each experimental group there were four presentation l i s t s ; in each case Lists 1 and 2 differed only in the
reversal of positive and negative decisions (e.g., category-yes in List 1 became cat-
egory-no in List 2 ) . Lists 3 and 4 contained a fresh randomization of the 60 words, but again Lists 3
and 4 differed between themselves only
in the reversal of positive and negative responses.
In all, 32 subjects were tested in the experiment;
11 each in Groups 1 and 2, and 10 in Group 3.
Two or three subjects were tested under each
randomization condition.

Results. Table 5 shows the proportion recognized by each group. Each group shows the typical
pattern of results already
familiar from Experiments 1-4; there is no evidence of a perturbation due to set size.
Table 5 also shows the recognition results organized by set size; it may now be seen
that set size does exert some effect, most conspicuously on rhyme-yes responses.
However, the differences previously attri-
buted to different levels of encoding were
certainly not eliminated by the manipula-
tion of set size; in general, when set size
was held constant (across groups), strong
effects of question type were still found.
To recapitulate, the argument underlying Experiment 8 was that in the standard ex-
periment, the encoding operation for case
decisions is, in some sense, always the same;
for rhyme decisions, it is somewhat similar
from word to word, and is most dissimilar
among words in the category task. If the
isolation effect in memory (see Cermak,
1972) is a consequence of uniqueness of
encoding operations, then when similar en-
codings (e.g., "case decision" words) are
few in number, they should also be encoded
uniquely, show the isolation effect, and thus
be well recalled. Table 5 shows that reduce-
ing the number of case-encoded words from
40 to 4 did not enhance their recall, thus
lack of isolation cannot account for their low retention. On the other hand, a reduction
in set size did enhance the recall of rhyme-
encoded words, thus isolation effects may
play some part in these experiments,
although they cannot account for all aspects
DEPTH OF PROCESSING AND WORD RETENTION 287

of the results. Finally, it may be of some TABLE 6


interest that recall proportions for rhymes– P ROPORTION OF W ORDS R ECOGNIZED FROM T WO
Set Size 4 are quite similar to category–Set REPLICATIONS OF EXPERIMENT 9
Size 40 (.90 and .70 vs. .88 and .70); this
observation is at least in line with the notion
that when rhyme encodings are made more
unique, their recall levels are equivalent to
semantic encodings.
Experiment 9: A Classroom Demonstration
Throughout this series of experiments,
experimental rigor was strictly observed.
Words were exposed for exactly 200 msec;
great care was exercised to ensure that
subjects would not inform future subjects
that a memory test formed part of the ex- processing and what factors underlie the
periment; subjects were told that the experi- superior retention of deeply processed
ments concerned perception and reaction stimuli.
time; response latencies were painstakingly Method. On a projection screen, 60 words were
recorded in all cases. One of the authors, presented, one at a time, for 1 sec each with a
by nature more skeptical than the other had 5-sce interword interval. All subjects saw the
formed a growing suspicion that this rigor same sequence of words, but different subjects
were asked different questions about each word.
reflected superstitious behavior rather than
For example, if the first word was COPPER, one
essential features of the paradigm. This subject would be asked, "Is the word a metal?",
feeling of suspicion was increased by the a second, "Is the word a kind of fruit?", a third,
finding of the typical pattern of results in "Does the word rhyme with STO PPE R ?", and so
Experiment 9, which was conducted under on. For each word, six questions were asked
(case, rhyme, category × yes- no). During the
intentional learning conditions. Accord- series of 60 words, each subject received 10 trials
ingly, a simplified version of Experiment 2 of each question response combination, but i n a
was formulated which violated many of different random order. The questions were pre-
the rules observed in previous studies. Sub- sented in booklets, 20 questions per page. Six
types of question sheet were made up, each type
jects were informed that the main purpose presented to two subjects. These sheets balanced
of the experiment was to study an aspect of the words across question types. The subject
memory; thus the final recognition test was studied the question, saw the word exposed on the
expected and encoding was intentional screen, then answered the question by checking
yes or no o n the sheet. After the 60 encoding
rather than incidental. Words were pre- trials, subjects received a further sheet contain-
sented serially on a screen at a 6-sec rate; ing 180 words consisting of the original 60 target
during each 6-sec interval subjects recorded words plus 120 distractors. The subjects were
their response to the encoding question. asked to check exactly 60 words as "old." Two
different randomizations of the recognition list
Indeed, the subjects were tested in one group were constructed; this control variable was crossed
of 12 in a classroom situation during a course with the six types of question sheets. Thus each
on learning and memory; they recorded of the 12 subjects served in a unique replication
their own judgments on a question sheet and of the experiment . Instructions to subjects
emphasized that their main task was to remember
subsequently attempted to recognize the tar- t h e words, and that a recognition test would
get words from a second sheet. Reaction be given after the presentation phase. The ma -
times were not measured. terials used are presented in the Appendix.
The point of this study was not to attack Result. The top of Table 6 shows that
experimental rigor, but rather to deter- the results of Experiment 9 are quite similar
mine to what extent the now familiar pat- to those of Experiment 2, despite the fact
tern of results would emerge under these that in the present study subjects knew of
much looser conditions. If such a pattern the recognition test and words were pre-
does emerge, it will force a further examina- sented at the rate of 6 sec each. The find-
tion of what is meant by deeper levels of ing that subjects show exactly the same pat-
288 FERGUS I. M. CRAIK AND ENDEL TULVING

tern of results under those very different room conditions, without the trappings of
conditions attests to the fact that the basic timers and tachistoscopes, is difficult to
phenomenon under study is a robust one. reconcile with the view that was implicit in
It parallels results from Experiment 4 and the initial experiments of the series: that
previous findings of Hyde and Jenkins processing of an item is somehow stopped
(1969, 1973). Before considering the at a particular level and that an additional
implications of Experiment 9, a replication fraction of a second would have led to bet-
will be mentioned. This second experiment ter performance. This view is therefore
was a complete replication with 12 other now rejected. It seems to be the qualitative
subjects. The results of the second study nature of the encoding achieved that is
are also shown in Table 6. Overall recog- important for memory, regardless of how
nition performance was higher, especially much time the system requires to reach
with case questions, but the pattern is the some hypothetical level or depth of encoding.
same.
The results of these two studies are quite Experiment 10
surprising. Despite intentional learning
conditions and a slow presentation rate, The final experiment to be reported was
subjects were quite poor at recognizing carried out to determine whether subjects
words which had been given shallow encod- can achieve high recognition performance
ings. Since subjects in this experiment with case-encoded words if they are given
were asked to circle exactly 60 words, they a stronger inducement to concentrate on
could not have used a strict criterion of these items. Subjects were paid for each
responding. Thus their low level of recog- word correctly recognized; also, they were
nition performance in the case task must informed beforehand that a recognition test
reflect inadequate initial registration of the would be given. Correct recognition of the
information or rapid loss of registered infor- three types of word was differentially re-
mation. Indeed, chance performance in warded under three different conditions.
this task would be 33%; we have not corrected Subjects know that case, rhyme, and cat-
the data for chance in any experi- egory words carried either a 1c, 3c, or 6c
ment. The question now arises as to why reward.
subjects do not encode case words to a Method. Subjects were tested under the same
deeper level during the time after their conditions as subjects in Experiment 9. That
is, 60 words were presented for 1 sec each plus
judgment was recorded. It is possible that 5 sec for th e subject to record his judgment.
recognition of the less well-encoded items is Each subject had 20 words under each encoding
somehow adversely affected by well-encoded condition (case, rhyme, category) with 10 yes and
items. It is also possible that subjects do 10 no responses in each condition. As in Experi-
not know how best to prepare for a memory ment 9, each word appeared in each encoding
condition across different subjects. After the
test and thus do no further processing of initial phase, subjects were given a recognition
each word beyond the particular judgment sheet of 180 words (60 targets plus 120 dist rac-
that is asked. A third hypothesis, that sub- tors) and instructed to check exactly 60 words.
jects were poorly motivated and thus simply There were three experimental groups. All
subjects were informed that the experiment was
did not bother to rehearse case words in a a study of word recognition, that they would be
more effective way, is put to test in the paid according to the number of words they
final experiment. Here subjects were paid recognized, and therefo re that they should
by results; in one condition the recognition attempt to learn each word. The groups differed
in the value associated with each class of word:
of case words carried a much higher reward Group 1 subjects knew that they w ould be paid
than the recognition of category words. 1c, 6c, and 3c for case, rhyme, and category
In any event, Experiment 9 has demon- words, respectively; Group 2 subjects were paid
strated that encoding operations constitute 3c, 1c and 6c, respectively; and Group 3 subjects
were paid 6c, 3c, and 1c, respectively. These
an important determinant of learning or conditions are summarized, in Table 7. T hus,
repetition under a wide variety of experi- across groups, each class of words was associated
mental conditions. The finding of a strong with each reward. There were 12 undergraduate
effect under quite loosely controlled class- subjects in each of three groups.
DEPTH OF PROCESSING AND WORD RETENTION 289

Results. Table 7 shows that while recog- TAB LE 7


nition performance was somewhat higher PROPORTIONS OF W ORDS R ECOGNIZED U NDER E ACH
than the comparable conditions of Experi- CONDITION IN EXPERIMENT 10
ment 9 (Table 6), the differential reward
manipulation had no effect whatever. An
analysis of variance confirmed the obvious;
there were significant effects due to type
of encoding, F (2, 22) = 90.7, p < .01,
response type (yes-no), F (1, 11) = 42.4,
p < .01, and the Encoding × Response
Type interaction, F (2, 22) = 4.13, p < .05,
but no significant main effect or interactions
involving the differential reward
conditions.
Although this experiment yielded a null
result, its results are not without interest.
Even when subjects were presumably quite
motivated to learn and recognize case-
encoded words, they failed to reach the per-
formance levels associated with rhyme or
category words. Subjects in Group 3 sound or the physical characteristics of its
(6-3-1) reported that although they really printed form. Further, positive decisions in
did attempt to concentrate on case words, the initial task were associated with higher
the category words were somehow "simply memory performance (for more semantic
easier" to recognize in the second phase of questions at least) than were negative
the study. decisions. These effects were shown to hold
Thus, Experiments 8, 9, and 10, con- for recognition and recall under incidental
ducted in an attempt to establish the bound- and intentional memorizing conditions. One
ary conditions for the depth of processing analysis of Experiment 2 showed that
effect, failed to remove the strong superi- recognition increased systematically with
ority originally found for semantically en- initial categorization time, but a further
coded words. The effect is not due to iso- analysis demonstrated that it was the nature
lation, in the simple sense at least (Experi- of the encoding operations which was
ment 8), it does not disappear under inten- crucial for retention, not the amount of time
tional learning conditions and a slow pre- as such. Experiment 5 confirmed that
sentation rate (Experiment 9), and it re- conclusion. Experiments 6 and 7 explored
mains when subjects are rewarded more for possible reasons for the higher retention of
recognizing words with shallower encod- words given positive responses: it was
ings (Experiment 1 0 ) . The problem now argued that encoding elaboration provided a
is to develop an adequate theoretical con- more satisfactory description of the results
text for these findings and it is to this task than depth of encoding. Experiment 8
that we now turn. showed that isolation effects could not by
themselves give an account of the results,
GENERAL DISCUSSION Experiment 9 demonstrated that the main
The experimental results will first be findings still occurred under much looser
briefly summarized. Experiments 1-4 experimental conditions, and Experiment 10
showed that when subjects are asked to showed that the pattern of results was
make various cognitive judgments about unaffected when differential rewards were
words exposed briefly on a tachistoscope, offered for remembering words associated
subsequent memory performance is strongly with different orienting tasks.
determined by the nature of that judgment. This set of results confirms and extends
Questions concerning the word's meaning the findings of other recent investigations,
yielded higher memory performance than
questions concerning either the
word's
290 FERGUS I. M. CRAIK AND ENDEL TULVING

notably the series of studies by Hyde, Jenk- empirical findings if it is assumed that yes
ins, and their colleagues (Hyde, 1973; Hyde and no responses are processed to roughly
and Jenkins, 1969, 1973; Till & Jenkins, the same depth before a decision can be
1973; Walsh & Jenkins, 1973) and by made, since there are no differences in
Schulman (1973, 1974). It is abundantly reaction times, yet there are large differ-
clear that what determines the level of ences in retention of the words.
recall or recognition of a word event is not Second, large differences in retention
intention to learn, the amount of effort were also found when the complexity of the
involved, the difficulty of the orienting encoding context was manipulated.
task, the amount of time spent making Experiment 7 showed that elaborate sen-
judgments about the items, or even the tence frames led to higher recall levels than
amount of rehearsal the items receive did simple sentence frames. This observa-
(Craik & Watkins, 1973); rather it is the tion suggests than an adequate theory must
qualitative nature of the task, the kind of not focus only on the nominal stimulus but
operations carried out on the items, that must also consider the encoded pattern of
determines retention. The problem now is to "stimulus in context."
develop an adequate theoretical formulation Third, and most crucial perhaps, strong
which can take us beyond such vague encoding effects were found under inten-
statements as "meaningful things are well tional learning conditions in Experiments 4
remembered." and 9; it is totally implausible that, under
such conditions, the system stops
Depth of Processing processing the stimulus at some peripheral
Craik and Lockhart (1972) suggested that level. Unless one assumes complete
memory performance depends on the depth perversity of subjects, it must be clear that
to which the stimulus is analyzed. This the word is fully perceived on each trial.
formulation implies that the stimulus is Thus, differential depth of encoding does
processed through a fixed series of ana- not seem a promising description, except in
lyzers, from structural to semantic; that the very general terms. Finally, as detailed
system stops processing the stimulus once earlier, initial processing time is not always
the analysis relevant to the task has been a good predictor of retention. Many of the
carried out, and that judgment time might ideas suggested in the Craik and Lockhart
serve as an index of the depth reached and (1972) article thus stand in need of
thus of the trace's memorability. considerable modification if that processing
These original notions now seem unsatis- framework is to remain useful.
factory in a number of ways. First, the
postulated series of analyzers cannot lie on Degree of Encoding Elaboration
a continuum since structural analyses do Is spread of encoding a more satisfactory
not shade into semantic analyses. The metaphor than depth? The implication
modified view of "domains" of encoding of this second description is that while a
(Sutherland, verbal stimulus is usually identified as a
1972) was suggested by Lockhart, Craik, particular word, this minimal core encoding
and Jacoby (1975). The modification can be elaborated by a context of further
postulates that while some structural structural, phonemic, and semantic encod-
analysis must precede semantic analysis, a ings. Again, the memory trace can be con-
full structural analysis is not usually carried ceptualized as a record of the various pat-
out; only those structural analyses tern-recognition and interpretive analyses
necessary to provide evidence for carried out on the stimulus and its context;
subsequent the difference between the depth and spread
domains are performed. Thus, in the case viewpoints lies only in the postulated orga-
where a stimulus is highly predictable at nization of the cognitive structures respon-
the semantic level, only rather minimal sible for pattern recognition and elabora-
structural analysis, sufficient to confirm the tion, with depth implying that encoding
expectation, would be carried out. The operations are carried out in a
original levels of processing viewpoint is fixed
also unsatisfactory in the light of the
present
DEPTH OF PROCESSING AND WORD RETENTION 291

sequence and spread leading to the more inherently superior about a yes response;
flexible notion that the basic perceptual retention depends on the degree of
core of the event can be elaborated in many elaboration of the encoded trace.
different ways. The notion of encoding Several authors (e.g., Bower, 1967;
domains suggested by Lockhart, Craik, and Tulving & Watkins, 1975) have suggested
Jacoby (1975) is in essence a spread theory, that the memory trace can be described in
since encoding elaboration depends more on terms of its component attributes. This
the breadth of analysis carried out within viewpoint is quite compatible with the
each domain than on the ordinal position of notion of encoding elaboration. The
an analysis in the processing sequence. position argued in this section is that the
However, while spread and elaboration may trace may be considered the record of
indeed be better descriptive terms for the encoding operations carried out on the
results reported in this paper, it should be input; the function of these operations is to
borne in mind that retention depends analyze, and specify the attributes of the
critically on the qualitative nature of the stimulus. However, it is necessary to add
encoding operations performed—a minimal that memory performance cannot be
semantic analysis is more beneficial for considered simply a function of the number
memory than an elaborate structural of encoded attributes; the qualitative nature
analysis (Experiment 5). of these attributes is critically important. A
Whatever the sequence of operations, the second equivalent description is in terms of
present findings are well described by the the "features checked" during encoding.
idea that memory performance depends on Again, a greater number of features
the elaborateness of the final encoding. (especially deeper semantic features)
Retention is enhanced when the encoding implies a more elaborate trace.
context is more fully descriptive Finally, it seems necessary to bring in the
(Experiment 7), although this beneficial principle of integration or congruity for a
effect is restricted to cases where the target complete description of encoding. That is,
stimulus is compatible with the context and memory performance is enhanced to the
can thus form an integrated encoded unit extent that the encoding question or context
with it. Thus the increased elaboration forms an integrated unit with the target
provided by complex sentence frames in word. The higher retention of positive
Experiment 7 did not increase recall decision words in Schulman's (1974) study
performance in the case of negative and in the present experiments can be de-
response words. The same argument can be scribed in this way. The question immedi-
applied to the generally superior retention ately arises as to why integration with the
of positive response words in all the present encoding context is so helpful. One pos-
experiments; for positive responses the sibility is that an encoded unit is unitized or
encoding question can be integrated with integrated on the basis of past experience
the target word and a more elaborate unit and, just as the target stimulus fits naturally
formed. In certain cases, however, positive into a compatible context at encoding, so at
responses do not yield a more elaborately retrieval, re-presentation of part of the
encoded unit: such cases occur when encoded unit will lead easily to regeneration
negative decisions specify the nature of the of the
attributes in question as precisely as total unit. The suggestion is that at en-
positive decisions. For example, the coding the stimulus is interpreted in terms
response no to the question "Is the word in of the system's structured record of past
capital letters?" indicates clearly that the learning, that is, knowledge, of the world
word is in lowercase letters; similarly a no or "semantic memory" (Tulving, 1972) ;
response to the question "Is the object at retrieval, the information provided as
bigger than a man?" indicates that the a cue again utilizes the structure of
object is smaller than a man. When no semantic memory to reconstruct the initial
responses yield as elaborate an encoding as en-coding. An integrated or congruous
yes responses, memory performance encoding thus yields better memory per-
levels are equivalent. There is formance, first, because a more elaborate
nothing trace is laid down and, second,
because
292 FERGUS I. M. CRAIK AND ENDEL TULVING

richer encoding implies greater com- control, and to what extent are they deter-
patibility with the structure, rules, and mined by factors such as context and set?
organization of semantic memory. This Why are there such large differences be-
structure, in turn, is drawn upon to tween different encoding operations? In
facilitate retrieval processes. particular, why is it that subjects do not, or
can not, encode case words efficiently when
Broader Implications they are given explicit instructions to learn
Finally, the implications of the present the words? How does the ability of one list
experiments and the related work reported item to serve as a retrieval cue for another
by Hyde and Jenkins (1969, 1973), list item (e.g., in an A-B pair) vary as a
Schulman (1971, 1974) and Kolers (1973a; function of encoding operations performed
Kolers & Ostry, 1974) will be briefly dis- on the pair as opposed to the individual
cussed. All these studies conform to the items? The important concept of association
new look in memory research in that the as such, the bond or relation between the
stress is on mental operations; items are two items, A and B, may assume a different
remembered not as presented stimuli acting form in the new paradigm. The classical
on the organism, but as components of men- ideas of frequency and recency may be
tal activity. Subjects remember not what eclipsed by notions referring to mental
was "out there." but what they did during activity.
encoding. There are problems, too, associated with
In more traditional memory paradigms, the development of a taxonomy of encoding
the major theoretical concepts were traces operations. How should such operations be
and associations; in both cases their main classified? Do encoding operations really
theoretical property was strength. In turn, fall into types as implied by the distinction
the subject's performance in acquisition, between case, rhyme, and category in the
retention, transfer, and retrieval was held to present experiments, or is there some
be a direct function of the strength of asso- underlying continuity between different op-
ciations and their interrelations. The deter- erations? This last point reflects the debate
minants of strength were also well known: within theories of perception on whether
study time, number of repetitions, analysis of structure and analysis of mean-
recency, intentionality of the subject, pre- ing are qualitatively distinct (Sutherland,
experimental associative strength between 1972) or are better thought of as continuous
items, interference by associations (Kolers. 1973b).
involving identical or similar elements, and Finally, the major question generated by
so on. In the experiments we have described the present approach is what are the encod-
here, these important determinants of the ing operations underlying "normal" learning
strength of associations and traces were and remembering? The experiments reported
held constant: nominal identity of items, in this article show that people do not
preexperimental associations among items, necessarily learn best when they are merely
intralist similarity, frequency, recency, given "learn" instructions. The present
instructions to "learn" the materials, the viewpoint suggests that when subjects are
amount and duration of interpolated instructed to learn a list of items, they
activity. The only thing that was perform self initiated encoding operations on
manipulated was the mental activity of the the items. Thus, by comparing quantitative
learner; yet, as the results showed, memory and qualitative aspects of performance under
performance was dramatically affected by learn instructions with performance after
these activities. various combinations of incidental orienting
This difference between the old paradigm tasks, the nature of learning processes may
and the new creates many interesting re- be further elucidated. The possibility of
search problems that would not readily have analysis and control of learning through its
suggested themselves in the former frame- constituent mental operations opens up
work. For example, to what extent are the exciting vistas for theory and application.
encoding operations performed on an
event under the person's volitional strategic
DEPTH OF PROCESSING AND WORD RETENTION 293
REFERENCE NOTE K o l er s, P . A . , & O st ry , D . J . T i me co ur se o f
loss of information regarding pattern analyzing
I. Moscovitch, M., & Craik, F. I. M. Retrieval cues and
op er at io ns . J ou rn al of V e rb al L ea rn in g a nd
levels of processing in recall and recognition.
Verbal Behavior, 1974, 13, 599-612.
Unpublished manuscript, 1975. (Available from
Lockhart, R, S,, Craik, F. I. M., & Jacoby, L. L.
Morris Moscovitch, Erindale College, Mississauga,
Depth of processing in recognition and recall: Some
Ontario, Canada).
aspects of a general memory system. In J. Brown
REFERENCES (Ed.), Recognition and recall. London: Wiley, 1975.
Neisser, U. Cognitive psychology. New York:
Begg, I. Recall of meaningful phrases. Journal of Appleton-Century-Crofts, 1967.
Verbal Learning and Verbal Behavior, 1972, 1 1 , Norman, D . A. (Ed.). Models of human memory.
431-439. New York: Academic Press, 1970.
Bobrow, S. A., & Bower. G. H . Comprehension Paivio, A. Imagery and verbal processes. New
and recall of sentences. Journal of Experimental York: Holt, Rinehart & Winston, 1971.
Psychology, 1969, 80, 55-61. Postman, L. Short-term memory and incidental
Bower, G. H. A multicomponent theory of the learning. In A. W. Melton (Ed.), Categories of
memory trace. In K. W. Spence & J. T. Spence human learning. New York: Academic Press, 1964.
(Eds.), The Psychology of learning and motivation Rosenbeig, S., & Schiller, W. J. Semantic cod-
(Vol. 1). New York: Academic Press, 1967. i n g and incidental sentence recall. Journal of
Bower, G . H . , & Karlin, M. B. Depth of processing Experimental Psychology, 1971, 90, 345-346.
pictures of faces and recognition memory. S c h u l m a n , A. I. Recognition memory for targets
Journal of Experimental Psychology, 1974, 103, 751- from a scanned word list. British Journal of
757. Psychology, 1971, 62, 335-346.
Broadbent, D. E. Behaviour. London: Eyre & S c h u l m a n , A. I. Memory for words recently
Spottiswoode, 1961. classified. Memory & Cognition, 1974, 2, 47-52.
Cermak, L. S. Human memory: Research and theory. Sheehan, P. W. The role of imagery in incidental learning.
New York: Ronald, 1972. British Journal of Psychology, 1971, 62, 235-244.
Craik, F. I. M., & Lockhart, R. S. Levels of
Sutherland, N. S. Object recognition. In K. C,
processing: A framework for memory research.
Carterette & M. P. Friedman (Eds.), Handbook of
Journal of Verbal Learning and Verbal
perception (V ol. 3). N ew York: Academic
Behavior, 1972, 11, 671-684.
Press, 1972.
Craik, F. I. M., & Watkins, M. J. The role of
T i l l , R. E., & Jenkins, J. J. The effects of cued
rehearsal in short-term memory. Journal of Verbal
orienting tasks on the free recall of words.
Learning and Verbal Behavior, 1973, 12, 599-
Journal o f Verbal Learning and Verbal Behavior,
607.
1973, 12, 489-498.
Eagle, M., & Leiter, E. Recall and recognition in Treisman. A., & Tuxworth, J. Immediate and
intentional and incidental learning. Journal of delayed recall of sentences after perceptual
Experimental Psychology, 1964, 68, 58-63. processing at different levels. Journal o f Verbal
Horowitz, L. M., & Prytulak, L. S. Redintegrative Learning and Verbal Behavior, 1974, 13, 38-44.
memory. Psychological Review: 1969, 76, 519- Tulving, E. Episodic and semantic memory. In
531. E. Tulving & W. Donaldson (Eds.), Organizat ion
H yde, T. S. D ifferen tial effect s of effort and of mem ory. N ew Y ork: A cademic P ress, 1
type of orienting task on recall and organization 972.
of highly associated words. Journal of Experimental
Psychology, 1973, 79, 111-113. T ulving, E . & Thomson, D. M. E ncoding
Hyde, T. S., & Jenkins, J. J. Differential effects of specificity and retrieval processes in episodic
incidental tasks on the organization of recall of a list of memory. Psychological Review, 1973, 80, 352-373.
highly associated words. Journal of Experimental Tulving, E. & Watkins, M, J. Structure of memory
Psychology, 1969, 82, 472-481. traces. Psychological Review, 1975, 82, 261-275.
Hyde. T. S., & Jenkins. J. J. Recall for words Walsh, D. A., & Jenkins. J. J. Effects of orienting
as a function of semantic, graphic, and syntactic tasks on free recall in incidental learning:
orienting tasks. Journal of Verbal Learning and "Difficulty," "effort," and "process" explana-
Verbal Behavior, 1973, 12, 471-480. tions. Journal of Verbal Learning and Verbal
Jacoby, L. L. Test appropriate strategies in Behavior, 1973, 12, 481-488.
retention of categorized lists. Journal of Verbal Waugh, N. C., & Norman, D . A. Primary memory.
Learning and Verbal Behavior, 1973, 12, 675- Psychological Review. 1965, 72, 89-104.
682. Wickelgren, W. A. The long and the short of
memory. Psychological Bulletin, 1973, 80, 425-438.
Kolers, P. A. Remembering operations, Memory &
Cognition, 1973, 1, 347-355. ( a ) (Received February 5, 1975)
Kolers, P. A. Some modes of representation. In
P. Pliner, L. Krames, & T. Alloway (Eds.).
Communication and affect: Language and
thought. New York: Academic Press, 1973. (b)
294 FERGUS I. M. CRAIK AND ENDEL TULVING

APPENDIX
Each subject in Experiment 9 received the word rhyme with each? (d) Does the word
same 60 words in the same order, but six dif- rhyme with tense? (e) Is the word a form of
ferent "formats" were constructed, such that communication? (f) Is the word something
all six possible questions (case, rhyme, cat- to wear? Each format contained 10 question of
egory × yes-no) were asked for each word
each type. Negative questions were drawn from
(Table A 1 ) . Thus, for SPEECH, the questions
were (a) Is the word in capital letters? (b) the pool of unused questions in that particular
Is the word in small print? (c) Do es the format.

TABLE A1

WORDS AND QUESTIONS USED IN EXPERIMENT 9

Вам также может понравиться