2015 - Eye Movement Modeling of Integrative Reading of An Illustrated Text Effects On Processing and Learning

Contemporary Educational Psychology 41 (2015) 172187
Contents lists available at ScienceDirect
Contemporary Educational Psychology

j o u r n a l h o m e p a g e : w w w. e l s e v i e r. c o m / l o c a t e / c e d p s y c h
Eye-movement modeling of integrative reading of an illustrated text:

Effects on processing and learning
Lucia Mason a,*, Patrik Pluchino b, Maria Caterina Tornatora a
a
b
Department of Developmental Psychology and Socialization, University of Padova, Padova, Italy

Department of General Psychology, University of Padova, Padova, Italy
A R T I C L E
I N F O
Article history:
Available online 28 January 2015
Keywords:
Eye movements
Text processing
Text comprehension
Multimedia learning
Example-based learning
Video-based modeling
A B S T R A C T
Integrative processing of verbal and graphical information is crucial when students read an illustrated text to
learn from it. This study examines the potential of a novel approach to support the processing of text and graphics. We used eye movement modeling example (EMME) in the school context to model students integrative
processes of verbal and pictorial information by replaying a models gazes while reading an illustrated text on
a topic different from that of the learning episode. Forty-two 7th graders were randomly assigned to an experimental (EMME) or a control condition (No-EMME) and were asked to read an illustrated science text about
the food chain. Online measures of text processing and oine measures of reading outcomes were used. Eyemovement indices indicated that students in the EMME condition showed more integrative processing than
students in the No-EMME condition. They also performed better than the latter in the verbal and graphical
recall, and in the transfer task. Finally, the relationship between the duration of reprocessing the graphical segments while rereading the correspondent verbal segments and transfer performance was stronger in the EMME
condition, after controlling for the individual differences of prior knowledge, reading comprehension, and achievement in science. Overall, the ndings suggest the potential of eye-tracking methodology as an instruction tool.
2015 Elsevier Inc. All rights reserved.
1. Introduction
Students mainly rely on reading to learn new knowledge in the
school context. Regardless of the presentation format of their
information sources, either paper or digital, they should be able to
understand written texts. It is therefore not surprising that a fruitful line of research in educational psychology is that of learning from
text in content areas (Alexander, 2012; Sinatra & Broughton, 2011)
and several studies have investigated the effects of text type (e.g.,
refutation text) on conceptual understanding and change (Cordova,
Sinatra, Jones, Taasoobshirazi, & Lombardi, 2014; Diakidoy, Kendeou,
& Ioannides, 2003; Diakidoy, Mouskounti, & Ioannides, 2011;
Kendeou, Muis, & Fulton, 2011; Mason, Gava, & Boldrin, 2008).
In learning from texts students also encounter different types of
visualization as textbooks are accompanied by illustrations. It has
been documented that images enhance learning (Butcher, 2006;
Carney & Levin, 2002; Mayer, 1989), although not always (Mayer
& Gallini, 1990). The superiority of an illustrated text over a nonillustrated text depends on a successful integration of verbal
and graphical information (Mayer, 2009, 2014; Schnotz, 2002, 2014).
* Corresponding author. Department of Developmental Psychology and

Socialization, University of Padova, via Venezia 8, Padova 35131, Italy. Fax: +39 049
827 6511.
E-mail address: lucia.mason@unipd.it (L. Mason).
http://dx.doi.org/10.1016/j.cedpsych.2015.01.004
0361-476X/ 2015 Elsevier Inc. All rights reserved.
Nevertheless, research has also indicated that students may pay little
attention to illustrations (Cromley, Snyder-Hogan, & Luciw-Dubas,
2010a; Hannus & Hyn, 1999) and are often under the illusion that
they comprehend them (Schroeder et al., 2011).
To help students integrate words and pictorial elements when
reading it is therefore very important to enhance not only text
comprehension but also learning from illustrated text, given that
the association between reading performance and academic performance has been documented, especially in the domain of science
(Cromley, 2009; Cromley, Snyder-Hogan, & Luciw-Dubas, 2010b).
Previous research has focused on the characteristics of learning materials that can better support the integration of text and pictures,
in particular the corresponding parts of the two types of external
representation, for example using labels and highlights as visual cues
(Bartholom & Bromme, 2009; Florax & Ploetzner, 2010; Mason,
Pluchino, & Tornatora, 2013). However, cueing by making relevant
information more salient is not necessarily successful, as indicated in studies on learning from static (Bartholom & Bromme,
2009) and animated visualizations (Lowe & Boucheix, 2011).
An alternative way to sustain readers integration of verbal and
graphical information is based on the opportunity of modeling the
readers processing behavior, that is, to show a novice student the
behavior of an expert who reads an illustrated text. A very recent
approach in research on learning and instruction supports students orientation of attention in video-based modeling examples
by means of eye tracking (Jarodzka, van Gog, Dorr, Scheiter, & Gerjets,
L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187
2013). Eye tracking captures a persons eye position, which is linked

to attention and information processing (Just & Carpenter, 1980;
Rayner, 1998, 2009). Eye-tracking methodology has recently received increased attention in educational research about multimedia
learning (van Gog & Scheiter, 2010) to examine the processing of
text and static graphics (Eitel, Scheitel, Schler, Nystrm, & Holmqvist,
2013) especially the time course of this processing (Mason,
Pluchino, Tornatora, & Ariasi, 2013) complex graphics (Canham
& Hegarty, 2010), animations (Boucheix & Lowe, 2010), and dynamic
stimuli (Jarodzka, Scheiter, Gerjets, & van Gog, 2010). Modern technology related to eye movement recordings not only provides unique
information regarding perceptual and cognitive processes underlying learning performance, but it also makes gaze replays available
in the form of videos. In standard eye tracking software, xations
on specic information are represented as solid dots: The larger a
dot, the longer the xation time on it. Videos of gaze replays can
be used to model a learners behavior. In this regard, Eye Movement Modeling Examples (EMME) is a recent instructional strategy
based on eye position recordings of a skillful expert, which are
replayed to less skillful students with the aim of helping them
acquire the desired skills (van Gog, Jarodzka, Scheiter, Gerjets, & Paas,
2009).
In the present study we used eye-tracking methodology in the
real school context to model students integration of text and graphics when interacting with the learning material. The aim was to
extend current research on students processing and comprehension of illustrated text, taking into account the main issues of two
separate lines of research, one on multimedia principle, and the
other on eye movement modeling examples. In the next sections,
relevant issues of these lines of research are briey reviewed for
the foundation of the current investigation.
1.1. Comprehension of text and picture

The benecial effects of supplementing texts with pictures have
been accounted for by the cognitive theory of multimedia learning (Mayer, 2009, 2014). This theory envisages three processes as
important for a successful comprehension of verbal and graphical
representations. The rst is the selection of relevant words from
the text and relevant elements from the picture. The second is the
organization of selected information in which the material is
further processed to understand and retain the information.
Organization takes place separately for textual and pictorial
information; therefore a verbal model and a pictorial model are constructed. The third process is the integration of verbal and pictorial
models with the help of prior knowledge retrieved from longterm memory.
In his integrated model of text and picture comprehension,
Schnotz (2002, 2014; Schnotz & Bannert, 2003) developed Mayers
theory to take into consideration the representational nature of a
text and a picture as two different sign systems, distinguishing
between the processing of descriptions and depictions. Texts are
considered to be descriptive representations with a higher representational power than depictive representations. Paivios (1986)
dual-coding theory is applied to the processing of images and texts.
However, in contrast to this traditional theory, the integrated model
of text and picture comprehension posits that multiple representations are constructed during text and picture comprehension.
During text comprehension, the reader rst generates a representation of the text surface structure, then a propositional
representation of the semantic content which is a representation of the ideas conveyed in the text at a conceptual level and
nally a mental model of the subject matter presented in the text.
Propositional representations and mental models interact continuously through processes of model construction and model
173
inspection guided by schemata that have selective and organizational functions.

Similarly, in picture comprehension, an individual rst generates a visual representation of the graphic visualization via perceptual
processing and then a mental model, as well as a propositional representation of the content through semantic processing (Schnotz,
2002, 2014; Schnotz & Bannert, 2003). Structural mapping processes are essential to the formation of a coherent mental model
of an illustrated text from the continuous interactions between the
propositional representation and the mental model, both in text
comprehension and picture comprehension. The mapping process
takes place when graphical entities are mapped onto mental
entities and spatial relations are mapped onto semantic relations.
The resulting mental mappings are integrated conceptually with prior
knowledge and enable the use of acquired knowledge in various
situations.
To exemplify, if a student reads that Some migrant birds y to
the south of Europe for wintering (Schnotz, 2014), she constructs
a rst representation of the text surface structure, which cannot be
considered understanding, but allows repetition of the content read.
A propositional representation derived from the surface representation leads to a conceptual organization of the content around the
proposition y, which is independent of the sentence wording and
syntax. Further, the readers construct a mental model of the text
content, for example a mental map of Europe with a northsouth
bird transfer. Similarly, if a student inspects a map of bird migration in Europe, an internal visual image of the map is formed rst,
then a mental model of bird migration in Europe, complemented
by a propositional representation, as an effect of selection and elaboration of information through structure mapping (Schnotz, 2014).
According to both the cognitive theory of multimedia learning
and the integrated model of text and picture comprehension, integration processes are essential for learning from illustrated texts,
after selecting and organizing relevant information. How can integrative processing of verbal and graphical information be enhanced
to facilitate text comprehension and learning from text? To answer
this question, research on multimedia learning has examined various
characteristics of the learning material. For example, the potential
of visual cueing in the form of labeling has been investigated.
In a study with university students, labeling included either the
presentation of numerical labels to mark each central concept in
the text and the corresponding unit in the graphics, or colored highlights of the central areas of the texts and the corresponding areas
of the graphics (Bartholom & Bromme, 2009). Numerical labeling was more effective than highlighting, at least in one of the various
knowledge measures at posttest, a classication task, when other
prompts were not provided. In a study with lower secondary school
students, labeling referred to the presence of only one or two key
words placed near each part of a picture (Mason et al., 2013). Results
showed that only for the transfer performance, did participants who
studied the text illustrated by a labeled picture outperform those
who interacted with the same text visualized by an unlabeled picture,
or text only. Moreover, the labeled illustration promoted stronger
integrative processing of the learning material, as revealed by the
eye-xation index of the time spent rexating text segments while
reinspecting the illustration (look-from illustration to text) during
the second-pass reading and inspection. It is worth underlining
that this study examined the time course of text and picture
processing and indicated that their integration occurs during the
second-pass reading and is related to deeper learning. The latter outcomes conrmed those of a correlational study with fourth graders,
which indicates that the greater integrative processing of an illustrated text was associated with higher learning performance (Mason,
Tornatora, & Pluchino, 2013).
Another potentially advantageous characteristic of learning materials that has been examined is spatial contiguity. It entails placing
174
texts and pictures close to, rather than far from, the page or screen
(Mayer, 2009, 2014). Spatial contiguity has been proven to enhance
retention and transfer in two of three studies with university students (Johnson & Mayer, 2012). Picture labeling in the forms of words
located near the graphical elements was investigated in relation to
spatial contiguity and text segmentation in another study with
university students. Findings revealed that retention, but not
comprehension, improved through segmentation of the verbal
representation and, to a lesser extent, through picture labeling
(Florax & Ploetzner, 2010).
Overall, although there is evidence that visual cueing in the form
of labeling can be effective in multimedia learning, the results are not
conclusive, especially regarding the level of learning supercial or
deeper that can be enhanced.
Another approach to support an effective processing of text and
graphics is to focus on the learners, who can be empowered to interact more effectively with multiple representations, for example
by teaching them a learning strategy. In two outcome-focused
studies, sixth graders selection, organization, and integration processes were supported through the direct verbal presentation of a
strategy to be used to learn from text and pictures (Schlag &
Ploetzner, 2011). Half the students were provided with written instructions on a worksheet regarding the various steps of the strategy
that they had to carry out. Findings revealed that the strategy
instructions were effective in promoting factual, conceptual, and
transfer knowledge.
Are only verbal instructions effective in supporting text and
picture integration? In the study reported below, we adopted a teaching strategy approach but in an indirect and innovative way, focusing
on both the process and outcomes of illustrated text reading. The
study was not based on explicitly teaching the various steps of a
successful strategy through written instructions as in previous
studies, but rather on giving learners the opportunity to observe
an example of how a successful reader processes an illustrated text
via the position of her/his eyes moving through the learning material. Relevant issues of research on example-based instruction,
especially on eye movement modeling examples, are now introduced to illustrate the innovative approach adopted in the study.
1.2. Eye movement modeling examples

Research has documented that example-based instruction is powerful. Providing students with examples as tools which show how
they should solve a problem or perform a given task, helps their
performance (see Atkinson, Derry, Renkl, & Wortham, 2000; van Gog
& Rummel, 2010, for reviews). The fruitful area of investigation on
worked-out examples has clearly documented that they substantially support novice learners (Renkl, 1997; Stark, Kopp, & Fischer,
2011; van Gog, Paas, & van Merrinboer, 2006). An important
advantage of example-based instruction is that learners can save
cognitive resources as they should not try a possible solution but
rather, they should concentrate on the correct solution, or way to
perform a task, which is provided.
In example-based instruction, examples are not presented in a
written format only, they are also in videos. Video-based modeling examples have been used increasingly in educational contexts,
for example to model writing performance (Braaksma, Rijlaarsdam,
& van den Bergh, 2002), problem solving (van Gog, 2011), and
creativity in verbal and visual domains (Groenendijk, Janssen,
Rijlaarsdam, & van den Bergh, 2013).
Video-based modeling is grounded on observational learning,
which was rst theorized by Bandura (1977) within the social learning theory. He posited that individuals can learn much by observation
if they attend to, and perceive accurately (p. 24), the relevant
aspects of the modeled behavior. Observation is also an essential
aspect of cognitive apprenticeship in which an expert model

unravels covert processes (Collins, Brown, & Newman, 1989).
In our study we combined the usefulness of instruction by videobased modeling with the benets of eye-tracking technology to
model strategic reading of an illustrated text, that is, the integrative processing of text and graphics, which is essential for successful
learning.
It should be noted that the entire process of reading an illustrated text is cued in EMME. In the above-mentioned studies on the
comprehension of texts and graphics, labels were used, for example,
to make the correspondences between words and pictorial elements more salient. In EMME the entire process of reading given
material not only important correspondences between representations is displayed to show expert behavior throughout the
execution of a task (i.e., reading an illustrated text).
Modeling the reading process by means of the eye movements
of a successful performer may be advantageous for novice learners as they are guided perceptually to direct attention during the
execution of the task. In this regard, the importance of a perceptual guide for the solution of a well-known problem Dunckers
radiation problem has been demonstrated in an eye-tracking
investigation by Grant and Spivey (2003). In a rst study, using
eye-xation patterns, they identied the critical components of
the diagram-based problem that was related to insight into problemsolving. In the second study, the authors perceptually highlighted
the identied critical diagram component and this increased
the frequency of correct solutions. This outcome provides evidence of the interactions between the visual environment, attention,
and mental processes (Grant & Spivey, 2003). In our study we
theoretically moved from the consideration that it is possible to
guide attention and related eye movements through perceptual
emphasis on crucial features or components of a representational
structure.
In the experimental psychology literature there are several studies
on the effectiveness of EMME in various areas of investigation, such
as problem solving and diagnosis in medical imaging. The effectiveness of seeing the eye movements of another person looking
for pulmonary nodules has been documented in a series of studies
with radiographers. An interesting outcome is that novice radiographers were better able to identify nodules after seeing the eye
movements of either a nave or an expert search behavior, but their
performance did not increase when the models eye movements were
unrelated to the search behavior (Litcheld, Ball, Donovan, Manning,
& Crawford, 2010).
To date, in research on learning and instruction, which is
pertinent to our study, there are only a very few studies that have
investigated the use of a models eye movements to enhance
students performance. In a study with medical students, video
recordings of the eye movements of the teacher were shown to the
students to test whether observing these videos would produce more
ecient gaze patterns when detecting task-relevant information on
medical images. Outcomes conrmed that observation of the experts
eye movements improved students performance (Seppnen &
Gegenfurtner, 2012). In another study with medical students, their
attention was guided through a recording of a models eye movements superimposed on the case video of patients (Jarodzka et al.,
2012). Students were required to learn the skills for an effective visual
search for symptoms and interpretation of their observations. The
models eye movements were displayed in two ways: By highlighting the features focused on by the model through circles, or blurring
the features not focused on by the model, that is, reducing other
information (spotlight condition) in a form of anti-cueing, as used
in Lowe and Boucheixs (2011) study. Results revealed that the latter
effectively guided students search of relevant information compared with the former and the control condition. Moreover, in the
spotlight condition students not only improved their visual search
with videos of new patients, but also showed better clinical

reasoning (Jarodzka et al., 2012).
In another recent study, EMME was used to guide students
attention in a visually complex perceptual task that required them
to distinguish sh locomotion patterns in realistic and dynamic
stimuli. The models eye movements were displayed in two ways:
Through solid dots (adding information) or spotlights (reducing other
information). In both conditions eye movements were accompanied by verbal explanations, which were necessary to explain why
the information attended by the expert was relevant at a given
moment. Results revealed that EMME enhanced both visual search
and interpretation of relevant information for novel stimuli compared with the control group. In addition, the two displays of eye
movements played a differential role: Spotlight EMME sustained
visual selection of information, while dot EMME enhanced organization and integration of information with prior knowledge
(Jarodzka et al., 2013).
1.3. The current study

In all the above-mentioned studies that have focused on the
effects of eye movement modeling examples, perceptual tasks have
been modeled. To our knowledge, there is no investigation of the
potential of this perceptual strategy in supporting the execution of
tasks that are far from perceptual, although they involve perception processes. To ll this gap and to contribute to the theory on
multimedia learning, in the current study we focus on modeling the
reading of an illustrated text to learn from it. As mentioned before,
learning from words and pictures entails much more than perceptual processes which have an important role during the selection
of relevant information. Selection processes should be followed by
organization and integration processes for the multimedia effect to
occur (Mayer, 2009, 2014; Schnotz & Bannert, 2003). Learning from
text and pictures requires the integration of verbal and graphical
information. Modeling the crucial integrative processing of words
and pictorial elements during reading is particularly relevant, given
that learning from illustrated texts is one of the most common academic learning tasks. It seems important to investigate whether
modeling the entire process of reading an illustrated text only
perceptually, through the gaze replay of an expert, can have positive effects on various postreading outcomes.
From a practical perspective, it should also be noted that only
university students and professionals were involved in previous
studies on eye movement modeling examples. To extend current
research, it is therefore worth investigating whether EMME may also
be helpful for much younger students in the educational context
to guide them in successfully carrying out a fundamental task, such
as learning concepts from text and picture.
To sum up, the rationale for the study is theoretically grounded
on four issues derived from previous research, as mentioned
above: (1) attentional guidance is a way to improve cognitive
performance involving graphical representations; (2) video-based
modeling has been proven useful in educational contexts; (3) EMME
has been effectively used to model the execution of complex tasks,
although limited to a perceptual domain; and (4) multimedia learning has been successfully enhanced in young students by teaching
a strategy for integrating text and picture.
More specically, the study investigated the effects of attentional
guidance through eye movement modeling examples to support integrative processing and learning from text and picture in lower
secondary school. A strategy derived from the Bartholom and
Bromme (2009) study was modeled for fostering the integrative processing. Based on the cognitive processes envisioned in the Mayer
(2010) and Schnotz (2002) theoretical accounts selection, organization, and integration the strategy involves a three-step
175
sequence of text and picture processing in which the latter is

conceptually guided by the former. Text processing initially allows
readers to use text-based information to better focus their picture
inspection on the most relevant elements. Readers with low prior
knowledge in particular seem to adopt a text-guided processing
approach (Canham & Hegarty, 2010).
In concrete terms, in an eye-movement model example to guide
students in the application of this strategy, the model initially gains
an overview of the whole text to identify central concepts. In the
second step, the model uses text information to direct the picture
inspection in order to identify the graphical counterparts of the central
concepts of the text. The model therefore starts relating the text and
picture to each other, shifting from one to the other representation
to organize their corresponding parts. In the third step, the models
continues relating the verbal and visual representations and then
focuses on the verbal segments that are not depicted, since the
mapping between text and picture is inevitably partial.
The following research questions guided the study:
(1) Do students with the opportunity of observing a models eye
movements while reading an illustrated text, show greater
integrative processing than students without this opportunity in their own reading of another illustrated text?
(2) Do the former also perform better than the latter in
postreading tasks that measure recall, factual knowledge, and
transfer of newly learned knowledge?
(3) Is there a link between online processing and oine measures of illustrated text reading?
For research question 1, we hypothesized that students who had
the opportunity to observe the eye movements of a model strategically reading an illustrated text, would show more integrative
processing of verbal and pictorial information. Higher integrative
processing should be evident in the ne-grained eye-xation index
of look from text to picture and from picture to text xation time.
It is the time a learner spends reinspecting picture segments while
rereading text segments and rereading text segments while reinspecting picture segments during the second-pass processing. This
index reects a less automatic and more purposeful processing of
the learning material (Hyn, Lorch, & Kaakinen, 2002; Hyn &
Nurminen, 2006). More specically, based on previous research on
the time course of text and picture processing mentioned above
(Mason, Pluchino et al., 2013 and Mason, Tornatora et al., 2013), we
hypothesized that during the second-pass reading EMME students would show longer xation times for look-from corresponding
text segments to corresponding picture segments, and vice versa.
In other words, they would rexate longer the graphical elements
that visualize the central text information after gaze shifting from
text to picture. They would also rexate longer on the central verbal
segments that are visualized in the illustration after gaze shifting
from picture to text. Integration would therefore mainly refer to the
correspondences between words and graphics, as highlighted in the
models gaze replay. In contrast, we did not hypothesize differences between EMME and No-EMME students either for the index
of immediate and more automatic rst-pass reading, or for the index
of the delayed and less automatic second-pass reading or lookback within text and within picture.
For research question 2, we hypothesized that EMME students
would also perform better than No-EMME students in the postreading transfer task. More than recall and factual knowledge,
transfer that reveals deeper conceptual understanding requires
greater integrative processing of text and graphics, an essential
condition for successful learning from illustrated text (Mayer,
2009; Schnotz, 2002).
For research question 3, based on recent process-oriented research on learning from illustrated science texts in grade level
176
students (Mason, Pluchino et al., 2013 and Mason, Tornatora et al.,

2013), we hypothesized that integrative processing of text and
graphics would predict reading performance, in particular in regard
to the use of EMME, after controlling for individual characteristics. Specically, given the text-guided processing of the learning
material in the models gaze replay, the crucial index of look-from
corresponding text segments to corresponding picture segments
xation time should be a predictor of deeper learning in particular, as revealed in the transfer task. In addition, we expected the
interaction term as a predictor of transfer, that is, a stronger
relationship between the look-from corresponding text segments
to corresponding picture segments and the transfer performance,
in participants who observed the gaze replay, compared with those
who did not have the opportunity of the eye movement modeling
example.
2. Method
2.1. Participants and design
Initially, 53 students attending 7th grade in the second year of two
public lower secondary schools in a north-eastern region of Italy were
involved on a voluntary basis with parental consent during the second
term of the 20112012 school year. Because of the poor eye calibration of 5 participants, 3 participants with learning disabilities, and
the absence of 3 participants at one of the two sessions, we considered the data of 42 students (23 girls), with a mean age of 12.61 years
(SD = .49). All were Caucasian, native-born Italians with Italian as their
rst language and shared a homogeneous middle-class social background. All had normal or corrected-to-normal vision.
At the start of the study, participants were randomly assigned
to the experimental condition featuring an eye movement modeling example (EMME, n = 22) or to the control condition (NoEMME, n = 20). Both conditions involved a pretestposttest design.
2.2. Materials
2.2.1. Eye tracking equipment
Eye movements were collected using the Tobii T120 eye-tracker.
It is integrated into a 17-in TFT monitor with a maximum resolution of 1280 1024 pixels. The eye-tracker embeds ve nearinfrared light emitting diodes (NIR-LEDs) and a high-resolution
camera with a charge coupled device (CCD) sensor. The camera
samples pupil location and pupil size at the rate of 120 Hz. The
Tobii T120 does not require a head stabilization system. Data were
recorded with Tobii-Studio (1.7) software.
2.2.2. Eye movement modeling example

A short video showing the gaze replay of a model reading an
illustrated text was shown in the EMME condition only. The video
with the gaze trail lasted 2 minutes and 53 seconds. It was created
by instructing a model (a graduate student) to study a one-page
illustrated text on the water cycle while her eye movements were
recorded. The model behaved didactically to deliberately model an
integrative reading strategy. Her eye movements were superimposed onto the illustrated text as moving solid red dots, the size
of which changed dynamically according to xation duration on a
position of the screen. Based on their positive effects in enhancing
organization and integration of knowledge (Jarodzka et al., 2013),
we used solid dots for attentional guidance. The eye movements were
shown at their original speed in the video.
The topic of the text read by the model the water cycle was
different from the topic of the text used in the learning episode to
avoid participants in the EMME condition being advantaged in terms

of exposure to the content to be learned. In this respect, it should
be pointed out that the purpose of the video with gaze replay was
to make the procedure of text and picture integration evident, regardless of the specic content of the learning material. Anyway,
the participants could read the text.
Before presenting the video, learners were only told the following: Now you will watch a video that shows the eye movements
of a student who is reading an illustrated science text, that is, a text
accompanied by a picture. This student learned a lot from reading
the text and observing the picture. Look at the video carefully. In
the video you will see red dots on the text and picture. Each of these
red dots represents how long the student xated the specic
information. The larger the red dot, the longer the time spent on
the information.
The video modeled a successful strategy for learning from an
illustrated text. As argued before, the text-guided strategy was
theoretically grounded (Bartholom & Bromme, 2009). The gaze
replay showed that the model initially read the entire text to gain
an overview of the verbal part of the learning material. Text information then guided the models picture inspection. The model thus
started connecting the text and picture to each other, shifting from
one to the other representation of the learning material to make
all the correspondences between them. In fact, the model shifted
24 times to relate the text segments to the picture and 24 times to
relate the picture to the text segments. The model also focused on
the text segments that were not depicted. The assumption is that
integration may take place after a reader has explored the text during
the earlier processing, and has then explored the picture conceptually guided by the text, making all correspondences between the
relevant elements in the verbal and graphical representations, and
reprocessing the verbal segments that are not depicted (Bartholom
& Bromme, 2009).
It should be noted that the video with the models gaze replay
was shown without simultaneous verbal accompaniment for three
reasons: (1) To avoid verbal instructions interfering with observational learning, that is, the desired effects could be attributed to the
verbal explanations instead of the modeling in itself; (2) to avoid
verbal explanations being redundant with respect to the video modeling, having a detrimental effect in tasks that are perceptually
simple, as happened in van Gog et al.s (2009) study; (3) to avoid
learners having diculty attending to both verbal and visual processes simultaneously. In previous research in which the modeled
task was a complex, purely perceptual task, verbal descriptions of
the relevant elements to look at accompanied the models gaze
replays because observers would have had diculty in visually
searching for small relevant perceptual aspects of the stimuli (e.g.,
Jarodzka et al., 2013). In contrast, what was relevant in the video
of this EMME study was not the perception of a specic small
element xated by the model (who read a text on a topic different
from the learning topic), but rather a global perception that text and
picture should be attended to, and their information integrated by
several shifts of attentional focus from one to the other part of the
learning material.
2.2.3. Learning material

The same illustrated text (also used in another non-EMME study,
Mason, Tornatora, & Pluchino, 2015) was used in both conditions
(see Appendix). The topic regarded the food chain, which had not
been previously presented in any of the science classes attended
by the participants. The text was 214 words long (in Italian) and
was illustrated by a picture (Figure 1). It should be noted that the
text and picture in the video on water cycle with the models gaze
replay matched the text and picture regarding the learning topic
of the food chain. Specically, text and picture in the video were
177
Fig. 1. The learning material with text and picture on the topic of the food chain. Highlighted parts of the text and picture are the corresponding segments of the verbal
and graphical representations.
designed and appeared in the same positions, using one screen only,
as text and picture in the learning material.
The text and picture were divided into areas of interest (AOIs)
for eye-xation analyses. The text was divided into 12 sentences
(AOIs). More specically, 5 sentences were considered as corresponding AOIs (i.e., areas of interest that contain the same information
depicted in the illustration) and 7 sentences were considered as noncorresponding AOIs (i.e., areas of interest that contain information
about the food chain but that were not depicted in the illustration). The illustration was also divided into corresponding AOIs (areas
that visualize text information) and non-corresponding AOIs (areas
that do not visualize text information).
In both EMME and No-EMME conditions, the learning material
appeared on one screen only. Both text and picture were saved as
images in a .tif format (1024 768 pixels). The text was written in
Courier 13 font and presented in double interlinear spacing.
2.2.4. Pretest
The following measure was used before the learning episode.
2.2.4.1. Factual knowledge. Prior factual knowledge of the topic was
assessed by nine questions: Two open-ended and seven multiple
choice questions that also required a justication for the chosen

option ( = .79).
2.2.5. Posttest
The following measures were used after the learning episode.
2.2.5.1. Verbal recall. Content retention was measured asking the

participants to write down everything they could remember from
the text they had read, which included 23 information units.
2.2.5.2. Graphical recall. Participants were also asked to draw

everything they could remember from the picture they observed.
2.2.5.3. Factual knowledge. Factual knowledge about the topic was

measured using the same nine questions which were also asked
before the learning episode ( = .86). Better answers at posttest
require not only information directly presented in the text, but also
combinations of pieces of information. For example, a correct answer
to question no. 2a about a producer in a food chain (see the
Appendix) implies the use of information that is explicitly
178
conveyed in the text. In fact, it states that the producers are water
vegetation. A correct answer to question no. 7 (see the Appendix),
instead, requires information that is not directly provided in the text,
although it describes the role of decomposers. Readers should infer
that the organisms that transform organic substances are the last
ring of a food chain.
2.2.5.4. Transfer. Participants near transfer at posttest was measured using a task that focuses on the ability to apply the newly
learned factual knowledge. The task included eight questions, four
open questions and four multiple-choice questions that also required a justication for the chosen option ( = .79). Answers to the
questions implied the use of acquired declarative knowledge to new
situations or phenomena that were similar but not identical to those
provided in the text (Haskell, 2001). Near transfer is the deeper level
of reading outcomes measured in the study.
2.2.6. Control variables
To ensure the equivalence of the groups in the two conditions,
other participant characteristics that could inuence text processing and learning were also measured.
2.2.6.1. Reading comprehension ability. This was measured using the
MT (Italian) test for seventh grade (Cornoldi & Colpo, 1995), which
entails reading an informational text and answering 14 questions.
Reliability of this instrument has been reported in the range of .73
to .82 (Cronbachs alpha). In the present study the reliability coefcient was = .75.
2.2.6.2. Verbal working memory capacity. This was measured using
the Italian version of Daneman and Carpenters (1980) Reading Span
Test (Pazzaglia, Palladino, & De Beni, 2000), which evaluates the
simultaneous processing and storage of unrelated information and
is, therefore, considered a complex span text. The split-half reliability coecient for the index words (in Italian) has been reported
as = .76, based on a Cronbachs = .72 (Pazzaglia et al., 2000). In the
present study the Cronbachs was = .70.
2.2.6.3. Visuo-spatial working memory capacity. It was measured
using the Corsi Span Test (Corsi, 1972), which evaluates the
visuo-spatial memory span and implicit visuo-spatial learning.
Testretest reliability for this instrument has been reported as = .74
(Mammarella, Toso, Pazzaglia, & Cornoldi, 2008). The reliability
coecient for Italian third and fourth graders has been reported
as = .79 (Mammarella, Pazzaglia, & Cornoldi, 2008). In the present
study the Cronbachs alpha was = .77.
2.2.6.4. Spatial ability. This was measured using the Mental Rotation Test (Vandenberg & Kuse, 1978), which requires visuo-spatial
ability to mentally rotate two- and three-dimensional objects quickly
and accurately. The reported split-half reliability coecient for subjects ranging from the 5th to 13th grade is .80 and the Cronbachs
= .87 (Geiser, Lehmann, & Eid, 2006). In the present study, the
Cronbachs was = .83.
2.2.6.5. Achievement in science. This was measured using students most recent grade in this subject (midterm of the school year).
In the Italian school system, grades range from 1 to 10 (highest
grade = 10).
2.2.6.6. Perception of text easiness and interestingness. After reading
the learning text, participants rated text easiness (1 = easy, 5 = difcult) and interestingness (1 = interesting, 5 = uninteresting) on a
ve-point Likert scale.
2.2.6.7. Baseline eye movements. While reading an illustrated text

on a different topic (greenhouse effect), which had not yet been
presented in the science classes they attended, participants eye
movements were also measured to have baseline data on text and
picture processing. It should be pointed out that the text and picture
used for the baseline measures had the same size and level of
diculty as that used in the learning session.
We computed the same eye-xation indices described below for
the learning text to measure their illustrated text processing at the
start of the study. We also measured learners prior knowledge on
the topic of the greenhouse effect and their perception of text
easiness and interestingness, which may inuence text processing.
2.2.7. Interviews for manipulation check
The video was deliberately shown without any verbal accompaniment and no information about the aim was explicitly given
to participants. To ensure that they had perceived the integrative
behavior of the model and the purpose of the gaze replay, they were
asked three questions in an individual interview at the end of the
posttest. The questions were: (1) In your opinion, how did the model
read the page?; (2) How did you understand this?; and (3) In
your opinion, why did we show the video with the eye movements of a successful student reading?.
2.3. Data scoring
2.3.1. Eye-xation indices
We computed indices (in milliseconds) of rst and secondpass xation times (Hyn & Lorch, 2004; Hyn et al., 2002; Hyn,
Lorch, & Rinck, 2003; Hyn & Nurminen, 2006).
2.3.1.1. First-pass xation times. First-pass xation time on text was
computed by summing the duration of all xations on the verbal
part when it was read for the rst time. First-pass xation time on
the picture was also computed by summing the duration of all
xations on the visualization during the rst inspection.
2.3.1.2. Second-pass xation times. We rst computed the lookback on text and picture by summing the durations of all rexations
on the verbal and graphical part, respectively. We then computed
the look-from xation time as an index of integrative processing.
Look-from text to picture was computed for the corresponding and
non-corresponding AOIs by summing the duration of all rexations
that took off from a segment (AOI) of the text, either corresponding or non-corresponding, and landed on a segment (AOI) of the
picture, either corresponding and non-corresponding. Similarly, the
look-from picture to text was computed by summing the durations of all reinspections that took off from a segment of the picture,
either corresponding or non-corresponding and landed on a
segment of the text, either corresponding or non-corresponding.
Look-from measures offer an index of the extent to which a text
segment is used as an anchor point for processing the picture segments, or the picture is used as an anchor point for processing
text segments, which is essential for integrating the two parts of
the learning material. We also computed the average look-from xation time by dividing the total duration by the number of transitions
during the second-pass, that is, the gaze shifts from one part of the
learning material to the other. Regarding the average look-from
measures, also transitions were computed from the corresponding and non-corresponding text segments to the corresponding and
non-corresponding picture segments and vice versa.
All measures of eye movements (duration in milliseconds) were
logarithmically transformed in order to control for the large interindividual variance that leads to non-normal distributions.
2.3.2. Pre and posttest

Two independent raters scored the pre and posttests. Disagreement between them was resolved in the presence of the rst author.
Answers to the open-ended questions to assess prior factual knowledge were awarded 02 points depending on their correctness and
completeness. Answers to the multiple-choice questions were scored
12 only when a correct justication was given. Inter-rater reliability for scoring the former and the latter, as measured by Cohens
k, was .90. Examples of questions to measure factual knowledge,
as well as examples of scoring, are reported in the Appendix. Verbal
recalls were scored according to the number of correct information units they reported. Inter-rater agreement was .91. Graphical
recalls were scored 02 depending on the conceptual relevance of
the depicted elements. No points were awarded when the producers, which are the basis of any food chain, were not included. One
point was awarded when the consumers or decomposers were also
included, as well as the producers. Two points were awarded when
producers, consumers, and decomposers were included and correctly connected. Inter-rater reliability was .98. Examples of graphical
recalls are reported in the Appendix.
Posttest answers to questions regarding factual knowledge were
scored in the same way as those at pretest. Inter-rater reliability for
coding the answers to the open-ended questions and the justications for the answers to the multiple-choice questions, as measured
by Cohens k, was .95. Answers to the open-ended questions for the
transfer performance were also awarded 02 points depending on
their correctness and completeness. Answers to the multiplechoice questions were scored 12 only when a correct justication
was given. Inter-rater reliability for coding the justications, as measured by Cohens k, was .96. Examples of questions to measure the
transfer performance, as well as examples of scoring, are reported
in the Appendix.
2.3.3. Interviews for manipulation check

A qualitative analysis of the responses to the three interview questions for manipulation check was also coded by the two independent
raters. Their agreement, as measured by Cohens k for each question, ranged from .93 to .98. The response categories to the rst
question (In your opinion, how did the model read the page?) were:
(a) He connected text and picture (e.g., He rst read, then he read
again and matched the images to the text, then he reread matching the images to the text, and then he looked at the things that
he had not understood and matched them to the picture to understand more) and (b) He read the text and also looked at the picture
(e.g., He read everything the rst time, then he reread jumping
from one piece to another and also looked at the picture when
rereading). Responses in the rst category emphasized more
the integrative process. The response categories to the second
question (How did you understand this?) were: (a) for the correspondences between words and image (e.g., He looked at both
text and picture and made connections between text and picture,
and re-xated some parts to understand well. When he read some
words, he shifted to the picture to see the correspondence) and
(b) for repeated readings and looks at the picture (e.g., He read the
sentences, then reread and looked at the picture and then reread).
The response categories to the third question (In your opinion, why
did we show the video with the eye movements of a successful
student reading?) were: (a) to see how to read and learn well (You
showed the video to make us understand a study method that we
can use to learn something from reading a text and seeing a gure)
and (b) to see how to read well (e.g., You showed the video to make
us understand how we should read a text with a gure).
To sum up, for the rst question, 17 out of 22 participants in the
EMME condition gave responses that emphasized the connection
between text and picture, while 5 perceived in any case that the
179
model read the text and looked at the picture. For the second question, 19 out the 22 participants explicitly appealed to the various
correspondences between words and pictorial elements made by
the model to justify their perception that the model connected verbal
and visual information. Only 3 students referred to multiple readings of the texts and reinspections of the picture without pointing
out the links between the two parts of the learning material. For
the third research question, 16 out of the 22 participants responded explicitly relating both reading and learning when referring
to the perceived purpose of the video they had watched, while 6
students explicitly mentioned only reading an illustrated text.
Overall, students answers to the interview indicated that they
had suciently perceived the crucial aspects of the models visual
behavior and the eye movement replay.
2.4. Procedure
Data collection took place in two sessions. In the rst session,
in the classroom, participants were collectively administered the
pretest questions on the topic of the text read to record baseline
eye movements, as well as on the topic of the text of the learning
episode. Reading comprehension and spatial ability tests were also
administered. This collective part took about 1 hour.
During the second session, which took place in a quiet room in
the school, the eye tracker was calibrated for each participant using
the 9-points procedure. After calibration and before the learning
episode, each participant in both conditions read an illustrated text
on the topic of the greenhouse effect while her/his visual behavior was recorded to collect a sample of baseline eye movements.
Participants in both conditions then performed the Corsi Span test
which also served as a distracter after the baseline reading task and
before the main reading task in the No-EMME condition, and before
showing the gaze replay in the EMME condition. Next, only in the
latter condition did participants observe a replay of the eye movements of a model who read an illustrated text on the topic of the
water cycle. In both conditions, the eye tracker was then calibrated again. After recalibration, each participant was instructed to
read carefully and silently the material on the computer screen, as
she or he would be asked to answer some questions. Participants
read the instructional material at their own pace while eye movements were recorded again. After reading, they rated the text easiness
and interestingness. Next, they performed the Reading Span Test
that also served as a distracter after the reading task and before the
learning performance. Then, they carried out all posttests, that is,
the verbal recall, graphical recall, and answers to the questions for
learning and transfer. The posttest part of this session took about
3040 minutes.
Finally, at the end of the second session all participants in the
EMME condition were individually interviewed to ensure that they
had perceived the models integrative visual behavior and were aware
that the gaze replay was presented to show how to read an illustrated text effectively. This session took 7080 minutes.
3. Results
3.1. Preliminary analyses
3.1.1. Individual characteristics
The equivalence of the readers in the two conditions for all examined control variables was tested rst. We performed a MANOVA
that included condition (EMME vs. No-EMME) as the independent
variable and students scores for the various measures of individual characteristics as dependent variables. The main effect did not
emerge, F < 1. The two groups did not differ for any of the individual differences examined, that is, prior knowledge, reading
180
comprehension, verbal working memory, visuo-spatial memory,

spatial ability, grade in science, perception of text easiness and interestingness, and total learning time. In univariate tests, F values
were < 1 except for visuo-spatial memory, F(1, 40) = 1.66, p = .204,
spatial ability, F(1, 40) = 2.35, p = .133, and learning time, F(1,
40) = 1.98, p = .167.
ways they read an illustrated science text, for their prior knowledge of the text topic, and their perception of the easiness and
interestingness of the text read.
3.2. Research question 1: effects on integrative processing of text

and picture
3.1.2. Baseline eye movements

Participants baseline eye movements were recorded while
reading an illustrated text on a different topic, the greenhouse effect.
To ensure (as reasonably as possible) that at the onset of the learning episode, students in the EMME condition were not spontaneously
more capable of integrative processing of text and graphics than
learners in the No-EMME condition, we carried out the same analyses on the various processing indices of this text that are reported
below (see data for research question 1) for the learning text about
the food chain. No signicant effect of condition emerged for
the rst-pass and second-pass xation times on text and picture,
Wilks Lambda = .85, F(4, 37) = 1.63, p = .187. Similarly, condition
did not differentiate the total look-from corresponding and
non-corresponding text segments to corresponding and noncorresponding picture segments xation times and vice versa, F < 1,
and the average duration of the rexations on the picture after transitions (gaze shifts) from the text and the average duration of the
rexations on the text after transitions from the picture, F < .1.
The data on the baseline processing of an illustrated text indicated that participants in the EMME and No-EMME conditions did
not differ in any of the examined eye-movement measures.
The easiness and interestingness of the text on the greenhouse
effect was perceived at substantially the same level as the text used
in the learning episode, by learners in both the EMME and NoEMME conditions, as indicated by a MANOVA for repeated measures
that revealed no differences for type of text or condition, all Fs < 1.
In addition, participants prior knowledge of the topic of the greenhouse effect, which might have inuenced text processing, was also
equivalent in the two conditions, F < 1.
Overall, these data indicate that at the onset of the study participants in the EMME and No-EMME conditions were similar in the
Research question 1 asked whether students in the EMME condition would show better integrative processing than participants
who did not have the opportunity to observe the models eye
movements. To answer this question, we examined the various
eye-xation indices reported in Table 1. Our hypothesis was that
differences between the two conditions would emerge during the
more strategic and purposeful second-pass reading, as revealed by
integrative processing of verbal and graphical information, which
was emphasized in the models gaze replay.
3.2.1. First-pass and second-pass xation times on text and picture

We carried out a MANOVA with condition as independent variable and the indices of total rst and second-pass (look-back) xation
times for both text and picture as dependent variables. The main
effect of condition did not emerge from the analysis, Wilks
Lambda = .80, F(4, 37) = 2.28, p = .078. Modeled and un-modeled
learners did not differentiate for the time spent attending and
re-attending each of the two parts of the learning material.
3.2.2. Look-from text to picture and from picture to text

xation times
When examining the integrative processing, the pattern of results
was different. We considered the duration of look-from corresponding and non-corresponding text segments to corresponding and
non-corresponding picture segments (and vice versa) xation times.
A MANOVA revealed a large effect of condition, Wilks Lambda = .62,
F(8, 33) = 2.43, p = .035, 2p = .37. Univariate tests showed signicant differences for the look-from corresponding, F(1, 40) = 4.14,
MSE = 7.07, p = .048, 2 =.09, and non-corresponding, F(1, 40) = 18.42,
Table 1
Means and condence intervals of eye-movement indices (frequency and durations in milliseconds) before log-transformation and means and standard deviations after
log-transformation as a function of condition.
Index
No-EMME (n = 20)
EMME (n = 22)
M (95% CI)
Logtransformed
M
First-pass xation time on

Text
12,396.45 (8536.3916,256.51)
Picture
5152.85 (3816.286489.42)
Second-pass xation time on
Text
242,373.70 (168,426.70316,320.70)
Picture
39,092.20 (10,699.7667,484.64)
M (95% CI)
SD
9.01
8.35
1.08
.74
12.14
9.69
.73
1.26
M
9242.32 (5561.8912,922.74)
4578.73 (3304.365853.10)
229,130.18 (158,624.49299,635.88)
52,028.14 (24,957.0179,099.26)
Average
durationa
Total look-from xation time
From C_TXT to C_PICT
From NC_TXT to C_PICT
From C_TXT to NC_PICT
From NC_TXT to NC_PICT
From C_PICT to C_TXT
From NC_PICT to C_TXT
From C_PICT to NC_TXT
From NC_PICT to NC_TXT
2272.10 (2849.077393.27)
1302.65 (81.552686.85)
3035.30 (1329.424741.18)
1120.15 (306.161934.14)
11,356.50 (1700.5621,012.44)
28,735.80 (18,908.2038,563.40)
8832.25 (71.5917,592.91)
12,585.60 (5407.7419,763.46)
517.41
371.34
1742.14
271.64
3596.59
9286.52
2543.12
9247.29
Logtransformed
SD
8.88
8.16
.80
.79
12.20
10.59
.53
.76
7.93
7.50
6.64
7.12
9.29
9.29
8.60
8.78
2.82
1.41
2.34
1.80
2.36
1.13
3.00
2.24
Average
durationa
6.25
3.69
5.45
5.50
7.75
8.32
6.61
7.03
2.46
3.89
3.84
2.60
3.00
3.85
4.09
3.53
11,211.56 (6328.7016,094.39)
3734.05 (2414.265053.83)
2273.77 (647.283900.27)
2230.45 (1454.353006.56)
26,737.14 (17,530.5635,943.71)
17,426.64 (8056.3926,796.89)
19,611.00 (11,258.0427,963.96)
14,197.00 (7353.1821,040.82)
1171.99
948.87
1230.82
491.36
2914.54
5520.02
3582.31
8729.35
Note: C_TXT = corresponding text segments; NC_TXT = non-corresponding text segments; C_PICT = corresponding picture segments; NC_PICT = non-corresponding picture
segments.
a The average duration of look-from xation time is the total xation time divided by the number of transitions (gaze shifts).
14
12
10
8
No-EMME
EMME
4
2
0
Verbal recall
Factual knowledge
Transfer
Fig. 2. Mean scores for the oine tasks as a function of condition. Standard errors
are represented by the error bars.
MSE = 8.24, p < .001, 2 = .31, text segments to corresponding picture

segments, as well as for the look-from non-corresponding text segments to non-corresponding picture segments, F(1, 40) = 5.62,
MSE = 4.91, p = .023, 2 = .12. All look-from xation times were longer
for learners in the EMME condition. These outcomes indicate that
learners who observed the models eye movements spent more time
rexating the relevant parts of the visualization while rereading the
corresponding and non-corresponding text segments than learners who did not see the models gaze replay. In addition, they also
attended more the non-corresponding picture segments while
rexating the non-corresponding text segments, perhaps checking whether there were further depictions of verbal information.
3.2.3. Average look-from text to picture and from picture to text
xation times
A MANOVA with the average duration (total duration divided by the
number of transitions) of look-from text to picture and look-from picture
to text xation times for corresponding and non-corresponding verbal
and graphical segments did not reveal the effect of condition, Wilks
Lambda = .77, F(8, 33) = 1.21, p = .322. This outcome indicates that in both
conditions the length of look-from xation time per transition was substantial the same as when learners made many, or few, gaze shifts from
text segments to picture segments and vice versa.
3.3. Research question 2: effects on postreading outcomes
Research question 2 asked whether students who had the opportunity to see the models eye movements would perform better
181
at the posttests. Our hypothesis was that differences between the

two conditions would emerge in the transfer task. A MANOVA with
the scores from the postreading tasks of the verbal recall, factual
knowledge, and transfer was performed. The analysis revealed a large
effect of condition, Wilks Lambda = .71, F(3, 38) = 5.09, p = .005,
2p = .28. Univariate tests showed that students in the EMME condition outperformed students in the No-EMME condition for verbal
recall (No-EMME: M = 8.00, SD = 4.42; EMME: M = 11.14, SD = 5.26),
F(1, 40) = 4.31, p = .044, 2 = .09, and transfer of knowledge (NoEMME: M = 3.70, SD = 2.25; EMME: M = 5.45, SD = 3.05), F(1, 40) = 4.42,
p = .042, 2 = .10. Only for factual knowledge did they not differentiate (No-EMME: M = 5.77, SD = 4.18; EMME: M = 6.10, SD = 3.73)
(Figure 2).
A non-parametric MannWhitney U test with the scores for the
graphical recall also revealed the superiority of students in the EMME
condition, Z = 2.69, p = .007 (No-EMME: M = .70, SD = .57; EMME:
M = 1.32, SD = .78).
3.4. Research question 3: relating online processing with
oine outcomes
Research question 3 asked whether the index of integrative processing predicted postreading outcomes recall, factual knowledge,
and transfer after controlling for individual characteristics. We
hypothesized that the main index of look-from corresponding text
segments to corresponding picture segments would predict
postreading outcomes, especially the deeper level of learning from
text, as revealed in the transfer task, which requires integrative processing more than the other measures. We also hypothesized that
the magnitude of the relationship between the duration of the lookfrom corresponding text segments to corresponding picture segments
and transfer performance would be signicantly greater in the EMME
condition.
We rst carried out correlational analyses between the indices
of integrative processing that differentiated the two conditions and
posttest measures. These correlations are reported in Table 2, which
also shows the correlations with the individual characteristics.
Regarding the latter, prior knowledge and achievement in science
correlated with all three postreading measures, while reading comprehension correlated with factual knowledge and transfer, but not
with recall. Regarding the eye-xation indices of integrative processing, only the relevant time spent rexating the graphical
information while rereading the corresponding text information correlated with the verbal recall and transfer, but did not with factual
knowledge. This index also correlated highly (r = .75) with the reverse,
that is, the time spent rereading the text while reinspecting the
picture, although the latter (and the other indices of look-from
Table 2
Zero-order correlations for individual characteristics, condition, integrative processing, and postreading outcomes.
Variable
10
11
12
13
14
15
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
.47**
.44**
.51**
.81**
.60**
.06
.28
.16
.04
.12
.16
.05
.09
.07
.39*
.23
.72**
52**
.07
.05
.01
.20
.11
.21
.13
.15
.04
.36*
.57**
.60**
.07
.04
.14
.24
.11
.02
.31*
.16
.24
.52**
.60**
.31*
.43**
.21
.15
.24
.19
.10
.12
.12
.70**
.04
.22
.05
.27
.05
.04
.23
.04
.13
.32*
.38*
.23
.17
.16
.16
.12
.09
.01
.30
.56**
.19
.35*
.28
17
.27
.29
.40**
.13
39*
.75**
.25
.35*
.27
17
.33*
.38*
.03
.53**
.26
.52**
.11
.62**
.10
.40**
.27
.61**
.42**
.72**
.05
.24
.05
.28
.51**
.22
Prior knowledge
Reading comprehension
Achievement in science
Verbal recall
Factual knowledge
Transfer
Condition
Look-from C_TXT to C_PICT xation time
Look-from NC_TXT to C_PICT xation time
Look-from C_TXT to NC_PICT xation time
Look-from NC_TXT to NC_PICT xation time
Look-from C_PICT to C_TXT xation time
Look-from NC_PICT to C_TXT xation time
Look-from C_PICT to NC_TXT xation time
Look-from NC_PICT to NC_TXT xation time
* p < .05, ** p < .01.
182
picture to text) was not associated with any postreading outcome.

Therefore, for a theoretical reason picture processing is guided
by text processing (Bartholom & Bromme, 2009; Schnotz, 2002)
and also for an empirical reason in the subsequent regression analyses, we considered only the duration of look-from corresponding
text segments to corresponding picture xation times as an index
of integrative processing.
In the regression analyses, we also considered the individual characteristics that correlated with postreading outcomes. However, for
methodological reasons, given the limited number of participants
and the consequent need to use few predictors in the regression
equations, we have computed a composite score for the three individual characteristics. In this regard, it is worth noting that they
correlated with each other. The composite score is the standardized average score for prior knowledge, reading comprehension, and
achievement in science.
In the rst step of the hierarchical regression analyses the composite score for individual characteristics was entered. In the second
step, condition and the eye movement index of integrative processing were entered. In the third step, the interaction term was
entered. Regression analyses are presented separately for each task.
3.4.1. Verbal recall

The regression model was signicant after entering the composite score for individual characteristics, R2 = .27, Fchange (1, 40) = 14.95,
p < .001. This score ( = .52) was a signicant predictor. The addition of condition and the eye-movement index of integrative
processing in the second step resulted in a statistical increase in the
explained variance, R2 = .46, Fchange(2, 38) = 6.47, p = .004. In this step,
individual differences ( = .49) and condition ( = .27) were signicant predictors. The addition of the interaction term in the third step
did not result in a signicant increase in the explained variance,
R2 = .46, Fchange < 1. These ndings indicate that individual differences and condition predict content retention. However, the
relationship between condition and the look-from corresponding
text segments to corresponding picture segments was not greater for
the EMME students. Results of the regression analyses of the verbal
recall are reported in Table 3(a).
3.4.2. Factual knowledge

The regression model was signicant after entering the individual characteristics, R2 = .80, F(1, 40) = 158.93, p < .001. The composite
score for them ( = .89) was a signicant predictor. The addition of
condition and the index of integrative processing in the second step
did not result in a statistically signicant increase in the explained variance, R2 = .82, Fchange(2, 38) = 1.90, p = .163 and individual
differences remained the predictor ( = 87). The addition of the interaction term in the third step did not contribute to the explained
variance either, R2 = .82, Fchange(1, 37) = 1.58, p = .216. This means
that the acquisition of factual knowledge was predicted only by
individual differences ( = .85). Results of the regression analyses
for factual knowledge are reported in Table 3(b).
3.4.3. Transfer of knowledge

For the transfer of knowledge, the regression model was significant after entering the composite score for individual characteristics,
R2 = .51, F(1, 40) = 41.33, p < .001. This score ( = .71) was a signicant predictor. The addition of condition and look-from corresponding text segments to corresponding picture segments resulted
in a statistically signicant increase in the explained variance, R2 = .67,
Fchange(2, 38) = 8.96, p = .001. Individual differences ( = .71) and conditions ( = .33) were signicant predictors in the second step. The
interaction term entered in the third step explained a further
signicant portion of variance, R 2 = .71, F change (1, 37) = 5.58,
Table 3
Results of hierarchical multiple regression analyses for variables predicting verbal
recall, factual knowledge, and transfer (N = 42).
(a) Verbal recall
Predictor
R2
Step1
Individual differences
Step 2
Condition
Look-from C_TXT to C_PICT
Step 3
Condition
Condition Look-from C_TXT to C_PICT
.27***
SE
3.12
.80
.52***
2.95
2.76
1.32
.73
1.27
.65
.49***
.27*
.26
2.92
2.80
.97
.53
.75
1.29
1.09
1.35
.49***
.28*
.19
.08
4.44
.35
.89***
4.35
.46
.55
.34
.56
.28
.87***
.06
.14
4.24
.38
.06
.77
.35
.56
.48
.61
.85***
.05
.01
.15
2.36
.37
.71***
.27
1.84
.39
.12
.55
.28
.71***
.33**
.14
2.27
1.95
.43
1.30
.30
.52
.44
.55
.69***
.35**
.15
.36*
19**
.00
(b) Factual knowledge

Predictor
Step1
Step 2
Condition
Step 3
Condition
.80***
.002
.00
(c) Transfer
Predictor
Step1
Step 2
Condition
Step 3
Condition
.51***
.16**
.04*
Note: Individual differences = standardized average composite score for prior

knowledge, reading comprehension, and achievement in science. Condition
coded as 0 = No-EMME and 1 = EMME. C_TXT = corresponding text segments and
C_PICT = corresponding picture segments.
* p < .05.
** p < .01.
*** p < .001.
p = .024. This means that for transfer the magnitude of the relationship between condition and integrative processing ( = .36) was
greater in the EMME condition. Individual differences ( = .69) and
condition ( = .35) were also signicant predictors in the third step.
Results of the regression analyses for the transfer of knowledge are
reported in Table 3(c).
For a complete examination, to follow up on the signicant interaction observed in the regression analysis, we performed a simple
slope analysis (Aiken & West, 1991). It conrmed a signicant
interaction effect for the EMME condition, t(19) = 2.14, p = .046,
b = .91, and a non-signicant effect for the No-EMME condition,
t(17) < .1, b = .03. These data indicated that the effectiveness of the
integrative processing of look-from corresponding text segments to
corresponding picture segments was moderated by condition. The
better transfer performance was signicantly associated with the
higher integrative processing of the EMME students who had
watched the gaze replay. For the No-EMME students, no differences emerged between the low and high look-from corresponding
text segments to corresponding picture segments. The signicant
Transfer
10
9
8
7
6
5
4
3
2
1
0
No-EMME
EMME
Low look-from
High look-from (z-scores)
Fig. 3. Plot of a signicant interaction effect of condition and integrative processing on transfer performance. Look-from xation time is from corresponding text
segments to corresponding picture segments.
interaction of condition and integrative processing observed in

the regression analysis using transfer as the dependent variable
is represented graphically in Figure 3.
4. Discussion
The aim of this study was to examine the effects of eye movement modeling examples (EMME) on cognitive processing of text
and graphics while reading an illustrated science text. We sought
to extend current research by taking into account that it is possible to guide attention and related eye movements through perceptual
emphasis on crucial components of a representation (Grant & Spivey,
2003). We focused on modeling a learning task that entails much
more than perceptual processes, that is, learning from words and
pictures. According to theoretical models of text and picture
comprehension (Mayer, 2009; Schnotz, 2002), as well as outcomefocused (Bartholom & Bromme, 2009; Schlag & Ploetzner, 2011)
and process-focused studies (Mason, Pluchino et al., 2013 and Mason,
Tornatora et al., 2013), the integration of verbal and pictorial representations is essential to learning from an illustrated text. In order
to foster students integrative processing, a novel approach was
adopted using eye-tracking technology. We used a replay of the eye
positions of a model moving through an illustrated science text and
showed it without accompanying verbal instructions to seventh
graders before asking them to read similar material on their own.
The video with the gaze replay emphasized the integrative processing of verbal and pictorial information through multiple shifts
from one representation to the other after reading the text that
guided the picture processing.
Three research questions guided the study. Research question 1
asked whether in the EMME condition students would show greater
integrative processing of text and graphics while reading an illustrated science text than students who did not have the opportunity
to observe the models eye movements. As hypothesized, the benets of EMME emerged for integrative processing during the secondpass reading. Specically, the ndings showed that students in the
EMME condition spent longer reinspecting the depicted verbal
information while rereading it, as expected. This index can be considered the main indicator of integrative eye movements as modeled
in the gaze replay. After reading the text, the model started making
all correspondences between verbal and graphical information. In
other words, this ocular behavior primed integration of text and
picture in the video-model.
Participants in the EMME condition also spent more time
rexating the depicted information while rereading the text information that did not correspond to it. This ocular behavior was also
183
modeled in the video in a later phase as it is also important for learning. To exemplify, when a student reads in the text about rst order
consumer s/he may need to look at the depiction of the second order
consumers to better understand the difference between the two
orders, or to connect different but relevant segments of the two
(verbal and graphical) representations. In any case, the index of lookfrom non-corresponding text segments to corresponding picture
segments was not associated with any postreading outcome. This
means that students in the EMME condition had a potentially more
useful integrative ocular behavior when reading the verbal information that was not visualized, but this behavior was not related
to the reading outcomes.
Moreover, students in the EMME condition explored for longer
the less relevant parts of the picture while rereading the text segments that were not graphically represented. This is an index of
ocular behavior that does not potentially facilitate an integration
of the two types of information. A possible interpretation of this
outcome refers to EMME students attempt to establish more correspondences between text and picture beyond those that were
easily identiable in the learning material. It is worth noting that
this potentially unproductive eye-movement index was not associated with any postreading outcome.
All signicant differences in integrative processing between the
two conditions regard the textpicture direction. In this respect, it
should be pointed out that in accordance with previous research
(Bartholom & Bromme, 2009), text processing guides the models
picture processing during reading, as shown in the gaze replay. It
is therefore legitimate that the video-modeled students used the
text more as the anchoring part of the learning material compared with the un-modeled students.
Overall, the interesting issue is that the ocular behavior of participants in the EMME and No-EMME conditions differed only for
the second-pass integrative processing of text and picture. In fact,
the rst and second-pass processing within the text or within the
picture did not vary from one condition to the other. As hypothesized, it can be said that modeling through gaze replay did not
support a longer text rereading or picture reinspecting per se, but
rather a stronger integrative processing of text and picture. This
outcome conrms the ndings of another recent study on eye
movement modeling examples to support integration of verbal and
graphical information, which indicates that EMME is particularly
benecial for students with lower reading comprehension skills
when considering both the acquisition and transfer of knowledge
(Mason, Pluchino, & Tornatora, in press).
It is also noteworthy that the total, not the average, look-from
corresponding text segments to corresponding picture segments differentiated the two reading conditions. This outcome indicates that
the EMME condition supported a longer overall rexation time on
the picture while reading the text, which is related to a higher
number of transitions (gaze shifts) from corresponding text segments to corresponding picture segments. We cannot say that in
the EMME condition, on average, learners made longer reinspections
of the relevant parts of the picture after gaze shifts. Fewer transitions with a longer duration of xation would be more indicative
of integrative processing than many transitions with short xation durations. This is an important issue that needs to be
investigated in future studies.
Research question 2 asked whether the benets of eye movement modeling examples would also manifest in the oine task
performances, especially at the deeper level of conceptual learning as revealed in a transfer task. Students in the EMME condition
achieved higher scores than students in the No-EMME condition
not only in the transfer, as hypothesized, but also in the verbal
and graphical recalls. These ndings and those regarding the rst
research question lead us to maintain that the EMME condition
supported more integrative processing of the learning material as
184
well as better content retention and transfer of knowledge. The

fact that the participants did not differ for the acquisition of factual
knowledge is not clearly interpretable. An interpretation could be
that the answers to the questions used to measure it did not require
particular integration of verbal and graphical elements as shown
in the EMME condition. However, retention, that is the lower level
of postreading outcomes, was different in the two situations both
verbal and graphical. Another related interpretation is that the questions may have captured the acquisition of factual knowledge only
partially. Undoubtedly, the issue of measurement of various levels
of postreading outcomes deserves further investigation for a more
solid and articulated picture of the benets associated with the
use of EMME in the educational context.
Research question 3 asked whether there was a link between illustrated text processing and the postreading measures. Data from
regression analyses only partially supported our hypothesis about
the predicting role of integrative processing. The index of integrative processing of look-from corresponding text segments to
corresponding picture segments xation time did not predict the
postreading outcomes, while the composite score for the individual differences of prior knowledge, reading comprehension, and
achievement in science was a predictor of all three posttest measures. However, as expected, the magnitude of the relationship
between condition and integrative processing for the transfer performance was stronger in the EMME condition, after controlling for
individual characteristics. A slope analysis conrmed the signicant interaction effect for the EMME condition only.
Overall, the ndings regarding the crucial link between online
text processing and oine measures of illustrated text reading highlights the variability of the natural school environments. This
variability is to some extent reduced in the EMME condition when
considering the transfer task. This nding is substantially in line with
previous eye-tracking studies with grade level students, which revealed that longer integrative processing is related to more successful
deep learning when reading an illustrated science text (Mason,
Pluchino et al., 2013 and Mason, Tornatora et al., 2013). The nding
also substantially conrms outcome-focused studies on the effectiveness of teaching reading strategies for an illustrated text to lower
secondary school students (Schlag & Ploetzner, 2011).
In sum, the main contribution of the study highlights the potential of eye movement modeling in supporting only perceptually
the execution of a complex task that may be dicult for students,
with no supplement of verbal instructions as was the case in
previous investigations (Jarodzka et al., 2012, 2013; Seppnen
& Gegenfurtner, 2012; van Gog et al., 2009). A learning task that
requires the selection, organization, and integration of information, like reading an illustrated science text to learn from it, can be
indirectly and quickly taught via a models ocular behavior
while executing the task. This adds to the literature that a perceptual modeling intervention should not necessarily be extensive and supplemented by oral or written instructions to be
effective in guiding visual attention for carrying out a task that goes
far beyond the perception of relevant elements in the examined
material.
4.1. Limitations
Some limitations should be taken into account when interpreting the results of the current study. The rst limitation is the design
with only two conditions. A stronger test of the effectiveness of
EMME would imply another EMME condition. Of course, in the
school context, a condition in which EMME is used to show poor
reading behavior is not acceptable. However, it would be possible,
for example, to use EMME in a condition that shows two contrasting reading strategies, one good and one poor. Alternatively, a second
EMME condition could show a different, potentially successful,
reading strategy (i.e., a brief exploration of the picture before reading

the text, see Eitel et al., 2013). Furthermore, in a third condition,
the processing strategy could be taught without the use of eye tracking. In this way, a more optimal design would make it possible to
separate the strategy instruction from the particular method of
delivering. The fact that in the No-EMME condition students were
not involved in any activity, while the students in the EMME
condition watched the models gaze replay although very briey
(about three minutes), is also a design concern that should be
rectied in future studies.
The second limitation is the use of only one text and picture to
test the EMME effects, mainly due to technical reasons. More solid
results will be obtained by using multiple verbal and graphical representations. The third limitation regards the lack of a delayed test,
which should be included in a stronger design aimed at examining long-term effects. A fourth limitation is that the study focuses
only on the cognitive effects of eye movement modeling. Future investigations may examine whether EMME creates or renes
metacognitive effects. This could be, for example, awareness of the
strategies to be used when carrying out a given task for a given
purpose, as participants responses to the questions about their perception of the models integrative processing leads us to speculate.
5. Conclusions
Despite these limitations, the current research has scientic
signicance as it suggests the potential of a novel approach to
example-based instruction, namely eye movement modeling examples. A very short and simple manipulation of a learning condition
through a models gaze replay can contribute to deeper learning,
which is related, at least to some extent, to the patterns of integrative processing behavior during reading. In this regard, eyetracking methodology as a research tool offers a unique opportunity
for examining the process of coherence formation when learning
from multiple representations (Hyn, 2010; Mayer, 2010).
The study also has educational signicance as they indicate that
eye-tracking methodology may be not only a research tool but also
an instruction tool. It has potential to be used to model students
visual behavior when performing school tasks, making them more
aware of the essential processes that underlie a successful learning performance. Importantly, eye trackers have become smaller and
less expensive over time, and will continue to do so in the future.
However, it is worth noting that if the production of a video with
the gaze replay requires an eye tracking apparatus, the video is then
usable without the apparatus to perceptually model the execution
of an essential task for successful learning. This means that within
a framework of productive collaboration between researchers and
teachers, modeling through eye movements can be easily implemented in schools to sustain effective processing in various learning
tasks.
Acknowledgments
The study is part of a research project on learning diculties in
the science domain (STPD08HANE_001) funded by a grant to the
rst author from the University of Padova, Italy, under the founding program for Strategic Projects. We are very grateful to all the
students involved in the study, their parents and teachers, and the
school principals.
Appendix
The text read by all participants (translated from Italian)
Regarding their feeding, all living organisms are linked to one
another like the rings in a chain. This series of links is known as
the food chain. In any environment, including the marine, plants

and animals form various food chains. Here, for example, is how a
marine food chain is formed. It starts with the producers: water
vegetation, such as different kinds of algae (for example the phytoplankton). These organisms take advantage of solar energy and,
through photosynthesis, transform basic substances like water,
carbon dioxide, and mineral substances into sugars and starch, thus
they produce their own food and release oxygen in the environment. Other organisms, for example shrimps and larvae (namely
zooplankton), which are rst order consumers, feed on the water
vegetation, that is, the producers in the food chain. There are also
second order consumers: starsh, sea urchins, other sh, turtles,
dolphins, and whales. Then there are third order consumers: seabirds, sharks, and killer whales. Bacteria and other organisms also
belong to the food chain and are called decomposers. They live on
the sea oor and transform food remains (dead plants and animals)
into mineral substances, which are useful to water vegetation in order
to photosynthesize. Breaking the equilibrium in one ring of the food
chain affects the entire chain.
185
2 points: There would be fewer plants because sunlight is fundamental to plant photosynthesis, and if plants decrease, there would
be very serious problems in all the links of the food chain.
Example of a graphical recall scored 0
Examples of questions to measure factual (pre and posttest) prior

knowledge (translated from Italian) and examples of scoring
Question 2a: What is a producer in a food chain?
A small animal
A plant
A big animal
A human being
Question 2b: Explain your choice as clearly and completely as
possible.
1 point: A plant produces substances useful to rst, second, and
third order consumers, and the entire chain.
2 points: Producer means being able to produce its own food.
A plant is able to make its own food by means of photosynthesis.
Question 7: How does a food chain end? Why?
0 points: It ends with the biggest animal.
1 point: It ends with the decomposers that transform the
consumers.
2 points: The chain ends with the decomposers that transform
the remains of dead animals and then the transformed substances
are used by plants so that the chain starts again.
Example of a graphical recall scored 1
Example of a graphical recall scored 2.
Examples of questions to measure transfer of knowledge and

examples of scoring
Question 4a: Which of these marine food chains is correct?
Shrimpwhaletunashark
Algashrimpdolphintuna
Larvashrimptunashark
Algashrimpwhaleshark
Question 4b: Explain your choice as clearly and completely as
possible.
1 point: This is correct because algae are a producer and are eaten
by shrimps which are rst order consumers which, in turn, are eaten
by whales, a second order consumer that is eaten by sharks, a third
order consumer.
2 points: It is correct because algae photosynthesize and produce
their own food. A shrimp eats the algae, a whale eats the shrimp
and, in the end, a shark eats the whale but is not eaten by anyone
else. Then when it is dead, it is transformed by decomposers.
Question 7: What would happen to a food chain if there were
50% less sunlight per day?
0 points: a disaster, because if there were animals that need heat,
they would die and then the chain would be destroyed.
1 point: The chain would break because the suns energy is
necessary to the producers to make their own food.
References
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting
interactions. Newbury Park, CA: Sage.
Alexander, P. A. (2012). Reading into the future: Competence for the 21st century.
Educational Psychologist, 47(4), 259280. doi:10.1080/00461520.2012.72251.
Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples:
Instructional principles from the worked examples research. Review of Educational
Research, 70(2), 181214. doi:10.3102/00346543070002181.
Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice-Hall.
186
Bartholom, T., & Bromme, R. (2009). Coherence formation when learning from text
and pictures: What kind of support for whom? Journal of Educational Psychology,
101(2), 282293. doi:10.1037/a0014312.
Boucheix, J. M., & Lowe, R. K. (2010). An eye tracking comparison of external pointing
cues and internal continuous cues in learning with complex animations. Learning
and Instruction, 20(2), 123135. doi:10.1016/j.learninstruc.2009.02.015.
Braaksma, M. A. H., Rijlaarsdam, G., & van den Bergh, H. (2002). Observational learning
and the effects of model-observer similarity. Journal of Educational Psychology,
94(2), 405415. doi:10.1037/0022-0663.94.2.405.
Butcher, K. R. (2006). Learning from text with diagrams: Promoting mental model
development and inference generation. Journal of Educational Psychology, 98(1),
182197. doi:10.1037/0022-0663.98.1.182.
Canham, M., & Hegarty, M. (2010). Effects of knowledge and display design on
comprehension of complex graphics. Learning and Instruction, 20(2), 155166.
doi:10.1016/j.learninstruc.2009.02.014.
Carney, R. N., & Levin, J. R. (2002). Pictorial illustrations still improve students learning
from text. Educational Psychology Review, 14(1), 526. doi:10.1023/
A:1013176309260.
Collins, A., Brown, J. S., & Newman, S. E. (1989). Cognitive apprenticeship: Teaching
the crafts of reading, writing, and mathematics. In L. B. Resnick (Ed.), Knowing,
learning, and instruction (pp. 453494). Hillsdale, NJ: Erlbaum.
Cordova, J. R., Sinatra, G. M., Jones, S. H., Taasoobshirazi, G., & Lombardi, D. (2014).
Condence in prior knowledge, self-ecacy, interest and prior knowledge:
Inuences on conceptual change. Contemporary Educational Psychology, 39(2),
164174. doi:10.1016/j.cedpsych.2014.03.006.
Cornoldi, C., & Colpo, G. (1995). Nuove prove MT per la scuola media [New MT tests
of reading comprehension for the middle school]. Florence, Italy: Organizzazioni
Speciali.
Corsi, P. M. (1972). Human memory and the medial temporal region of the brain.
Unpublished doctoral dissertation, McGill University, Montreal.
Cromley, J. G. (2009). Reading achievement and science prociency: International
comparisons from the Programme on International Student Assessment. Reading
Psychology, 30(2), 89118. doi:10.1080/02702710802274903.
Cromley, J. G., Snyder-Hogan, L. E., & Luciw-Dubas, U. A. (2010a). Cognitive activities
in complex science text and diagrams. Contemporary Educational Psychology, 35(1),
5974. doi:10.1016/j.cedpsych.2009.10.00.
Cromley, J. G., Snyder-Hogan, L. E., & Luciw-Dubas, U. A. (2010b). Reading
comprehension of scientic text: A domain-specic test of the direct and
inferential mediation model of reading comprehension. Journal of Educational
Psychology, 102(3), 687700. doi:10.1037/a0019452.
Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory
and reading. Journal of Verbal Learning and Verbal Behavior, 19(4), 450466.
doi:10.1016/S0022-5371(80)90312-6.
Diakidoy, I. N., Kendeou, P., & Ioannides, C. (2003). Reading about energy: The effects
of text structure in science learning and conceptual change. Contemporary
Educational Psychology, 28, 335356. doi:10.1016/S0361-476X(02)00039-5.
Diakidoy, I. N., Mouskounti, T., & Ioannides, C. (2011). Comprehension and learning
from refutation and expository texts. Reading Research Quarterly, 46, 2238.
doi:10.1598/RRQ.46.1.2.
Eitel, A., Scheitel, K., Schler, A., Nystrm, M., & Holmqvist, K. (2013). How a picture
facilitates the process of learning from text: Evidence for scaffolding. Learning
and Instruction, 28, 4863. doi:10.1016/j.learninstruc.2013.05.002.
Florax, M., & Ploetzner, R. (2010). What contributes to the split-attention effect? The
role of text segmentation, picture labelling, and spatial proximity. Learning and
Instruction, 20(3), 216224. doi:10.1016/j.learninstruc.2009.02.021.
Geiser, C., Lehmann, W., & Eid, M. (2006). Separating rotators from non rotators
in the Mental Rotations Test: A multigroup latent class analysis. Multivariate
Behavioral Research, 41(3), 261293. doi:10.1207/s15327906mbr4103_2.
Grant, E. R., & Spivey, M. J. (2003). Eye movements and problem solving: Guiding
attention guides thought. Psychological Science, 14(5), 462466. doi:10.1111/
1467-9280.02454.
Groenendijk, T., Janssen, T., Rijlaarsdam, G., & van den Bergh, H. (2013). The effect
of observational learning on students performance, processes, and motivation
in two creative domains. British Journal of Educational Psychology, 83(1), 328.
doi:10.1111/j.2044-8279.2011.02052.x.
Hannus, M., & Hyn, J. (1999). Utilization of illustrations during learning of science
textbook passages among low- and high-ability children. Contemporary
Educational Psychology, 24(2), 95123. doi:10.1006/ceps.1998.0987.
Haskell, R. E. (2001). Transfer of learning: Cognition, instruction, and reasoning. San
Diego, CA: Academic Press.
Hyn, J. (2010). The use of eye movements in the study of multimedia learning.
Learning and Instruction, 20(2), 172176. doi:10.1016/j.learninstruc.
2009.02.013.
Hyn, J., & Lorch, R. F. (2004). Effects of topic headings on text processing: Evidence
from adult readers eye xation patterns. Learning and Instruction, 14(2), 131152.
Hyn, J., Lorch, R. F., & Kaakinen, J. K. (2002). Individual differences in reading to
summarize expository text: Evidence from eye xation patterns. Journal of
Educational Psychology, 94(1), 4455. doi:10.1037/0022-0663.94.1.44.
Hyn, J., Lorch, R. F., & Rinck, M. (2003). Eye movement measures to study global
text processing. In J. Hyn, R. Radach, & H. Deubel (Eds.), The minds eye: Cognitive
and applied aspects of eye movement research (pp. 313334). Amsterdam: Elsevier
Science.
Hyn, J., & Nurminen, A.-M. (2006). Do adult readers know how they read? Evidence
from eye movement patterns and verbal reports. British Journal of Educational
Psychology, 97(1), 3150. doi:10.1348/000712605x53678.
Jarodzka, H., Balslev, T., Holmqvist, K., Nystrm, M., Scheiter, K., Gerjets, P., et al. (2012).
Conveying clinical reasoning based on visual observation via eye-movement
modelling examples. Instructional Science, 40(5), 813827. doi:10.1007/s11251012-9218-5.
Jarodzka, H., Scheiter, K., Gerjets, P., & van Gog, T. (2010). In the eyes of the beholder:
How experts and novices interpret dynamic stimuli. Learning and Instruction,
20(2), 146154. doi:10.1016/j.learninstruc.2009.02.019.
Jarodzka, H., van Gog, T., Dorr, M., Scheiter, K., & Gerjets, P. (2013). Learning to see:
Guiding students attention via a Models eye movements fosters learning.
Learning and Instruction, 25, 6270. doi:10.1016/j.learninstruc.2012.11.004.
Johnson, C. I., & Mayer, R. E. (2012). An eye movement analysis of the spatial contiguity
effect in multimedia learning. Journal of Experimental Psychology. Applied, 18(2),
178191. doi:10.1037/a0026923.
Just, M. A., & Carpenter, P. A. (1980). A theory of reading: From eye xations to
comprehension. Psychological Review, 87(4), 329354. doi:10.1037/0033295X.87.4.329.
Kendeou, P., Muis, K., & Fulton, S. (2011). Reader and text factors in reading
comprehension processes. Journal of Research in Reading, 34(4), 162183.
doi:10.1111/j.1467-9817.2010.01436.x.
Litcheld, D., Ball, L. J., Donovan, T., Manning, D. J., & Crawford, T. (2010). Viewing
another persons eye movements improves identication of pulmonary nodules
in chest X-ray inspection. Journal of Experimental Psychology. Applied, 16(3),
251262. doi:10.1037/a0020082.
Lowe, R. K., & Boucheix, J. M. (2011). Cueing complex animations: Does direction
of attention foster learning processes? Learning and Instruction, 21(5), 650663.
Mammarella, I. C., Pazzaglia, F., & Cornoldi, C. (2008). Evidence for different
components in childrens visuospatial working memory. British Journal of
Developmental Psychology, 26(3), 337355. doi:10.1348/026151007X236061.
Mammarella, I. C., Toso, C., Pazzaglia, F., & Cornoldi, C. (2008). Bvs-Corsi. Batteria per
la valutazione della memoria visiva spaziale [Bvs-Corsi. A test battery for the
evaluation of visual spatial memory]. Trento, Italy: Erickson.
Mason, L., Gava, M., & Boldrin, A. (2008). On warm conceptual change: The interplay
of text, epistemological beliefs, and topic interest. Journal of Educational
Psychology, 100, 291309. doi:10.1037/0022-0663.100.2.291.
Mason, L., Pluchino, P., & Tornatora, M. C. (2013). Effects of picture labeling on
illustrated science text processing and learning: Evidence from eye movements.
Reading Research Quarterly, 48(2), 199214. doi:10.1002/rrq.41.
Mason, L., Pluchino, P., & Tornatora, M. C. (in press). Using eye-tracking technology
as an indirect instruction tool to improve text and picture processing and learning.
British Journal of Educational Technology.
Mason, L., Pluchino, P., Tornatora, M. C., & Ariasi, N. (2013). An eye-tracking
study of learning from science text with concrete and abstract illustrations.
Journal of Experimental Education, 81(3), 129. doi:10.1080/00220973.2012
.727885.
Mason, L., Tornatora, M. C., & Pluchino, P. (2013). Do fourth graders integrate text
and picture in processing and learning from an illustrated science text? Evidence
from eye-movement patterns. Computers & Education, 60(1), 95109.
doi:10.1016/j.compedu.2012.07.011.
Mason, L., Tornatora, M. C., & Pluchino, P. (2015). Integrative processing of verbal
and graphical information during re-reading predicts learning from illustrated
text: an eye-movement study. Reading and Writing, doi:10.1007/s11145-0159552-5.
Mayer, R. E. (1989). Systematic thinking fostered by illustrations in scientic text.
Journal of Educational Psychology, 81(2), 240246. doi:10.1037/0022-0663.81.2
.240.
Mayer, R. E. (2009). Multimedia learning (2nd ed.). New York: Cambridge University
Press.
Mayer, R. E. (2010). Unique contributions of eye-tracking research to the study of
learning with graphics. Learning and Instruction, 20(2), 167171. doi:10.1016/
j.learninstruc.2009.02.012.
Mayer, R. E. (2014). Cognitive theory of multimedia learning. In R. E. Mayer (Ed.),
The Cambridge handbook of multimedia learning (2nd ed., pp. 4371). New York:
Cambridge University Press.
Mayer, R. E., & Gallini, J. K. (1990). When is an illustration worth ten thousand words?
Journal of Educational Psychology, 82(4), 715726. doi:10.1037/0022-0663.82
.4.715.
Paivio, A. (1986). Mental representations: A dual-coding approach. New York: Oxford
University Press.
Pazzaglia, F., Palladino, P., & De Beni, R. (2000). Presentazione di uno strumento per
la valutazione della memoria di lavoro verbale e sua relazione con i disturbi della
comprensione [An instrument for the assessment of verbal working memory and
its relation with comprehension diculties]. Psicologia Clinica dello Sviluppo, 3,
465486.
Rayner, K. (1998). Eye movements in reading and information processing: 20 years
of research. Psychological Bulletin, 124(3), 372422. doi:10.1037/00332909.124.3.372.
Rayner, K. (2009). Eye movements and attention in reading, scene perception, and
visual search. Quarterly Journal of Experimental Psychology, 62(8), 14571506.
doi:10.1080/17470210902816461.
Renkl, A. (1997). Learning from worked-out examples: A study on individual
differences. Cognitive Science, 21(1), 129. doi:10.1207/s15516709cog2101
_1.
Schlag, S., & Ploetzner, R. (2011). Supporting learning from illustrated texts:
conceptualizing and evaluating a learning strategy. Instructional Science, 39(6),
921937. doi:10.1007/s11251-010-9160-3.
Schnotz, W. (2002). Commentary: Towards an integrated view of learning from text

and visual displays. Educational Psychology Review, 14(1), 101120. doi:10.1023/
A:1013136727916.
Schnotz, W. (2014). Integrated model of text and picture comprehension. In R. E.
Mayer (Ed.), The Cambridge handbook of multimedia learning (2nd ed., pp. 72103).
New York: Cambridge University Press.
Schnotz, W., & Bannert, M. (2003). Construction and interference in learning from
multiple representations. Learning and Instruction, 13(2), 141156. doi:10.1016/
S0959-4752(02)00017-8.
Schroeder, S., Richter, T., McElvany, N., Hachfeld, A., Baumert, J., Schnotz, W., et al.
(2011). Teachers beliefs, instructional behaviors, and students engagement in
learning from texts with instructional pictures. Learning and Instruction, 21(3),
403415. doi:10.1016/j.learninstruc.2010.06.001.
Seppnen, M., & Gegenfurtner, A. (2012). Seeing through a teachers eye improves
students imagining interpretation. Medical Education, 46(11), 11131114.
doi:10.1111/medu.12041.
Sinatra, G. M., & Broughton, S. H. (2011). Bridging reading comprehension and
conceptual change in science education: The promise of refutation text. Reading
Research Quarterly, 46(4), 369388. doi:10.1002/RRQ.005.
Stark, R., Kopp, V., & Fischer, M. R. (2011). Case-based learning with worked examples
in complex domains: Two experimental studies in undergraduate medical
187
education. Learning and Instruction, 21(1), 2233. doi:10.1016/j.learninstruc

.2009.10.001.
van Gog, T. (2011). Effects of identical exampleproblem and problemexample pairs
on learning. Computers & Education, 57(2), 17751779. doi:10.1016/j.compedu
.2011.03.019.
van Gog, T., Jarodzka, H., Scheiter, K., Gerjets, P., & Paas, F. (2009). Attention guidance
during example study via the models eye movements. Computers in Human
Behavior, 25(3), 785791. doi:10.1016/j.chb.2009.02.007.
van Gog, T., Paas, F., & van Merrinboer, J. J. G. (2006). Effects of process-oriented
worked examples on troubleshooting transfer performance. Learning and
Instruction, 16(2), 154164. doi:10.1016/j.learninstruc.2006.02.004.
van Gog, T., & Rummel, N. (2010). Example-based learning: Integrating cognitive and
social-cognitive research perspectives. Educational Psychology Review, 22(2),
155174. doi:10.1007/s10648-010-9134-7.
van Gog, T., & Scheiter, K. (2010). Eye tracking as a tool to study and enhance
multimedia learning. Learning and Instruction, 20(2), 9599. doi:10.1016/
j.learninstruc.2009.02.009.
Vandenberg, S. G., & Kuse, A. R. (1978). Mental rotations. A group test of
three-dimensional spatial visualization. Perceptual and Motor Skills, 47(2),
599604.

2015 - Eye Movement Modeling of Integrative Reading of An Illustrated Text Effects On Processing and Learning

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

2015 - Eye Movement Modeling of Integrative Reading of An Illustrated Text Effects On Processing and Learning

Загружено:

Авторское право:

Доступные форматы

Contemporary Educational Psychology 41 (2015) 172187

Contents lists available at ScienceDirect

Contemporary Educational Psychology

Eye-movement modeling of integrative reading of an illustrated text:

Department of Developmental Psychology and Socialization, University of Padova, Padova, Italy

* Corresponding author. Department of Developmental Psychology and

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

2013). Eye tracking captures a persons eye position, which is linked

1.1. Comprehension of text and picture

inspection guided by schemata that have selective and organizational functions.

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

1.2. Eye movement modeling examples

aspect of cognitive apprenticeship in which an expert model

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

with videos of new patients, but also showed better clinical

1.3. The current study

sequence of text and picture processing in which the latter is

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

students (Mason, Pluchino et al., 2013 and Mason, Tornatora et al.,

2.2.2. Eye movement modeling example

avoid participants in the EMME condition being advantaged in terms

2.2.3. Learning material

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

choice questions that also required a justication for the chosen

2.2.5.1. Verbal recall. Content retention was measured asking the

2.2.5.2. Graphical recall. Participants were also asked to draw

2.2.5.3. Factual knowledge. Factual knowledge about the topic was

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

2.2.6.7. Baseline eye movements. While reading an illustrated text

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

2.3.2. Pre and posttest

2.3.3. Interviews for manipulation check

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

comprehension, verbal working memory, visuo-spatial memory,

3.2. Research question 1: effects on integrative processing of text

3.1.2. Baseline eye movements

3.2.1. First-pass and second-pass xation times on text and picture

3.2.2. Look-from text to picture and from picture to text

First-pass xation time on

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

MSE = 8.24, p < .001, 2 = .31, text segments to corresponding picture

at the posttests. Our hypothesis was that differences between the

* p < .05, ** p < .01.

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

picture to text) was not associated with any postreading outcome.

3.4.1. Verbal recall

3.4.2. Factual knowledge

3.4.3. Transfer of knowledge

(b) Factual knowledge

Note: Individual differences = standardized average composite score for prior

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

High look-from (z-scores)

interaction of condition and integrative processing observed in

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

well as better content retention and transfer of knowledge. The

reading strategy (i.e., a brief exploration of the picture before reading

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187

the food chain. In any environment, including the marine, plants

Examples of questions to measure factual (pre and posttest) prior

Example of a graphical recall scored 1

Example of a graphical recall scored 2.

Examples of questions to measure transfer of knowledge and

L. Mason et al./Contemporary Educational Psychology 41 (2015) 172187