Вы находитесь на странице: 1из 502

Fourth Edition

Engineering Psychology and Human Performance


Christopher D. Wickens
University of Illinois at Urbana-Champaign and AlionSciences
Justin G. Hollands
Defence Research and Development Canada and University of Toronto
Simon Banbury
Looking Glass HF and Université Laval
Raja Parasuraman
George Mason University

3
First published 2013, 2000, 1992 by Pearson Education, Inc.

Published 2016 by Routledge


2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
711 Third Avenue, New York, NY 10017, USA

Routledge is an imprint of the Taylor & Francis Group, an informa business

Copyright © 2013, 2000, 1992 Taylor & Francis. All rights reserved.

All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other
means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without
permission in writing from the publishers.

Notice:
Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to
infringe.
Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook appear on appropriate page
within text.

ISBN: 9780205021987 (hbk)

Cover Designer: Suzanne Behnke

Catalogue in Publication data available from the Library of Congress

4
Dedicated to Bill Howell: a pioneer and leader in engineering psychology

5
BRIEF CONTENTS

Preface
Chapter 1 Introduction to Engineering Psychology and Human Performance
Chapter 2 Signal Detection and Absolute Judgment
Chapter 3 Attention in Perception and Display Space
Chapter 4 Spatial Displays
Chapter 5 Spatial Cognition, Navigation, and Manual Control
Chapter 6 Language and Communication
Chapter 7 Memory and Training
Chapter 8 Decision Making
Chapter 9 Selection of Action
Chapter 10 Multitasking Corrected
Chapter 11 Mental Workload, Stress, and Individual Differences: Cognitive and Neuroergonomic
Perspectives
Chapter 12 Automation and Human Performance
Epilogue
References
Name Index
Subject Index

6
CONTENTS

Preface
Chapter 1 Introduction to Engineering Psychology and Human Performance
1. Definitions
1.1 Engineering psychology
1.2 Human performance
2. Research Methods
3. A Model of Human Information Processing
4. Pedagogy of the Book
Key Terms
Chapter 2 Signal Detection and Absolute Judgment
1. Overview
2. Signal Detection Theory
2.1 The signal detection paradigm
2.2 Setting the response criterion: optimality in SDT
2.3 Sensitivity
3. The ROC Curve
3.1 Theoretical representation
3.2 Empirical data
4. Fuzzy Signal Detection Theory
5. Applications of Signal Detection Theory
5.1 Medical diagnosis
5.2 Recognition memory and eyewitness testimony
5.3 Alarm and alert systems
6. Vigilance
6.1 Measuring vigilance performance
6.2 Theories of vigilance
6.3 Techniques to combat the loss of vigilance
6.4 Vigilance: inside and outside the laboratory
7. Absolute Judgment
7.1 Quantifying information
7.2 Single dimensions
7.3 Multidimensional judgment
8. Transition
Supplement: Information Theory
S.1 The quantification of information
S.2 Information transmission of discrete signals

7
S.3 Conclusion
Appendix: Computing d’ and Beta
Key Terms
Chapter 3 Attention in Perception and Display Space
1. Overview
2. Selective Visual Attention
2.1 Supervisory control: the SEEV model
2.2 Noticing and attentional capture
2.3 Visual search
2.4 Clutter
2.5 Directing and guiding attention
3. Parallel Processing and Divided Attention
3.1 Preattentive processing and perceptual Organization
3.2 Spatial proximity
3.3 Object-based proximity
3.4 Applications of object-based attention
3.5 The proximity compatibility principle (PCP)
4. Attention in the Auditory Modality
4.1 Auditory divided attention
4.2 Focusing auditory attention
4.3 Cross-modality attention
5. Transition
Key Terms
Chapter 4 Spatial Displays
1. Graph Perception
1.1 Graph guidelines
1.2 Task dependency and the proximity compatibility principle
1.3 Minimize the number of mental operations: search, encode, and compare
1.4 Biases in graph reading
1.5 The data-ink ratio
1.6 Multiple graphs
2. Dials, Meters, and Indicators: Display Compatibility
2.1 The static component: pictorial realism
2.2 Color coding
2.3 Compatibility of display movement
2.4 Display integration and ecological interface design
3. The Third Dimension: Egomotion, Depth, and Distance
3.1 Direct and indirect perception
3.2 Perception of egomotion: ambient 3D
3.3 Judging and interpreting depth and three-dimensional structure: focal 3D
3.4 Illusions in 3D viewing

8
3.5 3D displays
3.6 Stereoscopic displays
4. Spatial Audio and Tactile Displays
5. Transition
Key Terms
Chapter 5 Spatial Cognition, Navigation, and Manual Control
1. Frames of Reference
1.1 Cognitive representation of space
1.2 Frame of reference (FOR) transformations in 2D mental rotation
1.3 3D Mental rotation: the general FORT model
1.4 2D or 3D
1.5 Solutions to FOR problems
2. Applications to Map Design
2.1 Design of 2D maps
2.2 Design of 3D maps
2.3 Map Scale
2.4 The Role of clutter in map search
3. Environmental Design
4. Information Visualization
4.1 Tasks in visualization
4.2 Principles of visualization
5. Visual Momentum
6. Tracking, Travel, and Continuous Manual Control
6.1 Tracking to a fixed target
6.2 Tracking a moving target
6.3 What makes tracking difficult
6.4 Multi-axis tracking and control
7. Virtual environments and augmented reality
7.1 Virtual environment characteristics
7.2 Uses of virtual environments
7.3 Augmented reality
8. Transition
Key Terms
Chapter 6 Language and Communication
1. Overview
2. The Perception of Print
2.1 Stages in word perception
2.2 Top-down processing: context and redundancy
2.3 Reading: from words to sentences
3. Applications of Unitization and Top-Down Processing
3.1 Unitization

9
3.2 Context-data tradeoffs
3.3 Code design: economy versus security
4. Recognition of Objects
4.1 Top-down and bottom-up processing
4.2 Pictures and icons
4.3 Sounds and earcons
5. Comprehension
5.1 Instructions
5.2 Context
5.3 Command versus status
5.4 Linguistic factors
5.5 Working memory load
6. Multimedia Instructions
6.1 The optimum medium
6.2 Redundancy and complementarity
6.3 Realism of pictorial material
7. Product Warnings
8. Speech Perception
8.1 Representation of speech
8.2 Units of speech perception
8.3 Top-down processing of speech
8.4 Applications of voice recognition research
8.5 Communications
8.6 Crew resource management and team situation awareness
9. Transition: Perception and Memory
Key Terms
Chapter 7 Memory and Training
1. Overview
2. Working Memory
2.1 Working memory interference
2.2 Working memory, the central executive, and executive control
2.3 Matching display with working memory code
2.4 Limitations of working memory: duration and capacity
3. Interference and Confusion
4. Expertise and Memory
4.1 Expertise
4.2 Expertise and chunking
4.3 Skilled memory and long term working memory
5. Everyday Memory
5.1 Prospective memory
5.2 Transactive memory

10
6. Situation Awareness
6.1 Working memory and expertise in situation awareness
6.2 Levels of SA and anticipation
6.3 Measuring SA and the role of awareness
7. Planning and Problem Solving
8. Training
8.1 Transfer of training
8.2 Training techniques and strategies
9. Long Term Memory: Representation, Organization, and Retrieval
9.1 Knowledge representation
9.2 Memory retrieval and forgetting
9.3 Skill retention
10. Transition
Key Terms
Chapter 8 Decision Making
1. Introduction
2. Classes and Features of DM
3. An Information Processing Model of Decision Making
4. What Is “Good” Decision Making?
5. Diagnosis and Situation Assessment in Decision Making
5.1 Estimating cues: perception
5.2 Evidence accumulation, Selective attention: cue seeking and hypothesis formation
5.3 Expectations in diagnosis: the role of long-term memory
5.4 Belief changes over time
5.5 Implications of biases and heuristics in diagnoses
6. Choice of Action
6.1 Certain choice
6.2 Choice under uncertainty: the expected value model
6.3 Heuristics and biases in uncertain choice
6.4 The decision to behave safely
7. Effort and Meta Cognition
7.1 Effort
7.2 Meta-cognition and (over) confidence
8. Experience and Expertise in Decision Making
9. Improving Decision Making
9.1 Training debiasing
9.2 Proceduralization
9.3 Displays
9.4 Automation and decision support tools
10. Conclusion and Transition
Key Terms

11
Chapter 9 Selection of Action
1. Variables Influencing Simple and Choice RT
1.1 Stimulus modality
1.2 Stimulus intensity
1.3 Temporal uncertainty
1.4 Expectancy
2. Variables Influencing Choice Reaction Time
2.1 The information theory model: the Hick-Hyman law
2.2 The speed-accuracy trade-off
2.3 Stimulus Discriminability
2.4 The repetition effect
2.5 Response factors
2.6 Practice
2.7 Executive control
2.8 S-R compatibility
3. Stages in Reaction Time
4. Serial Responses
4.1 The psychological refractory period
4.2 Decision complexity: The decision complexity advantage
4.3 Pacing
4.4 Response factors
4.5 Preview and transcription
5. Errors
5.1 Categories of human error: an information-processing approach
5.2 Human reliability analysis
5.3 Errors in an organizational context
5.4 Error remedies
6. Transition
Key Terms
Chapter 10 Multitasking Corrected
1. Overview
2. Effort and Resource Demand
3. Multiplicity
3.1 Stages
3.2 Processing codes
3.3 Perceptual modalities
3.4 Visual channels
3.5 A computational model
4. Executive Control, Switching, and Resource Management
4.1 Task switching
4.2 Interruption management

12
4.3 From interruption management to task management
5. Distracted Driving
5.1 Mechanisms of interference
5.2 Cell phone interference
6. Task Similarity, Confusion, and Crosstalk
7. Individual Differences in Time Sharing
7.1 Expertise and attention
7.2 Training expertise in time-sharing skills
7.3 Aging and attention skills
8. Conclusion and Transition
Key Terms
Chapter 11 Mental Workload, Stress, and Individual Differences: Cognitive and Neuroergonomic
Perspectives
1. Introduction
2. The Neuroergonomic Approach
3. Mental Workload
3.1 Workload overload
3.2 Reserve capacity region
3.3 Measures of mental workload and reserve capacity
3.4 Neuroergonomics of workload
3.5 Relationship between workload measures
3.6 Consequences of workload
4. Stress, Physiological Arousal, and Human Performance
4.1 Arousal theory
4.2 The Yerkes Dodson law
4.3 Transactional and cogitive appraisal theories of stress
4.4 Stress effects on performance
4.5 Stress component effects
4.6 Stress remediation
5. Individual Differences
5.1 Ability differences in multitasking
5.2 Differences in working memory
5.3 Molecular genetics and individual differences in cognition
5.4 Brain computer interfaces for healthy and disabled individuals
6. Conclusions and Transition
Key Terms
Chapter 12 Automation and Human Performance
1. Introduction
2. Examples and Purposes of Automation
2.1 Tasks that humans cannot perform
2.2 Human performance limitations

13
2.3 Augmenting or assisting human performance
2.4 Economics
2.5 Productivity
3. Automated-Related Incidents and Accidents
4. Levels and Stages of Automation
4.1 Information acquisition
4.2 Information analysis
4.3 Decision making and action selection
4.4 Action implementation
5. Automation Complexity
6. Feedback on Automation States and Behaviors
7. Trust in and Dependence on Automation
7.1 Over-trust
7.2 Mistrust and alarm false alarms
8. Adaptive Automation
8.1 What to adapt
8.2 How to infer
8.3 Who decides?
9. Designing for Effective Human-Automation Interaction
9.1 Feedback
9.2 Appropriate levels and stages of automation
9.3 Designing for human-automation “etiquette”
9.4 Calibrating operator trust: display design and training
10. Conclusions
Key Terms
Epilogue
References
Name Index
Subject Index

14
PREFACE

Each edition of this book (this is now the fourth) was written to address the gap between the problems of
system design and much of the excellent theoretical research in cognitive and experimental psychology and
human performance. Many human-machine systems do not work as well as they could because they impose
requirements on the human user that are incompatible with the way people attend, perceive, think, remember,
decide, and act: that is, the way in which people perform or process information. Over the past six decades,
tremendous gains have been made in understanding and modeling human information processing and human
performance. Our goal is to show how these theoretical advances have been, or might be, applied to
improving human-machine interaction.
Although engineers encountering system design problems may find some answers or guidelines either
implicitly or explicitly stated in this book, it is not intended to be a handbook of human factors engineering.
Many of the references in the text provide a more comprehensive tabulation of such guidelines as well as
practical guidelines on how to apply them. Instead, we have organized the book directly from the
psychological perspective of human information processing. The chapters generally correspond to the flow of
information as it is processed by a human being—from the senses, through the brain, to action—rather than
from the perspective of system components or engineering design concepts, such as displays, illumination,
controls, computers, and keyboards. Furthermore, although the following pages contain recommendations for
certain system design principles, many of these are based only on laboratory research and theory; they have
not been tested in real-world systems.
It is our firm belief that a solid grasp of theory will provide a strong base from which the specific
principles of good human factors can be more readily derived. Our intended audience, therefore, is: (1) the
student in psychology, who will begin to recognize the relevance to many areas in the real-world applications
of the theoretical principles of psychology that he or she may have encountered in other courses; (2) the
engineering student, who while learning to design and build systems with which humans interact, will come to
appreciate not only the nature of human limitations—the essence of human factors—but also the theoretical
principles of human performance and information processing underlying them; and (3) the actual practitioner
in engineering psychology, human performance, and human factors engineering, who can understand the
close cooperation that should exist between principles and theories of psychology and issues in system design.
The 12 chapters of the book span a wide range of human performance components. Following the
introduction in Chapter 1, in which engineering psychology is put into the broader framework of human
factors and system design, Chapters 2 through 8 deal with perception, attention, cognition (both spatial and
verbal), memory, learning, and decision making, emphasizing the potential applications of these areas of
cognitive psychology. Chapters 9 and 10 cover the selection and execution of control actions, error, and time-
sharing. Chapter 11 covers three more integrated concepts: workload, stress, and individual differences, much
from the perspective of the new field of neuroergonomics. Chapter 12 addresses topics of human-automation
interaction. Finally, a short Chapter is an epilogue that highlights certain critical issues that transcend many of
the prior chapters.
Although the 12 chapters are interrelated (just as are the components of human information processing),
we have constructed them in such a way that any chapter may be deleted from a course syllabus and still leave
a coherent body. Thus, for example, a course on applied cognitive psychology might include Chapters 1
through 8 and Chapter 10; and another emphasizing more heavily engineering applications might include
Chapters 1, 2, 4, 5, 9, 10, 11, 12, and Epilogue.

NEW TO THIS EDITION


Changes since the 3rd edition that appear throughout the text:

We have added two new co-authors, Raja Parasuraman and Simon Banbury
Greatly increased number of references to medical and health care applications
Greatly increased number of references to changes in cognition in the aging population
Gave increased emphasis on readability and common sense examples.
We have created 48 new figures

15
Citations to many new studies have been added

In addition to incorporating new experiments and studies where appropriate, we have made a number of
changes in the fourth edition that set it apart from the third. First, most prominently, we have added the new
chapter on neuroergonomics that integrates much of the material on stress, workload, and individual
differences. Second, we have substantially revised the chapters on spatial cognition, decision making,
automation, and multi-tasking. In the latter we have included sections on interruption management and
distracted driving, as both of these areas represent cogent examples of applications of engineering psychology
theory to problems in society. To compensate for these additions, we have removed much of the complex
material on both manual and process control. Third, we have populated the text throughout with many more
examples of how information processing changes in older adulthood.
A fourth obvious change is the addition of two talented co-authors to the team. Raja Parasuraman brings
expertise in automation and neuroergonomics, while Simon Banbury contributes his expertise in cognition,
memory, and auditory processing. Finally, with this influx of talent we bring a wealth of new literature to
update engineering psychology to the third millennium. Approximately 1,000 new references (approximately
50 percent of the citation list) have been added.

CHANGES PER CHAPTER

Chapter 2
Added section on new technique of fuzzy signal detection theory
New section on vigilance inside and outside the laboratory

Chapter 3
New treatment of models of selective visual attention and eye movements
New section on clutter. This is also extended to a later section on map clutter.
New material on the distraction of noise in the workplace and school

Chapter 4
New section on direct and indirect perception
New section on illusions in 3D viewing
New section on stereoscopic displays

Chapter 5
This chapter has been substantially re-written and restructured from the 3rd edition, and is now titled spatial
cognition, navigation and manual control. As such it integrates much of the material from the manual control
chapter in the previous edition, while removing many of the technical details from that earlier version. The
new chapter contains:

New sections on cognitive representation of space and frame of reference transformations.


A new section on a computational model of spatial transformations
A new major section on applications to design of 2D and 3D maps
A new section on environmental design
A new section on the important display principle of visual momentum
Expanded section on virtual environments includes more material on augmented reality and problems
with virtual and augmented reality

Chapter 6
Contains a new section on auditory icons

16
Chapter 7
A new section on everyday memory
A new section on prospective memory
A new section on transactive memory, operating within groups of people
Expanded coverage of situation awareness from 2 pages to 6 pages

Chapter 8
Substantially revised this chapter on decision making, including many recent findings on loss aversion,
decision fatigue and “intuitive” decision making. In addition we have:

Added 2 new sections on the role of effort and meta-cognition in decision making
Added an integrative section on the role of experience and expertise

Chapter 9
This chapter on action selection has now integrated the material on human error which was in a separate
chapter in the previous edition. We have deleted some of the extensive material on stages in reaction time.

Chapter 10
This is now a substantially revised chapter focusing exclusively on multitasking. It includes:

A new section on computational models of multi-tasking


A new section on executive control
A new section on interruption management
A new major section on distracted driving with an emphasis on the sources and solutions to mobile
phone use while driving
A new major section of 4 pages on individual differences in multi-tasking with focus on differences
related to abilities, expertise and aging

Chapter 11
Mental workload, stress, individual differences: cognitive and neuro-ergonomic perspectives. This chapter is
new, although it contains much of the material in the previous edition on mental workload and stress.
However it now contains

A new section describing the neuro-ergonomics approach, integrating human factors with human
neurophysiology
Greatly expanded coverage of the neuro-ergonomics approach to workload measurement
A new section on individual differences, including differences in working memory and executive
control, differences in molecular genetics, and their relation to cognitive differences, and differences
resulting from disabilities, with the focus on the emerging study of brain computer interfaces.

Chapter 12
Automation. Coverage in this final chapter of process control has been removed, and somewhat distributed to
other chapters. In place, the chapter is devoted exclusively to human-automation interaction, more than
doubling coverage from 12 pages in the previous edition to 27 in the current edition. This expanded coverage
includes new sections on:

Automation related accidents and incidents


Levels and stages of automation
Automation complexity
Feedback on automation states and behaviors

17
Trust and dependence on automation
Designing for human-automation: etiquette

EPILOGUE
Finally, the book contains a final short chapter or “epilogue” that integrates several of the central and
recurring themes of the book.

SUPPLEMENTS
Please visit the companion website at www.routledge.com/9780205021987

18
ACKNOWLEDGMENTS
In any project of this kind, one is indebted to numerous people for assistance. For all of us the list includes
several colleagues who have read and commented on various chapters, have provided feedback on the prior
editions, or have stimulated our thinking. In addition to all acknowledgments in the first two editions (the text
of which, of course, remains very much at the core of the current book), the first author would like to
acknowledge the contributions of faculty colleagues and countless students who, in one form or another, have
offered feedback regarding either bad or good sections of prior editions.
Christopher Wickens would like to acknowledge the contributions of four specific individuals who have
contributed to the development of his interest in engineering psychology: His father, Delos Wickens,
stimulated his early interest in experimental psychology; Dick Pew introduced him to academic research in
engineering psychology and human performance; the late Stanley Roscoe pointed out the importance of good
research applications to system design; and Emanuel Donchin continues to emphasize the importance of solid
theoretical and empirical research.
Justin Hollands would like to acknowledge his many colleagues in the Human Systems Integration
Section at Defence Research and Development Canada – Toronto, with whom he discussed various topics in
the book at different times. He would especially like to thank Linda Bossi for seeing the “big picture” and
supporting his efforts. He also thanks Stewart Harrison for tracking down many of the references. Finally, he
thanks his family, and especially his wife, Cindy, for their patience while this book was being written.
Simon Banbury would like to thank Chunyun Ma, Patrick Bickerton, and Erica Elderhorst for their help
conducting the literature searches, and Sébastien Tremblay for his critical review of the sections on working
memory and auditory attention. Thanks also to his wife, Jennifer, and daughters, Tess and Charlotte, for their
support.
Raja Parasuraman would like to acknowledge his colleagues, postdoctoral fellows, and graduate students
at George Mason University for extensive discussions of topics related to attention, automation, and
neuroergonomics. He also appreciates the support of his wife and George Mason colleague Carryl Baldwin.
Christopher D. Wickens
Justin G. Hollands
Simon Banbury
Raja Parasuraman

19
1 INTRODUCTION TO ENGINEERING
PSYCHOLOGY AND HUMAN PERFORMANCE
The field of human factors engineering, along with the closely related disciplines of human-systems
integration (Booher, 2003), human computer interaction (Shneiderman … Plaisant, 2009, Sears … Jacko,
2009), and user-interface design (Buxton, 2007), addresses issues of how humans interact with technology. It
developed rapidly, following its approximate birth just after World War II when experimental psychologists
were called in to help understand why pilots were crashing perfectly good aircraft (Fitts … Jones, 1947), why
vigilance for enemy planes over the English Channel was sometimes wanting (Mackworth, 1948), or how
learning theory could be harnessed to better train military personnel (Melton, 1947). Since that time, over the
past 70 to 80 years, the field has grown and expanded into areas such as consumer products, business,
highway safety, telecommunications, and, most recently, health care (Kohn, Corrigan, … Donaldson, 1999).

1. DEFINITIONS
1.1 Engineering Psychology
Within the broader field of human factors lies the discipline of engineering psychology (Proctor … Vu,
2010), the focus of this book. Engineering psychology focuses on “human factors from the neck up,” in
contrast to many applications of human factors to issues “below the neck,” such as lower back injuries,
fatigue, work physiology and so forth. Much of this latter focus is covered in the general discipline of
ergonomics, the study of work, although classic ergonomics has itself spawned the study of cognitive
ergonomics and/or cognitive engineering, both of which naturally focus on human work “above the neck”
(Vicente, 1999; Jenkins, Stanton, et al., 2009). An additional contrast with the broader field of human factors
engineering (Wickens, Lee, Liu, … Gordon-Becker, 2004) is that human factors engineering focuses much
more heavily upon design (of products, workstations, etc.) and the evaluation of those designs than does
engineering psychology. Engineering psychology is, after all, a subdiscipline of psychology, and not
engineering.
Thus, engineering psychology can also be described within the broader discipline of psychology, and
within this the somewhat narrower discipline of applied psychology. In the latter, the study of behavior is
focused on the applications of those principles and theories of behavior and cognition to areas beyond the
laboratory, such as industry, schools, counseling, mental illness, and sports. Within this broader set of
applications, the focus of engineering psychology tends to be on performance in the workplace (expanded to
include transportation and some aspects of the home), hence characterizing its close linkage back to
ergonomics, the study of work, and particularly cognitive ergonomics.
But to highlight the uniqueness of engineering psychology again, what distinguishes it from cognitive
ergonomics is that the former has a strong and (some would say) necessary basis in theory: the theories of
brain, behavior, and cognition that are applicable to the workplace. The latter is certainly not devoid of theory,
but also broadens its focus to consider issues of task description and analysis, design, and principles of design
that may not directly translate to theory.
In distinguishing engineering psychology from branches of basic psychology (especially experimental
psychology), the former field must be concerned with the eventual applications of its theories and principles,
while the latter need not be. This has three implications for research in the two related disciplines. First,
experimental psychology is very concerned with the issues of experimental control. All variables should be
held constant except those manipulated in the experiment. Second, the concern for statistical significance
often dominates that of practical significance. That is, an effect measured in the laboratory of only 10
milliseconds can signal an exciting discovery, but such effects may be of limited usefulness in the workplace
beyond the laboratory. Third, in basic laboratory research the participant’s task is typically designed by the
experimenter for theoretical reasons.
In contrast, in engineering psychology, while there is still concern for control in its experimental
research, too much control may produce effects that, like the 10 millisecond effect above, would simply
“wash out” when the human performs in the workplace, with its many other competing influences on human
behavior. The second difference is related to the first. While engineering psychologists do pay attention to
statistics and statistical significance, they also realize that without considering practical significance, a

20
particular finding or principle will simply not scale up to the workplace, where it is to be passed onto the
human factors engineer, who has the commitment to design. Thirdly, in designing a task for experimental
participants, the engineering psychologist must always consider its relevance to tasks beyond the laboratory.
The engineering psychologist should know and understand the relevant real-world context and tasks, and this
knowledge should inspire the design of the experimental task.
Of course, in practice such distinctions are fuzzy rather than crisp. We have noted the fuzziness of
defining what is and is not the “workplace”; for example, highway safety is very much within the domain of
engineering psychology, but it does not matter whether the person is driving a truck for work or a car for
pleasure. As another example of this fuzziness, sometimes issues below the neck influence those above (as
when we are mentally distracted by the discomfort resulting from a poorly designed physical workplace).
Furthermore, many issues of design addressed by human factors depend on engineering psychology principles
(Peacock, 2009), and when designs are evaluated outside the laboratory, their results may lead to further
controlled experiments to refine the principles upon which those designs were based. And in this same way,
lessons learned, and challenges faced by engineering psychologists should always feed back to the basic
psychologist, to inform where new theory is needed, or old theory may be wanting. Experimental
psychologists are often interested in knowing the limitations of their models and results in real-world settings,
and by providing such feedback the engineering psychologist helps ensure that application is considered.

1.2 Human Performance


The second part of the title of the book, human performance, also deserves some explanation. Here, our
emphasis is on the quality of performance (e.g., better or worse), and here we typically think of measures of
“the big three”:
• Speed (faster is better),
• Accuracy (higher is better) and
• Attentional demand (less is generally better).
Thus, we might think of the perfect principle in engineering psychology as one which, if applied to design,
will allow the user to perform a task more rapidly, more accurately, and with reduced attentional demand (so
that other tasks can be done concurrently).
Of course as we will see, many times these measures may trade off in practice. And furthermore
engineering psychologists are quite interested in many cognitive phenomena that are not directly reflected in
performance, such as the degree of learning or memory of a concept, the quality of a mental model about a
piece of equipment, the level of situation awareness about a process, or the level of overconfidence in a
decision. Still, all of these cognitive phenomena may ultimately be expressed in some measure of performance
in the workplace, and so long as they are, such intervening variables lie very much at the heart of human
performance theory.

2. RESEARCH METHODS
Many different research methods can be employed to help discover, formulate, and refine theory-based
principles regarding “what works” to support human performance. These can be roughly laid out on a
continuum, from laboratory experiments, to human-in-the-loop simulations, to field studies, to actual real
world observations (Wickens … Hollands, 2000). The latter may come from surveys of users, observational
studies, and case studies (analyses) of major accidents and serious incidents. In some professions like health
care and aviation, a corpus of minor incidents is also available to create a large data base of human
performance issues, like errors, that occur in the work place. Each method has strengths and weaknesses.
There is no “best” technique, as attributes like cost, fidelity to the workplace and so forth trade off along the
continuum, and an effective engineering psychologist needs to be aware of the different methods, the various
studies that have been conducted in a particular domain, and be able to interpret their results appropriately.
To this arsenal of research methods, we add two that are becoming increasingly useful in engineering
psychology research. Both will be featured in the forthcoming chapters. First meta-analyses (Egger … Smith,
1997; Glass, 1976; Rosenthal … DeMatteo, 2001; Wolfe, 1986) provide a way of extracting and integrating
quantitative data from a set of studies in order to derive the “collective wisdom” of those studies on a
particular research issue, such as whether one training method is better than another (and if so, how much
better). While these may be time-consuming to conduct, they avoid many of the complexities of collecting
data from human participants, and they often do a great job at capturing the quantitative flavor of past results.

21
Second, computational models (Gray, 2007; Pew … Mavor, 1998) are convenient ways of simulating human
behavior and cognition through software. For relatively simple forms of human behavior, such as moving a
mouse to a cursor, or searching a list for a needed item, these can offer close approximations to human
performance without the requirement for data collection.

3. A MODEL OF HUMAN INFORMATION PROCESSING


Knowing the different dimensions of performance (e.g., speed and accuracy) that can be measured in different
research environments (e.g., lab, field studies) can assist the human factors engineer in understanding how
much performance is changed by system design or environmental differences. But such knowledge is not
always sufficient for the engineering psychologist, who is interested in why performance might be changed.
For example, a new interface for a car radio control might invite errors because the:
• Control cannot be touched without bumping another one,
• Control is too sensitive,
• Driver is confused about which way to adjust the control to increase frequency, or
• Driver cannot understand the icon on the control.
The distinctions between the different psychological and motor processes affected by design are of critical
importance because, on the one hand, they link to basic psychological theory and, on the other hand, they can
help identify different sorts of design solutions.
A model of human information processing stages, shown in Figure 1.1, provides a useful framework
for analyzing the different psychological processes used in interacting with systems and for carrying out a task
analysis, as well as a framework for the organization of the chapters in this book. The model depicts a series
of processing stages or mental operations that typically (but not always) characterizes the flow of information
as a human performs tasks. Consider as an example the task of driving toward an intersection. As shown on
the left of Figure 1.1, events in the environment are first processed by our senses—sight, sound, touch, etc.—
and may be held briefly in short term sensory store (STSS) for no more than about a second. Thus the driver
approaching the intersection will see the traffic light, the flow of the environment past the vehicle, and other
cars, and may hear the radio and the conversation of a passenger.
But sensation is not perception, and of this large array of sensory information only a smaller amount may
be actually perceived (e.g., perceiving that the light has turned yellow). Perception involves determining the
meaning of the sensory signal or event, and such meaning is, in turn, derived from past experience (a yellow
light means caution). As we see below, this past experience is stored in our long term memory of facts,
images, and understanding of how the world works.
After perception, our information processing typically follows either (or both) of two paths. At the
bottom, perceiving (understanding) a situation will often trigger an immediate response, chosen or selected
from a broader array of possibilities. Here the driver may choose to depress the accelerator or apply the brake,
a decision based on a variety of factors, but one that must be made rapidly. Then, following response
selection, the response is executed in stage 4 of our sequence in a manner that not only involves the muscles,
but also the brain control of those muscles.

Figure 1.1 A model of human information processing stages.

22
But perception and situation understanding do not always trigger an immediate response. Following the
upper path from perception, the driver may use working memory to temporarily retain the state of the light
(yellow), while scanning the highway and the crossing road ahead for additional information (e.g., an
approaching vehicle or a possible police car). In fact, in many cases an overt action does not follow perception
at all. As you sit in lecture you may hear an interesting fact from the lecturer, but choose not to take notes on
it (no response selection and execution), but rather to ponder it, rehearse it, and learn it. That is, to use
working memory to commit the information to long term memory, for future use on an exam or in
applications outside of the classroom. Thus, the function of working memory is not just to store information,
but also to think about it: the process of cognition.
At this point we note that the processes of perception and working memory are not as distinct from each
other as the separate boxes would suggest. There is a fuzzy boundary between them, and hence this second
stage—after sensation, but before response selection— can often be described as “cognition,” generically
describing the interpretation of sensed material, sometimes rapidly as the traffic light and sometimes more
slowly as the idea presented by the lecturer.
To this four-stage + memory model, we add two vital elements, feedback and attention. First, in many
(but not all) information processing tasks, an executed response changes the environment, and hence creates a
new and different pattern of information to be sensed, as shown by the feedback loop at the bottom. Thus, if
the driver applies the accelerator, this will not only increase the perceived speed of the car, but also may
reveal new sensory information (e.g., a police car is suddenly revealed waiting behind a sign), which in turn
may require a revision of the stop-go response choice.
Second, attention is a vital tool for much of information processing, and here it plays two qualitatively
different roles (Wickens … McCarley, 2008). In its first role as a filter of information that is sensed and
perceived, attention selects certain elements for further processing, but blocks others, as represented in Figure
1.1 by the smaller output from perception than input to it. Thus, the driver may focus attention fully on the
traffic light, but “tune out” the conversation of the passenger or fail to see the policeman. In the second role
attention acts as a fuel that provides mental resources or energy to the various stages of information
processing, as indicated by the dashed lines flowing from the supply of resources at the top. Some stages
demand more resources in some tasks than others. For example, peering at the traffic light through a fog will
require more effort for perception than seeing it on a clear, dark night. However, our supply of attentional
resources is limited, and hence the collective resources required for one task may not allow enough for a
concurrent one, creating a failure in multi-tasking.
While Figure 1.1 provides a useful framework for conceptualizing information processing (and the
organization of this book), it should not be taken too literally (Wickens … Carswell, 2012). Thus, although
the primary operations associated with the different stages are somewhat associated with different brain
structures (see Chapters 10 and 11), the association is not crisp, nor must the stages operate in strict sequence.
Thus, the student in a lecture may in parallel rehearse the lecturer’s words and write them down. And, of
course, the major feedback loop at the bottom means there is no fixed “start” and “end” point to the
information processing sequence. After all, a task might be initiated by an inspiration, thought, or intention to
do something, originating from long term memory, flowing to working memory, and then to response, with no
perceptual input whatsoever. Nevertheless, as we will see, the stage distinction is quite useful in analyzing
tasks, describing principles, recommending solutions and, in many cases, developing the theories upon which
engineering psychology is based.
The model shown in Figure 1.1 also provides a framework for organizing the chapters of the book. In
Chapter 2 we discuss basic perception in terms of the detection of signals and the classification of stimuli
varying along one or more dimensions. In Chapter 3 we consider the attention filter, the selective aspects of
attention. Chapters 4, 5, and 6 address the more complex aspects of perception and cognition that are relevant
to the design of displays for space and spatial operations (Chapter 4), including manual control (Chapter 5)
and language (Chapter 6). Chapter 7 addresses the role of cognition and both working memory and long term
memory and their relevance to learning and training. Chapters 8 and 9 address the selection of action. In
Chapter 8 this selection involves the deliberate process of decision making, which heavily involves working
memory. In Chapter 9 it represents more rapid actions such as that taken at the traffic light. Chapter 10
addresses the issues of multi-tasking, as various combinations of stages need to compete with each other for
the limited “fuel” of attention resources. In Chapters 11 we address issues of mental workload, stress, and
individual differences, from the perspective of neuroergonomics. In Chapter 12 we consider issues of human-
automation interaction, and a final short chapter summarizes some key themes.

23
4. PEDAGOGY OF THE BOOK
There are a few critical features that we would like to highlight to our readers before they jump into the
following chapters.
First, we have tried to cite a large amount of literature to indicate the wealth of research that lies behind
the concepts, principles, and findings that we present. In doing so, we have tried to emphasize “take home
messages” from the collective body of research, more than the specific methods and findings from a single
study. In so doing, we may have glossed over details of particular studies, but we think we have been true to
the studies’ main conclusions. Our extensive reference list will allow the curious reader to delve into greater
detail for any specific topic he or she desires. Many former students using previous editions of this text are
now engineering psychologists or human factors practitioners themselves; a common remark is that the book
remains a useful reference for their professional career, long after they have taken the course.
Second, the reader will detect a rich network of cross-references between chapters. We hope that any
distraction this may cause will be offset by a realization of the complexity of human performance, and how
interwoven the performance components are in their application to the workplace. As just one example, we
find that the cognitive phenomenon of overconfidence keeps re-appearing in different guises, across different
stages and types of human performance and cognition (and therefore different chapters).
Third, the reader will note the distinction between our use of boldface and italics. The former is used to
highlight new key terms or concepts [that are listed at the end of each chapter] whereas the latter is used to
emphasize a word or phrase that should already be familiar to the reader.
Finally, as befits the distinction between engineering psychology and human factors, we tend to
emphasize the general principles that support effective human performance (Peacock, 2009) more than
specific design examples (although we do not neglect the latter). It is hoped that the material in this book
provides an effective “hand-off ” to those truly interested in design applications, who can then follow these up
in more applied human factors treatments (e.g., Salvendy, 2012; Wickens, Lee, Liu, … Gordon-Becker, 2004;
Peacock, 2009; Proctor … van Zandt, 2008).
In summary, we hope that our approach provides a distinctive counterpoint to the existing literature. The
audience is intended to be graduate students or upper-division undergraduates, with a background in human
science (e.g., psychology, cognitive science, kinesiology) or applied science (engineering, computer science).
The science student may be more interested in how what is known about information processing and human
performance can be applied in real-world situations. The engineering student will likely be more interested in
knowing more about psychology and its theory and why it matters to the design of engineered products and
systems. We hope that both students find that the book has an appropriate balance of these qualities.

Key Terms
applied psychology 1
attention 5
cognition 5
cognitive engineering 1
cognitive ergonomics 1
computational models 3
engineering psychology 1
ergonomics 1
experimental control 2
feedback 5
filter 5
fuel 5

24
human information processing 4
human performance 2
intervening variables 3
learn 5
long term memory 4
major accidents 3
mental resources 5
meta-analyses 3
overconfidence 6
Perception 4
response selection 4
senses 4
short term sensory store (STSS) 4
working memory 5

25
2 SIGNAL DETECTION AND ABSOLUTE
JUDGMENT

1. OVERVIEW
Information processing in most systems begins with the detection of an environmental event. In a major
catastrophe, the event is so noticeable that immediate detection is assured. However, there are many other
circumstances in which detection itself represents a source of uncertainty or a potential bottleneck in
performance because it is necessary to detect events that are near the threshold of perception. Will the
baggage inspector detect the utility knife in the suitcase? Will the radiologist detect the abnormal X-ray as it is
scanned?
This chapter will first deal with the signal detection situation in which an observer classifies the world
into one of two states: A signal is said to be present or absent. The detection process will be modeled within
the framework of signal detection theory (SDT), and we will show how the model can assist engineering
psychologists in understanding the complexities of the detection process, in diagnosing what goes wrong
when detection fails, and in recommending corrective solutions. We will also consider a few modern variants
of SDT, when they might best apply, and consider how that changes how we interpret the signal detection
situation.
When perception involves more than two states of categorization, we move into the realm of
identification. We will consider first the simplest form of multilevel categorization, the absolute judgment
task. Then, we shall examine the more complex multidimensional stimulus judgment. Finally, a supplement to
this chapter presents the more technical details of information theory, which describes an alternative way of
quantifying and modeling perceptual errors.

2. SIGNAL DETECTION THEORY


2.1 The Signal Detection Paradigm
Signal detection theory (SDT) is applicable in any situation in which there are two discrete states of the world
(signal and noise) that cannot easily be discriminated. Importantly, SDT can be applied equally well to the
analysis of detection performance by a human operator alone, by a machine or automated detector, or by both
human and machine (Parasuraman, 1987; Sorkin … Woods, 1985; Swets, 1998). The process of signal
detection results in two response categories: yes (the signal is present) and no (the signal is not present). This
simple situation turns out to underlie many occupational tasks. For example,
• The detection of a concealed weapon by an airport security guard (McCarley et al., 2004),
• the detection of a contact on a radar scope (Mackworth, 1948),
• detection of a malignant tumor on an X-ray plate by a radiologist (Swets, 1998),
• a malfunction of an abnormal system by a nuclear plant supervisor (Sorkin … Woods, 1985),
• the identity of a target on a battlefield (Hollands … Neyedli, 2011),
• a critical event in air traffic control (Metzger … Parasuraman, 2001),
• an untruthful statement from a polygraph (Ben-Shakhar … Elaad, 2003),
• detecting when a driving situation is hazardous (Wallis … Horswill, 2007),
• determining whether it is safe to proceed through a railroad crossing (Yeh, Multer, … Raslear, 2009),
• detecting a crack on the body of an aircraft (Drury, 2001; Swets, 1998).
In each example there are two possible states of the world, and the fallible observer is responsible for deciding
which state has occurred.
The combination of states of the world and response categories produces the 2 × 2 table shown in Figure
2.1, generating four classes of joint events, labeled hits, misses, false alarms, and correct rejections. Perfect
performance occurs when no misses or false alarms occur. In many situations, however, it is not easy to

26
distinguish the signal from the non-signal (noise) state. The signal may not be that intense, the operator may
be suffering from fatigue, or the signal may be defined by a complex combination of cues. Thus, misses and
false alarms do occur, and so there are normally data in all four cells. In SDT these values are typically
expressed as probabilities, by dividing the number of occurrences in a cell by the total number of occurrences
in a column. Thus, if 20 signals were presented, and there were 5 hits and 15 misses, we would write P(Hit) 5
P(H) 5 5/20 5 0.25, the hit rate. If 10 noise trials were presented and half of them resulted in “yes” responses,
we would write P(FA) 5 0.50, the false alarm rate. In some situations outside the laboratory, we do not know
the actual frequency of “noise trials.” In these cases, we look specifically at all situations where the operator
has said “yes,” and then determine if there was a signal presented. Thus, false alarm rate is defined as the
probability of a non-signal given a yes response (e.g., 75 percent of the non-signals were given a yes
response).

FIGURE 2.1 The four outcomes of signal detection theory.

The SDT model (Green … Swets, 1966; Macmillan … Creelman, 2005; T. D. Wickens, 2002) assumes
that there are two stages of information processing in the task of detection: (1) sensory evidence is aggregated
concerning the presence or absence of the signal, and (2) a decision is made about whether this evidence came
from a signal or not. We label this evidence variable “X.” Therefore, on average X should be greater when a
signal is present than when it is absent. (We might think of X as the level of activity in some brain region.)
The activity increases in magnitude with stimulus intensity. Therefore, if there is enough activity, X exceeds a
critical threshold XC, and the operator decides “yes.” If there is too little, the operator decides “no.”
The value of X varies continuously even in the absence of a signal because of random variations in the
environment and in brain activity (e.g., the “noise” activity in the sensory channels and the brain). This
variation is shown in Figure 2.2. Therefore, even when no signal is present, X will sometimes exceed the
criterion XC as a result of random variations alone, and the observer will say “yes” (generating a false alarm at
point A of Figure 2.2). Correspondingly, even with a signal present, the random level of activity may be low,
causing X to be less than the criterion, and the observer will say “no” (generating a miss at point B of Figure
2.2). The smaller the difference in intensity between signals and noise, the greater these error probabilities
become because the amount of variation in X resulting from randomness increases relative to the amount of
energy in the signal. In Figure 2.2 the average level of X is increased slightly in the presence of a weak signal
and greatly when a strong signal is presented.
For example, consider the air traffic controller monitoring a noisy radar screen. Somewhere in the midst
of the random variations in stimulus intensity caused by reflections from clouds and rain, there is an extra
increase in intensity that represents the presence of the signal—an aircraft. The amount of noise will not be
constant over time but will fluctuate; sometimes it will be high, completely masking the stimulus, and
sometimes low, allowing the plane to stand out. In this example, “noise” varies in the environment. Now
suppose instead you were standing watch on a ship, searching the horizon on a dark night for a faint light. It
becomes difficult to distinguish the flashes that might be real lights from those that are just “visual noise” in
your own sensory system. In this case, the random noise is internal. Thus “noise” in SDT is a combination of
noise from external and internal sources.

27
FIGURE 2.2 The change in the evidence variable × caused by a weak and a strong signal. Notice that with the weak signal, there can
sometimes be less evidence when the signal is present (point B) than when the signal is absent (point A).

FIGURE 2.3 Hypothetical distributions underlying signal detection theory: (a) high sensitivity; (b) low sensitivity.

In SDT, we represent signal and noise as a pair of normal distributions. Figure 2.3 shows the probability
of observing a specific value of X, given that a noise trial (left curve) or signal trial (right curve) in fact
occurred. These data might have resulted from the evidence variable graph in Figure 2.2 by counting the
relative frequency of different X values during the intervals when the signal was off, creating the probability
curve on the left of Figure 2.3; then making a separate count of the probability of different X values while the
weak signal was on, generating the curve on the right of Figure 2.3. As the value of X increases, it is more
likely to have been generated while a signal was present.
When the probability that X was produced by the signal equals the probability that it was produced by
only noise, the signal and noise curves intersect. Let’s assume that the criterion value XC chosen by the

28
operator is set to this point. To represent this, we have drawn a vertical line at this location in Figure 2.3. All
X values to the right (X > XC) will cause the human operator to respond “yes.” All to the left generate “no”
responses. If a machine detection system is being analyzed, XC is set by another human, typically the system
designer (see also the section on “Alarms and Alerts” later in this chapter), but again the system will respond
with a “yes” response when X exceeds XC. Four areas under the curves are produced, representing the
probabilities of hits, misses, false alarms, and correct rejections. Since the total area within each curve is 1.0,
the two shaded regions within each curve must sum to 1.0. That is, P(H) + P(M) = 1 and P(FA) + P(CR) = 1.

2.2 Setting the Response Criterion: Optimality in SDT


In any signal detection task, observers may vary in their response bias or criterion. For example, they may be
“liberal” or “risky”: prone to saying yes, and therefore detecting most of the signals that occur, but also
making many false alarms. Alternatively, they may be “conservative,” saying no most of the time and making
few false alarms but missing many of the signals.
Sometimes circumstances dictate whether a conservative or a risky strategy is best. For example, when
the radiologist scans the X-ray of a patient who has been referred because of other symptoms of illness, it is
better to be biased to say yes (i.e., “you have a tumor”) than when examining the X-ray of a healthy patient,
for whom there is no reason to suspect any malignancy (Swets … Pickett, 1982). Consider, on the other hand,
the monitor of the power-generating station who has been cautioned repeatedly by the supervisor not to shut
down a turbine unnecessarily, because of the resulting loss of revenue to the company. The operator will
probably become conservative in monitoring the dials and meters for a malfunction and may be prone to miss
(or delay responding to) a malfunction when it does occur.
In Figure 2.3, the decision criterion XC was placed in a neutral location where the two distributions meet.
If instead XC is placed to the right, then more evidence is required for it to be exceeded, and most responses
would be “no” (conservative responding). Such a strategy will result in few false alarms but at the potential
cost of fewer hits. If placed to the left, less evidence is required and most responses would be “yes”. This
strategy is more risky (produces more false alarms) but has the benefit of increasing the number of hits. An
important variable that is positively correlated with XC is beta, which can be defined as the ratio of neural
activity produced by signal and noise at XC:

(2.1)

This is the ratio of the ordinate of the two curves of Figure 2.3, for a given level of XC. Thus both beta and XC
define the response bias or response criterion.
An important contribution of SDT is that it can prescribe where the optimum beta should fall given (1)
the likelihood of observing a signal and (2) the costs and benefits (payoffs) of the four possible outcomes
(Green … Swets, 1966; Swets … Pickett, 1982). We shall first consider the effect of signal probability, then
payoffs, on the optimal setting of beta.

2.2.1 SIGNAL PROBABILITY In the situation in which signals occur just as often as they do not, it can be shown
that the particular symmetrical geometry of Figure 2.3 dictates that optimal performance will occur when XC
is placed at the intersection of the two curves; that is, when beta = 1. Any other placement produces more
errors in the long run. However, if a signal is more likely than not, the criterion should be lowered. For
example, if the radiologist has other information to suggest that a patient is likely to have a malignant tumor,
the physician should be more likely to categorize an abnormality on the X-ray as a tumor than to ignore it as
mere noise in the X-ray process. On the other hand, if signal probability is reduced, then beta should be
adjusted conservatively (increased). For example, suppose an inspector searching for defects in
microprocessors is told that the current batch has a low estimated fault frequency because the manufacturing
equipment just underwent maintenance. In this case, the inspector should be more conservative in searching
for defects. Formally, this adjustment of the optimal beta in response to changes in signal and noise
probability is represented by the prescription

(2.2)

29
This quantity will be reduced (made riskier) as P(S) increases, thereby moving the value of XC producing
optimal beta to the left of Figure 2.3. If this setting is adhered to, performance will maximize the number of
correct responses (hits and correct rejections). Note that the setting of optimal beta will not produce perfect
performance. There will still be false alarms and misses as long as the two curves overlap. However, optimal
beta is the best that can be expected for a given signal strength and a given level of human or machine
sensitivity.
The formula for beta (Equation 2.1) and the formula for optimum beta (Equation 2.2) are sometimes
confused. βopt defines where beta should be set and is determined by the ratio of the probability with which
noise and signals occur in the environment. In contrast, where beta is set by an observer is determined by the
ratio of probabilities of X given signal and noise. These values are inferred from empirical data (i.e., the
proportion of hits and false alarms produced by an observer in a given situation).

2.2.2 PAYOFFS The optimal setting of beta is also influenced by payoffs. In this case, optimal is no longer
defined as the value of beta that minimizes errors, but that which maximizes the expected value, which refers
to the total expected financial gains (or losses). If it were important for signals never to be missed, the
operator might be given high rewards for hits and high penalties for misses, leading to a low setting of beta.
This payoff would be in effect for a quality control inspector who is admonished by the supervisor that severe
costs in company profits (and the monitor’s own paycheck) will result if faulty microchips pass through the
inspection station. The monitor would therefore be more likely to discard good chips (a false alarm) in order
to catch all the faulty ones. Conversely, in different circumstances, if false alarms are to be avoided, they
should be heavily penalized. These costs and benefits can be translated into a prescription for the optimum
setting of beta by expanding Equation 2.2 to:

(2.3)

where V is the value of desirable events (hit, H, or correct rejection, CR), and C is the cost of undesirable
events (false alarm, FA, or miss, M). Note that costs are assumed to be negative in this equation, reducing the
value of the numerator or denominator, as the case may be. An increase in the denominator will decrease the
optimal beta and should lead to risky responding. Conversely, an increase in the numerator should lead to
conservative responding. Notice also that the value and probability portions of the function combine
independently. An event like the malfunction of a turbine may occur only occasionally, thereby raising the
optimum beta as determined by probabilities; however, if the cost of a miss in detecting the malfunction was
severe, the net effect might still be to set optimal beta to a relatively low value, as cost dominates probability
in this example. That is, many false alarms are optimal in such circumstances if that avoids misses.

2.2.3 HUMAN PERFORMANCE IN SETTING BETA. The actual value of beta that an operator uses can be computed
from the number of hits and false alarms obtained from a series of detection trials. The Appendix shows how
to compute beta (and sensitivity, to be discussed subsequently). Therefore, we may ask how well humans
actually perform in setting their criteria in response to changes in payoffs and probabilities, relative to optimal
beta. Humans do adjust beta as dictated by changes in these quantities. However, laboratory experiments have
shown that beta is not adjusted as much as it should be. That is, subjects demonstrate a sluggish beta, as
shown in Figure 2.4. They are less risky than they should be if the optimal beta is high, and less conservative
than they should be if the optimal beta is low. As shown in Figure 2.4, the sluggishness is found to be more
pronounced when beta is manipulated by probabilities than by payoffs (Green & Swets, 1966).
A number of explanations have been proposed to account for why beta is sluggish in response to
probability manipulations. Laming (2010) suggests that observers tend to probability match, rather than
actually set an objective criterion. This means that they try to balance their errors so that P(FA) 5 P(Miss),
even while the probability of a signal becomes unlikely. Further, participants can remember events from only
about two to three trials earlier (see Chapter 7 when we talk about working memory). Sluggish beta may
therefore result from observers making errors, obtaining feedback, and then adjusting their responding trial to
trial to minimize the likelihood of making that error again. Another explanation may be that the operator
misperceives probabilistic data. There is evidence that people tend to overestimate the probability of rare
events and underestimate that of frequent events (Erlick, 1964; Hollands & Dyre, 2000; Peterson & Beach,
1967; Sheridan & Ferrell, 1974). This behavior, to be discussed in more detail in Chapter 8, would produce
the observed shift of beta toward unity.

30
FIGURE 2.4 The relationship between obtained and optimal decision criteria. Illustrates the phenomenon of sluggish beta.

There is evidence for sluggish beta in the world beyond the laboratory. Harris and Chaney (1969),
describing performance of inspectors in a Kodak plant, report that as the defect rate falls below about 5
percent, inspectors fail to lower beta accordingly, very clearly demonstrating a sluggish beta. Karsh et al.
(1995) had soldiers judge the identity of target vehicles, which were either friendly (US Army) or enemy
tanks. Although the optimal beta was very low (0.1), soldiers did not reduce their beta nearly as much as this
optimal (Hollands and Neyedli, 2011). Chi and Drury (1998) had observers scan the identification codes on
integrated circuit boards, and varied the likelihood of defective boards, as well as the costs and rewards
associated with the different outcomes. Sluggish beta was again evident, with the slope of empirical values
about half of the optimal slope (i.e., the solid diagonal line in Figure 2.4).
In an examination of criterion setting in product inspection, Botzer et al. (2010) had participants familiar
with the quality control process explicitly set the threshold for an automated alarm system based on a given
set of probabilities and payoffs. The threshold settings were generally non-optimal, with clear indication of a
sluggish beta pattern. The participants reported using a strategy in which they set a threshold limiting error
rates to around .05 (a level commonly used for statistical significance). This may have contributed to the
sluggish beta result. Importantly, Botzer et al. also found that if predictive value information was provided
(e.g., the probability that the product is faulty given an alarm), participants adjusted their criterion more
optimally than if diagnostic value information was provided (e.g., the probability of an alarm given a faulty
product). What this means is that it is better to give the user concrete information about the actual situation in
the plant, given that an alarm has occurred, rather than provide information about how diagnostic the alarm is.

2.3 Sensitivity
An important contribution of SDT is that it has made a clear distinction between response bias and an
operator’s sensitivity, the keenness or resolution of the detection mechanisms. It can distinguish whether
misses result because of high beta or low sensitivity.
Sensitivity refers to the separation of noise and signal distributions along the X axis of Figure 2.3. If the
separation is large (top of figure), sensitivity is high. A given value of X is quite likely to be generated by
either S or N but not both. If the separation is small (bottom of figure), sensitivity is low. Since the curves
represent hypothetical brain activity, their separation could be reduced either by physical properties of the

31
signal (e.g., a reduction in its intensity or salience) or by properties of the observer (e.g., a loss of hearing for
an auditory detection task or a lack of training of a medical student for the task of detecting complex tumor
patterns on an X-ray, or simply a poor memory of what the signal looked like). The formal sensitivity measure
therefore corresponds to the separation of the means of two distributions expressed in units of their standard
deviations, and is called d′. For most signal detection theory applications d′ varies between 0.5 and 2.0.
Extensive tables for both d′ and beta can be found in Macmillan and Creelman (2005). The Appendix
describes how to compute the measures.

3. THE ROC CURVE


3.1 Theoretical Representation
A graph known as the receiver operating characteristic (ROC) curve is useful for understanding the joint
effects of sensitivity and response bias on data from a signal detection analysis. In Figure 2.1 we presented the
four outcomes that can occur in a SDT experiment. Of the four values, only two are critical. These are
normally P(H) and P(FA), (since P(M) and P(CR) are then redundantly specified as 1-P(H) and 1-P(FA),
respectively). The ROC curve plots P(H) against P(FA) for different settings of the response criterion, at a
constant level of sensitivity. As the criterion is moved to different locations along the X axis of Figure 2.3, a
different set of values will be generated in the matrix of Figure 2.1. Each of the boxes in Figure 2.5 shows the
relation between a matrix of data and the signal and noise distributions. More importantly, Figure 2.5 also
shows the relation between the data matrix (Figure 2.1), the distributions (Figure 2.3), and the ROC curve.
Each signal detection condition (each matrix) generates one point on the ROC. If the signal strength and
the observer’s sensitivity remain constant, changing beta from one condition to another (either through
changing payoffs or varying signal probability) will produce a curved set of points called an ROC curve, or
alternatively an isosensitivity curve (because points falling on the curve have the same sensitivity). Points in
the lower left of Figure 2.5 represent conservative responding; points in the upper right represent risky
responding. One can see from the figure that sweeping the criterion placement X in Figure 2.3 across the
distributions from left to right produces progressively more “no” responses and moves us along the ROC
curve from upper right to lower left.

FIGURE 2.5 The ROC (receiver operating characteristic) curve. For the three boxes on the left, sensitivity is high, and the criterion is shifted
from a low to a high value. These are mapped to their respective positions on the ROC curve. On the right, the box showing one point of lower
sensitivity is similarly mapped to its position in ROC space.

It is time-consuming to carry out the same signal detection experiment several times, each time changing
only the response criterion by a different payoff or signal probability. A more efficient means of collecting
data from several criterion settings is to have the observer provide a rating of confidence that a signal was
present (Green & Swets, 1966). If three confidence levels are employed (e.g., “1” = confident that no signal

32
was present, “2” = uncertain, and “3” = confident that a signal was present) the data may be analyzed twice in
different ways, as shown in Table 2.1. During the first analysis, levels 1 and 2 are classified as “no” responses
and level 3 as a “yes” response. This classification corresponds to a conservative beta setting (roughly two-
thirds of the responses would be called “no”). In the second analysis, level 1 is considered a “no” response,
and levels 2 and 3 are considered “yes” responses. This classification corresponds to a risky beta setting.
Thus, two beta settings are available from only one set of detection trials. An economy of data collection is
realized because the subject conveys more information on each trial. This confidence level approach can be
generalized to any number of levels.
Formally, the value of beta (the ratio of ordinate heights in Figure 2.3) at any given point along the ROC
curve is equal to the slope of a tangent drawn to the curve at that point. As shown in Figure 2.5, this slope
(and therefore beta) will be equal to 1 at points that fall along the negative diagonal (shown by the dotted
line). If the hit and false-alarm values of these points are determined, we will find that P(H) = 1-P(FA), as can
be seen for the two points on the negative diagonal of Figure 2.5. Performance here is equivalent to
performance at the point of intersection of the two distributions in Figure 2.3. Note also that points on the
positive diagonal of Figure 2.5, running along the straight line between lower left and upper right corners,
represent chance performance: No matter how the criterion is set, P(H) always equals P(FA), and the signal
cannot be discriminated at all from the noise. A visual signal detector might as well have closed his eyes. A
representation of Figure 2.3 that gives rise to chance performance and corresponds to the points on the
positive diagonal would be one in which the signal and noise distributions were perfectly superimposed.
Finally, points in the lower right region of the ROC space represent worse than chance performance. Here, the
observer in saying “signal” when no signal is presented more often than when a signal is presented. Either the
observer is misinterpreting the task or is playing a joke on the experimenter!
Regarding sensitivity, Figure 2.5 shows that the ROC curve for a more sensitive observer is bowed,
being located closer to the upper left corner. In contrast, for the less sensitive observer the curve is located
closer to the positive diagonal (which was chance performance). The ROC space in Figure 2.5 is plotted on a
linear probability scale, and therefore shows a typically bowed curve. An alternative way of plotting the curve
is to use z-scores (Figure 2.6). Constant units of distance along each axis represent constant numbers of
standard scores of the normal distribution. This representation has the advantage that the bowed lines of
Figure 2.5 now become straight lines parallel to the chance diagonal. For a given point, d′ is then equal to
Z(H)-Z(FA), reflecting the number of standardized scores that the point lies to the upper left of the chance
diagonal.
TABLE 2.1 Analysis of confidence ratings in signal detection tasks
Subject’s Response Stimulus Presented How Responses Are Judged
Noise Signal
“1” = “No Signal” 4 2 No No
“2” = “Uncertain” 3 2 No Yes
“3” = “Signal” 1 4 Yes Yes
Total No. Of Trials 8 8 ↓ ↓
Conservative Criterion Risky Criterion
P(FA) = 4/8 P(FA) = 4/8
P(HIT) = 4/8 P(HIT)= 6/8

FIGURE 2.6 The ROC curve becomes a straight line when hit and false alarm rate are transformed to z-scores.

3.2 Empirical Data


It is important to realize the distinction between the theoretical, idealized curves in Figures 2.3, 2.5, and 2.6

33
and actual empirical data collected in an experiment or field investigation. Figures 2.5 and 2.6 show
continuous, smooth curves, whereas empirical data consist of one or more discrete points. More importantly,
empirical data do not always fall along the 45-degree slope shown in Figure 2.6 (equivalent to a line of
constant bowedness in Figure 2.5), but often the slope is slightly shallower. Theoretically, this situation arises
because the distributions of noise and signal shown in Figure 2.3 are not in fact precisely normal and of equal
variance. This might occur if there is some variability in the signal itself. The flattening of the slope presents
some difficulties for the use of d′ as a sensitivity measure. If d′ is the distance of the line from the chance axis
in Figure 2.6, and this distance varies as a function of the criterion setting, then bias and sensitivity are no
longer independent. Sensitivity cannot be measured independently of bias in this situation and one must
consider both measures to characterize performance.
Although it is desirable to generate multiple points on the ROC curve (through changing payoffs or
probabilities, or by using rating scales), it is difficult to do so in many real-world contexts. In such cases, the
experimenter is reduced to using the data available from only a single stimulus-response matrix. This does not
necessarily present a problem: collection of a full set of ROC data may not be necessary if bias is minimal
(Macmillan … Creelman, 2005). Nonetheless, if there are only one or two points in the ROC space and there
is evidence for strong risky or conservative bias, another measure of sensitivity should be used.
Under these circumstances, a measure of the area under the ROC curve (called A′) provides an alternative
sensitivity measure (Kornbrot, 2006; Macmillan … Creelman, 2005). The A′ measure represents the triangular
area formed by connecting the lower left and upper right corners of the ROC space to the measured data point
(plus the area under the positive diagonal). The value of the A′ measure does not explicitly depend on
assumptions about the shape of underlying signal and noise distributions, and so is sometimes referred to as
“non-parametric” or “parameter free”. A′ can be a convenient measure to use when there are only one or two
points of the ROC available. The measure A′ may be calculated from the formula:

(2.4)

Alternative measures of bias also exist. For example, the measure C locates the criterion relative to the
intersection of the two distributions. The intersection point is the zero point, and then distance from this
criterion is measured in z-units. Thus, conservative biases produce positive C values; risky biases produce
negative values. C is related to beta, as shown in the Appendix. Summaries of bias measures suggest that C
offers a better measure of bias than beta, because it is less sensitive to changes in d′ (See, Warm, Dember, …
Howe, 1997; Snodgrass … Corwin, 1988). Nonparametric measures of bias are also available, and are
described in Macmillan and Creelman (1990) and See et al. (1997).
Finally, we note that under circumstances when beta is near 1, and there are no differential costs for
misses versus FAs, nor differential benefits for Hits versus CRs, a simple measure of accuracy (percent
correct) is adequate to characterize sensitivity.

4. FUZZY SIGNAL DETECTION THEORY


SDT is typically used to analyze human performance in laboratory studies in which the experimenter decides
what is signal and what is noise and instructs the participant accordingly. In a recognition memory study, for
example, the signal is typically defined as a face that has been shown to the participant during a prior study
period, and other previously unseen faces represent noise. Such “crisp” definitions of signal and noise are
possible in everyday or work environments, yet more often than not the definition of what is or is not a signal
is fuzzy. For example, the legal (“crisp”) definition of a signal (a “conflict”) in air traffic control (ATC) is
when the flight paths of two aircraft come within five nautical miles (nm) horizontally and 1,000 feet
vertically of each other. However, the separation distances that the controller will consider a signal requiring
action generally exceed these minimum values, depend also on factors such as the complexity of the traffic
and the time until separation is lost, and are therefore not crisp.
When the definition of a signal or category is not clear-cut, it can nevertheless be represented
mathematically using fuzzy logic (Zadeh, 1965). Parasuraman et al. (2000) developed equations for fuzzy
SDT by combining SDT and fuzzy logic. Fuzzy logic permits an event to belong to more than one set: rather
than categorizing something as either black or white, fuzzy logic allows for shades of grey. For example, the
range of room temperatures that you would consider “comfortable” might be between 55 °F and 85 °F (13 °C
and 30 °C). A “crisp” set would allocate all temperatures in this range to the set “comfortable” and all others
to “uncomfortable.” In reality, however, most people will feel relatively uncomfortable at 56 °F or at 84 °F

34
(14 °C and 29 °C). It is more appropriate to distinguish between probabilities of comfort rather than to assign
every temperature to either the “comfortable” or “uncomfortable” sets. We could instead develop a function
that permitted a temperature’s membership in the set “comfortable” to be somewhere between no and yes,
between zero and one (e.g., 0.1 for 56 or 84 °F and 0.8 for 75 °F). The function describing the degree of
membership is called the mapping function.
An event in fuzzy SDT can therefore belong to the set “signal” (s) with some degree between 0 and 1.
Similarly, the response can belong to the set “response” (r) with membership in the range 0 to 1. Once s and r
are mapped onto the range [0,1] using an appropriate mapping function, event membership in the four fuzzy
outcome categories Hit, Miss, FA, and CR can be computed. Parasuraman et al. (2000) proposed the
following formulas:

Hit: H = min (s, r)


Miss: M = max (s-r, 0)
False alarm: FA = max (r-s, 0)
Correct rejection: CR = min (1-s, 1-r)

To illustrate how these formulas are used to compute event membership values, suppose that s = .8 and r
= .9. Here the state of the world strongly but not absolutely points to a signal, and the observer strongly (but
not absolutely) responds that a signal is present. Applying the formulas, the resulting category memberships
are H = .8, M = 0, FA = .1, and CR = .1. Hence the outcome strongly points to a hit, but unlike conventional
SDT, there is also some membership in the FA category, representing the fact that the response was stronger
than what was called for by the signal. The CR category is also non-zero, reflecting the small membership of
the event in the “noise” category and the fact that an unequivocal “yes” response was not made.
Once event membership values have been computed, it is a simple matter to calculate the fuzzy hit and
FA rates. The fuzzy hit rate is the sum over all trials of the H values divided by the sum of membership values
of signal (s). Similarly, the FA rate is the sum of FA membership values divided by the sum of membership
values of noise (1-s). Once the fuzzy Hit and FA rates have been computed, measures of sensitivity and bias
can be calculated, just as in conventional SDT. Note that either signal event or response can be classified as
either discrete categories or fuzzy membership. The equations of fuzzy SDT are such that if membership
values are not fuzzy, the equations revert to those of SDT.
Since its development (Parasuraman et al., 2000), there have been a growing number of applications of
fuzzy SDT to diverse issues in engineering psychology and human factors. Three examples are provided here.
Masalonis and Parasuraman (2003) used fuzzy SDT to compute sensitivity and response bias measures
from data obtained in two studies of air traffic control: a field evaluation of an automated conflict detection
system and a laboratory study of controller performance under so-called free flight conditions. Each event was
defined as a signal (conflict) to some fuzzy degree by mapping the distance between aircraft pairs into the
range [0, 1]. Compared to conventional SDT, the fuzzy SDT analysis gave lower values of sensitivity and
higher (more conservative) response bias. Conflicts just outside the conflict criterion used in conventional
SDT were defined by fuzzy SDT as a signal worthy of some attention. Masalonis and Parasuraman (2003)
concluded that fuzzy SDT provided a more complete picture of performance in conflict detection tasks than
conventional SDT.
Wallis and Horswill (2007) examined the relation between the ability of drivers to perceive hazards on
the road and driving safety. Hazard perception ability is negatively correlated with driver crash involvement,
but there is a need for more sensitive evaluation of its inter-individual variation, such as variation between
beginner and experienced drivers. Wallis and Horswill tested two models: that (1) novice drivers have lower
sensitivity in discriminating hazardous situations than experienced drivers or (2) that they have a higher
threshold for perceiving a situation as dangerous. Use of a fuzzy SDT analysis which considers degree of
membership in safe versus dangerous driving supported the second hypothesis. The authors also showed that
training novices to anticipate environmental cues for potential hazards improved their criterion placement,
indicating that training led them to use the same model that experienced drivers employ.
A final example comes from a study of baseball umpires. Whether a pitch should be classified as a strike
is a quintessential example of a real-world fuzzy signal. MacMahon and Starkes (2008) had umpires, players,
and people with no baseball experience call balls and strikes in video clips. Consistent with the expectations
of fuzzy SDT, in which the definition of a signal is context-dependent, participants called target pitches closer

35
to the strike end of the scale when viewed after definite balls (low strike membership signal) than when they
followed definite strikes (high membership). Moreover, the strength of this contextual effect was found in all
participants, irrespective of baseball experience.

5. APPLICATIONS OF SIGNAL DETECTION THEORY


SDT has had a large impact on experimental psychology, and its concepts are highly applicable to many
engineering psychology problems as well (Fisher, Schweickert, & Drury, 2006). It has two general benefits:
(1) It provides the ability to compare sensitivity and therefore the quality of performance between conditions
or between operators that may differ in response bias. (2) By partitioning performance into bias and sensitivity
components, it provides a diagnostic tool that implies different corrective actions depending on whether a
change in performance results from a sensitivity loss or a response bias shift (Swets & Pickett, 1982).
The implications of the first benefit are clear. The performance of two operators (or the hit rate obtained
from two different pieces of inspection equipment) is compared. If A has a higher hit rate but also a higher
false-alarm rate than B, which is superior (i.e., higher sensitivity)? Unless the explicit mechanism for
separating sensitivity from bias is available, this comparison is impossible. SDT provides the mechanism.
The importance of the second benefit—the diagnostic value of SDT—will be evident as we consider
some actual examples of applications of SDT to real-world tasks. In the many environments where the
operator must detect an event and does so imperfectly, the existence of these errors presents a challenge for
the engineering psychologist: Why do they occur, and what corrective actions can prevent them? Three areas
of application (medical diagnosis, eyewitness testimony and alarm design), will be considered, followed by a
more extensive discussion of vigilance.

5.1 Medical Diagnosis


The realm of medical diagnosis is a fruitful environment for the application of SDT (Lusted, 1976; McFall &
Treat, 1999; Swets, 1998). Abnormalities (diseases, tumors) are either present in the patient or they are not,
and the physician’s initial task is often to make a yes or no decision. The strength of the signal (and therefore
the sensitivity of the human operator) is related to factors such as the salience of the abnormality, the number
of converging symptoms, and the training of the physician to focus on relevant cues.
Swets (1998; see also Getty, et al. 1988) was interested in improving the radiologists’ sensitivity in
discriminating a cancerous tumor from a benign cyst. Mammograms on X-rays require skill to interpret, with
multiple features to examine and evaluate. For example, if the abnormal mass has an irregular border or shape
it is more likely to indicate a malignant growth. The researchers developed a “reading aid,” a checklist of the
types of features that should be considered, along with a numerical scale assessing how confident the
radiologist is that the feature is present. Radiologists who were not experienced in mammography showed
greater sensitivity (across a range of confidence levels, or beta values) when they used the aids than when they
did not. Swets noted that, for 100 patients with cancer, the aids would permit detection of cancers in about 13
additional patients (increasing hit rate), and moreover, for patients without cancer, the aids would permit
avoidance of 12 unnecessary biopsies (decreasing FA rate).
Response bias meanwhile can and should be influenced by disease prevalence and whether the patient is
examined in initial screening (probability of disease low, beta high) or referral (probability higher, beta
lower). Lusted (1976) has argued that physicians’ detections generally tend to be less responsive to variation
in the disease prevalence rate, P (signal), than optimal. Parasuraman (1985) found that radiologist residents
were not responsive enough to differences between screening and referral in changing beta. Both of these
results illustrate the sluggish beta phenomenon.
Although payoffs (in terms of values and costs) may influence medical decision making, it is difficult to
quantify the consequences of hits (e.g., a detected malignancy leads to its surgical removal), false alarms (an
unnecessary operation with associated hospital costs and possible consequences), and misses. Assigning costs
and values to these events based on financial costs of surgery, malpractice suits, and intangible costs of human
life and suffering is clearly difficult. Yet there is little doubt that they do have an important influence on a
physician’s detection rate (Lusted, 1976; Swets, 1998). Rather than consider all four outcomes individually,
Swets (1998) suggests that a physician might simply quantify a ratio of benefits and costs such as “I would
rather be right twice as often when cancer is present than when it is not present.” So, the ratio of the right side
of Equation 2.3 becomes ½, and a liberal criterion results. Alternatively, one can define a criterion that
satisfies a limit on false alarms. This is similar to setting the alpha level in significance testing. Swets suggests

36
that the false alarm (false positive) rate is typically around .10 in medical contexts. To keep FAs low, one
needs to raise beta, producing a conservative criterion.
Finally, one can examine biopsy rates to get a sense of where the criterion is set empirically. For
example, in the United States the biopsy yield (that is, the probability of a signal given a “Yes” response) is
20 to 30 percent, whereas in England it is around 50 percent (Swets, 1998). This means that when a biopsy is
conducted, a positive (cancer) result occurs less often in the United States than in England. The implication is
that in the United States the criterion is lower, and biopsies are called for more often when the risk is lower.
This might better meet the desires of the individual patient and physician, but is also more expensive for the
system considered as a whole (Swets, 1998).

5.2 Recognition Memory and Eyewitness Testimony


When applying SDT to recognition memory, the participant is not assessing whether or not a physical signal
is present but rather decides whether or not a physical stimulus (e.g., a name, an object, or a person’s face)
was seen or heard at an earlier time (Wixted, 2007). One important application of SDT to recognition memory
is found in the study of eyewitness testimony (e.g., Meissner, Tredoux et al., 2005; Wells … Olson, 2003;
Wright … Davies, 2007; Brewer … Wells, 2011) which represents a subset of the growing applications of
psychology to the law (Wargo, 2011). The witness to a crime may be asked to recognize or identify a suspect
as the perpetrator. The four kinds of joint events in Figure 2.1 can readily be specified. The suspect examined
by the witness either is (signal) or is not (noise) the same individual actually perceived at the scene of the
crime. The witness in turn can either say, “That’s the one” (Y) or “No, it’s not” (N).
In this case, the joint interests of criminal justice and protection of society are served by maintaining a
high level of sensitivity while keeping beta neither too high (many misses, with criminals more likely to go
free) nor too low (a high rate of false alarms, so that innocent individuals will be prosecuted). The common
method for conducting a lineup involves simultaneous presentation of all lineup members. The witness is
shown a lineup of five or so individuals, one of whom is the suspect detained by the police and the others are
“foils.” Hence, the lineup decision may be considered a two-stage process: Is the suspect in the lineup? If so,
which one is it?
In applying SDT to this procedure, investigators are interested in characteristics of the lineup process that
might affect how the witness responds. These include variables like presentation method, instructions,
content, and behavioral influence (Wells … Olson, 2003). With a simultaneous lineup, there is the risk of a
relative judgment strategy: selecting the line-up member most similar to the memory of the culprit (Lindsay,
1999). This works well if the culprit is in the lineup, but can lead to false alarms when this is not the case.
However, it has been shown that a process in which lineup members are shown sequentially, and the witness
makes a judgment about each one, makes it less likely that the witness will choose an innocent lineup
member; that is, it will encourage a more conservative response criterion (Lindsay … Wells, 1985; Steblay et
al, 2001). As reflecting a shift in beta, the sequential lineup also reduces hit rate (Gronlund et al., 2009,
Meissner et al., 2005), with no difference in overall sensitivity.
Simple instructions can affect how eyewitnesses respond (Wells & Olson, 2003). For example, merely
informing the witness that the suspect might not be in the lineup has been shown to reduce false alarms or
mistaken identification (Malpass & Devine, 1981) and increase overall sensitivity. This false alarm reduction
has been shown to be considerable in culprit-absent lineups (42 percent), and although there is also a
reduction of accurate identification in culprit-present lineups, it is minimal (2 percent; Steblay, 1997). As a
result the U.S. Department of Justice has added this instruction to the a set of guidelines for law enforcement
(Technical Working Group for Eyewitness Evidence, 1999).
It is common for an investigator to tell eyewitnesses that they selected the suspect from a lineup after the
choice has been made. This type of behavioral influence clearly affects eyewitnesses, who tend to become
more certain of their judgment after being told they selected the suspect by the lineup administrator. Wells and
Bradfield (1998) found that post-identification suggestions led to a “false certainty” that they had identified
the culprit correctly. Even when the true culprit was not in the lineup, a false alarm produced by a lower beta
was accompanied by overconfidence in the accuracy of judgment. The problem with this situation is that
eyewitnesses appear at trial convinced they have identified the criminal, and juries are in turn more easily
convinced by the greater confidence of the eyewitness’s testimony (Wells & Bradfield, 1998). As a result,
Wells and Olson (2003) suggest that the lineup be administered by an individual who does not know which
lineup member is the suspect. The issue of whether confidence correlates with sensitivity of recognition
memory is important, but it remains unresolved in eyewitness testimony (Brewer & Wells, 2006). The

37
correlation of confidence with accuracy is greater than zero, but far less than 1.0 (Brewer & Wells, 2011).
This critical issue of confidence (and overconfidence) in all sorts of judgments will be discussed in detail in
Chapter 8.

5.3 Alarm and Alert Systems


A clear application of signal detection theory is in designing alert or alarm systems, a form of automation
designed to capture human attention when some “danger variable” approaches a criterion level, like vertical
collision proximity less than 500 feet for an air traffic controller or temperature in a building greater than 100
°F(40 °C). As shown in Figure 2.7 (top), this represents a signal detection issue on two levels: the alert system
itself and the human-alert system combined (Botzer et al., 2010; Hollands & Neyedli, 2011; Parasuraman,
1987; Sorkin … Woods, 1985).
In considering alarm systems or “automated diagnosis,” a critical design decision is how to set the
threshold or response criterion (beta). As shown at the bottom of Figure 2.7, this may be set at any range of
values, as it receives the time-varying “raw data” of the danger signal (e.g., the combined heat and particulate
or smoke level in a building). If beta is set high, random variability within the danger state may occasionally
cause a miss. If it is set low, however, then the same random variability occurring within the safe state may
generate a false alarm. These two events or “automation errors” are depicted at the bottom of the figure.
Most alerting systems have a low beta threshold because, as discussed above in the context of optimal
beta, the costs of misses are typically much greater than the costs of false alarms (consider the fire alarm that
fails to alert a true fire, versus the false fire alarm). However, the base-rate of dangerous events is typically
very low (P(Signal) <<<1.0), and because of the high expected costs of setting an optimally high beta (based
on miss-costs), designers do not typically adjust the alarm system beta fully upward to a level that would be
appropriate to reduce false alarms. Hence the lower beta setting (to guard against misses) will inevitably lead
to a very high FA rate (Parasuraman, Hancock, & Olofinboba, 1997).
The problem is amplified when a worker receives alerts from several different independent systems, like
the medical professional (Xiao, Seagull, et al., 2004) or the nuclear power plant operator. Kesting, Miller, and
Lockhart (1988) estimated that in the typical operating room, an alarm was triggered every 4.5 minutes, with a
majority being “false alarms.”

38
FIGURE 2.7 Signal detection theory and warning signals. At the top, these two error types are shown in the context of the SDT matrix
representing the alarm system alone (left) and the human basing his/her judgment on both the alarm information and perception of the raw data.
At the bottom is shown the danger variable fluctuating above or below a low threshold (low beta) or a high threshold (high beta), with the
consequent two types of errors. The danger variable is the predicted threat of a mid air collision for air traffic control.

This situation becomes somewhat more complex when we consider that in some circumstances, the
human has access to the same “raw data” upon which the automation is making its decision (Parasuraman,
1987; Getty, Swets, et al., 1995; Wang, Jamieson, … Hollands, 2009; Wickens, Rice, et al., 2009), as shown
at the upper right of Figure 2.7. When the human only has access to the alarm, s/he may have little choice but
to comply with it, even if it is false. However, when the human can process the raw data in parallel, a high
false alarm rate can have two major negative consequences (Dixon, McCarley, … Wickens, 2007),
particularly in the multi-task environment when alerts are found to have their greatest potential benefits
(Parasuraman et al., 1997; Wickens … Dixon, 2007). First, frequent false alarms may impose frequent
interruptions of the ongoing concurrent tasks, in order for the human to cross check the raw data, ensure that
the alert is indeed false, or, in some cases, take unneeded actions to address the “non-event.” Second, and
more serious, after excessive alarm false alarms, people may develop a “cry wolf ” syndrome (Breznitz,
1983; Sorkin, 1989) in which alerts (including those few that may be true) are either responded to late or
ignored altogether. Formally, this amounts to the human adjusting his/her beta far upward to compensate for
the extremely low beta of the alerting system. We address this issue more in Chapter 12 when we discuss trust
in automation.
As one example, in 2001 air traffic controllers in Guam disabled a minimum safe altitude warning
(MSAW) system because it had issued too many false alerts. It had “cried wolf ” once too often. As a result
the controllers did not detect the low altitude descent of a commercial flight. The aircraft crashed into a
mountain short of the runway, leading to over 100 fatalities. In another example, in 2002 the AARC Joint
Commission (2002) issued a report stating that 22 percent of incidents (death or coma) from medical
ventilator problems had delayed or no response to alarms as a root cause. Here again, an excessive false alarm
rate led to many events being ignored.
Several potential solutions can be offered to address the problems of mistrust caused by alarm false
alarms.

39
Use multiple alarm levels. Likelihood alarms (Sorkin Kantowitz, … Kantowitz, 1993; see also
1. Neyedli, Hollands, … Jamieson, 2011; St Johns … Manes, 2002; Wickens … Colcombe, 2007) can be
used in which the alarm system issues three or more graded levels representing the likelihood of a
danger state. This is similar to the process manifest in fuzzy SDT described earlier, such that the alarm
signals its own “confidence” in the state of the world. This technique has the advantage that many of
the alarm errors, when they occur, are not viewed as bad; hence these errors are less likely to lead the
operator to mistrust or ignore the alarm.
2. Raise automated beta slightly. Sometimes the beta level of the automated system can be raised a little
without seriously jeopardizing safety, when the human has access to the raw data and monitors these
data in parallel.
3. Keep the human in the loop. For an increased alarm threshold to be successful, it is important that the
human observer can easily monitor the raw data in parallel with the alert system, just as an air traffic
controller monitors the radar display for potential conflicts, even as the conflict alert system is doing
the same thing (Wickens, Rice, et al., 2009).
4. Improve operator understanding of alarm false alarms. Train operators to understand the statistical
necessity of a high false alarm rate, particularly in circumstances that combine a high cost of misses
with a very low base rate of events (Parasuraman, Hancock, … Olofinbaba, 1997). For many
predictive warnings operators can be instructed that most false alarms do not represent “bad”
automation failures, but instances in which the alert system has lowered its threshold to guard against
misses (Lees … Lee, 2007). In cases when the operator also has access to the raw data, operators can
view the alerting system as one that can reinforce the operator’s own judgments. Indeed Wickens,
Rice, et al. (2009) did not find evidence for the “cry wolf ” effect shown by air traffic controllers using
a conflict alerting tool and suggested that such reinforcement was the reason.

6. VIGILANCE
The vigilance paradigm is one of the most common applications of SDT. The earliest laboratory studies
investigating vigilance were conducted by Mackworth (1948), who was trying to determine why World War
II radar operators were missing signals on their displays that signified the presence of enemy submarines. In
his experiments, the observer monitored a clock hand or pointer that moved in small jumps around the face of
the clock, simulating the radar task. These were non-target events. Occasionally the hand underwent a double
tick (the target event), moving twice the angle of the non-target events. Mackworth found that an operator’s
ability to detect the target event signal decreased over time, a finding since replicated many times.
In the vigilance paradigm the operator is required to detect signals over a long period of time (referred to
as the watch), and the signals are intermittent, unpredictable, very rare, and usually of low salience, such as
the airport security inspector who examines X-rayed carry-on luggage or the quality control inspector who
examines a stream of products (e.g., sheet metal, circuit boards) to detect and remove the rare defective or
flawed items.
Two general conclusions emerge from the analysis of operators’ performance in the vigilance situation.
First, the steady-state level of vigilance performance known as the vigilance level often shows lower levels
than desirable. Second, the vigilance level declines steeply during the first half hour or so of the watch. This
phenomenon has been experimentally replicated numerous times and has been observed in industrial
inspectors (e.g., Harris … Chaney, 1969; Parasuraman, 1986). This decrease in vigilance level over time is
known as the vigilance decrement.

6.1 Measuring Vigilance Performance


A large number of investigations of the factors affecting the vigilance level and the vigilance decrement have
been conducted over the last half century, with many experimental variables in various paradigms. An
exhaustive listing of all of the experimental results of vigilance studies is beyond the scope of this chapter, but
extensive treatments are available (e.g., Davies and Parasuraman, 1982; Parasuraman, 1986; See et al., 1995;
Warm 1984; Warm and Dember, 1998). In SDT terms, the vigilance decrement can arise either as a result of a
decrease in sensitivity (e.g., Mackworth & Taylor, 1963) or as a shift to a more conservative criterion (e.g.,
Broadbent & Gregory, 1965), depending on the task and experimental situation.
Several influences on sensitivity are:
1. Sensitivity decreases, and the sensitivity decrement increases, as a target’s signal strength is reduced,

40
which can be done by reducing the intensity or duration of a target or otherwise making it more similar
to non-target events (Mackworth … Taylor, 1963; Teichner, 1974).
2. Sensitivity decreases when there is uncertainty about the time or location at which the target signal
will appear. This uncertainty is particularly great if there are long intervals between signals
(Mackworth … Taylor, 1963; Warm et al., 1992).
3. For inspection tasks, which have defined non-target events, the sensitivity level decreases and the
decrement increases when the event rate is increased (Baddeley … Colquhoun, 1969; See et al.,
1995), such as by speeding up the conveyer belt in an inspection situation. Note that this keeps the
ratio of target to non-targets constant, and therefore event rate should not be confused with target
probability (see below).
4. The sensitivity level is higher for simultaneous tasks when target and non-target can be seen at once
than for successive tasks when only one can be seen at a time (Parasuraman, 1979). Event rate and
spatial uncertainty interact with task type, such that these factors impair performance to a greater
degree for successive than simultaneous tasks (Warm … Dember, 1998).
Changes in bias also occur, and the more salient results are as follows:
1. Target probability affects response bias, with higher probabilities decreasing beta (more hits and
false alarms) and lower probabilities increasing it (more misses and correct rejections; Loeb …
Binford, 1968; See et al., 1997; Wolfe, Horowitz, et al., 2007), although sluggish beta is evident
(Baddeley … Colquhoun, 1969; See et al., 1997). Note that a decrease in target probability can occur
if non-target events are more densely spaced between targets.
2. Payoffs affect response bias as in the signal detection task (e.g., Davenport, 1968; See et al., 1997),
although the effect of payoffs is less consistent and less effective than manipulating probability
(Maddox, 2002; see Davies … Parasuraman, 1982). This stands in contrast to the relative effects of
manipulating probability and payoffs in non-vigilance signal detection as described earlier (Figure
2.4), where signals are more common.

6.2 Theories of Vigilance


Traditionally, the vigilance decrement was thought to be caused by a decline in arousal (e.g., Frankmann …
Adams, 1962). In this view, the repetitive and monotonous characteristics of vigilance tasks suppress the
neural activity necessary to maintain alertness. Arousal theory (Welford, 1968) postulated that in a prolonged
low-event environment, the “evidence variable” X shrinks while the criterion stays constant. This is illustrated
in Figure 2.8. The shrinking results from a decrease in neural activity (both signal and noise) with decreased
arousal. Figure 2.8 reveals that such an effect will reduce both hit and false-alarm rates (a change in beta)
while keeping the separation of the two distributions—as expressed in standard scores—at a constant level (a
constant d′).
In contrast, sustained demand theory proposes that the vigilance task imposes substantial demands on the
observer’s information-processing resources, and rather than decreasing arousal, the task leads to increased
arousal and stress (Warm, Parasuraman, … Matthews, 2008) and that maintaining this supply of resources is
difficult to sustain over time. This theory has recently gained traction with a broad array of brain imaging and
neurophysiological measures supporting it (see Chapter 11). The sustained attentional demand of the vigilance
task has long been recognized. Broadbent (1971) argued that sustained attention was necessary to fixate the
Mackworth (1948) clock hand or somehow detect the target signal. Indeed, sometimes the vigilance task is
referred to as a sustained attention task (Parasuraman, 1979).

41
FIGURE 2.8 An illustration of arousal theory. Arousal decreases over time on watch, the SDT distributions are compressed relative to × c,
P(H) and P(FA ) are reduced, with the net result that a more conservative criterion is used (beta increases).

More recently, investigators have concluded that a vigilance task imposing a sustained load on working
memory (e.g., having to continuously remember what the target signal looks or sounds like, as in a successive
task) will demand the continuous supply of processing resources (Deaton … Parasuraman, 1988;
Parasuraman, 1979). Ratings of mental workload (see Chapter 11) show that the workload of vigilance tasks
is generally high and that the vigilance decrement is accompanied by an increase in subjective workload over
time (Warm, Dember, … Hancock, 1996). A further implication of the resource-demanding nature of
vigilance tasks is their susceptibility to interference from concurrent tasks. For instance, Caggiano and
Parasuraman (2004) showed that when both the vigilance task and a secondary task drew on spatial working
memory resources, a greater vigilance decrement was observed. The concept of mental resources will be
discussed further in Chapter 10.
Thus, situations demanding greater processing resources produce a vigilance decrement (e.g., when the
target is difficult to detect, when there is uncertainty about the target’s location or when it will occur, when
the event rate is fast, when the observer has to remember what the target looks or sounds like, when the target
is not a familiar stimulus). The finding that the sensitivity level is higher, and the sensitivity decrement
eliminated, when observers can detect the target automatically with little effort is also consistent with
sustained demand theory since a characteristic of automatic processing is that it produces little resource
demand (Schneider … Shiffrin, 1977).
Sometimes the vigilance decrement can be shown to be due to a shift in the criterion rather than, or in
addition to, the sensitivity decrement predicted by the sustained-demand theory. The expectancy theory
proposed by Baker (1961) attributes the vigilance decrement to an upward adjustment of the response
criterion in response to a reduction in the perceived frequency (and therefore expectancy) of target events.
Accordingly, expectancy theory explains a criterion shift as follows. It is assumed that the observer sets beta
on the basis of a subjective perception of signal frequency, Ps (S). In many vigilance situations with low
salience signals, it is impossible for the observer to attend to and detect all signals, even with an optimal
monitoring strategy (Moray … Inagaki, 2000). Then if a signal is missed for any reason, subjective
probability Ps(Signal) is reduced because the observer believes that one less signal occurred. This reduction in
turn causes an upward adjustment of beta, which further increases the likelihood of a miss. The consequent
increase in the miss rate decreases Ps(S) further, and so on, a phenomenon Broadbent (1971) labeled the
vicious circle hypothesis. The vicious circle leads to an upward spiraling of beta and a downward spiraling of
P(H). Although this behavior could lead to an infinite beta and a negligible hit rate, in practice other factors
will operate to level off the criterion at a stable but higher value.
When the signal probability is lowered, it should serve to decrease the expectation of the signal, and
therefore increase beta (Wolfe, Horowitz, et al., 2007). Payoffs may have similar, but less pronounced effects.
Since the vicious circle depends on signals being missed in the first place, it stands to reason that the kinds of

42
variables that reduce sensitivity (short, low-intensity signals) should also increase the expectancy effect in
vigilance, as noted above.

6.3 Techniques to Combat the Loss of Vigilance


In many vigilance situations, performance reflects some combination of shifts in sensitivity and response bias.
Like the theories of vigilance, these corrective techniques may be categorized into those that enhance
sensitivity and those that shift the response criterion in a more optimal (typically lower) direction.

6.3.1 INCREASING SENSITIVITY There are several techniques for improving sensitivity in a vigilance task.
1. Show target examples (reduce memory load). A logical implication of the sustained demand theory is
that any technique aiding or enhancing the subject’s memory of signal characteristics should reduce
sensitivity decrements and preserve a higher overall level of sensitivity. Hence, the availability of a
“standard” representation of the target should help by turning a successive task into a simultaneous
one. For example, Kelly (1955) reported a large increase in detection performance when quality
control operators could look at images of idealized target stimuli. Furthermore a technique that helps
reduce the beta increment caused by expectancy may also combat a loss in sensitivity. The
introduction of false signals, as described in the next section, could improve sensitivity by refreshing
memory.
  Childs (1976) observed an improvement in performance when subjects were told specifically
what the target stimuli were rather than what they were not. The implication is that inspectors should
have access to visual representation of possible defectives rather than simply the representation of
those that are normal.
2. Increase target salience. Various artificial techniques of signal enhancement are possible, although the
specific technique will depend on the nature of the signal. A trivial example of such a technique is, of
course, simply to amplify the energy (e.g., increase luminance, or volume) characterizing each event.
But this approach may magnify the noise as much as the signal and therefore will do nothing to change
the overall signal-to-noise ratio. More ingenious solutions capitalize on procedures that will
differentially influence signal and non-signal stimuli. For example, Drury et al. (2001) successfully
used a binocular rivalry technique in which a standard board and defective board are shown
simultaneously using a stereoscope. When two images contain large areas that are similar, the two
images appear as one and areas of the image that are different appear to shimmer. The shimmering
areas therefore represent a potential defect. Liuzzo and Drury (1978) developed a similar
signalenhancement technique known as “blinking” which rapidly and alternately projects, at a single
location, an image of two successive items: a known good prototype and the item to be inspected. If
the item contains a malfunction (e.g., a gap in wiring), the gap location will appear to blink on and off
in salient fashion as the displays are alternated. Another approach is to transform the events to another
sensory modality (e.g., add an auditory signal to a visual display). This technique takes advantage of
the redundancy gain that occurs when the signal is presented in two modalities at once (e.g., Doll &
Hanna, 1989).
3. Reduce the event rate. As Saito (1972) showed in a study of bottle inspectors, a reduction of the
inspection rate from 300 to less than 200 bottles per minute markedly improved inspection efficiency.
Allowing observers to control event rate is also effective: Scerbo, Greenwald, and Sawin (1993)
showed that giving observers such control improves sensitivity and lowers the sensitivity decrement.
4. Train observers. A technique closely related to the enhancement of signals through display
manipulations is one that emphasizes operator training. Fisk and Schneider (1981) demonstrated that
the magnitude of a sensitivity decrement could be greatly reduced by training subjects to respond
consistently and repeatedly to the target elements. This technique of developing automatic processing
of the stimulus (described further in Chapter 6) tends to make the target stimulus “jump out” of the
train of events, just as one’s own name is easily heard even in a noisy crowd. Fisk and Schneider note
that the critical stimuli must consistently appear only as a target stimulus and that the probability of
target must be high during the training session.

6.3.2 SHIFT IN RESPONSE CRITERION The following methods may be useful in shifting the criterion to an optimal
level.
1. Instructions. An inspector may not properly understand the relative costs of errors if these are not
made explicit. In quality control, for example, the inspector may believe that it is better to detect more

43
defects and not worry about falsely rejecting good parts when it would be more cost-effective to
maintain a higher criterion because the probability of a defective part is low. Sometimes simple
instructions in industrial or company policy can adjust beta to an appropriate level. Thus, for example,
in airline security inspection, increased stress placed on the seriousness of misses (failing to detect a
weapon smuggled through the inspection line) could cause a substantial decrease in the number of
misses (at the cost of a possible increase in false alarms).
2. Knowledge of Results. Where possible, knowledge of results (KR) should be provided to allow an
accurate estimation of the true P(S) (Mackworth, 1950). It appears that KR is most effective in low-
noise environments (Becker, Warm, Dember, … Hancock, 1995). In particular, using realistic airport
security X-ray images of luggage, Wolfe et al. (2007) found that feedback coupled with a temporary
high signal rate created by introducing false signals (see below) was an effective way of shifting and
then preserving the response criterion at a lower level.
3. False Signals. Baker (1961) and Wilkinson (1964) have argued that introducing false signals should
keep beta low. This introduction will raise Ps(S). Furthermore, if the false signals are physically
similar to the real signals, by refreshing the standard of memory this procedure should also improve
sensitivity and reduce the sensitivity decrement by reducing the sustained demand of the task, as
discussed earlier. For example, as applied to the quality control inspector, a certain number of
predefined defectives might be placed on the inspection line. These would be “tagged,” so that if
missed by the inspector they would still be removed. Their presence in the inspection stream should
guarantee a higher Ps(S) and therefore a lower beta than would be otherwise observed. There is,
however, a danger in employing this technique if the actions that the operator would take after
detection should have undesirable consequences for an otherwise stable system. An extreme example
would occur if false warnings were introduced into a chemical process control plant and these led the
operator to shut down the plant unnecessarily.
4. Confidence levels. Allowing operator to report signal events with different confidence levels decreases
the vigilance decrement (Broadbent … Gregory, 1965; Rizy, 1972). This is a like the likelihood alert
technique and is amenable to fuzzy SDT analysis as described earlier in the chapter. If rather than
classifying each event as target or non-target, the operator can say “target,” “uncertain,” or “non-
target” (or a wider range of response options), beta should not increase as quickly since the observer
would say “non-target” less often, and the subjective perception of signal frequency, Ps(S) should not
decrease as quickly.

6.3.3 OTHER TECHNIQUES Other techniques to combat the decrement have focused more directly on arousal and
fatigue. Parasuraman (1986) noted that rest periods can have beneficial effects. Presumably rest periods
interrupt the continued demand of the vigilance situation. Interestingly, meditation training (Shamatha
training) has been show to improve sensitivity in a vigilance task and reduce the decrement relative to a
control group which had no such training (MacLean et al., 2010). It would appear that meditation improves
the ability to concentrate for long periods, helping the observer meet the sustained demands of the vigilance
task. Atchley and Chan (2011) found that periodically engaging drivers in a verbal task (thereby increasing
their arousal levels) during long periods of simulated driving, could reduce the vigilance decrement. Similar
findings were reported in a study of simulated future air traffic control (“NextGen”), in which aircraft
maintained separation from each other without controller involvement. Increasing task engagement by
requiring observers to explicitly acknowledge entry of an aircraft into a flow corridor reduced the vigilance
decrement in detecting occasional failures of aircraft self-separation (Pop et al., 2012).

6.4 Vigilance: Inside and Outside the Laboratory


There has been a plethora of vigilance experiments in the laboratory, with a wealth of experimental data. It is
challenging, however, to capture real-world system dynamics in a laboratory environment. In the laboratory
tasks, fairly simple stimuli with known location and form have been employed relative to the more complex
stimuli existing in the real world. The monitor of the nuclear power plant, for example, does not know
precisely what configuration of warning indicators will signal the onset of an abnormal condition, but it is
unlikely that it will be the appearance of a single near-threshold light in direct view. There are also differences
in signal frequency between laboratory data and real vigilance situations. In the laboratory, signal rates may
range from one an hour to as high as three or four per minute—low enough to show decrements, and lower
than fault frequencies in industrial inspection, but far higher than rates observed in the performance of reliable
aircraft, chemical plants, or automated systems, in which failures occur at intervals of weeks or months. This

44
difference in signal frequency, discussed in Chapter 3 in the context of attention capture, may well interact
with differences in motivational factors between the subject in the laboratory, performing a well-defined task
and responsible only for its performance, and the real-time system operator confronted with a number of other
competing activities and a level of motivation potentially influenced by the large costs of misses and false
alarms.
There is some evidence to suggest that more realistic vigilance situations are more likely to produce a
vigilance decrement. Donald (2008) has noted that real-world occupations requiring vigilance (e.g., closed-
circuit television surveillance operators) involve detecting a large variety of signal types, in complex
(cluttered) naturalistic settings, in a continuous (rather than discrete) manner, in successive fashion. For
example, the surveillance operator monitors the output of numerous cameras, and must understand the
relationship between different cameras and scenes, each of which show scenes like streets, railway platforms,
or factory floors (Donald, 2008). Sustained demand theory and the experimental evidence in vigilance would
suggest that more complex working conditions like these should be more likely to produce a vigilance
decrement than laboratory experiments with simple stimuli. Donald also noted the need for effective
situational awareness to perform these tasks, especially the ability to think about and predict future states. We
will return to the concept of situation awareness in Chapter 7.
Thus, the variables affecting vigilance performance uncovered in the laboratory will likely affect
detection performance in the real world, although the magnitude of the effect may be attenuated or enhanced.
Data have been collected and vigilance phenomena observed in real or highly simulated environments: in
automated systems (Parasuraman … Riley, 1997); in driving (St John … Risser, 2009); in aviation (Molloy …
Parasuraman, 1996; Ruffle-Smith, 1979), including studies with experienced general aviation pilots (Wiggins,
2010); in air traffic control (Pop et al., 2012); and in NORAD aircraft surveillance (Pigeau et al., 1995). A
recent study also found evidence of vigilance decrement for observers required to detect a threatening action
— reaching for a gun in order to shoot—in dynamic video scenes containing more common nonthreatening
acts such as reaching for the gun in order to transport it, or reaching for a hair dryer (Parasuraman et al.,
2009). It is clear that vigilance effects reliably occur in many situations beyond the laboratory.

7. ABSOLUTE JUDGMENT
When humans detect signals, they make a simple two alternative choice along a continuum of sensory
evidence. Performance may be poor if signal energy is low. However, when humans must identify or classify
three or more stimuli at different levels along a sensory continuum, a task called absolute judgment, we find
that performance is relatively poor even when there may be substantial physical differences between the levels
(in contrast to SDT). Such a task might involve recognizing the pitch of a warning tone, discriminating among
several brightness or color levels, or comparing the sweetness of a set of soft drink formulas.
Absolute judgment, like detection, is an example of a task in which the human transmits information
from stimulus to response. Because much of the quantification of absolute judgment performance depends
upon understanding the formal language by which information is described and quantified, we present a brief
introduction to this language below. More details are found in the supplement on information theory at the
end of this chapter.

7.1 Quantifying Information


Information is potentially available in a stimulus or event any time there is some uncertainty as to what the
stimulus will be. How much information is delivered by a stimulus event depends in part on the number of
possible stimuli that could occur in that context. If the same stimulus occurs on every trial, its occurrence
conveys no information. In contrast, if more than one possible stimulus event could occur, then the amount of
information conveyed by one of these events when it does occur, expressed in bits (binomial digits), is
simply equal to the base 2 logarithm of the number of possible events. If four events could possibly occur, for
example, then when one event does occur we obtain two bits of information, since log2 4 = 2 bits. So we say
that HS, the amount of information in the stimulus, is 2 bits. If there were but two alternatives, the information
conveyed by the occurrence of one of them is HS = log2 2 = 1 bit. A completely certain or predictable event
conveys 0 bits (log2 1 = 0).
In human performance we are less interested in the amount of information in a stimulus than in the
amount transmitted by the human operator to the response, a quantity designated HT. While the formal
technique for computing information transmission will be described in the supplement, an intuitive

45
description will be given here. Obviously, if our hypothetical operator responds correctly to every occurrence
of one stimulus from a set of four alternatives, then two bits of information are transmitted. If the operator
ignores the stimuli and responds randomly, zero bits are transmitted. If the operator makes some errors,
performance ranges between these two limits. Thus the number of alternative stimuli, and therefore the
amount of information in the input (HS), places an upper bound on the maximum amount of information that
an operator can transmit: HT ≤ HS. The amount by which HT is less than HS (HS-HT) is information loss
(HLoss).

7.2 Single Dimensions


In absolute judgment tasks, the human must identify or “label” the level of a stimulus along a sensory
continuum, as when an inspector of wool quality must categorize a given specimen into one of several quality
levels. Our discussion of absolute judgment will first describe performance when stimuli vary on only a single
physical dimension. We will then consider absolute judgment along two or more physical dimensions that are
perceived simultaneously and discuss the implications of these findings to principles of display coding.

7.2.1 EXPERIMENTAL RESULTS For a typical absolute judgment experiment, a stimulus continuum (e.g., tone
pitch, light intensity, or texture roughness) and a number of discrete levels of the continuum (e.g., four tones
of different frequencies) are selected. These stimuli are then presented randomly to the subject one at a time,
and the subject is asked to associate a different response to each one. For example, four warning signals each
with a different pitch might be called A, B, C, and D. The extent to which each response matched the
presented stimulus can then be assessed. When four discriminable stimuli (two bits) are presented,
information transmission (HT) is usually perfect—at two bits. Then the stimulus set is enlarged, additional
data are collected with five, six, seven, and more discrete stimulus levels, and HT is computed each time by
using the procedures described in the chapter supplement. Errors begin to appear when about five to six
stimulus levels are used, and the error rate increases as the number of stimuli increase further. The larger
stimulus sets have somehow saturated the subject’s capacity to transmit information about the magnitude of
the stimulus. This suggests that the subject has a maximum channel capacity (Miller, 1956).
Figure 2.9 shows the information transmitted (HT) as a function of the number of absolute judgment
stimulus alternatives, HS. The 45-degree slope of the dashed line indicates perfect information transmission,
and the “leveling” of the function takes place at the region in which errors began to occur (i.e., HT < HS). The
level of the flat part or asymptote of the function indicates the channel capacity of the operator: somewhere
between 2 and 3 bits (between 4 and 8 stimulus levels). George Miller (1956), in a classic paper entitled “The
Magical Number Seven Plus or Minus Two,” noted the similarity of the asymptote level (around 7) across a
number different sensory continua (e.g., tone pitches, loudnesses, saltiness, pointer position, points in a
square, line curvature, line length, line slope, color). The limit does vary somewhat from one continuum to
another; it is less than 2 bits for saltiness of taste and about 3.4 bits for judgments of position on a line or hues
along a rainbow scale. Additionally, absolute judgments are also subject to the bow effect (Luce et al., 1982):
Stimuli located in the middle of the range are generally identified with poorer accuracy than those at the
extremes. The limit is not sensory because the senses can discriminate thousands of levels of stimuli (e.g.,
1,800 tone pitches; Mowbray & Gebhard, 1961). The implication is that the limited span reflects the accuracy
of the perceiver’s working memory for the stimuli (Siegel & Siegel, 1972): that is, for example, remembering
what the pitch “sounds like.”
There is some flexibility in our limited span. Sensory continua for which we demonstrate good absolute
judgments are those for which such judgments in real-world experience occur relatively often (and therefore
are better learned). For example, judgments of position along a line (3.4 bits) are made in measurements on
rulers. High performance in absolute judgment also seems to be correlated with professional experience with a
particular sensory continuum in industrial tasks (Welford, 1968) and is demonstrated by the noteworthy
association of absolute pitch with skilled musicians (Shepard, 1982; Takeuchi … Hulse, 1993).
Many attempts to model performance in absolute judgment tasks are similar to SDT (e.g., Brown, Marley
et al., 2008; Petrov … Anderson, 2005), adapting elements of that theory to situations where there are more
than two stimulus possibilities. In these approaches, each stimulus is assumed to give rise to a distribution of
“perceptual effects” along the sensory continuum (Thurstone 1927; Torgerson, 1958). The observer partitions
the continuum into response regions using a set of decision criteria instead of the single criterion used in the
simple signal detection situation. It is proposed that the variability of these distributions increases with the
number of stimuli, which would make it more difficult to absolutely identify each stimulus. Such models have

46
been shown to account for the bow effects as well (variability greater for the intermediate stimuli).

FIGURE 2.9 Typical human performance in absolute judgment tasks.

7.2.2 APPLICATIONS The conclusions drawn from research in absolute judgment are highly relevant to the
performance of any task that requires operators to sort or classify objects into levels along a physical
continuum, particularly for industrial inspection tasks in which products must be sorted into various classes
for pricing or marketing (e.g., fruit quality) or for different uses (e.g., steel or glass quality). The data from the
absolute judgment paradigm indicate the kind of performance limits that can be anticipated and suggest the
potential role of training. Bow effects suggest that inspection accuracy will be better for extreme stimuli. One
potential method for improving performance would be to have different inspectors sort different levels of the
dimension in question. This would lead to different extreme stimulus categories for each inspector, thereby
creating more “ends” where absolute judgment performance is superior. Absolute judgment data are equally
relevant to the issue of coding, in which the level of a stimulus dimension is given some particular meaning,
and the operator must judge that meaning. For example, computer monitors can display a very large range of
colors (about 16.8 million levels) and software designers are sometimes tempted to use the large available
range to code variables. As we have seen, however, it is clear that people cannot reliably classify colors
beyond about seven levels. Thus, if it is important for the color coding to be interpreted in an absolute sense, a
large number of colors will not be accurately judged (see Chapter 4).
In many cases some physical or conceptual continuum of importance in the performance of a task will be
coded for display by variation along a displayed sensory continuum. For example, the size of socket wrenches
may be coded by color so that they can be easily differentiated even when the digital size indicator cannot be
read (Pond, 1979). A more conceptual dimension, such as the hierarchical level of a personnel unit in an
organization or the security level in a particular environment, may be coded into a number of different levels.
It is, of course, possible to use letters or digits to identify the various levels, but in conditions of low visibility,
high visual clutter, or high stress, these may not be read accurately. For example, in energy management
systems, the continuous voltage levels in an electrical power system can be classified into several contour
levels represented by color coding (Overbye, Wiegmann, Rich, & Sun, 2002). Overbye et al. showed that such
contouring led to faster detection of voltage violations (relative to numeric codes). Basic data on the number
of conceptual categories that can be employed without error are highly relevant to the development of such
nonverbal display codes.
Finally, Moses, Maisano, and Bersh (1979) have cautioned that any conceptual continuum should not be
arbitrarily assigned to a given physical dimension. It should be compatible with the meaning of the
continuum. The issue of display compatibility will receive more discussion in the following two chapters.

7.3 Multidimensional Judgment


If our limits of absolute judgment are severe and can only be overcome by extensive training, how is it that we
can recognize stimuli in the environment so readily? A major reason is that most of our recognition is based
on the identification of some combination of two or more stimulus dimensions rather than levels along a
single dimension. When stimuli can vary on two (or more) dimensions at once, we make an important
distinction between orthogonal and correlated dimensions.
When dimensions of a stimulus are orthogonal, the level of the stimulus on one dimension can take on
any value, independent of the other—for example, the weight and hair color of a person. When dimensions are
correlated, the level on one constrains the level on another to varying degrees—for example, height and

47
weight, since tall people tend to weigh more than short ones.

7.3.1 ORTHOGONAL DIMENSIONS The fact that multidimensional stimuli increase the total amount of information
transmitted in absolute judgment has been repeatedly demonstrated (Garner, 1974). Egeth and Pachella (1969)
showed that subjects could correctly classify only 10 levels of dot position on a line (3.4 bits of information).
However, when two lines were combined into a square, so that subjects classified the spatial position of a dot
in the square, subjects could then correctly classify 57 levels (5.8 bits). Note that this improvement does not
represent a perfect addition of channel capacity along the two dimensions. If processing along each dimension
were independent and unaffected by the other, the predicted amount of information transmitted would be 3.4
+ 3.4 = 6.8 bits, or around 100 positions (10 × 10) in the square. Egeth and Pachella’s results suggest that
there is some loss of information transmitted along each dimension resulting from the requirement to transmit
information along the other.
Going beyond the two-dimensional case, Pollack and Ficks (1954) combined six dimensions of an
auditory stimulus (e.g., loudness, pitch) orthogonally. As each successive dimension was added, subjects
showed a continuous gain in total information transmitted but a loss of information transmitted per dimension.
These relations are shown in Figure 2.10a, with seven bits the maximum total capacity transmitted. The
reason why people with absolute pitch are superior at classifying pitches does not lie in greater discrimination
along a single continuum. Rather, those with absolute pitch make their judgments along at least two
dimensions: the pitch of the octave and the value of a note within the octave. They have created a
multidimensional stimulus from a stimulus that most treat as unidimensional (Shiffrin & Nosofsky, 1994;
Takeuchi & Hulse, 1993).

7.3.2 CORRELATED DIMENSIONS The previous discussion and the data shown in Figure 2.10a suggest that
combining stimulus dimensions orthogonally leads to a loss in information transmitted. As noted, however,
dimensions can be combined in a correlated (i.e., redundant) fashion. For example, the position and color of
an illuminated traffic light are redundant dimensions. When the top light is illuminated, it is always red.

FIGURE 2.10 Absolute judgment of multidimensional auditory stimuli. (a) Orthogonal dimensions. As more dimensions are added, more total
information is transmitted, but less information is transmitted per dimension. (b) Correlated dimensions. As more dimensions are added, the
security of the channel improves, but HS limits the amount of information that can be transmitted.

In such cases, HS, the information in the stimulus, is no longer the sum of HS across dimensions. In the
traffic light example, the correlation between dimensions is 1.0 and, as such, the total HS is equal to the HS on
one dimension alone (since the other dimension is completely redundant). If there are many dimensions, and
they are all perfectly correlated, the HS for all dimensions is the same as HS for any one dimension. Clearly
the maximum possible HS for all dimensions in combination is less than its value would be in the orthogonal
case. HT cannot be greater than this value; thus, HT is limited by HS (a channel cannot transmit more
information than there is in the stimulus).
Eriksen and Hake (1955) found that by progressively combining more dimensions redundantly, the
information loss (HS-HT) is much less for a given value of HS than it is when they are combined orthogonally
(sensitivity is increased), and the information transmitted (HT) is greater than it would be along any single
dimension. As illustrated in Figure 2.10b, HS represents a limit on information transmitted with correlated
dimensions, and as the number of redundant or correlated dimensions increases, HT will approach that limit

48
and information loss will be minimized.
It should be noted that the value of the correlation can range from 0 to 1.0. Such correlation may result
from natural variation in the stimulus material (e.g., height and weight of a person, or perhaps hue and
brightness of a color swatch). Alternatively, as with our traffic light example, it may result from artificially
imposed constraints by the designer, in which case the correlation is usually 1.0, indicating complete
redundancy. Other examples include the various security or threat advisory systems (e.g., the Homeland
Security Advisory System and the defense readiness condition (DEFCON)) used to communicate threat
levels, which commonly assign a color or numeric code to a particular level, such as yellow indicating an
elevated or “significant risk of terrorist attacks.” The colors and numeric codes are perfectly correlated with
the various threat level names, which helps to maximize the likelihood that a particular threat level is
accurately communicated.
In summary, we can see that orthogonal and correlated dimensions accomplish two different objectives
in absolute judgment of multidimensional stimuli. Orthogonal dimensions maximize HT, the efficiency of the
channel; correlated dimensions minimize Hloss; that is, maximize the security of the channel. For example, the
designer of a demographic map can increase the amount of information communicated to the user by
increasing the number of perceptual dimensions (e.g., color, size, shape, etc.), treating each dimension
orthogonally. The concern is that, as she does this, there is less and less information transmitted per each
added dimension (since more errors will be made on each). To maximize security—to ensure that information
is consistently and effectively transmitted—the designer needs to redundantly code information by ensuring
that the information carried by each dimension is perfectly correlated. The concern here is that there is a
restriction on the total amount of information that can be transmitted since new dimensions do not increase
HS.
Sometimes designers allow users to set the coding for different purposes. For example, a smart phone
designer might use several different ring settings. For one setting, an auditory ring tone, a tactile vibration,
and a red flashing light co-occur whenever a new email or text message appears. This maximizes the
likelihood that the user will detect the signal when it occurs (maximizes the security of the channel).
However, because the alarm tone can be distracting or can annoy others, smart phone manufacturers typically
provide other settings in which the type of signal is used to indicate the type of message (e.g., a ring is only
heard when there is an incoming telephone call, and a buzz or light is used to indicate an email or text
message). When the signal appears, more information is transmitted (the user is told both that there is a
message and also what type of message—voice or text), but because there is only one code per message type,
it is possible for the user to miss the message (e.g., didn’t feel the buzz or see the light). Information is
communicated less securely with the orthogonal dimensions (ring versus buzz) than when both are used
redundantly.

7.3.3 DIMENSIONAL RELATIONS: INTEGRAL AND SEPARABLE Orthogonal and correlated refer to the relationship
between a pair of dimensions. As such, they are properties of the information conveyed by a multidimensional
stimulus, not the physical form of the stimulus. However, the nature of the physical relationship between the
two dimensions is also important.
In particular, Garner (1974) made the important distinction between an integral and a separable pair of
physical dimensions. When people classify stimuli, they do so quite differently with integral and separable
pairs. To reveal the different implications on human performance of integral versus separable pairs,
experiments are performed in which subjects categorize different levels of one dimension for a set of stimuli
(a Garner sort task). In the control condition, participants sort on one varying dimension while the other
dimension is held constant. For instance, they might sort rectangles by height while the width remained
constant. In the orthogonal condition, they sort on one varying dimension while ignoring variation in the
other dimension. Thus, as in Figure 2.11a, they might sort on rectangle height as rectangle width also varies,
even though the width is irrelevant to their task and hence should be ignored. Finally, in the correlated
(redundant) condition, the two dimensions are perfectly correlated. An example would be sorting rectangles
whose height and width covary. The subject is told to judge one of the dimensions (e.g., height) only.
However, the taller rectangles would always be wider, and short rectangles would always be narrower.
An experiment by Garner and Felfoldy (1970) revealed that when subjects performed this sorting task
with rectangle height and width as described above, performance in the orthogonal condition was impaired
relative to the control condition. This effect has since been dubbed Garner interference. However,
performance in the correlated condition was better than the control condition. One could refer to this result as

49
a failure of focused attention (see Chapter 3). The interference and facilitation effects are the hallmark of
integral dimensions. In contrast, when sorting with the dot position stimuli in Figure 2.11b, which vary in
vertical and horizontal position, performance is little helped by redundancy and little hurt by the orthogonal
variation of the irrelevant dimension. These results suggest that vertical and horizontal dot position are
separable dimensions. The differences between integral and separable dimensions are observed no matter
whether performance is measured by accuracy (i.e., HT), when several levels of each dimension are used, or
speed, when only two levels of each dimension are used and accuracy is nearly perfect. Table 2.2 lists
examples of integral and separable pairs of dimensions, as determined by Garner’s classification
methodology.
As Table 2.2 shows, judgments involving dimensions of sound are generally integral. Pitch (whether a
sound is high or low) has an integral relation with loudness, and with timbre (the quality of sound that allows
us to distinguish a piano and a guitar playing the same note, for example); and timbre itself has been shown to
have three key dimensions, all of which have been shown to be integral with each other (Caclin, Giard, et al.,
2007). Finally, there is good evidence that pitch and location (azimuth) show integral relations (Dyson &
Quinlan, 2010; Mondor, 1998).

FIGURE 2.11 Examples of integral and separable dimensions. (a) integral dimensions: rectangle height and width. (b) separable dimensions:
dot position.

TABLE 2.2 Pairs of Integral and Separable Dimensions


Integral Dimensions Separable Dimensions
height of rectangle width of rectangle vertical position horizontal position
lightness color saturation size (area) color saturation
hue color saturation size (area) brightness
pitch timbre shape color saturation
pitch loudness shape (and letter shape) color
timbre dimensions duration location
pitch location orientation (angle) size
spatial location temporal order
facial features spacing of features facial features

7.3.4 CONFIGURAL DIMENSIONS With some pairs of dimensions, it matters which level of one dimension is paired
with which level of the other. An example is shown in Figure 2.12(a). When the height and width of
rectangles are positively correlated, it creates different size rectangles of constant shape. Classification
performance is not as good in this situation as when dimensions are negatively correlated (Figure 2.12b),
which creates rectangles of different shapes (Lockhead & King, 1977; Weintraub, 1971). Pairs of dimensions
for which the pairing of particular levels makes a difference are referred to as configural (Wickens …
Carswell, 1995; Carswell & Wickens, 1996). Thus, the height and width of a rectangle configure to produce
the emergent feature of shape. Such emergent features will have important implications for object displays,
to be discussed in the next chapter.

7.3.5 SUMMARY Before we discuss the practical implications of multidimensional absolute judgment, we will
briefly summarize the concepts. The information conveyed by a set of stimuli may vary independently on
several dimensions at once. If the human operator is asked to classify all of these dimensions, more total
information can be transmitted than if stimuli vary on only one dimension, but the information transmitted per
dimension will be less (Hloss will increase). The information conveyed by stimulus dimensions may be
correlated, which produces a redundancy gain in the speed and accuracy of information transmission.

50
The relation between particular perceptual or coding dimensions also matters. If two dimensions have an
integral relationship and both vary, it is difficult to judge one of the dimensions in isolation. However, it is
easier to judge a single dimension if the other dimension covaries with it. With separable dimensions,
variation in one dimension has little effect upon judgments of the other. Finally, how the levels of stimulus
dimensions are combined sometimes matters, because they can configure to create a new emergent property.

7.3.6 IMPLICATIONS OF MULTIDIMENSIONAL ABSOLUTE JUDGMENT As with single dimensions, multidimensional


absolute judgment applies to both sorting and code interpretation tasks. There are occupations that require an
operator to sort or classify natural or manufactured stimuli over whose physical form the human factors
engineer and system designer has little control. Examples include fabric inspection (Mursalin et al., 2008),
glass bottle inspection (Carrasco et al., 2010), or welding quality (Liao, 2003).
The industrial inspector needs to sort into categories along some dimension (e.g., smoothness). In
general, the quality of sort will be worse if the stimuli vary along some other dimension in addition to the
relevant one for sorting and the two dimensions are integral. For example, a poorer sort could result if the
inspector is looking for bumps or cracks in the sheet metal surface and the sheet metal varies not only in
smoothness but also in its shininess. In contrast, if particular perceptual dimensions are found to covary in the
natural stimuli (e.g., the fabric with rougher texture is also a darker color), then inspectors can be trained or
encouraged to look for these associations and take advantage of their covariation, which should speed up the
inspection process.

FIGURE 2.12 Example of configural dimensions for the heights and widths of rectangles. In part (a), heights and widths are positively
correlated; in part (b), they are negatively correlated, producing different shapes (an emergent feature).

In the coding task, the operator is confronted by an artificial symbol, created on a display by the system
designer, which is to be interpreted as representing levels on two or more information dimensions. When
coding dimensions have been shown to be separable, this implies that variation in one dimension will not
affect perception of the other.
In contrast, when coding dimensions have been shown to be integral, this will provide certain
disadvantages (and certain advantages). It will now be more difficult to consider each dimension in isolation,
but if the two dimensions are related, then it will be easier to detect and respond to that covariation. For
example, a designer interested in using auditory displays, cues, or warnings needs to be aware of the integral
nature of sound dimensions, and that identification decisions about the level of one dimension (e.g., high or
low pitch) will be affected by variations in another (e.g., loudness). The designer of auditory alerts for a
cockpit might need to use different volume levels when the aircraft is in flight versus taxiing or when
stationary. These changes in volume levels will affect the discriminability of the pitches used to code different
alert types. Pitch categorization will also be impaired when pitches are compared across different spatial
locations, versus in a single location. However, the use of three-dimensional (3D) audio to “place” sounds in
space may have other benefits, as will be discussed in Chapter 3.
In the visual domain, Rothrock et al., (2006) showed that the integral dimensions of height and width can
be used create a display showing a cross section of the steel I-beam used in construction. Because the
dimensions are integral, and the task is to optimize the value of the ratio (different shapes best for different
weights the I-beam will be used to support), they found a display that depicted the dimensions spatially
integrated was more effective than a display showing the dimensions separately. In contrast, Hollands (2003)
showed that when judgments of individual dimensions are required, it is better to use a display arrangement in

51
which angle and area can be judged independently (pie chart) than where one dimension needs to be
computed from two areas that both vary (stacked bar graph).
The display designer should also be sensitive to the natural relationships between the variables being
depicted in a display. For example, since higher altitudes are typically associated with less vegetation, the
electronic map designer can use integral dimensions to express that relationship. Indeed, many topographical
maps use different colors for high altitudes than low (light brown versus green), and lower altitudes also tend
to be darker areas on the map. Here color hue is being used to code altitude, but the brightness also reflects the
presence of vegetation (and water) at low altitudes, and their relative absence at high altitudes. This natural
correlation is reflected in the choice of integral dimensions (hue and brightness). This issue will be revisited in
the context of information visualization in Chapter 5.
If it is critical to avoid information loss on any particular variable, or the conditions of viewing or
hearing are degraded, then complete redundancy is the answer to display coding. Thus, to indicate an alarm
condition, an auditory warning system could covary the loudness of the horn with pitch to indicate the
seriousness of the alarm. These dimensions are integral since a pitch must have a level of loudness.
The redundancy gain will probably be greater if the dimensions are integral but will be realized even
with separable dimensions. For example, Kopala (1979) found that Air Force pilots were better able to encode
information concerning the level of threat of displayed targets when it was presented redundantly by shape
and color than when it was presented by either dimension alone. Stoplights are also a good example of
redundant coding of critical safety information that simply cannot be lost in transmission: The position of the
light is perfectly correlated with the color.
One final application of the integral-separable distinction in multidimensional absolute judgment was
demonstrated by Jacob et al. (1994). They showed that the relationship between perceptual dimensions
affected the choice of control input devices for graphical interactive tasks on computers. A separable control
device constrains movements to a city-block pattern; movement takes place along a single dimension at a
time. With an integral device, movement can be directed along multiple dimensions at once (in Euclidean
space). Jacob et al. found that when dimensions were integral (location and size), the integral control device
(3D Polhemus tracker) was superior, but when stimulus dimensions were separable (location and color), the
separable device (mouse with button press for mode change) was superior. We will further consider the
relationship between perception and action in Chapter 9. We will continue to see the value of the distinction
between integral and separable dimensions throughout this book, especially when we consider visual attention
in Chapter 3 and spatial displays in Chapter 4. Indeed, in those chapters we will see that the distinction
underlies an important principle for display and interface design.

8. TRANSITION
In this chapter, we have seen how people classify stimuli into two levels along one dimension, several levels
along one dimension, and several levels along several dimensions. In our discussion of the first of these tasks,
we saw that signal detection was characterized by the probabilistic element of decisions under uncertainty;
this characteristic will be addressed in much more detail in Chapter 8 as we discuss more complex forms of
decision making. In our treatment of multidimensional absolute judgment, we saw that more information
could be transmitted as dimensions were combined, and indeed most patterns that we encounter and classify
in the world are multidimensional. We discuss these elements of pattern recognition in Chapter 4, 5, and 6,
and the integration of multiple redundant cues in instructional design in Chapter 6 and in decision making in
Chapter 8. Finally, in the context of Section 7.3 above, it is apparent that when human operators transmit
information along all dimensions of a two-dimensional (or more) stimulus, they must divide attention between
the dimensions. When they are asked to process one dimension and ignore changes on the others, they are
focusing attention. These concepts of divided and focused attention will be considered in much more detail in
the next chapter, when we will consider their broader relevance to the issue of attention to events and objects
in the world as well as to dimensions.
One final word on SDT. We will return to SDT concepts as we progress through the topics in this book.
SDT has been successfully applied to a range of search tasks involving attention, and SDT concepts also
extend to how we exert executive control on attention (Logan, 2004), which has implications for how we time
share attention when performing tasks simultaneously. These topics—of obvious relevance to cognitive work
—will be discussed in more detail in Chapters 3 and 10, respectively. So the concepts of sensitivity and bias
as characterized in SDT extend well beyond detection of simple signals and noise, and enjoy a breadth of
application across a range of cognitive tasks, as we will discuss later.

52
SUPPLEMENT: INFORMATION THEORY
S.1 The Quantification of Information
The discussion of SDT was our first direct encounter with the human operator as a transmitter of information:
An event (signal) occurs in the environment; the human perceives it and transmits this information to a
response. Indeed, a considerable portion of human performance theory revolves around the concept of
transmitting information. In any situation in which the human operator either perceives changing
environmental events or responds specifically to events that have been perceived, the operator is encoding or
transmitting information. A car driver processes visual signals from traffic signs, from other vehicles, from
the dashboard, and from in-vehicle map displays, and must also process auditory signals (e.g., a truck horn). A
fundamental issue in engineering psychology is how to quantify this flow of information so that different
tasks confronting the human operator can be compared. Information theory provides a metric to compare
human performance across a wide range of different tasks.
Information is potentially available in a stimulus event any time there is some uncertainty about what the
stimulus will be. How much information a stimulus delivers depends in part on the number of possible events
that could occur in that context. If the same event occurs on every trial, its occurrence conveys no
information. If two stimuli (events) are equally likely, the amount of information conveyed by one of them
when it occurs, expressed in bits, is simply equal to the base 2 logarithm of this number, for example, with
two events, log2 2 = 1 bit. If there were four alternatives, the information conveyed by the occurrence of one
of them would be log2 4 = 2 bits.
Formally, information is defined as the reduction of uncertainty (Shannon … Weaver, 1949). Before the
occurrence of an event, you are less sure of the state of the world (you possess more uncertainty) than after.
When the event occurs, it has conveyed information to you, unless it is entirely expected. The statement
“There was a terrorist attack on the United States this morning” conveys quite a bit of information. Your
knowledge and understanding of the world are probably quite different after hearing the statement than they
were before. On the other hand, the statement “the sun rose this morning” conveys little information because
you could anticipate the event before it occurred. Information theory formally quantifies the amount of
information conveyed by a statement, stimulus or event. This quantification is influenced by three variables:
1. The number of possible events that could occur, N;
2. The probabilities of those events; and
3. their sequential constraints, or the context in which they occur.
We will now describe how each of these three variables influences the amount of information conveyed by an
event.

S.1.1 NUMBER OF EVENTS Before the occurrence of an event (which conveys information), a person has a state of
knowledge that is characterized by uncertainty about some aspect of the world. After the event, that
uncertainty is normally less. The amount of uncertainty reduced by the event is defined to be the average
minimum number of true-false questions that would have to be asked to reduce the uncertainty. For example,
the information conveyed by the statement “Obama won’” after the 2008 election is 1 bit because the answer
to one true-false question (e.g., “Did Obama win?”—“True” or “Did McCain win?”—“False”) is sufficient to
reduce the previous uncertainty. If, on the other hand, there were four major candidates, all running for office,
two questions would have to be answered to eliminate uncertainty. In this case one question, Q1, might be
“Was the winner from the liberal (or conservative) pair?” After this question was answered, Q2 would be
“Was the winner the more conservative (or liberal) member of the pair?” Thus, if you were simply told the
winner, that statement would formally convey 2 bits of information. This question-asking procedure assumes
that all alternatives are equally likely to occur. Formally, then, when all alternatives are equally likely, the
information conveyed by a event HS, in bits, can be expressed by the formula

(2.5)

where N is the number of alternatives.


Because information theory is based on the minimum number of questions and therefore arrives at a
solution in a minimum time, it has a quality of optimal performance. Clearly, human performance is often not
optimal (consider, for example, the data shown in Figure 2.4). However, it is often the difference between real

53
human performance and an optimal value that gives us a clearer understanding of how humans process
information.

S.1.2 PROBABILITY In fact, events in the world do not always occur with equal frequency or likelihood. If you
lived in the Arizona desert, much more information would be conveyed by the statement “It is raining today”
than the statement “It is sunny.” Your certainty of the state of the world is changed very little by knowing that
it is sunny, but it is changed quite a bit (uncertainty is reduced) by hearing of the low-probability event of
rain. In the example of the four election candidates, less information would be gained by learning that the
favored candidate won than by learning that the Green Party or Libertarian candidate won. The probabilistic
element of information is quantified by making rare events convey more bits. This, in turn, is accomplished
by revising Equation 2.5 for the information conveyed by event i to be

(2.6)

where Pi is the probability of occurrence of event i. This formula increases H for low-probability events. Note
that if N events are equally likely, each event will occur with probability 1/N. In this case, Equations 2.5 and
2.6 are equivalent.
As noted, information theory is based on a prescription of optimal behavior. This optimum can be
prescribed in terms of the order in which the true-false questions should be asked. If some events are more
common or expected than others, then we should ask the question about the common event first. In our four-
candidate example we will do the best (ask the minimum number of questions on the average) by first asking,
“Is the winner Obama?” or “Is the winner McCain?” assuming that these two candidates would have the
highest probability of winning. If instead we asked “Is the winner an independent?” or “Is the winner from
one of the minor parties?” we have “wasted” a question since the answer is likely to be no, and our
uncertainty would be reduced by only a small amount.
The information conveyed by a single event of known probability is given by Equation 2.6. However,
psychologists are often more interested in measuring the average information conveyed by a series of events
with differing probabilities that occur over time—for example, a series of warning lights on a panel or a series
of communication commands. In this case the average information conveyed is computed as

(2.7)

In this formula, the quantity within the square brackets is the information per event as given in Equation 2.6.
This value is now weighted by the probability of that event, and these weighted information values are
summed across all events. Accordingly, frequent low-information events will contribute heavily to this
average, whereas rare high-information events will not. If the events are equally likely, this formula will
reduce to Equation 2.5.
An important characteristic of Equation 2.7 is that if the events are not equally likely, Have will always be
less than its value if the same events are equally probable. For example, consider four events, A, B, C, and D,
with probabilities of 0.5, 0.25, 0.25, and 0.125. The computation of the average information conveyed by each
event in a series of such events would proceed as follows:

This value is less than log2 4 = 2 bits, which is the value derived from Equation 2.5 when the four events

54
are equally likely. In short, although low-probability events convey more information because they occur
infrequently, the fact that they do occur infrequently causes their highinformation content to contribute less to
the average.

S.1.3 SEQUENTIAL CONSTRAINTS AND CONTEXT In the preceding discussion, probability has been used to reflect
the long-term frequencies, or steady-state expectancies, that events will occur. However, there is a third
contributor to information that reflects the short-term sequences of events, or their transient expectancies. A
particular event may occur rarely in terms of its absolute frequency. Given a particular context, however, it
may be highly expected, and therefore its occurrence conveys very little information in that context. In the
example of rainfall in Arizona, we saw that the absolute probability of rain is low. But if we heard that there
was a large front moving eastward from California, our expectancy of rain, given this information, would be
much higher. That is, information can be reduced by the context in which it appears. As another example, the
letter u in the alphabet is not terribly common and therefore normally conveys quite a bit of information when
it occurs; however, in the context of a preceding q, it is almost totally predictable and therefore its information
content, given that context, is nearly 0 bits.
Contextual information is frequently provided by sequential constraints on a series of events. In the
series of events ABABABABAB, for example, P(A) = P(B) = 0.5. Therefore, according to Equation 2.7, each
event conveys 1 bit of information. But the next letter in the sequence is almost certainly an A. Therefore, the
sequential constraints reduce the information the same way a change in event probabilities reduces
information from the equiprobable case. Formally, the information provided by an event, given a context, may
be computed in the same manner as in Equation 2.6, except that the absolute probability of the event Pi is now
replaced by a contingent probability Pi|X (the probability of event i given context X).

S.1.4 REDUNDANCY In summary, three variables influence the amount of information that a series of events can
convey. The number of possible events, N, sets an upper bound on the maximum number of bits if all events
are equally likely. Making event probabilities unequal and increasing sequential constraints both serve to
reduce information from this maximum. The term redundancy formally defines this potential loss in
information. Thus, for example, the English language is highly redundant because of two factors: All letters
are not equiprobable (e’s versus x’s), and sequential constraints such as those found in common digraphs like
qu, ed, th, or nt reduce uncertainty.
Formally, the percent redundancy of a stimulus set is quantified by the formula

(2.8)

where Have is the actual average information conveyed taking into account all three variables (approximately
1.5 bits per letters for the alphabet) and Hmax is the maximum possible information that would be conveyed by
the N alternatives if they were equally likely (log2 26 = 4.7 bits for the alphabet). Thus, the redundancy of the
English language is (1 – 1.5/4.7) × 100 = 68 percent. Wh-t th-s sug-est- is t-at ma-y of t-e le-ter- ar- not ne-
ess-ry fo- com-reh-nsi-n. However, to stress a point that will be emphasized in Chapter 6, this fact does not
negate the value of redundancy in many circumstances. We saw in our discussion of vigilance that
redundancy gain can improve performance when perceptual judgments are difficult. We also saw its value in
absolute judgment tasks.

S.2 Information Transmission of Discrete Signals


In much of human performance theory, investigators are concerned not only with how much information is
presented to an operator but also with how much is transmitted from stimulus to response, the channel
capacity, and how rapidly it is transmitted, the bandwidth. Using these concepts, the human being is
sometimes represented as an information channel, an example of which is shown in Figure 2S.1. Consider the
typist typing up some handwritten notes. First, information is present in the stimuli (the notes written on the
page). This value of stimulus information, HS, can be computed by the procedures described, taking into
account probabilities of different letters and their sequential constraints. Second, each response on the
keyboard is an event, and so we can also compute response information, HR, in the same manner. Finally, we
ask if each letter on the page was appropriately typed on the keyboard. That is, was the information faithfully
transmitted, HT? If it was not, there are two types of mistakes: First, there could be a loss of information in the

55
stimulus, HL, which would be the case if a certain letter was not typed. Second, letters may be typed that were
not in the original text. This is referred to as noise. Figure 2S.1a illustrates the relationship among these five
information measures. Notice that it is possible to have a high value of both HS and HR but to have HT equal
to zero. This result would occur if the typist ignored the printed text, creating a new message, as shown in
Figure 2S.1b.

FIGURE 2S.1 Information transmission and the channel concept: (a) information transmitted through the system; (b) no information
transmitted.

We will now compute HT in the context of a four-alternative forced choice (4AFC) task rather than the
more complex typing task. In the 4AFC task the subject is confronted by four possible events (e.g., one of
four lights flashes), any of which may appear with equal probability, and must make a corresponding response
for each (e.g., push one of four buttons).
When deriving a quantitative measure of HT, it is important to realize that for an ideal information
transmitter, HS = HT = HR. In optimal performance of the reaction time task, for example, each stimulus
(conveying 2 bits of information if equiprobable) should be processed (HT = 2 bits) and should trigger the
appropriate response (HR = 2 bits). As we saw, in information-transmitting systems this ideal state is rarely
obtained because of the occurrence of equivocation and noise.
The computation of HT is performed by setting up a stimulus-response matrix, such as that shown in
Figure 2S.2, and converting the various numerical entries into three sets of probabilities: the probabilities of
events, shown along the bottom row; the probabilities of responses, shown along the right column; and the
probabilities of a given stimulus-response pairing. These latter values are the probability that an entry will fall
in each filled cell, where a cell is defined jointly by a particular stimulus and a particular response. In Figure
2S.2a, there are four filled cells, with P = 0.25 for each entry. Each of these sets of probabilities can be
independently converted into the information measures by Equation 2.7.
Once the quantities HS, HR, and HSR are calculated, the formula

(2.9)

allows us to compute the information transmitted. The rationale for the formula is as follows: The variable HS
establishes the maximum possible transmission for a given set of events and so contributes positively to the
formula. Likewise, HR contributes positively. However, to guard against situations such as that depicted in
Figure 2.9b, in which events are not coherently paired with responses, HSR, a measure of the dispersion or
lack of organization within the matrix, is subtracted. If each stimulus generates consistently only one response
(Figure 2S.2a), the entries in the matrix should equal the entries in the rows and columns. In this case, HS =
HR = HSR, which means that HS = HT. However, if there is greater dispersion within the matrix, there are more
bits within HSR. In Figure 2S.2b this is shown by eight equally probable stimulus-response pairs, or 3 bits of
information in HSR. Therefore, HSR > HS, and HT = 1 bit, less than the 2 bits of information in the stimulus.
The relation between all six quantities (the five types of H values and noise) is summarized in a Venn diagram
in Figure 2S.3.

56
FIGURE 2S.2 Two examples of the calculation of information transmission.

Sometimes we are interested in the information transmission rate expressed in bits/second rather than
just the quantity HT. To find this rate, HT is computed over a series of stimulus events, along with the average
time for each transmission (i.e., the mean reaction time, RT). Then the ratio HT/RT is taken to derive a
measure of the bandwidth of the communication system in bits/second. This is a useful metric because it
represents processing efficiency by taking into account both speed and accuracy, and allows comparison of
efficiencies across tasks. For example, measures of processing efficiency can be obtained for typing or
monitoring tasks and the bandwidths can be compared. We describe the relationship between speed and
accuracy of response in Chapter 9.

S.3 Conclusion
In conclusion, it should be noted that information theory has clear benefits. It provides a single combined
measure of information transmission that is generalizable across tasks. The bit measure provides a useful
method to understand performance in absolute judgment, as we saw earlier in this chapter. Further,
information theory provides a useful heuristic for characterizing human cognition: that the human in the loop
is an information transmitter, and that by perceiving, thinking, deciding, and acting, we are serving to reduce
uncertainty.

FIGURE 2S.3 Information transmission represented in terms of Venn diagrams.

Nonetheless, information theory also has its limitations when applied to human information processing
(Laming, 2001; Luce, 2003; Wickens, 1984). Unlike computer networks, humans do not process sequential
bits of information independently, as the theory assumes. We shall see in Chapter 9 when we consider how
the number of possible response alternatives affects response time, non-informational factors like stimulus
repetition and practice affect response times (when information theory says they should not). Further, it
appears that when observers perform long sequences of repetitive judgments (as for example in a signal
detection or vigilance task), they try to balance their error types (false alarms and misses) to be approximately
equal (probability matching) trial to trial rather than set a fixed criterion based on stimulus probabilities, as
information theory would suggest (Laming, 2001, 2010). The “sluggish beta” data in Figure 2.4 show this
pattern. Put another way, it is not so much the capacity limitations of a channel that limit human performance;
it is more that human beings process information in a different way than optimal that limits their performance.

57
Furthermore, HT measures only whether responses are consistently associated with events, not whether
they are correctly associated, and the measure does not take into account the magnitude of an error.
Sometimes larger errors are more serious, such as when the stimulus and response scales lie along a
continuum (e.g., driving a car on a winding road, tracking a moving target). Information theory can be applied
to such continuous tasks, and Wickens (1992) describes methods for doing this. However, an alternative is to
use either a correlation coefficient or some measure of the integrated error across time, as discussed in
Chapter 9. Further discussion of HT and its relation to measures of d′ and percentage correct can be found in
Wickens (1984).

APPENDIX: COMPUTING d′ AND BETA


To compute any SDT parameter, you need two values: P(H) and P(FA). For sensitivity, you need to compute
the z-scores corresponding to P(H) and P(FA). You might remember doing something similar in a statistics
course, when you computed the area under the standard normal curve for a particular z-score. This is the
opposite process, going from the probability (the area under the curve right of the criterion) to the z-score.
Sensitivity is defined as:
d′ = z(H) – z(FA)
Table A1 shows values of d′ for some combinations of P(H) and P(FA) values.
When P = .5, z = 0; z is positive when P > .5; otherwise z is negative. In most situations, P(H) will be
greater than .5, and P(FA) will be less, therefore z(FA) will be negative: so the double negative means that
you add the two z values.

TABLE A1 Some Values of d′

Selected values from Signal Detection and Recognition by Human Observers (Appendix 1, Table 1) by J. A. Swets, 1969, New York: Wiley.
Copyright 1969 by John Wiley … Sons, Inc. Reproduced by permission.

To compute beta things are a little more complicated. First, compute the value C (similar to XC as
described in the text). The equation is
C = -0.5 (z(FA) + z(H))
Note that when P(FA) = P(Miss) (neutral) then z(FA) = z(1-H), z(FA) + z(H) = 0, and C = 0 (zero bias).
When P(FA) > P(Miss) (liberal) then z(FA) > z(1-H) and C < 0. When P(FA) < P(Miss) (conservative) then
z(FA) < z(1-H) and C > 0.
Now that you have computed C, it is quite simple to compute beta:
1n(beta) = C d′
which means that
Beta = exp(C d′).
Essentially, compute the product of C and d′, and take the antilog of that value (using natural logarithms)
to obtain beta.
There are web pages that can compute SDT parameters from a set of P(H) and P(FA) values. Google it!

58
Key Terms
absolute judgment 32
alarm false alarms 25
Arousal theory 27
automatic processing 29
bandwidth 44
beta 12
bits (binomial digits) 32
bow effect 33
channel capacity 33
compatibility 34
confidence levels 17
configural 38
contingent probability 44
correct rejections 9
correlated (redundant) condition 37
correlated dimensions 34
disease prevalence 21
emergent feature 38
event rate 26
expectancy theory 28
expected value 13
false alarm rate 9
false alarms 9
fuzzy SDT 19
Garner interference 37
Garner sort task 37
hits 9
hit rate 9
identification 8
information theory 32
information transmission rate 46
integral dimensions 37
isosensitivity curve 16
knowledge of results (KR) 30
mapping function 19
minimum safe altitude warning 25
misses 9

59
negative diagonal 17
noise 49
non-target events 26
optimal beta 13
orthogonal condition 37
Payoffs 27
percent redundancy 44
positive diagonal 17
reaction time 46
receiver operating characteristic (ROC) curve16
redundancy 44
relative judgment 22
response bias or criterion 12
ROC curve 16
sensitivity or d′ (dee prime) 15
separable 37
separable dimensions 37
sequential constraints 44
signal detection 8
signal strength 26
simultaneous tasks 26
situational awareness 31
sluggish beta 14
successive tasks 26
sustained attention 27
sustained demand theory 27
target event 26
target probability 26
uncertainty 26
vigilance decrement 26
vigilance level 26
watch 26

60
3 ATTENTION IN PERCEPTION AND DISPLAY
SPACE

1. OVERVIEW
Driving is a skill that challenges one’s attention. Of the approximately 40,000 lives lost each year to
automobile accidents in the United States, it is estimated that more than half of these are caused, in part, by
distraction (Lee et al., 2009); that is, an attentional failure. Drivers must deploy attention broadly across the
roadway and within the vehicle to select that which is most relevant to safe driving. Some features of the
environment can capture attention and usefully guide it to objects (e.g., a warning light or a sign). On the
other hand, other features within our environment can inadvertently capture our attention and guide it to
unwanted irrelevant features, hampering our focused attention on the task at hand (e.g., distraction from
children arguing in the back seat).
Drivers must also divide attention between various sources of information. For example, the driver
attending to the turn directions from a navigational device must be aware of the relevant landmarks for the
turn along the highway while also monitoring traffic, keeping in the lane, and maintaining speed. Finally,
drivers must sometimes sustain attention for long periods (vigilance), as when driving hour after hour on an
empty highway at night.
Of course these critical aspects of attention—lecting, focusing, dividing and sustaining—are critical in
nearly all aspects of life within and outside of the workplace (Johnson & Proctor, 2004; Kramer Wiegmann &
Kirlik, 2007; Wickens & McCarley, 2008). The skilled basketball player must select the open teammate for a
pass, divide attention between the teammate and the close defender, and avoid the unwanted distraction of the
hostile crowd. Attention is also important for the design of many workplace displays. In the last chapter we
discussed the attention-capturing properties of alarms, and the challenges of sustained attention in the
workplace, for which alarms are designed to compensate. Displays can be designed to support divided
attention across the many elements of complex systems, without producing the clutter that hinders the focus of
attention.
Using the metaphor of the flashlight, selective attention is deployed over time, as if attention were the
flashlight beam, illuminating (selecting) different parts of the external and internal environment in turn.
Focused attention can be described as the width of the beam, narrow enough to prevent distraction from
unwanted elements. Divided attention also defines the width of the beam, but in the opposite sense: the beam
should be sufficiently wide to accommodate two or more desired channels of information. Sometimes when
we get too focused on information inside the beam, we fail to notice other important information beyond it:
the issue of attentional narrowing. Of course, sustained attention (as required for vigilance tasks) may be
represented by the flashlight battery maintaining illumination over long periods of time.
Consistent with the human information processing framework of the book, we separate the treatment of
attention into two chapters. Here we focus on attention in sensation and perception—the earliest stages of
information processing. We address selective attention, focused attention, and divided attention between
channels of sensory information. In Chapter 10, after we have discussed all of the stages of the information
processing model, we address how the various stages are utilized for divided attention between tasks, or
multi-tasking.
The current chapter focuses first on three important tasks for visual selective attention—supervisory
monitoring, noticing, and visual search—before considering how engineering design can guide attention.
Then we consider characteristics that can either aid or hinder divided visual attention between sources. Finally
we address attention to other modalities, particularly audition.

2. SELECTIVE VISUAL ATTENTION


Selective visual attention can engage in any of six different task types, as it is deployed across the visual
workspace. These are:

61
1. General orientation and scene scanning as might occur upon looking at a picture (Yarbus, 1967) or
encountering a new web page while browsing (Cockburn & McKenzie, 2001).
2. Supervisory control, as might describe the scan path of the pilot, vehicle driver, or the anesthesiologist,
assuring that certain dynamic variables are within bounds, and if they are not, exercising some form of
manual control (see Chapter 5) to bring them back. This task is highly goal directed.
3. Noticing, which involves monitoring for, and particularly responding to, somewhat unexpected events.
(Such events do not include the changes of parameters to be controlled in the supervisory control task.)
4. Searching for specific, usually pre-defined targets.
5. Reading.
6. Confirming that some control action has taken place (e.g., processing feedback).
Many tasks are clearly hybrids of some of the above. For example, following instructions when operating
equipment, reading graphs, or interpreting maps often involves some combination of searching and reading.
Given the somewhat ill-defined task goals of the first task, and the very specific processing aspects of the
fifth, we will not treat them in this chapter. Reading will be dealt with extensively in Chapter 6, and feedback
confirmation will be treated in Chapter 9. Our focus here then will be on understanding the role of selective
visual attention via eye movements in the three tasks of supervisory control, noticing, and searching.

2.1 Supervisory Control: The SEEV model


To understand the role of visual attention in supervisory control, it is necessary first to define the concept of
area of interest (AOI). The AOI is a physical location where specific task-related information can be found.
Examples would include the speedometer in a car or the surgical cavity of a patient in the operating room. It is
typically large enough so that eye movement equipment can reliably discriminate fixation on one AOI versus
another. Importantly, a single AOI may serve more than one task, just as the view out the windshield of a car
simultaneously serves the tasks of lane keeping and hazard monitoring (Wickens & Horrey, 2009).
Correspondingly, a given task can be served by more than one AOI, just as speed monitoring is served by the
view out the windshield (the optic flow field of the road, discussed in Chapter 4) and the speedometer.The
maximum scanning rate is roughly three fixations or dwells per second, and so the dwell time on an AOI in
this case would be about one-third of a second. However, the dwell time may be occasionally shorter, but
often much longer than this when the AOI contains a lot of information or contains information that is
difficult to perceive (e.g., a map in low illumination might produce dwell times of several seconds).
Research on visual attention in supervisory control has generally identified four factors—salience, effort,
expectancy and value—that determine where the eye is looking (which AOI is being attended) at any given
time in the visual workspace (Moray, 1986: Wickens & McCarley, 2008; Wickens, Goh, et al., 2003).
Salience refers to the extent to which the AOI stands out from the background (or from other AOIs) by
virtue of its size, color, intensity, or contrast. Salient AOIs attract attention, like the flashing of a light
surrounding an altimeter in a plane (Wickens, 2012; Wickens et al., 2003), or the distracting advertisement on
a web page (Simola et al., 2011).
Effort defines the cost of moving attention from one AOI to another. Eye movements are “cheap” but
not “free.” That is, while we are not continuously aware of the effort of scanning our eyes around our
workplace, extensive scanning is fatiguing. It is for this reason (in part) that head-up displays have been
implemented in vehicles—to reduce the amount of scanning between an instrument panel and the world
beyond, as we discuss later in this chapter. Furthermore, we know that dual task loading competes for effort
and thus reduces the overall breadth or distribution of scanning (Recartes & Nunes, 2000). Importantly, the
effort cost of scanning is not linear with the spatial separation between AOIs, but instead shows the roughly
three-segment pattern shown in Figure 3.1. On the left, for two AOIs within foveal vision, attention can be
moved without eye movements. In the center, for separations less than around 20 degrees, scanning requires
eye movements only (the “eye field” of visual scanning; Sanders & Houtmans, 1985). Pure eye movements
involve little effort: any major effort cost is incurred in initiating the scan, with little further cost for longer
scans. For the right segment of Figure 3.1, a head movement (neck rotation) is required to bring the new AOI
into focus. This is the so called “head field.” Longer movements within the head field impose progressively
greater effort (unlike the eye field). Finally, there comes a point (around 90 degrees) when even a head
rotation cannot bring the next AOI into attention. (Consider checking the blind spot in a car before lane
switching.) Here partial or full body rotation is required. Given our effort-conserving tendencies (see Chapter
10), people will be progressively more reluctant to make scans of distances when neck rotation and
particularly body movement is involved (the right side of the function in Figure 3.1).

62
FIGURE 3.1 Information access effort (IAE) as a function of the visual angle separation between two AOIs. The separation between the pairs
of dots represents differences in visual angle.

Expectancy. We tend to look more at places where there is a lot of “action.” Consider how we must
concentrate on the lane edges when driving a curvy road on a gusty day, or on the locations of other vehicles
in heavy traffic. In such situations, we look at these places because we expect changes to occur frequently,
changes that will affect our own driving actions. Generally, the more things change, the more we expect them
to change. Actual change is a physical property of the environment that can be measured and is often
expressed in bandwidth (changes/unit time). A well-developed mental model of environmental dynamics will
represent that bandwidth in the form of an expectancy that drives visual sampling (Senders, 1964, 1980). In
Chapter 12, we discuss the operator’s occasional failure to monitor highly reliable automation, a phenomenon
known as “complacency.” This tendency is mediated by the very low frequency of reliable automation
failures, and hence the low expectancy of these automation events (Moray & Inagake, 2000). However, while
expectancy is typically driven by bandwidth, it can also be driven by specific contextual cues, as when a
collision warning directs attention to what may normally be a low bandwidth region (e.g., while driving down
a straight road on a windless day, a collision warning sounds while you are tuning the radio, and you quickly
look up).
Value. Value may be described as the usefulness (importance) of the information (that is, relevance of an
AOI to a task, weighted by the relative importance of the task). It is important to detect pending collisions
visible in the road ahead, so the roadway (windshield) AOI is valuable, even on a straight freeway with little
traffic (low expectancy). In contrast, value can be low but expectancy high: consider driving a highway with
many advertising signs. The value of scanning to the roadside is low, but the expectancy (bandwidth) of the
signs flashing by is high. Value can either be defined positively (the value or relevance of the visual
information for the task) or negatively (e.g., the cost of missing a turn sign).
Salience and effort may be grouped as “bottom up” influences on selective attention that can be
objectively characterized by physical environmental measures (e.g., intensity of light from an AOI and visual
angle between AOIs, respectively). Expectancy and value in contrast are said to be “top down” influences,
embodied in the supervisor’s mental model of environmental changes and task priorities. Furthermore, if these
latter two are combined, they can be said to define the “expected value” of an AOI in a way that maps closely
to optimal models of decision making, discussed in Chapter 8. We will describe this mapping below.
Together, the four factors have been combined in an additive scanning model called SEEV (Salience,
Effort, Expectancy, Value; Wickens, 2012; Wickens Hooey et al., 2009; Steelman-Allen, McCarley, &
Wickens, 2011). The foundation of the model is based jointly on optimal visual scanning models from
engineering (Senders, 1964, 1980; Carbonell, Ward, & Senders, 1968; Sheridan, 1970; Moray, 1986) and
psychological models of salience (Itti & Koch, 2000) and visual attention (Bundesen, 1990). The model has
been found to do a good job of predicting visual scanning patterns in environments such as driving (Horrey,
Wickens, & Consalus, 2006), flying (Wickens, Goh, et al., 2003), and the hospital operating room (Koh, Park,
et al., 2011).
The SEEV model predicts not only the distribution of attention, but also periods of neglect, by
predicting specific scan paths. That is, SEEV predicts the distribution of times during which a high-value AOI
is not attended. This might occur because the AOI has low bandwidth, or is relegated to the periphery of the
workspace (and so there is high effort to attend there). This neglect of visual attention is an important
predictor of change blindness, which we discuss below, and has major safety implications for in-vehicle
technology in predicting the duration of head down glances (Horrey & Wickens, 2007; Wickens & Horrey,
2009; see also chapter 10).
In combination, the SEEV parameters also provide guidance for optimal display layout (Wickens,

63
Vincow, et al., 1997). More valuable AOIs should be made salient, and the distance between AOIs (which
determines effort) should be inversely related to the frequency of use (bandwidth). A third related influence
on optimal layout is the frequency of sequential use (integration property of a pair of AOIs). That is,
displays that must be used in sequence should be placed close together. This idea is incorporated into a useful
principle of display design called the proximity compatibility principle (to be discussed in Section 3.5
below).
While operators scan in order to supervise and control, they should also be prepared to notice and
respond to unexpected events: the jaywalking pedestrian, the failure of equipment, or the subtle deterioration
of the weather. People are fairly poor at this task of noticing the unexpected, and we address this deficiency
below in two sections. The first focuses on the failure to detect change, and the second addresses it from a
more optimistic perspective, modeling the successes rather than the failures of attentional capture.

2.2 Noticing and Attentional Capture

2.2.1 FAILURES: CHANGE BLINDNESS In general, the human perceptual system is sensitive to change in the
environment. The natural visual transients associated with the change (e.g., onset, flickering, or motion) are
easy to detect. These same properties are commonly exploited in the design of visual alerts. However, this is
not always the case, and the term change blindness is used to describe those situations when changes in the
environment are not noticed. In the laboratory, change blindness is typically demonstrated when the change is
accompanied by some form of disruption, such as a blink (O’Regan, Deubel, et al., 2000), a blank screen
(Rensink, 2002), a physical object occluding the scene (Simons & Levin, 1998), or a saccade away from the
change location (Stelzer & Wickens, 2006). These serve to mask the natural visual transients that normally
make the change salient.
Outside the laboratory, change blindness has been observed with drivers failing to notice changes in
street signs (Martens, 2011) or pilots failing to notice changes of the flight mode indicator light (Sarter,
Mumaw, & Wickens, 2007). In 2008, a mid-air collision between two aircraft over Brazil was partly attributed
to the fact that one pilot did not notice a display change signaling that the system broadcasting his airplane’s
position had been switched off (Wickens, 2009). Change blindness can even occur during face-to-face
conversations. In a classic study by Simons and Levin (1998), an interviewer initiated a conversation with a
pedestrian on a college campus. The interviewer was surreptitiously replaced by another interviewer when a
pair of workmen carrying a wooden door moved between the interviewer and the unsuspecting participant.
About half of the participants did not notice that they had continued their conversation with a complete
stranger!
Research has shown that there are number of factors that decrease the likelihood that changes are
detected. We summarize these findings below (see also Rensink, 2002):
1. Change blindness is more likely under high task load and when viewers are engaged in attention-
demanding concurrent tasks (e.g., conducting a phone conversation while driving, McCarley, Vais, et
al., 2004; Lee Lee & Boyle, 2007, flying with an engaging 3D display, Wickens, Hooey, et al., 2009)
and those tasks that demand the central executive of working memory (Fougnie & Marois, 2007).
2. Change blindness is less likely to occur when the changing stimulus is more salient. For example,
changes that involve an increase in luminance contrast (e.g., a warning light turns on) are more
noticeable than those that do not (e.g., changing a word from “on” to “off” or a changing digital value
from 100 to 000; Yantis, 1993).
3. Detectability of the change is a function of the peripheral eccentricity of the event from the current
fixation point; in other words, the greater the visual angle between the location of the change and the
fovea, the less likely the change will be detected (Steelman, McCarley, & Wickens, 2011; Wickens,
Hooey et al., 2009; Nikolic, Orr, & Sarter, 2004).
4. Change blindness is much more likely when the changing element is completely outside the field of
view (a “completed change”) than when it is within the field of view (a “dynamic change”; Rensink,
2002). A completed change characterized the occluding door used by Simon and Levin. In other
words, a change based on memory is harder to detect than one based on perception.
5. Change blindness is less likely if the event is probable, and hence expected. Wickens, Hooey, et al.
(2009) found that the proportion of missed changes in realistic flight simulations was quite high,
around 40 percent, when these changes were so-called “black swan” events (Taleb, 2007), totally out
of the realm of expectation.

64
Finally, independent of whether fixation is on the location of the change at the time of change,
6. detection is greater to the extent that more attention is focused there before and after the event (Beck,
Peterson, & Angelone, 2007; Martens, 2011).
From a practical point of view, our inability to notice fully-visible unexpected events while performing an
attention-demanding task, such as driving a car or flying an aircraft, has clear implications for safety. This is
especially true given that we often remain oblivious to our own poor performance, and overestimate the
degree to which we can detect changes in our environment. Levin, Momen, et al. (2000) referred to this
manifestation of overconfidence (a topic discussed further in Chapter 8) as change blindness blindness.
To combat change blindness, the engineering psychologist should certainly advocate increasing the
salience of more important changes that signal dangerous circumstances (i.e., warnings). Martens (2011)
suggests that in order to increase the likelihood of drivers noticing changes to road signs, it is important to
make the difference between the new and old signs as explicit as possible, and also make these cues clearly
distinct from the former situation. The expectancy of seeing an event at a particular location can be increased
through training (Richards, Hannon, & Derakshan, 2010).

2.2.2 A MODEL OF NOTICING: THE N-SEEV MODEL Given the importance of change blindness to designing safety-
critical displays and for operators in heavily visual environments, a computational model was developed that
identifies and quantifies the variables that enhance or degrade noticing (i.e., modulate the magnitude of
change blindness for unexpected events). This is the N-SEEV model (Noticing–SEEV; Wickens, Hooey, et
al., 2009; Steelman-Allen McCarley & Wickens, 2011; Wickens, 2012). Because it applies to those visual
environments in which scanning occurs, the SEEV model of supervisory control discussed in Section 2.1
describes the visual context in which the change occurs. We next identify factors that influence the N
(noticeability) of the to-be-noticed-event (TBNE) within the context of the ongoing scan path predicted by
SEEV.
SEEV calculates how much time the eye will spend in each AOI (e.g., in driving, 50 percent forward on
the road, 30 percent to the road signs and 20 percent head down). This in turn determines the eccentricity of
each of these locations from the location of the TBNE. For example, a TBNE warning light onset that is 30
degrees head down has an eccentricity from the roadway of about 30 degrees. Given the well-known functions
that define the loss of detectability with peripheral eccentricity (e.g., McKee & Nakayama, 1983; Mayeur,
Bremond, & Bastien, 2008), we can predict that detectability of the TBNE will be a function of these three
eccentricities (roadway, roadsigns, head down) weighted by the proportion of time that the AOIs associated
with each eccentricity are attended.
Beyond eccentricity itself, there are three additional factors that influence this eccentricity function on
noticeability (Steelman-Allen McCarley & Wickens., 2011; Wickens, 2012). These are:
• The expectancy of events. As we saw above, the detectability of very unusual, low frequency, or
“black swan” events is extremely low, even when these may be quite important and not too far in the
periphery (Wickens, Hooey, et al., 2009).
• The salience of events can be objectively characterized by functions derived from computational
vision (Itti & Koch, 2000; Steelman et al., 2011), related to contrast, color, and dynamic properties
(e.g., flashing, moving; Simola Kuisma et al., 2011).
• In combining the two influences above, salience can be “tuned” to certain perceptual properties based
on expectancy (Folk Remington & Johnston, 1992; Most & Astur, 2007). For example, in the context
of existing engine problems, the pilot may be more tuned to noticing a red onset than under normal
conditions (here also, SEEV would predict a higher expectancy for sampling the AOI containing
engine information). Car drivers are more likely to notice motorcyclists if they themselves are also
motorcyclists (Roge, Douissenbekov & Vienne, 2012).
While these factors are similar to those that drive steady state scanning (supervisory control) in SEEV, they
are distinct in driving a single scan in noticing. Thus factors in the N are clearly linked to temporal events
rather than spatial channels. From calculations based on these factors, NSEEV can predict the delay between
when the TBNE occurs and the first fixation on its location, a fixation which often corresponds to conscious
noticing. Such predictions are upheld to be accurate within the real-world environment of the aircraft cockpit
(Wickens Hooey et al., 2009).

2.2.3 INATTENTIONAL BLINDNESS Although it seems unlikely that we should fail to notice our car keys when we
are looking right at them (e.g., eccentricity is zero), there is a growing body of research evidence to suggest

65
that we do. In other words, we can look, but fail to see, even if the unexpected event involves a large, unusual,
dynamic object, which is fully visible for several seconds. This failure of attention, known as inattentional
blindness (Mack & Rock, 1998), which is a subset of change blindness has been the subject of a growing
body of research using some extremely innovative approaches. For example, Simons and Chabris (1999)
asked participants to watch a video of actors playing basketball and count the number of passes between them
(a primary task similar to supervisory control). During the video, another actor dressed in a gorilla suit
proceeded to walk across the frame, stopped for a moment, beat his chest, and then exited the frame.
Surprisingly, the gorilla in the midst of the basketball players went unnoticed by participants more than half
the time!
Inattentional blindness is thus the failure to notice something when the observer directly looks at it. Even
when looking within one degree of the location of the change, over 40 percent of participants do not notice
display changes if they are engaged in other tasks (O’Regan et al., 2000).
Simons and Chabris (1999) argued that the level of inattentional blindness is related to both the difficulty
of the primary task and the degree of visual similarity between the unexpected event and the primary task. We
treat each of these factors in turn.
In the gorilla in the basketball passing simulation, expert basketball players were more likely to notice
the gorilla (Memmert, 2006). Their heightened expertise in tracking passes likely made the primary task
easier. Similarly, Seegmiller, Watson, and Strayer (2011) found that individuals with greater working memory
capacity were more likely to report seeing the gorilla (67 percent) relative to those with reduced capacity (36
percent). As we will discuss in Chapter 7, one of the main functions of working memory is attentional control
(Kane & Engle, 2002); our ability to maintain task goals in an active state in the presence of interfering
information. Thus, individuals with greater working memory capacity are better able to maintain the primary
goal of the study (counting passes) and have enough residual attentional control to spontaneously monitor the
environment for any unexpected event (the gorilla; Seegmiller et al., 2011).
The primary task thus becomes easier for individuals with greater domain expertise or attentional
capacity (Fougnie & Marois, 2007). In contrast, when attentional capacity is reduced, the risk of inattentional
blindness is greater. For example, intoxicated individuals are more likely to demonstrate inattentional
blindness than their sober counterparts (Clifasefi, Takarangi, & Bergman, 2006). A similar effect has been
noted for individuals talking on a cell phone while walking, compared to those just walking (Hyman et al.,
2010). These effects can be attributed to a reduction in attentional capacity sufficient to maintain performance
on the primary task, but little else (see Chapter 10), making it less likely that the unexpected event is noticed.
Regarding the second factor (degree of visual similarity), participants were more likely to notice the
gorilla when the basketball players wore black shirts, the same color as the gorilla (Simons & Chabris, 1999).
It appears that adopting a task strategy that focuses on cues shared between the primary task and the
unexpected event—a black humanoid shape—frees up attentional capacity that can be directed elsewhere and,
in doing so, increase the likelihood of noticing the gorilla. Inattentional blindness is thus subject to top-down
or strategic processes. Indeed, Rattan and Eberhardt (2010) showed that inattentional blindness is reduced
when the unexpected event is associated with a socially meaningful concept (e.g., racism).
From our review of these two related phenomena, change blindness and inattentional blindness, it is clear
that without attention even visual information of great consequence may not reach conscious perception.

2.3 Visual Search


Visual search involves finding something—a target—with our eyes, as we move selective attention across a
search field. In the search task the target is typically defined in advance, as distinguished from the noticing
task. Search not only pervades everyday behavior (finding a set of car keys), but is also a critical component
of many specialized tasks. Accordingly, human factors researchers have studied search intensively and across
a variety of domains, including driving (e.g., Ho, Scialfa, Caird, & Graw, 2001; Mourant & Rockwell, 1972);
map reading (e.g., Yeh & Wickens, 2001; Beck, Lohrenz, & Trafton, 2010); medical image interpretation
(e.g., Kundel & LaFollette, 1972); menu search (Fisher, Coury, et al., 1989); baggage x-ray screening (e.g.,
McCarley, Vais, et al., 2004; McCarley, 2009); human-computer interaction (e.g., Fleetwood & Byrne, 2006;
Fisher & Tan, 1989; Ling & Van Schaik, 2004); industrial inspection (e.g., Drury, 1990, 1990, 2006); photo
interpretation (e.g., Leachtenauer, 1978); airborne rescue (Stager & Angus, 1978); and sports (e.g., Williams
& Davids, 1999). A deadly train accident in England (the Ladbroke Grove rail incident) was caused in part by
the considerable time it took supervisors to search through a traffic display and identify which train was
causing a collision alarm to sound (Stanton & Baber, 2008).

66
Many cognitive psychologists have used the search task to examine fundamental properties of visual
information processing and perceptual representation (e.g., Treisman & Gelade, 1980; Wolfe, 2007). Thus,
researchers have not only garnered an extensive applied knowledge of visual search, but have grounded that
knowledge on a strong theoretical foundation.
Visual search is closely related to the sequence of eye movements used to conduct the search. We can
also speak of search in other modalities, as when we search through an auditory phone menu to find the option
we want (Commarford et al., 2008). However, visual search is usually carried out by moving the eyes more or
less systematically across the search field. The separation between the centers of fixation of consecutive eye
movements is used to define the diameter of the useful field of view (UFOV). The UFOV is defined as the
visual angle within which a target can be detected if it is present, or a non-target identified if it is not. A
careful and systematic visual search will “blanket” the search field with UFOVs. Visual search can be a
precursor to signal detection, as when an industrial inspector or radiologist searches for faint targets and then,
upon locating a candidate, decides whether it is target (signal) or noise (Drury, 1975, 1990, 2006).
We present a simple model that captures many important aspects of visual search, called the serial self-
terminating search (SSTS) model (Sternberg, 1966) and based on the data of Neisser (1963). Then we will
show how this model has been refined and qualified over the past 50 years and use it as a baseline to identify
the characteristics that make search easier or more difficult.

2.3.1 THE SERIAL SELF-TERMINATING SEARCH (SSTS) MODEL In a visual search task, the person searches for the
target (here the letter “K”) among distractors or non-targets within the search field. The target location is
unknown to the person and varies with each search, and we assume nothing about the order in which the
person searches through the search field. As shown at the bottom of Figure 3.2, the search fields can vary in
size. In the figure the set size N is 4, 8, or 12. When the target is found, the search is “self terminated” (the
remaining items not inspected) and the person responds “yes.” If it is not found after the full array is searched,
the response is “no.” In each case, the total search time is recorded. The graph at the top of Figure 3.2 depicts
the typical result predicted by the SSTS model (Sternberg, 1966). When the target is present (solid line),
search time (ST) is a linear function of N. The function describing this line is:
ST = ap + bN/2,

Here b is the time to inspect each non-target item and decide it is not the target. The search time, bN, is
divided by 2 because, on average over repeated trials, the target will be found halfway through the search
field. The intercept constant ap represents the residual non-search components of the response when a target is
present. The dashed line in the figure depicts the predicted ST when the target, K, is absent aa (see the
rightmost array of 12 items in Figure 3.2). For these trials, ST = aa + bN. There is no division by 2, since all
of the items must be searched before concluding that the target is absent. The intercept aa may be longer than
ap because the searcher may doublecheck if no target is located.

FIGURE 3.2 Search time as a function of set size, as predicted by the serial self-terminating search model (after Sternberg, 1966, and Neisser,
1963). Different search fields are shown at the bottom of the figure.

Finally, in the special case when we can assume that the items in a display are searched in predictable
serial order—as, for example, might occur with names in a list—the SSTS model also predicts serial order
effects, with search times being proportionately shorter for targets in earlier list positions than for those in later
positions (Neisser, 1963; Nunes, Wickens, & Yin, 2006).

67
The equations of the SSTS model have been shown to do a reasonably good job describing search times
in realistic environments (e.g., air traffic control: Nunes, Wickens, & Yin, 2006; Remington, Johnston, et al.,
2000; cluttered map search: Yeh & Wickens, 2001; Beck, Lohrenz, & Trafton, 2010). However, there are
exceptions and elaborations that can account for variations in search speed and accuracy. We detail some of
these below.

2.3.2 QUALIFICATIONS OF SSTS: BOTTOM UP FACTORS


a. Search is not always self-terminating. Sometimes there are several targets present and all should be
found (e.g., inspecting an X ray image for nodules; Barclay, Vicarey, et al., 2006; Swets, 1998). In
these cases, exhaustive search occurs (i.e., all items in the search field are examined). The serial
exhaustive search function will resemble the target absent function in Figure 3.2 (dashed line).
However, the intercept of the function aa increases in proportion to the number of targets located,
since each positive identification will be associated with some overt or covert response.
b. Search is not always serial. This is probably the most important qualification and departure from the
SSTS model. Parallel search typically occurs when the target is defined on a single salient level along
one dimension (Treisman, 1986, Treisman & Gelade, 1980). For example, the search task in Figure 3.2
would be little affected by the number of items if the target letter was uniquely colored red. Thus
parallel search produces a “flat slope” in the context of Figure 3.2, with coefficient b approaching 0.
This is sometimes referred to as target popout since the uniquely colored target appears to “pop out”
of the search field. This highly efficient search shows the clear benefits of color highlighting. Eye
movements correlate with the performance data, showing greater search efficiency for parallel than
serial search (Williams et al., 1997). Some visual search models (e.g., Treisman & Gelade, 1980;
Wolfe, 1994, 2007) propose that parallel search of this type is preattentive (requiring few attentional
resources) and can be done across the entire visual field, whereas serial search requires attentional
resources, and can only be done over a limited portion of the visual field (i.e., the UFOV).
 As an important aside, it should be noted that auditory phone menus must always be searched in a
serial fashion; diminishing their efficiency further is the fact that the processing time per item, b, is
always going to be long—the time required for the machine to speak each option.
c. In serial search, the time per item b increases when targets are defined by a conjunction of features
(e.g., color and shape: a red X in a sea of multi-colored letters; Treisman, 1986). This situation is
called conjunction search.
d. Serial search is more likely when the target is difficult to discriminate from the distractors (the non-
target items in the search field; Geisler & Chou, 1995). Nagy and Sanchez (1992) found that search
times increased with number of distractors when the luminance or color difference between target and
distractor was small (serial), but search times did not increase when the difference was large (parallel).
Larger UFOVs occur when the target is more discriminable, producing more efficient search.
e. Search is easier if distractors are homogeneous (e.g., all identical) than if they are heterogeneous
(Duncan & Humphreys, 1989), For example it is easier to search for K in the array LLLLKLLL than
in the array BJRKITRG.
f. Search is easier if the target is defined by having a feature present rather than absent. For example,
Treisman and Souther (1985) showed that parallel search occurred when subjects searched for a Q
among Os [OOOOOOQOO], but serial search occurred when searching for an O among Qs
[QQQQQOQQQ]. In the first case, the bar on the Q is a feature present in the target. In the second
case it is a feature absent. This effect is similar to the “target-present” advantage noted in the vigilance
situation in Chapter 2 (e.g., Schoenfeld & Scerbo, 1997).
g. It matters little if the elements are closely spaced, requiring little scanning, or are widely dispersed
(Drury & Clement, 1978; Teichner & Mocharnuk, 1979). The increased scanning that is required with
wide dispersal lengthens the search time slightly. However, the high density of non-target elements
(e.g., clutter) also lengthens search times slightly when items are crowded together. Thus scanning
distance and visual clutter trade off with one another as target dispersion is varied.
h. Searching for several different target types is generally slower than searching for only one (Craig,
1981). An example in Figure 3.2 would be “Search for a K or an F.” However, an exception occurs
when the set of targets can be discriminated from the distractors by a single common feature. For
example, if the instructions were to “search for an L or a T”, in the array OUSLXUSO people can
learn that the target letters are the only letters in the field containing vertical lines, leading to efficient
search (Neisser, Novick, & Lazar, 1964). Thus, in industrial inspection, we can predict an advantage

68
for operators trained to focus on the features common to all faults.
i. Extensive training in target search can bring performance to a level of automaticity, when search time
is unaffected by the number of targets and is therefore presumably done in parallel (Fisk, Oransky, &
Skedsvold, 1988; Schneider & Shiffrin, 1977). Generally speaking, automaticity results when, over a
set of trials, targets are consistently treated as targets and never appear as non-target stimuli
(consistent mapping; Schneider and Shiffrin, 1977). This is contrasted with varied mapping, when
targets sometimes appear as non-targets. We will discuss the concept of automaticity further in
Chapter 6 in the context of reading; in Chapter 7 in the context of training; and again in Chapter 10, in
the context of time-sharing.

2.3.3 GUIDED SEARCH AND TOP DOWN FACTORS So far, we have emphasized characteristics of the search domain
that influence search in “bottom up” fashion. However the concept of guided search, embodied in models
developed over the past two decades by Wolfe (1994, 2007; Wolfe & Horowitz, 2004), shows how top-down
factors influence search efficiency by guiding visual attention to likely target candidates. For example,
suppose one is scanning a cluttered map to find a target that is large and red, and only a few of the elements
are large, while many are red. Given that this conjunction search will be serial (see above), it makes sense to
first narrow the search to all large items (a parallel search), and then search this greatly reduced subset for
those that are red, instead of the other way around. Thus, search can be “tuned” for particular features in
topdown fashion to increase efficiency (Most & Astur, 2007).
Perhaps the most salient feature to which search can be tuned is the target’s spatial location. In both
structured and unstructured search fields, people can learn where the target is likely to be found, and then
search those regions first. For example, skilled radiologists search for tumors or fractures by first inspecting
those locations likely to be abnormal; novices do not (Kundel & LaFollette, 1972). Expert drivers are better at
searching where hazards may appear than are novices (Pradham et al., 2006). When creating structured search
fields like lists or computer menus (Lee & MacGregor, 1985), designers can place the most frequently sought
menu items at the top of the list. The SSTS model shown in Figure 3.2 predicts a reduction in overall search
time if this is done (since earlier list positions produce shorter search times).
When applying the SSTS model to searching computer menus, we must also account for the time
required to switch between menu pages or screens along with the time to search items within a screen. Lee and
MacGregor (1985) have developed computational models that predict the time needed to locate a target item
as a function of reading speed and computer response speed with embedded, multi-level menus. Their model
predicts that the optimal number of words per menu is about 7 ± 2, similar to the limits of absolute judgment
(Chapter 2) and working memory (Chapter 7). Their model and data are consistent with others to be described
in Chapter 9 and also highlight the cost of many embedded levels of short menus (i.e., narrow and deep menu
structures are generally ill advised).

2.3.4 THE USEFUL FIELD OF VIEW The size of the UFOV affects search performance because it determines how
carefully the observer must scrutinize the search field. A large UFOV enables the observer to process a greater
portion of the image with each gaze, ensuring that fewer eye movements will be required to blanket the field
(Kraiss & Knäeuper, 1982). Accordingly, UFOV size is correlated with search efficiency among photo-
interpreters (Leachtenauer, 1978), industrial inspectors (Gramopadhye et al., 2002), and older adult drivers
(Owsley et al., 1998).
The relationship between UFOV size and search performance has led to the suggestion that it is possible
to improve search efficiency through training to expand the UFOV. Gramopadhye et al. (2002) found that a
training protocol designed to increase UFOV size produced positive transfer on a mock industrial inspection
task. Training to expand the UFOV may also improve driving performance in older adults (Roenker et al.,
2003).

2.3.5 SEARCH ACCURACY We have focused above on mechanisms that affect search time. Of equal importance
are processes that determine search accuracy. Not surprisingly there is a large trade off between speed and
accuracy in visual search (Drury, 1996): accurate searches tend to be slow, and rapid searches tend to produce
errors, usually misses (failing to find a target that is present; see Chapter 2). However, as we will discuss in
Chapter 9 with regard to response time, many factors that slow search (e.g., high target-distractor similarity)
also tend to create errors.
The miss errors in search are typically of two classes. First, although it is more likely that a target will be

69
found if it is fixated than if not, it is also common for targets like an X-ray abnormality or a well-camouflaged
weapon in luggage to be overlooked, even when it falls within a scanning UFOV (Kundel & Nodine 1978;
McCarley et al., 2004; McCarley, 2009). This is the inattentional blindness phenomenon described earlier.
Miss rates for fixated targets can be as high as 30 to 70 percent (Wickens & McCarley, 2008).
Second, many searches do not fully blanket the search region with UFOVs to ensure that all areas are
fixated before the search is terminated, increasing the miss rate still further. This describes the “stopping
policy” of the searcher. The importance of an appropriate stopping policy is illustrated by a study of
colonoscopy screening. Barclay, Vicari, et al. (2006) found a strong correlation (r = .90) between polyp
detection rates and search times for those trials on which no lesion was detected. That is, physicians who
employed a more conservative stopping policy, taking longer on average to reach a no-polyp judgment,
showed higher rates of successful polyp detection than those who terminated search earlier (Barclay et al.,
2006).
What factors lead to a premature stopping policy? On the one hand, people may stop after the most likely
regions have been searched, and fail to find a target in an unlikely region (Theeuwes, 1996). On the other
hand, the expectancy that the target is present at all exerts a powerful influence on how long a person will
continue a search that has not yet turned up a target (Wolfe, Horowitz, & Kenner, 2005; Wolfe, Horowitz, et
al., 2007). If expectancy is low, there is a greater likelihood of an early stop, before the space is fully
blanketed with UFOVs. Wolfe et al. (2005) found that as the target frequency decreased from 50 percent to 1
percent, miss rates increased from 7 percent to 30 percent, in a manner reminiscent of expectancy-driven
setting of beta in signal detection theory (SDT). These results imply that the introduction of occasional mock
targets (Wilkinson, 1964) will improve target detection rates in tasks like airport baggage screening, where
true threats are rarely encountered. Aviation security agencies have in fact begun to use such methods. As
another illustration of how SDT applies to search, the costs imposed by low expectancy for miss rate can be
offset if the targets are known to be of high value (Chun & Wolfe, 1996; Drury & Chi, 1995). This is similar
to the SDT payoff concept described in Chapter 2.

2.4 Clutter
Search is closely related to clutter, which impedes both selective and focused attention, and can be measured
either subjectively (Kaber et al., 2011) or objectively by metrics that may quantify any or all of the four
factors below. We label these according to sources of clutter.
• Numerosity clutter (N in Figure 3.2) hinders selective attention, as predicted by the SSTS model.
• Proximity or readout clutter hinders the focus of attention. Once a tentative target or nontarget is
located, nearby distractors within around 1 degree of visual angle slow the further readout or
inspection (Broadbent, 1982). This is particularly true if there is zero separation producing partial
masking. Minimal separation is more likely with miniaturized hand-held displays, or with display
overlay, as found with head-up displays or database overlays (Kroft & Wickens, 2003; Beck et al.,
2010). Numerosity and readout clutter have been referred to as global density and local density
clutter, respectively (Tullis, 1988; see also Beck et al., 2010; Wickens, Vincow, et al., 1997).
• Disorganizational clutter describes the random location of distractors in search fields that are not
“structured.” Examples of structured and unstructured search fields are shown in the left and right
panels respectively, of Figure 3.3.
• Heterogeneous clutter refers to the heterogeneity of the non-target background features (like color,
shape or size) which we saw above was an impediment to visual search.
All these clutter factors are manifest in the use of maps, whose design we discuss in more detail in Chapter 5.
Various researchers have quantified either individual factors (Yeh & Wickens, 2001) or developed metrics
that combine the factors (Beck, Lohrenz, & Trafton, 2010; Rosenholtz, Li, & Nakano, 2007) in order to
produce clutter models that predict visual search time. We address the issues of clutter further in our
discussion of maps in Chapter 5.

70
FIGURE 3.3 (a) Structured Gestalt principles of display organization. (b) Unstructured search field.

2.5 Directing and Guiding Attention


When discussing change blindness, we saw how people can overlook important events in their environment;
we also saw in our discussion of NSEEV how salient events in the environment can capture attention. Linking
these two phenomena together, we can appreciate how designerimposed events can guide attention to critical
events in spatial environments. This might include an alert that directs an air traffic controller to the location
of two conflicting aircraft on a busy display (Remington Johnston et al., 2001), a collision alert in a car that
directs the driver’s attention to the roadway (Victor, 2011), or an alert on a head-mounted display that directs
the soldier to the location of a possible enemy (Yeh et al., 2003). Indeed, the tragic Ladbroke Grove train
accident described above might have been averted had the cluttered signaller’s display been equipped with
attention guidance to show the location of the developing conflict.
Attentional guidance is typically performed by some form of automation, in which an intelligent agent
assumes that the human should be informed of the location of the critical event (see Chapter 12). But the
automation may be wrong: as we discussed in the previous chapter, alarms that inform the user that something
is wrong are often incorrect. So what are the benefits when the attention guidance is correct and the costs
when it is wrong?
The background for understanding the costs and benefits of visual attention guidance is provided by a
series of studies carried out on attention cueing (e.g., Posner & Snyder, 1978; Posner, 1986). In those
experiments, people were required to respond to a single imperative stimulus located at an unpredictable
place in the visual field. Prior to the imperative stimulus, people were cued as to where it was likely to occur.
Two features of the cue are important: its location and its reliability. We treat each of these in turn in the next
two sections.

2.5.1 CUE LOCATION A central cue is positioned at or near the center of fixation and is often represented as an
arrow pointing in the direction of the imperative stimulus. In controlled laboratory research the central cue
would typically be placed at the “fixation cross” in the center of the display. A peripheral cue is usually
placed at the imperative stimulus location and away from the fovea, and might take the form of a bar or flash.
Across many experiments, researchers have identified important differences between these two types of
cues (e.g., Posner, 1986; Egeth & Yantis, 1997; Muller & Rabbitt, 1989). Central cues (e.g., the pointing
arrow) are more cognitively driven. They take a little longer to process; hence, to be effective in guiding
attention and shortening the response to the imperative stimulus, they need to appear a little earlier. They
produce pronounced benefits when they are correct, but costs when they are wrong, and both benefits and
costs are pretty much eliminated when central cues offer only chance accuracy.
In contrast, peripheral cues appear more perceptually driven and automatic in orienting the person toward
their location. They are more fast acting and, importantly, even when their general validity is zero (chance
accuracy over multiple trials), they will still provide a benefit in responding to the imperative stimulus for
those trials when they do indicate the correct location. In general, responses to peripheral cues tend to be
more accurate (Cheal & Lyon, 1991).
The distinction between peripheral and central cueing is quite relevant when attentional guidance is

71
employed outside the laboratory (e.g., guiding a pilot’s attention to a potential conflict aircraft, or directing a
driver to look toward a potential pedestrian hazard). Here the central cue would be placed near the typical
focus of fixation (e.g., the view forward down the highway) or at the center of a head mounted display (see
Chapter 5). Peripheral cues have some costs. First, peripheral cues cannot be seen if they are too far into the
visual periphery (e.g., beyond about 90 degrees of visual angle) no matter how intense (big, bright) they are.
And they should certainly be made salient, such as using multiple onsets (flashing) rather than single onsets
(Wickens & Rose, 2001). Second, peripheral cues superimposed on a potential real-world target (e.g., the
conflict aircraft) must not be made so intense that they mask a non-salient target (Yeh, Merlo, et al., 2003).
This masking is a particularly important concern if it is necessary to identify or interpret the target to confirm
its identity. For instance, imagine that a military target peripheral cue masks those features distinguishing
between a friendly vehicle and a foe to be targeted or evaded. Masking does not occur with central cueing
(arrows), as by definition the arrow will be separated from the target. However, central cues are less precise in
designating target location (Yeh, Wickens, & Seagull, 1999) and do require more conscious processing.

2.5.2 CUE RELIABILITY In some situations, cues can be 100 percent reliable (always indicating the correct
location of the event). In others they are imperfect or unreliable, having an accuracy less than 100 percent.
Imperfectly reliable cueing can have a validity as low as chance.
In unreliable cueing, it is necessary to distinguish between those instances (e.g., 90 percent) when
automation is correct and those in which automation is wrong (e.g., 10 percent) because the two obviously
have different implications for the human following the guidance. The general research in this area (Yeh et al.,
1999, 2003; Yeh & Wickens, 2001), following the more basic research on attentional cueing (e.g., Posner &
Snyder, 1978, Jonides, 1980; Egeth & Yantis, 1997; Rabbit, 1989; Posner, 1986; Posner, Nissen, & Ogden,
1978), yields several conclusions:
• When the cue is 100 percent reliable, it provides greater benefits than when it is less than 100 percent,
even for those instances in the latter case when the cueing automation is correct.
• When the cue is wrong, obvious penalties are incurred, as the person looks first to the cue, and then,
finding nothing (or perhaps identifying an incorrect target) looks elsewhere without the aid of the cue.
• When there is imperfect cueing, both the benefits when it is correct and the costs when it is wrong are
increased as the reliability increases towards 100 percent. This increase describes a phenomenon of
automation over-trust or automation complacency (discussed further in Chapter 12).
• Associated with the increased reliability of spatial cueing is a phenomenon that we will call
attentional narrowing or tunneling. That is, the more correctly the cue indicates the location of an
important target, the less likely the observer is to examine other areas of space, even though these may
sometimes contain critical information, of which the automation is not cognizant. This was
demonstrated in target cueing studies conducted with soldiers by Yeh and her colleagues (1999,
2001b, 2003). When a cued target was present, along with a more dangerous target elsewhere in the
scene, soldiers were likely to miss the latter, even though they knew that such dangers could be
present. We revisit the issue of attentional tunneling in our discussion of multi-tasking in Chapter 10.
• Cue characteristics that benefit cueing when it is correct also amplify costs when cueing is wrong. For
example, peripheral cueing, known to be more accurate, amplifies the extent of attentional narrowing,
compared to central cueing (Yeh Wickens & Seagull, 1999). So does cueing within a virtual
environment, relative to cueing on a hand-held display (Yeh et al., 2003; Yeh & Wickens, 2001b).
• Attentional guidance through cueing is closely related to the issues of highlighting in the design of
lists and menus to be searched. Here too, the highlighting placed on a subset of items in the search
field, inferred by some agent to be more important to the user, is sometimes in error (at a proportion
inversely related to its validity), degrading search in these cases (Fisher & Tan, 1989; Fisher et al.,
1989).
The tradeoff of costs and benefits in attentional guidance is observed in situations well beyond those
specific to automated guidance. A particularly intriguing example is in the “weapons effect” in eyewitness
testimony (Hope & Wright, 2007). Here a well-known phenomenon is that when a crime is committed with an
obvious weapon (e.g., a gun), eyewitnesses are much less proficient at recognizing the suspect. The presence
of the salient weapon in the visual scene captures attention, guides it to the weapon like a peripheral cue, and
draws attention away from important information like the suspect’s facial features. We also will see that cue
reliability is vital to trust in imperfectly reliable automation, to be discussed in Chapter 12.

72
3. PARALLEL PROCESSING AND DIVIDED ATTENTION
3.1 Preattentive Processing and Perceptual Organization
Many psychologists have proposed that the visual processing of a multiple-element world has two main
phases: a preattentive phase that automatically organizes the visual world into objects and groups of objects
(Li et al., 2002) and selective attention to certain objects within the preattentive array for further elaboration
(Kahneman, 1973; Neisser, 1967). These two processes are associated with short-term sensory store and
perception, respectively, in the model of information processing presented in Figure 1.3. Thus, distinguishing
between figure and background is preattentive. So also is the grouping together of similar items on the display
shown in Figure 3.3a. Gestalt psychologists (e.g., Wertheimer) identified a number of basic principles that
cause stimuli to be preattentively grouped together on the display (e.g., proximity, similarity, common fate,
good continuation, closure; see Palmer, 1992). Displays constructed according these principles have high
redundancy (Garner, 1974). That is, knowledge of where one item is on the display will allow an accurate
guess of the location of other display items in a way that is more difficult with the disorganized arrangement
shown in Figure 3.3b. (Think back to Chapter 2 when we discussed the usefulness of redundant information in
maximizing the security of an information channel.) Because all items of an organized display must be
processed together to reveal the organization, preattentive processing is sometimes called global or holistic
processing, in contrast to the local processing of a single object within the display.
The concepts of global and local processing are closely related to the emergent features concept we
discussed in Chapter 2 when treating multidimensional judgment. An emergent feature is a global property of
a set of stimuli (or displays) not evident as each is seen in isolation. Consider the two sets of aircraft engine
dials shown in Figure 3.4. Engine dials for two-engine aircraft are arranged in a layout similar to that of
Figure 3.4a; dials for the left and right engine are paired together for each of the eight engine parameters. In
checking that the dial reading is within normal limits, a common strategy is to detect deviations from a
particular position, rather than reading the precise value. In the case of Figure 3.4a, the normal operating
position varies with each pair of dials. However, by rotating all of the dials so that the normal values are in the
12 o’clock position (Figure 3.4b) the vertical alignment of the dials allows more rapid detection of the
divergent reading because of the emergent features–four columns of pointers oriented vertically.
Because global or holistic processing is preattentive and automatic, it can reduce attentional demand as
an operator processes a multi-element display. But this savings is only realized under two conditions. First,
the Gestalt principles (e.g., proximity, symmetry) or related information principles like redundancy must be
used to produce groupings or emergent features. Second, the organization formed by the spatial proximity of
different elements on the display panel must be compatible with the physical systems they represent, and the
user’s mental representation of them. Thus, for example, in Figure 3.4 the layout of the dial columns within
the panel does not match the physical layout of engines on the aircraft; instead the two left columns are the
primary instruments for the left and right engines, and the two right columns contain the secondary
instruments. We will discuss such spatial display compatibility further in Chapter 4. Banbury, Selcon, and
McCrerie (1997) found a four-fold increase in check-reading errors for this type of engine panel arrangement,
compared to a redesigned panel which grouped both the primary and the secondary left engine dials on the left
side, and the right engine dials on the right side. We will touch upon principles relating to the compatibility
between display and task requirements later on in this chapter.

73
FIGURE 3.4 Local (a) and global (b) perception in aircraft engine dials. Determining whether the pointer indicates a normal position in (a)
requires separate examination of each dial. In (b) the dials have been rotated so that the normal state is straight up. This arrangement means that
each column of pointers creates an emergent feature (a series of vertical lines). Deviation from the vertical is easy to detect with this
arrangement. The arrows indicate the mapping of display column to aircraft engine (discussed further in text).

3.2 Spatial Proximity


Two fundamental theories dominated early research in visual attention. Space-based attention theories (e.g.,
Eriksen & Eriksen, 1974; Posner, 1980) propose that the fundamental dimension of attention is the visual
angle of space, as in the flashlight metaphor. In contrast, object-based attention theories (Kahneman &
Treisman, 1984; Scholl, 2001) propose that we allocate attention to objects, not regions of space. As we
describe below, both perspectives are valid and not mutually exclusive.
As noted earlier, we can use the metaphor of the spotlight to characterize the spatial nature of attention.
Placing visual information close together in space (within the spotlight) will support parallel processing of
that information (and therefore help divided attention). This is a useful characteristic of human attention that a
display designer can exploit. For example, the head-up display (HUD) places critical instrument readings on
the glass windscreen of the cockpit, superimposed on the forward field of view (FFOV), as shown in Figure
3.5a (Wickens, Ververs, & Fadden, 2004; Fadden Ververs & Wickens, 1998, 2001). Similar displays have
been introduced into the automobile (Liu & Wen, 2004). HUD imagery is often specially designed so that the
user does not have to accommodate (shift visual focus from near to far) when switching from HUD imagery
to the FFOV. The HUD therefore places information sources into high spatial proximity with the FFOV. This
has the advantage of increasing the likelihood that events in the scene will be detected, relative to standard,
head-down instrumentation in the cockpit or on the dashboard. Placing the information together reduces the
need for visual scanning. Multiple studies have shown HUD advantages relative to head-down presentation of
the same information (e.g., Charissis et al., 2009; Fadden, Ververs, & Wickens, 1998, 2001; Liu & Wen,
2004; Wickens & Long, 1995). Thus, the HUD facilitates parallel processing of scene and symbology. Figure
3.4 Local (a) and global (b) perception in aircraft engine dials. Determining whether the pointer.
However, some tasks have shown HUD costs (e.g., Fadden et al., 1998, 2001; Fischer, Haines, & Price,
1980; Hagen et al., 2007; Jarmasz et al., 2005; Wickens & Long, 1995; Zheng et al., 2007). For instance,
Wickens and Long found that an unexpected obstacle, an airplane crossing the runway, was detected more
poorly with the HUD because of readout clutter than with the head-down configuration. The airplane can be
seen, poised to “move out,” in Figure 3.5b. As we saw earlier, placing information sources together does not
necessarily gurarantee that both will be processed (remember inattentional blindness and the gorilla and the
basketball players). This apparent contradiction hinges on the expectations of the observer. We are likely to
see advantages to the HUD format when the observer expects the events in the superimposed background; that
is, when they are likely to occur (high expectancy). However, the HUD format will impair performance for
the detection of an unexpected stimulus (low expectancy). Placing information close together can lead to
interference, a disruption of focused attention.

74
FIGURE 3.5 (a) Head up display (HUD) used in aviation. (b) Head up display with conformal imagery. Note the airplane on the ground by the
runway. (c) Head up display with conformal imagery (runway overlay). Source: (3.5a) Richard Baker/Corbis.

Findings of focused attention failures with close spatial proximity, illustrated with the proximity or
readout clutter in Section 2, have been examined in closely controlled laboratory tasks in what is termed the
“flanker paradigm” (Eriksen & Eriksen, 1974). Here, imagine a rapid response task in which you respond with
the right hand to the letter “R” and the left to the letter “L.” Compared to a baseline, single letter response
time (RT), when the target letter is flanked by irrelevant letters (e.g., [N R S] or [S L K]), RT to the central
target is slowed by this perceptual competition. All letters fall within the flashlight beam. However, when
the relevant letter is flanked by the incompatibly mapped letter (i.e., [R L R] or [L R L]), then RT to the task-
related central letter is slowed by a much larger amount. This is called response conflict, and both perceptual
competition and response conflict grow as the flankers are moved progressively closer to the central letter.
In contrast, when the flanking letters are identical to the central target (e.g., LLL), there was redundancy
gain, with faster RTs than the control condition with either a single letter or irrelevant flankers. Response
conflict and redundancy gain are two sides of the same coin. If two perceptual channels are close together,
they will both be processed (failure of focused attention); then they will impair or facilitate, depending upon
their implications for action.
In displays outside the laboratory, we are more likely to see perceptual competition and redundancy gain
effects as display clutter increases, with the greatest effect when proximity is less than around one degree of
visual angle (Broadbent, 1982). However, flankers can still have effects two to three degrees out (Murphy &
Eriksen, 1987). Indeed, Mori and Hayashi (1995) showed that a task performed in one window of a computer
display was affected by the number of peripheral windows, which suggests that interference can occur across
greater distances. However, flanker effects can be substantially reduced by cueing observers about target
position (Yantis & Johnson, 1990). From a display design perspective, if focused attention on a particular
display element is required, cueing expected target location (if known) reduces the deleterious effect of
display clutter on performance.

75
Just as close spatial proximity can inhibit the focus of attention, so, too, can spatial separation inhibit
divided attention between two visual sources (Wickens, Dixon, & Seppelt, 2002; Wickens, 1993). This
divided attention cost to spatial separation does not appear to be linear, instead following the spatial distance
function shown in Figure 3.1.
A display designer can take advantage of these attentional effects. To the extent that the user needs to
divide attention among several display elements, reducing the distance between elements will improve
performance, just as increasing it will degrade performance. However, the non-linear function in Figure 3.1
suggests that sometimes the display designer can improve divided attention by moving display elements
closer together down to about one degree, without significant perceptual competition.
The role of distance in the attentional spotlight can be readily translated to the third dimension of depth.
Whereas objects viewed in the same stereo depth plane, and overlapping in the XY plane, impose challenges
to focused attention, separating them in depth by stereopsis allows easier focus on one and filtering of the
other (Chau & Yeh, 1995; Theeuwes, Atchley, & Kramer, 1998). For example, air and ground objects on a
radar screen might be displayed at different depths to help the air traffic controller distinguish an air target
from unique ground objects that are necessarily distracting. Thus, separating information sources in depth
reduces the likelihood of a failure of focused attention. We will discuss the role of depth perception in
attention in more detail in Chapters 4 and 5.

3.3 Object-Based Proximity


We have seen that moving display elements together in space will aid their parallel processing and increase
the likelihood of interference for focused attention. What if the display elements were combined into a single
stimulus object, the focus of research on object-based attention (Scholl, 2002)? The classic laboratory
demonstration of this phenomenon is called the Stroop effect (Stroop, 1935; MacLeod, 1992). In the Stroop
task, the participant is asked to report the ink color of a set of stimuli. In a control condition, the participant is
shown a row of four Xs (XXXX). Each row is a different ink color, and the participant must report the color
of each row. This is analogous to the single letter control of the flanker task. In the critical response conflict
condition, the stimuli are color names, printed in ink that does not match (e.g., the word BLUE is printed in
red ink). The results are dramatic: Reporting ink color is slow and error prone relative to the control condition.
When participants err, they read the word instead of reporting the ink color. There is response conflict
between the word and the color, slowing processing. Like the flanker paradigm, irrelevant information
interferes with the production of the correct response. The difference, however, is that the relevant and
irrelevant information are part of the same stimulus object in the Stroop effect.
The effect is not limited to words and ink color. Similar examples occur with judgments of an arrow’s
direction (pointing up or down) and its location on a display (high or low) (Clark & Brownell, 1975); stating
whether words are on the left or right of a display when the words themselves were “left” or “right” (Rogers,
1979); and classifying whether a number was large or small when the size of the numeral used to portray it
varies (Algom et al., 1996).
The Stroop effect is part of a large body of evidence suggesting that there is another dimension, besides
space, that can affect both focused and divided attention. This is whether or not an element B belongs to the
same object as element A. If B is to be ignored and belongs to the same object, processing of A will be
hindered compared to the case where B belongs to a separate object.
In contrast to the strong costs for focused attention of the Stroop effect, Duncan (1984) illustrated the
benefits of belonging to the same object for divided attention. He used the stimuli shown in Figure 3.6. One
object was a box, the other was a line. The box was either large or small, and had a gap on one side or the
other. The line was either dashed or solid, and slanted either left or right. Duncan found that judgments of two
attributes (divided attention) were better when both attributes belonged to the same object (e.g., box size and
gap side) than when one belonged to each object (e.g., box size and line orientation). Importantly, the amount
of visual scanning (separation by space) was equivalent in both conditions. Kahneman and Treisman (1984)
have integrated such evidence as we have discussed above to propose the object file theory of attention. The
theory postulates that perceptual processing is parallel within the features of a single object, but serial across
different objects.

3.4 Applications of Object-Based Attention


U.S. Supreme Court Justice Potter Stewart described pornography as difficult to define, but you know it when
you see it. One might say the same about an object. Three features that characterize an object are: (1)

76
connectedness or surrounding contours between parts; (2) rigidity of motion of the parts, relative to other
scene elements; and (3) familiarity. None is a truly defining feature, but the more of these features the object
has, the more object-like it becomes. We consider two examples: conformal symbology and object displays.

FIGURE 3.6 Stimuli used in the experiment by Duncan (1984).

Earlier we mentioned a study by Wickens and Long (1995) showing that the head-up display could
improve control of aircraft position during landing. Importantly, this result occurred when the HUD
symbology was conformal; that is, the position of HUD objects corresponded to the position of related
objects in the outside scene. For example, in Figure 3.5c the HUD runway was superimposed on the physical
runway, and was moved on the display whenever the plane changed heading, in order to maintain aligment. In
a sense, this is a form of augmented reality (discussed in detail in Chapter 5), in that the real runway scene is
augmented by computer-generated imagery. Wickens and Long’s result is consistent with the object-based
concepts discussed previously: Having the two components (real and HUD runways) superimposed using
conformal imagery creates one object for the attentional system, adhering to the Gestalt principle of common
fate, as the aircraft moves and rotates in the airspace (Jarmasz, Herdman, & Johannsdottir, 2005). This helps
to ensure that the aircraft is in the correct position for landing. We shall discuss these ideas in greater detail
when we consider augmented reality displays in Chapter 5. With conformal imagery, parallel processing
between the display and the world beyond is improved, and the clutter problem causing a failure to focus is
resolved, compared to the non-conformal images of Figure 3.5a.
Designers have also capitalized on the parallel processing of object features to create multidimensional
object displays (Barnett & Wickens, 1988, Hughes & MacRae, 1994). In these displays, multiple information
sources are encoded as the stimulus dimensions of a single object. Figure 3.7 provides several examples.
Figure 3.7a shows the safety parameter display for nuclear power reactor operators designed by
Westinghouse, in which the values of eight key parameters are indicated by the length of imaginary “spokes”
extending from the center of the display and connected by line segments to form a polygon (Woods, Wise, &
Hanes, 1981). The shape of the object denotes a particular system state. When it is symmetrical in all respects,
it indicates “situation normal;” furthermore, each asymmetrical configuration of the polygon indicates a
particular type of system problem. Thus, we can say that the shape of the polygon is an emergent feature of
the object, as defined earlier in this chapter (and in Chapter 2, when we discussed multidimensional absolute
judgment).
Another example of an object display, this time for a medical application, is shown in Figure 3.7b (Cole,
1986). This rectangular display represents the oxygen exchange between patient and respirator. One rectangle
represents the ventilator, the other the patient. The width represents the rate of breathing, and the height
represents the depth of breathing (amount of oxygen supplied on each breath). Thus, the size (area) of the
rectangle indicates the total amount of oxygen exchanged, a critical variable to be monitored. This is true
because oxygen amount = rate × depth (just as rectangle area = width × height). Furthermore, the style of
patient breathing (shallow short panting versus slow deep breaths) can be rapidly determined from the shape
of the rectangle (a second emergent feature). Determining total oxygen exchanged and the style of breathing
are both tasks requiring information integration. Each depends upon dividing attention between breathing rate
and depth, which can easily be discerned by examining the emergent features of size and shape. Such
rectangle displays are found to be quite effective (Barnett & Wickens, 1988).
Figure 3.7c shows an example of a graphical cardiovascular object display for anesthesia (Drews &
Westenskow, 2006). The display was constructed based on the anesthesiologist’s mental model of the
cardiovascular system. The left part of the figure shows normal values, whereas the asymmetric shape in the
right part of the figure indicates myocardial ischemia (a pathological state underlying heart disease). In
addition to the asymmetry, the small, crinkled heart shape shown on the right side is an emergent feature
indicating reduced cardiac output that occurs with the ischemia.

77
FIGURE 3.7 Figure a) from “An Evaluation of Nuclear Power Plant Safety Parameter Display Systems”, by David D. Woods, John A. Wise,
and Lewis F. Hanes. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, October 1981 vol. 25 no. 1, 110–114.
Reprinted with permission of Human Factors and Ergonomics Society. Figure b) from Cole, W. G. (1986). Medical cognitive graphics. In
Proceedings of the ACM-SIGCHI: Human Factors in Computing Systems (91 – 95). Boston: SIGCHI. Figure c) from “The Right Picture Is
Worth a Thousand Numbers: Data Displays”, by Frank A Drews and Dwayne R Westenstok in Anesthesia Human Factors: The Journal of the
Human Factors and Ergonomics Society, Spring 2006 vol. 48 no. 1, 59–71.

The object display concept can be applied to text as well. When a graphic designer places text on a
display, there are various techniques for ensuring that its content is associated with the object it identifies
(e.g., spatial proximity, arrows, similar colors,). The ultimate method for ensuring association between
elements is achieved by making multiple elements part of the same object. As one clever (and artistic)
example, consider the map shown in Figure 3.8, which uses words (the names of streets) to represent the
streets themselves.

78
FIGURE 3.8 Words as streets: an integrated object display. Source: “Chicago Typographic Map” by Axis Maps.

We are beginning to see a pattern here: there is a relationship between the choice of a display
representation and a particular set of task demands (what we might call a task representation; Smith,
Bennett, & Stone, 2006; Zhang & Norman, 1994). We can talk then of a task compatibility between the
design of a display and the task requirements. This is illustrated in Figure 3.9. In the next section, we specify
the precise nature of this compatibility in the form of a principle for display design.

3.5 The Proximity Compatibility Principle (PCP)


In Figure 3.10, we summarize the one key implication of what we have just discussed. On the bottom of the
figure, two (or more) elements on a display (or in the natural environment) can be either “distant” from each
other or “close,” where closeness can be defined by either spatial proximity (the circles) or belonging to the
same object (“objectness,” the rectangles). This distinction forms the x-axis of the graph above. On the y-axis
we have plotted performance; the quality of performance improves as you move up (as a display designer you
want to be working at the top of the graph). On the right of the graph are two labels of lines that represent
tasks requiring either divided attention between elements or focused attention on one element while ignoring
others. We say that the first (divided attention) task has high task proximity or mental proximity, since
multiple elements are required and attention must be divided between them, and the second has low task
proximity since only one element is required, and others must be kept separate or filtered by focused attention.
For high task proximity (divided attention task), performance will tend to improve as display proximity
increases (dashed line). For low task proximity (focused attention task), performance will degrade as display
proximity increases (solid line). The simple interaction plotted in Figure 3.10 conveys the key idea of the
proximity compatibility principle (PCP; Wickens & Carswell, 1995, 2012). This principle will be elaborated
considerably regarding the concepts of both the display proximity and task proximity below.

79
FIGURE 3.9 Task compatibility refers to the relationship between display and task representations.

FIGURE 3.10 An illustration of the proximity compatibility principle. The graph at the top of the figure shows performance as a function of
display and task proximity. The bottom part of the figure shows two ways to manipulate display proximity: by distance or by objectness.

In particular, it is important to distinguish between two different types of divided attention task. The first
involves mental information integration, where attention must be divided between multiple elements, but
both are mapped onto a single task (cognitive or motor response), and so their combined implications must be
mentally integrated. The second is dual task processing, where each display element is associated with a
separate response and goal, such as dialing a cellphone and maintaining a car’s heading on the roadway. The
PCP is designed to account for performance in information integration tasks (in contrast to focused attention
tasks), but not dual task processing. Divided attention in the dual task context will be discussed in Chapter 10.
Figure 3.11 shows various ways of manipulating display proximity. These elaborate the two primary
categories of space- and object-based attention. Each method is represented by a different row in Figure 3.11.
The methods can be broadly classified into three groups: sensory/perceptual similarities; common object; and
emergent features. We describe these below, identified by row number in the figure.

80
FIGURE 3.11 Dimensions of display proximity.

3.5.1 SENSORY/PERCEPTUAL SIMILARITIES


    1. Close proximity in space (or spatial contiguity; Ginn, 2006). As we have discussed, spacebased
proximity is related strongly to the effort required to move attention (and particularly the eyes) from
one location to another. One example is the book designer’s goal to keep a figure on the same page
as the text that refers to it, as opposed to requiring the reader to turn pages in search of the figure.
The attentional resources required to access the figure compete with the resources required to retain
information from the text (Liu & Wickens, 1992). Another example is the co-location of product and
hazard information within the same text area, a design that increases warning compliance (Frantz,
1994).
    2. Close proximity in color. When two objects have the same color, they tend to be processed similarly
(Yeh & Wickens, 2001a). It is relatively easy to mentally integrate a group of similarly-colored
objects in an otherwise cluttered visual field (Wickens, Alexander, et al., 2004), and it is also easy to
divide attention among them. For example, this use of color has been suggested as an aid to air
traffic controllers. Two same-color techniques can be employed to aid in information integration.
First, all aircraft flying at a given altitude can appear in the same color (Remington, Johnston, et al.,
2001), making it easier to integrate mentally those aircraft that represent potential collision threats.
Second, a pair of aircraft on a conflict trajectory could be colored red, making it easier for the
controller to notice the pair and understand their joint trajectory, an integration task. Both concepts
have been used in the design of air traffic control displays. Linking by color was employed by
Meortl et al. (2002) in a design concept that helped controllers link the spatial position of an aircraft
with its identity and flight parameters in a spatially separated table, by jointly flashing the two
representations in the same color.

3.5.2 COMMON OBJECT

    3. Connections. Two spatially separated objects can be cognitively linked with a line creating a single
object (see Figure 3.4). Attention appears to be drawn relatively automatically along the line
(Jolicoeur & Ingleton, 1991). Thus in an air traffic control conflict display described above, the two
conflict airplanes are not only the same color, but are also joined by a line. As another example, it
helps when a printed sentence is linked to a pictorial rendering in device instructions: For example,
text instructions about how to manipulate a particular control are connected with a line to a picture of
the control in its location on the equipment (Tindall-Ford, Chandler, & Sweller, 1997, see Chapter

81
6). The clutter of these additional links can be minimized by using reduced contrast for the linking
lines, or by using dashed or dotted lines, while ensuring they are still visible (Wickens, Alexander, et
al., 2004).
    4. Abutment. Having the contours of two objects touch or “abut” can improve their integration, while
still allowing them to be perceived as separate objects. Figure 3.11 shows that if bars in a bar graph
are abutted, creating a single object, the emergent feature of co-linearity is extremely salient, since
its absence will be signaled by the break in the line across the top. This is a sensory feature called
vernier acuity, to which humans are extremely sensitive (McKee & Nakayama, 1983).
    5. Heterogeneous features. Row 5 of Figure 3.11 shows two objects created by three heterogeneous
features: size, brightness, and shape. We call these features heterogeneous because they are
processed relatively independently in different perceptual analyzers or channels (Treisman, 1986).
Heterogeneous features are more likely to be separable dimensions (see Chapter 2) than
homogeneous features. Heterogeneous features are often used to show the characteristics of a town
on a demographic map (e.g., symbol size, color and shape represent population, political leaning, and
mean income) (see Chapter 5). The stimuli used in the Stroop task are heterogeneous objects, with a
semantic and a color dimension.
6 and 7. Homogeneous features. Rows 6 and 7 show two homogeneous featured objects, each object defined
by a horizontal and vertical measure; the XY position of a point on a graph (row 6) and the width
and height of a rectangle (row 7). These are said to be homogeneous because a single perceptual
analyzer–of spatial distance–defines both. A display designer who wants to represent two aspects of
a single entity will need to know whether to use heterogeneous or homogeneous features. The
answer appears to lie in both the kind and degree of integration (task proximity) that is required
(Wickens & Carswell, 1995; Carswell & Wickens, 1996). As we saw in Chapter 2, with integral
dimensions it becomes more difficult to filter out one dimension and only process the other.
Homogeneous features are similar to such integral dimensions.
If the user must consider both aspects of the entity at once in a Boolean logical operation (e.g., is a given city
both large in population and politically conservative?), then heterogeneous featured objects are the ideal
choice because they best allow parallel processing of the two dimensions (Lappin, 1967). Indeed,
heterogeneous object features are a highly economical way of presenting lots of information in a space
containing multiple objects (e.g., a map showing many cities), because all attributes of the single object can be
processed in parallel, the processing being divided between the different analyzers. Thus heterogeneous
features are good display clutter reducers! Heterogeneous feature objects also support redundancy gain (as
discussed in Chapter 2, and earlier in this chapter), where all features lead to a common response. For
example, the stop sign with color (red), shape (octagon), and word meaning (STOP) has three redundant
heterogeneous features as part of a single object.
If instead the integration goal is arithmetic or comparative, then heterogeneous features no longer
provide the same benefit since each feature is expressed in its unique “perceptual currency,” which cannot
easily be compared or combined. For example, an aircraft pilot who wants to compare actual and desired
speeds in order to arithmetically compute the difference (error) does not want one to be expressed spatially
and the other expressed by color code. Instead, it is better for both to be spatial, perhaps as the height of two
connected bar graphs (Row 4). Many integration tasks involve mental multiplication where the rate of some
operation (e.g., rate of travel) is multiplied by the duration or elapsed time of operation to produce a total
quantity measure (e.g., distance traveled). Homogeneous features represent the integration task of
multiplication better, and this is particularly true for the height and width of a rectangle (Row 7), where the
area of the rectangle display is equal to the product of the two variables (Barnett & Wickens, 1988; see also
Figure 3.7b). The user does not have to multiply numbers because the size of the rectangle is easily perceived.

3.5.3 EMERGENT FEATURES


8. Homogeneous features (again). To be useful in display design, emergent features should be mapped to
those quantities needing to be integrated (Bennett & Flach, 1992). The shape of the medical display
for monitoring patient respiration described in Figure 3.7 serves as one example (Cole, 1986; Drews &
Westenskow, 2006).
Importantly, research (and intuition) has suggested that emergent features need not be created by an object
display (Sanderson, Flach, et al., 1989). Figure 3.12 provides an example of two bar graphs (separate objects).
Let’s assume that each represents the desired and actual temperature of fluid in a tank, and that the user is to

82
perform an integration task: determine whether the two temperatures (desired and actual) are equal. To the
left, it is easy to perceive that the system is operating normally: the height of the two bars is identical. To the
right, the same integration judgment appears more difficult. The reason? Aligning the two bars to a common
baseline produces an emergent feature on the left: now when the tops are aligned, it signals equivalence. One
could imagine a ruler laid flat across the top (shown by the dashed line). In the same way, the common
orientation of the engine needles in Figure 3.4b provides an emergent feature (parallel verticality) signaling
“all is well”.

FIGURE 3.12 The effect of common baseline alignment on detecting a system state in which two parameters are equal.

FIGURE 3.13 An interaction and two main effects shown as a bar graph and as a line graph.

Another important emergent feature is the slope of a line that connects two objects in a graph. For
example, consider the graphs in Figure 3.13. On the left, it is easy to determine that the bars have different
heights, and inspection of the four means will indicate whether an interaction is present. However, when the
same data points are connected by lines as on the right, the presence of the interaction is more salient given
that it is explicitly represented by the slopes of the lines connecting the values. The fact that the slopes differ
is now expressed visually by the emergent feature of the angle between the two lines. Indeed, when the two
variables are additive, the parallel aspect of the two lines serves as the emergent feature, as shown at the
bottom of the figure. We will describe perception of graphical information in much more detail in the next
chapter.
9. Polygon displays and symmetry. A final emergent feature that can often be created and exploited in
display design is the symmetry of an object or a configuration of objects. Visual attention is highly
sensitive to symmetry and its absence (Garner, 1974; Palmer, 1999; Pomerantz & Pristach, 1989), and
so if a symmetrical configuration can be directly mapped to a critically important display state, then a
well-conceived emergent feature display will be achieved. An oft-cited example is the polygon or
object display (Beringer & Chrisman, 1995, Gurushanthaiah, Weinger, & Englund, 1995; Hughes &
MacRae, 1994; Peebles, 2008; Woods, Wise, & Hanes, 1981), as shown in Figure 3.7a and in Row 9
of Figure 3.11. Here the normal operating level of four system parameters is represented by a fixed
(and constant) length of each side of the quadrahedron (or the length of the four radii from the center).
When all four are at this normal level, a perfect square results, as shown on the left. The square is both
vertically and horizontally symmetric and easily perceived as a square. When any variable departs
from normality, symmetry is broken and the deviation is obvious, as shown on the right.
In closing our discussion of information integration, we note two important aspects of emergent
(homogeneous) featured object displays. First, the creation of such displays can involve considerable
creativity on the part of the designer (some would say as much art as science). Even given the constraints

83
described above, there are typically many possible display configurations from which to choose.
The second aspect is an obvious point that may have already been noted by the reader: Suppose the
display user does not care about the emergent integration quantity, but instead needs to focus attention on the
precise value of a particular underlying dimension (e.g., what is the patient’s rate of breathing). Would this
focused attention task be hurt by an object rendering? In other words, does close proximity always hurt
focused attention? We consider this topic in the next section.

3.5.4 COSTS OF FOCUSED ATTENTION: IS THERE A FREE LUNCH? The PCP proposes that there is an interaction
between display and task proximity, as depicted in Figure 3.10. In its purest form, it predicts that closer
display proximity, however achieved, will improve performance on integration tasks, and disrupt performance
on focused attention tasks. The negative effects (or diminished benefits) of high (close) task proximity on
focused attention are well documented as seen with both the Stroop task and overlay (readout) clutter
(Wickens & Carswell, 1995), even if this effect is typically smaller in its magnitude compared to the benefits
of close display proximity for integration (Bennett & Flach, 2010).
However, there are certain circumstances in which closer display proximity to aid integration does not
hurt focused attention. As we noted earlier, costs to focused attention typically emerge only when spatial
separation is decreased below about one degree of visual angle (and are then amplified when overlap occurs).
However, the costs of increased separation to divided attention are relatively monotonic across a wide range
of angles above one degree (see Figure 3.1). Thus, decreasing separation from 20 to 2 degrees generally helps
integration but does not hurt focused attention. Similarly, rendering two items in a cluttered display the same
color (or intensity) will not hurt the focus of attention on either item (and will aid search performance if the
pair are uniquely colored; Wickens, Alexander, et al., 2004). Furthermore, using a line to connect two dots on
a line graph (see Row 3 of Figure 3.11) will produce an emergent feature (line slope) that will help the user
detect any difference between the values (the integration task). However, this connection will not hinder a
focused attention task like extrapolating the line’s position to the X axis, compared to a bar graph.
In short, sometimes there is a free lunch (or at least a cheap one!) if proximity is used with care. A
designer who aims to support an array of focused and integration tasks may, by careful selection of different
proximity metrics, support both tasks. We shall reconsider some of these points when we discuss graph design
in Chapter 4.

4. ATTENTION IN THE AUDITORY MODALITY


The auditory modality is different from the visual modality in three important respects relevant to attention.
First, the auditory sense can take input from any direction and so there is no analog to visual scanning as an
index of selective attention (i.e., there is no “earball”). We say sound is omnidirectional. Second, the auditory
modality has the capacity to receive information at almost all times; in darkness or even while we sleep. There
is no “earblink.” Third, most auditory input is transient. A word or tone is heard and then it ends, in contrast
to most visual input, which tends to be continuously available. Hence, the preattentive characteristics of
auditory processing–those required to “hold on” to a stimulus before it is gone–are more critical in audition
than in vision to support this. As discussed briefly in Chapter 1, to support this criticality, short-term auditory
store is longer than short-term visual store.
As we found with visual attention, it is impossible to focus our attention on all aspects of what we hear in
everyday life, so we try to either divide our attention among a limited number of auditory events (e.g., an
operator listening to multiple communication channels simultaneously), or attend selectively to one specific
auditory event, while trying to ignore others (e.g., an operator listening to one communication channel and
ignoring others). The ubiquitous nature of the auditory channel (we cannot close our ears or move our earball)
can be exploited in the design of auditory warnings. A warning presented over the auditory channel stands a
good chance of capturing the operator’s attention, even if the operator is otherwise engaged. However,
auditory attention will also be captured by sounds with no relevance or significance, such as when we are
distracted by noise in a busy open-plan office (Banbury, Macken, et al., 2001).

4.1 Auditory Divided Attention


Consider a situation in which we listen to two talkers speaking simultaneously and try to identify key words in
their sentences. This is a difficult task, but it is made easier when the two voices are placed in different spatial
locations, or when the two voices have different fundamental frequencies (e.g., male and female voices)

84
(Humes, Lee, & Coughlin, 2006). We avoid the confusion, analogous to that created by visual overlay-readout
clutter (Section 2.4).
Given the difficulty of attending to multiple auditory streams at once, we often adopt a strategy of
switching between them. A general model of auditory attention (see Norman, 1968; Keele, 1972) proposes
that an unattended channel of auditory input remains in preattentive short-term auditory store for 3-6s (see
Chapter 7). The transient contents of this store can be examined if a conscious switch of attention is made.
Thus, if your attention wanders while someone is talking to you, it is possible to switch back and “hear” the
last few words the person spoke, even if you were not attending to them when they were uttered.
Information in unattended channels may make contact with long-term memory. That is, words in the
unattended channel are not just meaningless “blobs” of sound, but their meaning is analyzed at a preattentive
level. If the unattended material is sufficiently pertinent, it will often become the focus of attention (i.e.,
attention will be switched to the unattended channel). For example, a loud sound will almost always grab our
attention as it signals a sudden environmental change that may need to be addressed. Our own name also has a
continued pertinence, and so we will sometimes shift attention to it when spoken, even if we are listening to
another speaker (Moray, 1959; Wood & Cowan, 1995). So also does material semantically related to the topic
that is the current focus of attention (Treisman, 1964a).
Designers can capitalize on this tendency to switch attention to contextually pertinent material to design
quieter, less noxious alerts. Although loud tones call attention to themselves, they can annoy and startle, and
their intensity can increase stress, leading to poor information processing (Wiese & Lee, 2004; see Chapter
11). If a pilot is landing an airplane, for example, it may not be necessary to have loud alarm signals for
operations relevant to landing. Since one has a low attentional threshold for one’s own name, personalized
alerts prefaced with the operator’s name may also attract attention without high volume. These attention-
grabbing, but quieter, auditory warnings have been called attensors (Hawkins & Orlady, 1993; Sarter, 2009).
In our discussion of visual attention, we saw that close proximity, particularly as defined by objectness,
was key to supporting the successful divided attention necessary in an information integration task. We also
saw that the same manipulations of proximity that allowed success in divided attention were responsible for
the failure of focused attention. These manipulations and observations have their counterparts in audition. We
define an auditory object as a sound (or series of sounds) with several dimensions. These auditory dimensions
seem to enjoy the same parallel processing benefits as do the dimensions of a visual object. For example, we
can attend to both the words and melody of a song (Gordon, Schön, et al., 2010), or to the meaning and voice
inflections of a spoken sentence. In Chapter 2, we discussed how the basic dimensions of sound (pitch,
loudness, and timbre) were integral dimensions. Auditory warning alerts–such as the ‘earcons’ described in
Chapter 6–have been designed to capitalize on the parallel processing of these integral dimensions to convey
additional meaning, such as perceived urgency (Edworthy & Loxley, 1990; Hellier et al., 2002; Marshall Lee
& Austria., 2001; Wiese & Lee, 2004).

4.2 Focusing Auditory Attention


Focused auditory attention involves attending to one source of auditory information while excluding all
others; for example, a radio operator must concentrate on a single message while ignoring conversation and
background noise in the room.
We can attend selectively to auditory messages even from similar locations. The cocktail party effect
describes our ability to attend to one speaker at a noisy party and selectively filter out other conversations
coming from similar spatial locations. In physical terms, sound is a jumble of undifferentiated pressure
changes; whereas in perceptual terms it is organized into streams of relatively stable and distinct auditory
objects (see Bregman, 1990). This notion of auditory streaming explains how we use physical characteristics
of the sound to focus our attention selectively. For example, one such characteristic is pitch; it is easier to
attend to two voices if the voices are of the opposite sex (and thereby typically have different pitch) than if the
two voices are of the same sex (Treisman, 1964b). We experience auditory streaming when we listen
selectively to different instruments played within an orchestra; we are able to do this with ease, despite the
physical complexity of the sound pressure changes arriving at our ears.
The organization of sound into perceptually-distinct auditory objects is mediated by a number of factors,
including pitch, timbre, spatial location, and timing (Jones et al., 1999). For example, in Figure 3.14a we
perceive two tones with a small pitch separation as one coherent stream of alternating tones. If we were to
increase the pitch separation of the tones gradually, we would, at some point, perceive the fission of the single
alternating stream into two distinct streams of repeating tones (Figure 3.14b). Similarly, if we were to increase

85
the rate of presentation of the tones, we would also perceive the fission of the single alternating stream into
two distinct streams (Figure 3.14c). The phenomenon of auditory streaming has been exploited for centuries
in classical music by way of polyphony: the creation of two or more melodic voices played simultaneously.
(For a review of the role of attention in music, see Bregman, 1990.)

FIGURE 3.14 Illustration of the auditory streaming phenomenon. A small pitch separation between two alternating tones results in their fusion
into one stream (a), whereas larger pitch separation (b) or quicker presentation (c) results in their fission into two perceptually-distinct streams.

In vision, we saw that using close proximity to facilitate parallel processing was a double-edged sword
because it disrupted the ability to focus attention. In the auditory modality, too, we find that focused attention
on one channel is disrupted when two messages appear to come from the same spatial location. For example,
in monaural (mono) listening, two messages are presented by headphones with equal relative intensity to
both ears. This is similar to what you would experience when listening to two speakers both directly in front
of you. In dichotic (stereo) listening, the headphones deliver one message to the left ear, and the other to the
right. Here, you would hear one voice in each ear. Multiple studies show that there are large benefits of
dichotic over monaural listening in terms of our ability to filter out the unwanted channel (Egan, Carterette, &
Thwing, 1954; Humes, Lee and Coughlin, 2006; Treisman, 1964b).
By moving the eyes to a location, our visual system can selectively attend to the information at that
location and ignore other information sources. Although there is no earball, three-dimensional (3D) audio
technology (or 3D audio, discussed further in Chapter 4) can direct auditory attention by cueing, just as visual
attention can be directed without eye movement. By simulating the cues we use to determine the spatial
location of sound, 3D audio can be used to project auditory cues to the user in the full 360° volume of space,
even through traditional stereo headphones. Thus, one can use spatial audio to help direct attention of the pilot
(or car driver) to identify targets of interest in the environment. In applied settings, the cueing of attention
through the auditory modality confers a number of advantages. These include the use of an alternative channel
to multiple visual information sources, and the ability to present a cue anywhere within the full 360° volume
of space. Further, unlike visual cueing, the time needed to make the attentional shift does not vary with the
distance to the cue (Mondor & Zatorre, 1995).
In addition to directing attention, the display designer can take advantage of these various effects to
create auditory streams. For example, by presenting one radio network to each ear, and a third presented with
equal intensity to both ears (thereby appearing to originate from the midplane of the head), a radio operator is
better able to monitor all three networks as compared to monaural presentation (i.e., all presented to the center
channel only). In this case, a spatial separation of the three radio networks promotes the formation of three
distinct auditory streams (left, right, and center). This should make it easier for the operator to select one
network and ignore the others. Other factors that promote the formation of auditory streams, like pitch
difference, can also be utilized to make each stream even more perceptually distinct. Thus, airplane pilots
might have available several distinct audio channels (messages from co-pilot, from air traffic control, and
from nearby aircraft). These could be presented in their actual positions relative to the pilot. Synthesized voice
warnings from own aircraft could also be placed in appropriate locations (e.g., a left engine failure alert could
be heard coming from the left and to the rear). In addition, single word low-priority warnings could be

86
presented as moving left to right across the channels (for example, “ca”-“bin” “hot” to create the warning
“cabin hot”), an approach that reduces distraction while preserving intelligibility (Banbury et al., 2003).

4.3 Cross-Modality Attention


Section 4 has so far focused exclusively on attention within a modality. But in many real-life situations we are
confronted with parallel inputs across modalities. Consider when we drive and our passenger gives us verbal
directions, or when the pilot landing an aircraft monitors the visual environment, listens to the copilot’s
spoken messages regarding key velocities, and obtains somatosensory and kinesthetic feedback from the
shaking of the rudder control. As we will discuss in Chapter 6, we can view text or pictures and hear audio
information simultaneously when we visit a web site or engage in computer-based training. The construction
of virtual environments, to be discussed in Chapter 5, often requires the proper integration of visual, auditory,
and often haptic information. There are advantages to using multiple modalities: as we will discuss in
Chapters 6 and 10, redundantly coding a target across modalities (e.g., coupling a visual warning with an
auditory beep) improves the accuracy of processing (Wickens, Prinet, et al., 2011).
Visual, auditory, and even proprioceptive attention have been shown to draw upon common spatial
processes. Attentional cueing to a location in one perceptual modality has been shown to produce a reduction
in response time to targets in a different modality (Driver & Spence, 2004; Spence, McDonald, & Driver,
2004; see also discussion of cross-modality links in Chapter 4). For example, when asked to monitor a stream
of speech for target words while driving, monitoring performance was better when the speech was presented
in front of the driver–near the focus of visual attention–compared to when speech was presented to the
driver’s side (Spence & Read, 2003). Similar effects have been observed between visual and proprioceptive
attention, and between auditory and proprioceptive attention (for a review see Sarter, 2007). Crucially, the
spatial links between auditory, visual, and proprioceptive attention seem to be obligatory (i.e., beyond
conscious control). This helps explain why auditory stimuli are so useful in alerting–not only do they summon
auditory attention, they also cue the operator’s visual attention.
However, capturing an operator’s attention through another modality is a double-edged sword. Recent
research on the irrelevant sound effect has focused on identifying those task and sound factors that can lead a
person to be distracted while undertaking relatively complex mental tasks (for reviews see Banbury et al.,
2001, and Beaman, 2005); that is, the failure of focused attention. In general, it appears that working memory
(discussed in Chapter 7) is susceptible to interference by irrelevant sound. In particular, it is the maintenance
of item order (e.g., remembering the sequence of digits in a telephone number) that is most affected (Jones,
Hughes, & Macken, 2010). The disruptive effect of irrelevant sound on tasks involving the maintenance of
order, such as memory for prose and mental arithmetic, has been found to be as much as 60 percent (Banbury
& Berry, 1998; Szalma & Hancock, 2011 provide a metanalysis). On balance, the evidence suggests that
acoustic change is the main disruptive factor (Jones, 1999); particularly for mental activities that rely on
working memory (see Chapter 7) to keep information in order (Banbury et al., 2001; Beaman, 2005). For
example, sounds, tones, and speech utterances are most disruptive if they show appreciable acoustic variation
over time (Jones & Macken, 2003). Tremblay & Jones (2001) also found that these types of irrelevant sound
were particularly disruptive of the processing of sequential information. Verbal tasks (either visual or
auditory-based) were disrupted by irrelevant speech, but so were visual-spatial tasks. Taken together, these
results suggest that activities that require the order of items in memory to be kept intact are particularly
susceptible to interference by changing irrelevant sounds, even if we try to ignore them, and even if they
access working memory through different modalities. This finding has implications for noise abatement in
applied settings.
Indeed, the basic research on the irrelevant sound effect has been extended to examine the effects of
background noise on performance in the office (Banbury & Berry, 1998; 2005), the flight deck (Banbury et
al., 1998; Hodgetts et al., 2005), the classroom (Stansfeld, Berglund, et al., 2005; Dockrell & Shield, 2006),
the lecture theatre (Shelton, Elliott, et al., 2009; End et al., 2010), and even when doing homework in front of
the television (Pool, Koolstra, & van der Voort, 2003). The results of these studies are striking: background
noise significantly impairs performance on cognitive activities across the range of industrial and educational
settings. For example, office workers who had been moved from private office rooms to open-plan offices
reported increased distraction, increased concentration difficulties, and a two-fold increase in loss of work
performance due to background noise (Kaarlela-Tuomaala, Helenius, et al., 2009). In educational settings,
long-term exposure to aircraft noise has been found to impair children’s reading comprehension (Stansfeld,
Berglund, et al., 2005). Children are particularly susceptible to auditory distraction, with younger children
being more susceptible (Elliott, 2002). Unfortunately, longterm exposure to background sound does not

87
reduce its disruptive effects; if habituation does occur, relatively short periods of quiet can cause rapid
dishabituation to the sound (Banbury & Berry, 1997).
Fortunately, an understanding of auditory attention and the irrelevant sound effect can help create policy
and interventions to reduce the impact of background noise on worker productivity in industrial environments,
open-plan offices, and schools. It is not enough to simply reduce the level of the sound, as background noise
can be disruptive even when quiet (Tremblay & Jones, 1999). Rather, the research suggests that reducing the
variability of the sound–the main determinant of its disruptive effects–seems the most promising strategy;
especially for tasks involving the maintenance of order in memory. This can be accomplished through the
acoustic treatment of the workplace to minimize sound variability. For example, continuous white noise that
partially masks the cues necessary for the segmentation of background speech has been found to reduce its
disruption to cognitive performance (Venetjoki, Kaarlela-Tuomaala, et al., 2006).
Although instrumental music does not show the same ameliorating effect as white noise, participants do
report that they would prefer music to continuous noise in office environments (Schlittmeier & Hellbrück,
2009). This is an example of a common phenomenon: users sometimes want what is not necessarily best for
them from a performance perspective (Andre & Wickens, 1995). In cases in which the task requires the
maintenance of order, instrumental music has been shown to cause some disruption (Salamé and Baddeley,
1989); however for tasks that are less reliant on these processes (such as reading comprehension), they do not
(Martin, Wogalter and Forlano, 1988). The circumstances under which disruption by background sound takes
place therefore need to be fully understood by the Human Factors engineer before the redesign of the task and
task environment can take place. An analysis of the cognitive requirements of the task should reveal the extent
to which it relies on working memory processes for the maintenance of order, which in turn will indicate the
susceptibility of the task to disruption by background sound.
Other potential ways of reducing the acoustic variability of background sound (and in doing so reduce its
detrimental impact on cognitive performance) include the installation of sound-absorbing materials on ceilings
and partitions. This has the effect of reducing the intelligibility of the sound (Schlittmeier, Hellbrück, et al.,
2008) or varying the reverberation time of the sound (Perham, Banbury, & Jones, 2007). However, the
acoustic treatment of sound in the workplace illustrates how designers can face competing goals. On one
hand, there is a need to reduce the detrimental impact of background sound through masking with continuous
noise or by the acoustic treatment of the workspace. On the other hand, as we will discuss in Chapter 6, there
is also need to preserve good speech communication and intelligibility within the workspace. Clearly,
compromises are necessary, as it is difficult for good speech communication and good speech privacy to
coexist in a single physical environment.

5 TRANSITION
In this chapter we have described attention as a filter to the environment. Sometimes the filter narrows to
decrease irrelevant visual or auditory input, and sometimes it broadens to take in parallel streams of
environmental information for integration or multi-tasking. The effective breadth of the filter is dictated by the
limits of our senses (e.g., foveal vision), task demands, the differences and similarities between stimulus
channels, and the strategies and understanding of the human operator. What happens, then, when material
passes through the filter of attention? We saw in Chapter 2 that material may be provided with a simple yes-
no classification (signal detection) or categorized into a level on a continuum (absolute judgment). But more
often the material is given a more sophisticated and complex interpretation. This interpretation is the subject
of several subsequent chapters.
In Chapter 4, we present cognition-based principles of spatial display design that are intended to
maximize the likelihood of correct interpretation of attended information. In Chapter 5, we focus most heavily
on conveying information for navigation and spatial interaction tasks, and in Chapter 6, information for
comprehension of language. Finally, in Chapter 10, we revisit the concept of attention in the context of multi-
tasking.

Key Terms
3D audio 80
area of interest (AOI) 50

88
attentional capture 53
attentional cueing 63
attentional narrowing or tunneling 63
auditory object 78
auditory streaming 79
augmented reality 69
automaticity 59
automation complacency 63
central cue 62
change blindness 53
change blindness blindness 54
clutter 61
cocktail party effect 79
conformal 69
conformal symbology 69
conjunction search 58
consistent mapping 59
dichotic 80
Disorganizational clutter 61
display representation 71
distractors 57
Divided attention 49
dual task processing 72
dwells 51
effort 51
emergent features 64
exhaustive search 58
Focused attention 49
forward field of view (FFOV) 65
frequency of sequential use 53
global density clutter 61
global or holistic processing 64
guided search 59
head-up display (HUD) 65
Heterogeneous clutter 61
heterogeneous features 74
highlighting 63
imperative stimulus 62
inattentional blindness 55
information integration 72
irrelevant sound effect 81

89
local density clutter 61
mental model 52
monaural 80
multi-tasking 50
Numerosity clutter 61
object displays 69
object file theory of attention 68
object-based attention 65
omnidirectional 77
parallel search 58
perceptual analyzers 74
perceptual competition 67
periods of neglect 52
peripheral cue 62
polyphony 80
preattentive 58
proximity compatibility principle 53
proximity or readout clutter 61
redundancy 64
redundancy gain 67
response conflict 67
salience 51
SEEV model 52
selective attention 49
serial self-terminating search (SSTS) model 57
Space-based attention 65
Stroop effect 68
sustained attention 49
target popout 58
task compatibility 71
task proximity 72
task representation 71
useful field of view (UFOV) 56
varied mapping 59
vernier acuity 74
weapons effect 64

90
4 SPATIAL DISPLAYS

When we drive a car, we derive information about the depth and position of other objects in the world from
the scene through the windshield. Similarly, when we examine a bar graph or check a speedometer, we derive
information about the state of the world from a spatial array. The sizes of objects or the distances between
them are used to communicate the relevant information. Human performance in such spatial judgments
depends on accurate judgments of distance, extent, and depth. Our ability to perceive and understand such
spatial relations will be the focus of this chapter.
Generally, large spatial or physical differences are more important or significant than small ones.
Consider reading a graph or an analog meter. A small change in position reflects a small change in the
underlying dimension. In contrast, consider reading a digital meter or a word. In a digital meter the spatial
difference in the physical representation between, say, 79999 and 80000 is substantial—every digit is
changed. But the difference in meaning between these two values is small. An analog display preserves some
of the inherent properties of the dimension it represents: in this sense, it is an analog of its physical
counterpart.
In this chapter, we consider a variety of spatial displays. We first discuss the perception and
understanding of graphs. Then we address the role of motion as we consider the design of common displays
such as meters and dials. In doing so, we highlight the importance of compatibility between the dimension
portrayed and display elements. We consider compatibility in both static and dynamic senses. Space, of
course, is also three dimensional (3D). Our perception of a 3D environment is determined by the information
we obtain about its structure as we move through it. We thus consider the various types of information we can
obtain through movement, and their implication for display design. We are also concerned with perceptual
judgment of depth and distance. We discuss the implications of such judgment on perception of real-world
environments and for representing a 3D space on a 2D display surface. We close the chapter with a brief
discussion of spatial displays that use other sensory modalities. In Chapter 5 we will expand on some of these
topics while examining navigation and interaction with real and virtual environments.

1. GRAPH PERCEPTION
Unlike many of the displays discussed in this book, most of us will, at one time or another, design a graph. In
the process, we must make decisions about graph type, assign variables to axes, code variables using symbols,
and so on. This makes the graph a good place to start a discussion of display design. We define a graph as a
paper or electronic representation of numeric analog data with multiple data points. Some everyday examples
—bar graphs, line graphs, and pie charts—are shown in Figure 4.1. The distinction between graphs and
analog displays has become blurred in recent years due to developments in information visualization, where
graphs can dynamically change from one format to another, for example (Heer & Robertson, 2007; see
Chapter 5), but one remaining difference is that with graphs, the data typically do not change as the user views
them, whereas with information displays the data shown can change in real time as the user performs tasks
and monitors the outcome.

91
FIGURE 4.1 An example of a bar graph, a line graph, and a set of pie charts. Each graph type depicts the same data: the production of two
factories, A and B, over four years. Four graph reading tasks that could be performed with each graph are also described.

A history of the graphic display of data dates back to the pioneering work of Playfair (1786), who first
realized the power of using analog representations (e.g., bar graph, pie chart) to represent quantitative data.
For spatial judgments (e.g., which variable is decreasing more quickly?), performance is better with graphs
than tables (e.g., Kirschenbaum & Arruda, 1994; Vessey, 1991). As noted above, for spatial judgments large
differences between values are more significant than small ones. It comes as no surprise, therefore, that an
analog representation like a graph is more effective for the spatial judgment than a digital display. In contrast,
reading a precise value is generally performed better with tables of digits (Lalomia, Coovert, & Salas, 1992;
Meyer, Shinar, & Leiser, 1997; Vessey, 1991).
In Chapter 1, we introduced a model of human information processing. When considering the processing
of graphs, we are looking primarily at the perception, attention, and working memory stages shown in that
model. Long-term memory will also play a role in influencing familiarity with the data being depicted or the
underlying graphical form. These are essentially the same as the bottom-up and top-down influences on visual
information sampling described in the SEEV model in Chapter 3. Salience and effort are primarily influenced
by perceptual, attentional, and working memory stages; expectancy and value are influenced by working
memory and longterm memory processes. In general, we will see that less effective task-graph combinations
require a longer sequence of mental operations rather than having key task variables represented using easily
perceived geometric characteristics.

1.1 Graph Guidelines


We provide five general guidelines for the construction of graphs here. We discuss evidence for each
guideline in turn. Further guidelines can be found in Gillan, Wickens, Hollands, and Carswell (1998).
1. Consider the task. The relative effectiveness of various graph types depends on the task. The graph
designer should choose a graphical form that corresponds to task demands.
2. Minimize the number of mental operations. The graph designer should try to reduce the number of
operations required by choosing an appropriate graph type (e.g., bar graph, pie chart) and arranging
information within the graph appropriately.
3. Use physical dimensions judged without bias. Perceptual illusions, biases in the judgements of some
perceptual continua, and misjudgments of depth can produce error in judgment.
4. Keep the data-ink ratio high. Keep the amount of ink that does not depict actual data to a low level.
5. Code multiple graphs consistently. Graphs within a set should be designed in a consistent manner.

1.2 Task Dependency and the Proximity Compatibility Principle


There are a large number of tasks people perform with graphs. A convenient taxonomy is shown at the bottom
of Figure 4.1 (Carswell, 1992a). In point reading, the observer estimates the value of a single graph element.
For a local comparison the observer compares two values directly shown in the graph. For a global
comparison, the observer compares quantities that must be derived from other quantities shown in the graph.
Finally, for a synthesis judgment, the observer needs to consider all data points and make a general, integrative

92
judgment.
In Chapter 3 we introduced the notion of compatibility between the arrangement of multiple information
sources on a display, and the task requirements. We saw that this displaycognitive compatibility could be
defined in part by the proximity compatibility principle (PCP; Wickens & Carswell, 1995). Tasks requiring
integration of information are better served by more integral, objectlike displays. The PCP also applies to
graphs, as revealed by a meta-analysis conducted by Carswell (1992a). The meta-analysis integrated the
results of studies in which different graphic formats were compared. Integrated graph types (e.g., a line graph)
were compared with more separable formats (e.g., a bar graph or pie chart), as shown in Figure 4.1. Each
study was classified by its task demands into one of the four task categories described above, defining a
continuum of task proximity. The continuum thus represented the extent to which the integration of all
variables was necessary to carry out the task. (See Chapter 3, section 3.5 and figure 3.9). figure 4.2 shows the
proportion of studies in each category that showed better performance with the integrated graphs (relative to
separated formats), and those that showed the reverse effect. The Figure shows the increasing benefit of
integrated graphs as the task required more integration. The comparison of relative effectiveness of tables and
graphs (described above) can also be viewed in this manner—a table is highly effective for point reading
(focused attention), but less effective for integrative judgments, relative to graphs (Speier, 2006; Vessey,
1991).
As a specific example of the proximity compatibility principle, using the graphs in Figure 4.1, consider
this question: How is the rate of growth different between the two factories? Each object (line) of the line
graph offers an emergent feature—its slope—which can be directly perceived and directly maps to the task
(trend estimation). In contrast, the series of pie charts depicts the same data, but no single object represents the
rate of growth. The rate must be inferred by comparisons of individual slices over the several years. However,
judgments of specific proportion values can be made as well or better with the pie chart than with the line
graph.
The PCP also applies to the question of how to label data in a graph. Examine the line graph in Figure
4.1 and ask yourself whether Factory A’s production increased from 2010 to 2011. To perform this task, you
must first identify the line that represents Factory A. This is not too difficult because labels have been placed
in close proximity to the lines. In contrast, if you look at the bar graph or the pie charts, you need to look for a
legend, determine which shading level is assigned to which factory, and remember the coding when you
examine the graph again. Several additional mental operations are needed. Thus, a general recommendation is
that labels should be placed close to their referents (Gillan et al., 1998).

FIGURE 4.2 Proportion of studies showing an object-display advantage (solid line) or disadvantage (dashed line) as a function of task type
(focused, left; integrated, right). The figure illustrates the proximity compatibility principle. Source: History and applications of perceptual
integrality theory and the proximity compatibility hypothesis. University of Illinois Technical Report ARL88-2/AHEL-88-1 Technical
Memorandum 8-88.

When a graph shows many variables, direct labels are less feasible. In this case, it is helpful if the order
of variables in the legend (going from top to bottom) corresponds to the order in the graph: that is, that the

93
graph and legend are spatially compatible. Huestegge and Philipp (2011) have examined the effect of such
compatibility in an experiment in which the eye movements of their participants were measured. Participants
were shown a declarative statement (e.g., “In general, people spend more time in front of the computer than
the TV”) followed by the graph, and their task was to decide if the data shown in the graph were consistent
with the statement. They found that when the graph and legend were spatially compatible, less time was
required to make the decision.

1.3 Minimize the Number of Mental Operations: Search, Encode, and Compare
When a graph reader examines a graph to accomplish a task, a sequence of perceptual or cognitive operations
is performed. Various graphical perception models postulate a general process of search (drawing upon
attentional processes of visual search as described in Chapter 3), followed by the encoding of variables, and
ultimately comparison of perceived elements with values stored in working memory (e.g., Casner, 1991;
Gillan, 1995, 2009; Gillan & Lewis, 1994; Hollands & Spence, 1992, 1998, 2001; Lohse, 1993; Peebles &
Cheng, 2003; Pinker, 1990). Each operation is assumed to take time, and have some probability of error. More
operations will take more time and will increase the likelihood of error in graph interpretation.
Consider a simple example. Hollands and Spence (1998) found that increasing the number of slices
depicted within a pie chart had no effect on response times for judging proportion, whereas increasing the
number of bars shown in a bar graph did. The graph reader needs to estimate the whole with the bar graph
because no single object represents it. Determining this estimate requires mentally summing the bars: the
more bars, the more summation operations, the more time required to perform the task. (Error also increased
with more bars.) In contrast, with the pie chart, the entire pie represents the whole and so there is no need for
summation operations.
By conducting many studies of this type with particular task-graph combinations, researchers have
worked towards general graphical perception models. For example, Gillan (2009) has proposed particular sets
of arithmetic and perceptual operations (or mental operations) when tasks require simple comparisons, or
estimates of differences, sums, ratios, or means, using bar graphs, line graphs, pie charts, and star (object)
charts. Gillan summarizes the results of a large number of empirical tests of the model’s predictions. Once
validated, these general models can then be used to make specific predictions about the time required (or
likelihood of error) for a specific judgment.
Visual scanning behavior provides a good measure of the sequences of mental operations. Computational
models of graph reading have been developed based on sequences of mental or visual scanning operations
(e.g., Chandrasekaran & Lele, 2010; Peebles & Cheng, 2003). The formal aspect of the models also helps in
comparing human performance to some optimal level. For example, Peebles and Cheng found that their
participants unnecessarily revisited certain graph locations as they executed the task, demonstrating non-
optimal scanning patterns. It would appear that as the users scanned the graph they forgot information
accessed from the graph earlier (a failure to adequately encode a value). A redesigned graph might avoid this
problem.
In many everyday situations, the graph reader’s task might simply be to ask, “What is this graph
saying?”; that is, to synthesize the graph’s message as a whole. Such integration tasks have been shown to be
carried out in steps (Carpenter & Shah, 1998; Ratwani, Trafton, & Boehm-Davis, 2008). In particular, eye
movement and verbal protocols indicate that people segregate the graph into chunks or visual clusters (e.g.,
light and dark bars in the bar graph in Figure 4.1). Eye movements are often focused on the boundaries
between the clusters, segregating the graph into different parts, which can then be compared. The result of the
comparison often lead to a cognitive integration of the graph’s message (e.g., Factory B’s production
advantage keeps getting bigger). Such processing can be aided by ensuring that visual clusters are easily
distinguishable (e.g., by using color coding or shading, discussed in Section 2), but the graph should not
encourage the formation of too many visual clusters by having too many uniquely coded variables (Ratwani et
al., 2008).
In summary, the graph designer should always strive to reduce the number of operations by first
choosing an appropriate graph type and then arranging information within the graph appropriately. In PCP
terms, reducing the number of operations reduces information access cost. Various models instruct how this
should be done.

1.4 Biases in Graph Reading


In particular situations, the judgments people make in extracting information from graphs are biased (Gillan et

94
al., 1998). That is, people systematically overestimate (or underestimate) quantities relative to their true
values. Some biases are related to optical illusions that distort our sense of perception. For example, when
viewing the Poggendorf illusion, shown in Figure 4.3a, people “flatten” the sloping lines horizontally. The
same illusion tends to flatten the slope of a line in a line graph, as indicated by the arrows in Figure 4.3b.
Thus, a point far from the axis (e.g., a point on the right side of the line shown in the figure) will tend to be
underestimated (Figure 4.3b; Poulton, 1985). Poulton found that the illusion is greatly reduced if a graduated
axis is provided on each side (Figure 4.3c). Gridlines placed within the graph are also helpful in reducing the
bias (Amer, 2005).

FIGURE 4.3 (a) The Poggendorf illusion: the two diagonal lines actually connect. (b) A line graph susceptible to “bending” from the
Poggendorf illusion. (c) Debiasing of the Poggendorf illusion by marked edges on both sides. Source: E. C Poulton, “Geometric Illusions in
Reading Graphs,” Perception &amp; Psychophysics, 37 (1985), 543. Reprinted with permission of Psychonomic Society, Inc.

A second example of bias occurs when comparing differences between two lines of different slope
(Cleveland & McGill, 1984). The vertical difference between the two curves in Figure 4.4 is actually smaller
on the left. Yet perceptually the difference appears smaller on the right because judgments of differences
along the y-axis are biased by the visual separation (or Euclidean distance) between the two curves rather than
the vertical separation. One solution is to plot the differences directly (Figure 4.4, bottom).
Other biases result from perceptual limitations in judging areas and volumes, which are commonly used
to represent quantity in graphs. Volume is becoming especially prevalent given the frequent use of 3D
graphical formats (Carswell, Frankenberger, & Bernhard, 1991; Siegrist, 1996; Spence, 2004). Based on a set
of experiments they conducted, Cleveland and McGill (1984, 1985, 1986) proposed that our ability to make
comparative judgments of two quantities in a graph progressively degrades in the order shown in Figure 4.5.
The best comparative judgments are made with the evaluation of two linear scales, aligned to the same
baseline. (We made a similar point in Chapter 3, when we considered how aligning bars to the same baseline
created the emergent feature of slope.) The poorest judgments occur when people compare two areas,
volumes, or color patches. The Cleveland and McGill ranking shown in Figure 4.5 provides a useful
framework for a graph designer and corresponds to the predictions of the PCP for focused tasks like local
comparison and point reading (Carswell, 1992b).

FIGURE 4.4 Biases in perceiving differences between pairs of lines f1(x) and f2(x) with changing slopes. The bottom curve plots the difference
f1(x) – f2(x), which is larger on the right than on the left.

95
FIGURE 4.5 Seven graphical methods for presenting quantities to be compared. The graphs are arrayed from most (top) to least effective
(bottom).

The ranking in Figure 4.5 is likely related to observed biases in judging perceptual continua (types of
stimuli). When people estimate magnitudes by assigning numbers to various sizes of objects (the magnitude
estimation procedure developed by Stevens, 1957), they show certain biases. Some continua, like area and
volume, produce response compression: each unit increase in physical magnitude causes less and less
increase in perceived magnitude. Other stimuli, such as color saturation, tend to show response expansion:
each increase in physical magnitude causes incrementally greater increases in perceived magnitude. Lengths
tend to be judged with little bias. Stevens (1957, 1975) found that the relation between physical and perceived
magnitude can be expressed by the power function called Stevens’ law, with the exponent representing the
amount of response compression or expansion. When the exponent is less than 1.0, response compression
occurs; when it is greater than one, response expansion occurs; when it is equal to 1.0, no bias occurs.
Estimates of the areas and volumes shown in graphs are thus subject to response compression so that large
areas and particularly volumes will tend to be underestimated. In general, the use of areas, volumes, color
saturation, and other perceptual continua whose Stevens’ exponent differ from unity should be avoided in
graphs.
Moreover, the bias described by Stevens’ law affects more complex judgments where multiple quantities
are involved, such as judgments of proportion (e.g., what proportion is A of B?; Hollands & Dyre, 2000).
Suppose you were asked to divide a horizontal line into two parts corresponding to two slices of a pie, as
shown in Figure 4.6a. When judging graphs (e.g., pie charts, stacked bar graphs) depicting proportion, people
tend to show cyclical bias patterns (e.g., overestimation from 0–.25, underestimation from .25–.75,
overestimation from .50–.75, and underestimation from .75 to 1). The “amplitude” of the cyclical pattern is
determined by the Stevens’ exponent (for the pie charts shown in the figure, the estimated exponent was less
than 1.0), and the “frequency” of the bias pattern was determined by the number of available tick marks
(compare the upper panels of Figure 4.6). When tick marks are added to the graph, as shown in Figure 4.6b,
the bias frequency doubles, reducing error. Intermediate reference points (the tick marks) are used by
observers to subdivide the graph into components, which has the beneficial side effect that error is reduced,
even as the Stevens exponent stays constant.
In summary, bias in making relative judgments with graphs can be reduced by: 1) avoiding continua
whose Stevens exponents differ from 1.0; and/or 2) making reference points available (e.g., adding tick
marks). It is possible to make less effective perceptual continua (e.g., area) more effective by adding reference
points to the graph.

1.5 The Data-Ink Ratio

96
As we noted earlier, graph readers naturally scan or search through the available graphical elements. This is
especially true in situations where the reader is unfamiliar with the graph type or is otherwise inexperienced
(Peebles & Cheng, 2003). In Chapter 3 we learned that unnecessary visual elements (clutter) will slow visual
search. The greater the number of visual elements in the graph, the greater the number of scans required.
Graph designers should therefore strive to eliminate those extraneous elements of the graph that do not carry
information (Wang, 2011). In an influential book, Tufte (2001) distinguished between the ink in a graph used
to portray data and superfluous non-data ink. He argued for a data-ink ratio principle, which states that the
amount of ink that does not depict data points should be kept to a minimum (Tufte, 2001).

FIGURE 4.6 Patterns of cyclical bias in judging graphs. (a) Bias as a function of true proportion for pie charts. (b) Bias as a function of true
proportion when tick marks are added. The bias pattern changes from two to four cycles and overall error is reduced. The curved functions show
the predictions of the cyclical power model (Hollands &amp; Dyre, 2000), derived from Stevens’ law.

In line with the principle, techniques have been developed to modify graphical elements so that more
data can be portrayed in the same amount of space, without sacrificing judgment accuracy (Heer, Kong, &
Agrawala, 2009). The higher the data-ink ratio (i.e., more ink associated with data and less unnecessary ink),
the faster the time to make a variety of judgments, and the greater the accuracy (Gillan & Richman, 1994). In
addition, integration tasks (e.g., global comparison, synthesis judgments) appear to be more affected than
focused tasks by the data-ink ratio. Gillan and Richman’s results also suggest that the use of pictorial
backgrounds (e.g., the picture of a bank behind a bar graph depicting financial data, in the typical USA Today-
style graph) is particularly damaging, especially for more integrated judgments. Similarly, Renshaw, Finlay,
et al. (2004) compared a 2D line graph with a 3D ribbon graph (lines were represented as ribbons viewed from
an oblique angle), and found performance advantages for the 2D format, which had a much higher data-ink
ratio. Ratwani et al. (2008) found that task-irrelevant labels required extra fixations and increased
comprehension time; when the labels were removed the extra fixations and time penalty were eliminated.
Thus, there is good evidence to suggest that the use of high data-ink ratios will, by reducing distraction
(failure of focused attention), make a graph more effective, especially for integration tasks, and that non-data
ink should be eliminated from graphs. This is especially important to remember given that people appear to
prefer graphs with more non-data ink (Inbar, Tractinsky, & Meyer, 2007).
It is possible to carry the data-ink ratio principle too far, however (Carswell, 1992b; Wickens, Lee, Liu,
& Gordon-Becker, 2004). The lines connecting points within a line graph represent non-data ink (data are
fully represented by the points). But deletion of the lines is not always a good idea because, as we saw in
Chapter 3 and also in Figures 4.1 and 4.2, the line slope serves as an emergent feature. Limited use of non-
data ink can be useful in helping the user interpret graphical elements (Gillan & Sorensen, 2009). If the non-
data imagery is linked to the content of the graph, it can be effective in making the graph more distinctive, and
therefore more memorable (Bateman et al., 2010). In general, then, non-data ink should be avoided, but if
used judiciously some non-data ink may assist in graph comprehension.

1.6 Multiple Graphs


The previous discussion has focused on the ideal, compatible properties of single graphs. An equally
important issue lies in the presentation of linked or multiple graphs, which may show related sets of data (e.g.,
one graph shows the prevalence of several diseases for men, the other for women). This is analogous to the

97
interactive display or information visualization situation where the data are complex enough to require
viewing in multiple formats or windows to understand their interrelation (Chen et al., 2007). Here the graph
designer should consider the relationship between successively viewed graphic formats, in addition to the
optimization of each format by itself. Four specific concerns can be identified.
1. Coding Variables. Shah and Carpenter (1995) have shown that our mental representation of coded
variables (different lines) is qualitative or nominal, whereas our representation of variables placed
along the x-axis of the graphs is in quantitative metric terms. This has two implications for multiple
graph construction: 1) Build the graphs so that quantitative variables are placed on the x-axis; and 2) If
all variables are qualitative, build the graphs so that the most important differences are encoded as the
variables represented by the two (or more) points on each line along the x-axis, since we seem to be
most sensitive to these changes. In this way, the variable’s effect is directly represented by an
emergent feature—the slope—of the constructed graphs. The differences in slope (the angle between
two lines) serves as an emergent feature, as noted in Chapter 3.
2. Consistency. When the same data are plotted in different ways, it is important to maintain consistency
across graphs (Gillan et al., 1998). For example, the variable coded by line type (e.g., dashed versus
dot) in one graph, should, where possible, be coded by the same physical distinction in all graphs. If
such consistency is needlessly violated, the reader will need to exert greater cognitive effort (i.e., a
longer sequence of mental operations) to switch from one graph to the other. High consistency creates
good visual momentum as the eye moves from graph to graph (Woods, 1984), a concept considered
in the next chapter.
3. Highlighting differences. When related material is presented, it becomes critical to highlight the
changes from graph to graph, either prominently in the legend or in the symbols themselves. For
example, a series of graphs presenting different Y variables as a function of the same X variable
should highlight the Y label. This system allows the same cognitive set to be transferred from graph to
graph, while the single mental revision that is necessary is prominently displayed. The time- and
effort-consuming visual search necessary to locate the changed element is minimized (Gillan et al.,
1998), reducing information access cost.
4. Short distinct legends. Legends of similar graphs should highlight the distinct features, not bury them
as a single word that is nearly hidden in the middle or end of otherwise identical multiline legends.
Unfortunately, word processors make it all too easy to copy a long legend from one graph to another,
making it difficult to detect each graph’s unique features. The caption for each graph should be written
in short, efficient language that highlights the differences among graphs.
In conclusion, even though graphs are relatively simple, static displays, meant to be interpretable by the
layperson, there are a number of significant design issues to consider. We shall see many parallels when we
consider interactive information displays in the next section and next chapter, because the same digital or
analog representations are often used. The analog representations often take similar forms, with geometric and
spatial elements being used to represent the value of variables of interest in a similar manner. Thus,
overarching principles (such as the proximity compatibility principle and consistency) will reappear as we
consider such displays. However, with information displays the situation is dynamic, the data are real-time or
close to it, and the operator is often in the position of controlling some of the variables being portrayed in the
display (or overseeing automation that is controlling the variables). This was the supervisory control task
described in Chapter 3. The control of such variables often requires significant training or experience (e.g.,
controlling a nuclear power plant, flying an aircraft). In contrast, graphs are usually designed to be
interpretable by the layperson. Thus, in terms of our information processing model (Figure 1.1), the use of
feedback from the environment (after control actions) takes on an important role. Displays need to represent
the right variables in an intuitive manner to provide a useful guide for action (Bennett & Flach, 2011). We
consider these topics in the next section.

2. DIALS, METERS, AND INDICATORS: DISPLAY


COMPATIBILITY
Many dynamic systems controlled by human operators present information in dynamic analog form, using
dials, meters, or other changing elements, to represent the momentary state of some part of the system. It is
important that dials and meters be compatible with the operator’s mental model of the system. The mental
model, a concept we will discuss further in Chapter 7, forms the basis for understanding the system,
predicting its future behavior, and controlling its actions (Gentner & Stevens, 1983; Moray, 1998; Park &

98
Gittelman, 1995; St-Cyr & Burns, 2001). As a consequence, there are three levels of representation that must
be considered in designing display interfaces, as shown in Figure 4.7: (1) the physical system itself; (2) the
user’s mental model; and (3) the interface between these two, the display surface on which changes in the
system are presented to the operator, and which help form the basis for control action and decision (Bennett &
Flach, 2011). It is important to maintain a high degree of compatibility among all three representations.
In achieving this compatibility, it is first important that the properties of the interface accurately reflect
the dynamics of the physical system, a correspondence referred to as ecological compatibility (Vicente, 1990,
1997). This will help the operator’s mental model to correspond better to the physical system dynamics (St-
Cyr & Burns, 2001; Vicente, 1997). Such correspondence will be aided by displays that show the key physical
parameters in effective and intuitive ways, as well by good operator training, discussed in Chapter 7. Second,
display compatibility is achieved by display representations whose structure and organization are compatible
with the user’s mental model.
Given the increase in the use of automation in complex systems (discussed in Chapter 12), the physical
representation includes not only the system performing the physical work, but also any automated system
controlling the process. Thus, for example, the physical system for an aircraft includes not only the rudder,
engines, elevators, and ailerons but also the automated systems used to control those aircraft components. It is
important for the mental model to reflect the automated systems correctly in order to maintain appropriate
awareness should the system fail. For example, Sarter (2008) noted that gaps and misconceptions in pilots’
mental models of flight deck automation in the Boeing B737 and Airbus A320 contributed to errors made by
those pilots. Recent aviation accidents like Colgan Air Flight 3407 near Buffalo, New York (Sorensen, 2011)
were at least partially attributable to the pilots’ lack of understanding of what the automation was doing when
the plane lost control.

FIGURE 4.7 Representations of a physical system. Two types of compatibility are portrayed: that between the physical system and a display
(ecological compatibility: EC) and that between the display and the user’s mental model (display compatibility: DC). The Figure also highlights
the importance of training to the influence of the physical representation upon the mental representation.

When considering display compatibility, it is important to distinguish between analog or continuous


systems and digital or discrete systems. In general, analog systems are those whose behavior is governed by
the laws of physics, and therefore change continuously over time (e.g., controlling an aircraft, a ground
vehicle, or an energy conversion process). The physics defines an ecology, and hence makes ecological
compatibility important. In considering analog systems, it is important to distinguish between static and
dynamic components of display compatibility. We now consider each in turn.

2.1 The Static Component: Pictorial Realism


The principle of pictorial realism (PPR; Roscoe, 1968) has two parts. The first part can be defined as
follows: if a variable’s physical representation is analog, then its display representation should also be analog
(Roscoe, 1968). The representation of aircraft altitude is a typical instance. Physically, altitude is an analog
quantity, with large changes in altitude more important than small changes. Conceptually, the pilot likely
represents altitude in analog form. Therefore, to achieve compatibility, a display of altitude (i.e., an altimeter)
should be in analog format (e.g., a needle position changing on the display to indicate a change in altitude)
rather than digital. The human transformation of symbolic digital information to analog conceptual
representation imposes an extra cognitive processing step, leading to longer visual fixations, increasing
processing time, and increasing the likelihood of error (Grether, 1949).
There are, of course, other factors that influence the choice of analog or digital representations of altitude

99
or of other continuously varying quantities. The nature of the user’s behavioral response—which is often
driven by task requirements—matters. Miller and Penningroth (1997) had participants read analog and digital
clocks and report the time in different ways. When they were asked to read the time as exact numbers (e.g.,
2:40→“two forty”) the digital format was found to be superior. On the other hand, the need to estimate at a
glance the distance of that variable from some limit by stating minutes before the hour (e.g., 2:40→twenty
minutes to three) favored the analog format. Similarly, perceiving the magnitude of a variable when it is
rapidly changing or determining rate-of-change or event onset information favors an analog representation
(Proctor & Van Zandt, 2008; Schwartz & Howell, 1985). Given the flexibility of electronic displays, it is
common to use both formats within a single display. This meets the needs of multiple tasks. For example, in
general an analog representation is effective for representing heading to a soldier using a head-mounted
display while wayfinding in an unfamiliar environment (Kumagai & Massel, 2005). This follows the principle
of pictorial realism. Nonetheless, it is useful for the display to show additionally the specific heading to a
waypoint digitally, to aid the soldier who is verbally communicating a heading to another soldier.
There are many variables whose internal representations are likely analog (e.g., temperature, pressure,
speed, power, or direction). In addition, some conceptual dimensions have the characteristic of an ordered
quantity with multiple levels (e.g., degree of danger or readiness status); these will also likely benefit from
analog representation.
The second part of the PPR is that the direction and shape of the display representation should be
compatible with the mental (and physical) representations. Consider a violation in direction: an altimeter that
places high altitudes low on the display, and vice versa. While this would still be an analog representation, our
mental model of altitude mimics the physical variable itself: high altitudes are up and low altitudes are down.
Therefore, the altimeter should present high altitudes at the top of the scale and low ones at the bottom.
Analogously, high temperatures should be placed higher, and low temperatures lower, on a display.
Display compatibility may be violated in terms of shape if a circular altimeter (pointer or dial) represents
the vertical and linear conception of altitude (Grether, 1949). The PPR is also violated by dissecting a single,
continuous variable into separate parts. Grether reported that operators had a more difficult time extracting
altitude information from three concentric pointers (indicating units of 100, 1,000, and 10,000 feet) than from
a single pointer. In sum, displayed quantities should correspond to the operator’s mental model of them,
which in turn reflects characteristics of the physical world. The concept of static compatibility may also be
applied to systems that are not inherently analog, but have some ordered spatial component, such as an expert
system’s decision logic, or a circuit diagram.
When we talk of pictorial realism in the PPR, it is important to understand that we are not arguing for
blind acceptance of realism in displays; that is, to assume that realism is always a good thing. Smallman and
St John (2005; Hegarty, Smallman, & Stull, 2012) labeled this misplaced faith in realistic information display
as naïve realism. Smallman and Cook (2010) showed users photorealistic three-dimensional terrain models,
as well as less realistic topographic maps of the same terrain. Their participants rated the models as more
realistic than the topographic maps, and also thought that they would perform better with the more realistic
displays. However, the participants actually performed worse with the more realistic terrain models because
the greater realism meant that extraneous data were shown along with taskrelevant information. The user is
faced with the burden of additional cognitive effort to extract the task-relevant information from the
extraneous data (or alternatively, filter out the nonrelevant data). In contrast, here we have been arguing that
the display representation should be compatible with the user’s mental model as she performs a task. Any
particular task in an analog system will demand that certain parameters are attended to while others are not
relevant. The PPR argues for analog representation of these key parameters, whereas naïve realism would
argue that all domain parameters be explicitly represented, even those that are not relevant to the current task
activity.

2.2 Color Coding


Before turning to a discussion of dynamic aspects of display compatibility, it is important to consider another
static form of display compatibility: the role of color in display design. We discussed color coding in Chapter
2, in terms of absolute judgment, and in Chapter 3 in terms of its attentional impact in visual search and the
proximity compatibility principle, and we will reconsider color coding when we discuss its role in information
visualization (Chapter 5). We summarize here several characteristics of color that have practical implications
for display design.
• A unique color stands out from a monochrome background, and as we saw in visual search, also

100
allows for rapid parallel search for a target (Christ, 1975).
• Color hue is useful for coding categorical or qualitative information (e.g., blue and red symbols on a
map to show friendly and hostile forces). However, like other sensory continua, color is subject to the
limits of absolute judgment (see Chapter 2). Thus, the system designer should probably use no more
than about seven hues in a display (Carter & Cahill, 1979; Flavell & Heath, 1992). In conditions where
ambient light varies (e.g., in a cockpit or hand-held display), absolute judgment performance will
likely be impaired (Stokes et al., 1990) and fewer then seven levels are strongly recommended.
• Color hue is effective for segregating categories of objects within a display (Yamani & McCarley,
2010), and for showing discrete state changes (Smith & Thomas, 1964; Van Laar & Deshe, 2007).
• Certain colors have well-established symbolic meaning within a population (e.g., red is often used to
indicate danger, or stop; green signals safety, or go). Because these sometimes vary across culture
(Courtney, 1986), such coding is often referred to as a population stereotype, discussed further in
Chapter 9. Coding levels should not conflict with population stereotypes (e.g., assigning red to “go” or
“safe”).
• Color hue does not generate a natural ordering (i.e., from “most” to “least” in a way that lends itself to
analog displays (Merwin, Vincow, & Wickens, 1994). Red is not perceived as “more” or “less” than
green. Thus, color hue is not effective for relative judgment or comparison tasks in which users are
comparing values along a continuous or ordinal scale, such as deciding which values is greater or less,
which is of course important for the representation of analog variables. Color saturation is more
effective for this purpose (Bertin, 1983; Kaufmann & Glavin, 1990). Ordered brightness scales have
also been shown to be more effective than scales based on hue variation (Breslow, Trafton, &
Ratwani, 2009; Spence & Efendov, 2001; Spence, Kutlesa, & Rose, 1999) for relative judgment tasks.
There is evidence to suggest that judicious combinations of hue and brightness can be effective for both
identification and comparison tasks. For example, Spence et al. (1999) showed that ordered color scales in
which brightness was covaried with hue produced more accurate comparison judgments than brightness
variation alone. An algorithm called Motley has been developed to produce color scales varying in both hue
and brightness (Breslow, Trafton, et al., 2010). These authors showed that both identification and relative
comparison tasks were well served by Motley’s ordering. Thus, by clever combination and selection of
display elements, it is possible to design a display that serves multiple purposes well. We shall return to this
hybrid display concept when we discuss display movement in the next section.

2.3 Compatibility of Display Movement


If motion is occurring in the physical system itself, it can be useful to represent that motion by display motion
(rather than by using static displays) to produce an appropriate mental model of the situation (Park &
Gittelman, 1995). Beyond that, however, the compatibility of direction between the display and the mental
model is also important. Roscoe (1968) and Roscoe, Corl, and Jensen (1981) proposed the principle of the
moving part (PMP)—that the direction of movement of an indicator on a display is compatible with the
direction of movement of an operator’s mental model of the variable. In the case of the mercury thermometer,
this principle is typically adhered to because a rise in the height of the mercury column indicates a rise in
temperature. There are, however, circumstances in which the PMP and the PPR operate in opposition, and so
one or the other must be violated.
An example of this violation is shown in Figure 4.8, which could represent an altimeter. In the moving-
pointer display (Figure 4.8a) both principles—moving part and pictorial realism—are satisfied. High altitude
is at the top and an increase in altitude is indicated by an upward movement of the moving element on the
display. However, this simple arrangement can only show a small range of altitudes or requires an extremely
compressed scale where motion would be barely visible. One solution is to have a fixed pointer and move the
display scale when necessary to show only the relevant part (a moving-scale display; Figures 4.8b and c). If
the moving scale is designed to follow the PPR, high altitudes should be at the top of the display (Figure
4.8b). However, this means that the scale must move downward to indicate an increase in altitude—a
violation of the PMP. If the labeling is reversed to conform to the PMP (Figure 4.8c) this change will reverse
the orientation and display high altitude at the bottom, violating the PPR! A disadvantage for both moving-
scale displays is that scale values become difficult to read when the variable is changing rapidly since the
digits themselves are moving.

101
FIGURE 4.8 Display movement. (a) Moving-pointer altimeter; (b) and (c) are moving-scale or fixedpointer altimeters. The dashed arrows
show the direction of display movement to indicate an increase in altitude.

A possible solution here is to employ a hybrid display. The pointer moves as in Figure 4.8a, but only a
restricted portion of the scale is exposed. When the pointer approaches the top or bottom of the window, the
scale shifts more slowly in the opposite direction to bring the pointer back toward the center of the window,
and expose the newer, more relevant region of the scale. Thus the pointer moves at higher frequencies in
response to the more salient aircraft motion and the scale shifts at lower frequencies as needed. This way both
principles—pictorial realism and moving part—are satisfied. Head-up displays (described in Chapter 3) often
use this approach to show altitude.
Or consider the traditional aircraft attitude indicator (or artificial horizon display), which shows the
aircraft’s orientation in space (an aircraft’s attitude includes roll, pitch, and yaw, but here we will concentrate
on roll, when the wings dip left or right). Here, a stable aircraft is positioned relative to a moving horizon (see
Figure 4.9a). This looks like what the pilot sees through the aircraft window (because of this, it is sometimes
referred to as an inside-out display), and therefore conforms to the PPR. But when the plane rotates (rolls or
banks) it is the horizon not the aircraft that moves. This violates the PMP because pilots perceive the world as
stable and the aircraft moving through it (Johnson & Roscoe, 1972). Furthermore, the horizon will rotate in an
opposite direction to the aircraft, hence inviting confusion and an incompatible response (Roscoe, 2004). As
above, constructing the display so that the aircraft moves and the horizon is stationary (an outside-in display)
produces the opposite problem. It violates the PPR, since the static picture that is drawn (horizontal horizon,
tilted airplane) is incompatible with what the pilot perceives through the window (tilted horizon, horizontal
airplane).
A hybrid display called the frequency separated display (Figure 4.9(c); Lintern, Roscoe, & Sivier,
1990), like the hybrid altitude scale above, captures the best of both worlds, conforming in different ways to
both principles. Rapid movement of the aileron (controlling roll or bank) will cause the aircraft symbol to roll
in the same direction of the control, conforming to PMP. However, following a relatively sustained roll, or
slow roll back to level, the horizon rotates to the new orientation, as the plane symbol rotates with it, back to
horizontal, hence restoring the correct “picture” of what the pilot sees when looking forward: conforming to
the PPR. Thus the rapid motion conforms to the PMP, while the slower “steady state” conforms to the PPR.
Evaluations with skilled pilots have shown the success of frequency separation over displays that follow a
single principle (Beringer, Williges, & Roscoe, 1975; Ince, Williges, & Roscoe, 1975; Roscoe & Williges,
1975). Thus, the frequency-separated display illustrates a more general principle: sometimes clever design can
produce a system that adheres to two apparently contradictory principles with effective results.

102
FIGURE 4.9 Aircraft attitude display. (a) inside-out, (b) outside-in, and (c) frequency-separated display. All displays show an aircraft banking
left. Low-frequency return to steady state is indicated by arrows in (c).

Another type of frequency-separated display is called a tethered display (Wickens & Prevett, 1995).
Consider a gaming environment in which a user controls a virtual avatar in a threedimensional world. It is
quite common in such environments to have the viewpoint placed behind and above the avatar, and connected
to it, so that when the avatar moves the viewpoint moves with it in “tethered” fashion. The use of similar
technologies is being explored for remote vehicle control (Hollands & Lamb, 2011; Wang & Milgram, 2009).
Wang and Milgram developed a virtual tether with dynamic properties so that there is gradual adjustment of
the camera’s viewing position after a movement by the avatar. Importantly, a dynamic tether can be
constructed so that, like the two hybrids described above, the tether acts first as an inside-out display, with the
control motion first affecting the avatar motion in the same direction, and then a compensatory motion of the
surrounding scene occurs (outside-in display). Dynamic tethers based on this frequency-separated principle
were shown by Wang and Milgram to be superior to rigid tethers (which would be likened to an inside-out
display) for controlling the motion of a virtual aircraft through a curved tunnel.

2.4 Display Integration and Ecological Interface Design


The PPR suggests that an array of displays should be spatially compatible or congruent with the array of
physical components that they represent, as illustrated in Figure 4.7. However, as discussed in Chapter 3,
there are other ways of integrating information on displays to be compatible with the operator’s need to
mentally integrate that information, such as the proximity compatibility principle (Wickens & Carswell,
1995). We also noted that many creative design solutions can configure display elements to produce emergent
features, when those elements change in certain critical ways that are relevant to the operator’s task. When
this configuration is done in a way to reflect the constraints of the natural physical system being represented,
the resulting displays are called ecological interfaces (Vicente & Rasmussen, 1992; Vicente, 2002),
conforming to ecological compatibility. In this section we will focus on such interfaces.
Interfaces based on the principles of ecological interface design have been developed and assessed in a
large variety of work domains. These include nuclear process control (Burns et al., 2008, Burns &
Hajdukiewicz, 2004), petrochemical systems (Jamieson, 2007), medical anesthesia (Jungk, Thull, Hoeft, &
Rau, 2001), semiconductor manufacturing (Upton & Doherty, 2007), military command and control (Bennett,
Posey, & Shattuck, 2008), and the separation of aircraft in free flight (Van Dam, Mulder, & van Paassen,
2008). One of the key features of ecological displays is that they are the result of a process in which the work
domain is analyzed not just in terms of its physical form (e.g., pipes and valves), but also in terms of function
(what is the purpose of the system), and at an abstract level (what is the physics of the system). Key variables
that the operator needs to consider become apparent to the human factors designer through this work domain
analysis (Burns & Hajdukiewicz, 2004; Vicente, 1999).
For example, in the context of nuclear process control, Burns et al. (2008) compared an ecological
display to a traditional display. The traditional display showed the equipment ( turbines, valves, pipes) with
individual process values (pressure readings, valve positions) in numeric form. In contrast, the ecological
display mapped important conceptual variables like mass flow balance to emergent features of the display. For
example, as shown in Figure 4.10a, two bars were used to represent the masses of two fluids. A line was

103
drawn between the bars, with the center of the line indicated by a hatch mark; a bubble was placed on the line
that acted like the bubble on a carpenter’s level. If the two masses were equal then the bubble was found at the
hatch mark; if the mass on the left was less than the right, the bubble moved to the right away from the hatch
mark, and vice versa (Lau et al., 2008). Furthermore, the emergent feature of the line slope represented the
mass balance so that when the mass output from one subsystem was equal to the total mass pumped the line
was level, but if flow balance for a set of valves was greater or less than a critical value the line sloped to the
left or right at an angle proportional to the disparity. Burns et al. showed that these ecological displays were
more effective than the traditional displays for detecting unexpected system failures.
As another example, Seppelt and Lee (2007) developed an ecological interface for an adaptive cruise
control (ACC) system. These systems adjust the brake or throttle to maintain a constant distance from the
driver’s vehicle to a vehicle in front. ACC systems have braking and sensor limitations, which means that the
driver must intervene (i.e., hit the brakes) in some situations. The display developed by Seppelt and Lee
mapped the physical variables that the driver must monitor and control to certain characteristics of the display.
The physical variables included the difference between the velocities of the two vehicles, the distance between
the vehicles (scaled to the velocity of the driver’s vehicle), and the estimated time to collision (which we
discuss later in the chapter). The particular mapping they used meant that the shape of the display changed
depending on whether the situation was potentially hazardous or not. If the driver’s vehicle was approaching
the vehicle in front too quickly, a triangular shape (like a yield sign) was produced; if the vehicle in front was
traveling more quickly than the driver’s vehicle, the display looked like a trapezoid (empty road ahead)
instead, as shown in Figure 4.10. Thus, the emergent feature of shape was directly mapped on to the driver’s
task of working with the automation to ensure an appropriate following distance. Seppelt and Lee showed that
having this ecological display helped drivers maintain the correct following distance (relative to without the
display) in situations with both rain and traffic.

FIGURE 4.10 Examples of ecological displays. (a) Carpenter’s level display (Lau et al., 2008). When the two bars are not equal (as they should
be), the bubble deviates by shifting away from the hatch mark. (b) adaptive cruise control display (Seppelt & Lee, 2007). The triangular yield
shape on the left indicates that the driver should brake; the trapezoid on the right indicates a safe following distance. TTC = time to collision.
THW = time headway (distance from car in front divided by own car velocity).

There has been considerable effort put into how to generate the most effective displays based on the
principles of ecological compatibility. While ecological interface design (EID) provides general guidelines for
displays, there are often multiple display options that could meet the guidelines. Indeed, Vicente (2002) has
argued that the benefit of EID is not only attributable to the specifics of the functional form, but also that
important functional information is available to support the operator’s cognitive activities. Jessa and Burns

104
(2007) evaluated particular ecological display options for three different display-reading activities:
determining target levels, determining a change in direction, and interpreting proportions. They found that for
target value indication, a bull’s eye shape (an object display in which a solid circle was centered in a larger
empty circle) was most effective; for changing direction, a display that showed values on either side of a
vertical zero line was most effective, and for depicting the ratios between quantities a bar graph (in which
smaller values were shown in proportion to a set of larger values) was most effective.
Jessa and Burns showed that the effectiveness of their ecological displays was determined by the
judgment task being performed: integrated tasks (e.g., determining overall status or ratios among various
variables) were performed best by displays that integrated those values into a single object, and a focused task
(determining if multiple individual variables were greater or less than zero) was best performed a separated
format. These results are consistent with the proximity compatibility principle.
Given the importance of proximity compatibility and the form of the task representation (Zhang &
Norman, 1994) to display design, we have modified Figure 4.7 to incorporate proximity compatibility as well,
as shown in Figure 4.11. The set of compatibility principles shown in Figure 4.11—display, ecological, and
proximity compatibility—offer in combination a validated set of display guidelines, one of the most powerful
frameworks in the engineering psychology of display design. We will revisit compatibility in the context of
information visualization in Chapter 5, display modality in Chapters 6, 7, and 9, and motor responses in
Chapter 9. Displays that are compatible in these various respects are read more rapidly and accurately than
incompatible ones under normal conditions. More important, their advantages increase under conditions of
stress (see Chapter 11). The four representations in Figure 4.11 are tightly intertwined in a successful system;
this congruence is most likely to occur when the three types of display compatibility are met.

FIGURE 4.11 This Figure augments Figure 4.7 with a task representation. The proximity compatibility principle (PCP) states that the display
representation should be compatible with the task representation. The Figure also suggests that the physical system influences the task
representation, which influences the user’s mental model in turn.

3. THE THIRD DIMENSION: EGOMOTION, DEPTH, AND


DISTANCE
3.1 Direct and Indirect Perception
Much of our previous discussion has focused on two-dimensional (2D) displays. However, there are situations
in which a third depth dimension is represented, such that objects in a three-dimensional (3D) scene are
represented at various distances from the observer along an axis perpendicular to the plane of the display.
These displays are intended to represent three dimensions of Euclidean space, and they will be the focus of the
current section. Such displays may be developed for one of two general purposes. First, the three displayed

105
dimensions can represent the three spatial dimensions of physical space, as when a display is constructed to
guide the pilot in a flight path, or to plan the trajectory of a robot arm for manipulating hazardous material.
Second, the display may use the third (depth) dimension to represent another (non-distance) quantity.
Examples of this usage are found in many 3D graphics packages, discussed earlier (see also Chapter 5).
Psychologists have reached broad consensus that there are two qualitatively different systems for
perceiving 3D space (DeLucia, 2008). As shown in Table 4.1, these systems have different names, functions,
and pathways in the brain (Goodale & Milner, 2005; Patterson, 2007). Importantly for engineering
psychologists, they also have different implications for design and multi-tasking (see also Chapter 10).
We describe first a system for direct perception, which functions somewhat automatically and is
designed for perceiving nearby objects and surfaces as we move through the 3D world, a process called
egomotion. It is sometimes said to characterize ambient vision (Leibowitz, 1988; Previc, 1998, 2002), and its
visual receptors are distributed more or less equally all across the visual field (and retina), both in the fovea
and periphery. It employs dorsal visual pathways leading to the cortex. Its operation in egomotion does not
depend heavily upon higher cognitive inference, and so its properties are well represented by the dynamic
geometry of the visual image. Because of this anchoring of direct perception in the environment, it is closely
associated with ecological psychology (Gibson, 1979; Warren, 2004).
In contrast, a system for indirect perception is much more dependent on inference and higher-level
cognition. This system is useful for more explicit, deliberate judgments of depth and distance of objects,
including those objects that are relatively far away from the observer. For instance, this system might be used
to judge which of two distant airplanes are closer to a ground observer, or the direction one of the planes is
pointing. It makes use of focal (usually foveal) vision, using ventral visual pathways, as opposed to ambient
(or peripheral) vision (Previc, 1998, 2000, 2004; Previc & Ercoline, 2007). Because of the use of higher-level
cognition, indirect 3D perception imposes a burden on top-down processing and expectancies, in order to
make depth and distance inferences. This stands in contrast to the relatively automatic processing used for
direct perception. Thus, indirect perception places greater demand on attentional resources (see Chapter 10)
than direct perception.

TABLE 4.1 Two perceptual systems


Direct Perception Indirect Perception
Relatively automatic Cognitive inference
Egomotion (close to observer) Object perception (all distances)
Ambient (peripheral) vision Focal (foveal) vision
Dorsal pathways Ventral pathways
Ecological Information processing

When we consider our perception of a 3D environment, both types of perception—direct and indirect—
are important. To structure the remainder of this chapter, we will focus first on direct perception and
egomotion and its importance for vehicular control. Then we consider the importance of indirect perception
and deliberate perceptual judgment for the design of spatial displays.

3.2 Perception of Egomotion: Ambient 3D


As we move through an environment, whether in a plane, an automobile, or on foot, our judgments of the
direction and speed with which we are moving depend on information distributed across the visual field, not
just in the area of foveal vision (Geisler, 2007; Schaudt, Caufield, & Dyre, 2002). Thus, good drivers who
primarily fixate far down the center of the highway are still making effective use of the flow of texture beside
the highway as viewed in peripheral vision. As a consequence, engineering psychologists have argued that
conventional aircraft navigation instruments (like the attitude display indicator shown in Figure 4.9) are not
fully effective for controlling egomotion because they are restricted to foveal vision. Indeed, it has been
shown that the pilot’s perception of flight information can be augmented by peripheral displays. One example
is the Malcolm horizon display, which extends a visible horizon all the way across the pilot’s field of view
using laser projection (Comstock et al., 2003; Malcolm, 1984). Comstock et al. showed that attitude control
was much more accurate with the Malcolm display than without.
A second problem with the conventional aircraft instrument panel is that the information necessary for
the pilot to obtain a good sense of location and motion is contained in several separate instruments (Figure

106
4.12), which must then be mentally integrated. One solution to this integration problem is achieved through
the development of integrated 3D displays as described briefly in the last chapter. Another solution lies in the
design of ecological displays, which capitalize on the visual cues humans naturally use to perceive their
motion through the environment—the cues of direct perception that will support egomotion (Bulkley et al.,
2009; Gibson, 1979; Larish & Flach, 1990; Warren et al., 2001). Augmented reality displays (see Chapter 5)
can provide optical texture to the peripheral scene (Schaudt et al., 2002). In fact, the cockpits of fifth-
generation fighter aircraft (such as the F-35 joint strike fighter) make use of such cues and allow the pilot to
see sensor imagery “through the floor” using a head-mounted display
(http://en.wikipedia.org/wiki/Lockheed_Martin_F-35_Lightning_II, 2011).
What information is provided by the external environment as we move through it? Gibson (1979)
identified a set of environmental properties that the visual system can detect to assist in control of egomotion.
These properties have sometimes been referred to as optical invariants because they represent properties of
the light rays that reach the eye (or any surface) and have an invariant or unchanging relationship to the
location and heading of the observer, whether walking, driving, or flying. It is perhaps useful to think of each
invariant as a mathematical function that holds true across various visual environments. Gibson (1979)
identified a number of such invariants, and six are described below.
1. Texture gradient (compression). The compression of a textured surface indicates the relative distances
of different parts of the scene from the observer. The change in the compression signals a change in
altitude or the angle of slant with which the observer is viewing the surface, as is evident when you
compare the left and right panels of Figure 4.13.

FIGURE 4.12 A traditional flight instrument panel.

107
FIGURE 4.13 Splay and compression. Splay is defined by the angle of the two receding lines. Compression is defined by the gradient of
separation between the horizontal lines from the front (bottom) to the back (top). On the left, the perception is of being high above the field
looking down. On the right, the observer is at low altitude, looking forward. Note how both splay and compression change with altitude.

2. Splay. Parallel receding lines signal a change in altitude as given by the angle between the lines—the
splay. This can again be seen by contrasting the two panels of Figure 4.13.
  Experimental evidence has established the value of both splay and compression in helping the
pilot to control altitude (Flach et al., 1992; Flach et al., 1997; Gray et al., 2008). These cues present
altitude in a natural, “ecological” fashion, and there is evidence that they are processed automatically
by direct perception, leaving attentional resources available for other tasks (Weinstein & Wickens,
1992). Perception of altitude change is particularly important for the airplane pilot to initiate the final
stages of landing; pilots make use of the splay of the runway to help determine the altitude (Palmisano
et al., 2008).
3. Optical flow. Optical flow refers to the relative velocity of points across the visual scene (and
therefore across the retina) as we move through the world. This velocity is indicated by the arrows in
Figure 4.14. The expansion point is that place where there is no flow but from which all flow radiates,
and it indicates the direction of momentary heading (Warren, 2004).
  Optical flow is an important cue for the perception of heading (Dyre & Anderson, 1997).
Observers can accurately determine heading even if optical flow is the only available cue in the scene
(Warren & Hannon, 1990). For the pilot, the expansion point is critical because if it is below the
horizon, its position forecasts an impact with the ground unless corrections are made. Furthermore, the
relative rate of flow away from the expansion point, above, below, left, or right, gives a good cue
regarding the slant of a surface relative to the path of motion. A flow that is of uniform rate on all
sides indicates a heading straight into the surface, such as a parachutist would see when descending
straight down to the earth. In Figure 4.14, we see that the aircraft is angling into the surface because
the optical flow is greater below than above the expansion point. Finally, the rate of expansion signals
the distance to a surface.
  Greater optical texture density (i.e., more moving points in the scene, more visual detail)
generally leads to better control of heading (e.g., Li & Chen, 2010; Warren et al., Greater optical
texture density (i.e., more moving points in the scene, more visual detail) generally leads to better
control of heading (e.g., Li & Chen, 2010; Warren et al., 2001). Thus, if the visual environment is
impoverished in terms of optical flow, heading perception will be affected. Kim et al. (2010; see also
Palmisano et al. 2008) showed that pilots in a simulator made larger glideslope control errors during
landing in night than day conditions, when the terrain texture provides good optical flow.

108
FIGURE 4.14 Optical flow. The arrows indicate the momentary velocity of texture across the visual field that the pilot would perceive on
approach to landing.

  When landing an aircraft at night over featureless terrain (e.g., when landing over water, darkened
areas, or snow), a situation called the black hole illusion (Gibb, 2007; Kraft, 1978) can arise in which
the pilot thinks he is flying higher than he actually is and descends too quickly, producing a crash or
early landing in front of the runway. Through simulation work, Kraft found that, in the absence of the
normal textural gradient of the approach terrain (visible on a lighted surface or in daylight, and
providing global optic flow), pilots would inappropriately reduce altitude, flying on a dangerously low
trajectory that invited ground collision. Several aviation accidents during landing have been directly or
indirectly caused by this illusion (Gibb, 2007). One solution lies in the use of virtual imagery on a
head-up display (or HUD, described more extensively in Chapter 3) to provide the texture: a
peripherally located virtual speed indicator using optical flow on the HUD has been shown to be more
effective for controlling speed or altitude than conventional cockpit displays (Bulkley et al., 2009;
Schaudt et al., 2002).
  Consider what happens when we drive in snow (or hail or heavy rain). We have two patterns of
optical flow in the environment: one created by our vehicle’s motion along the road, and one created
by the snow (both by wind and by gravity). The driver’s task is to attend to the first optical flow field
and ignore the second. However, this is more challenging than it might appear, especially in heavy
snow conditions with limited visibility of the roadside. Studies in simulators have shown that drivers
tend to drift toward the point of expansion of the snow, rather than that defined by the road and
surrounding ground texture. Improved visibility of a simulated roadway has been shown to help
drivers maintain course (Dyre & Lew, 2005; Lew et al., 2006). Increased illumination, paint, or
signage could be used to produce the same effect on roads subject to heavy snow conditions.
4. Time-to-contact (tau). Tau specifies the time remaining until an observer makes contact with an object,
assuming that the speed of the observer or the object is constant (DeLucia, 2007; Grosz, Rysdyk, et al.,
1995; Lee, 1976). It can be thought of as the rate of change of expansion of an object. Whereas object
size and distance are ambiguous (we might be viewing a large object far away or a small object
relatively close), the time remaining until contact is unambiguously specified by dynamic information
in the visual scene.
  It is clear that observers are sensitive to tau and can make use of it to stop, catch a ball, or take
evasive action (Schiff & Oldak, 1990). However, tau is affected by other factors, such as whether the
objects are of a familiar size, whether they are partially occluded, or how high the objects are in the
visual field (DeLucia, 2004, 2005; DeLucia et al., 2003). These studies suggest that indirect perception
can moderate the effects of a directly perceived invariant. We shall return to these ideas when we
consider the influence of higher-order cognitive processes on rear-end collisions in the next section.
5. Global optical flow. The total rate of flow of optical texture past the observer (Larish & Flach, 1990)
is determined both by the observer’s velocity over the ground and height above the ground. Thus,
global optical flow will increase as we travel faster and also as we travel closer to the ground.
  Our subjective perception of speed is heavily determined by global optical flow (Dyre, 1997). A
potential bias in human perception occurs because perceived speed can appear to increase as height or

109
altitude decreases, even though the actual speed is the same. For example, we feel as if we are
traveling faster in a sports car than in a large sedan or bus, in part because the sports car is closer to the
ground. When the Boeing 747 was first introduced, pilots often taxied the aircraft too fast and
occasionally damaged the landing gear while turning on or off the runway. The reason for this error, in
terms of global optical flow, was simple. The 747 cockpit was about twice as far above the runway as
the cockpits in other jets. For the same taxiing speed, the global optical flow was half as fast. Pilots
accelerated to obtain a global optical flow that matched their perception of the appropriate taxiing
speed established through prior experience. As a result they achieved a true velocity that was unsafe
(Owen & Warren, 1987). Similar effects have been found using simulations: observers respond to
altitude changes as if they are changes in speed (Wotring et al., 2008). Observers tend to be more
sensitive to the global optical flow of the ground when controlling speed, even when they are required
to direct their attention elsewhere (e.g., to scan for aircraft above the horizon, Adamic et al., 2010).
6. Edge rate. Edge rate can be defined as the number of edges or discontinuities that pass across the
observer’s visual field per unit time. As edge rate increases (texture is finer), the traveler perceives a
faster velocity. Global optical flow and edge rate are typically correlated, but edge rate is affected if
systematic changes in texture density occur (e.g., if flying and sparse trees change to dense forest),
whereas global optical flow is not. Global optical flow and edge rate contribute additively to
perception of self-motion (Bennett et al., 2006; Dyre, 1997).
The edge rate cue was exploited by Denton (1980), who was concerned with automobile drivers in Great
Britain who approached traffic circles (roundabouts) at an excessive rate of speed. His solution was to
decrease the spacing between road markers gradually and continuously as the distance to the roundabout
decreased. A driver not slowing down appropriately would see the edge rate as increasing. Believing the
vehicle to be accelerating, the driver would compensate by imposing a more appropriate degree of braking or
slowing. Denton’s solution was imposed on the approach to a particularly dangerous roundabout in Scotland.
Not only was the average approach speed slower following introduction of the markers, but the rate of fatal
accidents was also reduced.
Table 4.2 summarizes our list of optical invariants. As noted earlier, there is increasing evidence that
such invariants are most important at smaller distances (less than about 30 m; DeLucia, 2008). If you examine
Figure 4.14 carefully, it is evident that points closer to the observer move a greater distance across the retina
than far points. At shorter distances, depth information has implications for action, and how we interact with
the environment. As already discussed, this has implications for vehicular control; it also has implications for
the design of virtual environments (as discussed with regard to the black hole effect). An important
implication is that there needs to be sufficient optical texture in the scene to allow detection of the invariants.
At longer distances, indirect perception becomes more important for interpreting depth. This is the topic of the
next section.

TABLE 4.2 List of optical invariants and what each indicates about egomotion
Invariant: tells you about
Texture: distance, altitude
Splay: altitude
Optical Flow: heading (slant)
Global Optical Flow: velocity (rate)
Edge Rate: velocity (rate)
Tau: contact

3.3 Judging and Interpreting Depth and Three-Dimensional Structure: Focal 3D


To understand the three-dimensional (3D) structure of space, it is important that we can judge the relative
depths or distances accurately. The accurate perception of depth and distance is accomplished through the
operation of various 3D perceptual depth cues. We will describe each of these cues briefly. Readers wishing
more detail about the cues should refer to an introductory perception text such as Goldstein (2010). Some cues
are characteristics of the object or world we perceive, and others are properties of our own visual system. We
refer to these as object-centered and observer-centered cues respectively.

3.3.1 OBJECT-CENTERED CUES Object-centered cues are sometimes called pictorial cues because they are the

110
kinds of cues that an artist could use in a picture to convey a sense of depth. Figure 4.15 shows a 3D scene
that incorporates eight of the following cues:
1. Linear perspective. When we see two converging lines we assume that they are two parallel lines
receding in depth (the road). This cue is analogous to splay.
2. Occlusion. When the contours of one object occlude (block) the contours of another, we assume that
the occluded object is more distant (on the right, the front building occludes part of the rear building).
3. Height in the plane (relative height). We normally view objects from above; when this is the case
objects higher in the visual field are farther away (compare the two trucks).
4. Light and shadow. When objects are lighted from one direction, they normally have shadows that
offer some clues about their orientation, 3D shape, and distance (the buildings and trucks). Although
not shown in the figure, lighted surfaces can produce reflectances that indicate the depth of the
reflecting object.

FIGURE 4.15 Contains object-centered cues for depth, as described in the text.
5. Relative (familiar) size. If two objects are known to be the same true size, the one subtending a
smaller visual angle (smaller area of the retina) is assumed to be farther away (compare the two
trucks).
6. Textural gradients. As noted when we discussed invariants, the grain on a textured surface grows
finer as distance increases (the field on the left and the center line of the road).
7. Proximity-luminance covariance. Objects and lines are typically brighter as they are closer to us.
The reductions in illumination and intensity with distance therefore signal receding distance (the road
lines).
8. Aerial perspective. More distant objects often tend to be “hazier” and less clearly defined (the corn
field).
9. Motion parallax. We use motion information to judge the distances of different objects in the scene.
For instance, when we look out a window on a moving train, objects that are closer to us show greater
relative motion than those that are more distant. Hence, our perceptual system assumes that distance
from us is inversely related to the degree of motion.
10. Structure through motion. Motion can be used as a cue to the three-dimensional shape of objects.
For example, the cloud of points in Figure 4.16 does not appear to be threedimensional. Yet if these
were points of light on a rotating cylinder, they would show a pattern of motion—slow near the edges,
fastest at the center—that leads to an unambiguous interpretation of a rotating three-dimensional
cylinder (Braunstein, 1990).

3.3.2 OBSERVER-CENTERED CUES Three sources of information about depth are functions of characteristics of the

111
human visual system.
1. Binocular disparity (stereopsis). The images received by the two eyes, located at slightly different
points in space, are disparate. Objects at different distances stimulate disparate pairs of points on the
retina. The degree of disparity, inversely correlated with object distance, provides a basis for the
judgment of distance. Three-dimensional movies and televisions (stereoscopic displays discussed in
detail Section 3.6) use various artificial methods to present different information to each eye based on
this principle.
2. Convergence. The “cross-eyed” pattern of the eyes, required to focus on objects as they are brought
close to the observer, brings the image onto the detail-sensitive fovea of both eyes. Proprioceptive
messages from the eye muscles to the brain indicate the degree of convergence, and therefore the
object’s distance.

FIGURE 4.16 Potential stimulus for recovery of structure through motion. If the horizontal motion of the dots were proportional to the velocity
vectors at the top of the figure, the flat surface would be perceived as a threedimensional rotating cylinder.
3. Accommodation. Like convergence, accommodation is a cue provided to the brain by the eye
muscles. The muscles adjust the shape of the lens to bring the image into focus on the retina. The
amount of adjustment indicates the approximate distance of the object from the eye.

3.3.3 EFFECT OF DISTANCE ON CUE EFFECTIVENESS The various cues are not all equally effective, and their
effectiveness depends on the viewing distance, as shown in Figure 4.17 (Cutting and Vishton, 1995). The
figure separates the continuum of depth into three regions: personal, action, and vista space. Some cues are
effective regardless of distance: for example, occlusion and relative size. Other cues tend to be more effective
in the different spaces. For example, accommodation and convergence operate only within personal space;
within both personal and action space (< 30 m) motion parallax and binocular disparity are important cues for
depth. However, as distance is increased, the effectiveness of these cues decrease, and pictorial cues, such as
relative size and aerial perspective becomes more important, as illustrated in Figure 4.17.
The range depicted in the figure is based on natural viewing situations. With artificial displays, it is
possible to make cues more or less effective at difference distances. For example, stereoscopic displays can
artificially represent differences in the distances of objects that are miles away (Allison, Gilliam, & Vecellio,
2009). Furthermore, there are interactions among the cues: while a cue like stereopsis might not play a
primary role at large distances, its presence improves visual performance and it appears to validate available
monocular cues at large distances (Allison et al., 2009).

112
FIGURE 4.17 Effectiveness of various depth cues as a function of distance from the observer.

3.4 Illusions in 3D Viewing


In different ways, Figures 4.15 and 4.17 portray the multiple depth cues that people can use to judge depth
and distances in a natural viewing environment. Normally, multiple, redundant cues are available to provide a
compelling sense of three dimensionality. In general, the more cues available, the more compelling the sense
of depth along the viewing axis (Domini et al., 2011; Wickens, Todd, & Seidler, 1989); however, illusions of
depth and distance exist. To understand when depth judgments succeed and fail, it is important to consider
how the cues are integrated in the brain, an integration that is well explained by the weighted linear cue
model (WLCM; Bruno & Cutting, 1988; Ichikawa & Saida, 1996; Knill, 2007; Young, Landy, & Maloney,
1993). The model essentially describes the cues as varying in the reliability and precision with which they
convey depth information, and through experience with the 3D environment (both shortand long-term;
Westheimer, 2011), humans learn to give more weight to more reliable, and hence more dominant cues. In
this regard, research on depth perception has indicated that three cues in particular tend to be dominant and
powerful: relative motion, stereopsis and occlusion (Wickens et al., 1989): they have high weightings in the
WLCM.
To illustrate the effects of weighting and cue dominance, consider the two objects A and B in Figure
4.18 (top left). Assume that A and B are the same true size. Only a single cue is present, relative size, which
suggests that B is farther away (but there is little indication of how much farther it is). In Figure 4.18 bottom
left, the cue of height in the plane is added, and the sense of depth/distance is more compelling. Now look at
4.18 bottom right. The identical positions and sizes are used as in 4.18 left, but now the near contours of B
occlude those of A, presenting a clear indication that B is closer. The high dominance of occlusion is
demonstrated here (occlusion beats height in plane and relative size).
The importance of the cues in Figure 4.17 in the natural world is found in situations where safety is
compromised. This can occur when cues are insufficient or misleading. We will discuss each of these
situations in turn.
When depth cues are missing, there is insufficient perceptual information to provide a compelling sense
of depth (we say that the depth scene is impoverished). In such cases, like figure 4.18 top, the brain can
impose hypotheses on what the depth differences should be, based on past experience and expectancies (Enns
& Lleras, 2008; Gregory, 1997; Palmer, 1999). For example, in Figure 4.15 we hypothesize or “assume” that
the two trucks in the visual field are the same true size, and therefore the one with the smaller-sized retinal
image is farther away. These hypotheses and assumptions are relatively automatic and unconscious. Another
example is the black hole illusion, which we described earlier in the context of optical flow (Gibb, 2007;
Gillingham & Previc, 1993). When the pilot is flying over dark featureless terrain, there are few cues to the

113
distance of the runway from the cockpit, and the pilot hypothesizes that the aircraft is too high, leading to an
aggressive descent.

FIGURE 4.18 Illustrating the weighted linear cue model (WLCM). The Figure illustrates the added sense of depth by added cues, and the role
of cue dominance by occlusion.

Even when depth cues are available, they can often be misleading. The hypotheses based upon such cues
will end up being just plain wrong. An example is provided by Eberts and MacMillan’s (1985) assessment of
why small cars tended to get rear-ended more often on the highway than their larger counterparts. The authors
hypothesized, and confirmed with a simulation experiment, the following. The driver behind judges
separation, in part on the relative size of the vehicle in front, compared to the expected size of the typical
vehicle, in order to maintain a safe headway. A smaller car will thus be perceived to be farther away relative
to the expected norm; the following car will then inappropriately correct, by pulling too close, and cut the
headway to an unsafe margin … too close to avoid collision if the small car should suddenly brake. A similar
explanation can be offered for why pilots landing at a smaller than expected runway (often a landing strip)
will land fast and hard, sometimes overshooting the runway’s end (Gillingham, 1993; O’Hare & Roscoe,
1983).

3.5 3D Displays
Understanding 3D perception, and how depth cues combine to provide a compelling sense of depth, is
important for the design of 3D displays, especially for those displays that use any and all of the 3D cues in
Figure 4.15 to represent depth and distance of real space. The choice of such displays is of course influenced
by the principle of pictorial realism (Roscoe, 1968), discussed above. As a result, 3D displays can be very
effective formats for representing real space, and we will discuss those success stories first. As discussed
earlier, however, the PPR is not the same as naïve realism, which is the commonly held belief that because a
3D display of 3D space is more “realistic,” it will always be more effective for spatial tasks (Smallman &
Cook, 2010; Smallman & St. John, 2005). People like and want “3D” even when it does not support the most
effective task performance. Thus, we will also consider the shortcomings of 3D displays in this section.

3.5.1 3D DISPLAYS OF REAL SPACE DISPLAYS OF REAL SPACE One example of such a 3D display is the so called 3D
highway in the sky (HITS) display that shows a pilot’s commanded route through and actual position within
the sky (Figure 4.19; Haskell & Wickens, 1993; Jensen, 1978; Prinzel & Wickens, 2009). The role of relative
size and linear perspective in signaling the depth component of the command path is clearly evident in Figure
4.19a, in a way that is missing in the “tri-planar” presentation of the same information in 4.19b. Figure 4.19c
presents an example of such a display to be found in emerging versions of corporate aircraft. Several
evaluations of this concept have proven it to be more effective than separated tri-planar displays (Prinzel &
Wickens, 2008). Within the context of the proximity compatibility principle, the advantage can be seen
because flying an aircraft clearly requires integration of motion across all three axes. Hence such an
integration task is best supported by the integrated display (Haskell & Wickens, 1993).
In non-aviation domains, 3D displays have also proven superior for tasks in which integration across all
three axes of space is required, such as the appreciation of 3D shape, position and trajectory. This would
include robotics, industrial or architectural design (Liu, Zhang, & Chaffin, 1997), medical imaging (Hu &

114
Multhaner, 2007), and terrain layout (Hollands, Pavlovic, et al., 2008; St. John, Cowen, et al., 2001; Wickens,
Thomas, & Young, 2002). For example, Hu and Multhaner found that resident physicians were better able to
determine whether or not to remove a lung tumor using 3D displays of thoracic cavities than they were
reading 2D CT images. Tasks requiring shape understanding, such as judging the layout of terrain, or the
general shape of 3D objects, are best performed with realistic 3D perspective displays. In Figure 4.20, if you
were asked whether you could see point A from point B, you can generally do this better with the realistically
shaded, 3D perspective view display (right) than with the plan view topographic map (left) (Hollands et al.,
2008; St John et al., 2001).

FIGURE 4.19 (a) Highway in the sky (HITS) display. (b) tri-planar representation of the same information. (c) operational HITS display
(image courtesy of Erik Theunissen).

But 3D displays are not invariably better than their 2D co-planar or tri-planar counterparts (Wickens,
2000a, 2000b). Consider the air traffic displays shown schematically in Figure 4.21. These displays could be
used in the air traffic control terminal or as a cockpit display of traffic information (CDTI), which is being
introduced into the next generation of aircraft (Alexander, Merwin, & Wickens, 2005; Thomas & Wickens,
2007). Figure 4.21(a) shows a 3D traffic representation. Figure 4.21(b) shows the same information in co-
planar form, with the map location of the two planes in the upper panel (X-Y) and the vertical representation
of the two in the bottom panel (Z-Y). Here research has shown that the 3D representation of the airspace is
inferior for air traffic controllers (May, Campbell, & Wickens, 1996; Wickens, Miller, & Tham, 1996), and
either inferior (Wickens, Liang, et al., 1996) or no better (Alexander, Wickens, & Merwin, 2005; Thomas &
Wickens, 2007) for pilots. The experimental tasks required controllers or pilots to make judgments of the
proximity or collision risk of aircraft pairs. Such inferiority is observed in spite of the fact that: (a) airspace is
3 dimensional, and hence the 3D display conforms to the principle of pictorial realism; and (b) the judgment
of collision risk can be thought of as an integration task, and the 3D display clearly integrates all three
dimensional values into a single location in space.

115
FIGURE 4.20 2D topographic map and a 3D perspective representation of the same terrain. Source: 2012 Her Majesty the Queen in Right of
Canada, as represented by the Minister of National Defense.

From Figure 4.21, the reason for the inferiority of the 3D ATC display is obvious. The position of the
two aircraft is inherently ambiguous given that the three spatial dimensions have been collapsed onto a 2D
viewing surface (McGreevy & Ellis, 1986). In spite of the added complexity of the co-planar display, the
ambiguity is eliminated, and it is possible to precisely judge the XY distance (above as the crow flies, over the
map) as well as the altitude separation below. In addition, the strength of the co-planar advantage for air
traffic controllers is related to the fact that controllers do not really perform an integration task as they judge
separation. Rather, they approach separation more as a two-stage judgment: XY (map) separation, and altitude
separation are judged separately. Hence theirs really is a focused attention task. Research in other domains too
has established the inferiority of 3D displays for precise judgments along axes requiring focused attention,
where the 3D display is ambiguous (e.g., Hollands et al., 1998, 2008; Liu, Zhang, & Chaffin, 1997; Wickens,
Thomas, & Young, 2000).

FIGURE 4.21 Three representations of a traffic conflict display, portraying the relative position in 3D space of two possibly conflicting
aircraft. (a) 3D, (b) co-planar, and (c) 3D with artificial frameworks.

We will unpack the concept of line of sight ambiguity (LOS ambiguity) here, using Figure 4.22. At the
top of the figure, we represent a volume of space, and the observer’s eyeball, viewing this volume from right
to left. The space contains three different letter-objects, all approximately equidistant from each other, but A
is farther away from the observer than C and B. This is the true 3D geometry. Now consider what the observer
would actually see looking at the display (lower panel). Here A and B look very close to each other, compared
to their distance from C, a clear departure from the 3D reality. Now suppose the viewer uses the cue of
relative size, assuming the letters to be the same true size. Then, seeing the slightly smaller A compared to B,
the viewer might realize that A is indeed farther away along the depth or distance axis. But how much farther
away? It is impossible to judge, since there are many (indeed, an infinite number of) locations of A along the
depth axis and the vertical axis that could produce the same relative position of A and B from the viewer’s

116
perspective.
As would be apparent from the WLCM model, discussed above, part of the solution to this LOS
ambiguity problem is to provide more depth cues in the image. While this is helpful, when the 3D scene is
portrayed in a flat surface, as with a photograph or computer monitor, the benefits of additional depth cues are
mitigated somewhat by flatness cues (Domini et al., 2011; Young et al., 1993). Here certain features of the
viewing environment (e.g., the display frame, reflectance from the screen) signal loud and clear to the
observer that this is indeed a 2D image. This awareness has a way of perceptually “re-orienting” the perceived
depth plane, from one along the line of sight, to one that is progressively more parallel to the viewing screen
as depth cues are reduced. This is indicated by the two angled arrows in Figure 4.22 (top). Indeed, if there
were no depth cues at all, viewers would perceive all objects to be arrayed vertically on the flat vertical
surface. The prominent role of cues to flatness is revealed when those cues are removed. When viewers can no
longer see the screen boundaries, or when reflectance is minimized as when viewing the image in a virtual
reality simulator, the sense of depth becomes much more compelling, as we describe in the next chapter.

FIGURE 4.22 Top: showing the relative depth along the viewing axis of three objects, A, B and C, as viewed by the observer on the right.
Bottom: Depicts the relative position of images on the viewing screen as they would be seen by the observer (represented in the foreground).

There is also a second cost to 3D displays, closely related, but not identical to LOS ambiguity, and this is
compression along the depth axis. Such compression can easily be seen in Figure 4.22 (top). Here the distance
between A and B, as viewed on the screen (e.g, in pixels or visual angle) is far less than (is compressed
relative to) the distance between B and C. Even when the AB distance is well above threshold, its
compression will still degrade the resolution with which differences can be judged (Stelzer & Wickens, 2006)
and, for dynamic displays, will reduce the extent to which changes (movement) and changes in changes (rate
increases or decreases) can be perceptually resolved. It is of course the low resolution of movement in depth
that is responsible for the difficulty in detecting loss of headway in driving, as the car ahead slows down.
Similarly, DeLucia and Griswold (2011) showed problems with compression when using multiple camera
views in simulated laparoscopic surgery. Performance was poor when their participants used a camera view
and the view was parallel to the movement trajectory of the laparoscopic probe.

3.5.2 3D DISPLAYS OF SYNTHETIC SPACE DISPLAYS OF SYNTHETIC SPACE 3D displays can be used to represent
conceptual spaces as well as real spaces. In this case, the three spatial dimensions X, Y, and Z are used to
represent conceptual variables. Examples would include a 3D scatterplot, a 3D graph like that shown in Figure
4.23, or many of the 3D data visualizations that we will describe in the next chapter. Under such
circumstances, while the same limitations of LOS ambiguity and compression apply for focused attention
tasks along a single axis, their consequences to performance may not be as serious if precise metric judgments
of distance or size are not required. Here the object integration quality of the 3D representation can provide an
advantage that outweighs the other costs. For example, when the complex shape of a 3D surface needs to be
understood, 3D scatterplot displays have been shown to be superior to separated 2D scatterplots (Kumar &
Benbasat, 2004; Wickens, Merwin, & Lin, 1994). However, when precise judgements are required, the costs

117
of the 3D format become evident. For example, if asked to judge the relative heights of two bars in the 3D
graph shown in Figure 4.23(a) it is difficult to do this accurately, and the error increases with the distance
between the bars in the simulated depth plane (Hollands et al., 2002).

3.5.3 3D DISPLAYS SOLUTIONS: ENHANCING DEPTJ AND RESOLVING AMBIGUITIES Several remedies to 3D ambiguity
can be offered. First, the WLCM suggests that the more depth cues used, the better, and this is clearly
supported by research that has varied their number (e.g, Ware & Mitchell, 2008; Sollenberger & Milgram,
1993). Furthermore, given the particularly compelling influence of occlusion, stereopsis, and motion parallax,
these should be incorporated whenever possible. Stereo will be discussed in detail in the following section,
and motion parallax can be accommodated by allowing the viewer to “rock” or “tilt” the entire displayed
volume, much as one might tilt a real 3D transparent volume (like a doll house; Thomas & Wickens, 2007).
Flatness cues can be reduced by dimming ambient lighting (eliminating reflection off the display surface),
making the display frame less visible, or using immersive VR technology, as described in the next chapter.

FIGURE 4.23 (a) Perceptual distortions produced by 3D graphics. On the left, the two bars are the same height, but the perception of depth
makes the more distant bar appear larger. On the right, the rear bar is smaller than the close bar, but perspective makes them appear the same.
Measure the bars to make these comparisons. (b) The same bars are shown with tick marks added. It is now clearer that the two bars on the left
are the same size, and on the right, that the bar in front is in fact larger than the bar in the rear.

Second, artificial frameworks can be added. The tickmarks placed on the bars of Figure 4.23(b) provide a
framework that helps judgments of extent (height, in this case). Also, any framework that highlights how
differences vary precisely along the 3D orthogonal axes of a volume (lateral longitudinal and vertical) can
help. For example, referring now to Figure 4.21c, placing gridlines on the surface and placing the aircraft atop
vertical “posts” can help disambiguate their 3D location (Ellis, McGreevy, & Hitchcock, 1987).
Finally, careful task analysis is essential. As we discuss in chapter 5, what kinds of cognitive and motor
judgments are to be made on the basis of the displayed information? If only holistic judgments or general
impressions of space are required (Wickens & Prevett, 1995, call this global situation awareness), 3D
displays will be superior. But whenever precise judgments along one or more axes are required, co-planar
displays should be considered; or the 3D displays should be augmented with an artificial framework. Effective
design must accommodate the balance of principles that influence performance of the task required by the

118
user.

3.6 Stereoscopic Displays


As noted above, stereopsis is one of the three dominant cues for 3D depth perception. Indeed many people
consider stereo as the defining aspect of “3D.” We resist this simplistic classification, because motion cues
provide a compelling sense of depth when one eye is closed (i.e., without stereo), and indeed monocular
viewing can provide a powerful sense of 3D richness from the 10 object-centered cues. Nevertheless, given
the importance of the stereo cue, and the technology necessary to generate it artificially, we provide some
detail here.
Stereopsis presents slightly different images to the two eyes (Patterson, 2007; Westheimer, 2011). This
can be done artificially in a variety of ways. One method is to use glasses with optical shutters that open and
close in rapid succession (e.g., at 120 Hz), synchronized with the image shown on the monitor. Another
method uses polarized glass so that one lens has horizontally polarized glasses and the other has vertically
polarized glass. The display surface depicts two images, each with corresponding polarization. This is the
most common method used for 3D movies. The use of different colored lenses works on a similar principle, at
the cost of impairing the colors that can be perceived in the scene. Perhaps you have seen 3D bookmarks,
cards, or mouse pads in which stereopsis is simulated from a particular viewing angle. These use a lenticular
printing technology having special lenses that align to control the direction of the light to either the left or
right eye. In holographic and volumetric displays, the image is truly 3D, and binocular parallax is preserved in
the different directions of light from the display (Patterson, 2007). However, these last methods are
challenging to build and require considerable computational power, and as a result are not widely used
relative to stereoscopic methods.
As we saw earlier, the amount of disparity can provide a direct, unambiguous cue for depth, and it
dominates most other cues with which it is placed in competition. Comparative evaluations generally reveal
that stereopsis enhances performance (Getty & Green, 2007; Muhlbach, Bocker, & Prussog, 1995;
Sollenberger & Milgram, 1993; Tsirlin et al., 2008; Van Beurden et al., 2009; Ware & Mitchell, 2008;
Wickens, Merwin, & Lin, 1994). Stereopsis appears important for the control of limb movement given its
high efficacy at short viewing distances. For example, Servos et al. (1992) showed that grasping movements
to a target were faster with binocular relative to monocular viewing. In Chapter 3, we talked about the
influence of display clutter on visual search and attention; stereopsis can be used as a method for filtering
information shown on a display. Kooi (2011) has shown that observers can easily segregate a visual scene on
the basis of portrayed depth using stereopsis, which has the net effect of reducing display clutter.
Within the medical community there is great interest in the use of 3D stereoscopic displays for a number
of purposes, including diagnosis, preoperative planning, minimally invasive surgery, and medical training
(Van Beurden et al., 2009). In general, the advantages of stereo are greatest when visibility is degraded, when
there is high scene complexity, and when there are few monocular depth cues. One particular problem for
medical imaging systems (e.g., ultrasound, X-rays) is that transparent and translucent surfaces are common
and their depiction on a 2D display can be confusing. For example, it can be hard to tell which object is in
front (Tsirlin et al., 2008). So for example, Getty and Green (2007) have shown clear stereo advantages for
detection rate in breast imaging, reducing both false alarms (false positives) and misses (false negatives).
In preoperative planning, the precise analysis of distances, volumes, and angles is of high importance
(Van Beurden et al., 2009). Visualizing multiple intersecting radiation beams to treat a cancerous tumor serves
as one example. Again, stereopsis shows clear advantages. For example, determining the optimal path for
radiation therapy was performed better using stereoscopic than monoscopic imagery (Hubbold et al., 1997).
The advantages of stereopsis for minimally invasive (laparoscopic) surgery appear to be greatest in more
complex environments, with more complex tasks, and with inexperienced users (Falk et al., 2001;
Votanopoulos et al., 2008). Beyond medical applications, stereoscopic displays will likely be useful for other
domains where precise limb positioning and relative position understanding in personal space is necessary.
In summary, stereoscopic displays appear to provide an effective method for increasing the precision of
relative position judgments. By reducing ambiguity of depth, they reduce some of the problems observed with
3D displays. However, there are certainly limitations to stereoscopic displays. First, as noted above, they
typically require specialized eyewear, which usually produces a drop in the intensity and spatial resolution of
an image (McKee et al., 1990; Smallman & Cook, 2010). Second, not all people can accurately use
stereoscopic cues. Third, when a richer set of monocular pictorial cues is available (including texture
gradient), the advantages of stereopsis can be eliminated (Kim et al., 1987; Ware & Mitchell, 2008). A display

119
designer must balance the added cost of the three-dimensional stereoscopic display against the performance
benefits that it provides in a particular task context.

4 SPATIAL AUDIO AND TACTILE DISPLAYS


So far in this chapter we have concentrated on the use of visual displays to depict spatial information. Perhaps
this is not surprising, for as we will see when we discuss mental resources in Chapter 11, there is a natural
mapping between the visual and the spatial. However, it is certainly possible to use auditory modality to
communicate spatial information. An everyday example is the use of stereo headphones, where one musical
instrument is placed in the left channel, and another in the right. Tactile displays also have an inherent spatial
component. In this section, we briefly address the use of 3D spatial audio technology and tactile displays.
In Chapter 10 we will discuss the role of auditory displays in presenting the operator with information
through an alternative channel in order to mitigate the effects of excessive visual workload. Recent advances
in computing technology—most notably in the form of head-related transfer function filtering techniques—
allow sounds to be presented to the listener via everyday stereo headphones that seem to originate from a
specific location in 3D space. Under normal listening conditions, we estimate the spatial location of a sound
using cues derived from a single ear (monaural cues) and by comparing cues received at both ears (binaural
cues). Similar to the combination of visual depth cues, the monaural and binaural cues are used in
combination to determine the location of a sound. If we consider the simple case of the horizontal plane, the
auditory system can use differences in both the intensity and timing of the sound as it arrives at each ear. So a
sound wave approaching from the left side will reach the left ear earlier, and have greater amplitude (will
sound louder), than when it reaches the right ear. So this is a binaural cue. On the vertical plane monaural
spectral cues determined by the shape of the pinna are used (Bremen, van Wanrooij, & Van Opstal, 2010).
The precise vertical location of a sound is more difficult to determine; although it is mediated by the acoustic
context of the sound (Getzmann, 2003). It is through the use of such cues in combination that consumer
products with 3D audio technology can reproduce the 3D aspects of the auditory environment, and 3D
auditory alerting systems are able to project a sound to a specific location in space, even when the listener is
wearing traditional stereo headphones.
The application of 3D audio technologies to aviation has met with considerable success in terms of
enhancing performance and reducing workload on a range of tasks, such as target detection and acquisition
(Nelson, Bolia, & Tripp, 2001). For example, response times to Traffic Advisory Warning alerts are reduced
by 25 percent when 3D audio cues are available (Simpson Brungart et al., 2004). We have a natural tendency
to attend visually to loud and distinct sounds, a phenomenon known as the orientation reflex (Perrott, Saberi,
Brown, & Strybel, 1990), leading to significant decreases in visual search times and improvements in head
movement efficiency and effective search area. 3D auditory displays can take advantage of this reflex. Such
alerting effects are robust for both static and moving targets and require relatively short training sessions
(McIntire, Havig, et al., 2010), are resistant to the effects of sustained high accelerative (gravitational, or G)
forces (Nelson, Bolia, & Tripp, 2001), and can also improve the intelligibility of the audio messages
themselves (Carlander, Kindström, & Eriksson, 2005). Spatial audio cues can be used to improve the speed of
visual search (Pavlovic, Keillor et al., 2009). The location of the auditory cue has to be precise, especially for
targets located on the horizontal plane. Even four degrees of error between the target and the sound cue leads
to significantly longer search times (Bertolotti & Strybel, 2011).
One advantage to spatial audio is that it more resistant to cognitive load than spoken language. Klatzky,
Morrison et al. (2006) guided blindfolded participants along virtual paths. Information was provided to the
participant about the azimuth direction of the next waypoint, either using virtual sound or spatial language. At
the same time, the participants had to perform a cognitive task (an N-back task, to be described in Chapter 7).
This task generated a cognitive load for the participants as they tried to navigate between waypoints using the
cues. Participants showed better performance while navigating with virtual sound than with spatial language.
Over the last decade tactile displays have been developed to present spatial information to operators
using tactile actuators. Tactile displays can help direct visual spatial attention, and enhance spatial awareness
under degraded visual conditions (Hale, Stanney, & Malone, 2009). Like 3D auditory displays, tactile displays
capitalize on the orientation reflex. Tactile displays can reduce spatial disorientation in aviation
environments when visual and vestibular cues are missing or misleading (McGrath, Estrada et al., 2004).
Tactile displays have also been shown to improve obstacle avoidance (Lam, Mulder, & van Paassen, 2007),
facilitate target acquisition for unmanned aerial vehicle operators (Gunn et al., 2005), provide drift
information to helicopter pilots during hover (van Veen & van Erp, 2003), and facilitate aircraft upset

120
recovery (Wickens, Small et al., 2008). Like 3D audio, tactile displays are also resistant to the effects of
sustained high G forces (van Erp et al., 2007).
The integration of tactile displays with existing visual and auditory displays presents a number of
challenges to the designer. One decision relates specifically to whether the tactile cue should provide status
information (such as the location of an obstacle) or command information (tell the operator to avoid the
obstacle). Salzer Oran-Gilad et al. (2011) found that for tactile displays used in the cockpit, command displays
were preferred over status displays. A related topic (discussed in Chapter 2 in the context of information
theory and in Chapter 6 in the context of communications) is the use of redundancy to improve performance.
Many studies have shown a benefit from simultaneous presentation of the same information through different
modalities (for a review, see Wickens, Prinett, et al., 2011). We will revisit many of these topics when we
discuss communications in Chapter 6.
In summary, we can see that auditory and tactile displays offer useful methods for presenting spatial
information to an operator, if well coordinated with available visual information.

5 TRANSITION
This chapter has described issues related to the design of spatial or analog displays. We began with a
discussion of graphs and noted several factors that can make a graph more effective. We then examined
graphical displays such as meters and dials, and emphasized the concept of compatibility between the display
and the cognitive domain. Then, after introducing two types of perception (direct and indirect) we considered
how each contributes to our understanding of 3D space. First, we considered characteristics of a three
dimensional environment that provide information about egomotion and how this guides navigation. Then we
examined how we deliberately judge and interpret depth and three dimensional structure and discussed how
3D displays might best be designed to effectively convey information. Finally, we briefly considered spatial
displays that use other sensory modalities. In the next chapter we will focus on interactive displays that are
also spatial, so that chapter forms a natural continuation of many of the topics discussed here. In particular, we
build upon and elaborate the discussion of 3D displays. We will address similar topics when we discuss
spatial working memory in Chapter 7, and the compatibility between a display and working memory and
response in Chapters 7 and 9, respectively. However, as we are well aware, spatial information plays only a
partial role in our interactions with other systems, including people. In Chapter 6 we will discuss the
complementary role of verbal and linguistic information in such interaction.

Key Terms
Accommodation 111
Aerial perspective 110
ambient vision 103
binaural cues 120
Binocular disparity (stereopsis) 110
black hole illusion 107
brightness 97
color 96
color hue 96
color saturation 97
compression 104
Convergence 110
cue dominance 112
data-ink ratio 92
depth cues 109

121
direct perception 103
display compatibility 94
dorsal visual pathways 103
ecological compatibility 94
ecological interfaces 100
ecological psychology 103
Edge rate 108
egomotion 103
expansion point 106
flatness cues 117
focal vision 103
frequency separated display 98
global optical flow 107
global situation awareness 119
head-related transfer function 120
Height in the plane (relative height) 109
hybrid display 97
indirect perception 103
inside-out display 98
Light and shadow 109
line of sight ambiguity 116
Linear perspective 109
magnitude estimation 90
Malcolm horizon display 104
mental model 94
mental operations 88
meta-analysis 86
monaural cues 120
Motion parallax 110
moving-pointer display 97
moving-scale display 97
naïve realism 96
object-centered cues 109
observer-centered cues 109
Occlusion 109
optical flow 106
optical invariants 104
orientation reflex 121
outside-in display 98
perceptual continua 90
pictorial cues 109

122
Poggendorf illusion 88
population stereotype 97
principle of pictorial realism 95
principle of the moving part 97
proximity compatibility principle 86
Proximity-luminance covariance 110
Relative (familiar) size 110
relative judgment or comparison 97
response compression 90
response expansion 90
splay 106
stereoscopic display 110
Stevens’ law 90
Structure through motion 110
tactile displays 121
tethered display 99
Textural gradients 110
ventral visual pathways 103
visual momentum 93
weighted linear cue model 112
work domain analysis 100

123
5 SPATIAL COGNITION, NAVIGATION, AND
MANUAL CONTROL

The mountain hiker had summited the peak on a beautiful morning and now left the descending ridge to
plunge into the wooded valley below, leading to his destination at the distant roadway. The noonday sun gave
him a clear orientation along his northbound course. By 1 PM, he had descended below timberline, the sun was
now hidden by low clouds, and his GPS unexpectedly gave out. With no compass for a backup, he consulted
his guidebook, which indicated that he should take a right turn before the creek drainage. But where was the
creek? In a break in the trees he looked upward to find the ridge from which he had descended, but the
mountain was now obscured in clouds. He could not match the dim silhouette of the mountain peak with the
many humps shown in his map in the guidebook. He thrashed through the trees, came at last to a dirt road, and
decided to follow it down. But in the level forest in which he now found himself, which way was “down?”
Much of the material in the previous chapter addressed analog or spatial displays, which are useful for
showing continuous differences, such as the slope of a line on a graph, or the position of a pointer on a
display. The current chapter also considers issues of continuous representation of spatial information, but does
so in the context of location in and movement through space (Shah & Miyake, 2005; Taylor, Brunye, &
Taylor, 2008). Such movement may be direct, as when walking through a building, along a wooded trail, or
hiking a mountain like our lost climber. This movement may also be indirect, as when controlling a bicycle,
car, or even controlling a “virtual viewpoint” in virtual reality.
Whether direct or indirect, the movement typically requires some or all of the four primary stages of
information processing:
1. A scene or a map must be perceived and attended in order to find one’s current location and goals;
2. The space in which one is traveling must often be understood, a process heavily dependent on spatial
working memory (Chapter 7). For example: “From what I see, which way is north?” or “Where is the
nearest exit?”;
3. A direction is chosen to meet some task-specific goals, a choice that is often based upon the spatial
awareness represented in the second stage;
4. The choice is executed through locomotion, either via a simple automated natural method (e.g.,
walking) or one that may manifest considerable complexity (controlling a large aircraft or submarine
in 3D space).
Within this context, the sections of this chapter deal with several related concepts. We begin by
describing the cognitive representation of space and in particular, the importance of the frames-of-reference
concept in spatial thinking (Wickens, 1999; Wickens, Vincow, & Yeh, 2005). In this context, we describe a
few important categories of tasks that depend upon this spatial representation. We address human factors tools
designed to support these spatial tasks, focusing on the design of maps, issues of clutter and frame-of-
reference, and the challenges of 3D maps. We address an area, closely related to spatial representations: the
issues of information visualization and visual momentum. The next section focuses more explicitly on Stage
4, the execution of spatial movements in the tracking or manual control task. Here a primary focus is on
vehicle control. In the final section we consider the human performance issues of virtual and augmented
reality.

1. FRAMES OF REFERENCE
To set the stage for our discussion, we present a matrix (Figure 5.1) showing three classes of tasks involving
space—travel, understanding, and precision judgments—crossed with four different ways that 3D spatial
information can be represented: from an egocentric or exocentric perspective, in co-planar form (e.g., from
two orthogonal perspectives), or verbally. We will “fill in” this matrix during the first sections of this chapter,
and we note that, regardless of task, some form of transformation of frame of reference will be required, the
concept that began our discussion.

124
1.1 Cognitive Representation of Space
Space can be represented in three Euclidian dimensions, generally labeled X, Y, and Z. However, the three
spatial dimensions are often represented more concretely in either of two different frames of reference (FOR).
In a human-centered, egocentric, or ego-referenced frame the three dimensions are left-right, front-back, and
up-down (Franklin & Tversky, 1990; Previc, 1998). In an exocentric, or world referenced frame (sometimes
called “allocentric”), the dimensions are: east-west, north-south, and (again) up-down, respectively. Of course
there are many other possible sets of reference frames. We can talk about the front or back of a room (which
may not correspond to what is in front of an observer or to north), or the frame of reference of a three-axis
controller (which may not necessarily align with the human’s head or trunk; Chan & Hoffman, 2010).

Display Frame of Reference

Task Co-planar Exocentric Egocentric Verbal

Travel

Understand

Precise judgement

FIGURE 5.1 Spatial task X display matrix

One general characteristic of the axes of the frames, particularly egocentric frames, is the salience, or
degree of “marking” of endpoints. For example, there is a clear ecological distinction between “up” and
“down,” representing not only the distinction between sky and ground, but also the force of gravity. There is
also a distinction between front, where things (e.g., hazards) can easily be seen, and back, where they cannot
be. In contrast, there is far less of a differential marking between left and right, and hence there is more
opportunity to confuse these (Previc, 1998). It is likely that these differences in perceptual salience of the axes
have some fundamental biological basis.

1.2 Frame of Reference (FOR) Transformations in 2D mental rotation


The alignment of a pair of FORs supports human performance. For example, people have an easier time
navigating with maps when north is aligned with the forward direction. When FORs are not aligned, tasks
often require a FOR transformation, or FORT. A FORT requires time to perform, increases the likelihood of
error, and leads to increased cognitive load (e.g., Pavlovic, Keillor, et al., 2008; Wickens, 1999; Wickens,
Keller, & Small, 2010). Because FORTs impact human performance, we describe them in some detail. In the
context of the tasks in Figure 5.1, we emphasize first the task of navigation.
The most familiar FORT is mental rotation (Shepard & Cooper, 1982; Aretz, 1991; Gugerty & Brooks,
2004; Stannsky, Wilcox, & Dubrowski, 2010). The original mental rotation studies required people to
determine if rotated letters or geometric objects matched the identity of an upright target letter (or shape), in
two or three dimensions (e.g., Shepard & Cooper, 1982). More recent studies have examined mental rotation
in the context of map use (e.g., Crundall, Crundall, et al., 2011; Williams Hutchinson & Wickens, 1996;
Wickens, Vincow, & Yeh, 2005). We begin by describing FORT in the navigation or travel task: getting from
the current location to a destination. For example, driving southward while holding a map in a north-up
orientation can be challenging because in deciding upon turns, or evaluating landmarks, the driver must often
“mentally rotate” the map into a south-up orientation so that there is congruence between the forward view
and objects on the map. Of course, some of us physically rotate the map to the south-up orientation.
Unfortunately, this means that text and symbols on the map are inverted. It also presents other cognitive
challenges, described next. It is important to note that 2D mental rotation can be used to help make either a
discrete cognitive decision (e.g., which way to turn) or it can affect a continuous manual response, as when a
remotely controlled model airplane is flown by a controller from the ground.
A general function representing the cost of 2D mental rotation is seen in Figure 5.2. The Y axis (cost)
can represent time, error likelihood, or mental workload, and in different circumstances any or all of these
may emerge. The figure can be decomposed into four “regions” across the X axis. For small mental rotations

125
(or small misalignments), there are minimal costs, and they do not grow much with increased rotation until a
90-degree point is approached. This second region is critical because here, left on the map no longer
corresponds to left in the forward view. Such ambiguity leads to an increased need for mental rotation. For
angles above 90 degrees (region 3) there is in fact an incompatibility such that right on the map corresponds to
left in the world, an issue we address in detail in Chapter 9. As the misalignment approaches 180 degrees,
however, there is an interesting “dip” in the peak such that perfect misalignment does not impose as much of a
cost as might otherwise be predicted by a single spatial-mental rotation mechanism (Gugerty & Brooks, 2004;
Macedo et al., 1998; Aretz, 1991), although there is still a cost relative to upright. This relative advantage
appears to be mediated by a verbal “left is right” strategy that can often be deployed (Cizarre, 2007). The
curve is then relatively symmetric, returning to 0 at angles of misalignment above 180 degrees.

FIGURE 5.2 Two-dimensional mental rotation costs as a function of angle. The four regions are described in the text.

The implications of the figure are straightforward. Human performance will generally be more proficient
if maps rotate in the direction of travel (“track up” or “heading up”), relative to a fixed (usually north-up)
orientation. With track-up maps, the 2D FORT cost is minimized; with electronic maps, this can normally be
done while still keeping the text in an upright (and therefore legible) orientation. For the so-called “you are
here” (YAH) map (Levine, 1983; Figure 5.3), a navigational aid often found in malls, parks, airports, or urban
environments, this congruence of alignment may be accomplished by rotating the map in the appropriate
orientation before affixing it to the signpost.

FIGURE 5.3 A “You-are-here” (YAH) map. Notice that the forward-field-of-view corresponds to the orientation of the map. Some YAHs
violate this convention. Note also that a visually prominent landmark is highlighted which will illustrate visual momentum.

At this point, it is appropriate to note three potential costs to a rotating map display. First, for single
users, when the map is continuously rotating (as for example while traversing a city in a winding path), the
lack of consistency makes it more difficult to build up a “mental model” of the environment (the task of
“understanding”) and indeed research has shown that people are less able to re-construct the environment (by
drawing a map) after having operated with a rotating map (Aretz, 1991; Munzer, Zimmer, & Baus, 2012;

126
Williams, Hutchinson, & Wickens 1996; Wickens, Liang et al., 1996). Thus, it may disrupt the
understanding task shown in Figure 5.1.
Second, there are substantial individual differences in mental rotation ability (Gugerty & Brooks, 2004;
Carlson, 2010; Hegarty & Waller, 2005; Crundall Crundall et al., 2011), which may be why some people are
quite comfortable navigating with a north-up map, showing minimal costs, and indeed why many pilots prefer
to hold paper maps in a north-up orientation (Williams, Hutchinson, & Wickens, 1996). Third, when
communication is required between people who may not share the same momentary frames, world-referenced
language (north-south-east-west) is more universal and less ambiguous than ego-referenced (van der Kleij &
Brake, 2010). Consider, for example, an aerial fire-fighting tanker being directed by a commander in another
aircraft to fly a particular course toward a water-drop. It is for these reasons that well-designed electronic
maps preserve a fixed (usually north-up) orientation mode that can be selected.

1.3 3D Mental Rotation: The General FORT Model


Humans must often navigate in 3D space. Figure 5.2 illustrated a non-linear rotation function for a map,
which generally represents information lying along a 2D horizontal plane (north-southeast-west). However,
more complex spatial understanding and navigation is often required in 3D space as well, whether this be the
pilot flying a 3D trajectory to an airport (Wickens & Prevett, 1995), the shopper or museum visitor in a
complex multi-level building (Carlson, 2009), the surgeon performing endoscopic surgery, maneuvering a
probe through a twisting vessel (Stansky, Wilcox, & Dubrowski 2010; Zhang & Cao, 2010), or the operator
positioning herself in space on a 3D platform relative to a telephone pole. In such cases, FORT
transformations must take place between three orthogonal 2D planes (shown by the curved arrows in figure
5.4), contributing additional costs. There is some cost for transforming between planes oriented at 90 degrees
to each other. The user may need to ask questions like, “In order to move the probe up, do I move my control
forward, or backward?” We address this issue of control-display compatibility further in Chapter 9.
Furthermore, performance costs sometimes become evident when the user must make comparative
judgments between images. The viewer of a contour map or satellite depiction of the ground, when comparing
with a forward field of view, must in essence rotate the map 90 degrees upwards to envision how that image
corresponds with a forward view, in order to judge congruence. That is, “is what I see, what I should see?”
This upward mental rotation of the environmental view (or forward mental rotation of the map; rotation A in
Figure 5.4) also imposes time and error costs (Hickox & Wickens, 1999; Aretz & Wickens, 1992). A nice
solution to this problem is to adopt a 45-degree downward viewing perspective such as shown in the YAH
map of Figure 5.3 (Hickox & Wickens, 1999). This has the advantage of reducing the extent of forward
mental rotation while still preserving some of the desirable topological features of god’s eye “map” (What
roads go where? How are distances judged?). Such a view seems highly suitable for situations in which
mapped information needs to be compared with that available in the environment (e.g., YAH maps, electronic
maps). An added advantage is that images and objects depicted in the map look more like their real-world
counterparts (visible in the FFOV) than they would if depicted from a top-down view.

127
FIGURE 5.4 3D frames of reference transformations. The figure depicts 2D movements (the thin straight arrows) in each of three planes
(frontal, top and side), relative to the human controller at the front. The curved arrows describe different orthogonal transformation between
three planes that might be required by a task. The difficulty of these is described in the text.

While forward mental rotation (A in figure 5.4; envisioning what a contour map looks like in 3D)
imposes some costs, it also turns out that not all plane-pairs impose equal difficulty in their 90 degree
transformations. In such cases a general finding is that transformations between the frontal and horizontal
plane (A) are less demanding (“up and over the top”) than are those between either of these planes, and the
vertical plane (“around the side” B & C in figure 5.4) (Chan & Hoffman, 2010; Delucia & Griswold, 2011).
The reason for this difference appears to lie in the difficulty of translating left-right, in part because of the lack
of “marking” of this lateral dimensional axis described earlier (Franklin & Tversky, 1990). In particular, the
easier mapping between the frontal and horizontal plane (rotation A) always keeps the left-right axis
consistent. Left is left and right is right no matter whether you are looking forward at a vertical screen or
downward at a horizontal one. As a consequence we have little difficulty using a YAH map even if it is
mounted vertically, while the terrain depicted is horizontal. But this consistency does not exist with rotations
B and C (Wickens, Vincow, & Yeh, 2005).
Finally, we see in Figure 5.5 the challenges of multiple FORT transformations. The pilot sees the world
through the windshield (top left), but must compare it with her estimated position on the north up 2D map
(bottom left). The two right panels illustrate the two transformations required to judge if “where I am is where
I should be.” The wedge on the bottom left panel will be explained later.

128
FIGURE 5.5 An illustration of dual maps, and visual momentum to aid comparison between them. The display at the top left is a rotating 3D
egocentric map such as appearing in the synthetic vision display of some modern aircraft and mimicking the real world. The map at the bottom
left is a top down 2D north up map, depicting the aircraft flying southward. The boxes to the right illustrate the two FOR transformations—
lateral and vertical mental rotation—that the pilot accomplishes to compare the two views. The wedge depicted in the 2D map provides visual
momentum since it depicts the FOV seen in the upper map.

1.4 2D or 3D
“3D” or perspective displays were discussed in Chapter 4. The answer to the challenge of minimizing FORT
transformations in forward navigation is often found to be the 3D forward view display such as the “tunnel in
the sky” (Fig 4.19 in Chapter 4) or a 3D YAH map (Figure 5.3). The advantages of 3D maps notwithstanding,
they are hindered by two “spawned costs.” Before these costs are described, a context for understanding them
is the distinction between the egocentric and exocentric viewpoint of the 3D display (Hollands & Lamb, 2011;
Wickens & Prevett, 1995). In the former, the viewpoint of the display is the same as the eyes of the viewer if
s/he were immersed in the displayed 3D environment (the immersed view). The highway in the sky display
shown in Figure 4.19 provides an example. In the latter, like the YAH map of Figure 5.3, the viewer can see
his or her position (sometimes called an avatar) from above and behind within the display.
Given this context, understanding the 3D space through which one travels can be greatly hindered by the
keyhole properties of the immersed view (Woods, 1984), the first spawned cost. More generally, the
accentuation of the forward travel path, while aiding travel and navigation, can hinder understanding. A 3D
egocentric display with a narrow field of view will hide (not display) non-forward information. A narrow
perceptual focus can lead to a narrow attentional focus. If one’s attention is drawn to what is in front, little
attention is available to consider objects and landmarks to the side, above, below, and behind (Wickens,
Thomas, & Young, 2000; Wickens, 1999; Olmos, Wickens, & Chudy, 2000). What is not seen cannot easily
be understood. Hence the egocentrism of an immersed 3D display degrades the understanding task in a way
that is analogous to that of the rotating 2D map, as discussed above.
The lack of understanding fostered by the 3D egocentric view is amplified still further in the fourth
source of navigational information represented in Figure 5.1, the verbal route list, a form of command display
that simply tells the traveler when, where, and which direction to turn (e.g., the driving instructions in Google
Maps). With the emphasis on forward actions, attention is diverted from considering those landmarks and
features unrelated to the forward path (Bartram, 1980).
The second spawned cost of 3D displays, line of sight (LOS) ambiguity, was described in the previous
chapter (see Figures 4.21 and 4.21). The location and movement of objects in 3D space is ambiguous when
presented on a flat viewing surface, and position differences along the line of sight are highly compressed.
Such costs are particularly prominent in the 3D exocentric display because not only are object locations
ambiguous, but so is the location of you the viewer (or the avatar from which spatial judgments must be

129
made; Wickens & Prevett, 1995; Wickens, 1999; Wickens, Vincow, & Yeh, 2005).
Yet despite these costs, 3D exocentric displays preserve two clear benefits. Unlike the coplanar display,
displayed objects look like their real-world counterparts, and unlike the 3D immersive display, a large space
around you can be seen, hence mitigating the unwanted keyhole effect and increasing what we call global
situation awareness (see Chapter 7); that is, the ability to understand “the lay of the land” (Wickens,
Thomas, & Young, 2000).
Figure 5.6 presents a filled-in version of Figure 5.1, which now identifies more clearly the costs and
benefits imposed by different information processing mechanisms for each task-frameof-reference
combination (Wickens, 1999). Each cell of the table can be characterized by a set of factors that makes the
viewpoint at the top either more (+) or less (-) suitable to serve the task at the left. Thus for the navigational
travel task, the 2D map imposes a cost because the symbols or icons used to designate landmarks (useful for
deciding where to turn) do not look like their realworld counterparts. For the 3D exocentric display, this cost
is reduced, and a benefit for landmark comparison (between the world and the map) is observed with the 3D-
immersed display.

Display Frame of Reference

Task 2d co-planar 3D exocentric 3d immersed Verbal route list

Nav travel Landmark Landmark Landmark


comparison – comparison + comparison –

Understanding Broad FOV + Broad FOV + keyhole- Landmark


Landmark Landmark comparison +
comparison – comparison +

Precise judgment Linear distance + Double LOS LOS ambiguity –


ambiguity –

FIGURE 5.6 Matrix of costs and benefits of tasks and display Frame of reference

For the task of understanding, maintaining global SA, or developing a mental model of the world
(functions which often support navigational planning), there are two sources of influence: again, landmark
similarity, but also visibility of the broader array, required to gain an understanding of the relative and
absolute location of map features beyond the route of travel. These two features are identified by + and -
across the cells, where the 3D immersed display is heavily penalized by the keyhole phenomenon. For the
third task of precision judgment, the key feature is the ease of perception of linear distance. The plan view (or
co-planar) map possesses a consistent scale that applies across all regions of the map/display. In its absence,
as with 3D displays, ambiguity is present. The exocentric 3D display is doubly penalized because the
ambiguity is applied both to assessing one’s own position and to the position of other elements.

1.5 Solutions to FOR Problems


Whenever a single task is to be served, it is usually possible to pick an optimal viewpoint or FOR. But often
multiple tasks must be served simultaneously. For example, the driver must navigate when driving through an
unfamiliar area, but she should also understand the area traversed so that a wrong turn or incorrect guidance
does not get her lost. Two solutions are described: one a design solution and the other a training-based
solution.

1.5.1 DESIGN: MULTIPLE MAPS An obvious solution is to provide two (or more) different maps, either
simultaneously viewable or sequentially accessible (such as an electronic map that can toggle between track-
up or north-up options). The effectiveness of such a solution can generally be enhanced if techniques of visual
momentum are employed to show how the area depicted in one map, display, or view relates to that depicted
in the other (Aretz, 1991; Bennett & Flach, 2012, Woods, 1984; Hochberg & Brooks, 1978). While this issue
is addressed in detail later in this chapter, one brief example can serve here. Consider the situation in Figure

130
5.5. Now suppose that the forward view is, instead of a real view, a 3D egocentric synthetic map (also known
as a synthetic-vision-system display (Prinzel & Wickens, 2008; Alexander, Wickens, & Hardy, 2005). This
will help guidance (particularly if coupled with a highway in the sky, as shown in Figure 4.19). The 2D map
will help planning and understanding and communications with air traffic control, but the dual maps will
support both tasks. Now the field of view in the immersed display can be depicted as a “wedge” on the plan
view map at the lower left of Figure 5.5. This allows the navigator to rapidly see how the terrain depicted in
the forward view display is represented in the 2D map view. In terms of the proximity compatibility principle,
discussed in chapter 3, the common element (FOV representation) helps the navigator integrate the two
sources of information. It guides selective attention gracefully between the common elements of the two
displays. The presence of such visual momentum tools has proven beneficial for performance when flying
with north-up maps (Aretz, 1991; Olmos, Liang, & Wickens, 1997).

1.5.2 TRAINING: STAGES OF NAVIGATIONAL KNOWLEDGE There is now good evidence that different aspects of
spatial/cognitive skills can be trained, both in terms of raw processing speed (mental rotation fluency;
Stanzky, Wilcox, & Dubrowski, 2010) and general strategic approaches. An important element here is the
process of acquiring geographical knowledge, developed as a person learns about a particular area: a city, a
mountain range, a neighborhood, or a complex 3D building.
Researchers have identified three general stages of knowledge acquired as familiarity with an
environment increases. Possession of all forms of spatial knowledge helps to optimize navigational fluency
and understanding (Thorndyke & Hayes-Roth, 1982).
1. Landmark knowledge generally develops first, and is characterized by highly visual representations
of key salient landmarks—the atrium in a building or the distinct statue and river in a city.
2. Route knowledge is characterized by knowing how to get from one location to another; it is often
represented verbally in terms of specific navigational decisions, such as “turn right at the church.”
Route knowledge thus links together information about the relative location of landmarks.
3. Survey knowledge or the “mental map” is represented by the ability to reconstruct an accurate
rendering of the area. Answers to questions about spatial relations, such as “What street is north of the
statue?” or “How far is it from X to Y?” (when these are not along a route already traversed) also
constitute a form of survey knowledge. Survey knowledge is helpful for the lost or disoriented
traveler, whereas route knowledge is less useful in this case (since the traveler is likely off the route).
Research suggests that the paths to the latter two forms of knowledge are somewhat different. The most
direct path to route knowledge is through navigational practice in either the real environment or a virtual
rendering (virtual reality environments, as discussed later in the chapter). The most direct route to survey
knowledge is through map study. However, there is some asymmetry between the two kinds of training in that
extensive navigation will also eventually develop fluent survey knowledge; but extensive map study is less
proficient in developing route knowledge because it does not support the visual landmark recognition
obtained from the forward 3D view of the navigator (Williams, Hutchinson, & Wickens 1996; Thorndyke &
Hayes-Roth, 1982).
It is important to realize that the path to the two levels of advanced geographical knowledge, route, and
survey is not linear and sequential. Both can develop concurrently, and Montello (2005) has suggested that it
is the increasing precision of metric properties, more than the qualitative change in type of knowledge, that
develops with increased navigational experience. Liben (2009) has also noted the increased precision of
spatial knowledge, a hallmark of spatial environmental learning that is acquired with experience.

2. APPLICATIONS TO MAP DESIGN


In the previous section, we discussed a number of factors that influence the design of maps, based on how
they are used. Here we summarize some of the main implications of our discussion, and augment these with
two additional considerations.

2.1 Design of 2D Maps


2D maps used for navigation should rotate with direction of travel or orientation, but have the option of fixed
(north-up) (to allow for improved spatial understanding).
• Heading-up maps may be dynamic (in vehicle mounted maps) or achieved through appropriate
directional placement (in YAH maps). Such design will improve the congruence between the map and

131
the forward view and ensure that the direction of turn decisions are spatially compatible with the
visualized map information.
• If relatively precise vertical information is important (e.g., in trail maps, air space maps, or
architectural and construction blueprints), then a set of co-planar 2D views is useful (e.g., linear and
uncompressed depiction of the vertical).

2.2 Design of 3D Maps


Guidance is given in Figure 5.6 above as well as the discussion in Chapter 4, but making map viewpoints
optimal for the task may involve the creation of two maps. The coupling of the two maps (or views) will be
assisted by applications of principles of visual momentum as discussed earlier, in the context of Figure 5.5
(see also Section 5 below).

2.3 Map Scale


This refers to the ratio of the distance represented along a map surface to the world distance represented on a
map. That is, a 1:1000 scale indicates that 1 m on the map corresponds to 1000 m in the world. If the second
number in the ratio is small (e.g., 1:10), then this corresponds to a “large scale” or “zoomed in” map. If it is
large (e.g., 1:100,000), then it is a “small scale” or “zoomed out” map. In 3D maps, scale can be defined by
the geometric field of view or GFOV, which includes angular and distance (scale) components, like the lens
shape and zoom on a camera (Hollands & Lamb, 2011).
• The best map scale is task-dependent, with smaller scale maps generally better supporting global
understanding (since the relative location of more features can be apprehended in a single glance), and
larger scale maps generally better supporting navigation, since details along the route of travel will be
better represented. Note particularly the parallel between small and large scale with the exocentric and
egocentric 3D displays respectively.
• In 3D displays, a large GFOV will generally support global understanding, and reduce the keyhole
phenomenon (Alexander, Wickens, & Hardy, 2005). However a smaller GFOV, by magnifying and
enhancing the visibility of landmarks along the forward path, will help navigation and travel. Also
because it presents fewer landmarks or objects on the screen (it is less “compressed”), the smaller
GFOV will generally produce less clutter, the issue we address in the following section.

2.4 The Role of Clutter in Map Search

2.4.1 CAUSES OF MAP CLUTTER Clutter was discussed in some detail in Chapter 3 as an impediment to selective
attention (visual search) and focused attention (read out clutter). Clutter plays an important role in map use,
where search for an item is typically followed by a more focused readout of properties of the located item. It
also bears on issues discussed previously relating to map scale and compression. We focus our discussion here
on two forms of clutter introduced in Chapter 3.
• Search or numerosity clutter is created by two factors in map design:
1. Adding more information. Display designers will provide status information about the objects
shown in a display. For example, in air traffic control, there is often a desire to include digital
data tags for each aircraft. Heterogeneous-featured object displays, described in Chapter 3,
provide a method for increasing information without increasing clutter.
2. Scale. Increasing the GFOV of a 3D display, or decreasing the map scale of a 2D display, will
increase “N” in the search task.
• Proximity or readout clutter, which challenges focused attention, is increased by three factors:
3. More items. Increasing the number of items, reducing scale, or increasing GFOV will “scrunch”
items together, greatly increasing the likelihood that an unwanted item will be within one degree
of visual angle of a wanted relevant one, and hence disrupting focused attention on the latter.
4. Display miniaturization, such as a handheld display with a tiny screen, will have the same effect
(Stelzer & Wickens, 2006; Yeh, Merlo, et al., 2003). (By reducing text size, this will also
decrease visual resolution and make map reading hard in a way unrelated to clutter.)
5. Data base overlay (Kroft & Wickens, 2003). This change produces a complex set of effects that
we describe next.

132
2.4.2 DATA BASE OVERLAY An example of data base overlay is provided by comparing the two images in Figure
5.7, which represent an integrated hazard display for pilots (Kroft & Wickens, 2003). In the right panel, a map
of traffic and air routes is shown at the left, and a map of terrain and weather is shown at the right. In the left
panel, the two maps are overlaid in an integrated display. The overlay clutter produced on the left is quite
apparent. The terrain features make the traffic information harder to read. The separate data base solution on
the right solves this problem.
Yet there are two obvious costs to the separate displays on the right given the same physical “display real
estate:”
1. The separated maps are of reduced size and lower resolution relative to the integrated map on the left,
leading to greater legibility problems.
2. Whenever judgments on the two data bases require integration, such overlay or close spatial proximity
will adhere to the proximity compatibility principle, addressed in Chapter 3. The separated displays on
the right make it difficult to perform integration judgments like “How can I fly a safe route that avoids
both traffic and terrain?” (Kroft & Wickens, 2003).

FIGURE 5.7 (Left) map cluttered by data base overlay. Right: a version of the map in 5.7 (left), now decluttered by separating the air route
from the terrain data base. These two maps are half the size as their representation in 5.7 left because of the need to place them in the same
display screen.

2.4.3 CLUTTER SOLUTIONS Just as we saw problems with chart junk in graph design (Chapter 4), it is important
not to put too much “stuff ” (i.e., extraneous information) on a map! Beyond this simple recommendation,
however, we present some more sophisticated solutions to map clutter, addressing each of the two clutter-cost
categories in turn.
• To address search or numerosity clutter, highlighting can be employed using pre-attentive feature
differences to segregate different aspects of the data; for example, two-level discrete category color
coding of high versus low altitude aircraft in an ATC display. Both color and intensity highlighting
can be successfully employed here (Yeh & Wickens, 2001a; Wickens, Alexander, et al., 2004; Nunes,
Wickens, & Yin, 2006; Remington et al., 2001), although color tends to be more effective. If the target
is known to be in one particular coding class, that class can be searched first; and if the feature
differentiating the searched class from the other classes is one that is pre-attentively processed
(Treisman, 1986, 1988; see Chapter 3), then the restricted set size of the targeted (and highlighted)
class allows the search to be carried out as if the other elements were not present at all. Even some
aspects of shape can serve as a pre-attentive filter: in Figure 5.7a, it is easy to see the difference
between the “line shapes” (air routes), and the “blob shapes” (weather patterns; Yeh & Wickens,
2001a).
• Both aspects of map clutter costs can be addressed by “decluttering” tools, in which a keystroke or two
can “hide” pre-designated aspects of the data bases (readout clutter) or sets of elements (search
clutter). While the benefits of decluttering to search and readout are evident, these may be offset by the
time (and added workload) required to use the keys appropriately, in a way that reduces any net benefit
of such techniques (Kroft & Wickens, 2003; Yeh & Wickens, 2001a). In addition, sequential displays
(e.g., toggling between two data bases) often impose a cost of working memory: if their contents need
to be compared, again a cost consistent with the proximity compatibility principle. Furthermore, there

133
are dangers for dynamic maps that when a dynamic element is hidden, its changes will not be noticed,
the so-called “out of sight out of mind” phenomenon typified by change blindness (Chapter 3;
Wickens, Alexander, et al., 2005b). Finally, we note that one person’s clutter may be another person’s
information, a major concern for decluttering tools when a display is shared between users.

3. ENVIRONMENTAL DESIGN
The design of urban environments and large public buildings like hospitals or transportation stations has much
in common with the design of maps, particularly the 3D immersed view, discussed in 1.4 above, which
captures the essence of being inside a complex building. Such design is often challenged because designing
for effective navigation and understanding is sometimes at odds with the aesthetics of the creative architect
(Carlson, Holscher, et al., 2010). In Chapter 4 we talked of the compatibility between a user’s mental model
and a display representation. We identify below three prominent characteristics of people’s mental models of
3D environments that are important to consider when designing features of 3D environmental design.
Canonical orientation. Most 3D environments have a canonical or favored orientation: the direction of
the main entrance to a building or the view upon a city from a scenic lookout (Sholl, 1987). However, even in
environments within which one has often navigated, the canonical orientation is likely to be north-up
(Frankenstein, Mohler, et al., 2012).
Landmark prominence. Much of environmental learning is highlighted and facilitated by prominent
landmarks, which stand out from the surround because of their size and distinctiveness. These “anchors”
facilitate navigational performance and acquisition of landmark knowledge. Salience is not sufficient,
however, and when multiple landmarks are identical or highly similar, their use can be detrimental and cause
confusion, as will be discussed below.
Rectilinear normalization. People tend to think and spatially reason in the orthogonal “3D grid”
discussed as part of FORT theory at the beginning of this chapter. For example, directional judgments are
faster when aligned with compass headings (Maki, Maki, & Marsh, 1977), and people tend to “straighten”
curved features in their mental model (reconstructed from drawings), like a curved road or river (Milgram &
Jodelet, 1976). They tend to align an oblique (nonperpendicular) intersection with a square grid (Chase & Chi,
1979). The alignment tendency is so prominent that a large sample of residents and workers in downtown
Boston reconstructed the Boston Common (an asymmetrical pentagon), as a normal rectangle, eliminating the
fifth side. Another example of rectilinear normalization is applied to spatial reasoning of relative directions.
When people are asked to judge the position of Montreal relative to Seattle, they will often report Montreal to
be further north. (Seattle is actually further north.) Their reasoning is based on this simple grid-like logic:
Montreal is in Canada; Seattle is in the United States; Canada is north of the Unites States; therefore,
Montreal is north of Seattle.
It is noteworthy that this categorical topological reasoning is assumed to be more “primitive,” developing
earlier in human cognitive growth than is the more accurate spatial analog reasoning (Liben, 2009). We will
see a similar pattern when we consider how people store semantic knowledge of various types (in Chapter 7).
The above three characteristics dictate some principles of good environmental design. In practical terms,
they are most easily applied to the design of building interiors, because unlike a city, a single team of
contractors and architects is responsible for incorporating desirable cognitive features (see Carlson, Holscher,
et al., 2010).
• Landmark prominence and discriminability. As noted, 3D environments should contain landmarks
—not too many (or else these will lose their distinctiveness)—but also not too few. Ideally one
landmark should be visible from most places in the environment. Landmarks should be discriminable
from each other in some form, even while those sharing a common function (e.g., the lounge by a
stairwell on different floors) will share common visual features. In addition, three recommendations to
landmark creation are:
1. The benefit of glass windows to support landmarks in a building within a geographical region,
which is characterized by some unique directional view (a mountain range, park, or body of
water);
2. Landmarks are particularly valued on YAH maps;
3. There are considerable advantages to “intervisibility” whereby one landmark can be seen from
another (Carlson, Holscher, et al., 2010).

134
Consistency of orientation. People have an expectation of consistency, particularly across floors of
• multi-level buildings. Visitors to one modern museum were quite troubled by a slightly altered angle
of orientation of each subsequent floor relative to north (Carlson, Holscher, et al., 2010; Holscher et
al., 2011).
• Consistency of elements or 3D structure is not the same as uniformity. Functionally important
differences need to be identified by prominent differences in environmental features. These may be
landmarks differing in appearance, as discussed above, or distinctive features like the different color of
an east wing and a west wing of a building.
• Consistency of rectilinear normalization can be achieved both internally (by designing 90-degree
corners and four-way intersections) or externally (orientation of building features with the external
grid of surrounding streets).
• It is important that functionality of design is compatible with the visitor’s task (Carlson, Holscher, et
al., 2010). As a negative example, building features that facilitate survey knowledge will not
necessarily facilitate the route knowledge important for navigation. A building’s 3D layout may be
visible and understandable from a viewpoint in the main lobby (good survey knowledge), but if there
is little indication of the location of stairwells, or if a visible escalator cannot readily access all visible
floors, navigation will be hindered.
• Individual differences in geographical knowledge should be accommodated. Most visitors to a
museum or hospital will have visited only a few times and will have only minimal route and
(particularly) survey knowledge; they will need to be better supported by features such as consistency.
This stands in contrast to long-term occupants of an office building.
In conclusion, it is interesting to note the commonalities of good cognitive human factors of
environmental design, and good human factors of workplace and display layout design, in terms of such
concepts as consistency and confusability. Cognition in large and small-scale spaces has much in common.

4. INFORMATION VISUALIZATION
In many ways, the topic of information visualization integrates the topics of graphs and 3D displays,
discussed in the previous chapter, and navigation and maps, which we have just discussed. The graph material
is directly relevant because the task of the visualizer is generally to integrate or “make sense” of a set of data
(usually numbers), but in contrast to graphs the amount of data in visualization is typically vast, well beyond
the 4 to 20 data points in a typical graph. It may consist of thousands or millions of connections between e-
mail users, or the temperature and humidity readings of thousands of 3D locations, or the spectral and
movement qualities of millions of stars. There is, of course, no sharp dichotomy between the domain of
graphs and that of visualization. Furthermore, the supervisors of complex systems for air traffic control,
process control, disaster management, or battlefield management can benefit from techniques of visualization,
capitalizing on properties of human visual perception in order to find data and draw insight from them.
The topics of navigation and spatial cognition are relevant here because the data being visualized
typically either have an intrinsic spatial structure (e.g., 3D weather patterns, stars in the universe) or can be
represented within some spatial context, like a three-way correlation scatter plot or a network or hierarchy
with nodes “close to,” “farther from,” “above,” or “below” each other. Furthermore, the space of visualized
information is often so large that it must be “explored,” like an unfamiliar city or mountain range. Hence the
metaphors of travel, orientation, and getting lost are quite relevant.

4.1 Tasks in Visualization


Vincow and Wickens (1998) and McCormick and Wickens (1998) have identified three broad task categories
used by the visualizer of a large data base.
1. Search tasks involve finding one particular entity within the data base, like a single file in a large
computer.
2. Comparison tasks involve integrating or otherwise comparing a small set of entities, such as
determining the change in concentration of a pollutant at a given location across time.
3. Insight (North, 2006; Robertson, Czerwinski, et al., 2009; Borwn & Gallimore, 1995) or
“sensemaking” (Klein, Moon, & Hoffman, 2006), in which the data are examined in order to discover
relationships not previously known. Such is the task that dominates the subfield of scientific
visualization (Card, Mackinlay, & Schneiderman, 1999). This last category is challenging to model

135
and assess but is one of the greatest uses of visualization.
Task categories 2 and 3 require the integration of information, and hence the proximity compatibility
principle discussed in Chapter 3 and 4 is again relevant (Robertson, Czerwinski, et al., 2009).

4.2 Principles of Visualization


Integrating some of the limited amount of empirical research on visualization, (e.g., Chen & Czerwisnski,
2000; Shneiderman & Plaisant, 2005), with principles extrapolated from other research on complex system
display (Smith, Bennett, & Stone, 2006), and with the creative thinking of key systems developers in the field
(e.g., Tufte, 2001; Robertson, Czerwinski, et al., 2009; North, 2006; Card, Mackinlay, & Shneiderman, 1999;
Ware, 2005), it is possible to identify a set of human factors principles and challenges for designing useful
visualization tools. We describe each of these in detail in the next several sections.

4.2.1 COMPATIBLE MAPPING OF DIMENSIONS As discussed in Chapters 2 and 4, certain conceptual dimensions
have a more natural or compatible mapping to rendered dimensions than others. Data can be qualitative,
ordered, or quantitative in form (Stevens, 1946). For the purposes of display design, these different data types
are best represented by different visual variables (Bertin, 1983; Upton & Doherty, 2007). Brightness or
texture for example is more compatibly mapped to continuous variables with a clear “greater” and “lesser”
aspect than is color hue (Bertin, 1983; Merwin & Wickens, 1993). This is because it is not clear which end of
the color spectrum means “greater” or “less”; and color hue has more of a qualitative aspect, with strong
stereotypes (Spence & Efendov, 2001; Merwin, Wickens, & Vincow, 1994). In contrast, one might use shape
or color to distinguish different nominal categories on a display. As we know from studies of visual search
(Chapter 3), objects having identical colors tend to be associated together, even when they are spatially
separated. Furthermore, a unique color tends to stand out. It is also the case that space is compatibly mapped
to space so that visualization of geographic areas (e.g., a pollution map) is best accomplished when the
dimensions of rendered space correspond to the dimensions of displayed space.
FIGURE 5.8 shows a chart pitting the different data types against particular visual variables that can be
used to denote values on a two-dimensional display surface. Whereas all visual variables can be used to group
or associate different nominal categories, fewer are appropriate for ordinal variables, and only the two spatial
dimensions, size, and brightness are recommended for quantitative variables.

FIGURE 5.8 Relationship between data representation and display representation. “Yes” indicates a good map. Source: Redrawn and modified
from Bertin (1983).

Since visualization involves data points of three or more conceptual dimensions, a natural choice is the
use of three-dimensional Euclidian space displays. However, caution should be exercised due to the problems
of 3D line-of-sight ambiguity discussed above (Wickens, Merwin, & Lin, 1994) and in Chapter 4. Another
important concept is time. Time, like space, is compatibly mapped to display dimensions, often advancing
from left (past) to right (future). However, time can often be directly mapped to display time via animation
(Robertson, Czerwinski et al., 2009). Here a display changes in real time, or time is “dragged” forward or
backward by moving a slider, an issue we discuss below.

4.2.2 COMPATIBLE MAPPING OF DATA STRUCTURE Beyond data type, we can also distinguish between four main

136
categories of data structure, as shown in Figure 5.9 (Durding, Becker, & Gould, 1977). Tabular data (a) have
categorical attributes determined by their column and row, as seen for example with spreadsheets.
Dimensional data (b) are more characteristic of many graphs, as axes tend to have continuous ratio or at least
ordinal scales. Dimensional data also include maps, such as a 3D temperature map. Network data (c) consist
of nodes connected by links, such as a communications diagram of who talks to whom. Finally, hierarchical
data (d) are a form of network data clearly defined by a hierarchy such that a few “higher” nodes link to a
greater number of lower level nodes.
Certain kinds of display organization are more compatibly mapped to each of these data classes than
others. For example, it makes sense to represent a network in terms of visual nodes and links (the letters
connected by lines in figure 5.9c top) rather than force it into a tabular format. However, there may be
circumstances in which insight into a network can be gained by examining it in tabular form, where for
example a cell represents a node, and adjacent cells are linked. This is shown in figure 5.9(c) bottom. Figure
5.9(d) also shows two alternative renderings of hierarchical data.

FIGURE 5.9 Four categories of data base structure for information visualization: (a) tabular (b) dimensional (maps) (c) networks shown in
either link form (above) or tabular form (below) (d) hierarchical networks shown in either link form (above) or Venn form (below). In the
network (c), the item F which shares no attribute with the item A, is connected by A with two links. That is, it is more “distant”.

In summary, these ideas reflect an additional type of compatibility that is important for the display
designer to consider: data type compatibility (DTC). DTC reflects a compatibility between the display
representation and characteristics of the data being visualized. We have augmented a figure we used in
Chapter 4 with a new data representation box and linked it to the user’s internal representation (mental model)
(Figure 5.10). We note also that the data representation may be influenced by other representations, including
the physical and task representations (shown by links in Figure 5.10).
Of course there is no absolute immutable assignment of rendered variables to rendering dimensions, and
sometimes great insight can be achieved if dimensional assignments are swapped to provide alternative ways
of looking at data. A classic example is the parallel coordinate graph (Inselberg, 1999) shown in Figure
5.11. On the top, in a 3D graph, a 3D data point can either be represented in conventional Euclidean space
(left), or as a line drawn to points along three parallel axes, the parallel coordinate graph. Different features
emerge from these two representations: a single location in space versus a “profile.” On the bottom, the
potential advantage of parallel coordinates is shown when the number of dimensions for each data point
grows beyond the three that can easily represented in Euclidian space and when two data points are presented.
Here the six dimensional values of the two data points can be easily visualized, and the similarity between
these profiles can be easily perceived; that is, they are close together in 6D space. The importance of
flexibility of views is related to our discussion earlier in the chapter about the importance of dual maps with
different frames of reference and is also related to our next principle of visualization.

137
FIGURE 5.10 Data type compatibility. We have augmented Figures from Chapter 4 with a data representation (on the lower right), and have
linked it to the display representation. When the mapping between data representation is good, we say there is data type compatibility (DTC).
PCP = proximity compatibility principle. DC = display compatibility. EC = ecological compatibility.

FIGURE 5.11 Parallel coordinate graphs. A data point is shown on the left in a 3D spatial graph. On the right, the representation of this same
data point is now the line running through the three (now) parallel axes. On the bottom is shown two “data points” from a six dimensional data
base in parallel coordinate form Source: After Inselberg, 1999.

4.2.3 MULTIPLE VIEWS There is a consensus in visualization research that multiple views offer many advantages
and should generally include both a global (zoomed out) view with a stable world frame of reference, and one
or more local (zoomed in) views. Such a consensus is consistent with Shneiderman and Plaisant’s (2005)
advice that the best approach for designing visualization tools is to allow the user to: “Overview first, then
zoom and filter, then details on demand.” We will unpack this important sequence in the following sections,
but we note here that the initial global overview provides spatial stability in understanding the data structure
(Robertson Czerwinski et al., 2009) as well as a context in which subsequent local views can be examined:
like a large-scale map, this stable context can prevent a user from “getting lost” in the data.
Just as an initial overview is highly desirable (North, 2006) there is also merit in preserving this
contextual display throughout the subsequent phases of visualization (zoom & filter; Risden, Czerwinski,
Mayer, & Cook, 2000). This will help prevent the keyhole phenomenon discussed earlier that becomes
prevalent when, for example, one is scrolling a list (Robertson, Czerwinski, et al., 2009). Part of the
continuous context awareness can be achieved by retaining the large-scale view, akin to the dual maps
discussed earlier in the chapter. Visual momentum can be preserved by highlighting the current location in the
small scale world view, as zooming and exploration are conducted in the local view, as shown by the two

138
examples in Figure 5.12.
Yet another way of preserving context and preventing lostness is through the fisheye view (Furnas, 1986;
Sarkar & Brown, 1994). A fisheye view expands and displays in full detail information concerning a specific
item of interest, but provides progressively less information about items as their distance from the item of
interest increases. The fisheye view appears to be an effective representation for a variety of tasks (e.g.,
estimating best routes in a network, Hollands, Carey, et al., 1989; displaying aircraft maintenance data, Mitta
& Gunning, 1993), showing Java source code (Jakobsen & Hornbaek, 2006), or fitting web pages onto mobile
devices with small screens (Gutwin & Fedak, 2004).

FIGURE 5.12 Two illustrations of visual momentum. Top: the flashing star on the right, depicted within the context setting global map of a full
menu, depicts the particular page being viewed in the local view on the left. On the bottom, the small map at the upper left depicts the context in
which the local view triangular area is depicted on the right.

4.2.4 INTERACTION Whether supported or not by context, users often “visit” or “travel” to different parts of the
data base, as when “zooming.” Such travel can be carried out in a variety of ways. Direct travel involves
“flying” through the data base with a joystick or other control device, much like the tracking task we will
discuss later in this chapter (Section 6). In contrast, indirect travel can be accomplished by a point-and-click
system where targeted data base areas are expanded to provide “details on demand” in an attached pop up
window. There is little conclusive empirical evidence that one approach is superior to the other (North, 2006);
however, there is a danger in providing too much interactivity. “Flying” in 3D space is not a skill natural to
human evolution, and if three axes of travel are added to three axes of viewing orientation, the six-axis control
problem can become complex (and possibly unstable). As we discuss in section 6, stability in such flight
depends upon avoiding a control gain that is too high (leading to overshoots in getting to one’s goal) or too
low (leading to long delays in traveling). For these reasons, there are sometimes advantages for discrete point
and click systems when three (or higher) dimensional data bases are involved.
One simple form of interaction that has proven successful is “brushing,” (Becker & Cleveland, 1987) in
which a single dimension of the data base is “traveled” at one time. For example, Figure 5.13 (top) depicts a
pollution map of a hypothetical “square-state.” It is a four-dimensional data base in that the two spatial
dimensions are augmented by color hue (type of pollution) and saturation (outbreak intensity). A fifth
dimension (time) can be added by a “brushing” interaction whereby a slider is moved along a timeline. The
slider movement will paint different regions according to the type and intensity of pollution during the year in
question. Thus, traveling one dimension at a time, in the constrained fashion imposed by the linear slider, can
help to prevent spatial orientation or control problems that might be imposed by “flying” a time axis.

139
FIGURE 5.13 Illustrates visualization interactivity and proximity compatibility. Top 2 panels: interactive slider “brushing” time. The two
views of Colorado represent time 10 years apart. Bottom: a means of creating proximity compatibility to compare pollution with possible
diseases caused. The single slider will make correlated changes in the two variables evident by shared intensity and color changes. Note the
increase in both variables in the northeast corner of the State.

A third form of interaction involves re-arranging different views or re-mapping different axes (North,
2006). For example, as we have noted, manual interaction or toggling can restructure a graph from its
conventional format to a parallel coordinate format (see Figure 5.11), because sometimes insight is gained
differentially in different views.
Finally, interaction is generally necessary to carry out the “filter” aspects of the action sequence proposed
by Shneiderman and Plaisant (2006). Such filtering may involve tailoring by asking Boolean queries about the
data (“show me only CO2 pollution”) or may be accomplished by various decluttering techniques as discussed
earlier in the chapter (e.g., highlighting the relevant portions).

4.2.5 PROXIMITY COMPATIBILITY As noted at the outset of this section, both comparison and insight tasks involve
information integration, across multiple parts of the data base. As in simpler domains, such integration is
fostered by display proximity (Robertson, Czerwinski, et al., 2009; Liu & Wickens, 1992b) that makes to-be-
compared features more similar (closer spatial or object-based proximity). However, in visualization it is not
always apparent a priori which aspects of the data need to be compared or integrated. Indeed, if it was clear to
the visualization tool designer, then insight could be said to have already occurred! Nevertheless there are
ways in which display proximity can be created to foster integration in such tasks, even without knowing what
needs to be integrated.
First, for multiple points in any sort of 3D graph or data base, “mesh” can be used to connect the points,
revealing the trends of the surface (Liu & Wickens, 1992). In this way, the mesh operates analogously to the
way that a line connecting points in 2D line graphs aids trend identification and creates emergent features, as
discussed in Chapter 4.
Second, just as we have emphasized the need for multiple views, there is often a need for the user to
integrate the representation of spatially separated elements across the views. Physical features can foster this
mental integration. In Chapter 3 (section 3.5), we saw how the use of a common color could accomplish this
objective. As another example, synchronous change in different spatial locations facilitates integration
because of the high sensitivity of the visual system to such changes (Meortl et al., 2012). This is illustrated in
the lower two images of Figure 5.13, which depicts the same square state as the top two images, but now with
the frequency (intensity) of different diseases (color) illustrated. Now a single time slider can reveal potential
correlations of pollution (top images) and disease (bottom images) over time that highlight a likely causal
relation between them. For example, increases in intensity occurring at the same location, the northeast corner
of the State, might occur in both maps as the slider is moved.
Finally, as noted before, designating a common element, the ego-location within a local and global view
by some shared feature cognitively links the two displays. In this way, it demonstrates both proximity
compatibility and visual momentum.

4.2.6 ANIMATION The role of animation in visualization displays, described in some detail by Robertson
Czerwinski et al. (2009), has had only mixed success. (This concern is distinct from the helpful role of motion
in dynamic vehicle displays as in Chapter 4.) On the one hand, there is little evidence that the animation of
movement per-se (e.g., a streaming, flowing river) is any more effective than a simple static arrow (whose

140
direction and length can convey most necessary information) (Tversky Morrison & Betrancourt, 2002). This
negative finding has parallels in the questionable value of animation in instructional programs (Mayer, 2009;
see Chapter 7). On the other hand, however, animation can help in viewpoint switching (Hollands, Pavlovic,
et al., 2008), as discussed in the next section, and interactive animation, such as that accomplished through
brushing as discussed above, can be extremely valuable. Within this framework, interactive animation along
the time axis is analogous to single dimensional travel.

5. VISUAL MOMENTUM
As we noted in the previous sections, people become disoriented as they navigate within large spaces, whether
real, synthetic, or information. The concept of visual momentum represents an engineering design solution to
the problem of becoming cognitively lost as the user traverses through multiple displays pertaining to
different aspects of the same system or data base (Watts-Perotti & Woods, 1999; Woods, 1984; Woods,
Patterson, & Roth, 2002; Bennett & Flach, 2012). The concept was originally borrowed from film editors, as a
technique to give the viewer an understanding of how successively viewed film cuts relate to one another
(Hochberg & Brooks, 1978; Wise & Debons, 1987). When applied to the viewing of successive display
frames, either of virtual space (e.g., maps) or of conceptual space (e.g., topologically-related components in a
process control plant, nodes in a menu or data base, or graphic representations of data), visual momentum
could be created by following four basic guidelines.
1. Use consistent representations. As noted in earlier discussions of graphs (Chapter 4), it is important to
keep display elements consistent across displays, unless there is an explicit rationale for a change.
However, when it is necessary to show new data or a new representation of previously viewed data,
display features should show the relationship of the new data to the old. The next three guidelines
indicate how this may be accomplished.
2. Use graceful transitions. When changes in representation will be made over time, abrupt
discontinuities may be disorienting. On an electronic map, for example, the transition from a small-
scale, wide-angle map to a large-scale close-up will be cognitively less disorienting if this change is
made by a rapid but continuous blowup, or at least presentation of a few intermediate frames. The use
of animation between two system states, such as animating the enlargement of a web page when
selected (Bederson et al., 1998) or rotating a network to bring a selected node closer to the observer
(Robertson, Card, & Mackinlay, 1993), serve as two examples. In 3D environments, animating the
switch between viewpoints has been shown to be helpful. Hollands, Pavlovic, et al. (2008) found that a
smooth rotation between a 2D and 3D view of the same terrain supported better spatial decision
making than did an abrupt switch between views. Similarly, Keillor, Trinh, et al. (2007) found that
smooth rotation between two task-relevant viewpoints on terrain was superior to a discrete shift in
viewpoint.
3. Highlight anchors. An anchor may be described as an invariant feature of the displayed “world,”
whose identity and location is prominently highlighted on successive displays. For example, in aircraft
attitude displays that might be viewed successively in various orientations, the direction of the vertical
(or horizon) should always be prominently highlighted. In map displays, which may be reconfigured
from inside-out to outside-in to accommodate different task demands, a salient and consistent color
code might highlight both the northerly direction and the heading direction (Andre, Wickens, et al.,
1991). As noted earlier, Aretz (1991) and Olmos, Liang, and Wickens (1997) successfully used an
anchor by rendering the angle subtended by the forward field of view on a top-down north-up map
(Figure 5.5). In displays designed for examining the components of a complex chemical or electrical
process, the direction of causal flow (input-output) could be prominently highlighted. In the YAH map
(Figure 5.3) a prominent landmark highlighted in the map serves as a useful anchor.
A corollary principle is that when successive display frames are introduced over time, each new
frame should include overlapping areas or features with the previous frame (e.g., in computer menu
navigation, Tang, 2001), and these common landmarks should be prominently highlighted (here again,
color is an effective highlight). Visual momentum anchors are also beneficial for teams, working with
separate maps, to highlight shared reference points between the teams (van der Kleij & Brake, 2010).
4. Display continuous world maps. Here we refer to a continuously viewable map of the “world”, always
presented from a fixed perspective discussed as a context-setting dual map design in section 1.5.1 (see
Figure 5.12). Within this map the current identity of the active display is always highlighted. This is a
feature of the topographic maps produced by the U.S. Geological Survey, in which a small map of the
state is always viewable in the upper left-hand corner, with the currently displayed quadrant

141
highlighted in black.

6. TRACKING, TRAVEL, AND CONTINOUS MANUAL CONTROL


Two sections of this chapter have described the cognitive aspects of navigation, and the perceptual and
cognitive aspects of understanding large scale spatial data bases. In both cases, physical or virtual travel has
been a necessary component. In this section we describe the perceptualmotor components of such travel,
whether driving a car, flying a plane, flying through a 3D visualized data base, moving through a virtual world
(see section 7), or moving a probe along a blood vessel in endoscopic surgery. Common to all of these are
intermittent decisions of which way to move, guided jointly by pursuing goals (target) of the task and
avoiding obstacles or compensating for disturbances in the environment. Sometimes the decisions are made
very periodically followed by simple linear trajectories. For example, when I am text editing with a mouse I
will only periodically move the cursor, on a straight line, to the word I wish to edit. However, sometimes
control decisions are nearly continuous, as when we are driving a car down a curvy highway on a windy day.
These two examples represent different forms of the tracking task, used to represent manual control
(Jagacinski & Flach, 2003; Wickens, 1986; Wickens & Hollands, 2000). In the tracking task, a control device
(the mouse, or steering wheel) is used to maneuver the system output, often represented as a “cursor” (the
blip on the computer screen, or the visualized heading of the car) through space, over time, to reach a target
(the desired word on the computer screen, or a middle-of-the-lane car location). In other words, the goal is to
reduce the error between the target and the cursor (system output). You can imagine a loop connecting the
human operator, the control device, the system being controlled, and a display (either the real world or an
artificial display) that allows the human operator to see the current system state (and the results of their
actions). When the human continuously monitors the system output and adjusts in response to the feedback
provided, this is called closed-loop tracking.
During this process two things can happen that create error and thus force decisions to be made to change
the trajectory. First, the target may move. This is what happens when the car encounters a curve: the desired
position bends to the left or right. Second, a disturbance may deviate the cursor from the desired trajectory.
One may overshoot the desired word with the mouse, or wind gusts may blow the car from its desired
heading. Finally, the relationship between how a control is moved, and how the cursor moves—the system
dynamics—can vary greatly with the complexity of the tracking system, in a manner elaborated below. As
three contrasting examples, the cursor typically responds to the mouse movement in a very straightforward
way. But the car changes its location on the lane in a more complex fashion, and the relation between a pilot’s
stick movement and the 3D position of the aircraft in the sky is still more complex.

6.1 Tracking to a Fixed Target


When the target is fixed in space, like the position of a word on a screen to be edited, then movement of the
cursor to the target follows a well-established law in human performance known as Fitts’ Law (Fitts, 1954;
Card, English, & Burr, 1978). This law predicts that movement time is proportional to both the distance
traveled and the required precision of the target (smaller target, greater precision). More specifically, there is a
linear relation between movement time and the Index of Difficulty (ID) of the movement, such that:
ID = Log2(2A/w),

where A is the amplitude of the movement and w is the width of the target. This law can be used effectively to
predict values like the time required to move a mouse to a “button” on a computer screen as a function of the
button’s size or to predict the time to move the foot to pedals of different sizes and separation (Drury, 1975).
More information regarding movement precision is discussed in Chapter 9 (see also Peacock, 2009).

6.2 Tracking a Moving Target


In many situations people need to track a target that changes over time, such as when the road curves while
driving, or the blood vessel bends during endoscopic surgery, when the plane is blown off heading, or when
the wide receiver (the target of the football quarterback about to throw the pass) is in continuous motion. Fitts’
law can be applied when the target moves, but there are several additional factors that influence the difficulty
of human performance in such tasks. In the following section we discuss those factors and illustrate them with
examples.

6.3 What makes tracking difficult?

142
1. Bandwidth. Increasing the frequency with which the target moves and required trajectory decisions are
made imposes added demands in tracking, and usually increases tracking error (the net deviation between
target and cursor). This is known as the bandwidth of the tracking input. When driving a car down a curvy
road, for example, increasing the speed of travel will increase the bandwidth. Bandwidth in tracking is often
measured in cycles/second (Hz) for the highest frequency changes in a tracking input.
2. Gain. The gain of a tracking task is defined as the ratio of movement of the cursor (system output) to
movement of the control that produced it. When we use a touch screen, as on an iPad, our finger is both a
control and a cursor, and so the gain is naturally =1. When designing a mouse for a computer screen, gain can
be set to any level, but a value between 1 and 3 is recommended as optimal (Baber, 1997). The steering gain
of a sports car is typically high, so a large angle turn can be accomplished by a fairly small turn of the steering
wheel, while the gain of the large truck is lower. The relation between gain and tracking difficulty is that of a
U-shaped function, with penalties when gain is too low or too high (Wickens, 1986). When gain is too low,
the control dynamics are effortful: a lot of control movement is necessary to change course (imagine spinning
the steering wheel rapidly with little change in your car’s direction). When the gain is too high, the control
system is too sensitive (imagine any tiny deviation in steering wheel angle leading to a massive change in
direction). With high gain the system becomes unstable, with overshoots as the target is approached. High
gain interacts with time-delay or system lag, as we describe below.
3. System Lag. In any control system, lag refers to the delay between when the control is moved and
when the cursor moves in response. Lag has two forms: transmission lag and control order:
• Transmission lag occurs when there is a direct delay in the signal from the control reaching the
cursor. An extreme example is controlling a lunar vehicle from a workstation on the earth. When the
steering control is exercised, there will be a delay of several seconds before the vehicle starts to
turn, and a delay of twice that time before the operator can see its movement change on an earth-
located display. Closer to home, we can describe a transmission lag as the delay between pressing
the accelerator of a car and the forward acceleration, as we try to maintain a constant headway to the
car in front of us.
• Control Order. As shown in Figure 5.14, control order describes the way that the system responds
to a direct change of position of the control. In zero order or position control, like the mouse, a
change in control position produces a chance in cursor position. In first order or velocity control,
like some analog radio station tuners, a change in control position produces a constant velocity of
the cursor. The relationship between steering wheel position and vehicle heading is a first order
control, since a given constant deflection of the wheel will cause the car to change heading at a
constant rate (defined by the radius of the curve). The greater the steering wheel deflection, the
sharper the turn. In second order, or acceleration control, describing the relationship between
steering wheel position and the car’s lateral position in the lane, a constant change in control
position causes an acceleration (increasing rate of change) of cursor position over time. As shown in
the Figure, increasing control order causes an increased lag between control deflection and system
(cursor) output.
Whichever is the source of lag, (transmission, control order, or both), longerlagged systems,
said to be more “sluggish,” are harder to control because of the cognitive demands required to
anticipate where the system (cursor) will be in the future (Wickens, 1986). In particular, with very
long lags, closed-loop tracking is no longer possible. For a novice, such systems are difficult, if not
impossible to control. However, experienced users can engage in open loop tracking. This means
that when they perceive an error, they know the amount of control necessary to eliminate it and
impose that control with confidence that it was correct. Thus, they need not continuously sample the
delayed error feedback as is required with closed loop tracking.

143
FIGURE 5.14 Control dynamics and control order in tracking. The figure depicts four time-graphs. At the top, the control is suddenly
displaced. The second row represents the change of the output from a position controller, and the zero-order dynamics that result. (Note that the
gain here is about 1.8). Third row is the change of velocity dynamics (1st order) that results. Fourth row is the change in acceleration dynamics
(2nd order) that results. The arrows represent the increasing lag, for the system output to change its position, as the control order increases.

4. Instability. Together, gain and lag combine to determine system stability (or its converse, instability).
If the system has a high gain (small movement produces a large correction), but also a high lag, then the
tracker will not see the resulting large cursor change from an errorreducing control movement until it is “too
late,” causing an overshoot. Once seeing the overshoot, the tracker will correct in the opposite direction, but
here again the delay will cause an overshoot in that opposite direction, ultimately causing an oscillation
around the desired target position. In aircraft with high gain and long lags, this type of behavior sometimes
observed is called a pilot-induced oscillation. We will see this form of instability again when we discuss the
concept of adaptive automation in Chapter 12.
5. Prediction. Of necessity, some tracking systems have long lags given their physical or thermodynamic
characteristics. This occurs in energy and chemical processes. When the human operator applies heat (the
control input) to a large vat of material, there will be a long lag until a change in temperature (the system
output) is observed. When there are lags, in order to obtain stable tracking and avoid overshoot, you would
like to anticipate and start correcting an error in advance, so the correction will be realized in system output
(after the lag) when it is desired. Thus we need to anticipate or predict future inputs by whatever means is
possible. Across dynamic systems, the longer the lag, the greater is the need for anticipation. When driving a
large ship, there will be a lag of several minutes between changing the helm control and the change of heading
of the ship (van Breda, 1999), as sadly revealed to the captain of the Titanic.
Mental prediction, whether of the future inputs or the future system output, is hard. People (particularly
novices) don’t do it well, and such prediction imposes high mental workload (Wickens, 1986; Wickens,
Gempler, & Morphew, 1999). Hence an extremely valuable tool in tracking systems is the development of
predictor displays, in which automation makes an inference about future target position and future system
output (cursor position) and portrays this graphically along a time axis. Figure 5.15 provides two examples
(Wickens, 1986; van Breda, 1999; Roth & Woods, 1992). Of course such prediction is limited by the
reliability of the automation; in most circumstances the reliability of prediction degrades with the span of
prediction or look-ahead time (Wickens, 1986; Xu, Wickens, & Rantanen, 2007). We will discuss the
implications of human predictive limits in Chapter 7 (level 3 situation awareness) and Chapter 8 (predictive
inferences).

6.4 Multi-Axis Tracking and Control


Tracking demands information processing. Hence it is not surprising that doubling the number of variables
being tracked simultaneously will increase mental demand, and cause interference (see Chapter 10). In the
natural world, there are many examples of multi-axis tracking. The driver on the busy freeway is tracking both
laterally (lane keeping) and longitudinally (headway tracking relative to the vehicle in front). In our discussion
of visualization, we hinted at the challenges of multi-axis tracking when people were given several degrees of
freedom to control “flying through” an information space.
Controlling more than one axis of a single entity (e.g., one aircraft, one vehicle, one endoscopic probe)
can be difficult, but the difficulty does not generally grow linearly with the number of axes. For example, it is
not much harder to move a mouse in two axes across the page than in one axis along a line of text. However,
control of multiple entities (as opposed to multiple axes) is almost always much harder than controlling a
single entity such as an unmanned vehicle (UV) (Dixon, Wickens, & Chang, 2005). Here, difficulty generally
grows rapidly with the number of entities (Cummings & Guerlaine, 2007). This increase in difficulty is very

144
much in evidence with the control of multiple robots, ground or air vehicles (Cummings, Bruni, & Mitchell,
2010; Nehme et al., 2010; Dixon, Wickens, & Chang, 2005). Continuous control (tracking) of each entity
becomes very difficult.

FIGURE 5.15 Predictive Displays. The upper panel shows a typical predictor of temperature for process control in the application of heat to a
furnace. The bottom panel shows the predictor display in the highway in the sky (HITS) discussed in Chapter 4. The tunnel is preview of where
the aircraft should fly. The small aircraft symbol is a 3D (perspective) prediction of where the aircraft will be about 5 seconds into the future.
Hence 5 seconds is the span of prediction or the look-ahead time.

To compensate, various forms of automation are used. Here the human operator provides supervision or
oversight of the automated processes rather than being engaged in continuous control (Sheridan &
Parasuraman, 2006). The robot or UV automation can engage in simple behavior, like tracking a straight line
path or circular “loiter pattern” and automatically correcting for disturbances. To provide oversight, the
human operator needs displays that show the position of the various entities in the 3D space. We discuss these
ideas further in Chapter 12.
As with other aspects of human cognition, a critical need in multiple UV control and supervision is to
support both global perception (awareness of the fleet), and local perception of the viewpoint and control
needs of a particular UV (Hunn, 2006). Furthermore, because individual control demands can grow rapidly
with the number of systems, an alternative to a serial control model in which all but the currently controlled
UV remains autonomous and unattended, can be achieved by collective control. For example, in one proposed
concept the controller can virtually “lasso” a set of UVs and control them as a cohort (Micire, 2010; see
http://www.youtube.com/watch?v=HSOziHgQedA). Individual local views from each robot are superimposed
on a map representation (the global world view), relating the two viewpoint types (Micire, 2010). This follows
the principles of visual momentum described in the last section.

7. VIRTUAL ENVIRONMENTS AND AUGMENTED REALITY


7.1 Virtual Environment Characteristics
A virtual environment or virtual reality (VE or VR, respectively) can be defined as a computer-generated
environment that gives a user the experience of being in a particular location, different from where the user
actually is. Typically this is done in multi-sensory fashion (visual, auditory, haptic), often with stereoscopic
visual information. Importantly, a virtual environment can be interactively experienced and manipulated by
the observer (Stanney & Zyda, 2002), making it different from watching, say, a movie with computer-
generated imagery. Many video games employ VR technology. A related technology called augmented
reality (AR) shows simulated imagery on a transparent head mounted display, superimposed over real-world
objects or environments. For example, wearing an AR device, a construction worker can see planned
infrastructure like pipelines or electrical conduits running through the actual construction site (Schall et al.,
2009).
In Chapter 3 we considered head-up displays (HUDs); in Chapter 4 we discussed 3D displays, and earlier
in this chapter we examined frames of reference, visualization, and tracking. All these concepts pertain to the
design of virtual and augmented environments; in this section we will focus on what VR and AR

145
environments are and how they are being developed and applied across a variety of domains. As we will see,
such environments have tremendous utility and potential: VR and AR have changed the shape of many fields
(e.g., training, telepresence, teleoperation), and are driving technological development such that VE and AR
elements have been incorporated into many everyday applications, including Internet gaming environments
(e.g., Second Life) and map planning tools (e.g., Google Earth).
A discussion of virtual environments requires discussion of the term presence (Sheridan, 1996), defined
as the extent to which a user interacting with a virtual environment is convinced of being in the virtual
environment. Sometimes the term immersion is used. The concept is not unlike what we experience when we
become engrossed in a good movie—momentarily forgetting the fact that we are sitting in a theater and
becoming totally engaged in the situation. However, a person experiencing immersion would later report the
sense of having been in the virtual environment, but the person watching a movie would report having been in
the theater (Slater & Usoh, 1994).
A virtual environment is a combination of multiple features (Sherman & Craig, 2003; Furness &
Barfield, 1995; Wickens & Baker, 1995). As each feature is added, the experience of a real environment—and
therefore the sense of presence—generally becomes more compelling. Seven typical features of virtual
environments are described below.
1. Three-dimensional viewing. Since space is three dimensional, a display representation that preserves
that characteristic is more realistic than a 2D representation. A 3D model of a house provides a more
realistic view than can be obtained through a set of 2D elevations. Three dimensionality can be
enhanced by the powerful depth cue of stereopsis through the use of 3D eyewear (see Chapter 4).
2. Dynamic. Perceptually, we experience time as a continuous variable; thus, we perceive a video or
movie as more realistic than a set of static images. As discussed in Chapter 4, relative motion,
achieved through image movement is a second powerful depth cue for three dimensionality. A virtual
display allows the user to view (and control) events dynamically in real time.
3. Closed-loop interaction. When we act upon objects in the real world, there is typically very little delay
from the time the action is initiated until motion occurs. The virtual world should respond quickly to
control inputs (hand, mouse, joystick movements) so that there is little lag.
4. Ego-centered frame of reference. As discussed in section 1.4 of this chapter, an egocentric frame of
reference presents the image of the world from the point of view that is being controlled by the user
(the FFOV).
5. Head or eye motion tracked. Many VE systems incorporate a head-mounted display (HMD) and
motion sensors. These allow changes in head position to control the view on the virtual environment in
the same way that changes in head position change the visual scene in the real world. The entire 3D
scene can be shown on a HMD, or on a set of display surfaces in what is called a CAVE (computer
automatic virtual environment), which is like being inside a room with the room’s walls, floor, and
ceiling depicting the virtual environment. The displayed view is continuously adjusted based on head
position (head tracking).
6. Multimodal interaction. In real-world interaction, we do not simply view stimuli. Imagine what
happens when your alarm clock goes off in the morning. You localize the clock by its sound, reach for
it, pick it up (it has weight, shape, and texture) and push a button or two (so you can go back to
sleep!). If a virtual environment is to give a sense of presence it needs to capture this type of
multimodal experience. This could include auditory feedback using 3D localized sound techniques
(e.g., Kapralos et al., 2008), and proprioceptive, kinesthetic, force, and tactile feedback using haptic
gloves or force-reflecting joysticks (e.g., Biggs & Srinivasan, 2002; Vicentini & Botturi, 2009), or
with specialized robotic manipulators (e.g., Taati, Tahmasebi, & Hashtrudi-Zaad, 2008).
7. Objects and agents. Virtual environments contain objects that can be manipulated. The physics
applying to these objects is defined by the parameters of the VE. The environment can also include
simulated humans or agents, which exhibit certain limited human behaviors in the context of certain
tasks, such as acting as the helicopter pilot in a virtual environment that trains the landing signals
officer working on the deck of a frigate (Cain, Magee, & Kersten, 2011).
Many of these features of VR have been incorporated for years in vehicle training simulators, particularly
with outside world graphics, and indeed many of these simulators can be considered examples of virtual
reality. Many of the design issues and problem solutions addressed with flight simulator design have been
applied to VR design.
Dynamic motion of and interaction with virtual objects are often seen as being of particular importance

146
in creating a sense of presence (e.g., Lee, 2004; Sadowski & Stanney, 2002). Lee argues that “people respond
to mediated or simulated objects as if they were real” (p. 499). In this view, virtual environments in which the
objects and agents act in expected ways will be most likely to achieve presence. High image fidelity is often
not necessary. In terms of concepts described in Chapter 4, presence may be more a function of realistic action
and egomotion than of resolution and perceptual judgment. Social interaction can also play a role: an
environment in which the user can interact and communicate with other users or agents may be highly
immersive even if pictorial realism is low (Sadowski & Stanney, 2002).
It is important to realize that including all of the above features can make a VE system expensive.
Adding more feature elements increases initial costs, costs of implementation (a more elaborate system is
more difficult to construct), maintenance costs, and increases vulnerability to error during experiments and
demonstrations. The additional expense needs to be justified in terms of improved human performance on the
tasks to be performed in the VE system, and not just the increased sense of presence.

7.2 Uses of Virtual Environments


Virtual environments have been shown to be useful for a range of tasks; overviews of applications areas are
provided by Stone (2002) and Sherman and Craig (2003). For instance, VEs are highly effective training
tools, especially for tasks that are expensive or dangerous to train in a real environment. In general, VEs can
be viewed as useful tools for rehearsing controlled actions in a benign environment, in preparation for
performance in a real environment where the consequences of incorrect actions (either to the user or other
parties) are more severe (Wickens & Baker, 1995). Some examples include surgical simulation (e.g.,
Vicentini & Botturi, 2009), welding skills (Stone, Watts, et al., 2011), air traffic control (Ellis, 2006),
maneuvering a spacecraft (Grunwald & Ellis, 1993), deploying a parachute in free fall (Hogue et al., 2001), or
rehearsing a flight prior to a dangerous mission (Bird, 1993; Williams, Wickens, & Hutchinson, 1994). We
next consider five specific application areas for VE: training and education, online comprehension of 3D
space, therapeutic applications, social applications, and ubiquitous computing.

7.2.1 TRAINING APPLICATIONS Exploring a virtual environment that simulates a real world environment has
generally been shown to be beneficial (Darken & Peterson, 2002). Such training could be useful for military
mission planning or reconnaissance. For example, training in a virtual environment of a large multistory
building transfers well to the physical building (e.g., Wilson, Foreman, & Tlauka, 1997). In summarizing the
results of several such studies, Darken and Peterson concluded that for short exposure times training with
maps is more effective than training within a VE. However, the map is only useful up to a point, and if there is
sufficient training time (more than about 30 minutes), the added information in a VE leads to superior
performance relative to the map.
In recent years, the use of virtual or synthetic environments for training surgical procedures has received
a lot of attention (e.g., Johnson, Guediri, et al., 2011). When a patient’s health is on the line, it is important
that the haptic properties of the human body are properly simulated in synthetic environments used for
surgical training. Haptic perception refers to the ability to detect the shape of an object by actively moving
fingers across its surface. For example, Sowerby et al. (2010) modeled the force feedback from the tympanic
membrane (eardrum) to create an accurate interactive simulation of middle ear surgery. Similarly, Misra,
Ramesh, and Okamura (2008) emphasized the importance of capturing the elastic and frictional properties of
various organ walls (e.g., liver, kidney, uterus) as well as the properties of the surgical tools and the operation
(e.g., deformation, puncture, cutting) to characterize tool-tissue interaction properly.
In addition to being less dangerous, training in a virtual environment is generally less expensive than
training in the real world. For example, training of flight skills in the air (or ship navigation handling skills on
the sea) require fuel and manpower costs that are eliminated by training in a virtual environment (Orlansky et
al., 1997). The use of virtual environments for training purposes will be discussed further in Chapter 7.
Closely related to training, e-learning (electronic learning) technologies (Clark & Kwinn, 2007) seek to
provide a compelling learning experience for students who are not physically located in a classroom with
other students. One method for making this environment more compelling is to immerse the student into a
virtual classroom, which could include interactive tutorials to demonstrate concepts; white boards that can be
annotated by students or instructor; student windows that list names of individuals participating virtually;
polling buttons that students can use to respond to multiple choice questions; and chat or direct messaging
areas where students can chat with each other. Here, the features of virtual classroom technology seek to make
the physical distance separating the various participants transparent.

147
7.2.2 ONLINE COMPREHENSION Virtual environments can also be useful for online comprehension tasks
(Wickens & Baker, 1995), and indeed here is where VR as a tool, links closely with scientific visualization as
a process (see section 4 of this chapter). The intent here is to assist the user in gaining insight about the
structure of an environment. This is similar to points made earlier in this chapter about visualization tools
providing insight into the structure of a large data base. It differs from spatial navigation in that the user
cannot explore the real environment directly. Typically the insight is gained while the interaction is taking
place. For example, comprehension may be achieved by displaying the atomic structure of a fractured ceramic
composite to a scientist (Nakano et al., 2001). By using a virtual environment an engineer designing
nanorobots (robots at the scale of nanometers) may gain improved understanding of the molecular structure
of the robots, and haptic feedback can be used to represent the adhesive forces involved when the nanorobot
approaches other microscopic objects (Sharma et al., 2005). A virtual robot has been shown to be useful for
exploring 3D medical images of the cochlea used for surgical planning (Ferrarini, 2008). The concept of
immersive journalism represents another form of online comprehension in which the user can obtain first-
person experiences of events or situations described in news stories (de la Peña et al., 2010).

7.2.3 THERAPEUTIC APPLICATIONS Virtual environments have also been used for therapeutic purposes by
exposing patients with anxiety phobias to simulated aversive situations. For example, a patient with
acrophobia (fear of heights) can be placed at the top of a virtual cliff or on a high virtual platform (Juan &
Perez, 2009). With repeated exposure, this would have the ultimate effect of desensitizing or habituating the
patient to the aversive situation. Emmelkamp et al. (2002) produced virtual versions of real-world
environments (e.g., shopping mall, fire escape). They compared exposure in the virtual environments to that
obtained in the real environments and found that the virtual exposure was just as effective in reducing
acrophobic patients’ anxiety and avoidance.
For this therapy to work, it is important that the aversive situation actually produces an anxiety response
in the patient. In a sense this is one type of presence—does the patient really find the situation compelling?
Juan and Perez found that a CAVE was more compelling than a head-mounted display for inducing a sense of
presence or anxiety in non-phobic users. Virtual environments have similarly been constructed to treat other
anxiety disorders like fear of flying (Rothbaum et al., 2006), and to address post-traumatic stress after 9/11
(Difede et al., 2007) and in Vietnam War veterans (Krijn et al., 2004).
Virtual environments have also been used to rehabilitate stroke patients with some success. For example,
Jack et al. (2001) had chronic stroke patients (who had difficulty moving their right hands) attempt to perform
repetitive hand exercises while viewing a virtual environment (e.g., movement of the patient’s hand served to
reveal an attractive landscape hidden beneath a fogged window). Objective measurements showed significant
improvement during this training and transferred to real-world tasks involving the affected hand (e.g.,
buttoning shirt).

7.2.4 SOCIAL APPLICATIONS: GAMING, MULTI-AGENT ENVIRONMENTS, AND COLLABORATIVE NETWORKING The
distinction between game environments and virtual environments is becoming blurred. Game environments
have many of the properties of virtual environments, and when networked they are also social environments,
with multiple users simultaneously interacting with each other, and with virtual agents, in the same virtual
world. This makes for a compelling, immersive gaming experience. Users treat other human users in the
virtual gaming environment differently from virtual agents. In studies using Second Life, users tended to be
more influenced by and more likely to obey other user’s avatars than virtual agents, mimicking their behavior
(Harris et al., 2009). Thus, when multiple users share the same VE, it becomes a method to “virtually move”
people at various locations into a shared environment.
Just as in a multi-player gaming environment, avatars can be used to represent individual users in a
virtual environment, and intelligent agents or virtual humans incorporating real-time character engines (Gillies
& Spanlang, 2010) can also be represented. Avatars or agents can affect user navigation; for example, they
can guide a novice user through the 3D VE (de Araujo et al., 2010). They can also affect how a user allocates
mental and physical resources (effort): virtual competitors have been show to influence rowers training in a
simulator, for example (Wellner et al., 2010).
With collaborative networking technologies, the aim is to achieve telepresence: that is, that we perceive
that the persons with whom we are communicating at a distance are “in the room” with us (Kirk, Sellen, &
Cao, 2010). Video technologies in support of work practices have met resistance, whereas similar broadly
available technologies like Skype have become very popular. This difference is likely due to the different
purposes of the communication (Kirk et al., 2010). Work-related conversations are often object-focused

148
(Whittaker, 2003), so that discussions centre around the content of a document, for example. Video
technologies often show “talking heads” instead of this content. In contrast, systems like Skype are popular
for personal communication, typically among persons who know one another (e.g., family members) and
serve to mediate “closeness” among the users, rather than for shared work purposes. The point is that the
nature of telepresence will vary with the nature of the interaction (task demands).

7.2.5 UBIQUITOUS COMPUTING Many aspects of VR are relevant for ubiquitous computing, the examination of
how computing should work within everyday environments and activities, such as a table in a restaurant. For
example, with one novel technology (Microsoft Surface; Dietz & Eidelson, 2009), virtual objects are selected
and transferred from one electronic device to another using a shared tabletop display. For example, one user’s
camera is placed on the tabletop, and photos in the camera appear on the tabletop. Another user can drag those
photos to her smart phone. A map is shown on the tabletop; the user selects a nightclub location, purchases
tickets to a show, and drags the tickets to his smart phone. Diners at a restaurant split the bill by selecting
icons on the tabletop representing what they had ordered, and dragging their selection to their credit card
(Dietz & Eidelson, 2009; see Microsoft Surface video on http://www.youtube.com/ watch?
v=6VfpVYYQzHs&feature=related). In one sense this application can be thought of as a linkage of VR and
AR, since while real-world objects (camera, phone, credit card) are being placed on a real table top surface,
they are linked in a virtual manner to each other and to the virtual imagery shown on the table top. Thus we
turn now to a discussion of augmented reality.

7.3 Augmented Reality


Augmented reality (AR) supplements reality, rather than replacing it with a virtual world (Azuma, 2001). We
might think of augmented and virtual reality as lying along a continuum, the virtual ruler shown in Figure
5.16 (Milgram & Colquhoun, 1999). A real environment (e.g., your bedroom) is on the far left, and a
completely virtual environment (e.g., a virtual model of your bedroom) is on the far right. Augmented reality
applications are on the left part of the ruler (e.g., you are looking at your bedroom wearing a set of lenses on
which information is portrayed, such as an arrow pointing to the location of your keys) and what Milgram and
Colquhoun call augmented virtuality applications on the right (e.g., the lighting in the virtual bedroom
model is updated based on the natural light in your bedroom). The continuum thus represents different forms
of mixed reality (MR).
In some AR situations, the user sees the real world directly through the transparent eyewear of a head-
mounted display (HMD), and synthetic imagery is presented on the same display surface, while performing a
manual task directly. This is not unlike the head up display (HUD) with conformal imagery discussed in
Chapter 3, except that here the image moves with the head rotation, rather than with the vehicle rotation
(Wickens, Ververs, & Fadden, 2004). For example, Henderson and Feiner (2009) describe an optical AR
system in which the mechanic conducting maintenance in an armored personnel carrier wears a transparent
head-worn display. The mechanic can see turret components directly through the eyewear, but in addition
imagery is shown on the display to label parts (e.g., WEAPON DRIVE), and show virtual tools and objects
(e.g., socket wrench and bolt) in the correct position to complete the task.
From an attentional perspective (Chapter 3), AR adheres to the proximity compatibility principle,
facilitating the integration of information between the display and the real world beyond, a display proximity
fostered both by co-location (overlay) and the common fate of shared movement when the head rotates. As
such, AR has proven superior to presenting the same information on a head-down, hand-held display (Yeh,
Merlo, et al., 2003). But also like the HUD, the overlaid imagery creates clutter, and so hinders the processing
of non-salient information in the real world beyond (Yeh, Merlo, et al., 2003).

FIGURE 5.16 The Virtual Ruler. Mixed reality lies on a continuum between a completely real environment and a completely virtual one. See
text for details. Source: Redrawn and modified from original figure in Milgram & Colquhoun, 1999.

149
FIGURE 5.17 Imagery from an augmented reality “x-ray vision system.” An early prototype version is shown on the left; note how the imagery
(from cameras mounted behind the wall) appears to sit in front of the wall. On the right, the imagery properly appears to be behind the brick
edges. Source: 2009. Reprinted, with permission, from IEEE Virtual Reality 2009 Proceedings (pp. 79–82).

Instead of adding new imagery, augmented reality can also be used to remove or replace real-world
objects in the virtual scene. For example, Avery et al. (2009) describe an AR “x-ray vision” system that
allows the user to see through walls into the rooms beyond (using imagery generated by cameras mounted
there). Here, imagery is provided to represent a brick wall, but the bricks themselves are semi-transparent so
that the user also sees the room behind them. Avery et al. found that when the room imagery was displayed
with an opaque brick wall, the room ended up appearing to be in front of the wall instead of behind it (an
illusion often observed with AR systems). To get around this problem, Avery et al. rendered just the edges of
objects in the foreground (in this case, the mortar of the virtual bricks). The objects in the room are visible
behind the bricks, and are only partially occluded by the thin brick edges, as shown in Figure 5.17. Recall
from Chapter 4 that occlusion was a powerful cue for representing depth, and we see its useful application
here. Similar techniques have been developed for visualizing organs within a patient or the position of the
engine block within a car (Kalkofen et al., 2009).
In some AR applications, it is very important to provide haptic and proprioceptive feedback to the user to
achieve presence. Jeon and Choi (2009) extend the virtual ruler continuum in two dimensions, one for vision,
and one for touch (see Figure 5.18). Thus we can have varying degrees of virtuality or reality in both visual
and haptic senses. For teleoperation applications like telesurgery, an effective AR solution will likely need to
provide the user with synthetic haptic stimuli based on the user’s actions in a remote world. Without such
contact feedback, the operator can break instruments, strip threads or puncture surfaces when the controlled
element is capable of applying strong forces.
Augmented reality offers great potential for a range of application domains. It appears to most useful for
tasks for which information in the real and virtual environments need to be integrated. In this sense, therefore,
augmented reality displays offer a type of integrated display in the PCP sense. The most useful developments
will likely take a task-oriented approach, ensuring that the right information is integrated to address task
demands.

150
FIGURE 5.18 Multimodal virtual ruler. At top is the virtual ruler shown in Figure 5.16. At bottom is a two-dimensional representation, with
separate dimensions for vision and touch modalities. Shaded areas indicate mixed reality.

7.3.1 PROBLEMS FOR VIRTUAL AND AUGMENTED REALITY ENVIRONMENTS. Although the increased sense of
immersion that occurs with an ego-centered frame of reference and a dynamic, three-dimensional environment
can be beneficial, there are also associated costs. We shall discuss five challenges for virtual and augmented
reality environments.
1. Cost. The cost of fully immersive simulation in large-scale facilities (e.g., moving platform flight
simulators, and CAVE environments) is still quite expensive, and the benefits of such environments
need to be well established for particular applications to justify the expense. However, large-scale
simulation is not always necessary, and often a compelling sense of presence is provided with
inexpensive hardware. The designer is therefore faced with the choice of balancing the cost of large
scale simulation against its potential benefits, relative to “desktop VR” options.
2. Lag. System latency or lag is a problem for both VEs and AR environments. With a headmounted
display, the viewpoint needs to shift with each change in head position (head tracking). Delays in head
orientation measurement lead to errors in presented visual direction (Ellis et al., 2004). If there are lags
in an AR system, then there will be a discrepancy between feedback from the real world and from
augmented reality components, with the visual instability leading to impaired performance. The lag
does not have to be great to have an effect: 15–20 ms is enough when head movements are frequent
and rapid (Ellis et al., 2004). In addition, lag reduces presence (Snow & Williges, 1997), and reduces
the effectiveness of multiuser collaborative environments (Jay et al., 2007). With longer lags (e.g., in
teleoperation environments), performance can be impaired because observers cannot use feedback
from their earlier actions to help plan the current action (making the system open-loop instead of
closed-loop).
In VEs, lag can be reduced by simplifying imagery being rendered during motion. As we learned
in Chapter 4, for interaction with the immediate environment, it is more important to have motion be
accurate than have imagery at ultra-high resolution (another example of “naïve realism” described in
Chapter 4; Smallman & Cook, 2011).
Finally, lag due to communication delays has been shown to affect conversation negatively,
increasing the risk of interruptions (Geelhoed et al., 2009).
3. Biases and distortions. Viewing a virtual environment can lead to biases or distortions in perception of
the virtual environment. These are generally due to the problems of a narrow field of view (which

151
head-mounted displays tend to possess). Thus, a level surface in a VE appears to slope uphill away
from the observer (Wickens & Baker, 1995; Perrone, 1993), and a downhill slant looks shallower than
it really is (Li & Durgin, 2009). This is probably related to the fact that judgments of elevation (up-
down) tend to be poor relative to judgments of azimuth (judgments of angle on the horizon) in virtual
environments (Barfield, Hendrix, & Bjorneseth, 1995). Distances in depth appear to be underestimated
in VEs (Witmer & Kline, 1998), and the virtual space appears smaller than it actually is (Durgin & Li,
2010; Willemsen et al., 2009). Care needs to be taken to avoid pincushion distortion with head-
mounted displays (where the space is stretched or compressed at the extremes of the display; Kuhl et
al., 2009); such distortion leads to poorer depth perception with stereo viewing within a head-mounted
display, but can be eliminated through proper calibration (Durgin & Li, 2010).
4. Lostness and disorientation. If the virtual environment is large, then it is possible for the user to get
lost or disoriented in that space (just like in real space), leading to the need for navigational aids, like
maps. Luo et al. (2010) showed that users with a floor map navigated a multi-floor virtual subway
station better than those without a map. But the problem with this fix is that maps require high
resolution to be readable; high levels of resolution can be difficult to achieve with an HMD. Studies of
wayfinding with soldiers suggest that moving compass displays with landmarks may provide a better
solution for low resolution headmounted applications than maps (Bos & Tack, 2005).
5. Cybersickness. As we saw earlier in the chapter with the design of vehicle simulators, motion sickness
(sometimes referred to as cybersickness in VE) can be a problem in virtual environments. It is a
common to have participants drop out of VE studies due to motion sickness symptoms such as nausea,
dizziness, and disorientation (e.g., Ehrlich & Kolasinski, 1998; Ehrlich, Singer, & Allen, 1998).
Cybersickness is often produced by display lag (Ellis et al., 2004), or if there is a gain mismatch
between head movement and display movement (i.e., the angle of head motion and the display angular
change is not equal; Draper, 1998). Cybersickness appears to be less likely if the direction of the
observer’s gaze corresponds to the direction of motion (Ehrlich et al., 1998).
In spite of these problems, the potential for virtual and augmented reality displays is great. It is quite likely
that the remaining technological challenges will be addressed by the time the next edition of this book is
published. The use of virtual elements will become more prevalent, and virtual objects provided using AR
technologies on mobile platforms will lead to a closer and closer link between the real and the virtual in our
careers and everyday lives. The greater engineering psychology challenge will be to ensure that the
technologies are designed to meet human needs, to show data in appropriate ways, and to ensure that display
representations are compatible with task demands, with the user’s mental representation, with the ecology of
the work domain, and with the form of the user’s control action.

8. TRANSITION
Chapter 5 has integrated and elaborated much of the more generic material presented about attention in
perception (Chapter 3) and visual display design (Chapter 4) in order to understand the human performance
challenges of integrating and controlling lots of information that is fundamentally (but not exclusively)
spatial. We considered the tasks of navigation, spatial awareness, visualizing data and manual control, and the
technology of virtual and augmented reality. In so doing, we went well beyond issues of perception, to
consider those of cognition, working memory and action.
In the next chapter, we shift gears to more verbal, linguistic information involving words, language, and
communications. We consider how people understand symbols, meaning, and procedures. We again consider
working memory, but here of a verbal sort, before we delve into the concept of working memory and
cognition in depth in Chapter 7.

Key Terms
a synthetic-vision-system 131
adaptive automation 148
agents 151
augmented reality (AR) 150

152
augmented virtuality 155
avatar 129
bandwidth 147
Canonical orientation 135
CAVE (computer automatic virtual environment) 151
closed-loop tracking 146
consistency of elements 136
consistency of orientation 136
Consistency of rectilinear normalization 136
control device 146
control order 147
cursor 146
cybersickness 158
disturbance 146
egocentric frame 124
ego-referenced frame 124
e-learning 153
error 146
exocentric frame 124
first order 147
fisheye view 141
Fitts’ Law 146
FORT 125
frames-of-reference 124
gain 147
gain mismatch 158
geometric field of view 133
global situation awareness 130
Haptic perception 152
head-mounted display (HMD) 151
highlighting 135
immersed view 129
immersion 150
immersive journalism 153
Index of Difficulty (ID) 146
Instability 148
keyhole 129
keyhole phenomenon 141
lag 147
landmark knowledge 132
landmark prominence 135

153
landmark prominence and discriminability 136
line of sight (LOS) ambiguity 130
look-ahead time 149
mixed reality 155
nanorobots 153
open loop tracking 148
parallel coordinate graph 140
pilot-induced oscillation 148
pincushion distortion 158
Prediction 148
predictor displays 148
presence 150
rectilinear normalization 136
route knowledge 132
route list 130
second order 147
span of prediction 149
survey knowledge 132
system dynamics 146
System lag 147
system output 146
system stability 148
target 146
telepresence 154
tracking or manual control 124
ubiquitous computing 154
understanding 127
unstable 147
virtual environment 150
virtual reality 150
virtual ruler 155
visual momentum 144
world referenced frame 124
zero order 147

154
6 LANGUAGE AND COMMUNICATION

1. OVERVIEW
The smooth and efficient operation of human-machine systems often depends on the efficient processing of
written and spoken language, whether in reading instructions, comprehending labels, or exchanging
information with a fellow crew member. Not all communication is language-based—gestures and nonverbal
cues can convey information—and not all instructions need to be verbal—symbols pictures and icons are
sometimes helpful. The fundamental tie linking the material in this chapter is the role of language and symbol
representation. The symbol— whether a letter or word, or an icon or earcon—stands for something other than
itself.
It is easy for us to recall instances in which our ability to understand instructions and messages has
failed. Sometimes terms or abbreviations are used whose meanings are unclear; in longer instructions the
meaning of each word may be clear, but the way in which they are strung together makes little sense, or
imposes tremendous mental effort to understand.
In this chapter, we will first consider the perception of printed language—letters, words, and sentences.
We will see how these units are processed both hierarchically and automatically, and we will consider the role
of context and redundancy in their perception. After considering applications to print format and code design,
we will discuss similar principles in the recognition of pictures and iconic symbols. Next we address cognitive
factors involved in comprehending instructions, procedures, and warnings, and make comparisons between
their effects on older members of the population. After discussing the perception of speech, we will conclude
with speech and non-verbal communications in multi-person systems.

2. THE PERCEPTION OF PRINT


2.1 Stages in Word Perception
The perception of printed material is hierarchical in nature. When we read and understand the meaning of a
sentence (a categorical response), we must first analyze its words. Each word, in turn, depends on the
perception of letters, and each letter is itself a collection of elementary features (lines, angles, and curves).
These hierarchical relations are shown in Figure 6.1 (see Neisser, 1967). Most models of visual word
recognition start with a stage commonly described as “feature processing” in which the activation of features
leads to the activation of letters, which in turn leads to the activation of words. In other words, the activation
of individual features within a letter forms the basis of letter and word recognition. In the classic
“Pandemonium” bottom-up theory of letter recognition developed by Lindsay and Norman (1972), a
hierarchy of “demons” are activated by specific features within an individual letter. Thus, in describing the
perception of words we may refer to feature units, letter units, or word units. A given unit at any level will
become active if the corresponding stimulus is physically available to foveal vision and the perceiver has had
repeated experience with the stimulus in question.

155
FIGURE 6.1 Hierarchical process of perception of the visual word work.

We will consider, first, the evidence provided for the unit at each level of the hierarchy and the role of
learning and experience in integrating higher-level units from experience with the repeated combination of
lower-level units. Then we will consider the manner in which our expectancies guide perceptual processing
from the “top down.” After we describe the theoretical principles of visual pattern recognition, we then
address their practical implications for system design.

2.1.1 THE FEATURES AS A UNIT: Visual Search The features that make up letters are represented as vertical or
diagonal lines, angles, and curved segments of different orientations, as shown at the bottom of Figure 6.1.
The importance of features in letter recognition is demonstrated by the visual search task developed by
Neisser, Novick and Lazar (1964) and discussed in Chapter 3. The researchers demonstrated how the search
for a target letter (e.g., K) in a list of nontarget “noise” letters was greatly slowed if the latter shared similar
features (N, M, X), but it was not slowed if the features were distinct (O, S, U), from those of the target.
Lanthier et al. (2009) found that deleting vertices of letters is more detrimental to letter identification than
deleting mid-segments of letters. The importance of vertices, relative to mid-segments, is because of the
former carrying information about the relations between different features, whereas the latter carries
information about a single feature.

2.1.2 THE LETTER AS A UNIT: AUTOMATIC PROCESSING There is strong evidence that a letter is more than simply a
bundle of features. LaBerge (1973) revealed that subjects could process letters like b or d preattentively, or
automatically, as their attention was directed elsewhere. In contrast, symbols like ↿ or ⇂ made up of features
that were no more complex but not familiarly grouped in past experience, required focal attention in order to
be processed. The concept of automaticity— processing that does not require attentional resources—is a key
one in understanding human skilled performance. We will encounter it again in our treatment of training in
Chapter 7 and again when we discuss attention in Chapter 10. Here we focus on the development of
automaticity in language processing.
What produces the automaticity that we use to process letters and other familiar symbols? Familiarity
and extensive perceptual experience is necessary; but research summarized by Schneider and Shiffrin (1977)
suggest that experience alone is not sufficient; symbols must be consistently mapped to the same response.
Inconsistent responding, when a letter (or other symbol) is sometimes relevant, and sometimes not, will be
less likely to develop automaticity, even as it may be seen the same number of times as consistently mapped
symbols. As we saw in Chapter 2, this automatic processing can produce signals that are much more resistant
to the vigilance decrement (Schneider & Fisk, 1984).
Subsequent research has expanded the list of categorization processes for which automaticity can be
developed through consistency. For example, Schneider and Fisk (1984) showed how consistently responding
to members of a category (like vehicles) can show the features of automatic processing (fast and preattentive)
when each category member is presented, even if that member itself has not been seen frequently.
2.1.3 THE WORD AS A UNIT: WORD SHAPE There is evidence that familiar words can be directly perceived as
units, just as LaBerge’s (1973) experiment provided evidence that letters were perceived as units because of
the familiar co-occurrence of their features. Thus the pattern of full-line ascending letters (h, b), descenders (p,

156
g), and half-line letters (e, r) in a familiar word such as the forms a global shape that can be recognized and
categorized as the even if the individual letters are obliterated or blurred to such an extent that each is
illegible. Broadbent and Broadbent (1977, 1980) propose that the mechanism of spatial frequency analysis is
responsible for this crude analysis of word shape.
The analysis based on word shape is more holistic in nature than the detailed feature analysis described
above. The role of word shape, particularly with such frequent words as and and the for which unitization is
likely to have occurred, is revealed in the analysis of proofreading errors (Haber & Schindler, 1981; Healy,
1976). Haber and Schindler had subjects read passages for comprehension and proofreading at the same time.
They observed that misspellings of short, function words of higher frequency (the and and) were difficult to
detect. The role of word shape contributing to these shortcomings was suggested because errors in these
words were concealed most often if the letter change that created the error was one that substituted a letter of
the same class (ascender, descender, or half-line) and thereby preserved the same word shape. An example
would be anl instead of and. If all words were only analyzed letter by letter, these confusions should be just as
hard to detect in long words as in short ones. As Haber and Schindler observe, they are not.
Increasing age appears to influence unitization. Thus Allen and colleagues found that older adults (mean
age of 70 years), as compared to younger adults (mean age of 24 years), were more biased to processing
words in larger processing units (Allen, Groth, et al., 2002). It seems that this more efficient strategy is used
to offset the effects of generalized slowing of cognitive functioning due to age

2.2 Top-Down Processing: Context and Redundancy


In the system shown in Figure 6.1, “lower-level” units (features and letters) feed into “higher-level” ones
(letters and words). As we saw, sometimes lower-level units may be bypassed if higher-level units are
unitized, and automaticity can result. This process then is sometimes described as bottom-up or data-driven
processing. There is also strong evidence that much of our perception proceeds in a “top-down,” expectancy
or context-driven manner (Lindsay & Norman, 1972).
More specifically, in the case of reading, hypotheses are formed concerning what a particular word should be
given the context of what has appeared before, and this context enables our perceptual mechanism to “guess”
the nature of a particular letter within that word, even before its bottom-up feature-to-letter analysis may have
been completed. Thus the ambiguous word in the sentence “Move the lever to the rxxxx” can be easily and
unambiguously perceived, not because of its shape or its features but because the surrounding context limits
the alternatives to only a few (e.g., right or left) and the apparent features of the first letter eliminate all but the
“right” alternative).
In a corresponding fashion, top-down processing can work on letter recognition, whereby knowledge of
surrounding letters may guide the interpretation of ambiguous features. Top-down processing of this sort,
normally of great assistance in reading, can actually prove to be a source of considerable frustration in
proofreading, in which allowing context to fill in the gaps is exactly what is not required. All words must be
analyzed to their full-letter level to perform the task properly.
The foundations of top-down processing and its basis on knowledge-based expectancy were established
in the discussion of signal detection theory, redundancy, and information in Chapter 2. Top-down processing
in fact is only possible (or effective) because of the contextual constraints in language that allow certain
features, letters, or words to be predicted by surrounding features, letters, words, or sentences. When the
redundancy of a language or a code is reduced, the contribution to pattern recognition of top-down, relative to
bottom-up, processing is reduced as well (Tulving, Mandler, & Baumal, 1964).
In addition to redundancy, there is a second form of top-down or learning-based processing, in which the
letters within a word mutually facilitate one another’s analysis so that a letter that appears in a word can
sometimes be processed more rapidly than the letter by itself. In other words, letters can be either identified
directly on the basis of activity at the level of letter representation or inferred on the basis of word
identification. This word superiority effect (Reicher, 1969) has important implications on models of how
people read (Rumelhart & McClelland, 1986), and its implications for engineering psychology are
straightforward. The letters in a word are processed more accurately under time constraints than a similar
number of unrelated letters. Indeed, this mutual facilitation of processing units (i.e., letters) within a familiar
sequence (words) certainly supports automaticity in complementing the word shape effect.
The word superiority effect has also been demonstrated, albeit to a lesser extent, using familiar acronyms
(Laszlo & Federmeier, 2007); acronyms are better recognized, relative to unfamiliar non-word letter strings.

157
Similarly, Eichstaedt (2002) found that previous experience of a given category of word also facilitates the
visual recognition process of other words belonging to that category. Eichstaedt found that among Macintosh
users, a preactivation of the category “Macintosh” by briefly showing the word leads to faster recognition of
subsequently presented Macintosh related words (e.g., Sad Mac) compared to “Windows” related words (e.g.,
Ctrl, Alt, Del). Similarly, preactivation of the category “Windows” among Windows users leads to faster
recognition of Windows related words compared to Macintosh related words. The applied implications are
that the processing of information can be optimized through the use of preactivated categories deliberately
exploited by the design (e.g., categorizing the results of a web search using keywords from the original search
string).
The pattern of analysis of word perception described up to this point may be best summarized by
observing that top-down and bottom-up processing are continuously ongoing at all levels in a highly
interactive fashion (Navon, 1977; Neisser, 1967; Rumelhart, 1977). Sensory data suggest alternatives, which
in turn provide a context that helps interpret more sensory data. This interaction is represented schematically
in Figure 6.2. The conventional bottom-up processing sequence of features to letters to words is shown by the
upward flowing arrows in the middle of hierarchy. The dashed lines on the left indicate that automatic
unitization at the level of the letter and the common word may occur as a consequence of the repeated
processing of these units. Thus unitization may identify a blurred word by word shape alone, even when
features, letters, and context arent available (Broadbent & Broadbent, 1980). Unitization does not necessarily
replace or bypass the sequential bottom-up chain but operates in parallel. Represented on the right of Figure
6.2 are the two forms of top-down processing: those that reduce alternatives through context and redundancy
(solid lines) and those that actually facilitate the rate of lower-level analysis (dotted lines).

FIGURE 6.2 Bottom-up processing (analysis and unitization) versus top-down processing.

2.3 Reading: From Words to Sentences


The previous analysis has focused on the recognition of words. Yet in most applied contexts word recognition
occurs in the context of reading a string of words in a sentence. We have already suggested that sentences
must be processed to provide the higher-level context that supports top-down processing for word recognition.
In normal reading, sentences are processed by visually scanning across the printed page. Scanning occurs by a
series of fixations, joined by the discrete saccadic eye movements discussed in Chapter 3. The average
fixation in reading is around 225–250 msec, and the average saccade length is between 7 to 9 letter spaces for
English readers. Regressions (saccades that move backwards in the text) occur around 10 percent to 15
percent of the time in skilled readers. For difficult text, fixations get longer, saccades get shorter, and
regressions are more common (Rayner, 2009).
During each fixation, there is some degree of parallel processing of the letters within the fixated word.
Whereas the meaning of an isolated word can normally be determined during fixations as short as the
minimum fixation value of 200 msec, fixations made during continuous reading are sometimes considerably
longer (Just & Carpenter, 1980; McConkie, 1983). The extra time is required to integrate the meaning of the
word into the ongoing sentence context to process more difficult words and to extract some information from

158
the words to the right of the fixated words. Both the absolute duration of fixations and the frequency of
fixations along a line of text vary greatly with the difficulty of the text (McConkie, 1983); fixation duration
and frequency are higher for words of low frequency and predictability (for a review, see Rayner & Jushasz,
2004). This notion is supported by the finding that fixation time and the number of regressions are less for
content words—nouns, verbs, and adjectives, which tend to be more concrete and easier to process—
compared to other words in the sentence that are used to represent the grammatical relationships between
content words (Schmauder, Morris, & Poynor, 2000).
When a given word is fixated, information understood from the preceding words provides context for
top-down processing. Pilotti, Chodorow, and Schauss (2009) used eye tracking to show that the frequency and
predictability of text affects the balance between top-down and bottom-up processes during proofreading,
Their findings suggest that proofreading speed is slower and less accurate for high-frequency words and for
highly constrained (i.e., predictable) sentences, which would favor top-down processing.
Similarly, cognitive processing is also observed for proceeding words to the word fixated. McConkie and
his colleagues (see McConkie, 1983) found that different kinds of information were processed at different
regions surrounding foveal vision. As far out as 10 to 14 characters to the right of the fixated letter, very
global details pertaining to word boundaries may be perceived for the purpose of directing the saccade to the
next fixation. Some processing of word shape may occur somewhat closer to the fixated letter. Individual
letters, however, are only processed within a fixation span of roughly 10 letters: four to the left and six to the
right. More recent studies reviewed by Rahner and Jushasz (2004) have found that about 30 percent of words
do not receive a direct fixation when reading. Although these words are skipped, they are still processed by
the brain; short words are skipped more frequently, as are words of high predictability and frequency. Thus
non-fixated words are processed at a deeper level, at which the meaning of the word is understood enough for
the eye to skip it.
The question remains, however, regarding the extent to which words in a sentence are processed serially
or in parallel, and is an area of ongoing research (Starr & Rayner, 2004). Recently, Reichle, Liversedge, et al.,
(2009) have argued that the processing of several words in parallel when reading is implausible. Anecdotally,
it is interesting to note that our own inner voice that says “out loud” each word rarely says words out of order,
as might be predicted by a parallel model. This illustrates the important role of phonetics in reading.
Phonetics (internal speech) must be sequential, just as external speech must also be.
Just as important as the visual processes involved in reading are the cognitive processes involved in
understanding text. These can be described in terms of the reader integrating, across sentences, the meaning of
sets of propositions (Kintch & van Dijk, 1978). For example, the sentence “turn the top switch to on” consists
of two propositions: switch → on and switch → top. Because of limitations in working memory, to be
described in the next chapter, readers can typically carry only about four propositions from a previous
sentence over to the next sentence in order that the former can easily help to interpret the newly encountered
information (Kintch & van Dijk, 1978). This characteristic that has important implications for the readability
of instructions as we will discuss in section 5.

3. Applications of Unitization and Top-Down Processing


Referring back to Figure 6.2, which distinguishes top-down from bottom-up processing, (although both
factors may be operating simultaneously), two primary dimensions underlie the relative importance of one or
the other. The first contrasts sensory quality against context and redundancy, as these trade off in bottom-up
versus top-down processing. The second contrasts the relative contribution of higher-level unitization to
hierarchical analysis in bottom-up processing. This contribution is determined by the familiarity and
consistent mapping of the lower-level units. These two dimensions will be important as a framework for later
discussion of the applications of pattern recognition.
The research on recognition of print is, of course, applicable to system design in contexts in which
warning signs are posted or maintenance and instruction manuals are read. These contexts will be discussed
later in the chapter. It also applies to the acquisition of verbal information from computer displays. In
designing such displays the goal is to present information in such a manner that it can be read rapidly,
accurately and without high cognitive load. In addition, certain critical items of information (one’s own
identification code, for example, or critical diagnostic or warning information) should be recognized
automatically, with a minimal requirement to invest conscious processing. We will discuss two broad classes
of practical implications of the research that generally align themselves with the two dimensions of pattern
recognition described: applications that capitalize on unitization and applications that are related to the trade-

159
off between top-down and bottom-up processing.

3.1 Unitization
Automatic processing confers a number of advantages that can be exploited in applied settings— automatic
processing is fast and parallel, requires little cognitive effort, operates in high workload situations, and is
robust to the effects of adverse effects of fatigue and stress (Schneider & Chein, 2003). As we have seen,
training and repetition, especially consistent and extended repetition, will lead to automatic processing.
Some of this training is the consequence of a lifetime’s experience (e.g., recognition of letters), and it is
possible to unlearn automatic processes that have been developed as a consequence of a lifetime of learning.
Dulaney and Marks (2007) found that some automatic processing can be unlearnt, although over 10,000
training trials was needed before this effect was observed; a finding that from an applied point of view
demonstrates the difficulty of requiring operators to “unlearn” overly-learnt abilities and tendencies.
Conversely, LaBerge (1973) and Schneider and Shiffrin (1977) clearly demonstrated that the special status of
automatic processing of critical key targets can also be developed within a relatively short period of practice.
These findings suggest that when a task environment is analyzed, it is important to identify critical
signals (and these need not necessarily be verbal) that should always receive immediate priority if they are
present. For medical personnel, these might be a pattern of patient symptoms that require immediate response,
or for the air traffic controller they might be the trajectories of two aircraft that define a collision course.
Training regimes should then develop the automatic processing of those signals. In such training, operators
should be presented with a mixture of the critical signals and others and should always make the same
consistent responses to the critical signals (see Schneider, 1985; and Rogers, Rousseau, & Fisk, 1999; see also
Chapter 7).
In this regard, there is an advantage in calling attention to critical information by developing automatic
processing rather than by simply increasing the physical intensity of the stimulus. First, as we described in the
discussion of alarms in Chapter 2, loud or bright stimuli may be distracting and annoying and may not
necessarily ensure a response. Second, physically intense stimuli are intense to all who encounter them.
Stimuli that are “subjectively intense” by virtue of automatic processing, such as the sight or sound of one’s
own name, may be “personalized” to alert only those for whom the alert is relevant.
At any level of perceptual processing it should be apparent that the accuracy and speed of recognition
will be greatest if the displayed stimuli are presented in a physical format that is most compatible with the
visual representation of the unit in memory. For example, the prototypal memory units of letters and digits
preserve the angular and curved features as well as the horizontal and vertical ones. As a consequence,
“natural” letters that are not distorted into an orthographic grid should be recognized more easily than letters
formed with only horizontal and vertical strokes. These suggestions were confirmed in recognition studies
comparing digits constructed in right-angle grids with digits containing angular and curved strokes (e.g., Ellis
& Hill, 1978; Plath, 1970). This advantage was enhanced at short exposure durations, as might be typical of
time-critical environments.
A similar logic applies to the use of lowercase print in text. Since lowercase letters contain more variety
in letter shape, there is more variety in word shape and so there is a greater opportunity to use this information
as a cue for holistic word-shape analysis. Tinker (1955) found that subjects could read text in mixed case
BETTER THAN IN ALL CAPITALS. However, the superiority of lowercase over uppercase letters appears
to hold only for printed sentences. For the recognition of isolated words, the words appear to be better
processed in capitals than in lowercase (Vartabedian, 1972); a finding recently replicated using modern
computer displays (Sheedy, Subbaram, et al., 2005). These findings would seemingly dictate the use of capital
letters in display labeling (Grether & Baker, 1972), where only one or two words are required, but lowercase
in longer segments of verbal material.
More recently, capital letters and lowercase letters have been used in the same word for drug labeling.
‘Tall Man’ lettering, which was been developed in the late 1990s for the display of similar- looking drug
names which are considered to be confusable, highlights differences between similar drug names by
capitalizing dissimilar letters (e.g., cefUROXime and cefTAZIDime). Darker, Gerret, et al. (2011) examined
the effectiveness of “Tall Man” lettering and found that compared to lowercase text, the use of Tall Man
lettering does improve the perception of drug names. However, no difference was seen between uppercase
text and Tall Man lettering. Similar to the studies of Vartabedian, it appears that the advantage of Tall Man
lettering is through the larger size and greater visibility of uppercase letters themselves within the Tall Man
lettering scheme, and not through their creation of more distinct (unitization) word shapes of the drug name.

160
Van Overschelde and Healy (2005) examined if a blank space is an important cue in the perception of
written words; specifically, the space separating letters within a word and the space separating lines of text.
Increasing the space between letters disrupts the unitization of words which in turn slows down the reading
process and improves the identification of letters within words. Increasing the space between lines of text
speeds up the reading process as well as the identification of individual letters and words (see also Paterson &
Jordan, 2010). The latter finding has important pedagogical implications; for example, increasing the spaces
between lines in children’s books will assist the child’s ease of reading.
It appears that this benefit may carry over to the processing of unrelated material such as alphanumeric
strings by defining high-order visual “chunks” (see Chapter 7). Klemmer (1969) argues that there is an
optimum size of such chunks for encoding unrelated material. In Klemmer’s experiment strings of digits were
to be entered as rapidly as possible into a keyboard. In this task the most rapid entry was achieved when the
chunks between spaces were three or four digits long. Speed declined with either smaller or larger groups.
Using a data-entry task, Fendrich and Arengo (2004) found evidence of flexible chunking strategies related to
both the length of the string, and repetitions within it. In their experiment, subjects had a tendency to evaluate
the length of a string when planning their keystrokes and transcribed the string in chunks based on both the
length of the string, and the exploitation of any repetitions of digits within the string. These findings have
important implications for deciding on formats for various kinds of displayed material—license plates,
identification codes, or data to be entered on a keyboard. For example, repeated digits in telephone numbers
should make them easier to remember.
As a result of unitization, words are both perceived faster and understood better than are abbreviations or
acronyms. This difference would suggest that words should be used instead of abbreviations, except when
space is at an absolute premium (Norman, 1981). This guideline is also based on the tremendous variety,
across people, in their conception of how a given word should be abbreviated (Landaur, 1995). The cost of a
few extra letters is surely compensated for by the benefits of better understanding and fewer blunders. Where
abbreviations are used, as for email addresses for example, Norman (1981) suggests that at a minimum
relatively uniform abbreviating principles should be employed (i.e., all abbreviations of common length) and
that the abbreviated term should be as logical and meaningful to the user as possible.
Moses and Ehrenreich (1981) have summarized an extended evaluation of abbreviation techniques and
conclude that the most important principle is to employ consistent rules of abbreviation. In particular, they
find that truncated abbreviations, in which the first letters of the word are presented, are understood better
than contracted abbreviations, in which letters within the word are deleted (see also Ehrenreich, 1985). For
example, reinforcement would be better abbreviated by reinf than by rnfnt. This finding makes sense in terms
of our discussion of reading since truncation preserves at least part of any unitized letter sequence. Ehrenreich
(1982) concludes that whatever rule is used to generate abbreviations, rule-generated abbreviations are
superior to subject-generated ones, in which the operator decides the best abbreviations for a given term.
Similar findings of rule-based consistency should apply to the standardization of personal email
addresses; for example, the consistent use of the first initial and up to seven truncated (not contracted) letters
of the last name. The use of middle initials makes little sense since senders are not likely to know these, and
use of hyphen or underscore adds no information but only invites confusion and uncertainty. Rau and
Salvendy (2001) have developed a number of principles for the design of email addresses based on these and
other findings to make email addresses containing abbreviations of location, organization, and so on, more
memorable.

3.2 Context-Data Tradeoffs


The distinction between bottom-up and top-down processing is important for the design of text displays and
code systems. An example of the trade-off of design considerations between bottom-up and top-down
processing can be seen when a printed message is to be presented in a display in which space is at a premium
(e.g., the display on a hand held device). Given certain conditions of viewing (high stress or vibration), the
sensory qualities of the perceived message may be far from optimal. A choice of designs is thereby offered as
shown in Figure 6.3: (1) Present large print, thus taking advantage of improving the bottom-up sensory quality
but restricting the number of words that can be viewed simultaneously on the screen (and thereby limiting top-
down processing). (2) Present more words in smaller print and enhance top-down processing at the expense of
bottom-up processing. Naturally the appropriate text size will be determined by an evaluation of the relative
contribution of these two factors. For example, if there is more redundancy in the text, smaller text size is
indicated. However, if the display contains random strings of alphanumeric symbols, or relatively
unpredictable sequences, there is little opportunity for top-down processing, and larger presentation of fewer

161
characters is advised.

FIGURE 6.3 The trade-off between top-down and bottom-up processing in display of limited size. The two dashed lines represent different
amounts of contextual redundancy: (a) high context of printed text; (b) low context of isolated word strings.

For example, if limited page space is available for a short article about a basketball game, it is better to
expand the size of the box score (in which little context is available to “guess” the numerical values) and
reduce the size (by reducing the font) of the narrative story, which of course has context. Similarly, on a
phone list, it is better to reduce the size of the name and increase the size of the digits in the phone name or
the less familiar e-mail (e.g., Smith, E.4456-2874, esmith@ XXX.edu). If the display or viewing quality is
extremely poor, larger size is again suggested. It is important that the system designer be aware of the factors
that influence the trade-off between data-driven and context-driven processing, which determines the
optimum point on the tradeoff to be selected.
Finally, top-down processing may also be greatly aided through the simple technique of restricting a
message vocabulary. With fewer possible alternatives to consider, top-down hypothesis forming (i.e.,
guessing an unreadable word) becomes far more efficient.

3.3 Code Design: Economy Versus Security


The trade-off between top-down and bottom-up processing is demonstrated by the fact that messages of
greater probability (and therefore less information content) may be transmitted with less sensory evidence. We
have already encountered one example of this tradeoff in the compensatory relation between d’ and beta in
signal detection theory (Chapter 2). We learned there that as a signal becomes more frequent, thereby offering
less information, and beta is lowered, it can be detected at lower sensitivity (i.e., with less sensory evidence
and therefore lower d). It is fortunate that the trade-off in human performance corresponds quite nicely with a
formal specification of the optimum design of codes, referred to as the Shannon-Fano principle (Sheridan &
Ferrell, 1974). In designing any sort of code or message system in which short strings of alphanumeric or
symbolic characters are intended to convey longer ideas, the Shannon-Fano principle dictates that the most
efficient, or economic, code will be generated when the length of the physical message is proportional to the
information content of the message. The principle is violated if all messages are of the same length. Thus,
high-probability, low-information messages should be short, and low-probability ones should be longer.
In fact, all natural languages roughly follow the Shannon-Fano principle. Words that occur frequently (a,
of, or the) are short, and ones that occur rarely tend to be longer. This relation is known as Zipf’s law (Ellis &
Hitchcock, 1986). The relevant finding from the viewpoint of human performance is that adherence to such a
code reinforces our natural tendencies to expect frequent signals and therefore require less sensory evidence
for those signals to be recognized. In other words, high-probability messages should be short, whereas low-
probability messages should be longer. For example, in an efficient code designed to represent engine status,
the expected normal operation might be represented by N (one unit), whereas HOT (three units) should
designate a less-expected, lower-probability overheated condition.
Rau and Salvendy (2001) explored the utility of the Shannon-Fano principle to provide guidance for the
design of email addresses. For generating email addresses that are both memorable and meaningful, they
recommend that for geographical information, countries should be short and cities or states (conveying more
information) should be longer. For example, using codes like telephone area codes (three digits), whose length
is similar to the length of national domains, is better than using longer codes such as postal ZIP codes (five

162
digits). Furthermore, for organizational information, organizations should be short and departments should be
longer. Several other properties of a useful code design have been summarized by Bailey (1989).
In the context of information theory, there is a second critical factor in addition to efficiency that must be
considered when a code or message system is designed. This is security. The security factor illustrates again
the trade-offs often encountered in human engineering. As we saw above, the Shannon-Fano principle is
intended to produce maximum processing efficiency, which is compatible with perceptual processing biases.
However, it may often be the case that relatively high-frequency (and therefore short) messages of a low
information content are in fact very important. It is therefore essential that they be perceived with a high
degree of security. In these instances enhanced data quality should be sought and the principle of economy
should be sacrificed by including redundancy, as discussed in Chapter 2. Wickens, Prinett, et al. (2011) found
that redundant text and speech increased accuracy (security) but reduced speed (efficiency). The security
advantage for redundancy is particularly enhanced if sensory processing may be degraded.
Redundancy is accomplished by allowing a number of separate elements of the code to transmit the same
information. For voice communications, the use of a communications-code alphabet in which alpha, bravo,
and charlie are substituted for a, b, and c is a clear example of such redundancy for the sake of security. The
second syllable in each utterance conveys information that is highly redundant with the first. Yet this
redundancy is advantageous because of the need for absolute security (communication without information
loss) in the contexts in which this alphabet is employed. It is possible to look on the trade-off between
efficiency and security in code design as an echo of the trade-off discussed in Chapter 2 between maximizing
information transmission and minimizing information loss. Certain conditions (orthogonal dimensions and
adherence to the Shannon-Fano principle) will be more efficient, and other conditions emphasizing
redundancy will be more secure.

4. Recognition of Objects
4.1 Top-Down and Bottom-Up Processing
The combination of bottom-up and top-down processing involved in word perception characterizes the
perception of everyday objects as well (as we explored in the context of object processing in Chapter 3). For
example, just as letters are perceived, in part, through feature analysis, so Biederman (1987) has proposed that
humans recognize objects in terms of combinations of a small number of basic features, which consist of
simple geometric solids (e.g., straight and curved cylinders and cones). An example is shown in Figure 6.4.
Biederman’s theory suggests that the designers of three-dimensional graphics displays might well capitalize
on these basic features by fabricating objects that can be easily recognized without needing to incorporate
excessive detail. This work has been extended into the field of machine vision in which the identity of real-
world scenes can be inferred from aggregated statistics of low-level features (Oliva & Torralba, 2007).
The role of top-down processing in object recognition is as important as it is in word recognition. Despite
the complexity of object recognition we perform on a daily basis, such as interpreting road signs while driving
in traffic, our experience tell us that our brains are able to solve this problem very efficiently and with
seemingly little cognitive effort. Object recognition based on solely bottom-up processing is problematic
given that even minor changes to the lighting, or the presence of shadows, occlusions or reflections make
bottom-up object recognition in these cases almost impossible, In this regard, top-down processing plays a
pivotal role in how we recognize objects within a complex visual scene. Recent research suggests that we
employ a blend of both bottom-up and top-down processes; whereby top-down processes are used to predict
and guide processing of the visual scene, against which information from bottom-up processes is used to
either refute or validate these predictions, which in turn are used to refine these predictions, and so on
(Kveraga, Ghuman, & Bar, 2007). It is this notion of a “proactive brain,” whereby the brain is continually
generating predictions that facilitate perception and cognition (Bar, 2007) that we will explore under the
auspices of Situation Awareness in Chapter 7.

163
FIGURE 6.4 Proposed set of primitive geometric features, or geons, used in object recognition. The attributes or dimensions that distinguish
each geon from others in the list are shown on the right.

Given the influence of top-down processing to generate predictions about the visual scene, it is not
surprising to learn that objects represented in a familiar context are faster to recognize and localize (Oliva &
Torralba, 2007). The contextual relationships between objects in a visual scene can be either physical (e.g.,
inter-relationship with other objects) or semantic (e.g., a fire hydrant has a well-defined orientation and size);
both have an impact on object recognition within a visual scene. The strength of the contextual relationship
between objects in a scene can change; for example, a dinner plate would be expected to be on a table, not on
the floor (except for the dog!). The accuracy of object recognition should therefore be a function of the
strength of relationship between the context and the object (Oliva & Torralba, 2007).
Biederman, Mezzanotte, et al. (1981) had subjects recognize rapidly exposed objects which were in
either appropriate or inappropriate contexts, where appropriateness was defined in terms of several expected
properties of the objects (e.g., the object must be supported, and it should be of the expected size given the
background). The researchers found that if the object was appropriate, it was detected equally well at visual
angles out to three degrees of peripheral vision. If it was not, performance declined rapidly with increased
visual angle from fixation.

4.2 Pictures and Icons


The fact that pictures can be recognized as rapidly as words leads to the potential application of pictorial
symbols or icons to represent familiar concepts. Highway symbols and signs in public buildings are familiar
examples of pictures being used to represent or replace words. In a similar way, icons have become a standard
feature of computer displays and, more recently, hand held devices (Figure 6.5), where their value over words
in allowing rapid processing has been demonstrated (Camacho, Steiner, & Berson, 1990).
Icons attempt to represent objects, concepts, or functions by relying our ability to learn the meaning of
the icon using our pre-existing knowledge (Isherwood, 2009), in a similar way in which language is learnt by
children (McDougall, Forsythe, et al. 2009). When designing an icon set for a specific application, it is

164
important that the meaning of each icon is clear (and if not immediately obvious, its meaning can be learned
quickly), and not confusable with other icons in the set. One can imagine then that ensuring a consistent
interpretation of icons across a user community varying in age, culture, and preexisting knowledge presents a
considerable challenge to the human factors engineer. Understandably then, much research has been
conducted to identify the factors that are important in determining the usability of icons; for example,
Isherwood, McDougal, and Curry (2007) examined the effects of icon concreteness, visual complexity,
semantic distance, and familiarity on an icon identification task.
The concreteness of an icon relates to the extent by which it depicts a real-life object or person (Figures
6.5a and 6.5b), as opposed to more abstract depictions using lines and arrows (Figures 6.5c and 6.5d).
Intuitively, we might expect that concreteness to be the most important determinant of an icon’s usability.
Although that might be true when the icon is unfamiliar, the effect of concreteness diminishes over time as
users gain more experience with them (Isherwood et al., 2007). However, the diminishing effect of
concreteness due to user experience is not found for mobile phone icons (Schröder & Ziefle, 2008), but the
icons used in Schröder and Ziefle’s study were approximately one-third of the size of those icons used by
Isherwood et al. As we have already discussed, these nuances highlight the importance of comprehensive
usability testing in design—in this case an icons set used for computer-based applications might not readily
transfer to hand-held applications where the small visual angle may render critical details obscure.

FIGURE 6.5 Typical icons for a computer display. Source: Isherwood, S. J., McDougall, S. J. P., and Curry, M. B. (2007). Icon Identification
in Context: The Changing Role of Icon Characteristics With User Experience. Human Factors, 49(3), 465–476.

The visual complexity of an icon relates to the degree of detail or intricacy in an icon; for example Figure
6.5b has a high level of visual complexity, whereas Figure 6.5d does not. Although more detailed depictions
of real-life objects should allow users to access their pre-existing knowledge more quickly to infer meaning,
research has shown that complexity increases visual search times, even after considerable training
(McDougall, Curry, & de Bruijn, 2000). In addition, high complexity discriminating features of icons will be
hard to discern with small hand held displays. The background upon which an icon is presented should also be
considered; in general a higher contrast ratio between the icon and its background results in quicker search
times (Huang, 2008).
Semantic distance refers to the degree of closeness of the relationship between the icon itself and its
meaning. For example, Figure 6.5e shows a direct, closely-coupled relationship. Figure 6.5f shows a
relationship in which the meaning needs to be inferred from the icon, and Figure 6.5g shows a relationship in
which the relationship between the icon and its meaning are completely arbitrary and needs to be learned. As
such, semantic distance has been found to be an important determinant of the usability of novel icons;
especially when icon-meaning relationships are being established (Isherwood et al., 2007).
Familiarity can be defined both in terms of a user’s experience of the icon itself and of the object that is
depicted by the icon. For example, a user might be familiar with the object depicted in Figure 6.5b (a book)

165
but might not be familiar with its meaning of “library.” Familiarity has been found as important a determinant
of icon usability as semantic distance; however, its effects are longer-lasting; a finding attributed to generally
easier access to long-term memory representations (Isherwood et al., 2007).
These findings suggest that although concreteness is an important consideration in the design of rarely
encountered icons, other characteristics, particularly semantic distance and familiarity need to be considered
when designing icons that are frequently used, especially for the older members of the population (Schröder &
Ziefle, 2008). Although low concreteness, low familiarity, and high semantic distance all degrade icon
processing, the latter two particularly degrade processing for older people.
The effects of aesthetic appeal on the usability of icons has also been examined. McDougall, Reppa, et
al. (2009) demonstrated that for complex icons, perceived aesthetic appeal has a beneficial effect on search
times. In other words, the detrimental impact of visual complexity can be reduced using more aesthetically
pleasing icons. These are the sorts of relationships and tradeoffs that always confront the human factors
engineer.

4.3 Sounds and Earcons


Just as print has an auditory analogy in the spoken word, so visual icons have an auditory analogy in the
sound of notifications and alerts. The design of speech and non-speech auditory notifications has been well-
established in power plants, operating theatres, aircraft cockpits and motor vehicles, to name but a few
(Noyes, Hellier, & Edworthy, 1996; Marshall, Lee, & Austria, 2007). More recently, the ubiquitous presence
of mobile devices has brought with it a similar prevalence in the use of auditory notifications to mitigate their
limited display real-estate. In all cases, auditory notifications may have particular value when visual
processing is engaged by other aspects of the task (see Chapter 11).
Earcons refer to abstract musical tones that can be structured in combinations varying in intensity, pitch
and timbre. As such, earcons are flexible in that they can be attributed to any function or object. Earcons can
also be designed in families so that represent hierarchies of related options, such as items within menu
categories. Of course, a problem with this flexibility is that users will have to memorize the relationships
between the earcon and its meaning, just like with their visual counterparts (Garzonis, Jones, et al., 2009). In
addition, when earcons are presented concurrently, even in limited numbers, the accurate identification of
individual notifications is rapidly diminished (McGookin & Brewster, 2004).
Auditory icons, unlike earcons, use auditory metaphors to relate to their meaning. For example, the sound
of crumping paper accompanying a file being deleted (e.g., moved into a recycle bin). However, auditory
metaphors for more abstract concepts, such as copying a file, are more difficult to find. From an engineering
psychology perspective, another limitation is that they might be confused with actual environmental sounds,
such as a tire-skidding auditory icon used as a vehicle collision warning.
Garzonis and colleagues (2009) compared the effectiveness of earcons and auditory icons for use in
mobile devices in terms of their intuitiveness, learnability, memorability, and user preference. They found that
auditory icons perform significantly better than earcons across all four measures. The implications of these
results for the design of auditory icons for mobile devices are strikingly similar to those from the field of
visual icons. Where possible for commonly identified sounds, or those that require little training, auditory
icons should be used, especially for those notifications that are rarely encountered. For applications in which
more abstracts sounds are required, research has found that the learnability of a set of earcons is greatly
enhanced by avoiding similar temporal patterns for two or more members of the set, and increasing the range
of type of sounds used (Edworthy, Hellier, et al., 2011).
While voice displays have been used for decades, more recent developments in these technologies have
focused on a hybrid approach to speech and auditory icons—spearcons (Walker & Kogan, 2009). Spearcons
are created by speeding up a spoken phrase like a menu item (so that its duration is about 250 msecs) without
an associated change in pitch so that the meaning of the item can be extracted in much less time than with
normal speech. Due to the compressed nature of the sound, the spearcon might not be completely
comprehensible. However, a short learning session is all that is required to associate the spearcon with its
meaning (Walker & Kogan, 2009). Spearcons have been successfully used in hierarchical auditory menus for
visually impaired users as a direct replacement for textual menu items (Sodnik, Jakus, & Tomazic, 2011) to
rapidly search contact menus on mobile devices (Jeon and Walker, 2009) and to present mathematical
material verbally (Bates & Fitzpatrick, 2010).

166
5. Comprehension
Whether presented by voice or by print, words are normally combined into sentences whose primary function
is to convey a message to the receiver. So far our discussion has considered how the meaning of the isolated
symbols, words, and word combinations is extracted and how the visual system processes strings of words in
text. In this section we will consider properties of the word strings themselves, and not just their physical
representation, that influence the ease of comprehension (Broadbent, 1977). A major focus of this section will
be focused on the ease of comprehending instructions. It turns out that many of the principles discussed here
are equally relevant to the ease of encoding and storing material in working memory, to be discussed in the
next chapter. Indeed, the border between comprehension of instructions and storage in memory is a fuzzy one.

5.1 Instructions
Good instructions are not only those that can be followed while the instructions are being consulted, but also
those that can be easily memorized (and hence remembered), even for the short period of time during which
the gaze may be diverted from the visual text on which they are written or displayed, or attention my fade
from the spoken sentence of what to do.
Instructions and procedures vary dramatically in the ease with which they may be understood and of
course this has critical importance for society. For example, Laskowski and Redish (2006) found that typical
ballots in the United States violate many basic principles of instruction design. Some examples are shown in
Figure 6.6. The U.S. government has recently passed a law mandating simplification of the “fine print” on
many consumer documents, such as credit card information.
Wordy phrases that are difficult to understand are also encountered in legal documents and instructions.
Jury instructions have often be criticized because they contain convoluted sentence structure, complicated and
confusing legal jargon, and contain words with multiple interpretations. Astonishingly, it appears that tests of
juror comprehension of instructions sometimes reveal chance accuracy (Miles & Cottle, 2011). Consider this
example by Peter Tiesrma who was a member of a task force recently charged with drafting more
comprehensible instructions in California:

167
FIGURE 6.6 Examples of good and bad practice for voting instructions from various ballots in the United States (1998–2004). Source:
Adapted from Laskowski, S. J. and J. Redish (2006)

“Failure of recollection is a common experience, and innocent misrecollection is not uncommon.”

His suggested rewording attempts to explain in simple terms a critical instruction to the jury:
“People often forget things, or they may honestly believe that something happened even though it turns out later that they were
wrong.”

In writing instructions or procedures that are easy to understand, such as the rewritten example above, it
is often sufficient to follow a set of straightforward, common sense principles similar to those outlined by
Tiersma (2006):
1. Keep the audience in mind.
2. Adopt an appropriate style and tone.
3. Use logical organization. For example, number and physically separate the different points to be made
(or procedural steps to be taken), as has been done here rather than combining them in a single
narrative.
4. Be as concrete as possible.
5. Use pronouns when appropriate. Lawyers tend to avoid using “I” and “you,” which can seem rather
pompous to most audiences. However, ambiguous pronouns like “it” or “this” that refer to nouns

168
identified much earlier in the text could cause confusion (Bailey, 1989).
6. Try to use verbs instead of nouns. For example, rather than telling jurors to “take into consideration,”
it is better to ask them to “consider something.”
7. Keep grammatical constructions simple and straightforward, use ordinary word order (i.e., subject-
verbal-object), and avoid the passive voice.
The use of these types of guidelines to modify juror instructions appears to have met with modest
success; a number of studies have shown minor improvements in juror comprehension. (For a review, see
Miles & Cottle, 2011). The writing of understandable procedures and instructions may also be aided by a
number of readability formulas (Bailey, 1989). These formulas take into account such factors as the average
word and sentence length, in order to make quantitative assessments of the likelihood that a passage will be
correctly understood by a readership with a given educational level.
However, the simplification of instructions at solely the word or sentence level does have its critics,
especially those working in domain of health literature (Zarcadoolas, 2010). Rather efforts to make
instructions more understandable should also focus on the overall linguistic structure and function of the text.
For example, reducing the proportion of function words (words that do not add direct meaning) makes the
content words in the document easier to assimilate as they are in closer proximity with each other (Leroy,
Helmreich et al., 2008). In addition, the cognitive activities of the reader (Crossley, Greenfield, & McNamara,
2008), and visual aids should accompany the instructions (Friedman & Hoffman-Goetz, 2006). Returning to
the juror instructions, Miles and Cottle (2011) argue that comprehension can be greatly enhanced by
connecting the personal experiences of jurors with legal concepts and procedures through the use of analogies.
For example, explaining the legal concept of “reasonable possibility” through the analogy of balls being
thrown at a pane of glass; “A tennis ball had ‘a chance,’ a steel ball was ‘almost certain,’ and a baseball had a
‘reasonable possibility” of breaking the glass (Brewer, Harvey, & Semmler, 2004).
Taking the findings from both the legal and health literature domains together, it seems that simplifying
instructions solely at the word and sentence level does not ensure that the instructions are easier to understand.
Instead, connecting with the reader at a deeper, semantic level appears to have a greater impact on the
probability that the text will be understood.
However useful and necessary as these guidelines may be, they do not consider some other important
characteristics of comprehension that are directly related to principles in cognitive psychology and
information processing. We will consider five general categories: context, command versus status
information, linguistic factors, working memory, and the role of pictures.

5.2 Context
The important role of context in comprehension is to influence the perceiver to encode the material in the
manner that is intended. This top-down processing influence was considered in two different forms in Chapter
2: the influence of probability on response bias and the influence of context on information. Furthermore,
context should provide a framework on which details of the subsequent verbal information may be hung.
Bransford and Johnson (1972) have demonstrated the dramatic effect that the context of a descriptive picture
or even a thematic title can exert on comprehension. In their experiment, the subjects read a series of
sentences that described a particular scene or activity (e.g., the procedures for washing cloths). The subjects
were asked to rate the comprehensibility of the sentences and were later asked to recall them. Large
improvements in both comprehensibility and recall were found for subjects who had been given a context for
understanding the sentences prior to hearing them. This context was in the form of either a picture describing
the scene, or a simple title of the activity. For those subjects who received no context, there was little means
of organizing or storing the material, and performance was poor.
For context to aid in recall or comprehension, however, it should be made available before the
presentation of the verbal material and not after (Bower, Clark, et al., 1969; see also Laskowski and Redish,
2006 and Figure 6.6). Like a good filing system, context can organize material for comprehension and
retrieval if it is set up ahead of time.

5.3 Command Versus Status


Another issue in the delivery of instructions is related to the distinction between status and command
information. Should a display simply inform the operator of an existing status, such as the aircraft attitude
directional indicators discussed in the previous chapter (Figure 5.8) or a verbal statement (“Your speed is too

169
high”), or should a display command an action to be carried out (“Lower your speed”)?
Arguments can be made on both sides of the issue, and the data are not altogether consistent. For
example, in designing flight path displays to help pilots recover from unusual aircraft states (e.g., inverted),
Taylor and Selcon (1990) found that a display that told the pilot what direction to fly in order to recover was
more effective than one that showed the aircraft’s current status. In a similar task, Wickens, Self, et al. (2007)
found that a command icon reduced both the time taken to initiate the recovery and the number of flight
control errors relative to a status display. However Barnett (1990) observed no difference in performance on a
decision-aiding task between status and command displays. Similarly, Sauer, Wastell and Schmeink (2008)
found no differences in performance and subjective judgment of usability between command and status
displays used to manage a central heating system. Finally, studies by Crocoll and Coury (1990) and Sarter and
Schroeder (2001) obtained results that favored status displays; particularly when the information presented
was not totally reliable (as is often the case with automated decision aids).
What conclusions can be drawn from these studies? First, it is probably true that under conditions of high
stress and time pressure, like the aircraft recovery studies, a command display is superior to a status display,
as the latter will require an extra cognitive step to go from what is, to deciding what should be done. Second,
these guidelines might be modified if time pressure is relaxed and/or the source of the status or command
information is not fully reliable. Since the command display is a form of automation, this finding is relevant to
the issue of imperfect automation— an issue addressed in Chapter 12.
Finally, as is so often the case in human performance, a strong argument can be made for redundancy,
presenting both status and command information. This is an approach reflected in the design of the Traffic
Alert and Collision Avoidance System (TCAS) equipped in most commercial aircraft. A command tells the
pilot what to do to avoid a collision (“pull up”) while a status display presents the relative location of the
threatening traffic (Wickens, 2003). Redundancy of this sort, however, should be introduced only if any
possible confusion between what is status and what is command is avoided by making the two sources as
different from one another as possible. For example, in the case of the TCAS system, the voice command will
be easily distinguished from a pictorial status. But without concern for avoiding confusion, in the case of such
information as directions, a possible user confusion of status (“you are left”) with command (“turn left”) could
lead to disaster.

5.4 Linguistic Factors


5.4.1 NEGATIVES Statements that contain negatives always take longer to verify than those that do not.
Therefore, where possible, instructions should contain only positive assertions (i.e., “Check to see that the
power is off ”) rather than negative ones (“Check to see that the power is not on”). An added reason to avoid
negatives is that the not can sometimes be missed, overlooked, obliterated, or forgotten if the instructions are
read or heard in degraded or hurried circumstances. The conclusion to avoid negatives has also been
confirmed in applied environments. Newsome and Hocherlin (1989) observed this advantage in computer
operating instructions. In highway traffic-regulation signs, experiments have suggested that prohibitive signs,
whether verbal (“no left turn”) or symbolic, are more difficult to comprehend than permissive signs such as
“right turn only” (Dewar, 1976; Whitaker & Stacey, 1981). In designing forms to be filled out, such negative
phrases as “Do not delay returning this form even if you do not know your insurance number” are harder to
comprehend than positive phrases such as “Return this form at once even if you do not know your insurance
number” (Wright & Barnard, 1975).
5.4.2 ABSENCE OF CUES People are generally better at noticing that something unexpected is present than that
something expected is missing. The dangers that result when people must extract information from the
absence of cues or information are somewhat related to the recommendation to avoid negatives in instructions.
Fowler (1980) stated this point in his analysis of an airplane crash near the airport at Palm Springs, California.
He noted that the absence of an R symbol on the pilot’s airport chart in the cockpit was the only indication of
the critical information that the airport did not have radar. Since terminal radar is something pilots come to
depend on and the lack of radar is highly significant, Fowler argues that it is far more logical to call attention
to the absence of this information by the presence of a visible symbol than it is to indicate the presence of this
information with a symbol. In general, when there is something that an operator needs to know, it should be
indicated by the presence of a symbol, rather than its absence.
There is also a connection to more basic attention research in the guideline that important information
should be conveyed by the presence of displayed symbology. As noted in Chapter 3, changes will be better
noticed if signaled by the onset of displayed symbology (i.e., a light or line of text appears; the new state is a

170
presence) than by the offset of that same symbology (the new state is an absence; Yantis, 1993).
5.4.3 CONGRUENCE AND ORDER REVERSALS Instructions are often intended to convey a sense of ordered events.
This order is often in the time domain (procedure X is followed by procedure Y). When instructions are to
convey a sense of order, it is important that the elements in those instructions are congruent with the desired
order of events (DeSota, London, & Handel, 1965). This would dictate that procedural instructions should
read, “Do A, then do B,” rather than “Prior to B, do A,” since the former preserves a congruence between the
actual sequencing of events and the ordering of statements on the page (Bailey, 1989). For example, a
procedural instruction should read, “If the light is on, start the component,” rather than, “Start the component
if the light is on.” Congruence can decrease working memory load, as we now discuss briefly, before the
deeper treatment of working memory in Chapter 7.

5.5 Working Memory Load


Poor instructions often reflect a structure that imposes unnecessarily on working memory (see Chapter 7) to
maintain information until it can be either used, or incorporated into the developing meaning of the text. As a
simple example, in the incongruent instructions from the previous paragraph (“start the component if the light
is on”), the user must hold the proposition “start component” in working memory until after the contingency
“light-on” is encountered.
As we introduced earlier in this chapter, the model of sentence comprehension proposed by Kintch and
Van Dijk (1978), characterizes such comprehension in terms of the number of propositions that need to be
maintained in working memory, or retrieved from long term memory, in order to integrate new information
into the evolving script or schema conveyed by a string of sentences. On the one hand, more propositions lead
to greater working memory demand, and hence poorer comprehension. On the other hand, given the
assumption of Kintch’s model that the capacity of working memory in text comprehension is roughly four
propositions, then any new proposition, which depends for its interpretation on processing the meaning of a
proposition encountered more than four propositions back, will require reinstatement of information no longer
in working memory; either by a time-consuming memory search or (in the case of visual text) by rereading.
Using the previous example, the instructions should reiterate the proposition “start the component if the light
is on” if, for whatever circumstances, several other intermediary propositions were required.

6. MULTIMEDIA INSTRUCTIONS
We have described the role of text and pictures for presenting instructions and other information. Voice
synthesis allows voice to be added to the two visual media for presenting instructions. Considering the
strengths and limitations of human information processing enables three important guidelines to be proposed
for using the three media for presenting instructions related to the optimal medium, redundancy gain, and
realism. The role of multimedia in presenting more elaborated educational material for longer term retention
will be discussed in Chapter 7, and the implications for multitasking are discussed in Chapter 10.

6.1 The Optimal Medium


Text and pictures should logically be tailored to their respective strengths. Pictures or graphics can best
convey analog spatial relations, and complex spatial patterns. Verbal material (whether print or text) can best
convey more abstract information, including action verbs that do not have a strong spatial component (e.g.,
“read,” “comply”). If verbal information is lengthy, it should be visual (text) rather than auditory (speech),
because of the greater permanence of visual information, and the higher working memory demands required
to understand speech. While there is some evidence for advantages of providing different media to individuals
with different cognitive strengths (e.g., spatial graphics for those with higher spatial abilities), the strength of
this effect does not appear to be great (Yallow, 1980; Landaur, 1995; Pashler et al., 2008), and it is better to
choose the medium as a function of the material, and the task, based on an understanding of how people learn
from words and pictures (Mayer, 2012 in press).

6.2 Redundancy and Complementarity


Instead of considering each medium in isolation, design guidelines usually suggest that pairs of media should
be used in combination, to capitalize both on redundancy and complementarity, and upon the particular
strengths of each. True redundancy, whereby different channels provide identical information, is distinct from
complementarity, whereby two channels are used to convey complementary, but not identical information
(e.g., when integrating pictures and words, or integrating video and narrative in instructions).

171
Much of research on audio-visual redundancy of presentation will be covered in Chapter 7 (multimedia
learning) and Chapter 10 (multitask performance). Here we emphasize one general point: Redundant
presentation of text and words, such as simultaneous text and voice instructions from an air traffic controller
to a pilot (Helleberg & Wickens, 2003) is likely to improve the accuracy (security) of comprehending the
message, but may delay the processing time, because twice as much information needs to be processed
(Wickens et al., 2011). This finding is consistent with that articulated in Chapter 2 (information theory).
Much of the historical evaluation of combined media in instructions has evaluated the complementary
use of pictures (graphics) and text. Three investigations point to the advantage of picture-text
complementarity, even as they emphasize the relative strengths of different formats.
Booher (1975) evaluated subjects who were mastering a series of procedures required to turn on a piece
of equipment. Two of these combinations were complementary: one code was emphasized and the other
provided supplementary cues. Two others were related: the non-emphasized mode gave related but not
redundant information to the emphasized mode. Booher found the worst performance with the printed
instructions and the best with the pictorial emphasis/complementary print format. Although the picture was of
primary benefit in this condition, the complementary print clearly provided useful information that was not
extracted in the pictureonly condition.
Schmidt and Kysor (1987) studied the comprehension of airline passenger safety cards, using samples
from 25 of the major air carriers. They found that those cards using mostly words were least well understood,
those employing mostly diagrams fared better, but the best formats were those in which words were directly
integrated with diagrams. The authors describe the value and use of arrows as attention-focusing and
attention-directing devices to facilitate this integration.
In the third study, Stone and Gluck (1980) compared subjects’ performance in assembling a model using
pictorial instruction, text, or a completely complementary presentation of both. Like Booher (1975), Stone and
Gluck found the best performance in the complementary condition. In the complementary condition, they
found that five times as much time was spent fixating the text as the picture. This finding is consistent with a
conclusion drawn by both Booher and by Stone and Gluck: The picture provides an overall context or “frame”
within which the words can be used to fill in the details of the procedures or instructions (see also Mayer,
2001). The importance of context was of course, emphasized earlier in this chapter.
As we will learn in the next chapter, even short delays of a second or two can disrupt the quality of
information retained in working memory, and can impose a high cognitive load, that may interfere with
comprehension. Guidelines for complementarity derived from the theory of cognitive load in instructions,
developed by Sweller and his colleagues (Sweller & Chandler, 1994; Sweller, Chandler, et al., 1990; Tindall-
Ford, Chandler, & Sweller, 1997), suggest the importance of integrating text with pictures as closely as
possible (rather than sequencing) in order to reduce the demands on working memory of retaining the textual
information until the relevant figures are located; or retaining the graphic information in working memory
until the relevant textual information is encountered. Mayer and Johnson (2008) suggest that complementarity
supports learning when the text is short, when the text highlights key points in the narration, and when the text
is placed next to the portion of the graphic that it describes. Under these conditions, unnecessary cognitive
processing is minimized.
The latter guideline—that of the placement of text next to its corresponding graphic— is consistent with
the principle of spatial contiguity (Mayer, in press Johnson & Mayer, 2012), which is a more specific case of
the proximity compatibility principle discussed in Chapter 3 (Wickens & Carswell, 1995). For example,
Tindall-Ford et al. found considerably worse comprehension of instructions offered in the separated format of
Figure 6.7a, than with those in the integrated format of Figure 6.7b. Jang, Schunn, and Nokes (2011) found
that spatially-distributing instructions whereby multiple sources of information are placed side by side both
reduced cognitive load and improved task performance. Similarly, Holsanova, Holmberg, and Holmqvist
(2009) found that integrating text and graphics in order to reduce the physical distance between them on a
page makes it easier for readers to find the correspondences between text and illustration, and to mentally
integrate information from the two different sources. Finally, in order to reduce such extraneous processing
further, Mayer (in press) also recommends that spoken text and corresponding graphics are presented at the
same time in line with the temporal contiguity principle.

172
FIGURE 6.7a An example of visual-only instructions with text separated. Source: S. Tindall-Ford, P. Chandler, & J. Sweller, “When Two
Sensory Modes Are Better Than One,” Journal of Experimental Psychology: Applied, 3(4) (1997), pp. 257287. Reprinted by permission.

FIGURE 6.7b An example of visual-only instructions with text separated. Source: S. Tindall-Ford, P.Chandler, & J. Sweller, “When Two
Sensory Modes Are Better Than One,” Journal of Experimental Psychology: Applied, 3(4) (1997), pp. 257–287. Reprinted by permission.

Of course the verbal information can be presented auditorily as well as in text form, (i.e., the sound track
of an instructional video). Here some research suggests an advantage for auditory-pictorial combination, over
text-pictorial combination (Tindall-Ford et al., 1997; Wetzel et al., 1994; Nugent, 1987, Mayer, in press). This
advantage can be related to cognitive load due to the freeing up of capacity in the visual channel (thus
allowing more processing of the pictures) by offloading some of the processing demands onto the auditory
channel (Mayer, in press).
Naturally however, any efforts to present verbal material in auditory form must be sensitive to the limits

173
of that modality: non permanence implies that long difficult material should not be presented aurally, and an
auditory presentation which is related to pictures or graphics, must insure that the linkage to the particular
picture (or part thereof) is made clear, in the same manner analogous to the arrows in Figure 6.7b.
To summarize then:
1. Pictures and words have different strengths that can serve complementary interests (e.g, pictures
conveying spatial relations and concrete objects, words conveying abstract concepts and action verbs).
2. To be effective in this complementing, words and pictures should be linked. With visual text this is
easy through spatial proximity or lines (see Chapter 3). But it is more challenging with voice since one
needs to present voice at the time when one knows that the picture is being examined.
3. Voice, however, has the advantage of more facile concurrent processing with the visual picture (than is
the case with text) and can be particularly advantageous if the voice message is short, and timing with
the picture fixation can be assured. An example might be when the user clicks an command to expose
a picture, the click activates the voice content.
4. Use of different modalities in instruction should take into account concurrent task and environmental
activity. This would favor an auditory (voice) presentation in poor visibility, or when eyes may need to
look at places other than the picture, but would more favor visual (text) presentation in a noisy or
communication environment.
5. If the source of environmental and task context is uncertain, redundancy of text and voice
(complementing pictures) seems advisable.

6.3 Realism of Pictorial Material


If pictures and graphics do indeed contribute to the effectiveness of instructions, how realistic should those
graphics be? The consensus of research seems to be that more is not necessarily better (Spencer, 1988; Wetzel
et al., 1994). Simple line drawings appear to do just as well if not better than more elaborate artwork (Dwyer,
1967) or pictures (Schmidt and Kysor, 1987), which capture detail that is not necessary for understanding.
This is certainly consistent with some of the findings on icon realism discussed above.
While more realistic animations have been used to convey complex dynamic information successfully
(for example, K�cheiter, et al., 2011), studies have shown that animations can fail to be as efficient for
learning as static graphics. Amadieu, Marin矡nd Laimay (2011) found that animations can cause high
extraneous cognitive load unless cues were used in the animation to guide the attention of the learner to the
key points of the animation. As such, designers should consider carefully whether animated models or static
visualizations are the most appropriate learning material (for a review of guidance, see Wouters, Paas, & van
Merriꭢoer, 2008). These findings will have some parallels in our discussion of unnecessary simulator fidelity
in Chapter 7.

7. PRODUCT WARNINGS
Nowhere has the study of comprehension had greater importance for the human factors community than in the
design of effective product warnings (for a recent review, see Wogalter & Laughery, 2006), which includes
drug and prescription warning labels. Manufacturers of products must provide their consumers with an
adequate warning of the dangers associated with their product’s use and instructions on how to avoid related
risk of injury. However, within the last two decades the number of legal cases in which plaintiffs have alleged
a failure-to-warn defect has increased exponentially (Dutcher, 2006; for a review of adverse drug events in
healthcare, see Morrow, North and Wickens, 2006). This has led to an increase the number of warnings on a
product as manufacturers and pharmaceutical companies try to avoid litigation costs associated with product
liability lawsuits. This has occasionally led to some absurd results, such as a label on a baby stroller that reads
“Remove child before folding” (Dutcher, 2006). However, despite a large volume of research on the topic
relatively little has been conducted on unsuspecting subjects, using accident data or real-world measures of
safety outcome (Ayres, 2006), or on all aspects of the warning process in a holistic manner (Mayer, Boron, et
al., 2007).
Existing standards for warnings indicate that a warning should have four components (e.g., ANSI Z.535);
a signal word (e.g., “caution,” “warning,” “danger”), a statement of the nature of the hazard (e.g., “toxic
material”), an instruction statement (e.g., “use a respirator when using the product”), and a consequence
statement (e.g., “could cause death if inhaled”). However, from a human performance perspective, the goal of
product warnings is to get the user to comply with the warning and therefore use the product in a safe way, or

174
avoid unsafe behavior. For such compliance to succeed however, at least four information processing
activities must be carried out successfully (Wickens, Lee, et al., 2004). If any of these stages break down,
ultimate compliance will be compromised.
First, the warning must be noticed, an activity that depends upon the fundamental properties of selective
attention, as discussed in Chapter 3. It is for this reason that auditory warnings are more noticeable than visual
ones (Wolgalter, Godfrey, et al., 1987), and when visual warnings are used, certain design principles should
be employed to ensure that they are captured by visual attention. Visual warnings should be located so that
they will be “encountered” as the user carries out actions that are a necessary part of the equipment use. For
example, they might be close to a “power on” switch. Edworthy, Hellier, et al. (2004) found that placing
warning labels at the point where they are relevant (as opposed to a separate precautions section) improves
compliance. Williams and Noyes (2007) suggest the use of “smart warnings” that present warnings when the
individual is confronted by a risky situation. By tailoring warnings to specific user and situational
characteristics, habituation and desensitization to warnings can be reduced (Wogalter & Conzola, 2002).
Second, warnings must be read. However anyone who has ever gazed at the product warnings on the side
of a small medicine container, realizes that readability is often thwarted by very fine print, just as it is also
thwarted by the clutter of an excessive number of multiple warnings. We have already discussed the Federal
Drug Administration’s recommendation of the use of ‘Tall Man’ lettering for drug labeling. In a similar vein,
Morrow, Weiner, et al. (2007) found that using larger font sizes (12 to 14 point versus 8 to 10 point) and
presenting less information in instructions for the use of chronic heart disease medication was, in part,
responsible for greater acceptance by patients with lower health literacy. Given that health literacy often
predicts compliance to medication, we can see even subtle changes to how medication instructions are
presented can have a large impact on the well-being of individuals. Smith and Wogalter (2010) found that
behavioral compliance with a warning label can be increased through a combination of five minutes exposure
to the user manual and accompanying warning labels on the product itself. Although the product label does
not present all the warning information, it does serve as a memory cue for information previously learnt from
reading the user manual.
Third warnings must be understood. Here all of the material on comprehension, discussed in the previous
pages, is critical. We have already discussed the use of simple language to improve the readability of the
instructions and the provision of text and pictorial formats to improve comprehension. Morrow and colleagues
also used simple language to improve the readability of chronic heart disease medication instructions, together
with organizing the information in a manner consistent with how patients conceptualize taking medication
(i.e., identify medication, take medication, possible outcomes). The outcome, as we have mentioned before,
was that these instructions were preferred over more standardized instructions by patients with lower medical
literacy; especially for learning about adherence information, such as the schedule for taking the medication.
Smith and Wogalter (2010) found that a general warning describing a nonspecific hazard has a relatively
low compliance rate compared to an ANSI-style warning that informs the reader explicitly what to do.
Similarly, Edworthy, Hellier, et al. (2004) found that expressing safety information in probabilistic form (e.g.,
“May be harmful to people without gloves and a respirator mask”) is less effective than instructions in a non-
probabilistic form (e.g., “You must wear gloves and a respirator mask”). Note also, the former is like a status
display with a negative, the latter is a simple positive command display. In addition, instructions that use
personal pronouns (e.g., “You should …”) are highly effective. The effective use of readable text, and
integrated pictures, can be important in conveying information regarding the seriousness of the consequences,
which can influence compliance (Zeitlin, 1994), as well as the behavior to avoid, or deal with the hazard.
Similarly, Taylor and Bower (2004) found that product instructions that include an explanation as to why
failure to follow instructions might lead to negative consequences (i.e., process-cause information) can
increase behavioral compliance.
Another key issue is the calibration of the seriousness, which is fairly accurately conveyed by the three
words “danger,” “warning,” and “caution,” each indicating progressively lower risk in a manner that is
generally well understood by the English speaking population (Wogalter & Silver, 1995). Similarly, Munoz,
Chebat and Suissa (2010) found that the level of threat of a warning affected compliance to warnings about
the risks of gambling. Strong warning statements such as “excessive gambling may drive you to intense
distress and suicidal thoughts” are more effective than weaker statements such as “beware of excessive
gambling.” Seriousness can also be redundantly encoded by other properties associated with the signal word,
such as color (red-orangeblack-blue-green defines a scale of progressively lower risk), print size (Braun &
Silver, 1995), pictures (such as those found next to health warnings on cigarette packets; Kees, Burton, et al,
2006), or even the source of the information (such as from medical sources; Munoz et al., 2010).

175
Fourth, unfortunately (and sometimes tragically) even a well understood warning will not guarantee
compliance (Zeitlin, 1994), even with professional users (Edworthy, Hellier, et al., 2004). As will be
discussed again in Chapter 8, the choice to comply (or the decision to behave in an unsafe manner), can often
be analyzed as a decision based upon balancing the risks of not complying, with the cost of compliance. This
cost of behaving safely may be reflected in terms of time, discomfort, or mental or physical effort, and as this
decision processed in analyzed in detail in Chapter 8, we will understand the critical importance of reducing
the cost wherever possible to induce safe behavior.
The design of effective product warnings poses a number of challenges for the human factor engineer.
On one hand, too much information provided on product instructions might go unread by consumers. On the
other hand, too little information may not provide consumers with sufficient information to use the product
safely, opening the manufacturer up to the risk of litigation (Taylor & Bower, 2004). In order to design
effective product warnings, the human factors engineer needs to know the hazard, associated aspects of the
situation, warning design principles, and important characteristics of the target audience (Wogalter &
Conzola, 2002).

8. SPEECH PERCEPTION
In 1977 a tragic event occurred at the Tenerife airport in the Canary Islands: A KLM Royal Dutch Airlines
747 jumbo jet, accelerating for takeoff, crashed into a Pan American 747 taxiing on the same runway.
Although poor visibility was partially responsible for the disaster, in which 538 lives were lost, the major
responsibility lay with the confusion between the KLM pilot and air traffic control regarding whether
clearance had been granted for takeoff. Air traffic control, knowing that the Pan Am plane was still on the
runway, was explicit in denying clearance. The KLM pilot misunderstood and, impatient to take off before the
deteriorating weather closed the runway, perceived that clearance had been granted. In the terms described
earlier, the failure of communications was attributed both to less-than-perfect audio transmission resulting
from static and “clipped” messages—poor-quality data or bottom-up processing—and to less-than-adequate
message redundancy, so that context and top-down processing could not compensate. The disaster, described
in more detail in Hawkins (1993) and fully documented by the Spanish Ministry of Transportation and
Communications (1978), calls attention to the critical role of speech communications in engineering
psychology. In engineering psychology applications, we are equally concerned with recognition of
synthesized speech with increasingly sophisticated auditory displays, and with speech in team activities, even
as the latter application—communications dialogue—is also increasingly manifest in human-computer
interaction.
Human perception of speech shares some similarities but also a number of pronounced contrasts with the
perception of print, described at the beginning of this chapter. In common with reading, the perception of
speech involves both bottom-up hierarchical processing and top-down contextual processing. Corresponding
to the reading sequence of features to letters to words, the units of speech go from phonemes to syllables to
words. In contrast to reading, on the other hand, the physical units of speech are not so nicely segregated from
one another as are the physical units of print. Instead, the physical speech signal, like the cursive line but in
contrast to print, is continuous, or analog, in format. The perceptual system must undertake some analog to
digital conversion to translate the continuous speech wave form into the discrete units of speech perception.
To understand the way in which these units are formed and their relationship to the physical stimulus, it is
necessary first to understand the representation of speech. We will consider the difference between the time
and frequency representations of continuous analog signals.

8.1 Representation of Speech


Physically, the stimulus of speech is a continuous variation or oscillation of the air pressure reaching the
eardrum, represented schematically in Figure 6.8a. As with any time-varying signal, the speech stimulus can
be analyzed by using the principle of Fourier analysis into a series of separate sine wave components of
different frequencies and amplitudes. Figure 6.8b is the Fourier-analyzed version of the signal in Figure 6.8a.
We may think of the three sinusoidal components in Figure 6.8b as three features of the initial stimulus. A
more economical portrayal of the stimulus is in the spectral representation in Figure 6.8c. Here the frequency
value (number of cycles per second, or Hertz) is shown on the x axis, and the mean amplitude or power
(square of amplitude) of oscillation at that particular frequency is on the y axis. Thus, the raw continuous
wave form of Figure 6.8a is now represented quite economically by only three points in Figure 6.8c.
Because the frequency content of articulated speech does not remain constant but changes very rapidly

176
and systematically over time, the representation of frequency and amplitude shown in Figure 6.8c must also
include the third dimension of time. This is done in the speech spectrograph, an example of which is shown in
Figure 6.8d. Here the added dimension of time is now on the x axis. Frequency, which was originally on the x
axis of the power spectrum in Figure 6.8c, is now on the y axis, and the third dimension, amplitude, is
represented by the width of the graph. Thus, in the representation of Figure 6.8d one tone starts out at a high
pitch and low intensity and briefly increases in amplitude while it decreases in pitch, reaching a steady-state
level. At the same time a lower-pitched tone increases in both pitch and amplitude to a higher and louder
steady level. In fact, this particular stimulus represents the spectrograph that would be produced by the sound
da. The two separate tones are called formants.

FIGURE 6.8 Different representations of speech signal: (a) time domain, (b) frequency components; (c) power spectrum, (d) speech
spectrograph.

8.2 Units of Speech Perception


8.2.1 PHONEMES The phoneme, analogous in many respects to the letter unit in reading, represents the basic
unit of speech because changing a phoneme in a word will change its meaning (or change it to a nonword).
Thus, the 38 English phonemes roughly correspond to the letters of the alphabet plus distinctions such as
those between long and short vowels and representations of sounds such as th and sh. The letters s and soft c
(as in ceiling) are mapped into a single phoneme. Although the phoneme in the linguistic analysis of speech is
quite analogous to the printed letter, there is a sense in which it is quite different from the letter in its actual
perception. This is because the physical form of a phoneme is highly dependent on the context in which it
appears (the invariance problem). The speech spectrograph of the phoneme k as in kid is quite different from
that of k as in lick (whereas visually the letter k has the same physical form in both words). Also, the physical
spectrograph of a consonant phoneme differs according to the vowel that follows it.
8.2.2 SYLLABLES Two or more phonemes generally combine to create the syllable as the basic unit of speech
perception. This definition is in keeping with the notion that although a following vowel (V) seems to define
the physical form of the preceding consonant (C), the syllabic unit (CV) is itself relatively invariant in its
physical form. The syllable in fact is the smallest unit with such invariance (Huggins,1964); something that
people are particularly dependent on in speech perception (Neisser, 1967).
8.2.3 WORDS Although the word is the smallest cognitive or semantic unit of meaning, like the phoneme it
shows a definite lack of correspondence with the physical speech sound. This lack of correspondence defines
the segmentation problem (Neisser, 1967). In a speech spectrograph of continuous speech, there are
identifiable breaks or gaps in the continuous record. However, these physical gaps show relatively little
correspondence with the subjective pauses at word boundaries that we seem to hear. For example, the
spectrograph of the four-word phrase “She uses st⋆and⋆ard oil” would show the two physical pauses marked
by ⋆, neither one corresponding to the three word-boundary gaps that are heard subjectively. The

177
segmentation issue then highlights another difficulty encountered by automatic speech-recognition systems
that function with purely bottom-up processing. If speech is continuous, it is virtually impossible for the
recognition system to know the boundaries that separate the words in order to perform the semantic analysis
without knowing what the words are already.

8.3 Top-Down Processing of Speech


The description presented so far has emphasized the bottom-up analysis of speech. However, top-down
processing in speech recognition is just as essential as it is in reading, as recent neuroscientific evidence
suggests (Eulitz and Hannemann, 2010). The two features that contrast speech perception with reading
discussed above—the invariance problem and the segmentation problem— make it difficult to analyze the
meaning of a physical unit of speech (bottom-up) without having some prior hypothesis concerning what that
unit is likely to be. To make matters more difficult, the serial and transient nature of the auditory message
prevents a more detailed and leisurely bottom-up processing of the physical stimulus. That is, one cannot re-
evaluate previous spoken words as easily as one can glance back to an earlier portion of text. This restriction
therefore forces a great reliance on top-down processing.
Demonstrations of top-down or context-dependent processing in speech perception are quite robust. In
one experiment, Miller and Isard (1963) compared recognition of degraded word strings between random
word lists (“loses poetry spots total wasted”), lists that provided context by virtue only of their syntactic
(grammatical) structure but had no semantic content (“sloppy poetry leaves nuclear minutes”), and full
semantic and syntactic context (“A witness signed the official document”). The three kinds of lists were
presented under varying levels of masking noise. Miller and Isard’s data suggested the same trade-off between
signal quality and top-down context that was observed by Tulving, Mandler, and Baumal (1964) in the
recognition of print. Less context, resulting from the loss of either grammatical or semantic constraints,
required greater signal strength to achieve equal performance (Zekveld, Heslenfelda, et al., 2006).
Older adults, who often have more difficulty listening in challenging environments, can overcome these
difficulties by deploying compensatory top-down cognitive processing; for example, using knowledge about
the context within which the communication takes place. These results suggest that we shift between
automatic processing of speech to more effortful controlled processing when the listening conditions or task
demands become sufficiently challenging (Pichora-Fuller, 2008). The fact that bilinguals are better able to
perceive speech in the presence of noise in their native language compared to their non-native language,
suggests a specific contribution of top-down semantic processing to native language processing only
(Golestani, Rosen and Scott, 2009).
It is apparent that the perception of speech proceeds in a manner similar to the perception of print,
through a highly complex, iterative mixture of between higher-level linguistic knowledge and bottom-up
perceptual processes, such as perceptual grouping, lexical segmentation, perceptual learning, and categorical
perception (Davis and Johnsrude, 2007). While lower-level analyzers at the acoustic-feature and syllable level
progress in a bottom-up fashion, the context provided at the semantic and syntactic levels generates
hypotheses concerning what a particular speech sound should be. The subjective gaps that are heard between
word boundaries of continuous speech also give evidence for the dominant role of knowledge driven top
down processing.
Since such gaps are not present in the physical stimulus, they must result from the topdown processes
that decide when each word ends and the next begins. Interestingly, the influence of top-down processing on
speech recognition seems to be less apparent when listening to synthetic speech. Whereas context improves
the accuracy of word identification for everyone in natural speech, Roring, Hines and Charness (2007) found
that providing context with synthetic speech does not improve performance for older adults, to the same
extent found with younger and middle-aged adults. An important implication of these findings is that the
fidelity of synthetic speech must be improved to a point similar to natural speech before it can become truly
useful for older adults. In the meantime, systems that need to use synthetic speech should avoid presenting
words in isolation, and provide a rich context for critical words or phrases whenever possible.

8.4 Applications of Voice Recognition Research


Research and theory of speech perception have contributed to two major categories of applications. First,
understanding of how humans perceive speech and employ context-driven top-down processing in recognition
has aided efforts to design speech-recognition systems that perform the same task (Scharenborg, 2007). Such
systems are becoming increasingly desirable for conveying responses when the hands might be busy and

178
unavailable (such as in the case of hand-held devices), or when visual feedback is not available to guide a
manual response. They have the potential to replace keyboard typing, as we discuss in Chapters 9 and 10, or
even as a speech therapy tool (Hailpern, Karahalios, et al., 2009).
The second major contribution has been to measure and predict the effects on speech comprehension of
various kinds of distortion, which was a source of the Tenerife disaster. Such distortion may be extrinsic to
the speech signal—for example, in a noisy environment like an industrial plant. Alternatively, the distortion
may be intrinsic to the speech signal when the acoustic wave form is transformed in some fashion, either
when synthesized speech is used in computergenerated auditory displays or when a communication channel
for human speech is distorted. The following will describe how the disruptive effects of speech distortion are
represented and will identify some possible corrective techniques.
As discussed earlier, natural speech is conveyed by the differing amplitudes of the various phonemes
distributed across a wide range of frequencies. Thus, it is possible to construct a spectrum of the distribution
of power at different frequencies generated by “typical” speech. The effects of noise on speech
comprehension will clearly depend on the spectrum of the noise involved. A noise that has frequencies
identical to the speech spectrum will disrupt understanding more than a noise that has considerably greater
power but occupies a narrower frequency range than speech.Engineers are often interested in predicting the
effects of background noise on speech understanding. The articulation index (Kryter, 1972) accomplishes this
objective by dividing the speech frequency range into bands and computing the ratio of speech power to noise
power within each band. These ratios are then weighted according to the relative contribution of a given
frequency band to speech, and the weighted ratios are summed to provide the articulation index (AI).
However, hearing is not the same as comprehension. From our discussions of bottom-up and top-down
processing it is apparent that the AI provides a measure of only bottom-up stimulus quality. A given AI may
produce varying levels of comprehension, depending on the information content or redundancy available in
the material and the degree of top-down processing used by the listener.
To accommodate these factors, measures of speech intelligibility are derived by delivering vocal
material of a particular level of redundancy over the speech channel in question and computing the percentage
of words understood correctly. The speech intelligibility index (SII) is computed by dividing the spectrum
into 20 bands contributing equally to intelligibility and estimating the weighted average of the signal-to-noise
ratio in each band (ANSI, 1997). Naturally, for a given signal-to-noise ratio (defining signal quality and
therefore the articulation index) the intelligibility will vary as a function of the redundancy or information
content of the stimulus material. A restricted vocabulary produces greater intelligibility than an unrestricted
one; words produce greater intelligibility than nonsense syllables; high-frequency words produce greater
intelligibility than low-frequency words; and sentence context provides greater intelligibility than no context
(for recent advances in the measurement of speech intelligibility in noise, see Ma, Hu, & Loizou, 2009). Some
of these effects on speech understanding are shown in Figure 6.9, which presents data analogous to those
concerning print.

179
FIGURE 6.9 The relationship between the articulation index (AI) and the intelligibility of various speech test materials made up of phonetically
balanced (PB) words and sentences. Source: K.D.Kryter, “Speech Communications,” in Human Engineering Guide to System Design, ed. H.P.
Van Cott and R.G. Kinkade (Washington, DC: U.S. Government Printing Office).

As we have already discussed, it is important to realize that limitations in signal quality can be
compensated for by augmenting top-down processing—creating a context which affords the ability to “guess”
the message without actually (or completely) hearing it. In noisy environments this may be accomplished by
restricting the message set size (i.e., by using standardized vocabulary) or by providing redundant “carrier”
sentences to convey a particular message). The latter procedure is analogous to the use of the redundant
carrier syllables of the communications-code alphabet (alpha, bravo, charlie, etc.) to convey information
concerning a single alphabetic character (a, b, c). A high level of redundancy in the message from air traffic
control to the KLM pilot would probably have stopped the premature takeoff and so averted the disaster.

8.5 Communications
Intuition as well as formal experiments tell us that there is more to communications than simply
understanding the words and sentences in speech. For example, characteristics of the speech itself, such as
frequency, repetition, and rate can affect the perceived urgency of a spoken message; a phenomenon that has
been exploited in the design of speech warnings (Hellier, Edworthy, et al., 2002). Being able to see the
speaker face to face also greatly improves communications, particularly when signal quality is low (Olson,
Olson, and Meader, 1995).
8.5.1 NONVERBAL COMMUNICATIONS There are four possible causes of differences between the two modes of
verbal interaction (face-to-face communications and voice-only communications). All of these causes can
influence the efficiency of information exchange.
1. Visualizing the mouth. There is little doubt that being able to see a speaker’s mouth move and form
words is a useful redundant cue—particularly one that can fill in the gaps when voice quality is low.
This skill of lipreading is often of critical importance to the hearing impaired, but it is also important
to understand our own speech perception, especially given the growing use of avatars in computer-
mediated communications. Gong and Nass (2007) examined the effects of human voices being paired
with computer-generated humanoid faces. A mismatched pairing of a human voice with a humanoid
face (and vice versa) leads to more negative attitudes, reduces trust, and causes longer processing time.
2. Nonverbal cues. Being able to see the speaker allows an added range of information conveyance—
pointing and gesturing as well as facial cues such as the puzzled look or the nod of acknowledgment
that cannot be seen over a conventional auditory channel (e.g., a telephone line). With the emergence
of internet-based, 3D virtual environments and avatars to represent users, there has been renewed

180
interest in the role of nonverbal cues in discourse between users within these environments.
Antonijevic (2008) examined the role of nonverbal communication in the Second Life virtual
environment—particularly those relating to gestures, postures, and facial expressions—and found that
such cues enhance interactions between users within the virtual world.
3. Disambiguity. The availability of extra nonverbal cues may resolve ambiguous messages by allowing
the speaker to follow up on a puzzled look or other cues suggesting that the listener may have
misinterpreted the message. Nonverbal cues and disambiguity appear to combine in allowing face to
face conversation to be more flexible, and less formal. This difference is reflected in the greater
frequency of formal “turn taking” with audio only dialogues, as well as a greater overall number of
words spoken (Boyle, Anderson, et al. 1994; Olson, Olson, et al. 1995).
4. Shared knowledge of action. In coordinated team performance, such as that typifying the flight crew of
an aircraft on a landing approach, a great amount of information is exchanged and shared simply by
seeing the actions that a team member has taken (or failed to take), even if this information is totally
unrelated to the contents of oral communications (Segal, 1995). For example, the copilot, seeing that
the pilot has turned on the autopilot, will be likely to adopt a different mental set as a consequence.
The shared knowledge gained by knowing where each member is looking, reaching, and switching
potentially contributes a great deal to the smooth functioning of a team (Shaffer, Hendy, & White,
1988). We will touch upon this again later in the chapter when we discuss training designed
specifically to support this kind of shared awareness.
To the extent that this shared knowledge facilitates communications, changes in the physical
configuration of the workspace can affect team performance. For example, the repositioning of flight controls
from their position in front of the pilot to the side (the so-called side-stick controller used on some modern
aircraft) reduces the amount of shared knowledge about control activity between the pilot and copilot since the
control activity of one can no longer be easily seen by the other (Segal, 1995). Conversely, the central and
shared location of the engine thrust levers in the cockpit allows both pilots to develop and share their
understanding of which pilot has control (and when) of the thrust levers using, in part, actual physical contact
with the levers (Nevile, 2002). The advances of modern technology, in which spatially distributed dials and
keys may be replaced by centralized displays and keyboards, may also inhibit the shared knowledge of action
by reducing both the amount of head and hand movement that can be seen by the coworker (Wiener, 1989).
8.5.2 VIDEO MEDIATED COMMUNICATIONS The greater advantages of face-to-face over auditory-only
communications has suggested the advantages to communication that could be achieved by allowing video to
accompany the voice. Wheatley and Basapur (2009) compared user experiences of face-to-face
communication, television-based video calling (which shows the head to waist) and computer-based webcam
(which shows only the head to shoulders). Experience of the television-based video calling was judged to be
very similar to face-to-face communication; however, subjects’ experience using the webcam was judged to
be significantly worse. The wider head to waist view enables greater non-verbal expression giving a rich
communication experience; a finding also replicated by Nguyen and Canny (2009), who found that video-
based systems that preserve both gaze and upper-body cues are as effective as face-to-face meetings.
Recent advances in networking and telecommunications has led to a proliferation of teams that do not
work face-to-face, but instead interact using computer mediating communication. Cred矡nd Sniezek (2003)
found that group decision making was similar to face-to-face communications in terms of decision quality,
group confidence, or group members’ individual commitment to the group decision; although groups meeting
face-to-face expressed more confidence in the group decision. Nguyen and Canny (2007) found that
videoconferencing systems that do not adequately represent the spatial seating arrangements of team members
negatively affect trust formation in the team. Video mediated communication is also likely to affect the status
structure of the team by blocking the transmission of status information (Driskell, Radkte, & Salas, 2003).
Despite the close approximation to face-to-face communication, there are other factors that need to be
taken into account by the human factors engineer when implementing video mediated communication for
remote workers. Remote working can cause professional isolation which in turn can have a negative effect on
job performance (Golden, Viega, & Dino, 2008), and have detrimental effects on relationships with co-
workers (but interestingly not with supervisors) (Gajendren & Harrison, 2007).

8.6 Crew Resource Management and Team Situation Awareness


In the 1970s, a series of major airline accidents occurred that could be attributed directly to a breakdown in
communications (Foushee, 1984). Indeed we saw one such example with the tragic collision of the two jumbo

181
jets in the Canary Islands. Another example is when a copilot failed to speak up to a dominant captain, when
the co-pilot noticed that the plane was running out of fuel. The lack of assertiveness of the co-pilot contributed
to a situation in which ultimate fuel exhaustion led to the crash.
At that time, and thanks to the input of psychologists, the commercial aviation community begin to
realize that insufficient attention had been given to these breakdowns in team social and communication
behavior (Foushee, 1984; Helmriech & Merrit, 1998) and adopted a concept called crew resource
management, or CRM; a major component of which emphasized the non-vocabulary aspects of
communications. These included reducing the “authority gradient” in which junior members are unlikely to
speak up to dominant senior members, even when the former know something is wrong. They include an
emphasis on feedback and avoidance of ambiguity; strong influences on the efficiency of system performance
and sources of problems that led to the Canary Islands disaster.
The programs are seen by many as an invaluable countermeasure to the inevitable occurrence of human
communications breakdowns within the cockpit. CRM broadly means management of the team’s resources,
including of course those of individuals, but has a larger focus on the emergent behavior (beyond the
individual) of the team. CRM courses place emphasis on training non-technical skills, such as communication,
briefing, backup behavior, mutual performance monitoring, team leadership, decision making, task-related
assertiveness, team adaptability.
In terms of team communications, Helmreich and colleagues (Foushee & Helmreich, 1988; Sexton &
Helmreich , 2000) found that aircrew that communicated using fewer and shorter words and language in the
first person plural (“we,” “our,” and “us”) had improved communications (i.e., more efficient communication
and fewer errors), while aircrew that used larger words (more than six letters long) showed degraded
communications. In addition, aircrew that communicated in a more assertive and less tentative manner
(irrespective of the experience or rank of the team members) were more effective.
Training programs in CRM have demonstrated success, indicating the important contribution of human
factors to system safety. The impact of these CRM programs on flight safety has been well documented
(Diehl, 1991), A recent meta-analysis of CRM training effectiveness demonstrates the positive impact of
CRM courses in terms of subjects’ knowledge, and especially their attitudes and behaviors (O’Conner,
Campbell, et al., 2008), despite some difficulties with institutionalizing and evaluating the effectiveness of the
course (Salas, Wilson, et al., 2006). An operational example of the successful applications of crew resource
management principles is provided by the analysis of US Airways Flight 1549 that hit geese shortly after
takeoff from LaGuardia Airport, causing both engines to lose power. Without engine power, the crew decided
that an emergency landing in the Hudson River was necessary. Due to expert crew performance all 155 people
aboard survived the flight. Analysis of the incident demonstrated the importance of non-technical skills
related to CRM ingrained from aviation training that may have been equally (if not more) important to the
successful outcome (Eisen & Savel, 2009).
Given their successes in aviation, it is perhaps not surprising to see attempts to transition lessons learnt
from aviation-oriented CRM programs to other domains, such as intensive care physicians (Eisen & Savel,
2009), anesthetists (Flin, Fletcher, et al., 2003), surgeons (Helmreich, 2000), nuclear control centers
(Harrington & Kello, 1991), and the off-shore oil industry (Flin, 1997). In particular, in the operating room
the relationship between surgeon and nurse, with a strong authority gradient running from the former to the
latter, often parallels the traditional relationship in the 1960s cockpit between the senior pilot and the junior
co-pilot, with the latter often afraid to speak up, upon noticing a mistake by the former.
In addition to communications, another important component of current concepts of CRM is the concept
of shared situation awareness (Salas, Wilson, et al., 2006). Shared, or team situation awareness (TSA), has
received a great deal of research interest in the last decade. We will discuss the concept of situation
awareness, and the cognitive factors that determine how we acquire and maintain it, in Chapter 7. However,
we will briefly touch upon TSA here given its recent evolution to improve our understanding of teamwork and
team training.
TSA relies on cognitive processes, such as perception, comprehension and projection, and additional and
unique activities such as communication and coordination to support the shared understanding of a situation
among team members (Endsley, 1995). This shared understanding allows the team to perceive changes to the
structure of the team environment, which in turn allows the team to identify and exploit new opportunities for
behavior. In other words, TSA allows the team to dynamically self-organize itself when confronted by
changes to its environment or the team itself, or by discovering new and better ways of working (Cooke &
Gorman, 2006).

182
TSA is more than just the sum of each team member’s individual SA (Gorman, Cooke, & Winner, 2006);
additional team-related processes are required to acquire and maintain it (e.g., coordination, information
sharing, and cross-checking information). These team-related processes are especially important for team
performance when teams are not co-located (Garbis & Artman, 2004) and are not sufficiently supported by
shared tools (Bolstad & Endsley, 2000).
As we have discussed at the start of this section, social and organizational factors can also influence TSA
(Endsley & Jones, 2001), which are factors that CRM programs attempt to take into account. Like training for
CRM communications, there is also solid evidence that training for TSA can be successful. For example, a
European consortium comprising several airline and research organizations developed a comprehensive
training solution for TSA and threat management, which went through a full-scale simulator evaluation
program (H�nn, Banbury et al., 2004). The study demonstrates the effectiveness of TSA and threat
management training methods on flight crew performance, particularly in terms of positive impact on threat
avoidance, briefings (sharing SA), and distractions management during approach and landing phases. The
training also instructed aircrews to become vigilant for losses of SA, both one’s own and by others, and to act
on that knowledge. Cues for loss of SA include confusion or uncertainty not being resolved, fixation on a
single task, or dwelling on past events (ESSAI, 2001). Once again, we see the importance of nonverbal cues
in effective communication.
In summary, the research on communications suggests clearly that the performance of the whole multi-
operator team is greater than the sum of the parts. This conclusion comes as no surprise to those who have
seen a sports team with a collection of superstars fail to meet its expectations because of poor teamwork. The
data reemphasize one theme introduced in Chapter 1: The design of effective systems for information display
and control with the single operator is a necessary but not sufficient condition for effective human
performance.

9. TRANSITION: PERCEPTION AND MEMORY


Our discussion in the previous chapters has been presented under the categories of spatial and verbal
processes in perception. Yet it is quite difficult to divorce these processes from those related to memory.
There are four reasons for this close association:
1. Perceptual categorizations, as we saw, were guided by expectancy as manifest in top-down processing.
Expectancy was based on both recent experience—the active contents of working memory—and the
contents of permanent or long-term memory. Indeed the rules for perceptual categorization themselves
are formed only after repeated exposure to a stimulus. These exposures must be remembered to form
the categories.
2. In many tasks when perception is not automatic, such as those related to navigation and
comprehension, perceptual categorization must operate hand in hand with activities in working
memory.
3. The dichotomy that distinguished codes of perceiving into spatial and verbal categories has a direct
analog in terms of two codes of working memory.
4. The distinction between one time instruction, for example on how to activate a piece of equipment,
using an instruction manual, and long term learning of this process is fuzzy; and similar variables
influencing working memory either for retention while the procedures are being carried out (in the first
case) or for learning, affect the two processes in similar manner.
5. Perception, comprehension, and understanding are necessary precursors for new information to be
permanently stored in long-term memory—the issue of learning and training.
In the following chapter, we discuss these topics of memory and learning in detail.

Key Terms
articulation index 190
bottom-up processing 162
cognitive load 181
cost of compliance 186

183
crew resource management 194
data-driven processing 162
formants 188
invariance problem 188
phonetics 165
readability formulas 177
segmentation problem 189
spatial contiguity 182
speech intelligibility 191
speech intelligibility index (SII) 191
team situation awareness 195
temporal contiguity principle 183
top-down 162
word superiority effect 163
Zipf’s law 170

184
7 MEMORY AND TRAINING

1. OVERVIEW
Failures of memory often plague us. These may be as simple and trivial as forgetting a phone number we have
just looked up or as involved as forgetting the procedures to run a word-processing application. Operators
may forget to perform a critical item in a checklist (Degani & Wiener, 1990), or an air traffic controller may
forget a “temporary” command issued to a pilot (Danaher, 1980). In 1915, a railroad switchman at the
Quintinshill Station in Scotland forgot that he had moved a train to an active track, thereby permitting two
oncoming trains to use the same track. In the resulting crash over 200 people were killed (Rolt, 1978; Reason,
2008). In 1996, a ramp agent forgot to check the contents of cargo boxes for a ValuJet DC-9. The boxes
contained uncapped, full oxygen generators (mechanics had forgotten to put safety caps on them). One of the
generators engaged while the DC-9 was in flight, causing a fire that sent the airplane into the Everglades,
killing more than 100 people (Langewiesche, 1998).
When we use a computer system to access information, we may find that information we need while
inputting information on one screen can only be found on another. Thus, we have to hold information in
memory while we switch between screens, introducing the possibility of error. Even as we gaze forward in the
car, we may forget that we saw a car in the adjoining lane, last time we glanced at the mirror, and we pull over
directly in front of it.
Clearly, then, the success or failure of human memory can have a major impact on the usefulness and
safety of a system. As noted in Chapter 1, memory may be thought of as the store of information. In this
chapter we will focus on two different storage systems with different durations: working memory and long-
term memory. Working memory is the temporary, attention-demanding store that we use to retain new
information (like a new phone number) until we use it (dial it). We also use working memory as a kind of a
“workbench” of consciousness where we examine, evaluate, transform, and compare different mental
representations. We might use working memory, for example, to carry out mental arithmetic or a mental
simulation of what will happen if we schedule jobs in one way instead of in another. Finally, working memory
is also used to hold new information until we can give it a more permanent status in memory; that is, until we
encode it into long-term memory. Long-term memory thus is our storehouse of facts about the world and
about how to do things.
Both of these levels of memory may be thought of in the context of a three-stage representation, shown
in Figure 7.1. The first stage, encoding, describes the process of putting things into the memory system.
Encoding can take two forms shown in the diagram: encoding into working memory, or transferring
information from working memory into long-term memory. We use the terms learning or training to refer to
this latter transfer of information. Learning describes the various ways in which the transfer can occur,
whereas training refers to explicit and intentional techniques used by designers and teachers to maximize the
efficiency of learning. Our concern will be primarily with training.

FIGURE 7.1 A representation of memory functions.

185
Storage, the second stage, refers to the way in which information is held or represented in the two
memory systems. The terms that we use to describe it are different for working memory, in which we
emphasize spatial versus verbal codes, than it is for long-term memory, in which we emphasize declarative
and procedural knowledge, episodes, and mental models. Storage is also characterized by the length of store,
before retrieval takes place, and by cognitive activity that takes place during storage.
The third stage, retrieval, refers to our ability to get things successfully out of memory. Here we contrast
successful retrieval with the various causes of retrieval failure, or forgetting. Sometimes material simply
cannot be retrieved. At other times it is retrieved incorrectly, as when we mix up the steps in a memorized
procedure.
In this chapter, we will first describe the properties of working memory, its spatial and verbal
representations, and its limited capacity. We shall then discuss the concept of chunking and how it helps deal
with working memory’s limited capacity. Chunking is tied to expertise in a domain, which leads naturally to a
discussion of expertise. We will discuss both how expertise interacts with working memory to produce what is
called skilled memory, and how working memory is involved in situation awareness, planning and
problem solving tasks. Finally, we will describe long-term memory, focusing heavily on the issue of
encoding through a discussion of training. Particular emphasis will be given to the transfer of training—how
the skills and knowledge acquired in one domain are transferred to another. We will then discuss a number of
different ways in which knowledge representation in long-term memory has been described, and conclude
with a discussion of retrieval and forgetting from long-term memory.

2. WORKING MEMORY
Working memory is typically defined as having three core components, or subsystems (Baddeley, 1986,
1995). The phonological store represents information in linguistic form, typically as words and sounds. The
information can be rehearsed by articulating those words and sounds, either vocally or subvocally, using a
phonological loop. In contrast, the visuo-spatial sketch pad represents information in an analog, spatial
form, often typical of visual images (Logie, 1995). Each of these components stores information in a
particular form, or code. Use of the spatial, dynamic displays discussed in Chapters 4 and 5 would typically
involve activity in the visuo-spatial sketch pad; in contrast, much of the processing of language, the topic of
Chapter 6, would involve the phonological store.
The third component of Baddeley’s model is the central executive, which is used to control working
memory activity, assign attentional resources to the other subsystems and resist distractions. The topic of
executive control in selecting responses and time sharing will also be discussed in Chapter 9 and Chapter 11.
More recently, Baddeley and colleagues have supplemented this model with a fourth component—the
episodic buffer (Baddeley, 2007). This component provides a temporary, passive store in which the various
components of working memory can interact both with each other (e.g., the binding of different perceptual
features to form one perceptual object, scene or episode; see Karlsen, Allen, et al., 2010), and with
information from perception and long-term memory (Baddeley, Hitch, et al., 2009). The buffer is accessible
through conscious awareness.
The research of Baddeley and colleagues (Baddeley, 1986, 1995, 2007; Baddeley & Hitch, 1974; see also
Logie, 1995, 2011) has contributed substantially to the understanding of this dichotomy, in terms of both the
kind of material that is manipulated within working memory ( spatial-visual or verbal-phonetic), and the
separate processing resources used by each. Generally speaking, we seem to have two forms of working
memory. Each is used to process or retain qualitatively different kinds of information (spatial and visual
versus temporal, verbal, and phonetic).
A number of tasks thought to measure the capacity of working memory—reading span, operation span,
and counting span—have been found to predict performance on a number of real-world tasks such as reading
and listening comprehension, academic performance, multi-tasking, language comprehension, ability to
follow directions, vocabulary learning, note taking, writing, reasoning, learning to write computer programs
and making complex aviation decisions (Miyake, Friedman, et al., 2000; Engle, 2001; Kane & Engle, 2002;
Logie, 2011; Causse, Dehaise, & Pastor, 2011). Working memory span has also been found to decrease with
age. Taylor et al. (2005) found that older pilots were less accurate at remembering and executing air traffic
messages due to an age-associated decrease in their working memory span.
Working memory also plays a role in moral control. Moore, Clark, and Kane (2008) asked participants to
judge how morally acceptable it would be for them to kill one person in order to save others. They

186
manipulated the judgments in terms of the personal or impersonal nature of inflicted harm, the benefit to the
agent, and the inevitability of victims’ deaths. The results showed that participants with higher working
memory capacity were more likely to condone killing only when the victim’s death was inevitable. Moore
and colleagues argued that this effect demonstrates that working memory capacity is part of a larger
selectively engaged and voluntary reasoning system.
Working memory is therefore thought to reflect a basic attentional control capability that is critical to a
wide range of cognitive tasks (Kane, Bleckley, et al., 2001).In particular, the central executive (or controlled
attention) component of the working memory system is not really about storage per se, but more about the
capacity for controlled, sustained attention in the face of interference and distraction (Engle, 2002). For
example, McVay and Kane (2009) found that the propensity for our mind to wander and neglect the task at
hand is negatively related to our working memory capacity. They argue that these task-neglect failures stem,
in part, from momentary failures of attentional control.
The practical implications of the distinction drawn between the different working memory components
are provided primarily by three different phenomena: (1) the sketch pad and phonological store appear to be
independent from one another and are therefore susceptible to interference from different sorts of concurrent
activities, which has implications for the design of tasks performed simultaneously; (2) the control and
management activities of the central executive are also susceptible to interference, which has implications for
concurrent task performance; and (3) the relationship of codes to display modalities has implications for
auditory versus visual displays and verbal versus spatial displays. We discuss each of these implications in
turn.

2.1 Working Memory Interference

2.1.1 CODE INTERFERENCE The verbal-phonetic and visual-spatial codes of working memory appear to function
more cooperatively than competitively. Posner (1978), for example, has argued that both may be activated in
parallel by certain kinds of material (e.g., pictures of common objects). For example, Johannsdottir and
Herdman (2010) found that both working memory subsystems play an important role in remembering the
location of surrounding traffic; specifically, visuo-spatial codes are used to encode highway traffic located in
the forward view, whereas phonological codes are used to encode traffic located in the rear view (to maintain
information about symbols and objects that are not continuously in view; Baddeley, Chincotta, & Adlam,
2001). One implication of this cooperation is that the two codes do not compete for the same limited
processing resources or attention. That is, if two tasks employ different working memory codes, they will be
time-shared more efficiently than if they share a common code, a theme to be discussed in more detail in
Chapter 10.
The general findings in the literature, to be summarized more in Chapter 10, are that the verbal/sequential
characteristics of verbal working memory are more disrupted by concurrent verbal tasks than by concurrent
spatial tasks (e.g., Vergauwe, Barrouillet, & Camos, 2010), and that spatial working memory is more
disrupted by concurrent spatial than verbal tasks. Furthermore, even irrelevant environmental inputs have this
differential disrupting influence. Consider background music in the workplace for example. Both Salamé and
Baddeley (1989) and Martin, Wogalter and Forlano (1988) found that music with lyrics (words) disrupted
verbal working memory tasks, while similar music without lyrics did not (Martin et al., 1988).
As discussed in Chapter 3, Tremblay and Jones (2001) found that speech, even when it was irrelevant (to
be ignored), was particularly disruptive of the processing of sequential information. Verbal tasks (either visual
or auditory-based) were disrupted by irrelevant speech, but so were visual-spatial tasks. Similar to our
discussion of auditory intrusions on focused attention in Chapter 3, it seems that activities that require the
order of items be maintained in working memory (as well as the items themselves) are particularly susceptible
to interference by concurrent activities, even if we try to ignore them and even if they access working memory
through different modalities.
Thus the implication is that the working memory demands of the task should be carefully analyzed and,
where possible, both irrelevant environmental information (e.g., sounds, distracting visuals) and relevant
concurrent tasks (e.g., spatial driving or verbal linguistic speech) that will amplify code interference should be
minimized.

2.1.2 INTERFERENCE IN THE CENTRAL EXECUTIVE While the two subsystems (the visual-spatial sketch pad and the
phonological loop) are both susceptible to code, or resource-specific interference, the central executive is

187
more disrupted by concurrent task activities of higher general demands; tasks that are performed using
controlled, rather than more automated processes (Baddeley, 1996; see Chapter 10). Baddeley has proposed
that a pure central executive task is a random generation task (e.g., the subject types a random sequence of
letters). Even after a lot of practice this task demands attention; Baddeley has shown that random generation is
interfered with by a category generation task such as producing as many items as possible from a particular
semantic category (e.g., animals or fruit). However, the random generation task is not interfered with by
articulatory suppression (e.g., counting repeatedly from 1 to 6), presumably because that task can be
performed in verbal working memory (in particular, the phonological loop).
In terms of the visuo-spatial sketch pad, Bruyer and Scailquin (1998) found that the random generation
task (requiring central executive resources) interfered with subjects’ ability to perform a mental rotation task
(a central executive operation on the contents of the visuo-spatial sketch pad), but did not interfere with a task
involving the passive maintenance of an image (pure visual-spatial sketch pad). Finally, we note here that
functionally, in most real world tasks, use of either the phonetic loop or the visual spatial sketch pad is always
coupled with the central executive. Hence below, we will simply refer to verbal or spatial working memory,
assuming the resource-demanding contribution of the central executive to each.

2.2 Working Memory, the Central Executive, and Executive Control


Working memory thus consists of the four components—the two subsystems, the episodic buffer, and the
central executive. Baddeley has proposed four roles for the central executive: (1) to temporarily hold and
manipulate information stored in long-term memory; (2) to change retrieval strategies from long-term
memory; (3) to coordinate performance on multiple tasks; and (4) to attend selectively to stimuli. The first
two of these are directly related to memory. The third can be seen as indirectly related as, for example when
doing mental multiplication of two digit numbers, one must hold sub-sums in working memory while also
performing multiplication operations. We will see more about the linkage between multi-tasking, attention
control and task management in Chapters 10 and 11. The fourth involves a form of attention control,
discussed in Chapter 3. All four of these directly or indirectly are related to working memory capacity on
tasks when material must be retained, while other effort-demanding cognitive operations are ongoing; that is,
cognitive operations demanding controlled processing, rather than automatic processing.
The roles of the central executive as one of the components of working memory, and of executive
control (Banich, 2009) are closely related, but not identical. First, executive control functions are clearly
associated with specific brain areas, particularly in the prefrontal cortex (Banich, 2009) while the central
executive is less well specified anatomically. Second, executive control functions are not as tightly linked to
working memory functions as is the central executive. For example, executive control may be involved in
sequential task switching (Miyake, Friedman, et al., 2000, see also Chapters 9 and 10) involving no working
memory, as well as focused attention tasks such as the Stroop task (Chapter 3), where better developed
executive control is more equipped to suppress Stroop interference, or other distractions. In particular, it is
noteworthy that Miyake, Friedman et al. found no relation between individual differences in task switching
and inhibiting a dominant response (as in Stroop), and individual differences in working memory capacity.
Still, despite the distinctions, there are many commonalities between the two concepts of the central
executive (as the “commander of working memory”) and executive control, and certainly both have been
found to operate in complex tasks outside the laboratory.

2.3 Matching Display with Working Memory Code


In Chapter 4 we discussed the general issues of display compatibility. Wickens, Sandry, and Vidulich (1983)
have described the principle of stimulus/central-processing/response compatibility that prescribes the best
association of display formats to the codes of working memory used by a task. In this S-C-R compatibility
principle S (stimulus) refers to display modality (auditory and visual), C (central processing) to the two
possible central-processing codes (verbal and spatial), and R to the two possible response modalities (manual
and vocal). In this section, we will discuss the optimum matching between stimulus (display) and central
processing or cognitive codes. The compatibility between stimuli and response (S-R compatibility) will be
dealt with in Chapter 9.

188
FIGURE 7.2 Optimum assignment of display format to working memory code.

Figure 7.2 shows four different formats for information display as defined by code (verbal, spatial) and
modality (visual, auditory). Experimental data suggest that the assignment of formats to memory codes should
not be arbitrary. The shaded cells in Figure 7.2 indicate the optimum combinations of code and modality. The
visual spatial format is the preferred format for spatial information; for example, a map for understanding
where things are. Words (whether spoken or text) are less proficient when the spatial relations are at all
complex.
In contrast, tasks that demand verbal working memory are more readily served by speech; especially if
the verbal material can only be displayed for a short interval (Wickens, Sandry, & Vidulich, 1983). This is
because echoic memory (a short term sensory store that retains auditory information for three to four
seconds) has a slower decay than iconic memory (the visual analog of echoic memory); speech has obligatory
access to the phonological store, and speech is more compatible with the vocalization used in rehearsal. This
guideline is supported by laboratory studies showing that short sequences of verbal material are better retained
for short periods when presented by auditory rather than visual means (e.g., Nilsson, Ohlsson, & Ronnberg,
1977).
This observation has considerable practical importance when verbal material is to be presented for
temporary storage (e.g., navigational entries presented to the aircraft pilot, or the outcome of diagnostic tests
presented to the physician). Such information will be less susceptible to short-term loss when presented by
auditory channels (either spoken or through speech synthesis). However, auditory presentation is much less
effective when the message is relatively long (i.e., longer than four to five unrelated words or letters) because
of the decay of WM over time, as discussed next. In this case, there is a need to physically prolong the
message—an optimal format would be one in which auditory delivery is “echoed” by a redundant visual
information (e.g., Helleberg & Wickens, 2003; see Chapter 6), or at least can be repeated by a simple user
request.

2.4 Limitations of Working Memory: Duration and Capacity

2.4.1 DURATION In the late 1950s experiments conducted by Brown (1959) and Peterson and Peterson (1959)
used similar techniques to determine the duration of working memory. How long does information in working
memory last if it is not rehearsed? In the Brown-Peterson paradigm, subjects are asked to retain a simple
sequence of three random letters in memory for short intervals. To prevent subjects from rehearsing the digits,
they are asked to count backward aloud by threes from a designated number, presented just after the item to be
remembered. This is sometimes called a “filler task.” On hearing a recall cue, the subject stops the count and
attempts to retrieve the appropriate item. The researchers found that retention dropped to nearly zero after
only 20 seconds when rehearsal was prevented in this manner. This decay function is shown schematically by
the three item curve of Figure 7.3.
The transient characteristic of working memory has been demonstrated repeatedly in numerous variations
of the Brown-Peterson paradigm. The various estimates generally suggest that in the absence of continuous
rehearsal, little information is retained beyond 10 to 15 seconds. Visuospatial information is subject to similar

189
decay. For navigational information (Loftus, Dark, & Williams, 1979) and information used by radar
controllers (Moray, 1986), decay functions similar to those in Figure 7.3 have been obtained. Indeed, this
notion of that our memory inexorably decays over time is an important part of recent models of working
memory (e.g., Barrouillet, Bernardin, & Camos, 2004; Burgess & Hitch, 2006). However, Lewandowsky,
Overauer, and Brown (2009) caution that decay is not purely a function of time; rather, decay is related to
interference by other factors, including both the filler task and the material that is being remembered. In sum,
the findings suggest that the transience is applicable to both spatial and to verbal working memory and
presents a serious problem for a number of work domains when to-be-remembered information cannot be
rehearsed due to intervening tasks.
As noted, an apparent solution to the problem of such memory failures is to augment the initial transient
stimulus (whether visual or auditory) with a longer-lasting visual display—a visual echo of the message a
pilot receives from air traffic control, for example. Interestingly, current trends in ground-air communications
are to directly present those communications on a text only display called digital data link, and bypassing
traditional radio communications (Kerns, 1999; see Chapter 6). Now the issue is whether these visual displays
should themselves be echoed with an auditory synthetic speech, so that redundant presentation is used (see
Chapter 6).

FIGURE 7.2 Effect of retention interval on recall from working memory with rehearsal prevented.

2.4.2 CAPACITY Working memory is also limited in its capacity (the amount of information it can hold), and
this limit interacts with time. The one and five item curves of Figure 7.3 represent decay functions in a
Brown-Peterson paradigm that would be generated by one- and five-letter items, respectively (Melton, 1963).
Not surprisingly, faster decay is observed when more items are held in working memory, mainly because
rehearsal itself (covert speech by the articulatory loop) is not instantaneous. With more items to be rehearsed
in the phonological store, there will be a longer delay between successive rehearsals of each item. This delay
increases the chance that a given item will have decayed below some minimum retrieval threshold before it is
next encountered in the rehearsal sequence. In fact, the speed of rehearsal, as dictated either by the length of
time it takes to say different items (longer → slower) or by differences between people, seems to influence
directly the capacity of working memory (Baddeley, 1986, 1990). The faster the speed, the larger the capacity.
For example, Chinese spoken words for digits are shorter than those words in English, whereas the
corresponding Welsh words are longer. The difference in the time needed to rehearse Chinese and Welsh
words, compared to English, causes an increase in span for the (shorter) Chinese words (9.9 digits; Hoosain &
Salili, 1988) and a decrease in span for the (longer) Welsh words (5.8 digits; Ellis & Hennelly, 1980).
The limiting case occurs when a number of items cannot be successfully recalled even immediately after
their presentation and with full attention allocated to their rehearsal, as in the seven item curve in Figure 7.3.
This limiting number is sometimes referred to as the memory span. As we have already discussed, working-
memory span is measured by requiring some form of cognitive processing (e.g., reading sentences or simple
arithmetic), coupled with remembering the final words of the sentences, arithmetic totals, or unrelated words
(e.g., Turner & Engle, 1989). Memory span is simply the maximum number of items that are recalled
correctly.

190
In a classic paper discussed previously in Chapter 2 in the context of absolute judgment, Miller (1956)
identifies the limit of memory span as “the magical number seven plus or minus two” (the title of his paper).
Thus, somewhere between five and nine items defines the maximum capacity of working memory when full
attention is deployed to rehearsal. However, subsequent research has downgraded this estimate to three
(Broadbent, 1975) or four items (Cowan, 2001). It appears to be particularly restrictive in the so-called “N-
back” task, where one hears a random series of letters, digits or words, and responds with the item that was
heard N-items ago. Furthermore, as we discussed earlier in this section, the length of time it takes to say
different items seems to influence directly the capacity of working memory (Baddeley, 1986, 1990).
Even though this 7 ± 2 ‘limit’ should not be taken too literally (or might be “reset” to, say, 5 ± 2), it does
provide important guidance for system design. When presenting auditory or visual information, tasks that
encroach on the limits of five to nine items should be avoided. For auditory information, we might consider
the length of strings of navigational information that are issued to a pilot. For example, the message “Change
heading to 155 and speed to 240 knots when you reach flight level 180” approaches or exceeds the limits. Or
consider the number of options to be selected from a computer menu. If all alternatives must be compared
simultaneously with one another to select the best, the choice will be easier if the number does not exceed
working memory limits (Mayhew, 1992).

2.4.3 CHUNKING To this point, we have spoken loosely of an “item” in working memory, defining it explicitly
as a letter in the Brown-Peterson paradigm. However, Miller (1956) proposed that the capacity of working
memory is 7 ± 2 chunks of information. A chunk can be defined as a set of adjacent stimulus units that are
tied together by associations in the subject’s long-term memory. Thus seven three-letter words will define the
capacity of working memory, even though this represents 21 letters, because the letter trigrams (cat, dog, etc.)
are each familiar sequences to the subject—repeatedly experienced together—and so the three letters within
each are stored together in long-term memory. The 21 letters thereby define seven chunks. Furthermore, if the
seven words are combined in a familiar sequence so that the rules that combine the units are also stored in
long-term memory (“London is the largest city in England”), the entire string consists only of a single chunk.
Thus, the family of decay curves shown in Figure 7.3 describe equally well a string of 1, 3, 5, or 8
unrelated letters, words, or familiar phrases (although working memory capacity is somewhat reduced for
more complex, higher-order chunks like familiar phrases). In each case, the items within each chunk are
bound together by the glue of associations in long-term memory; a process which takes place in the episodic
buffer component of Baddeley’s model of working memory (Baddeley, Hitch, & Allen, 2009). Recoding
information by semantically associating low-level elements is called chunking, and is a valuable technique
for maintaining information in working memory (a concept which we will further elaborate on in our
discussion of skilled memory and expertise).
Chunking may be hindered or helped by properties of the to-be-memorized material. System designers
should exploit this difference by forming codes to facilitate chunking. “Vanity” license plates in many
American states contain words—473 HOG—a strategy that takes advantage of this principle. Commercial
phone numbers often use familiar alphabetic strings in place of digits (“Dial 263 HELP”). In general, letters
allow better chunking than digits because of their greater number and meaningfulness of possible sequential
associations.
Chunking may also be facilitated by parsing; that is, by physically separating likely chunks. The
sequence 4149283141865 is probably less easily encoded than 4 1492 8 314 1865, which is parsed to
emphasize five chunks (“for Columbus ate pie at Appomattox”). For an imaginative reader these five chunks
may be “chunked” in turn as a single visual image. Loftus, Dark, and Williams (1979) investigated pilots’
memory of air traffic control information and observed that four-digit codes were better retained when parsed
into two chunks (27 84) than when presented as four digits (2 7 8 4). Bower and Springston (1970) presented
sequences of letters that contained familiar acronyms and found that memory was better if pauses separated
the acronyms (FBI JFK TV) than if they did not (FB IJF KTV). Finally, Wickelgren (1964) found that our
recall of telephone numbers is optimal if numbers are grouped into chunks of three digits. Results such as
these have led to the general recommendation that the optimum size of grouping for any arbitrary
alphanumeric strings used in codes is three to four (Bailey, 1989).

3. INTERFERENCE AND CONFUSION


In addition to the forgetting that occurs because of the passage of time and the overload of capacity, material
to be remembered (MTBR) is also lost from working memory through interference from information learned

191
at another time. In fact, dealing with the effects of interference from previous memories is one of the primary
functions of executive control within working memory (e.g., Anderson, 2003), and such interference operates
similarly, whether retention and forgetting are from working memory or long term memory. In both
memories, it is important to distinguish two different kinds of interference in terms of the time sequence
between presentation of interfering material and the MTBR.

FIGURE 7.4 Effects of RI and PI on forgetting of material to be remembered (MTBR). Dialing a phone number will produce PI for memory of
the next phone number. A conversation after the second number has been looked up will produce RI.

Figure 7.4 depicts a time sequence during which the operator engages in some activity, is given the
MTBR, performs some further activity, and finally retrieves, or “dumps,” the MTBR. Proactive interference
(PI) occurs when activity engaged in prior to encoding the MTBR disrupts its retrieval (Keppel &
Underwood, 1962; Jonides & Nee, 2006). For example, prior mugshot exposure decreases eyewitness
accuracy at a subsequent lineup (Deffenbacher, Bornstein, & Penrod, 2006). The effects of PI can be
pronounced, especially when the operator must engage in a series of memory tasks with little time between
them (e.g., air-traffic control; Hopkin, 1980), when engaged in another task (Kane & Engle, 2000), and for
people with low working memory capacity (Kane & Engle, 2000; Whitney, Arnett, et al., 2001). Using verbal
material characteristic of pilots and air traffic controllers, Loftus, Dark, and Williams (1979) found that at
least ten seconds’ delay was necessary before material presented in an exchange no longer disrupted memory
for a subsequent exchange.
Whereas PI arises as a result of previous learning or activity, retroactive interference (RI) arises as a
result of new learning or activity interfering “backwards in time.” For example, after using a new phone
number for a while, we find it hard to remember our old telephone number—even one which we had used for
years. The Brown-Peterson paradigm described above also demonstrates retroactive interference in working
memory from the counting filler task. Many studies have shown that memory for verbal information (a list of
words to be remembered) is interfered with by the subsequent presentation of other verbal information (e.g.,
McGeoch, 1936; see Anderson, 2003, for a review).
Retroactive interference has also been observed for the identification of crime suspects in a lineup
(Chapter 2); target identification can be impaired when the target person is not included among mugshots and
no one in a mugshot is present in the subsequent lineup (Davies, Shepherd, & Ellis, 1979). Hole (1996)
showed that to-be-remembered spatial information can be interfered with by the subsequent presentation of
other spatial information. Indeed, these results appear similar to the interference seen between two concurrent
tasks discussed in 2.1.1, except that here, the two activities occur at different times. Like concurrent
interference, retroactive interference can be reduced or eliminated if the two sources of information are coded
to use different working memory components (e.g., Haelbig, Mecklinger, et al., 1998).
Items in working memory are sometimes forgotten because they are confused with other items held at the
same time because of their similarity in content, and not just in code. Intuitively, we can see how this
confusion will be most likely to occur if the items are similar to one another. When an air traffic controller
must deal with a number of aircraft from a fleet having similar identification codes (e.g., AI3404, AI3402,
AI3401), the interference caused by the similarity of the items makes it difficult for the controller to maintain
their separate identity in working memory (Fowler, 1980). The controller must maintain in working memory
the identity of separate aircraft along some ordered continuum (e.g., projected time of arrival or position in
airspace).
Similarity also increases the degree of retroactive and proactive interference, as the MTBR get more
confused with subsequent or prior encountered material, as they share more features in common with that
material. For example, one reason a letter code followed by a number code (HTR 4728) will be better retained

192
than say the sequence 273 4728 is the reduced PI for the 3 letters (in the first case) than for the three digits (in
the second case) in affecting recall of the four-digit string.
Space and spatial identity versus spatial difference also exerts a strong influence on confusion and
interference in memory. Consider for example two different display layouts for keeping track of changes in
attributes of four different systems (for example, location and status of four different robots or unmanned
vehicles; see Chapter 5). In one layout, there is a single window in which changes to the parameters are
signaled for the identified unmanned agent under supervision. In the other layout there are four different
(spatially separated) windows. The first layout is more economical of space. But research by Hess, Detweiler,
and Ellis (1999; Hess & Detweiler, 1996) indicates that the spatially distributed display, by eliminating the
source of spatial confusion (identity of location) and/or creating the important location source of
discrimination, improves memory in this keeping track task. As we discuss later in the chapter, this will
improve situation awareness of the dynamic state of the fleet of unmanned agents. Other sources of difference
may enhance further the benefit of this spatial distinction, such as distinct colors for the status of each.
The implications of memory interference and confusion for system design are five-fold: When designing
coding systems the designer should: (1) avoid creating codes with large strings of similar-sounding chunks;
(2) use different codes (verbal vs. spatial) for the different sources of information; (3) ensure that the intervals
before, during, and after storage are free of any unnecessary activity that uses the same code (spatial or
verbal), and particularly the same material (e.g., all digits) as the stored information; (4) use different scales or
scale labels for attributes, or separate and unique spatial locations for the objects that must be monitored; and
(5) in any new system design, a working memory analysis is a vital component of the more general task
analysis to determine the circumstances in which the operator might need to retain information without visual
backup for any period as short as a few seconds.
In closing our treatment of working memory, we note that, in some real-world systems, recent
information is kept available on a display, and does not have to be remembered. For example, the air traffic
controller has the status of relevant aircraft continuously visible and so can respond on the basis of perceptual
rather than memory data. However, the principles described above should still apply to these systems. As
discussed in Chapter 6, an efficiently updated memory will ease the process of perception through top-down
processing and will unburden the operator when perception may be directed away from the display (i.e.,
scanning). Furthermore, if a system failure occurs, display information may be eliminated—not a trivial
occurrence in air traffic control. In this case, an accurate working memory becomes essential and not just
useful.

4. EXPERTISE AND MEMORY


In the previous section, we discussed how the capacity and decay limitations of working memory could be
reduced by chunking material whenever possible. It is clear that effective chunking will make use of
information stored in long-term memory. In this section we first describe expertise, and then relate it to the
chunking concept. After that, we describe the concepts of skilled memory and long-term working memory,
which provide a theoretical understanding of the relationship between working memory and long-term
knowledge.

4.1 Expertise
Expertise is, almost by definition, inexorably linked to both memory and learning. Experts, through learning
and training are assumed to remember things about their domain that novices do not, whether the memory is
explicit, like facts about the task, or implicit, like the procedural skills necessary to use a piece of equipment.
Expertise is domain specific (Cellier, Eyrolle, & Mariné, 1997); that is, being an expert does not provide
general performance advantages but rather advantages in a specified domain (e.g., a sport, a game, a particular
occupation). Cellier et al. note the following general characteristics of expertise:
1. It is acquired through practice or training in a domain;
2. It generally provides a measurable performance advantage; and
3. It may involve specialized, rather than generic, knowledge.
Attempting to define expertise is one thing; actually determining who is an expert is much harder. One
might assume that peer-nomination, extended domain experience and high levels of training and education
would all indicate high levels of expertise. However, Ericsson and Ward (2007) found that performance of
these so-called experts was not reliably better than their less-experienced colleagues. Citing recent reviews in

193
medicine, they argue that education and clinical experience is often unrelated to the quality of treatment
outcomes, and that performance can actually decrease without continued training. (In Chapter 8, we present
similar findings regarding expertise in decision making and prediction.) This last point is an important one;
high levels of expertise are not acquired simply through experience or innate “gifts” or abilities, rather they it
is a result of intense and deliberate practice over many years (Ericsson, 2006; Gobet, 2005).
Being an expert can have corollary benefits. A task that defines the domain of expertise is called
intrinsic (e.g., playing a chess game); a task that is not central to the domain of expertise, but greater
expertise in the domain improves performance nonetheless, is called contrived task (e.g., better recall of
pieces of a chessboard after a game; Vicente & Wang, 1998). It is these contrived tasks that provide
researchers with the means to examine the memory structures involved in high levels of expertise, because
such tasks are novel to both novices and experts. For example, expert chess players are unlikely to have
deliberately practiced memorizing chess positions, but are nonetheless far better at recalling chess positions
than less skilled players (Chase & Simon, 1973). Experts’ success on contrived tasks is also common in many
other domains: process control (Vicente, 1992); aviation (Wiggins & O’Hare, 1995); and nursing (Hampton,
1994). For example, Vicente (1992) showed that experts had much better recall of the state of a simulated
thermal–hydraulic process plant when the process variables worked normally and when a fault occurred
(intrinsic tasks) but experts also performed better than novices even when process variables were driven in
random fashion (a contrived task).
Thus, although expertise tends to be specific to a domain of skill, it is more general than just that
information provided at training or that experienced directly. In the next section, we will discuss how
expertise facilitates the use of chunking. Then we will describe a theoretical framework that specifies the
mechanism underlying experts’ improved performance. In Chapter 8, we describe expertise in decision
making.

4.2 Expertise and Chunking


One of the more enduring models of expert memory has been Chase and Simon’s (1973) chunking theory
which posits that long-term memory information can be grouped together in a meaningful way and that it is
encoded as a single perceptual unit, or chunk (see also Section 2.4.3). More recently, Gobet and Clarkson
(2004) proposed their template theory as a refinement to the chunking model whereby frequently-
encountered chunks develop into higher-level structures (templates) that allow information to be rapidly
encoded into long-term memory. This refinement to the chunking theory explains the relatively small effect of
interfering stimuli between presentation and recall of chess positions (Charness, 1976); experts can rapidly
encode chess positions to long-term memory, whereas novices have to rely on working memory which is
more susceptible to interference.
Chunking strategies can be acquired through expertise. Chase and Ericsson (1981) examined the memory
spans of expert runners, and found that they used grouping principles based on running statistics. In the same
manner, we can group sets of digits or letters in license plate numbers using codes on domains with which we
have familiarity. In fact, a conclusion from several studies of expert behavior in a variety of domains is that
the expert is able to perceive and store the relevant stimulus material in working memory in terms of its
chunks rather than its lowestlevel units (Anderson, 1996). Such domains include computer programming
(Barfield, 1997; Vessey, 1985; Ye & Salvendy, 1994), chess (Chase & Simon, 1973; deGroot, 1965; Gobet,
1998), planning (Ward & Allport, 1997), medicine (Patel & Groen, 1991), air traffic control (Seamster,
Redding, et al., 1993), and flying (Sohn & Doane, 2004).
Barfield (1997) performed a study in which expert and novice programmers viewed a short program
organized in executable order, in random chunks, or random lines. The eye movements of the programmers
were monitored when they examined the program. Expert programmers encoded more lines of the program
per glance than did novices whether the program was presented in order or randomly ordered chunks, but not
in random lines. When asked to recall the program later, expert programmers recalled more lines of organized
code if it had been in order or in random chunks, but not random lines. The fact that expert programmers
could encode more lines of program per glance when the program was organized suggests that they were
encoding chunks into working memory, rather than the individual lines encoded by the novice programmers.
Ye and Salvendy (1994) and Vessey (1985) found similar results, both finding a relationship between
chunking ability and programming expertise. In addition, Ye and Salvendy found that novices’ chunks tended
to be smaller than experts.

4.3 Skilled Memory and Long term Working Memory


194
Consider yourself reading the text on this page. To perform this task well, you must maintain access to large
amounts of information. For example, to understand what “this task” refers to in the previous sentence you
must retain some knowledge of the first sentence. As you read through this chapter, you retain some
information from previous paragraphs in order to properly integrate the current topic with earlier topics.
Although we don’t think of it as such, text reading is a skilled activity, requiring years of training. Clearly
such skilled tasks must involve working memory, but as we have already touched upon there are two aspects
of performance in skilled tasks that are difficult for the traditional chunking-based view of working memory
to account for. The first is that skilled activities can be interrupted, and later resumed, with little effect on
performance (Ericsson & Kintsch, 1995). If working memory only stores information temporarily, how can it
account for this result?
The second aspect is that performance in skilled tasks requires quick access to a large amount of
information. However, we know there are very strict limits on the amount of information that can be
maintained in working memory, and so skilled performance defies the concept of a limited capacity. One
could argue that such information is retrieved from long-term memory, but access to this information appears
to be faster than typical retrieval times for information in long-term memory (usually several seconds;
Ericsson & Kintsch, 1995).
For these reasons, Ericsson and Kintsch (1995) propose that working memory includes another
mechanism based on skilled use of storage in long-term memory. They refer to this mechanism as long-term
working memory (LT-WM). Information in LT-WM is stable, but is accessed through temporarily active
retrieval cues in working memory. LT-WM has a longer time constant than the several seconds of working
memory. The waiter who relies upon memory to associate dinner orders with customers (Ericsson & Polson,
1988) would be doomed if only WM was used. Yet the decay of customer order information is clearly shorter
than the several hours (minimum) more typical of LTM. Hence LT-WM is used here.
The temporarily active retrieval cues of LT-WM is related to Gobet and Clarkson’s template theory of
chunking discussed above, which posits that the high-level templates provide a retrieval structure to support
rapid encoding to long-term memory. As people acquire domain-specific skills, it allows them to acquire
retrieval structures, which in turn can extend their working memory for that particular skilled activity. These
retrieval structures allow experts to place the to-be-remembered information in LT-WM rather than working
memory. This would explain why, for experts, reduced interference occurs when performing another task
(verbal or spatial) simultaneously with a memory task. Presumably this is because experts store task-related
information in a LT-WM (or templates) retrieval structure; if the information was only stored in the expert’s
working memory, another task should have interfered with it (Ericsson & Kintsch, 1995). Note that these
retrieval structures are acquired for particular skill domains (medical diagnosis, waiting tables, and mental
arithmetic). There is not an improvement in the general capacity of working memory, and the expert
physician, waiter, or calculator is reduced to normal performance in most other situations (Ericsson &
Kintsch, 1995).
An example of a retrieval structure supporting LT-WM is one used by the waiter JC studied by Ericsson
and Polson (1988). JC would link all items of a food category, such as starches, in a pattern linked to table
locations. Going around a table, for example, JC might remember a reversing pattern like rice, fries, fries, rice
(Ericsson & Kintsch, 1995). The retrieval structure underlies common mnemonic techniques (Wenger &
Payne, 1995) and may account for results showing that aircraft importance affected air traffic controllers’
memory for flight data. Thus Gronlund, Ohrt, et al., (1998) found that as incoming flight information was
provided, air traffic controllers classified aircraft in terms of importance and used this classification for later
recall.

5. EVERYDAY MEMORY
In the next section, we will explore recent research on memory phenomena which underpin the performance
of tasks that are commonplace in our daily routine, such as remembering to take medication at a particular
time, or knowing which friend to ask about a particular topic. First, we will discuss how we are able to
remember to do a particular task in the future (and why we sometimes forget). These prospective memory
tasks range from the mundane (remembering to take out the garbage) to the essential (remembering to take
medication). Second, we will discuss how shared experiences often lead us to encode, store and retrieve
experiences as a group. Such a transactive memory system allows us to locate and retrieve information from
a group that might otherwise be unavailable to us.

195
5.1 Prospective Memory
Every day we all try (and often fail) to form an intention to perform a particular action at some point in the
future. These efforts to “remember to remember” pervade our social, domestic, and working lives and the
implications of failure to remember these intentions can be dire (e.g., forgetting to call your mother on her
birthday) or even life-threatening (e.g., a pilot failing to remember to deploy the landing gear on approach to
landing). The role of attention and memory processes underlying this phenomenon has been the subject of
much research conducted under the rubric of prospective memory (PM; Einstein & McDaniel, 1996;
Dismukes, 2010).
As what might be expected with retrospective memory, research has shown that, in general, the greater
the delay between the formation of the intention (i.e., encoding) and the point in time that the action should
take place (i.e., recall) the greater the reduction in PM performance (for a review see Martin, Brown, & Hicks,
2011). Similarly, McBride, Beckner, and Abney (2011) found that even for relatively short delays (less than
20 minutes) we show a decline of PM performance over the first few minutes of the delay, especially when we
are engaged in tasks that are not related to the intended future action.
However, if we are involved in a task that is related to the intended future action, then little or no decline
in PM performance is observed (McDaniel, Einstein, et al. 2004). This finding suggests that we are more
likely to remember to do something, if that something is related to what we are currently doing. For example,
for pilots forgetting to lower the landing gear of the aircraft or an air traffic controller forgetting that a plane is
positioned on an active runway, spontaneous retrieval will be a function of the degree of relatedness between
the tasks. In other words, it is not enough to be flying or controlling aircraft; rather the related task must
actually involve the gear landing controls, or for the controller, operations involving that particular aircraft. In
fact, in the specific case of related tasks, the longer the delay the more likely it is that we spontaneously
rehearse the intention or be reminded of the intention by cues in the environment (Martin et al., 2011). For
example, Hicks, Marsh, and Russell (2000) found PM performance was actually increased for a 15 minute
delay compared to a five minute delay. In summary, a delay hurts less (or may even help) when the task is
related to the intention because it can act as a reminder.
Motivation has also been found to influence whether or not an intention to act is remembered (Kliegel,
Martin, et al., 2004). For example, we are more likely to remember to call our mother on her birthday,
compared to remembering to return a library book. Penningroth, Scott, and Freuen (2011) examined at the
effect of social obligation and its contribution to the perceived importance of the PM tasks. They found that
PM tasks rated as social by participants were also rated as more important. In addition, social PM tasks were
also more likely to be remembered than non-social ones. Putting it another way, a key determinant of
remembering to complete a PM task is the extent to which others are affected. Finally, 12 hours of sleep,
rather than wakefulness is found to improve the prospective memory for an action whose intention was
formed 12 hours before (Scullin & McDaniel, 2010).
Yet despite our best efforts, we sometimes fail to remember our intention for future action. As a result we
often adopt cognitive strategies—either intentionally or not—to improve our chances of remembering. We
often use cues in our environmental to trigger our intention to act. For example, if we see a garbage truck in
our street and on the day that the garbage is collected, this cue will be a powerful reminder for us to take out
garbage. Knight, Meeks, et al. (2011) found that such cues are highly effective, even if the cues occur out of
context (e.g., we see the garbage truck on a different day, and parked in a parking lot). Einstein and McDaniel
(1990) have shown that the degree to which an intention-related cue stands out (are salient) relative to the
other non-related cues has a positive impact on prospective memory. For example, uncommon words or
unusual events—seeing a garbage truck out of context—make better cues. As we have seen in Chapter 3, it
seems that salient items involuntarily capture our attention which results in an evaluation of the item’s
significance to our current and future activities. The strength of association between the cue and the intended
action has also been found by McDaniel et al. (2004) to increase PM. McDaniel et al. argue that once a strong
association between cue and intention has been formed, subsequent encounters with the cue will result in an
automatic retrieval of the associated intention, requiring little or no conscious effort.
After taking the same medication for a while, seeing the familiar shape of the medicine bottle on the
counter is a very strong reminder to take your medication. The impact on PM of such a strong association
between a cue and its related intention underlies the effectiveness of one of the more robust strategies to guard
against forgetting future intentions—the so-called implementation intention (Gollwitzer, 1999). This
strategy comprises two important components of the intention to act: the intended action itself (i.e., the
“what”), and the future situation in which the intention must be executed within (i.e., the “where” and the

196
“when”). This strategy manifests itself in the development of a verbal association in the form of “in the event
of X, I will do Y.” McFarland and Glisky (2011) found that both forming the implementation intention and
imagining the circumstances of implementation improved PM; although either activity alone worked as well
as both together.
These findings show our PM performance can be greatly enhanced by improving cue saliency and
strengthening cue-intention associations through the simple act of forming implementation intentions, or
imagining ourselves doing the intended action under specific circumstances in the future. We have also seen
that cues in our environment can spontaneously trigger the intention, even when they are seen out of context.
We often try to help this along by deliberately introducing salient cues into our environment at the same time
the intention is formed. For example, by putting a garbage can by the front door the night before collection,
the intention “If I see a garbage can by the front door, I will put out the garbage for collection” is triggered
when we go to leave the house the next morning.
Sometimes we use prospective memory to remember to do a task in the future because it is simply not
appropriate to do it now (“I shouldn’t be taking my medication now at 9:00 AM when I am supposed to
remember to take it at 10:00 AM”). However, at other times the delay is because we are interrupted by another
task, and can resume the interrupted task only after the interrupting task has been completed. This particular
issue of interruption management will be discussed extensively in Chapter 10 as a part of multitasking. But
it is vital to note the role of PM in interrupted task resumption, and indeed interruption management and PM
are close cousins (Dismukes, 2010).
In summary, we have seen the importance of providing support to the user; particularly in work settings
where PM failures are costly or have critical consequences, such as medicine, aviation, and air traffic control.
Fortunately, for the Human Factors engineer there appears to be several strategies for training or interface
design that can increase the likelihood of “remembering to remember.” For example, electronic versions of
the paper-based “sticky notes” have been implemented on several desktop and mobile-based computing
platforms (see also Chapter 10 for technological solutions relating to the management of interruptions)

5.2 Transactive Memory


As we have seen so far, research on memory has focused largely on how individuals encode, store and retrieve
knowledge. However, in real life we often supplement our own limited memories with those of our family,
friends and co-workers (Wegner, Giuliano, & Hertel, 1985). As we have seen in Chapter 6 (within the context
of team situation awareness on the flight deck), and will discuss later on in this chapter (within the context of
collaborative problem solving), the effectiveness of teams is highly dependent on the efficient sharing of
information and knowledge between team members. This sharing of information and knowledge can be
described in terms of a transactive memory system (TMS) comprising two components: the knowledge
stored by each individual, and knowing what knowledge (i.e., meta-memory) each individual has in their
possession. A TMS provides group members with information regarding the knowledge they have access to
within the group, and in doing so, greatly increases the amount of information that they have at their disposal,
as well as the speed at which it can be accessed. This concept of shared awareness of “who knows what”
within a group has already been discussed in Chapter 6 within the context of team situation awareness (e.g.,
Gorman, Cooke, & Winner, 2006; Cannon-Bowers & Salas, 2001).
The benefits of groups having transactive memory are well-established having been researched in
laboratory (e.g., Liang, Moreland, & Argote, 1995) and field settings (e.g., Michinov and Michinov, 2009).
Research shows that groups that possess a well-developed TMS perform better than those groups that do not.
A TMS comprises three dimensions: the specialization of expertise across members of the group; the
coordination between members of the group; and the credibility of each group member’s expertise on a given
task (Liang, Moreland, & Argote, 1995; Lewis, 2003).
More recently, Michinov and Michinov (2009) investigated the relationship between these three
dimensions and the academic performance of students working in small study groups. Students completed a
series of group learning tasks during the semester followed by a self-report questionnaire on transactive
memory at the end of the course. The results showed a significant positive relationship between self-report
measures of transactive memory and learning task performance based on coordination and credibility within
the group. In addition, learning performance increased as a function of members developing specializations
within the group. These results suggest that over time members of a group increasingly specialize and in
doing so perform better as a collective. In order to carry out tasks within the group, team members coordinate
their individual efforts to perform the assigned task, and this coordination process has a positive impact on

197
their overall performance.
The benefits of specialization are also apparent when the group is asked to encode and retrieve
information collaboratively. The level of transactive memory within a group will determine how successful
they are at doing that; for groups with little or no transactive memory two or more persons recalling at once
do not produce any more new items compared to when they recalled on their own. This phenomenon of
collaborative inhibition (Weldon & Bellinger, 1997) appears to be related to a disruption in retrieval strategy
through hearing another group member’s recalled items (Dahlström, Danielsson, et al., 2011). For groups with
an established TMS, collaborative inhibition is reduced because each member is responsible for the encoding,
storage and retrieval of information related to his or her own area of expertise. Dahlström, Danielsson, et al.
argue that this level of specialization allows more information to be recalled by the group by distributing it
across group members in a non-redundant fashion. Collaborative inhibition is also reduced for groups of
friends (compared to strangers) and experts (compared to novices).
Michinov and Michinov’s (2009) research suggests that the development of a TMS within family
members or close-friends, or even study groups working together over the course of a semester, is a function
of the amount of time spent living and working together. But beyond simple time together, it appears that
explicit training is also beneficial; an issue that is particularly timely, given the recent trend for organizations
to form agile project groups which come together for only a short duration to tackle a particular task or goal.
Thus group performance can indeed be enhanced if group members have undertaken team skills training
(Prichard, Bizo, & Stratford, 2011), have been trained to work together (Liang, Moreland, & Argote, 1995), or
have received information about each team member’s respective skills (Moreland & Myaskovsky, 2000).
Team skills training that includes topics directly related to the dimension of transactive memory (e.g.,
agreeing roles, distributing work, cooperation, and so on) has been found to both lower the workload reported
by team members and improve team performance on collaborative tasks (Prichard, Bizo, & Stratford, 2011).
Liang, Moreland, and Argote (1995) examined the effects of training team members individually, or together,
on how to assemble a radio. They found that members of a team that were trained together were more likely to
recall different aspects of the assembly task (i.e., specialize), trust each other’s expertise (i.e., credibility), and
coordinate their activities within the team. They argue that these improvements to the TMS led to the team
being able to recall more about the assembly procedure and produce better-quality radios. Using an identical
radio-assembly task, Moreland, and Myaskovsky (2000) found similar effects even if individuals of a team
were trained individually but were also given feedback on one another’s performance before they worked
together as a team.
Team performance can also be improved by giving individuals the opportunity to work with other teams.
Gorman and Cooke (2011) examined the effects of breaking up existing Uninhabited Aerial Vehicle (UAV)
mission teams for short (three to six weeks) or long (10 to 13 weeks) durations on communication and
performance after re-forming the team. They found that after a break of between 10 to 13 weeks, mixing team
membership leads to greater shared knowledge about the task and improved communication, which in turn
leads to greater performance, compared to teams that were left intact. Their results show that team learning
and performance is supported by providing team members with new opportunities to gain experience in
interacting with other individuals. In doing so, team members are able to further refine their knowledge
structure within the team’s TMS to support their specific role in the team (in this case a pilot, navigator or
photographer), which in turn provides a more coordinated system of specialty knowledge within the team.
Learning to work effectively in a group is an important facet of our working life, and as we have seen,
knowing the limits of our expertise—knowing what we know—is an important facet of that. The following
section will explore this notion further as we consider one of the most important applications of memory to
dynamic environments: situation awareness.

6. SITUATION AWARENESS
One of the more pervasive topics within the study of Human Factors has been the concept of Situation
Awareness (SA) (Endsley, 1995a; Endsley & Garland, 2001; Banbury & Tremblay, 2004; Durso &
Sethumadhavan, 2008; Tenney & Pew, 2007). Indeed, in the last 15 years or so the concept has received
considerable attention from engineering psychologists (Wickens, 2008) because of its relevance to both
designing displays to support SA and understanding the causes of disasters and accidents in which SA has
been lost. Probably the most popular definition of SA is that of Endsley: the perception of critical elements in
the environment, the comprehension of their meaning, and the projection of their status into the future
(Endsley, 1988). Or as paraphrased by Tenney and Pew (2007): What? So what? What now?

198
Possessing good levels of SA is critical to efficient task performance within a wide range of dynamic and
safety-critical occupations, including air traffic controllers, pilots, surgeons, nuclear power plant operators,
and military commanders (Endsley, 1995a; Durso and Gronlund, 1999). Even minor problems encountered
can quickly snowball into disasters when operators do not fully comprehend the evolving situation. For
example, Air France Flight 447 stalled at 38,000 feet over the Atlantic and crashed killing all 228 persons on
board. Initial analysis of the cockpit voice recorder revealed that the flight crew was gripped by confusion as
they tried to diagnose and respond to what should have been a manageable emergency (Sorensen, 2011).
Researchers readily agree on the importance of having SA for successful task performance; however they
are less clear on what SA actually is, how we acquire it, and why we occasionally lose it (Rousseau,
Tremblay, & Breton, 2004). Research on SA has taken many different perspectives (for an overview see
Durso & Sethumadhavan, 2008). For example, Endsley (1995a) distinguishes between SA as a state of
knowledge (or product) and the cognitive processes that are used to achieve that state; such processes are often
referred to as situation assessment. We will touch upon situation assessment again in Chapter 8 when we
discuss the deliberate process of acquiring information to support a particular decision. To avoid confusion, it
is important that we differentiate between one time situation assessments in decision making, and the ongoing
and continuous process of acquiring and maintaining situation awareness in time-critical, dynamic
environments such as aviation and driving. Having good ongoing SA will facilitate making a rapid and
accurate situation assessment should the latter be called on to support a decision.
Rousseau et al. (2004) also differentiate SA research between an operator-focused approach which is
concerned with the set of cognitive processes that support the production of a mental representation
corresponding to the SA state (Endsley, 1995a), and a situation-focused approach that views SA as
determined by the task environment (and the events, objects, other persons, and their mutual interactions that
it comprises) (Pew, 2000; Flach, Mulder, & van Paassen et al., 2004; Patrick & James, 2004).
Given that the focus of this book is the application of psychological theory to system design, we will
concentrate on reviewing research undertaken to understand SA from an operator-focused perspective.
However, we do acknowledge that an understanding of both the operator (i.e., cognitive capabilities and
limitations) and the situation (e.g., environment, system, goals, and other crew members) is essential to
system design. For example, sources of SA are distributed and can be held by both human and non-human
agents (e.g., displays). Thus from a distributed cognition perspective, the operator does not need to remember
all of the information details; rather he or she just needs to refer to the information as required (Garbis &
Artman, 2004; Stanton, Salmon, et al., 2010; Sorenson, Stanton, et al., 2011).
The information processing framework, as described in Chapter 1, has underpinned several attempts to
identify the cognitive processes underlying our ability to acquire and maintain SA, particularly processes such
as attention and memory (Endsley, 1995a, 2004; Adams, Tenney, & Pew, 1995; Banbury, Croft, et al., 2004).
In Chapter 3, we discussed how we are able to focus or divide our attention to monitor multiple objects in our
environment, and how stimuli that we are not attending to can capture our attention or be missed (e.g., a subtle
change in the sound of an engine can prompt a pilot to look at the engine status panel; Endsley, 1995a). We
will discuss in Chapters 10 and 11 that we have a limited attentional capacity and when the demands on our
attention are excessive, our task performance suffers as a consequence. Thus, limitations on our attentional
capacity and our susceptibility to distraction are major limits on SA; complex and dynamic environments can
quickly exceed an operator’s capacity resulting in information overload and losses in SA (for a discussion of
the range of factors affecting SA acquisition see Banbury, Dudfield, et al., 2007).

6.1 Working Memory and Expertise in Situation Awareness


The linkage between SA and working memory is direct. Much of our current awareness of any evolving
situation resides in working memory; once perceived, information must be held in working memory in order
to develop an understanding of the situation from it (Durso & Gronlund, 1999; Endsley, 1995a). Indeed,
holding information active for processing is viewed by many researchers as critical for air traffic control (e.g.,
Gronlund, Ohrt, et al., 1998; O’Brien & O’Hare, 2007); driving (e.g., Gugerty and Tirre, 2000; Johannsdottir
& Herdman, 2010); flying (e.g., Carretta, Perry, & Ree, 1996; Sohn & Doane, 2004; Sulistyawati, Wickens, &
Chui, 2011); and process control (Gonzalez & Wimisberg, 2007), and in a variety of other real-world tasks
(Endsley, 1995). The effective monitoring of displays or system parameters over time requires that the
temporal order of this information be kept intact in working memory (Banbury, Fricker, et al., 2003).
The notion that working memory is an important determinant of successfully acquiring and maintaining
SA has been supported by a number of empirical studies. For example, Carretta, Perry, et al. (1996) found that

199
verbal and spatial working memory were good predictors of 31 supervisory/ peer ratings on the , U.S. Air
Force’s SA battery. Gugerty and colleagues found that working memory correlated with SA measures in a
driving task (Gugerty & Tirre, 2000; Gugerty, Brooks, & Treadaway, 2004). Durso, Bleckley, and Dattel
(2006) found that participants with larger working memory for spatial information made fewer errors on an
air-traffic control task. Durso and Gronlund (1999) argue that correlations between working memory and SA
are due to the processing of information rather than the storage of information (Baddeley & Hitch, 1974; see
Section 2).
As with other cognitive processes, the ability to maintain SA improves with domain experience. In
explaining how this develops, Durso and Gronlund propose (as we have done earlier in this chapter) that
experts rely less on working memory and more on LT-WM (see Section 4.3; Ericsson & Kintsch, 1995) in
which pointers in working memory activate information stored in long-term memory, facilitating the rapid and
efficient storage and retrieval of situational information. However, in the case of novices, or when the
situation is suitably novel, these LTWM structures cannot be brought to bear, necessitating real-time
computational processes heavily dependent on pure working memory (Endsley, 1997). For example, in a
situation recall task analogous to the Chase and Simon (1973) chess study, Sohn and Doane (2003, 2004)
found that spatial and verbal memory span (i.e., memory capacity) and performance on reconstructing
plausible and implausible cockpit configurations (i.e., memory skill) correlated to performance on predicting
future states of the cockpit instrumentation (i.e., SA). However, this effect was a function of the participants’
level of expertise; working memory capacity was critical for novice pilots, while memory skill was more
important for expert pilots. Sohn and Doane argue that both memory mechanisms play a significant role in
complex task performance; experts with higher LTWM skills rely less on working memory capacity during
complex task performance compared to novices whose LTWM structures have yet to develop. Similarly,
Gonzalez and Wimisberg (2007) found that the relationship between SA and working memory diminished as
a function of expertise on a process control task.

6.2 Levels of SA and Anticipation


As we have noted above, Endsley proposed that SA has three levels: perception (noticing), comprehension
(understanding), and projection (anticipation). To a large extent, these three levels can be accommodated
within the framework of this book. First, perception directly relates to the material discussed in Chapter 3
(selective attention and noticing) and Chapter 6 (fundamentals of perception). In a dynamic world, unless the
dynamic changes are noticed, and given a basic perceptual interpretation, no awareness of the changes is
possible. Thus the air traffic controller must first notice that two planes are at the same altitude, to ultimately
be aware of their conflict potential. St John and Smallman (2008a) highlight the direct linkage between
change blindness and SA.
At the second level, understanding or diagnosis of the situation requires the integration of information,
and a higher level inference of what is happening. In the next chapter we devote a lot of time to these
cognitive processes in diagnosis, inference and situation assessment as a precursor to decision making. The
process is working-memory intensive, but as we have noted, it also invokes LT-WM. In our air traffic control
example, the controller, having noticed the two aircraft now integrates their two trajectories, with their co-
altitude status, and understands them to be on a potential conflict course.
The third level is anticipation, projection, or prediction. The controller must now make a projection of
the time remaining until the closest passage of the two planes, and assess whether that future separation will
be under the minimum allowable limits. This projection is hard, people don’t do it very well, it is under-
represented in research, but it is perhaps the most critical element of SA. Before we describe level 3 in detail
however we note that all three levels of SA are pre-response. That is, the SA construct generally does not
address issues of and decision choice, and action selection as discussed in the last half of Chapter 8, and in
Chapter 9. Thus, just as it is critical to describe what SA is, it is also important to describe what it is not.
The critical importance of level 3 SA in human performance is highlighted by the fact that SA is most
relevant for dynamically evolving situations, like the dynamic quality of the young forest fire, or progression
of the possible engine abnormality in the aircraft or power plant. When such situations require human
intervention, it is simply a fact that corrective actions cannot be effectively achieved immediately. For
example, it takes time to steer the Titanic away from the iceberg, time for the air traffic controller to steer the
aircraft away from the potential conflict, and time for the anesthesiologist to gather all the information about a
deteriorating patient in the operating room, before understanding the cause of the crisis. In Chapter 5, we
referred to this time delay, when applied to system dynamics as a system lag, and we saw how predictive
displays were useful and often essential for control, by explicitly displaying this estimated future system state;

200
in Chapter 8, we will describe the cognitive challenges of more long term predictions.
Here, we emphasize that given that changes in SA cannot be effectively addressed by action
instantaneously (i.e., the moment those changes are noticed or even understood), then it becomes essential that
people can predict those changes, so that action can be initiated before the situation has reached a crisis state
(see the Titanic above). We also emphasize that in order to be prepared for the possible crisis, it is important
for the operator to be anticipating all the time—to maintain level 3 SA—even if that SA may not be required
most of the time for effective routine performance (Wickens, 2000). After all, typically the operator does not
anticipate the crisis situation, but must nevertheless be prepared for it. Such preparation is attained in part by
the continuous maintenance of level 3 SA.
So how is level 3 SA accomplished? At least five mechanisms have been proposed, not mutually
exclusive of any other. First, anticipation can be achieved by carefully focusing attention on the most relevant
leading indicators, typically sources of information in the environment. For example certain economic
indicators are more valid predictors of future trends in the economy (a dynamic system) than others; and in the
aircraft, the vertical speed display is a better source of level 3 altitude awareness, than is the altimeter itself
(Bellenkes, Wickens, & Kramer, 1997). Sometimes this focus may simply involve attending to the rate of
change of a single display, or even display acceleration, rather than the level of the display (Yin, Wickens, et
al., 2011). Here again, skill and experience are necessary, to know what indicators are more or less important
to attend to (Bellenkes, Wickens, & Kramer, 1997; Sohn & Doane 2004; Jackson, Chapman, & Kramer,
2009).
Second, as we discuss in the next chapter, some experts employ “mental simulation” to anticipate the
future, using working memory for literally running possible scenarios in their mind, to anticipate what might
occur (Klein & Crandall, 1995).
Third, Endsley argues that the acquisition of higher levels of SA (e.g., both understanding and
anticipation) can be achieved through a process of ‘pattern-matching’ with previous experience (e.g., Endsley,
1995a, 2000). Long-term memory structures (i.e., mental models) are utilized to construct current SA
(Endsley, 2000). Mental models allow operators to “generate descriptions of system purpose and form,
explanations of system functioning and observed system states, and predictions of future states” (Rouse &
Morris, 1985). Similarly, Durso and Gronlund (1999) argue that situation models are the momentary
instantiation of LT-WM that allow, amongst other things, predictions into the near future. For example, an
accurate mental representation of an industrial control process, such as a water purification plant, will allow a
process control operator to simulate mentally the outcome of hypothetical faults or operator-initiated actions.
Fourth, Banbury, Croft, et al. (2004) have recently argued that the cognitive streaming framework of
Jones (1993) that has been used to explain a number of phenomena associated with selective attention (see
Chapter 3) and working memory might also provide useful insights into SA, particularly those associated with
anticipation. A key concept of Cognitive Streaming is that of transitional probabilities; the likelihood that
certain types of events will occur following the occurrence of other events. Banbury, Croft, et al. argue that
the use of transitional probabilities is the mechanism by which we are able to anticipate. For example, the
transitional information present for vehicles approaching a familiar intersection leads to a high transitional
probability that a vehicle will show a particular behavior (e.g. after moving from the center to the right hand
lane a vehicle will likely turn right at the intersection) . Even an object with low transitional probabilities can
be understood and anticipated more readily by “grafting” transitional probabilities onto it through “pattern
matching” from long-term memory structures (e.g., previous experience of an aircraft’s capabilities and likely
maneuvers) as we have discussed previously.
Fifth, differences in cognitive ability certainly play a role. Sulistyawati, Wickens, Poon (2011) studying
the different levels of SA in fighter pilots, found that cognitive reasoning, presumably used to extrapolate the
future, was important in predicting those with better level 3 SA, but not with better level 2 SA, the latter being
predicted better by spatial ability.
Finally, while we emphasize here the importance of anticipating the future, the next requirements for the
human operator who has maintained level 3 SA are to select the actions to address the anticipated future
situation. This critical issue of planning will be discussed in detail in Section 7.

6.3 Measuring SA and the Role of Awareness


One of the more memorable quotes by a politician in recent times was from the former US Defense Secretary,
Donald Rumsfeld given at a NATO press conference in 2006: “There are known knowns. These are things we

201
know that we know. There are known unknowns. That is to say, there are things that we know we don’t know.
But there are also unknown unknowns. There are things we don’t know we don’t know.” Although Mr.
Rumsfeld might not be familiar with concept of measuring SA, he did a fairly decent job of describing how
we might do it.
A large proportion of the SA measures have been designed to access the operator’s consciousness
knowledge (for a recent review, see Salmon, Stanton, et al., 2006). For example, the situation awareness
global assessment technique (SAGAT; Endsley, 1995b) comprises a set of memory-based queries to asses SA
across all three of Endsley’s levels of SA. In SAGAT, the queries are presented during “freezes” in a
simulation of the task under investigation. During these “freezes” all displays are blanked and the operator is
required to answer each query based upon his or her knowledge of the situation at the point of the freeze.
While SAGAT techniques are often used, it imposes a degree of disruption on the performance of the
very task whose SA it is measuring: the problem of intrusiveness, discussed more in the context of workload
in Chapter 11. As we saw above, SA is working memory-dependent, and working memory is quite vulnerable
to interruptions such as filler tasks. Furthermore, we might expect some confound when SAGAT is used to
differentiate SA between experts and novices because, as we saw, experts rely less on working memory and
more on LT-WM, a system that we saw in Section 4.3 is less disrupted by interruptions. Consistent with this
differential effect, McGowan and Banbury (2004) found that interruption-based measures of SA actually
reduced young drivers’ anticipation of road hazards during a simulated driving test.
In contrast to SAGAT, the situation present assessment measure (SPAM; Durso & Dattel, 2004)
presents the queries to the operator while the situation remains present and while they continue to perform the
task. SPAM also records the operator’s response time and accuracy, which are both used to infer SA. As we
have already touched upon, SPAM takes a distributed cognition perspective (e.g., the operator does not need
to remember all of the information; rather he or she just needs to refer to the information as required). Based
on the RT to respond to the queries, inferences can then be made about the processes underlying an operator’s
SA. For example, a rapid response to a query would indicate that the knowledge is held in active memory;
whereas a slower response would indicate that the operator needs to access the information from artifacts
within the task environment.
Up until this point, we have covered SA from the perspective of working memory whereby the operator
is consciously aware of the relevant knowledge required for successful task performance. Returning to Mr
Rumsfeld’s quote, what about the “unknowns unknowns” and the “ unknown knowns”? There are certainly
cases where people “don’t know what they don’t know” (unknown unknowns). Thus fighter pilots
(Sulistyawati et al., 2011), soldiers (Matthews, Eid, et al., 2011), and military commanders (Rousseau,
Tremblay, et al., 2010) often believe they have better SA than what is assessed using probe-based or observer-
based measurement techniques: an example of such an “unknown unknown” is a pilot not knowing there is an
enemy on his tail but nevertheless reporting good SA (an issue we will address in terms of overconfidence in
Chapter 8). But in the case of “unknown knowns,” it also possible that experts who genuinely possess good
levels of SA are also not able to articulate it in an explicit or verbalisable manner which can be readily
measured using probe-based or observer-based techniques.
In fact, many researchers have argued that SA is not simply the momentary knowledge of which an
operator is aware, or any verbal report of consciousness about a situation (Smith & Hancock, 1995; Rousseau,
Tremblay, et al., 2004). Rather, the process of acquiring and maintaining SA also involves implicit
components (Durso & Sethumadhavan, 2008) which by their very nature leave them inaccessible to
consciousness introspection. (For a review, see Croft, Banbury, et al., 2004.) This can be problematic given
the emphasis on conscious “awareness” has been reflected in the development of measures of SA; a large
proportion of the measures currently in use have been designed to access operator’s explicit consciousness
knowledge (Croft, Banbury, et al., 2004).
Instead, Croft et al. (2004) argue that SA measures must account better for the implicit, non-conscious
acquisition of information, rather than the explicit recall (SAGAT) or retrieval (SPAM) of information.
Implicit performance-based measures of situation awareness essentially impose an abnormal or unexpected
event into the flow of routine performance (Wickens, 2000). If SA is high, this event will be handled fluently.
If it is low, it will not be. For example, in driving, a sudden braking of a lead vehicle will require an evasive
action to avoid a collision. One who has poor SA of traffic behind might swerve to the adjoining lane to avoid
the collision, but run into another car in the blindspot. One with good SA of that traffic would aggressively
brake instead. Yet such SA need not be based upon conscious awareness.
In conclusion, the research discussed in this section has provided an overview of the attempt to isolate

202
and understand the cognitive processes that underpin our ability to acquire and maintain SA of the critical
elements in our environment. Such an understanding has a practical value insofar as training programs can be
developed (for a review, see Endsley, 2004), displays to support SA can be designed (St John & Smallman,
2008a; see Chapters 4 and 5), and the way in which automation often degrades SA (as discussed in Chapter
12).

7. PLANNING AND PROBLEM SOLVING


The concepts of planning and problem solving are intertwined. A plan can be considered as a strategy for
solving a problem. Generally, planning and problem solving are presumed to draw upon resources from the
central executive subsystem of working memory (Baddeley, 1993; see also Allport, 1993). Therefore, we
should expect to see working-memory limitations play a role in planning and problem solving tasks, and we
should expect a decrease in planning performance in situations where there is increased working memory
load. Indeed, that is what has been observed (Ward & Allport, 1997).
Planning and problem solving are not synonymous, however. In terms of situation awareness, problem
solving is more about addressing level 2 issues (understanding), whereas planning is more related to level 3
(prediction). That is, problem solving requires an understanding of the current situation and its direct
implications, with more emphasis on the short term, whereas planning is about developing more general
strategies over a longer time horizon. Given their similarities we will generally treat them together, but we
specifically note when we are describing one or the other in particular.
It has been said that a person attempting to solve a problem is analogous to an ant working its way across
the sand on a beach towards its home (Simon, 1981). The ant’s path on the beach is determined as much by
the features of the beach (bumps formed by waves, the dryness of the sand) as by the goals of the ant. The
analogy is therefore that human planning is determined as much by environmental constraints as by the
operator. The success of a route chosen to avoid rush-hour traffic will partially be determined by the
constraints of the environment: traffic density, weather, routes chosen by other drivers, accident likelihood,
and so on. Indeed, in the context of flight planning, Casner (1994) found that nearly half the variability in
pilots’ problem-solving behavior is due to environmental features.
What makes a planning task more difficult? First, fewer constraints and more choices actually increase
planning difficulty. Ward and Allport (1997) had their subjects solve a five-disk “Tower of Hanoi” puzzle
which requires changing the position of the disks on three vertical poles to a particular goal position in as few
moves as possible. They found that the time to prepare a planned solution to this task was affected by the
number of competing alternative choices at critical steps. Second, difficulty also increases the choices options
are of roughly equal preference. As a result, the problem solver equivocates, leading to longer planning times.
This result has also been observed in the context of en-route flight planning by Layton, Smith, and McCoy
(1994; see also Anderson, 1993).
The human problem solver tends to satisfice—that is, he or she selects the current best plan with no
guarantee that it is the absolute best plan (Anderson, 1991; O’Hara & Payne, 1998; Simon, 1990). The reason
for this is that continued search of the problem space takes place at increasing cost (Simon, 1978). Thus,
potential plans will be generated until the expected improvement over the current plan no longer justifies the
cognitive effort to generate further plans.
When people are engaged in the planning process, they demonstrate a strategy labeled opportunistic
planning. This is similar to satisficing as used in problem solving. Thus, when Vinze et al. (1993) studied
managers performing real-world planning tasks (e.g., auditing, production planning), they found that the
managers tended to choose the most promising leads at any point in time. While opportunistic planning is
often successful, it can lead to solutions that are not optimal. For example, Layton et al. (1994) described a
case where a pilot engaged in flight planning solved each step in route selection accurately—following the
apparently best strategy at any given stage—but produced a route that was not optimal in a general sense.
Thus opportunistic planning—which has the advantage of reducing cognitive load—leads to focused solutions
that may not be globally optimal.
Planning is often done in context of external displays (Casner, 1994; O’Hara & Payne, 1998; Payne,
1991; Moertl, Canning, et al., 2002), and different display designs affect the environmental constraints,
leading to different problem solutions (O’Hara & Payne, 1998). The displays may be as simple as notes on a
piece of paper or may be part of a larger complex system (e.g., dynamic graphical map displays for flight
planning; Layton, Smith, & McCoy, 1994). The design of a computer interface has been shown to impose

203
constraints that affect the plan chosen by the human problem solver (O’Hara & Payne, 1998).
In some cases, planning can be characterized as a comparison between the conceptual model of the user
and the external display representation (a situation model). A particular display representation is therefore
more or less useful on the basis that it affords such comparison. The CECA (Critique, Explore, Compare, and
Adapt) model (Bryant, 2003) of operational planning in command and control uses such ideas to characterize
the commander’s mental model. The military commander needs to validate a solution against the situation
model, typically represented by an external display of some type. The more the external representation
facilitates the comparison, the more effective the plan should be.
External display representations can help problem solving (Moertl et al., 2002); other times they hinder.
For example, Zhang and Norman (1994) found that particular display representations for the Tower of Hanoi
problem affected the quality of problem solving performance. Graphical methods that used ordinal coding for
ordered aspects of the problem space (e.g., a small ring can only be placed on a larger one) were more
successful than those using nominal coding (e.g., assigning a particular color or shape to represent each ring).
The ordinal external representation reduced cognitive load, whereas using nominal coding, the user had to
maintain the ordinal relation in working memory. However, simply providing pictorial representations is not
necessarily helpful. Berends and van Lieshout (2009) examined whether illustrations helped users solve
arithmetic word problems. They found that illustrations containing irrelevant or redundant sources of
information did not aid problem solving performance. The illustrations appeared to increase the cognitive
load, rather than help the problem solving process.
You may have heard of the traveling salesman problem. In this problem, the goal is to find the shortest
path passing through each of a set of points (e.g., the salesman needs to visit a set of cities on his route).
Because the number of possible paths increases rapidly with the number of points, finding the shortest path
using an exhaustive algorithm is not feasible (MacGregor, Chronicle, & Ormerod, 2004). However, without
training, humans can solve this problem more optimally than computers in less time (MacGregor, 2010)!
Why are people so good at this problem? It turns out that the way the problem is presented (the display
representation, to use our term from Chapter 4) is critical: a visual representation is necessary, one that shows
the points laid out as if on a map. When this is the case, humans naturally tend to pick a sequence called the
convex hull, which can be intuitively visualized as those cities that would be touched by an elastic band
stretched around the geographic space (i.e., the boundary points; MacGregor & Ormerod, 1996). Humans also
naturally manage to avoid having arcs cross each other in their solutions (van Rooij, Stege, & Schachtman,
2003). When humans are given tables with distances between cities instead of a graphical map layout they
perform much worse (Garling, 1989). As discussed in Chapter 4, the task representation is the same, but a
different display representation has a large impact on human performance. Chapter 5 discussed the importance
of display type in supporting complex visualizations.
The more general, as yet untested, implication is that very complex problems can potentially be solved
faster by humans than by algorithmic approaches, if given the right display format, an implication that is
important for the process of task allocation (whether or not to assign a task to human or machine, discussed in
Chapter 12). A human problem solver draws upon heuristics—strategies that are not guaranteed to give a
perfect or optimal solution, but are fast and correct most of the time. The traveling salesman problem shows
the utility of heuristics that human problem solvers naturally adopt. We shall treat heuristics again when
discussing decision making in Chapter 8.
The use of explicit visualizations and external representations has also been shown to help team problem
solving (Smith, Bennett, & Stone, 2006). Dong and Hayes (2011) had their teams solve engineering design
problems (e.g., designing a robot arm), for which uncertainty was a key element. The team needed to assess
whether or not they had enough information to identify the best design candidate. Their results showed that
having visual depictions of uncertainty helped the teams in their problem solving. Similarly, Rosen, Salas, et
al. (2009) argue for the value of external representation for teams, proposing that high-quality external
representations reduce the need for team members to exchange information. Rosen et al. liken this advantage
to the carpenter’s jig (a physical mockup of a part that is correct in its dimensions). Having a jig available
reduces cognitive load for the carpenter (Kirsh, 1995, cited in Rosen et al.): the carpenter’s knowledge of
correct dimensions has been offloaded to the jig. Similarly, having effective external representations of a
cognitive domain allows the members of the team to share that representation (including relevant terminology,
concepts, etc.), reducing the cognitive load of team members.
The differing training or experience of team members affects the problem solving approaches they take.
Canham, Wiley, and Mayer (2011) found that when two team members had the same training (homogeneous

204
pairs), they tended to perform accurately on standard problems, but were weaker on new transfer problems,
relative to two team members who had undergone different lesson training (heterogeneous pairs). As in our
discussion of instruction redundancy in Chapter 6, complementarity helps. The homogeneous pairs spent a
larger proportion of time communicating about low-level details; the heterogeneous pairs spent more time
discussing solution development. Working with a non-human partner (an automated system) has been shown
to have similar pros and cons; it was useful when there were multiple solutions, but the problem solver’s
exploratory search of the problem space tended to decrease, and consideration of uncertainty in the decision
decreased (Layton, Smith, & McCoy, 1994). Some of these issues will be discussed further in the complex
systems and automation chapter (Chapter 12).
In summary, humans can be quite good at problem solving, particularly when supported with effective
displays. But they are far from perfect, and well-designed automation can provide effective support in this
endeavor, as discussed in Chapter 12. Also, many of the human imperfections will be revisited in the next
chapter, in the context of diagnosis and trouble shooting.

8. TRAINING
Memory training and learning are closely linked in engineering psychology. We naturally learn a lot of
information about the environment of say a workplace. But when it is essential that tasks and skills there be
well learned, they can be explicitly trained, and once trained, they are less likely to be lost from memory.
In this section, we will first concentrate on transfer of training—how knowledge learned in one context
facilitates the learning of new material, and how to measure the improvement in performance. Then we will
consider various training methods, and their effectiveness, not only in training, but in resistance to forgetting.

8.1 Transfer of Training


Information can be learned in a variety of ways—formal classroom teaching, practice, on-the job training,
focus on principles, theory, and so on. The engineering psychologist who develops a new training procedure
or device is concerned with these issues: What procedure (or device) provides the best learning in the shortest
time, leads to the longest retention (resists forgetting), and is cheapest? Together these criteria define the issue
of training efficiency—the greatest level of proficiency per dollar invested.
A critical factor in skill acquisition is the extent to which learning a new skill, or a skill in a new
environment, can capitalize on what has been learned before. This is called transfer of training (Salas, Wilson,
et al., 2006; Singley & Andersen, 1989). How well, for example, do lessons learned in a driving simulator
transfer to performance on the highway? Or how much does learning one word-processing program help (or
hinder) learning another? Measures of transfer of training are normally used to evaluate the effectiveness of
different training strategies, to be discussed later in this chapter (Acta Psychologica, 1989; Healy & Bourne,
2012).

8.1.1 MEASURING TRANSFER Although there are many ways to measure transfer, the most typical is illustrated in
Figure 7.5. The top row represents a control group, who learns the target task in its normal setting. This group
achieves some satisfactory performance criterion after a certain time—in this example, 10 hours. Suppose you
propose a new training technique or strategy with the purpose of shortening the time needed to learn the target
task. A transfer group is given some practice with the new technique and then is transferred to the target task.
In the second row, we see that the transfer group trains with the new technique for four hours and then learns
the target task faster than the control group, a savings of two hours. Hence, some information in the training
period carried over to the effective performance (or learning) of the target task. Because there were savings,
we say that transfer was positive. In row 3, we see that a second training technique had no relevance to the
target task (no savings, zero transfer). In row 4, a third training condition was employed, and we see that this
training inhibited learning the target task. That is, people would have learned the target task faster without the
training! We say here that transfer was negative.

205
FIGURE 7.5 The measurement of transfer performance.

While the simplest transfer measure is to just compute the ratio of performance by the transfer group to
that of the control group during the first few transfer trials, this does not really account for benefits in the
speed of learning the transfer task that may have resulted from prior training. To do this, a common formula
for expressing transfer presents the amount of savings as a percentage of the control group learning time:

The results of these calculations are shown in Figure 7.5 for the three training conditions.
Positive transfer is generally desirable, but it is not always clear how much positive transfer is necessary
to be effective. Consider the following example, which might have produced the hypothetical data shown in
row 2 of Figure 7.5. A driving simulator is developed that produces 20 percent positive transfer to training on
the road. That is, trainees who use the simulator can reach satisfactory performance on the road in 20 percent
fewer road lessons than trainees who do all their training on the road. This sounds good, but notice that to get
the 20 percent transfer (the two-hour savings) the simulator group had to spend four hours in the simulator.
Therefore, they spent 12 total hours, compared to the 10 hours spent by the control group. Hence the
simulator, while transferring positively, is less efficient in terms of training time than the actual vehicle.
This relative efficiency is expressed by the transfer effectiveness ratio (TER) (Povenmire & Roscoe,
1973):

Examining this formula, we see that if the amount of time spent in the training program (the
denominator) is equal to the amount of savings (the numerator), then TER = 1. If the total training for the
transfer group (training and practice on the target task) is less efficient than for the control group, as is the
case with all three groups in Figure 7.5, the TER will be less than 1. (In row 2 TER = 0.50). If training is more

206
efficient, the TER is greater than 1. A TER less than 1 does not mean that the experimental training program
is worthless because two factors may make such programs advantageous: (1) They may be safer (it is clearly
safer to train a driver in the simulator than on the road), and (2) they may be cheaper. In fact, a major
determinant of whether a company will invest in a particular training program or device should depend on the
training cost ratio (TCR Provenmire & Roscoe, 1973):

In short, the cheaper the training device, the lower the TER can be. The cost-effectiveness of a training
program may be assessed by multiplying TER by TCR. If TER × TCR &gt; 1, the program is cost effective. If
the product is less than 1.0, the program is not cost effective. Even if a program is not cost effective, however,
safety considerations may be important to consider.
There is often a diminishing efficiency of training devices with increased training time. In the example in
row 2 of Figure 7.5, four hours of training were given, and a TER of 0.5 was obtained. But now consider row
5, in which the same device was used for only one hour. Although the savings is now only one hour (half of
what it was before) the training time was reduced by 75 percent, and so the TER is 1.0. The general result is
shown in Figure 7.6. TERs typically decrease as more training is given, although for very short amounts of
training TERs are typically greater than 1 (Povenmire & Roscoe, 1973). The point at which training should
stop and transfer to the target task should begin will depend, in part, on the training cost ratio (TCR). In fact,
the amount of training at which TER × TCR 5 1 is the point beyond which the training program is no longer
cost effective. As noted, however, the training program may still be safety effective for even longer amounts
of training.
What causes transfer to be positive, negative, or zero? Generally, positive transfer occurs when a training
program and target task are similar (in fact, if they are identical, transfer is usually about as positive as it can
be, although there are some exceptions). Extreme differences between training and target task typically
produce zero transfer. Learning to type, for example, does not help learning to swim or drive an automobile.
Negative transfer occurs from a particularly unique set of circumstances relating to perceptual and response
aspects of the task, to be described later. We will first consider the similarity between training device and
target task: the issue of training system fidelity. Last, we will consider negative transfer between old and
new tasks.

FIGURE 7.6 Relationship between time in training and transfer effectiveness ratio, (TER ).

8.1.2 TRAINING SYSTEM FIDELITY We stated that maximum positive transfer would generally occur if all
elements of a task were identical to the target task. Does this mean that training simulators should resemble
the real world as closely as possible? In fact, the answer to this question is no for a number of reasons
(Schneider, 1985), an answer that reminds us of the concept of naïve realism, discussed in Chapter 4
(Smallman & St. John, 2008b). First, highly realistic simulators tend to be expensive, but their added realism
may add little to their TER (Hawkins & Orlady, 1993). Druckman and Bjork (1994) note that there are
multiple studies that show no training advantage for real equipment or realistic simulators over cheap
cardboard mockups or drawings. Second, in some cases, high similarity, if it does not achieve complete
identity with the target environment, may be detrimental by leading to incompatible response tendencies or
strategies. For example, there is little evidence that motion in flight simulators, which cannot approach the
actual motion of an aircraft, offers positive transfer benefits (Burki-Cohen et al., 2011, Hawkins & Orlady,
1993). Finally, if high realism increases complexity, it may increase workload and divert attention from the
skill to be learned so that learning is inhibited (Druckman & Bjork, 1994).
Instead of total fidelity in training, researchers have emphasized understanding which components of

207
training should be made similar to the target task (Druckman & Bjork, 1994; Holding, 1987; Singley &
Andersen, 1989). For example, training simulators for a sequence of procedures may be of low fidelity yet
effective as long if the sequences of steps are compatible (Hawkins & Orlady, 1993). Sometimes the training
situation need not even be superficially similar to the transfer situation. Gopher, Weil, and Bareket (1994)
trained two different groups of air force cadets on the Space Fortress game, a complex videogame task that
demands working memory resources and controlled attention. While Space Fortress had little superficial
similarity to fighter aircraft flight, the generic attentional skills taught in the videogame transferred positively.
As we discussed in detail in Chapter 5, particular attention has been paid to the value of virtual reality
trainers that can simulate large amounts of training environment with reasonable fidelity and much less
expense than many physical simulators. Hence, while their TER may be less than 1, their TCR is often much
greater than 1.0.
In summary, some departures from full fidelity do not have the detrimental impact on transfer that would
be predicted from the view that maximum similarity produces maximum transfer. Furthermore, departures
from full fidelity can actually enhance transfer if they focus the trainees’ attention on critical task components,
processing demands, or task-relevant visual elements.

8.1.3 NEGATIVE TRANSFER Negative transfer is an important concern, as the continued emergence of new
technology and different system designs require operators to switch systems. What causes skills acquired in
one setting to inhibit performance in another? A history of research in this area (see Holding, 1976) reveals
that the critical conditions for negative transfer are related to stages of processing. When two situations have
similar (or identical) stimulus elements but different response or strategic components, transfer will be
negative, particularly if new and old responses are incompatible with one another (i.e., they cannot easily be
performed at the same time). The relationship between the similarity of stimulus and response elements and
transfer is shown in Table 7.1.
Many real-world tasks involve the transfer of many different components, most of them producing
positive transfer. Hence, given similar tasks, most transfer is positive. However, the designer should focus on
the differences between training and transfer (or between an old and new system) that do involve incompatible
responses or inappropriate strategies as the airlines do when transferring the pilot from one aircraft to a similar
one (Lyell & Wickens, 2005). For example, consider two word-processing systems that present identical
screen layouts but require a different set of key presses to accomplish the same editing commands. A high
level of skill acquired through extensive training on the first system will inhibit transfer to the second
(identical in appearance, different in response), even though overall transfer will be positive.
Negative transfer is also a concern for an operator who switches between two systems. Consider two
control panels in different parts of a plant that both require a lever movement. In one panel the lever must be
pushed up, and in the other it must be pushed down to accomplish the same function. Negative transfer is
inevitable as the operator moves from one panel to the other resulting from the lack of consistent S-R
mapping (Andre & Wickens, 1992; see Chapter 9). The designer should be concerned about such error in
many contexts: when a company installs a new word-processing system, or when it changes an operating
procedure. In commercial aviation, a concern relates to the number of different types of aircraft a pilot may be
allowed to fly (transfer between) without undergoing an entirely new training program (Braune, 1989). The
lack of standardization in the control arrangements for light aircraft can also lead to serious problems of
negative transfer.
Sometimes different systems can yield very positive transfer. As shown in Table 7.1, two systems may
differ in their display characteristics but positive transfer may be observed if there is identity in the response
elements. For example, there will be high positive transfer between two automobiles with identical control
layouts and movements, even with different dashboard displays. Furthermore, if the responses for two systems
are different and incompatible, Table 7.1 suggests that the amount of negative transfer may be reduced by
actually increasing the display differences. For example, the operator confronting the two control levers with
incompatible motion directions will have fewer problems if the appearance of the handles (both visual and
tactile stimulus elements) are quite distinct.

Table 7.1 Relationship between Old and New Task


Stimulus Elements Response Elements Transfer
Same Same ++
Same Different ‒

208
(incompatible) ‒‒
Different Same +
Different Different 0

8.2 Training Techniques and Strategies

8.2.1 COGNITIVE LOAD THEORY There are a wide variety of training techniques or “strategies” that have been
advocated to maximize transfer of training in complex skills (e.g., Healy & Bourne, 2012; Acta Psychologica,
1989; Wickens, Hutchinson, et al., 2011). These strategies vary in their cost of implementation, their overall
effectiveness, and in the other variables that modulate or modify their effectiveness. Many of these techniques
can be understood within the context of cognitive load theory (CLT), (Sweller, 1999; Paas, Renkl, &
Sweller, 2003; Mayer, 2005, 2009, 2012, 2011; Mayer & Moreno, 2003; Paas & van Gog, 2009), and so this
theoretical framework will be presented first, followed by an individual discussion of several of the different
strategies.
CLT asserts that the attention demands or mental workload of the learner can be partitioned into three
distinct elements:
• Intrinsic load describes mental workload imposed by the task to be learned. For example, learning to
fly an aircraft is more complex than learning to drive a car, because of the number of axes in which it
can move and rotate, and because of the complex coupling between axes (e.g., its relational
complexity) (Halford Wilson & Baker, 1996). Working memory also contributes.. The higher the
intrinsic load of the task, the more of the limited resources of the learner it requires simply to perform
the task, leaving fewer available to learn the task. The issue of mental workload will be discussed in
more detail in Chapters 10 and 11.
• Germane load describes the demand for resources necessary to learn the task itself. While it may
seem that germane and intrinsic load are indistinguishable (Kalyuga, 2011) this is not necessarily the
case. A learner pilot may be struggling so hard just to keep the plane flying in a straight line, that
she/he cannot even think about and hence learn, the critical relationship between flight axes, and the
need for anticipatory control, that will ultimately support the skill in question. In some circumstances
during training, it may be better not to try to perform the task perfectly (maximum resources allocate
to intrinsic load), but to sacrifice performance just a bit, in order to think about, understand, and
rehearse (i.e., learn) the relationships and strategies necessary to perform the task adequately. In short,
perfect performance during learning does not necessarily translate to optimal learning (Bjork, 1999).
• Extraneous load describes the source of resource demands unrelated to either of the above. It is a
nuisance and will compete with both intrinsic and germane load in inhibiting both performance and
learning. An example might be a poor interface, or technical difficulties in a computer-based learning
environment (Sitzman, Ely, et al., 2010), or the need for the learner to go to a manual and look up the
meaning of acronyms that appear on the screen of the technology device to be trained, or even
distracting the learner with unrelated information, jokes, or stories (Mayer, Griffeth, et al., 2008).
Given these three sources of load, or resource competition, strategies of training should seek to minimize
extraneous load and try (by altering the task during learning) to keep intrinsic load from being too high, so
ample resources are available to allocate to germane load. While this overall “meta strategy” appears intuitive
and straightforward, it is complicated by the fact that some strategies, if not implemented correctly, will
inadvertently produce extraneous load; and some strategies have what we call “spinoff effects” that can hinder
learning in ways not addressed by CLT and offset these advantages (Wickens, Hutchins, & Carolan, et al.,
2012a). As we describe the successes (and moderating factors) for the several training strategies below, we
identify both of these mitigating spinoff factors where relevant.

8.2.2 TRAINING SUPPORT AND ERROR PREVENTION: REDUCING INTRINSIC LOAD Several researchers have examined
training strategies variously known as, “training wheels” (Carroll, 1992; Catrambone & Carroll, 1987),
worked examples (Paas & van Gog, 2009; van Gog & Rummel, 2010), or “scaffolding” (Pea, 2004) in which
support for the learner guides the correct skill performance, but is gradually withdrawn, as learning,
progresses. Such guidance explicitly lowers the intrinsic load as the learner does not constantly have to think
and decide “what do I do next, and how do I do it?” Furthermore, such support can also avoid “thrashing” or
the unpleasant and often time-consuming consequences of making “bad” errors (such as pressing the delete
key while learning a text editing system or—using the training wheels metaphor literally—the child falling off

209
the bike and badly skinning her knee). These consequences are clear contributors to extraneous load. A meta-
analysis reveals that these error prevention techniques are generally quite effective, offering an approximate
50 percent advantage in transfer of training (e.g., 50 percent better performance relative to a control on the
first transfer trial; Wickens, Hutchins, et al., 2011).
However, some caution needs to be exercised. For example, in many environments it is not only
advisable, but essential for learners to make some errors (but not too many), so, in this spin-off effect, that the
process of error recognition and correction can itself be learned (Keith & Fresese, 2008). Support for this
position is provided by the finding that training wheels techniques are less effective when inappropriate
behavior is totally “locked out” than when it is not (and appropriate behavior is simply recommended or
guided; Wickens, Hutchins, et al., 2011).

8.2.3 TASK SIMPLIFICATION: REDUCING INTRINSIC LOAD While training wheels essentially provides a “crutch” to
prevent performance failure, another way to do this is to alter the task itself in some way that makes it simpler,
hence reducing its intrinsic load early on, availing more resources for germane load, but gradually increasing
the difficulty as learning and automaticity progresses to reach the full difficulty of the target (transfer) task
(Wightman & Lintern, 1989). Importantly such an increase can be implemented either on the same schedule
for all learners, or adaptably, according to the momentary level of skill development of each individual
learner. The latter is referred to as adaptive training (Mane, Gopher & Donchin, 1989).
A meta-analysis reveals that task simplification and increasing difficulty yield neither costs nor benefits
relative to fixed difficulty training (Wickens, Hutchins, et al., 2012b), but several variables moderate this lack
of effect. In particular, when difficulty increases (from the simplified training to the complex transfer task
version) are implemented adaptively, positive transfer is observed. When they are not (e.g., the identical fixed
difficulty increase schedule for all learners), slight negative transfer is observed. The reason for this negative
transfer is likely to be a spin off effect. In many cases, the simplified version actually involves different skills
from the target task at its full level of difficulty. For example if tracking a higher order lagged system is the
target task (see Chapter 5) , then earlier simplified versions containing no lag will teach trackers to rapidly
react to any existing error signal. But this skill transfers negatively to high lagged systems, where the
necessary skill is, instead, slower, smoother anticipation of future error (Naylor & Briggs, 1962).

8.2.4 PART TASK TRAINING: REDUCING INTRINSIC LOAD The intrinsic load of a complex multipart task can be
reduced by dividing it into parts, and training each part individually before re-integrating them. Thus a
difficult piano piece might be learned by training first on the left hand (one part) and the right hand (another
part) individually before combining these. Alternatively, the skill might be acquired by training both hands
together, but only on the most difficult passages (one part), before combining these passages into the whole
piece, with earlier and later passages (the other parts). These two techniques are labeled fractionation (by
task) and segmentation (by time), respectively (Wightman & Lintern, 1985). This distinction is important
because fractionation (by concurrent task part) in general produces negative transfer, with those trainees
suffering a rough 20 percent cost relative to the control group, whereas segmentation (by sequential parts)
shows neither cost nor benefit (Wickens, Hutchins, et al., 2012b). The reason for the cost of fractionation is
related to another spin off effect; the time sharing skill, that is necessary when the two concurrent tasks are
combined in the whole task transfer trials (Damos & Wickens, 1980), a concept that will be discussed in depth
in Chapter 10. If a part task training group never has the opportunity to practice this skill during training, they
will be at a disadvantage for transfer trials, even as they did benefit from reduced intrinsic load during
training. Fortunately, a variation of fractionation can eliminate its cost, and actually produce a benefit. This is
the concept of variable priority training (Gopher, Weil, & Seigel, 1989), in which the parts are always
practiced together; but with differing levels of emphasis on one or the other, as training progresses.

8.2.5 ACTIVE LEARNING: INCREASING GERMANE LOAD When people make active choices, they are more likely to
retain information about those choices than when they passively witness another agent When people make
active choices, they are more likely to retain information about those choices than when they passively
witness another agent (whether human or machine) making those choices. This advantage is known as the
generation effect (Slamecka & Graf, 1978), a concept that we will revisit in our discussion of automation in
Chapter 12. As applied to training, it simply indicates that active learning will be more successful than
passive learning. These active choices are a source of germane load. Another related example is the
distinction between rote rehearsal and semantic rehearsal, one that Craik and Lockhart (1972) have
associated with shallow processing and deep processing respectively. The latter forces more active

210
consideration of the meaning of the concept to be rehearsed or learned, relating working memory to long term
memory via the episodic buffer, while the former simply attends to the phonetic sound in the articulatory loop.
Deep processing is more effortful, but this effort is invested into productive germane load.
Examples of the benefits of active learning abound. Meta analyses have documented the modest
advantage in transfer (Kraiger & Jordan, 2007; Keith and Frese, 2008; Wickens, Hutchins et al., 2011,
Carolan, Hutchins & Wickens, 2012). More specific, examples can be found in the benefits to learning a
navigational route, when actually driving (or flying) the route and making active choices about turns, than
when being a passive passenger. They can be found in the advantage of study strategies that involve taking
practice tests (Roediger & Karpicke, 2006, Roediger Agarwal et al., 2011) or answering questions about the
material (knowledge retrieval practice, Karpicke, 2012, Weinstein, McDermott & Roediger, 2010), or reciting
the material (McDaniel, Howard & Einstein, 2009), or giving computer-based learners active choice over
what material to study or what feedback to process (Kraiger & Jerden, 2007, Wickens, Hutchins et al., 2011).
Yet, here too spinoff effects can sometimes mitigate and offset the advantages of active choice. In
particular, providing the learner with too much choice or exploration of the material, without guidance, can
lead the learner to make bad choices; become immersed in material that has little to do with the ultimate skill
or knowledge to be acquired and possibly to become “lost” in a very complex data base, hence creating added
extraneous load. It is for this reason that relative to a full control condition, the advantages of learner control
strategies that have some form of guidance are significantly greater than any advantages of total learner
control (advantages that are tenuous at best; Wickens et al., 2011). Guidance, but not mandating is helpful,
just as we saw in 8.2.2, that guidance of what not to do during training, is more effective than lockouts, which
prevent an inappropriate action altogether.

8.2.6 MULTIMEDIA INSTRUCTION: DECREASING EXTRANEOUS LOAD Multimedia instruction typically involves some
combination of speech, text, pictures (or animation/video; Mayer, 2005, 2009, 2012). The advantages of
multimedia redundancy presentation were discussed in some detail in Chapter 6. For the purposes of learning
and skill acquisition, the advantages of multimedia instruction lie in the well validated dual coding principle
of Pavvio (1971, 1986), and the idea that material is better retained (and more likely to be retrieved) if it has
multiple different representations in the brain. The dual coding principle in particular highlights the
advantages of both pictorial (spatial) and verbal representation of the same material. Yet, “use multimedia,”
like other principles or training strategy must be qualified and carefully applied by considering the occasional
spin-off downsides. As we see below, these downsides are generally reflected in attentional factors causing
extraneous load. The following are sub principles extracted from the work of Mayer (2009, 2012; Mayer &
Moreno, 2002) and closely related to attentional phenomena discussed in Chapters 3 , 6, and 10.
1. Modality combinations. As we described in Chapter 6, a general conclusion is that pictures (or video)
tied to words via speech (auditory) is more effective than pictures tied to text (Tindall-Ford, Chandler
& Sweller, 1997). The reason for this advantage, sometimes called the “split attention effect,” is based
on multiple resources theory discussed in Chapter 10. The extraneous load of dividing visual attention
(e.g., scanning) between two spatial locations is imposed with visual-visual learning that is reduced in
visual-auditory learning.
2. Temporal contiguity. When speech and pictures (particularly video) are employed, it is important that
the time of the heard phrase is closely linked to the time of the viewed image or picture. In the absence
of such contiguity, the working memory load of retaining the first information until the second arrives
is a clear source of extraneous load.
3. Spatial contiguity or linking. If dual visual channels are to be employed (e.g., because audio is
unavailable, as is the case with text books), then, as we discussed in the context of the proximity
compatibility principle in Chapter 3, text and related pictures should be adjacent (Johnson & Mayer,
2012); not, for example, on different pages of a textbook, with the latter creating the extraneous load
of visual search or page turning. When possible, visual linking should be employed (Chapters 3 and
Chapters 6).
4. Highlighting. As discussed in Chapter 3, highlighting, the most critical and important details of
instructions, directing attention to this provides an advantage.
5 Filtering irrelevant material. Several studies have indicated that material that is irrelevant to the
contents to be learned, as a source of extraneous load, can detract from that learning. While this seems
self evident, such material is often imposed in the learning environment in an effort to invite
“engagement” and interest. It may take the form of jokes (in a classroom), interesting (but barely
related) anecdotes (Mayer, Griffeth, et al., 2008) or even animation in computer-based instruction

211
(Mayer, Hegarty, et al., 2005). Such engagement (see also Chapter 10), if leading to resource
investment in germane load, is of course effective for training, but not when engagement invites
investment into interesting, but unrelated sources of extraneous load.

8.2.7 FEEDBACK Presenting feedback is not really itself a training strategy so much as it is an important property
of the training environment, and it can either be a source of extraneous or germane load, depending on how it
is delivered. The timing of feedback delivery, relative to the skill performance to which the feedback pertains
is critical, and can be divided into three categories. Concurrent feedback is delivered while the skill is being
performed. Temporally adjacent feedback is delivered immediately after the skill is performed, and delayed
feedback is delivered only after an interval that can be as short as a few seconds, but as long as days, weeks or
even months. It is apparent and well known that delayed feedback, like the lack of temporal contiguity in
8.2.6, is heavily subject to memory failures. The learner simply cannot recall what was done (or not done)
during the skill performance episode in question, against which the feedback is supposed to provide a standard
of comparison.
In contrast, concurrent feedback, particularly if it is offered in the same perceptual modality as the
primary source of performance-related information, will produce perceptual dual task interference (and may
not be processed at all if the skill to which it pertains is heavily engaging). And concurrent feedback can
produce cognitive dual task interference, to the extent that the feedback and/or the skill itself is cognitively
demanding. Such interference is obviously a major source of extraneous load. Unless such interference can be
avoided (by simplifying feedback, using alternative modalities, or integrating it with the task; see Chapter 10),
then by default, the optimal time for delivering feedback is temporally adjacent, mitigating both the spin-off
effects of memory failure and of dual task interference.

8.2.8 PRACTICE AND OVERLEARNING The expression “practice makes perfect” is one that we are all familiar with,
but the issue of how much practice is not always obvious. Generally, skills continue to improve after days,
months, and even years of practice (Proctor & Dutta, 1995; Healy & Bourne, 2012). Such improvement may
not be evident in measures of correctness, for with many skills, such as typing or using a piece of equipment,
errorless performance can be obtained after a relatively small number of practice trials. However, two other
characteristics of performance continue to develop long after performance errors have been eliminated: The
speed of performance will continue to increase at a rate proportional to the logarithm of the number of trials
(Anderson, 1981), and the attention or resource demand will continue to decline, allowing the skill to be
performed in an automated fashion (Fisk, Ackerman & Schneider, 1987; Schneider 1985). (Overlearning will
also decrease the rate of forgetting of the skill, as discussed later in this chapter.) These characteristics make it
clear that training programs in which training stops after the first or second errorless trial will shortchange an
important part of the automaticity of skill development.
It is important to note that making errors (and hence their absence in error-free performance) is a much
more salient symptom of learning than is the minor increase of speed (following a logarithmic trend) or
reduced attention demand. Hence giving learners complete control over when they may terminate learning or
study invites overconfidence that a skill is fully mastered, when this self evaluation is heavily dominated by
the high salience of error-free performance: “Hey I got it perfect. I’m done!!!” (Bjork, 1999).
We described above, the importance of overlearning (beyond error-free performance) in moving a task
toward automaticity. Of course, in training for skills that are subsequently used on a daily basis (like driving
or word processing), such overlearning will occur in the subsequent performance. But because learning skills
related to emergency response procedures, for example, will not receive this same level of on-the-job training,
their retention will greatly benefit from overlearning. (Logan & Klapp, 1991).

8.2.9 THE EXPERTISE EFFECT One of the strongest tests of cognitive load theory comes from what is called the
expertise effect in training strategies (Kalyuga, Chandler & Sweller, 1998; Paas & van Gog, 2009; Pollack et
al., 2002; van Merriënboer et al., 2006; Rey & Buchwald, 2010). Put simply, learners more experienced with
the task (compared to novices) either receive reduced benefits or increased costs of load-reducing training
strategies; or benefited more from the germane load-increasing strategies of active learning (Wickens
Hutchins et al., 2012a, in press).
The basis of this effect in cognitive load theory is that the task to be mastered imposes less intrinsic load
for a learner of higher experience. Hence, with more resources already available for germane load for the high
experience learner, and, additional simplifying techniques that are designed to increase resources available for

212
germane load (e.g., lower difficulty, error prevention, training in parts) are simply unnecessary. Indeed, when
deployed for the experts who do not need those extra resources for germane load, such techniques may simply
amplify their spin-off costs that were described earlier in the chapter (e.g., developing inappropriate strategies
from simplification, failure to learn time sharing skills in part task training). Correspondingly, with more
resources available, the more experienced can benefit more from the added germane load of active choice.
Although there are pronounced differences in the benefits (or costs) of strategies on high versus low
experienced learner, it is important to note that little consistent evidence exists for differences in training
effectiveness of learners of different qualitative cognitive abilities (e.g., spatial versus verbal ability). This,
phenomenon, if observed, is called an aptitude X treatment interaction (Pashler et al., 2008; see Chapter 6).

8.2.10 DISTRIBUTION OF PRACTICE How practice sessions are distributed over time can have a significant impact
on training effectiveness. In general, distributing practice over multiple sessions leads to better skill
acquisition than massed practice (Cepeda, Pashler, et al., 2006; Donovan & Radosevich, 1999; Healy &
Bourne, 2012), and increasing the interval between the practice sessions themselves leads to longer retention
intervals (Cepeda et al., 2006). When training a complex task, there is often a need to train task components.
The order of the training of these components then becomes an issue. The acquisition of a motor learning skill
has been shown to be slower with a random schedule than a blocked schedule, but retention is ultimately
better with the random schedule. With extended practice, a blocked-repeated schedule (ABCABCABC instead
of AAABBBCCC) has been shown to lead to the best acquisition and retention (Gane & Catrombone, 2011).

8.2.11 TRAINING-TRANSFER DISSOCIATION Our prior discussion has focused exclusively on the effects of training
strategies on transfer, not on the training/learning experience itself. This difference is intentional because it is
becoming clear that several variables that may make training easier (or more rapid) may not necessarily
increase transfer effectiveness, and may in fact degrade it through spin-off effects (Schmidt & Bjork, 1992).
We saw above that such was the case with part task training and with training wheels (if guidance is not
carefully removed). This phenomenon is described as training-transfer dissociation.
Such dissociation has implications beyond the fact that training strategy merits should be based upon
transfer and not training performance. In particular, Bjork (1999) has noted that people intuitively evaluate the
ease of learning, training, and practice as a proxy for the quality and effectiveness of that learning: They
erroneously think that if learning is easy, it is effective, and memory for what is learned will therefore be
strong. This is an illusion. People using this heuristic (ease of learning = quality of learning) will often study
material less than they should, or chose an inappropriate easy training technique (e.g., relying upon training
wheels, or pure reading rather than practice testing), indicating an overconfidence in their knowledge and skill
gain. The general phenomenon of overconfidence is discussed in the next chapter.
This meta-cognitive illusion also has implications beyond the self-choices of training strategy and
practice time. If learners enjoy a particular training device or strategy because of its favorable impact on
performance during training (and other enjoyable aspects that may create extraneous load), this positive affect
will reflect favorably on the instructor or training environment in which that strategy is employed. Vendors
who sell that strategy (or an instructional curriculum or simulator device based on it) will benefit in sales and
marketing because of this favorable attitude. Bjork hypothesizes that the proof of effectiveness must lie in
transfer, which may not be correlated with (or may even be negatively correlated with) performance and
enjoyment in training.

9. LONG TERM MEMORY: REPRESENTATION, ORGANIZATION,


AND RETRIEVAL
9.1 Knowledge Representation
Once information is encoded into long-term memory (LTM) through learning and training, its representation
can take on a variety of forms. Some knowledge is procedural (how to do things), and other knowledge is
declarative (knowledge of facts). Procedural knowledge is often said to be implicit in the sense that people
possessing this knowledge (often experts) are unable to express it verbally, but it is implicit in their actions
(Reder, 1996). (This was discussed in section 6.3, when we considered situation awareness measures.)
Procedural knowledge is therefore sometimes referred to as implicit memory. We can subdivide declarative
knowledge into two further categories: general knowledge of facts or concepts, like word meaning (semantic
memory), and memory for specific events in a person’s own life (episodic memory), the sort that is critical in

213
eye witness testimony. There is good evidence to suggest that these three LTM systems (implicit, semantic,
and episodic) exist independently in the brain (Poldrack & Packard, 2003; Tulving & Schacter, 1990). In the
next three sections, we treat the topics of: (1) knowledge representation, (2) memory retrieval and forgetting,
and (3) skill retention. These topics roughly correspond to semantic memory, episodic memory, and implicit
memory systems, respectively.
Human knowledge is incomplete and vague (Cohen, 2008). Most of what we know takes the form of
relational knowledge (e.g., Germany is east of France) rather than absolute knowledge like specific
quantitative longitudes (Nickerson, 1977). This often meets the needs of everyday life. In most situations
rough, imprecise estimates will suffice. Traditionally, we use the external environment to validate and check
our imperfect relational assumptions. We record information (write things down) in order to make use of
precise information when we need it. Information technology is most valuable when it provides the absolute
information that our semantic memory lacks, in a task appropriate form.
LTM is not simply a passive repository of information. Evidence from the grounded cognition approach
has emphasized the active and perceptual characteristics of memory (Barsalou, 2008). That is, when we think
conceptually, we activate sensory areas in the brain related to LTM concepts (e.g., color, size, shape, spatial
relations). So when we draw upon LTM knowledge (e.g., how does my bike work), we often simulate sensory
and motor elements (we visualize the gears, imagine how the pedals turn, or remember how much resistance a
certain gear produces). The simulation could represent an average of different instances, and can then be used
to test abstract predictions (e.g., with this gear controller, I push forward to obtain a higher gear). This is
similar to the concept of a mental model that we briefly discussed in Chapter 4, and we will consider mental
models in more detail shortly.
From an engineering psychology perspective, the way knowledge is represented and organized in long-
term memory is important primarily because it has implications for interface design, and the design and
organization of tasks within a real-world work domain. We consider how we make use of our relational
knowledge to solve specific problems in such domains, in the form of mental models about elements within
the domain. It is also important to consider how one can extract domain knowledge from a subject-matter
expert, and represent that knowledge in some form, a process called knowledge elicitation. The
representation can be useful in designing an interface or making recommendations for the organization of a
work place. With these applications in mind, we consider knowledge representation in terms of three
subtopics: the organization of the knowledge in memory, the concept of a mental model, and the methods for
representing long-term knowledge.

9.1.1 KNOWLEDGE ORGANIZATION We have long known that information is not stored in LTM as a random
collection of facts. Rather, that information has specific structure and organization, defining the ways in which
items are associated with one another. In particular, systems designed to allow the operator to use knowledge
from a domain will be well served if their features are congruent with the operator’s organization of that
knowledge. There is good evidence for a hierarchy of conceptual knowledge, such that we store different
types of information for broader concepts than for narrow instances (Collins & Quillian, 1969). When we
obtain specific expertise in a domain, this affects the nature of that hierarchy. Consider the index for a book
on engineering psychology. The psychologist might search for information relevant to visual display design
under the heading “Perception, visual,” whereas the engineer might look under “display design.” To support a
variety of users, book indexes should be relatively broad and redundant, with items of information accessible
under different categories (Bailey, 1989; Roske-Hofstrand & Paap, 1986).
Internet search engines like Google provide ultimate flexibility in this regard: users can search for any
concept terms desired. Indeed, this “random access” is one of the great advantages to the electronic storage of
information. However, the empty search box does not provide any hint to the user about how information can
or should be organized. When the domain is the complete set of indexed information on the internet, this is
probably the best approach. With a more limited set of information (e.g., the commands available in an
application), the interface designer will likely want to provide assistance to the user in the form of an
organizational structure.
The design of a menu system serves as one example, typically representing a set of commands that can
be used within an application. If the categories and the structures defined by the menu system do not
correspond to the user’s mental organization of them, the search for a particular item will require time-
consuming serial search, which will likely be frustrating for the user. That is, the user must start at the first
item on the list and scan down until the target is reached (as discussed in visual search in Chapter 3), or with

214
auditory phone menus, listen to each item before making a decision about which less-than-satisfactory option
could be chosen. Seidler and Wickens (1992) showed that if information that is more subjectively related was
closer together in a menu system, faster menu search resulted than if menu items were structured randomly.
Similar results have been obtained in other studies (e.g., Durding, Becker & Gould, 1977; Roske-Hofstrand &
Paap, 1986). Thus, when providing an organizational structure for information, it is important to understand
the mental representation of the typical user.
As users gain knowledge about the information stored in the database (domain knowledge), they also
become more flexible in how they can access information (e.g., Hollands & Merikle, 1987; Salterio, 1996;
Smelcer & Walker, 1993). For example, Hollands and Merikle provided participants with a definition and
asked them to find the corresponding term in a hierarchy of menus. Experts were equally effective searching
with an alphabetic or semantic organization; however, novices were better using the semantic organization for
this task. Thus, the information represented in the expert’s semantic memory allows flexibility in the use of
different organizational structures.
Sometimes the knowledge associated with expertise can interfere with how information is presented
(Kalyuga & Renkl, 2010). In the context of instructional design, we saw earlier in this chapter that methods
reducing cognitive load during learning help novices acquire new concepts, more so than experts. As a
specific example tied to knowledge structures, more experienced learners benefit less, and in fact suffer a cost,
when knowledge is accessed by multiple redundant sources (e.g., text and diagrams; Kalyuga, Chandler, et al.,
2001). So interface designers need to understand that knowledge can be a double-edged sword, sometimes
making users more flexible (in the case of organizational structures) and sometimes making them less flexible
(in the case of redundant instructional materials).

9.1.2 MENTAL MODEL A mental model has been defined as a mental structure that reflects the user’s
understanding of a system (Carroll & Olson, 1987) and therefore is a source of expectancies about how a
system will respond. It can be conceived as knowledge about the system sufficient to permit the user to
mentally try out or simulate actions before choosing one (Moray, 1999). A mental model may be created
spontaneously by the user or carefully formed and structured through training.
The mental model developed over time through experience with a physical system (Moray, 1999). The
model is necessarily incomplete and may be incorrect; the complexity of the domain is simplified. There may
be different mental models for a given physical model, of different types and levels of abstraction (more
general or more specific). Thus, for example, we do not have a single mental model of a car, but we may have
a model for how the car’s electrical system works (e.g., that the battery is recharged when the engine is on)
and a more specific model of the light systems or the starting motor. Different models are applied in different
contexts (e.g., when the car fails to start, the driver remembers that the battery provides power to the starting
motor and hypothesizes that the battery must therefore be dead). The user essentially picks the model that
seems most useful at a given moment, in a given context. The expert will generally have more refined and
accurate mental models than the novice.
Experts also demonstrate greater flexibility in their use of mental models than do novices. It is easier for
an expert to switch among the various models of the physical system than a novice. In a case study by
Williams, Hollan, and Stevens (1983), their expert subject switched between models while reasoning about a
heat exchanger (a device that cools or heats one fluid using another of different temperature). Williams et al.
argued that the use of multiple models, and the ability to switch between them, are crucial features of human
reasoning. Further, experts are generally able to adjust their strategic approach under stress better than
novices. Switching among mental models to find the most appropriate one would be one manifestation of the
strategic superiority associated with expertised.
As we noted in Chapter 4, and earlier in this chapter when discussing problem solving, the features of the
display representation and more generally the human-machine interface will shape the development of a
mental model, and how it is used. For instance, Sanderson (1990) found an interaction between physical
topology and reasoning: the same problem can be made more or less difficult by changing the physical layout.
Moreover, in light of the arguments made above by Moray (1999), one of the things that a good interface does
is improve the selection of the mental model. Ecological interface design is based heavily upon compatibility
of display representation with the mental model of the expert (which should correspond well to the underlying
physics of the system, as discussed in Chapter 4).
As discussed in Section 6.2, mental models can be used in prediction. The user can perceive an
environmental input to a physical system, “run” the mental model of system dynamics based on that input,

215
and predict system output, which is then likely confirmed by the system response to actual input. Yet mental
models may sometimes be inaccurate, and when they are breakdowns in performance occur and errors are
committed (Doane, Pellegrino, & Klatzky, 1990). It might be advantageous therefore to create a correct
mental model by explicit training on the underlying causal structure and principles operating in a system—the
principles that underlie the procedures used to operate the system and its visible controls and displays. There
is some evidence that training through a mental model has benefits (Fein, Olson, & Olson, 1993; Taatgen,
Huss, et al., 2008).
However, explicit training is generally not feasible for consumer products and everyday household
devices (the assumption is that these products should be usable without formal training). People often have
erroneous mental models for everyday devices like the refrigerator and the furnace. Norman (1992) showed
that while mental models of such devices contains errors, they can still be used for relatively accurate
prediction in some cases (i.e., the model is often accurate enough). An aid to the formation of a correct mental
model of a system is the concept of visibility (Norman, 1992). A device is said to have visibility if by looking
at it one can immediately tell the state of the device and the possibilities for action. The relation between
operator actions and state changes can be immediately seen (and thereby more easily learned). The concept of
visibility also refers to the ability of a system to display intervening variables between user action and system
response. For example, a thermostatic system that shows the state of the system generating or removing heat,
as well as the momentary temperature, has good visibility. Visibility is often reduced by high levels of
automation (discussed further in Chapter 12).
In sum, if the correct mental model for the operation of a device is provided to the user, either through
training or by design, better performance should result. A flawed mental model will likely lead to error in
some circumstances. Device visibility may help the user form a more accurate mental model. The main
advantage to a correct mental model is that it allows the user to make correct predictions about untested
situations, a useful characteristic for large and small systems alike.

9.1.3 METHODS FOR REPRESENTING LONG-TERM KNOWLEDGE The engineering psychologist sometimes wants to
gain access to the organization of an expert user’s knowledge. This information may be useful for training
programs or to improve the design of an interface. There are various knowledge elicitation techniques
available to do this (Cooke, 1994). Some of these are listed below.
1. Scaling techniques (see Kraiger, Salas & Cannon-Bowers, 1995 or Rowe, Cooke, Hall, & Halgren,
1996 for examples) are one method. These methods show how related domain concepts are to each
other, usually by having experts rate the similarities of pairs of concepts.
2. Protocol analysis. Experts perform typical tasks with a system and speak their thoughts as they do so
(think aloud technique). Their behavior can be recorded on video and coded using process tracing
methods (Cooke, 1994). Like observation (below), this emphasizes the observable aspects of the task.
3. Interviews with subject-matter experts (self-reports).
4. Observation of experts in their work environment.
5. Structured knowledge elicitation. These techniques are typically part of cognitive task analysis, and
are organized around an account of a specific experience (a situation, or an event) in the expert’s
experience.
6. Document analysis techniques. Traditional document analysis involves reviewing manuals and other
procedural documents used within a work domain. While not an explicit representation of the user’s
long-term knowledge per se, such documents often represent the constraints imposed by the physical
and social systems with which the user interacts (the user’s work domain). As we discussed in
Chapter 4, this physical representation impacts the user’s mental representation (long-term knowledge
and mental models).
Although all these techniques have some value, we note that structured knowledge elicitation techniques
based on cognitive task analysis have been shown to produce better instructional materials than expert self
reports, which tend to be incomplete or inaccurate (Feldon 2007). Structured knowledge elicitation involves a
set of query methods that can be used to elaborate the expert’s experience in terms of time, depth and
richness, and breadth by asking “what if ” questions (Hoffman, Crandall, & Shadbolt, 1998; Schraagen,
Chipman, & Shute, 2000). The problem with self-report data from unstructured interviews is related to the
distinction between declarative and procedural memory, noted above. That is, the validity of a self report
procedure is based on the assumption that experts have direct conscious access to their relevant knowledge.
However, once a perceptual or procedural skill is acquired, it is often difficult to explain how one

216
accomplishes that task (Cooke, 1994). Simply put, an expert cannot accurately introspect on procedural
knowledge. Structured interviews provide a set of probes and queries to address this problem (Crandall, Klein,
et al., 1994; Hoffman, 1995; Randel, Pugh, & Reed, 1996; Schraagen, Chipman, & Shalin, 2000).
Once user’s knowledge is acquired, how is it best represented? One technique that has been especially
successful with respect to training is conceptual graph analysis (CGA; Gordon, Schmierer, & Gill, 1993). A
conceptual graph uses nodes and links of different types to characterize the user’s knowledge of a system.
Gordon et al. used CGA to develop an instructional text for a topic in engineering dynamics. First, a document
written by an expert was constructed as a conceptual graph. After construction, the graph was translated into a
standard text format. Students using this knowledge-engineered text solved more dynamics problems than
students who received the original text.
Document analysis techniques have become computational, with large text corpuses classified
automatically using computational models of semantic memory (e.g., Griffiths, Steyvers, & Tenenbaum,
2007; Landauer & Dumais, 1997) and the resulting classification represented as a dynamic three-dimensional
visualization (Kwantes, 2005). Thus, there is an emerging application for computational knowledge
representation: the representation can become the interface to a semantic database. For example, a military
intelligence analyst can interact with this visualization tool to quickly ascertain key events in the domain (as
opposed to reading thousands of pages of intelligence reports) (Figure 7.7).
Finally, we should consider the concept of an ontology. This term, from computer science, represents a
systematic classification of “what exists” in a domain. In an ontology, all concepts within a domain, and the
relationships among those concepts, are formally defined (Brewster & O’Hara, 2007). For example, an air
traffic control ontology might include object classes for types of aircraft, various types of radios or
communications equipment, different occupational classes, and so on. Certain types of aircraft may or may
not have a particular type of sensor, or communications equipment. This relational information would also be
represented in the ontology. Given what we know about the incomplete nature of our declarative knowledge,
it is doubtful that our LTM would be well modeled in the form of an ontology. However, the concept is useful
as a method for representing what exists objectively in a particular work domain, and also provides a formal
method of knowledge representation for the design of interfaces, or the design of intelligent systems (agents)
working within the domain (Brewster & O’Hara, 2007).

FIGURE 7.7 A multidimensional representation of concepts associated with insurgent activities in Afghanistan. The representation is based
upon the co-occurrence of terms from a large text corpus of intelligence information. Image courtesy of Defence Research and Development
Canada.

217
9.2 Memory Retrieval and Forgetting
Knowing that a fact or skill has been learned and is therefore stored in LTM does not guarantee that it will be
retrieved when needed. Hence engineering psychologists must also be concerned with the sources of memory
failure. Failures of memory lead to human performance errors. As with working memory, retroactive and
proactive interference play a role. Also the similarity of to-be-remembered information to other information
stored in long-term memory is a factor. As we saw in the discussion of negative transfer, a set of procedures
learned for one word-processing system can very easily become confused in memory with a set of procedures
for a different system, particularly if many other aspects of the two systems are identical. Finally, the mere
passage of time causes forgetting. We remember best those things that have happened most recently, an
important phenomenon in the discussion in Chapter 8, of information integration in decision making. To
understand such memory failures better, we distinguish between two forms of retrieval: recall and recognition.

9.2.1 RECALL AND RECOGNITION Recall describes the situation in which you must generate information stored in
memory. For instance, you might make a mental shopping list, and then try to remember what was on that list
when you go shopping. The recall may be associated by any number of cues, such as those designed to
facilitate prospective memory to recall a specific intent (see Section 8). Before I go shopping I might say to
myself that I have two dairy items to purchase. When I get to the dairy section and recall this cue, it reduces
the chance that I buy one and forget the other. Recognition involves classifying an item as something you
either have or have not seen or heard before. (Indeed you might recognize “recognition” from our discussion
of eyewitness testimony in Chapter 2!) While shopping, you might recognize a friend. Or the eyewitness
might recognize the suspect as the perpetrator of a crime. Typically this takes the form of a yes-no judgment.
However, we might ask the witness to provide a confidence rating instead (how confident were you that that
was your friend in the supermarket, or that the suspect committed the criminal act).
The important difference among these tasks is the need to retrieve the item from memory. In a recall task
you must retrieve the item; in cued recall you are provided a cue to help you retrieve the item; in recognition
you only need to decide whether the item is familiar (no retrieval required). These are all different ways to
assess episodic memory. In general, recognition is easier than recall (Cabeza, Craik, et al., 1997), with cued
recall somewhere between. That is, recognition is typically the most sensitive measure. Even though we may
no longer be able to recall things we often recognize them as familiar once we see or hear them.
The contrast between recognition and recall is evident in the design of computer software. Recalling
uncued procedures on a computer is hard for the novice user. Recall failures are a source of frustration with
command-based interfaces such as Linux (Soegaard, 2010) or scripting languages. Providing some sort of cue
makes it easier; but supporting recognition memory through a menu in which all you need to do is recognize
the option you want, and click it, is easiest of all. To use Norman’s (1992) terms, recall requires “knowledge
in the head,” recognition places some “knowledge in the world” instead.
Novices tend to prefer menus of commands, because they can scroll through the list until a particular
command is recognized. In contrast, for experts, using command menus can be frustrating because the expert
must scroll through various menus to make selections. To deal with these problems, command menus for most
current software applications also allow the user to press a sequence of keystrokes (keyboard shortcuts) to
accomplish the same functions. Thus, expert users can recall the sequence and recognition can be used when
recall fails (Grossman, Dragicevic, & Balakrishnan, 2007; Ryu & Monk, 2009). For instance, I type the
sequence &lt;alt&gt; &lt;o&gt; &lt;f&gt; &lt;b&gt; to subscript text in MS Word, but if I forget or don’t
know the sequence I can search the menu options. Based on such ideas, models of human-computer
interaction explicitly incorporate recall and recognition as types of mental process important in interface
design (e.g., Ryu & Monk, 2009). For example, an interface might require a user to recall that they put the
system into a particular state (less effective), or it might provide an icon to allow the user to recognize the
system state (more effective, although it can produce display clutter if not done carefully, as described in
Chapters 3 and 5).
As we saw in Chapter 2, recognition can be represented by signal detection with “yesfamiliar”
corresponding to “I see the signal” and “no not familiar” corresponding to no signal. When applied to
recognition memory, research shows that as the criterion becomes more liberal (more “familiar” responses),
then sensitivity actually declines. This occurs because the signal distribution (old, familiar items) tends to
have greater variability than the noise distribution (new, unfamiliar items) (Wixted, 2007). Thus, in a
recognition context, to maximize sensitivity it is important to maintain a high (conservative) criterion. This is
important when sensitivity is a high priority, as for example with eyewitness testimony.

218
Consistency is very important in recognition. For example, dynamic menus (menus that change based on
recent selections) slow users down because commonly sought items are not where they are expected (Mitchell
& Shneiderman, 1989). Consistency of presentation of search engine results is also important. Teevan (2008)
found that people recalled very little explicitly about the contents of previously viewed search results lists, but
they often recognized a list as one they have seen before. When users believe that a results list has changed,
they have trouble reusing the old content on the list, and are less likely to find what they are looking for.
Importantly, they often falsely recognize a result list as one they thought they had seen before even when it is
different. Thus, it appears human users have a liberal (low) criterion for identifying the list as old. As we have
just seen, in recognition memory a liberal criterion is associated with low sensitivity. The implication is that
ensuring consistency in the order of a sequence of results lists will generally improve search performance.
One of the advantages of expertise is that it is easier for experts to identify a particular situation (e.g., the
experienced firefighter detecting the type of fire from another floor). The set of available cues that
characterize a situation is recognized, with implications for decision making, as discussed in the next chapter.
We might say that what we naively think of as intuition is actually recognition of a previously experienced
situation (Seligman & Kahana, 2009). While it might be difficult for the expert to recall a situation explicitly
and describe its characteristics, given the appropriate situational cues they know how to respond. Indeed,
empirical studies show that it is possible for a person to “know” or be familiar with an item, without explicitly
remembering it (Cohen, Rotello, & Macmillan, 2008). In studies where participants are asked to explicitly say
if they remembered something versus whether they are familiar with it (the remember-know paradigm), it is
possible to affect each type of response without affecting the other (Gardiner & Richardson-Klavehn, 2000).
That is, familiarity is independent from explicit remembering. This type of familiarity process underlies many
of our day-to-day behaviors, and likely underlies performance in high-stress situations when decisions need to
be made quickly, like the firefighter noted above.
In many everyday situations, information in the external world provides retrieval cues that help in the
recall of procedural steps from long-term memory. When those cues are absent, forgetting can result. For
example, suppose I want to operate a gasoline pump. Without retrieval cues, I may occasionally err in the
sequence of activities I perform, such as selecting the type of gasoline before removing the nozzle. In contrast,
if the numbers (1), (2), (3) . . . are printed next to the controls, this provides a sequence of retrieval cues for
me and I am less likely to err. In commercial aviation, the checklist has become the predominant method for
minimizing error in flight activities. It does so by providing retrieval cues that activate information about the
sequence of activities the pilot must perform (Degani & Wiener, 1990; Reason, 1990).
Providing retrieval cues within a task structure or interface design has a myriad of benefits. They have
been shown to address action slips, a type of error discussed at length in Chapter 9. Retrieval cues are
obviously quite beneficial when procedures must be carried out in a fixed order, and they can thereby prevent
deviation from that order. Also, as we describe in detail in Chapter 10 in our discussion of interruption
management, retrieval cues placed within a sequence of activities can remind the person of where they were in
a sequence when they were interrupted, so that return to the ongoing task will be fluent. Loft, Smith, and
Bhaskara (2009) found that retrieval cues were most effective at those times when deviation from routine was
required. Users working with an air traffic control simulation were required to press a specific response key
when accepting target aircraft into their sector. A memory aid that was constantly available had no effect;
however, the same information shown at just the right time increased the likelihood that the user would press
the key. Retrieval cues actually produce a second benefit: other related associations are more likely to be
forgotten (retrieval-induced forgetting; Coman, Manier, & Hirst, 2009). So if I follow the checklist items, I
am less likely to associate cues with inappropriate actions in future.

9.2.2 EVENT MEMORY In Chapter 2, we described biases that affect the recognition memory of eyewitness
testimony. Beyond simple yes-no recognition, we may be interested in the accuracy of episodic memory when
people recall or describe a situation that has happened to them (e.g., when providing a narrative of a sequence
of events). This might occur with the witness in a judicial proceeding (Loftus, 2005) or the system operator
questioned in an investigation following an industrial accident. Two sources of bias emerge for this event
memory: the loss of knowledge about the event (forgetting), and the tendency to include new information that
did not occur at the time of the event. Thus, witnesses are likely to “fill in” details of an event to make them
plausible with the way the world runs, even though those details were not explicitly observed. Top-down
processing (Chapter 6) operates on one’s memory for events. Indeed, replacing or augmenting details of a
specific event is characteristic of the expert (Lewandowsky, Little, & Kalish, 2007), who has a large amount
of domain-relevant knowledge.

219
Events occurring subsequent to an initial event can be absorbed into one’s memory of the initial event
(Loftus, 1979; Wells & Loftus, 1984). Since witnesses tend to be unaware of this, they tend to be
overconfident about their memory’s accuracy. In one study, Okado and Stark (2005) had subjects view a
staged video showing a man stealing a girl’s wallet. Some subjects were given misinformation about the event
(the girl’s arm was hurt in the process). Then all subjects were asked to describe the original event. Many of
those subjects given the misinformation after the event then claimed that they specifically saw the
misinformation (the girl’s arm being hurt) in the original event. Loftus (2005) refers to this result as the
misinformation effect. Indeed, Loftus, Coan, and Pickrell (1996) went so far as to show that not only can
post-event information change an existing memory, but memory for an event can be produced when there was
no actual event! By including a story about being lost in a shopping mall in a set of stories about other events
that had occurred in their subject’s lives, many subjects claimed to have been lost in that mall even when that
had never occurred to them.
DNA testing technology has shown that many individuals have been convicted of crimes that they did
not commit (Wargo, 2011). In most of these cases, eyewitness testimony was involved and was considered the
primary evidence during trial (Scheck, Neufeld, & Dwyer, 2003; Wright & Loftus, 2005). Since human
testimony remains a necessary source of information in judicial proceedings or accident investigations, the
jury or board of investigation should be aware that: (1) information occurring after the event can be
incorporated into the memory for an event; and (2) individuals can recall events that did not occur.

9.3 Skill Retention


In many work situations, operators are frequently called upon to perform a particular skill they have learned.
This procedural memory is different from the recall (or recognition) of specific episodes. Very often, the
skilled performance of procedures is accurate and effortless. For example, we do not forget how to ride a bike,
even though we might not have done so for a few days, weeks, or even years. But sometimes the problem of
skill forgetting is a substantial one, in particular when the person did not thoroughly learn the skill in the first
place, or when the person has only limited opportunity to practice it (e.g., first aid procedures). The
commercial airline industry is sufficiently concerned with pilots forgetting skills not often practiced (e.g.,
recovery from emergencies; see Section 8.2.7) that recurrency training is required every six months.
Physicians trained in simulators for laparoscopic surgery are assessed half a year later to determine their skill
retention, and undergo maintenance training to ensure that their skills are maintained (Stefanidis, Korndorffer,
et al., 2005, 2006).
It is important for the engineering psychologist to have some way to predict what skills will be forgotten
at what rate in order to know how often operators should be required to participate in recurrency training. The
following three factors are important in determining how well skilled performance is remembered.
1. Skill type. Different skill types have different lengths of skill retention (Adams & Hufford, 1962;
Arthur, Bennett, et al., 1998; Rose, 1989). Perceptual-motor skills involving continuous responses,
such as driving, flight control, and most sports skills, show very little forgetting over long periods of
time. In contrast, cognitive skills, which require a sequence of discrete steps, such as how to use a
word processor, are more rapidly forgotten. (The skill distinction is similar to the distinction between
procedural and declarative memory noted in the previous section, with perceptual-motor skills being
stored in procedural memory and cognitive skills being stored in declarative memory). For cognitive
skills, it is likely that the linkage between consecutive steps in a process is the source of the forgetting.
So-called digital skills (those skills necessary to work with tactical command and control systems) are
a type of cognitive skill and as such are subject to forgetting, a situation of concern to military
organizations (e.g., Adams, Webb, et al., 2003; Goodwin, 2006).
 The issue therefore is how to ensure that cognitive skills are not forgotten. Forgetting can to some
extent be addressed by the use of retrieval cues such as checklists as noted above. Consistency of
practice is important in maintaining cognitive skill, leading to automaticity (Schneider, 1985). Raskin
(2000) has emphasized the importance of consistency in interface design. If a single sequence of
keystrokes is used for the same function across software applications (i.e., there is a consistent
mapping to function), the sequence becomes automatized. Similarly, applications that support the
transition from a recognition-based to a recall-based interface with practice generally tend to be very
effective in developing and maintaining a procedural skill. Zhai, Kristensson, et al. (2012; see also
Zhai, 2008) developed an application called Shapewriter that allows an iPhone user to draw the shape
of a sequence of keystrokes on a QWERTY keyboard. Users learn the shape of each word (e.g.,
imagine typing the word “the” versus “and” and consider the shape formed by the letter arrangement).

220
The consistent motor movement associated with the shape is therefore implicitly learned each time the
user enters text, essentially turning the cognitive skill into a perceptual-motor skill.
2. Sequence of practice. Many complex tasks have different types of task components, including both
procedural and declarative elements. Clawson, Healy, et al. (2001) showed that for a task having both
procedural (perceptual-motor) and declarative (cognitive) components (translating Morse code), it is
better to train the procedural component first. Skill retention is greater in this case, presumably
because the more robust nature of the perceptual-motor skill “anchors” the cognitive components.
3. Individual differences. Faster learners tend to show better retention than slower learners. Rose (1989)
suggests that this difference may be related to chunking skills. As we have seen, better chunking will
lead to faster acquisition as well as more effective and efficient storage in long-term memory. A larger
working memory capacity has been shown to improve the ability to utilize feedback during learning
(Kelley & McLaughlin, 2008).

10. TRANSITION
In this chapter we have discussed at length the separate components of verbal and spatial working memory
and long-term memory. Each has different properties and different codes of representation, yet all are
characterized by stages of encoding, storage, and retrieval. Failures of each of these processes result in
forgetting, which is a critical point of breakdown in human-system interaction. Techniques of system and task
design and procedures to facilitate memory storage (training) were discussed.
In the next chapter, we discuss decision making, coupling the memory box in Figure 1.1 with the forward
flow of information processing to include the selection of decision choices. Our treatment of decision making,
however, depends on an understanding of memory and learning in three respects. First, many decisions place
heavy loads on working memory. The costs imposed by these loads often lead to mental shortcuts, or
heuristics, which produce systematic biases in decision performance. Second, other decisions are affected by
long-term memory and experience. We decide upon an action because the circumstances correspond to a
memory of a similar situation where we made the same decision, and that its outcome was successful. Finally,
we will learn that the decision-making task has unique features, which cause learning and expertise in
decision making to be somewhat different from that in other skills.

Key Terms
active learning 230
adaptive training 229
aptitude X treatment interaction 233
binding 199
central executive 199
checklist 241
chunk 205
chunking 205
cognitive load theory (CLT) 228
cognitive skills 243
cognitive streaming 218
collaborative inhibition 213
conceptual graph analysis 238
contrived task 208
data link 204
declarative memory 243

221
digital skills 243
dual coding principle 231
echoic memory 202
encoding 197
episodic buffer 199
episodic memory 234
event memory 242
executive control 201
expertise effect 233
extraneous load 228
fractionation 230
generation effect 230
Germane load 228
grounded cognition 234
iconic memory 202
implementation intention 212
implicit memory 234
implicit performancebased measures 220
interference 205
interruption management 212
intrinsic load 228
intrusiveness 219
knowledge elicitation 235
learning 197
long-term memory 197
long-term working memory 210
memory 197
memory span 204
mental model 236
misinformation effect 242
ontology 238
opportunistic planning 221
parsing 205
passive learning 230
perceptual-motor skills 243
phonological loop 198
phonological store 198
planning 198
proactive interference 206
problem solving 198
procedural skills 208

222
prospective memory 211
recall 239
recognition 240
remember-know paradigm 241
retrieval 198
retrieval cues 241
retrieval-induced forgetting 241
retroactive interference 206
satisfice 221
scaffolding 229
segmentation 230
semantic memory 234
sequence of practice 243
situation assessment 215
situation awareness 198
skill type 243
skilled memory 198
stimulus/central-processing/ response compatibility 201
storage 198
system lag 217
template theory 209
the situation present assessment measure 219
think aloud technique 237
time sharing skill 230
training 197
training cost ratio 225
training system fidelity 226
training-transfer dissociation 233
transactive memory 211
transfer effectiveness ratio 225
transfer of training 198
variable priority training 230
visual echo 204
visuo-spatial sketch pad 198
work domain 238
worked examples 229
working memory 197
working memory analysis 207
working memory capacity 199

223
224
8 DECISION MAKING

1. INTRODUCTION
Lauren had loved mountain climbing since she was a young girl, and in her twenties was now an
accomplished climber. She decided to organize her own mini-expedition to climb a remote peak in the
Northern Himalayas. To finance the expedition she took out a considerable loan on credit, and then turned to
the choice of what mountain to tackle. There were so many options, varying in remoteness, altitude,
challenge, uniqueness, beauty, possible weather, and information available. And then once the peak, Mt.
Heuristic-Ri was chosen, the choice of team member was equally hard: how many, and whom? Friends she
could trust or excellence of climbing reputation? And of her friends, good humor or strength, or
organizational skills?
After a long trek in, they arrived at the foot of the mountain, but now were confronted by additional
decisions: what route to take? What and how much equipment—was a higher camp necessary, or would they
go for the summit in one long 24-hour shot—and what was the weather forecast? Unfortunately, it was rainy
and cloudy for three days as they waited at base camp, until at last the weather begin to clear.
On the night prior to departure, the forecast, while iffy, indicated better weather on the way, so they
decided to proceed with a 1:00 am departure the next morning. Proceeding upward, the dawn was murky with
clouds remaining over much of the sky, however a band of clear sky in west gave them hope and they
continued onward. But the band never widened.
Leading the climb high on the mountain, Lauren was confronted with another choice above her: to veer
to the left up a steep ridge of hard but solid rock, or to continue up an easier snow slope, burdened with new
snow from the past several days of bad weather. The team was tired, and the snow looked good while the rock
looked steep. Recalling her recent fall on a rock climb in Wyoming, Lauren chose the snow route. That choice
almost proved disastrous; as the last climber neared the top of the slope, the large slab of snow below him
started to cascade off. Fast action by the belayer above caught the climber before he was dragged down.
Topping the slope, they stopped to gaze at the sky, and noticed that the blue patch they had counted on
was not opening, and indeed the ominous clouds behind them had grown. The summit was just about half
mile beyond along the ridge, and Lauren huddled the team, saying “we are almost there. It might be risky to
continue, but the summit is not far, and we have put so much into this expedition that we can’t afford certain
failure by turning back.” Her teammate promptly rephrased the option: “If we turn back now, we’ll surely get
back safely, but if we continue there is still only a possibility that the weather will hold for us to make the
summit.” The team discussed the options briefly, and decided on descent as the safer course of action. While
descending safely, Lauren remained somewhat dissatisfied. The weather had not turned worse, and she could
only say “if only …”
Many serious accidents in which human error has been involved can be attributed to faulty operator
decision making: The decision to launch the Challenger Space shuttle, which later exploded because cold
temperature at launch time destroyed the seals is one example; another is the 1987 decision of personnel on
board the USS Vincennes to fire on an unknown aircraft,which turned out to be a civilian Iranian transport
rather than a hostile fighter (U.S. Navy, 1988). However a contrasting tragic decision was made by those on
board the USS Stark cruising in the Mediterranean a year before, not to fire on an approaching target which
turned out to be hostile and launched a missile which cost several lives on board the Stark. Of course these
and other decisions gain notoriety because they generated unfortunate or tragic outcomes.
In the same manner we can recall better our own personal decisions that went awry: the class we chose to
take that we failed; the poor investment we made, or Lauren’s decision to take the snow slope. However in
terms of frequency, our lives are far more dominated by the less salient decisions that went right. In this
chapter we consider the processes that underlie decisions of both kinds, and the characteristics of the
information and choice that can either improve the likely outcome, or make the decision more difficult and the
choice more likely to produce an unwanted result.

225
2. CLASSES AND FEATURES OF DM
From an information processing perspective, decisions typically represent a many-to-few mapping of
information to responses. That is, a lot of information is typically perceived and evaluated in order to produce
a single choice. The following are some key features:
Uncertainty. An important feature of any decision is the degree of uncertainty of the consequences.
Such uncertainty is generally a result of the probabilistic nature of the world in which we live, in which a
given choice may lead to one sort of outcome if certain characteristics of the world are in effect or will come
to pass, and a different outcome otherwise. If some of the possible but uncertain outcomes are unpleasant or
costly ones, we usually consider the uncertainty of the decision as involving risk. The decision to purchase
one of two possible vehicles is generally low risk if one has done advanced research on product quality, since
the probable outcomes of one purchase or the other are known. But the decision to proceed with a flight in
uncertain weather may have a high amount of risk, since it is difficult to predict in advance what impact the
weather will have on the safety of the flight.
Time. Time plays at least two important roles in influencing the decision process. First, we may contrast
“one shot” decisions like the choice of a purchase, with evolving decisions like those involved in treating an
uncertain disease, in which test is followed by medication which may be followed by further tests and further
treatment. Secondly, time pressure has a critical influence on the nature of the decision process (Svenson &
Maule, 1993).
Familiarity and Expertise. Decision making changes with experience in several ways (Lipshitz &
Cohen, 2005; Montgomery Lipshitz & Brenner, 2005; Weiss and Shanteau, 2003). As we discuss later,
experts can often look at a decision problem and intuitively, nearly instantly pick the correct choice, whereas
novices may ponder the problem for some time, and perhaps make a poor choice. This distinction parallels
(although is not identical to) a dichotomy that research has distinguished between holistic and analytical
decision types (Hammond et al., 1987), or between decision systems 1 (more holistic) and 2 (more analytic)
(Evans, 2007; Kahneman & Klein, 2009; Kahneman, 2003; Sloman, 2002). Indeed these two systems appear
to rely on different brain structures (Leher, 2010). In short, system 1 operates relatively automatically and
effortlessly, reflecting “skilled expertise,” and hence obviously develops fluency as the decision maker gains
familiarity with a domain. System 2 is much more analytical and deliberative, generally relying heavily on
working memory capacity in its deliberations. The two systems often work interactively, in that system 2 may
monitor and cross check the quick intuitive decision made by system 1. As we will discuss below, the two are
also somewhat associated with two different schools of decision analysis and research, naturalistic decision
making (Zsambok & Klein, 1997, high skill and expertise: system 1) and the heuristics/biases approach to
decision making (system 2; Kahneman & Klein, 2009).
Classes of decision-making research. Certain of the features of decision making described above have
played a prominent role in distinguishing three important classes of decision-making research. The study of
rational or normative decision making (e.g., Edwards, 1987) has focused its efforts on how people should
make decisions according to some optimal framework; for example, one that will maximize the expected gain
or minimize the expected loss. Efforts here are often focused on the departures of human decision making
from these optimal prescriptions. We considered a simple example of this in the context of setting the
“optimal beta” for signal detection decisions in Chapter 2 and will discuss it in more detail in Section 6 below.
The cognitive or information processing approach to decision making focuses more directly on the
sorts of biases and processes that reflect limitations in human attention, working memory, or strategy choice,
as well as focuses on common decision routines—known as heuristics—that work well most of the time, but
occasionally lead to undesirable outcomes (Kahneman, Slovic, & Tversky, 1982; Herbert, 2010; Hogarth,
1987; Gilovich, Griffin, & Kahneman, 2002; Kahneman & Klein, 2009). Less emphasis here is placed on
departures from optimal choice per se, and more on understanding the causes of such biases in terms of the
structure and limits of the human as an information processing system. Finally, the naturalistic decision
making approach (Kahneman & Klein, 2009; Mosier and Fischer, 2010; Zsambok & Klein, 1997, see Section
8) places its greatest emphasis on how people (usually experts) make decisions in naturalistic environments
(i.e., outside of the laboratory), where they possess expertise in the domain and where the decisions have
many of the aspects of complexity (evolving time, time pressure, multiple cues) that may be absent in
laboratory studies of decision making (Mosier & Fischer, 2010).

3. AN INFORMATION PROCESSING MODEL OF DECISION

226
MAKING
Figure 8.1 presents a model of the information processing components that are involved in decision making,
elaborating the information processing presented in Chapter 1 while deemphasizing some components (e.g.,
sensory processing, response execution).

FIGURE 8.1 An information Processing model of decision making. The general information processing model is shown in the upper left.

Beginning at the left, the decision maker seeks cues or information from the environment. However we
note that in decision making (unlike much of pattern recognition), these cues are often processed through the
“fuzzy haze” of uncertainty, and hence, may be ambiguous or interpreted incorrectly. In our opening story,
Lauren was required to process the fuzzy uncertain weather forecast in making her decision to proceed.
Selective attention of course plays a critical role in decision making, in choosing which cues to process (of
higher perceived value) and which to filter out. Such selection is based on past experiences (long-term
memory) and requires effort or attentional resources.
The cues that are then selected and perceived now form the basis of an understanding, awareness, or
assessment of “the situation” confronting the decision maker (see Chapter 7), a process that is sometimes
labeled diagnosis (Rasmussen & Rouse, 1981). Here the decision maker entertains hypotheses about what
might be the current and future state of the world, upon which a decision should be based. For example, the
physician must diagnose a disease before deciding upon a treatment, or the student may wish to assess an
instructor’s quality prior to choosing to enroll in a course. This diagnosis or assessment is based upon
information provided from two sources, the external cues filtered by selective attention (bottom up
processing) and long-term memory. The latter can offer the decision maker both various possible hypotheses
of system state (e.g., the physician’s knowledge of possible diseases and their associated symptoms or cures)
and estimates of the likelihood or expectancy that each state might be true (top down processing). What makes
decision making distinct from many other aspects of information processing is that diagnosis or situation
assessment is often incorrect, because of the uncertain nature of the cues, their ambiguous mapping to possible
states, or because of vulnerabilities in the cognitive processing of the decision maker related to selective
attention (Chapter 3) and working memory (Chapter 7).
Many decisions are iterative in the sense that initial hypotheses will trigger the search for further
information to either confirm or refute them. Troubleshooting a system failure will often trigger repeated tests
to confirm or refute possible hypotheses (Hunt & Rouse, 1981). This characteristic defines the important
feedback loop to cue filtering, labeled “confirmation” in Figure 8.1. The entire process of cue seeking and
situation assessment has been labeled the “front end” of the decision process (Mosier & Fischer, 2010).
Following from the front end stages of cue seeking and situation assessment (or diagnosis), the third
principle stage in decision making is the choice of an action, described as the “back end” of decision making
(Mosier & Fischer, 2010). From long term memory the decision maker can generate a set of possible courses
of action or decision options; but if the diagnosis of the state of the world is uncertain (as it is in much

227
decision making), then the possible consequence of the different choices define their risks. Consideration of
risk requires the explicit or implicit estimation of two quantities: (1) the probability or likelihood that different
outcomes will come to pass and (2) values, the extent to which those outcomes are “good” or “bad.” This is
directly analogous to the discussion in Chapter 2, where the decisions made in signal detection theory
depended upon both probability and the values (costs and benefits) imposed on different outcomes (hits, false
alarms, misses, correct rejections). Thus the physician will probably consider the values and costs of various
outcomes before she decides which treatment (do nothing, drugs, surgery) to recommend for a patient’s
abnormality of uncertain identity.
The overall distinction between front-end and back-end processes is critical to understanding decision
failures (Hoffman et al., 1998). For example, very different solutions may be applied to remedy environments
where decisions fail because of poor information and situation assessment, compared to those when failures
result from inappropriate (e.g., too risky) choices in the face of a well-diagnosed situation (Wiegmann Goh &
O’Hare, 2002).
Three additional components characterize our model. First, many of the components of decision making
demand effort or resources (see Chapter 10). Sometimes people choose decision strategies that impose
reduced effort demands as they conserve this effort, such as choosing a diagnostic strategy that does not
require them to hold many alternatives in working memory. Indeed such effort-conserving choices form a
basis of many of the heuristics that we will discuss below.
Second, the figure depicts the role of meta-cognition (Reder, 1996). This process, discussed further in
Section 7—awareness and knowledge of one’s own knowledge, effort, and thought processes—is one that is
closely linked with situation assessment (in this case, the “situation” involves the evolving decision process)
and turns out to have an important influence on the overall quality of decision making: is one aware of the
limitations in one’s own decision process? Does the decision maker know that he does not possess all of the
information necessary to make a good decision and hence seeks more?
Finally, the major feedback loop as shown at the bottom of Figure 8.1 critically illustrates the iterative
nature of decision making. First, feedback of decision outcomes is sometimes used to assist in refining a
diagnosis as we described above in troubleshooting. Second, meta-cognitive evaluation may trigger the search
for more information. Third, feedback may be employed in a learning sense, to improve the quality of overall
decisions (i.e., learning from one’s mistakes); this feedback (although often delayed) may eventually be
processed in long-term memory in order for the decision maker to revise his internal rules of decision making
or the estimates of risks (see Section 8). That is, to learn decision-making skills.

4. WHAT IS “GOOD” DECISION MAKING?


The previous section has emphasized the several information processing components involved in decision
making, such as cue perception, selective attention, and working memory. In previous chapters we have
discussed many of these components in detail and have outlined some of the limitations (as well as the
strengths) of all of them, such as the limited capacity of working memory. Hence, it is not surprising that the
decision process may often fall short of “perfect” or “optimal” performance. Mistakes are made. Yet at the
same time, the concept of what really is “good” decision making has proven to be illusive (Kahneman &
Klein, 2009; Lipshitz, 1997; Shanteau, 1992), in contrast to other aspects of human performance, where speed
and accuracy have a clearly defined status of quality. In fact, at least three different characterizations have
been offered of “good” decision making, not all of which are in perfect agreement with each other.
First, early decision research of the normative school offered the expected value of a decision as the
“gold standard.” That is, the decision that would produce the maximum value if repeated numerous times
(Edwards, 1987; see Section 6.1). However, defining expected value depends upon assigning universally
agreed upon values to the various possible outcomes of a choice; values are often personal, making this a
difficult undertaking. Even if values could be agreed upon, the choice that might be optimal if the decision is
repeated time and again with plenty of time for weighing all the cues will not necessarily be optimal for a
single choice, particularly one made under time pressure with little time to fully diagnose the situation and
consider all possible outcomes (Zsambok & Klein, 1997). Furthermore, for a single decision, the decision
maker may be more concerned about, say, minimizing the maximum loss (worst case) rather than maximizing
expected long-term gain which after all can only be realized following a longterm average of the outcome of
several decisions.
Second, one may say that “good” decisions are those that produce “good” outcomes and bad decisions

228
conversely produce bad outcomes, such as the decision to launch the Challenger space shuttle, to fire on the
Iranian Airliner from the USS Vincennes, Lauren’s decision to climb the snow slope that triggered the
avalanche, or the decision of a jury to convict a suspect who subsequently is found innocent. Yet we also
know that in a probabilistic uncertain world, where cues are uncertain, it may only be in the 20–20 vision of
hindsight that the decision can be labeled “bad” (Woods et al., 1994). After all, considering the USS
Vincennes case (a “bad” decision), the decision makers on board the ship must also have considered that the
decision made a year earlier on board the USS Stark, not to fire upon an approaching contact, turned out also
to be “bad,” leading to the loss of life on the Stark. This tendency to label a decision as good or bad only after
the outcome is known is sometimes called the hindsight bias.
A third approach to decision quality has been based upon the concept of expertise (Zsambok & Klein,
1997, Kahneman & Klein, 2009; Brehmer, 1981; Shanteau, 1992, see Chapter 7). Since experts in other fields
(e.g., chess, physics) are known to produce “good” and sometimes exceptional performance, why not consider
that expert decision makers do the same. The problem here is that several analyses of decision making have
shown that experts in certain domains do not necessarily make better decisions than novices (Brehmer, 1981;
Dawes, 1979; Garling, 2009; Kahneman & Klein, 2009; Shanteau, 1992; Taleb, 2007; Tetlock, 2005; Serfaty,
MacMillan, et al., 1997; see Section 8), and several “bad” decisions, according to our second criterion, have
indeed been made by highly trained experts.
We adopt the approach here that, to the extent that all three of the characteristics described above
converge then it becomes increasingly easy to discriminate good from bad decision making. But when they do
not, then such discrimination is often fruitless, and it is much more appropriate simply to look at the
qualitative ways in which different environmental and informational characteristics influence the nature of the
processing operations and outcomes of the decision process. This is the framework shown in Figure 8.1,
within which we treat the material below, first considering how people accumulate and assess evidence
bearing on a diagnosis (front end: Section 5), then how they use that assessment to choose an action (back
end; Section 6), and then the explicit role of effort and meta-cognition (Section 7).

5. DIAGNOSIS AND SITUATION ASSESSMENT IN DECISION


MAKING
Accurate situation assessment is necessary (although not sufficient) for good decision making. Pilots who are
good decision makers (by the various criteria above) actually take longer in understanding a situation or
decision problem, even as they select and execute the choice more rapidly (Orasanu & Fischer, 1997). As
shown in Figure 8.1, we can however distinguish four different information-processing components, each
with their limitations, that can influence the quality of assessment and diagnosis: the role of perception in
estimating a cue, the role of attention in selecting and integrating the information provided by the cues, the
role of long-term memory in providing background knowledge to establish possible hypotheses or beliefs,
and finally the role of working memory as the “workbench” for updating and revising beliefs or hypotheses
on the basis of newly arriving information.

5.1 Estimating Cues: Perception


On the whole, people are reasonably accurate in estimating the mean and variance of a set of observations
(Sniezek, 1980; Wickens & Hollands, 2000). However, systematic biases have been observed in perceiving
and estimating three other characteristics of the environment: proportions, projections, and randomness.

5.1.1 PROPORTIONS With regard to proportions, when perceiving a set of dichotomous observations (e.g., faulty
versus normal parts on an inspection line; see Chapter 2), people do a reasonably accurate job of estimating
the proportion so long as proportion values fall within the midrange of the scale (e.g., between around .05 and
.95); however, with more extreme proportions, their estimates often tend to be “conservative,” biased away
from the extremes of 0 and 1.0 (Varey, Mellers, & Birnbaum, 1990). Such biases may result from an inherent
conservative tendency (“never say never”), or alternatively they may result from the greater salience,
noticeability or impact of the single outlying observation (which is, by definition, the infrequent event) in the
sea of more frequent events. For example, if I have seen 99 normal parts, then detecting the one abnormal part
will make more of an impact on my consciousness than detecting a 100th normal one. Its greater impact could
well lead me to overestimate its relative frequency in hindsight, even as the rarity of the abnormal part will
make me less likely to detect it in the first place if its abnormality is not salient (see Chapter 3).

229
With regard to proportions, when perceiving a set of dichotomous observations (e.g., faulty versus normal
parts on an inspection line; see Chapter 2), people do a reasonably accurate job of estimating the proportion so
long as proportion values fall within the midrange of the scale (e.g., between around .05 and .95); however,
with more extreme proportions, their estimates often tend to be “conservative,” biased away from the
extremes of 0 and 1.0 (Varey, Mellers, & Birnbaum, 1990). Such biases may result from an inherent
conservative tendency (“never say never”), or alternatively they may result from the greater salience,
noticeability or impact of the single outlying observation (which is, by definition, the infrequent event) in the
sea of more frequent events. For example, if I have seen 99 normal parts, then detecting the one abnormal part
will make more of an impact on my consciousness than detecting a 100th normal one. Its greater impact could
well lead me to overestimate its relative frequency in hindsight, even as the rarity of the abnormal part will
make me less likely to detect it in the first place if its abnormality is not salient (see Chapter 3).
However a very important exception to this overestimation bias occurs when the estimate of the
frequency of very rare events (e.g., causing a rear end collision by following too close) is based on personal
experience, rather than description (Hertwig & Erev, 2009). Here the person’s sample of events is insufficient
so they never actually experience the event in question, and underestimation is observed; that is, they may act
as if the event is impossible (for them) rather than just improbable. This finding has important implications for
safety as we discuss in Section 6.
The tendency to overestimate the frequency of rare events from description (versus experience, as above)
has important implications for choice behavior. For example, people appear to show little difference in
behavior (e.g., purchasing lottery tickets) whether odds of an event (winning) is 1/1000, or 1/10,000, and
thereby implicitly overestimating the probability of the latter (Slovic & Finucane et al., 2002). They consider
both as equal evidence for the possibility of winning rather than as different evidence for the probability of
winning. In Chapter 2, we saw how this tendency could affect the setting of the response criterion, as manifest
in a “sluggish beta.” Later in this chapter, we see how it effects risky decision making.

5.1.2 PROJECTIONS With regard to projection, humans are not always effective in extrapolating non-linear
trends. As shown in Figure 8.2, they often bias their estimates toward the more linear extrapolation of the
tangent where the data end (Waganaar & Sagaria, 1975; Wickens, 1992). This parallels the challenges people
have in predicting the dynamic behavior of systems to be tracked, as discussed in Chapters 5. Thus, for
example, in predicting the future temperature of a process on the basis of historical trend data of the
exponential growth, people would be likely to underestimate its future values. Like the bias in estimating
proportions, this can be thought of as a “conservative” one, inferring that the quantity will be less extreme
than the statistical data would suggest. However, such prediction is, by definition, an inference, and so the
conservative bias in extrapolation can possibly be explained on the basis of a further inference based upon
past experience. This is the inference that most exponentially increasing quantities do eventually encounter
self-correcting mechanisms that slow the rate of growth. For example, exponential population increases will
encounter natural (through disease) or artificial means (i.e., of birth control) to lower the rate of growth.
Exponentially increasing temperatures will often trigger fire extinguishing efforts, or opening pressure relief
valves that will reduce the rate of growth. So the long-term memory of experience will lead the decision
maker—accurately—to infer that the rapidly growing quantity will eventually slow its rate of growth.
At the same time, other research indicates that people (e.g., stock analysts) may sometimes be overly
risky or extreme in their projection of quantities that are not exponentially growing as above (De Bondt &
Thaler, 2002), leading to an overreaction in their trading (e.g., choice) behavior. Indeed, they tend to be even
more so when making longer-range forecasts, as if discounting the lower reliability of the greater look-ahead-
time, a point to which we return below (De Bondt & Thaler, 2002, see also Chapter 5). Finally, as we discuss
in Section 7.2, people are not always effective in planning for the future.

230
FIGURE 8.2 Conservatism in extrapolation.

5.1.3 RANDOMNESS People do not do a good job in perceiving (or understanding) randomness in the
environment (Tversky & Kahneman, 1971). This is best illustrated by the gambler’s fallacy in observing (or
acting on) a series of dichotomous events, like coin tosses or wins and losses in a gamble. People tend to think
that “random” implies a heavy bias toward alternation between the two outcomes. When generating a random
series of say heads and tails, people will tend to avoid generating a sequence like HHH or TTTT, even as this
sequence of three (or four) identical events is no less likely than any other sequence. In particular, when
people witness a series of dichotomous events, the more consecutive observations of one event (e.g., losses)
they see, the more they expect the next one to be the other event (a win). This is true despite the fact that in a
random process, each event is independent of the prior one. The chances of a head following four heads is still
50 percent, not higher, as people’s predictions would suggest.
This bias in the perception of random events is shown clearly in the “hot hand” effect in basketball
(Gilvich, Vallone, & Tversky, 2002). Here, many players and coaches are convinced that after a few
consecutive baskets (usually from outside) the player has a “hot hand” and should continue shooting (often at
the expense of distributing the ball to teammates). Yet careful statistical analyses of such “streaks” indicate
that they are no more likely than is the series of, say four “heads” in a coin toss. The next shot has a
probability of success that is no greater than the player’s long-term shooting percentage. Indeed, if anything
the streak could lead to the opponents’ more aggressive defense against the hot hander, hence decreasing her
likelihood of hitting the next shot.
Poor perception of randomness is also reflected in people’s resistance to perceiving outliers in a
distribution as legitimate components of the tails of an otherwise random distribution. They interpret them
instead to be significant trends (Tversky & Kahnemann, 1981). People search for what they perceive to be
systematic trends, and therefore they often see “patterns” in data which are, in fact, nothing more than random
organization.
The previous discussion of biases in the perceptual estimation of quantities spawns one important design
message. When possible, systems should display directly, the parameters estimated from separate
observations (e.g., computer generated predictions), rather than requiring the human to estimate or infer those
quantities. The format in which these parameters should be displayed (e.g., digital, graphical) was an issue
discussed in earlier chapters of the book, and also has important implications for decision-making displays, as
will be discussed toward the end of this chapter.

5.2 Evidence Accumulation. Selective Attention: Cue Seeking and Hypothesis Formation
As shown in Figure 8.3, we can represent the diagnostic stage of decision making as a process by which the
decision maker receives a series of cues, symptoms, or sources of information as shown near the bottom,
bearing on the true (or predicted) state of the world, and attends tosome or all of these with the goal of using
those cues to influence the cognitive belief in one of several alternative hypotheses shown at the top right. In

231
many instances, we can represent this as a “belief scale,” between two alternative hypotheses, H1 and H2, as
shown in the figure. Thus, we may think of the physician diagnosing a tumor as benign or malignant, the
planner (for a flight, a hike, a picnic) predicting that the weather will be either clear or rainy, the investment
broker predicting that the stock in a company will either climb or dive, or intelligence agents diagnosing the
presence or absence of weapons of mass destruction in Iraq (Isakoff & Corn, 2006).

FIGURE 8.3 Representation of the process of information integration (from bottom to top) to form a belief or diagnosis related to one of two
hypotheses.

Each cue that potentially bears on the hypothesis can also be characterized by three important properties:
1. Cue diagnosticity formally refers to how much evidence a cue should offer regarding one or the other
hypothesis. Thus, if one sees rain drops falling, this is a 100 percent diagnostic cue that it will be
raining; on the other hand, a forecast of “a 50 percent chance of showers” is a totally undiagnostic cue
for precipitation. Dark clouds on the horizon are relatively diagnostic (e.g., 75 percent), but not
perfectly so. The diagnosticity of any cue can be expressed both in terms of its discriminating value
(high or low) as well as its polarity (i.e., which hypothesis the cue favors).
2. Cue reliability or credibility refers to the likelihood that the physical cue can be believed. This
feature is independent of diagnosticity. Thus an eyewitness to a crime may state categorically that “the
suspect did it” (high diagnosticity); but if the witness is a notorious liar, his or her reliability is low.
Collectively, both diagnosticity and reliability can be expressed on scales of 0 to 1.0, and then their
product can reflect the information value of a cue. If the decision maker views a cue with an
information value =1 (d=1 × r=1), then that single cue is all that needs to be processed to make an
error free diagnosis. However, most diagnostic problems have cues with information value less than
1.0, and hence can produce circumstances in which cues conflict. (Consider opposing witnesses for
the defense and prosecution in a legal trial.)
3. The physical features of the cue which can make it conspicuous or salient have an important bearing
on the selective attention and the subsequent processing that it receives.
How then should the multiple cues be integrated to form a belief that correlates with the true state of the
world? Here we can consider four information-processing operations, three of them having parallels with our
discussion of perception in earlier chapters. First, selective attention must be deployed to process the different
cues, ideally giving different weight according to their information value. Second, the cue values—raw
perceptual information—must be integrated, analogous to the bottom up processing of perceptual features in
pattern recognition. Third, expectancies or prior beliefs may play a role in biasing one hypothesis or belief to
be favored over the other, analogous to the way that expectancies stored in long term memory influence the
top down processing in perceptual pattern recognition and signal detection (Chapters 6 and 2). Fourth, an

232
operation that is not paralleled by those in perceptual pattern recognition, is the iterative testing and retesting
of the initially formed belief, to attain the final belief which is the basis for choice.
Having established the role of reliability and diagnosticity in determining the information value of a cue,
we are then in a position to establish the optimal degree of belief in one hypothesis or another on the basis of
multiple cues.
The process of attending to and integrating multiple cues typically located at different places and/or
delivered at different times along various sensory channels presents a major challenge to human selective
attention and hence can be a source of four major vulnerabilities, as we discuss below.

5.2.1 INFORMATION CUES ARE MISSING A decision maker may not have all the information at hand to make an
accurate diagnosis. An operator’s judgment to turn on a faulty piece of equipment cannot be blamed if the
operator was not informed by maintenance personnel of the equipment failure. At the same time however
sometimes a decision maker may be blamed if a decision is made in the absence of critical information that
she/he should know is essential. But thwarting this process is the fact that present cues can be perceived,
while realizing the existence of absent cues depends upon memory, a process that we learned in the previous
chapter is often prone to error. One quality of good decision makers is that they will often be aware (meta-
cognition) of what they do not know (i.e., missing cues) and may proceed to seek these cues before making a
firm diagnosis (Orasanu & Fischer, 1997). Thus, the effective planner of a mission will attempt to obtain, and
rely on, only the most recent weather data, and if the available forecast is outdated may postpone a decision
until a weather diagnosis can be made only on the most recent data.

5.2.2 CUES ARE NUMEROUS: INFORMATION OVERLOAD As we have noted, when the information value of any cue
is known to be 1.0 (both reliability and diagnosticity = 1.0), then other information need not be sought. But
this is rarely the case, and so effective diagnosis will rely upon integrating multiple cues. However, this can
present a selective attention challenge, as we discussed in Chapter 3. The operators monitoring any nuclear
plant in the face of a major failure may be confronted with literally hundreds of indicators, illuminated or
flashing (Rubenstein & Mason, 1979). Which of these should then be attended first, as the operator then tries
to form a diagnosis as to the nature of the fault.
When several different information sources are available, each with less-than-perfect information value,
the likelihood of a correct diagnosis can increase as more cues are considered. In practice, however, as the
number of sources grows beyond two, people generally do not use the greater information to make
proportionately better, more accurate decisions (Allen, 1982; Dawes, 1979; Dawes & Corrigan, 1974; Lehrer,
2009; Malhotra, 1982; Schroeder & Benbassat, 1975). Oskamp (1965), for example, observed that when more
information was provided to psychiatrists, their confidence in their clinical judgments increased but the
accuracy of their judgments did not. Allen (1982) observed the same finding with weather forecasters. The
limitations of human attention and working memory seem to be so great that an operator cannot easily
integrate simultaneously the diagnostic impact of more than a few sources of information. In fact, Wright
(1974) found that under time stress, decision-making performance deteriorated when more rather than less
information was provided.
Despite these limitations, people have an unfortunate tendency to seek far more information than they
can absorb adequately. The admiral or executive, for example, will demand “all the facts” (Samet, Weltman,
& Davis, 1976). In the field of medical imaging, Jarvic et al. (2003) have noted that with the emergence of the
MRI, surgeons begin to recommend a large number of unnecessary back surgeries, compared to the
recommended rate when only lower quality X-rays were available. The extensively greater amount of
information available in the MRI did not lead to improved diagnosis, and apparently degraded it (Lehrer,
2009).
To account for the finding that more information may not improve decision making, we must assume that
the decision maker employs a selective filtering strategy to process informational cues. When few cues are
initially presented, this filtering is unnecessary. When several sources are present, however, the filtering
process is required, and it competes for the time (or other resources) available for the integration of
information. Thus, more information leads to more time-consuming filtering at the expense of diagnostic
quality.

5.2.3 CUES ARE DIFFERENTIALLY SALIENT As we discussed with the SEEV model in Chapter 3, the salience of a
cue, its attention-attracting properties or ease of processing, can influence the extent to which it will be

233
attended and weighted in information integration (Payne, 1980). For example, loud sounds, bright lights,
underlined or highlighted information, abrupt onsets of intensity or motion, and spatial positions in the front
or top of a visual display are all examples of salient information cues and are likely to be given greater weight,
particularly under time pressure (Wallsten & Barton, 1982). Negative, unpleasant information is found to be
more salient (attention capturing) than positive, in influencing decisions (Yechiam, 2012).
These findings lead us to expect that in any diagnostic situation, the brightest flashing light or the meter
that is largest, is located most centrally, or changes most rapidly will cause the operator to process its
diagnostic information content over others: the salience bias. When integrating testimony from witnesses, it
may be the loudest or most articulate voice that is attended to the best. It is important for a system designer to
realize, therefore, that the goals of alerting (high salience) are not necessarily compatible with those of
diagnosis in which salience should be directly related to the information value of the cue in making a
diagnosis, not just in detecting a fault.
In contrast to salience, which may lead to “overprocessing,” research also suggests that information that
is difficult or effortful to interpret or integrate, because it requires arithmetic calculations or contains
confusing language, will tend to be ignored, or at least underweighted (Bettman, Johnson, & Payne, 1990;
Johnson, Payne, & Bettman, 1988). For example, Stone, Yates, and Parker (1997) found that presenting risk
information in digital form led to less appropriate processing than presenting it in the analog form of stick
figures, whose salient numerosity represented the magnitude of risk.
An extreme case of low salience relates to the absence of a cue. There are often circumstances in which
a hypothesis can gain credibility on the basis of what is not seen as well as what is seen. For example, the
computer or automotive troubleshooter may be able to eliminate one hypothesized cause of failure on the
basis of a symptom that is NOT observed. Yet people are relatively poor in using the absence of cues to assist
in diagnosis in fields such as medicine (Balla, 1980) or logical troubleshooting (Hunt & Rouse, 1981). It
should be noted that the absence of a cue is not quite the same as the missing information described in 5.2.1
because there are circumstances in which the fact that something is NOT observed (absence of a cue) can
provide a great deal of diagnostic information. It’s just that people do not use that information very well.
The observation that cue salience influences the impact of cue processing is a part of the more general
observation that the physical format or array of information relevant to a decision problem can influence the
nature of the decision processes (Smith, Bennett & Stone, 2006), an issue we discuss in Section 7, and it also
has relevance to the benefits of ecological interface displays with salient emergent features to the diagnosis of
abnormal states in complex systems (Burns & Hajjckkk, 2008, see Chapter 4).

5.2.4 PROCESSED CUES ARE NOT DIFFERENTIALLY WEIGHTED BY INFORMATION VALUE While people will tend to
overprocess cues of greater salience, there is also good evidence that people tend to overprocess cues of lesser
information value relative to those of greater value (e.g., Kohler, Brenner, & Griffin, 2002). That is, people do
not effectively modulate the amount of weight given to a cue based upon its information value, whether the
latter is influenced by diagnosticity or reliability. Instead, they tend to treat all cues as if they were more or
less of equal value (Cavenaugh, Spooner, & Samet, 1973; Schum, 1975). This as-if heuristic thereby reduces
the cognitive effort which would otherwise be required to consider differential weights when integrating
information. It is a heuristic which, like others we discuss below, will not generally do damage to the
diagnosis (Dawes, 1979), but under certain circumstances, particularly when a low value cue happens to be
quite salient, its use can invite a wrong diagnosis.

234
FIGURE 8.4 Demonstration of the as-if heuristic. The function shows the relationship of the validity of cues to the optimal and obtained
weighting of cues in prediction.

Kahneman and Tversky (1973) have demonstrated that even those well trained in statistical theory do not
down-weight less reliable information sources when making “intuitive” predictions. In Figure 8.4, the optimal
diagnostic weighting of a predictive variable is contrasted with the weights as inferred from subjects’
predictive performance. Optimally, the information extracted, or how much weight is given to a cue, should
vary as a linear function of the variable’s correlation with the criterion. In fact, the weighting varies in more of
an “all or none” fashion, as shown in the figure.
Numerous examples of the as-if heuristic can be identified, downweighting differences in information
value. As one example, Griffin and Tversky (1992) found that evaluators, forming impressions of an applicant
on the basis of letters of recommendations, tended to give more weight to the tone or enthusiasm of the letter
(a salient feature) than to the credibility or reliability of the source (the letter writer). Koehler, Brenner, and
Griffin (2002) found that when people make predictions, they generally neglect to consider differences in the
quality of evidence, overrelying upon evidence when its quality is low, and under-relying when its quality is
high. Rossi and Madden (1979) found that trained nurses were not influenced by the degree of diagnosticity of
symptoms in their decision to call a physician. This decision was based only on the total number of symptoms
observed.
A particularly dangerous situation occurs when less than perfectly informative information is passed
from observer to observer. The lack of perfect reliability or diagnosticity may become lost as the information
is transmitted, and what originated with uncertainty might end with certain conviction. There is some feeling,
for example, that in the USS Vincennes incident in which the Iranian airliner was targeted, the uncertain status
of the identity of the radar contact may have become lost as the fact of its presence was relayed up the chain
of command (U.S. Navy, 1988).
Another potential cause of unreliable data whose limits are discounted in information integration occurs
when the sample size of data used to draw an inference is small. A political poll based on 10 people is a far
less reliable indicator of voter preferences than one based on 100. Yet these differences tend to be ignored by
people when contrasting the evidence for a hypothesis provided by the two polls (Fischhoff & Bar-Hillel,
1984; Tversky & Kahneman, 1971, 1974). In the context of Figure 8.3, information regarding reliability can
be said to be less accessible to cognition than the actual diagnostic content of that information, and hence is
ignored (Kahneman, 2003).
The insensitivity to differences in predictive validity or cue reliability (e.g., optimal weighting) should
make people ill suited for performing tasks in which diagnosis or prediction involves multiple cues of
different information value. In fact, a large body of evidence (e.g., Dawes & Corrigan, 1974; Dawes, Faust, &
Meehl, 1989; Kahneman & Tversky, 1973; Kleinmuntz, 1990; Meehl, 1954) does indeed suggest that humans,
compared to machines, make relatively poor intuitive or clinical predictors. In these studies, subjects are given
information about a number of attributes of a particular case. The attributes vary in their weights, and the
subjects are asked to predict some criterion variable for the case at hand (e.g., the likelihood of success in a
program or the diagnosis of a patient). Compared with even a crude statistical system that knows only the

235
polarity of cue diagnositicity (e.g., higher test scores will predict higher criterion scores) and assumes equal
weights for all variables, the human predicts relatively poorly. This observation has led Dawes, Faust, and
Meehl (1989) to propose that the optimum role of the human in prediction should be to identify relevant
predictor variables, determine how they should be measured and coded, and identify the polarity of their
relationship to the criterion. At this point a computer-based statistical analysis should take over and be given
the exclusive power to integrate information and derive the criterion value (Fischhoff, 2002).
Why do people demonstrate the as-if heuristic in prediction and diagnosis? The heuristic seems to be an
example of cognitive simplification or effort conservation, in which the decision maker reduces the load
imposed on working memory by treating all data sources as if they were of essentially equal reliability. Thus,
a person avoids the differential weighting or mental multiplication across cue values that would be necessary
to implement the most accurate diagnosis. When people are asked to estimate differences in reliability of a cue
directly, they can clearly do so. However, when this estimate must be used as part of a larger mental
aggregation using working memory, the values become distorted in this simplifying pattern.

5.3 Expectations in Diagnosis: The Role of Long-Term Memory


When cues are integrated, such integration is influenced in two important respects by long term memory
(based on past experience), as related to cue correlation and to expectancy. Each generates its own unique
heuristic.

5.3.1 REPRESENTATIVENESS The foundation of the representativeness heuristic (Kahneman & Frederick, 2002;
Tversky & Kahneman, 1974) is that cues for a diagnostic state are often correlated. Thus, for example, bad
weather is diagnosed by both clouds and low pressure. The flu is diagnosed by nausea, fever, and aches. The
correlation between these cues or symptoms may be less than perfect. So there exists a difference between the
ideal “prototype” (all cues present) and its actual expression in each real world “case.” Some cues may be
absent or weak, and possibly some extra cues may be present. When making a diagnosis, people tend to match
the observed case pattern against one of a few possible patterns of symptoms (one for each diagnosis) learned
from past experience and stored in long-term memory. If a match is made, that diagnosis is chosen. As we
will see in Section 8, this is behavior typical of skilled decision making, or visual pattern recognition (Chapter
6).
There is nothing really wrong with following this heuristic except that people tend to use
representativeness without adequately considering the base rate, probability, or likelihood that a given
hypothesis or diagnosis might actually be observed (Koehler et al., 2002). For example, following the
representativeness heuristic, a physician observing a patient who matches four out of five symptoms typical of
disease X, and three out of five typical of disease Y will be likely to diagnose disease X as being most
representative of the patient’s symptoms, even if X occurs very rarely in the population, compared to disease
Y.
In a manner similar to the failure to differentially weight cues, discussed above, Christenssen-Szalanski
and Bushyhead (1981) have observed that physicians are insufficiently aware of disease prevalence rates
(base rate) in making diagnostic decisions. Balla (1980, 1982) confirmed the limited use of prior probability
information by both medical students and senior physicians in a series of elicited diagnoses of hypothetical
patients. Furthermore, the sluggish beta adjustment in response to signal probability, described in Chapter 2,
in which decision-making criteria are not adjusted sufficiently on the basis of signal frequency information, is
another example of this failure to account for base-rate information. So too is the relative insensitivity to
differences in proportion described in Section 5.1.1.
Representativeness may be thought to reflect another example of the distorting effects of salience or
accessibility in decision making (Kahneman & Frederick, 2002; Kahneman, 2003). Symptoms are salient,
accessible, and visible; probability is abstract and mental, and hence seems to be “discounted” when placed in
competition with a pattern of perceivable symptoms. As Griffin and Tversky (1992) put it, “people pay more
attention to the salient, representative strength of evidence (e.g., the difference between two means, or the
warmth of description of an applicant in a letter) than they do to the reliability of evidence” (which is more
abstract).
The prevalence of the representativeness heuristic does not mean that people ignore probability or base
rates altogether in reaching diagnoses. It only means that physical similarity of expressed cues to a prototype
hypothesis dominates probability consideration when the two are integrated to determine the most likely
hypothesis, on the basis of both past experience and the physical evidence (Griffin & Tversky, 1992). If, on

236
the other hand, the physical evidence is itself ambiguous (or missing), then people will use probability. They
will be quite likely to diagnose the hypothesis which, in their mind, has the greatest probability of being true
(Fischhoff & Bar-Hillel, 1984). However, this mental representation of probability may also be imperfect, as
reflected in the second important heuristic in evidence consideration, the availability heuristic.

5.3.2 THE AVAILABILITY HEURISTIC Availability refers to “the ease with which instances or occurrences [of a
hypothesis] can be brought to mind” (Tversky & Kahneman, 1974; Schwarz & Vaughn, 2002) and is closely
related to the construct of accessibility discussed briefly above (Kahneman, 2003; Kahneman & Frederick,
2002). This heuristic can be employed as a convenient means of approximating prior probability, in that more
frequently experienced events or conditions in the world generally are recalled more easily. Therefore, people
typically entertain more available hypotheses.
Unfortunately, other factors strongly influence the availability of a hypothesis that may be quite
unrelated to their absolute frequency or prior probability. As we noted in our discussion of long-term memory
(Chapter 7), recency is one such factor. An operator trying to diagnose a malfunction may have encountered a
possible cause recently, either in a true situation, in training, or in a description just studied in an operating
manual. This recency factor makes the particular hypothesis or cause more available to memory retrieval, and
thus it may be the first one to beconsidered. Lauren’s recent fall on the rock in Wyoming led her to diagnose
the rock route as more dangerous.
Availability also may be influenced by hypothesis simplicity. For example, a hypothesis that is easy to
represent in memory (e.g., a single failure) will be entertained more easily than one that places greater
demands on working memory (a compound double failure). Another factor influencing availability is the
elaboration in memory of the past experience of the event. For example, in an experiment simulating the job
of an emergency service dispatcher, Fontenelle (1983) found that those emergencies that were described in
greater detail to the dispatcher were recalled as having occurred with greater frequency.
Availability and accessibility are closely related to the phenomenon of a attribute substitution
(Kahneman, 2003) in which certain highly accessible mechanisms get substituted by the intuitive (type 1)
decision system, for more effort-demanding mechanisms employed by the analytic (type 2) system when
resources are scarce. Thus, highly accessible attributes like similarity, averages, and change are contrasted
with (and often substitute for) more abstract, less accessible, but often more optimal attributes such as
likelihood (influenced by probability) and absolute amount. As one simple example, when people make
choices in a gamble, they are often heavily influenced by the probability of winning or losing between two
options, rather than the expected value of the two options (an issue that will be discussed later in the chapter).
Probability is bounded (by 0 and 1) and is easily accessible, comparable or discriminable between them
(Slovic, Finucane, et al., 2002).
Interestingly, representativeness (the pattern of data) and availability (estimating frequency of
hypothesis) are two commodities that are integrated together in the Bayesian approach to optimal decision
making (Edwards, Lindman, & Savage, 1963). In this approach, prior probability is multiplied by the P(data
pattern/hypothesis) to estimate the true probability of each hypothesis given the data. The interplay between
availability and representativeness in human cognition approximates this process, as we saw too in signal
detection judgments. In contrast, however, classical statistic fails to consider the prior probability (odds),
focusing only on the “p value” or p(data/hypothesis). As we see by considering representativeness and
availability, the human as an “intuitive statistician” considers both, but does so heuristically.

5.4 Belief Changes Over Time


As we have noted, many diagnoses are not the short, “one shot” pattern classifications, but rather take place
over time as an initial tentative hypothesis may be formed, and more evidence is sought (or arrives) to confirm
or refute it. Indeed most troubleshooting seems to work this way, in which various tests are performed,
specifically designed to provide new cues or evidence in an effort to identify the “true” state. Jurors in a
criminal trial also may form an initial hypothesis or degree of belief in the guilt or innocence of the suspect,
but find these beliefs altered as further evidence is presented. Scientists form hypotheses and then design
experiments and use subsequent data to either strengthen or weaken (usually the former; see 5.4.2) their belief
in the hypothesis. In this process of refining beliefs over time, we can identify two important characteristics
that can sometimes work against the most accurate estimate of the “truth”: the anchoring heuristic and the
confirmation bias. Later in the chapter we will also show how the overconfidence bias amplifies these two
influences.

237
5.4.1 ANCHORING HEURISTIC The anchoring heuristic (Einhorn & Hogarth, 1982; Chapman & Johnson, 2002;
Joslyn et al., 2011; Kahneman & Tversky, 1973; Mosier, Sethi, et al., 2007) describes how, when cues bearing
on a hypothesis, or information sources bearing on a belief arriveover time, the initially chosen hypothesis
tends to be favored, as if we have attached a “mental anchor” to that hypothesis and do not easily shift it away
to the alternative. If evidence a favors hypothesis A and b favors B, then receiving the evidence in the order
a→b should lead to a favoring of A, but receiving it in the order b→a will favor B. Such a tendency is
consistent with the general observation that “first impressions are lasting.”
One clear implication of the anchoring heuristic is that the strength of belief in one hypothesis over
another will be different, and may even reverse depending on the order in which evidence is perceived
(Adelman et al., 1996; Hogarth & Einhorn, 1992; Ricchiute, 1998). Allen (1982) has observed such reversals
as weather forecasters study meteorological data on the probability of precipitation, and Einhorn and Hogarth
(1982) have considered similar reversals as people hear evidence that is either supporting or damaging to a
particular hypothesis about an event, such as jurors hearing different pieces of evidence for the guilt or
innocence of a suspect (Ruva & McElvoy, 2008; Kahneman & Klein, 2009).
It should be noted that while anchoring represents a sort of primacy in memory (see Chapter 7), there is
also sometimes a recency effect in cue integration, in that the most recently encountered of a set of cues may,
temporarily, have a strong weighting on the diagnosis (Rieskamp, 2006). Thus the lawyer who “goes second”
in presenting closing arguments to a jury may well leave the jury with a bias toward that side, in making their
judgment of guilt or innocence (Davis, 1984).
Indeed, a careful review of studies and a program of experiments carried out by Hogarth and Einhorn
(1992) revealed that a number of factors tend to moderate the extent to which primacy (anchoring) versus
recency is observed when integrating information for a diagnosis. For example, primacy is dominant when
information sources are fairly simple (e.g., a numerical cue rather than a page of an intelligence report), and
the integration procedure is one that calls for a single judgment of belief after receiving all of the evidence,
rather than a revision of belief after each piece of evidence. However, to the extent that the sources are more
complex and hence often require an explicit updating after each source is considered, then recency tends to be
more likely.
To add to the complexity of this analysis, a case can be made that in many dynamic circumstances
recency is in fact more optimal (and anchoring less so) to the extent that the reliability of a given piece of
sampled information declines over time. Thus in a sequence of patient health status reports, those encountered
first, perhaps several hours old, should be somewhat discounted. Yet people do not do much of this age-
related discounting (Wickens, Ketels, et al. 2010), still showing primacy and anchoring.
Whether primacy or recency is observed, in arguing for such innovations as integrated graphics displays
for decision support (Bettman, Payne, & Staelin, 1986; Cook & Smallman, 2008; MacGregor & Slovic, 1986;
see also Chapter 12) or simultaneous displays of unit/price information of a number of comparable products
(Russo, 1977), researchers have made a convincing case that where possible, evidence that is available
simultaneously should be presented simultaneously and not sequentially (Einhorn & Hogarth, 1981). A
simultaneous format cannot guarantee that simultaneous processing will occur, which of course depends on
the breadth of attention and the operator’s own processing strategies. At least, however, it gives the operator
the option of dealing with the information in parallel if attentional capabilities allow or of alternating between
and revisiting different information sources, if they do not. In this manner, one information source is not given
automatic primacy (or recency) over others.

5.4.2 THE CONFIRMATION BIAS Evidence bearing on a hypothesis or belief may be either passively received or
actively sought. The confirmation bias describes a tendency for people to seek information and cues that
confirm the tentatively held hypothesis or belief, and not seek (or discount) those that support an opposite
conclusion or belief. Ambiguous cues, that information which is totally undiagnostic within the framework
presented in Section 5.1, will be interpreted in a manner that supports the favored belief (Cook & Smallman,
2008; Einhorn & Hogarth, 1978; Herbert, 2010; Hope, Memon, & George, 2004; Mynatt, Doherty, &
Tweney, 1977; Nickerson, 1998; Schustack & Sternberg, 1981). This bias produces a sort of “cognitive tunnel
vision” in which operators fail to encode or process information that is contradictory to or inconsistent with
the initially formulated hypothesis, hence conferring even greater rigidity to the anchor.
The investigation into the USS Vincennes incident in the Persian Gulf revealed the confirmation bias at
work. Operators of the radar system hypothesized early on that the approaching aircraft was hostile, and they
did not interpret the contradictory (and as it turned out, correct) evidence offered by the radar system about

238
the aircraft’s neutral status (U.S. Navy, 1988). The analysis of the Three Mile Island incident also reveals a
confirmation bias for the operators to confirm their belief in the erroneous hypothesis of a high-water level in
the reactor (Rubenstein & Mason, 1979).
Arkes and Harkness (1980) demonstrated the selective biasing of memory induced by the confirmation
bias. They presented subjects with several symptoms related to a particular clinical abnormality (experiment
1) or to the state of a hydraulic system (experiment 2). Arkes and Harkness found that if the subject held a
hypothesis or made a positive diagnosis, the symptoms they had observed that were consistent with that
diagnosis were readily remembered, whereas inconsistent symptoms were more easily forgotten. Furthermore,
subjects erroneously reported seeing symptoms that they actually had not seen but that were consistent with
the diagnosis. Similar observations of false memories for consistent cues were made in a study of aviation
fault diagnosis by Mosier, Skitka, et al. (1998).
In a comprehensive review of the confirmation bias, Nickerson (1998) identified several possible reasons
for this failure to seek disconfirmatory evidence:
1. People have less cognitive difficulty dealing with positive information than with negative information
(Clark & Chase, 1972, see Chapter 6), and with the presence of information (a present cue that
supports what you already believe) than the absence (the absence of a cue which, if present would
support your belief), also reflecting cognitive effort. The process required to change hypotheses—
abandon an old one and reformulate a new one—requires a higher degree of effort than does the
repeated acquisition of information consistent with an old hypothesis (Einhorn & Hogarth, 1981).
Given a certain “cost of thinking” (Shugan, 1980) and the tendency of operators, particularly when
under stress, to avoid troubleshooting strategies that impose a heavy workload on limited cognitive
resources (Rasmussen, 1981), operators tend to retain an old hypothesis rather than go to the trouble of
formulating a new one, or even entertaining two hypotheses at one time, so long as accepting “the
chosen one” is consistent with most of the evidence (e.g., close to the truth).
2. There is a motivational factor related to the desire to believe. The high value that people place on
consistency of evidence leads them to see all (or most) evidence supporting one or the other belief, and
that belief is usually the one initially formulated.
3. A second motivational factor results when people focus more on the consequences of the logical
choice of action that would follow from the initially favored hypothesis, rather than the truth of that
hypothesis itself (Bastardi, Uhlman, & Ross, 2011). As we will see below, choices are inherently value
laden, given the likelihood of positive and negative outcomes that can flow from those choices in an
uncertain world. Lauren was inclined to believe that the weather would clear because the
consequences would be summiting and success of the expedition. Hence people may be inclined to
stick with (and try to confirm) thebelief supporting choices whose outcomes, if the belief is true, are
less negative and more positive. As Nickerson says, “when using a truth seeking strategy [trying to
disconfirm] would require taking a perceived risk, survival is likely to take precedence over truth
finding.” Often, finding one’s beliefs to be wrong can be embarrassing.
4. In some instances it may be possible for operators to influence the outcome of actions taken on the
basis of the diagnosis, which will increase their belief that the diagnosis was correct. This is the idea of
the “self-fulfilling prophecy” (Einhorn & Hogarth, 1978). It might describe a teacher who, diagnosing
a child as “gifted,” will provide that child with sufficient extra opportunities and motivation so that
high academic performance will be almost guaranteed. It might also describe the scientist who,
believing a theory to be correct, will now design and carry out experiments that are most likely to
produce confirming evidence.
The issue is how to force a diagnostician simultaneously to entertain alternative hypotheses and to seek
disconfirming evidence or at least attend to it if it arrives—in short, to break through the cognitive tunnel.
This represents a major challenge to the designer of systems in which troubleshooting will be required.
Finally, we note in the context of both the confirmation bias and anchoring, the insidious role of the
overconfidence bias in amplifying the distorting influence of both. While this bias will be discussed in detail
later in Section 7.2, for now we consider that to the extent that people are more confident than they have a
right to be in their existing beliefs, then they will be even less likely to seek evidence that those beliefs may be
wrong, creating a sort of vicious cycle or “perfect storm” of these biases. This scenario was played out in the
conviction that Iraq possessed weapons of mass destruction, leading up to the Iraq war (Isakoff & Corn,
2006).

239
5.4.3 DECISION FATIGUE A third influence on decision making over time is known as decision fatigue (Tierney,
2011). Repeated decisions can often lead to decreased effort invested in accuracy and analysis. This
phenomenon was illustrated dramatically in an analysis of parole board decisions carried out by Danzigera,
Levav, and Pesso (2011), who observed that the probability of granting parole declined from 75 percent early
in the morning, down to approximately 25 percent later in the day. Stated simply, the effort or cognitive
resources required to make careful decision analysis was depleted over time, such that the “effort-lite” default
strategy of denying parole (essentially deciding not to decide) begins to dominate.

5.5 Implications of Biases and Heuristics in Diagnoses


The previous sections may have painted a fairly pessimistic picture of the accuracy of the human as a
diagnostician, full of biases and heuristics that force beliefs away from “the truth.” Although such departures
are often observed and records are replete with examples of incorrect diagnoses (jury verdicts that have later
been found incorrect; Three Mile Island, USS Vincennes, misdiagnosed diseases), several qualifications need
to be applied to the view that humans are just “a bundle of biases.” First, as we noted above, many of the
heuristics are highly adaptive, for a decision maker who must work rapidly and cannot afford to invest a large
amount of mental effort (and/or time) to consider all the symptoms and all possible hypotheses (Payne,
Bettman, & Johnson, 1993). Indeed, heuristics are so often used by people precisely because most of the time
they do provide a correct or at least satisfactory outcome (Gigerenzer et al, 2002; Gigorenza, 2002). If they
were wrong more often than right, people would eventually abandon them (although see Section 8 below).
Secondly, using the shortcuts offered by heuristics often is a necessity giventhe time constraints of a decision
environment. For example, the fire captain must depend upon the speed of the representativeness heuristic in
certain time-critical situations, when a delay in selecting an action can result in loss of life. And the
confirmation bias can at times provide a very useful and adaptive way of gathering information (Klayman &
Ha, 1987).
Finally, for all of the biases and heuristics described above, decision research has examined certain
conditions under which they may be modulated or eliminated entirely. For example, overconfidence in
forecasting appears to be eliminated from the forecasts offered by meteorologists (Murphy & Winkler, 1984;
but not by experts in many other professions, Shanteau, 1992; see Section 8). Anchoring may be reduced or
eliminated by properties of the cues (Hogarth & Einhorn, 1992). And there are great differences between
circumstances and people in the amount of overconfidence in diagnostic estimates (Paese & Sniezek, 1991).
What is most critical from the perspective of this book is that analysis of these sorts of biases can lead to
suggested training, procedural, and design remediations which can lessen their degrading impact on diagnosis
in the circumstances when those impacts may be severe, or safety compromising. We discuss these
remediations in the final section of this chapter.

6. CHOICE OF ACTION
Up to this point our discussion of decision making has focused on a collection of processes involved with
estimating the state of the world and diagnosing or making a situation assessment. These processes are
necessary to sustain effective decision making, but are not sufficient. As represented in Figure 8.1, the output
of decision making must also include a choice of some action. In this regard, the dichotomy of state
assessment and action choice is analogous to that discussed in Chapter 2 (signal detection), between the
evidence variable (representing the likelihood of a signal), and the response criterion (by which the evidence
variable was transformed into a dichotomous choice). Lauren, our climbing leader, assessed the difficulty and
safety of rock versus snow, and then chose the snow course of action.
One key feature of this choice, is not relevant for diagnosis but was clearly represented by signal
detection theory is the value that the decision maker places on different possible outcomes. We consider
below, how people “should” and how they do combine information on value and probability to make
decisions, just as, in our discussion of signal detection theory, we considered how they combined information
on values and probabilities in setting beta for the decision of whether a signal was present or not. We discuss
first the nature of decisions that consider values only; then we consider the added complexity of combining
probability with value when examining decision making under uncertainty.

6.1 Certain Choice


When choosing which product to buy, or Lauren’s choice of teammates for the expedition, the choice can be
often be conceptualized as in Figure 8.5, in which an array of possible objects (e.g., products) are compared,

240
each with varying attributes. For example the set of personal computers to purchase may vary in their
attributes of price, usability, maintainability, warrantee, and so forth. In making such a choice that will
maximize the consumer’s overall satisfaction, the decision maker should carry out the following steps:
1. Rank order the importance of each attribute (highest number, greatest importance). In Figure 8.5, the
left attribute (price) is least important (1), the next attribute across (warrantee) is number 4 and so
forth.
2. Assess the value of each object on each attribute (highest number, greatest value). For example, the
highest number would be for the least expensive product, the best warrantee, etc.
3. For each object assess the sum of the products of (value x importance), as is shown in the bottom of
the figure.
4. Chose to purchase the object with the highest sum of products. As the calculation shows, in the
example of Figure 8.5, this turns out to be object A.

FIGURE 8.5 Choice under certainty. The calculations at the bottom are based on a choice between only two objects, although the extended
rows and columns suggest that the procedure could generalize to many more objects and attributes.

This decision process is known as a compensatory one, in that a product which may be low on the most
important attribute (an expensive computer, when cost is most important), can still be chosen if this deficiency
is compensated for by high values on many other attributes of lesser importance. For example the most
expensive computer may have far and away the best user interface, the most reliable maintenance record, and
the best warrantee, allowing these strengths to compensate for the weakness in price.
While people may, in the long run, best satisfy their own expressed values by following the prescriptions
of the compensatory method, many choices in everyday life are made with much less systematic analysis,
following heuristics or other shortcuts (Leher, 2010). For example the rule of satisficing (Simon, 1955) is one
in which the decision maker does not go through the mental work to chose the best option, but rather one that
is “good enough” (Lehto, 1997), and this is often the strategy employed in real-world naturalistic decision
making, when there is time pressure (Klein, 1989, 1997; Mosier & Fischer, 2010).
A more systematic heuristic that people sometimes employ when the number of attributes and objects is
quite large, is known as elimination-by-aspects (EBA; Tversky, 1972). Here, for example, the most
important attribute is first chosen, then any product that does not lie within the top few along this attribute
(aspect) is eliminated from consideration, and then the remaining products are evaluated by comparing more
of the aspects of the remaining few objects. As a heuristic, this technique will easily reduce the cognitive
effort of needing to compare all attributes across all objects. And it will usually prove satisfactory, only failing
to pick a satisfactory choice if an object that is low on the most important attribute (and hence eliminated)
happens to be near the top on all others. Understandably, the EBA heuristic is one that begins to dominate
over time, as people suffer the effort depletion of decision fatigue (Tierney, 2011).

6.2 Choice Under Uncertainty: The Expected Value Model


Unlike those choices discussed in the previous section in which the consequences of the choice were relatively
well known, many decisions are made in the face of uncertainty regarding their future consequences. Such
uncertainty may result because we do not know the current state of the world; for example a physician may
choose a particular treatment, but be uncertain about the diagnosis. Lauren was uncertain of the avalanche
conditions of the snow route. Others may result because the future cannot be foretold with certainty. Stock
brokers are certainly vulnerable to accurately predicting the future market forces, prior to making investment

241
decisions (De Bondt & Thaler, 2002; Kahneman & Klein, 2009; Taleb, 2007).
Indeed we can often represent decision making under uncertainty as shown in Figure 8.6, by providing
the possible states of the world (A, B, C, …) across the top of a matrix, each associated with their estimated
probability or likelihood, and the possible decision options (1, 2, …) down the rows. The representation in
Figure 8.6 echoes three other analyses considered earlier. First, the estimated probabilities of states of the
world, can be thought of as being “passed on” from the degree of belief in one of two or more hypotheses, as
represented in Figure 8.3 and now shown at the top of Figure 8.6. Second, the matrix shares an analogous
form with the certain choice matrix shown in Figure 8.5, and indeed the computations for the optimal choice
are similar between the two matrices. Third, the matrix is in fact a direct analog to the signal detection theory
decision matrix discussed in Chapter 2, with its two states of the world and two choices. However, in the
context of the present chapter there may be more than two states of the world and more than two decision
options.
As you will recall, a key aspect of the discussion of signal detection theory was the setting of the optimal
beta, in a formula that was determined by the probability of the two states of the world, and by the outcome
costs and values of the different states of the world that would be forecast from the four joint events. In Figure
8.6, these costs and values are represented by a value (V) (which can be either positive or negative) of the
outcome associated with the consequence of each decision option made in each state of the world. One might
consider for example the costs and benefits to shutting down a large power generating plant, under the
alternative states that either nothing is wrong (and a large expense is incurred in re-starting the plant, and
enduring a temporary power loss), or that the plant is failing and will suffer major damage if it continues in
operation.

FIGURE 8.6 Decision making under uncertainty. The decision option with the highest expected value will be that which maximizes sigma (V ×
P).

In the analysis of decision making under uncertainty, the exact same procedures as in signal detection
theory can be applied for maximizing the expected value of a choice, as long as the probabilities of the
different states of the world can be estimated, and as long as values can be placed within the different cells of
the matrix (there will be more than four cells if there are more than 2 states or 2 outcomes). The process by
which the optimum choice can be proposed involves following calculations analogous to those discussed in
the context of Figure 8.5:
1. The probability of each state of the world (PS) is multiplied by the outcome value (VXY) in each cell,
assigning positive values to “good” outcomes, and negative values to “bad” ones.
2. These [probability X value] products are summed across options, to produce the expected value of
each option.
3. The decision alternative with the greatest expected values is chosen.
To the extent that this option is chosen repeatedly over multiple opportunities to exercise the choice, and
that values are objective and known, then the algorithm will, over the long run, provide the greatest payoff.
Such an algorithm, for example, is well suited to apply to a gambling scenario, in which these conditions are
met; and it is indeed such an algorithm that is used by gambling casinos to guarantee that they receive a profit
(and therefore guarantee that the long term expected value for the gambling consumer is a loss).

242
While expected value maximization is clear, simple, and objective, there are several factors that
complicate the picture when it is applied to most human decisions under uncertainty. First, it is not necessarily
the case that people want to maximize their winnings (or minimize their expected losses) over the long run.
For example, they may wish to minimize the maximum loss (i.e., avoid picking the option which has a
catastrophic negative outcome value). This is, of course, one reason why people purchase fire insurance and
avoid the decision option of “no purchase”, even though the expected value of the purchase option is negative
in the long run (if it were positive for the consumer, the insurance company would soon be out of business!).
Second, in many decisions it is not easy to assign objective values like money to the different outcomes. A
case in point are decisions regarding safety, in which consequences may be human injury, suffering, or the
loss of life. Third, as we discuss in the following section, people do not treat their subjective estimates of costs
and values as linearly related to objective values (i.e., of money). Fourth, people’s estimates of probability do
not always follow the objective probabilities that will establish long term costs and benefits.
In spite of these many departures from the maximum expected value choices in Figure 8.6, departures
which we discuss in more detail below, it remains important that we understand the optimal prescription of
expected value choices, given that, like the optimal beta, this prescription establishes a benchmark against
which the causes of different human departures can be evaluated (Kahneman, 1991), and given the high
frequency with which humans make decisions under uncertainty or risk. A few examples are:
• Does the company institute a costly safety program, or does it take gambles that its factory will not be
inspected and that an accident will not occur at the workplace?
• Do you purchase the expensive expanded warrantee option for your new computer system: given the
likely possibility that it may never fail?
• Does Laura choose the snow over the rock route?
• Does the pilot continue flying through bad weather, or turn back?
• Does the student decide not to read the chapter, gambling that its material will not be covered on the
exam?
All of these are examples of risky decision making for which, if probabilities and values are known, the
procedures in Figure 8.6 could be applied. We now explore some of the departures or reasons why people
make choices that do not agree with the expected value model.

6.3 Heuristics and Biases in Uncertain Choice


Whether a choice is between two risky outcomes, or between a risk and a “sure thing” (i.e., an option for
which the outcome is known with certainty), decision-making research has revealed a number of ways in
which choices depart from the optimum payoff, prescribed by expected value theory. As with diagnosis
heuristics, these are not necessarily “bad,” and, indeed, some can be shown to be optimal under certain
circumstances. Understanding the variables that can moderate the strength of influences on subjective values
and probability perception can provide important guidance in improving decision making. We consider below
first a shortcut or heuristic related to direct retrieval that totally bypasses the explicit considerations of risk,
and then the forms of influences of human perception of value and of probability, which have been
incorporated in to a theory of choice known as prospect theory (Kahneman & Tversky, 1984).

6.3.1 DIRECT RETRIEVAL As we have noted in Section 2, many skilled decisions are made without much
conscious thought given to risks (probabilities and values). Choices of action may sometimes be implemented
simply on the basis of past experience. If the conditions are similar to those confronted in a previous
experience, and an action worked in that previous case, it may now be selected in the present case with
confidence that it will again produce a satisfactory outcome. This direct retrieval strategy is a hallmark on
naturalistic decision making to be discussed below. As well, it is a hallmark of operant conditioning. Indeed
studies of decision makers in high stress realistic environments such as fire fighting (Klein, 1997; Klein et al.,
1996) reveal the prevalence of such decision making strategies. So long as the domain is familiar to the
decision maker, and the diagnosis of the state of the world is clear and unambiguous, the comparative risks of
alternatives need not be explicitly considered. Sometimes such an approach may be coupled with a mental
simulation (Klein & Crandall, 1995), in which the anticipated consequences of the choice are simulated in the
mind, to assure that they produce a satisfactory outcome. Good arguments can be made that such a direct
retrieval strategy like recognition primed decision making is in fact a highly adaptive one in a familiar domain
and if time pressure is high (Svenson & Maule, 1993).

243
6.3.2 DISTORTIONS OF VALUES AND COSTS: LOSS AVERSION As we have noted, expected value theory is based upon
optimizing some function which in the economic framework has been used to analyze much of human
decision making and uses money or objective value as its fundamental currency. But the way that people
actually make decisions suggests that they do not view money as a linear function of worth. Instead, much
human decision making can be better understood if it is assumed that humans are trying to maximize an
expected utility rather than expected value (Edwards, 1987), in which utility is the subjective value of
different expected outcomes. Within this context, the important principle of loss aversion specifies that
people are more concerned about (greater loss in utility) the loss of a given amount of value, than they
appreciate (increasein utility) a gain of the same amount (Garling, 1989; McGraw et al., 2010). This
difference is explicitly represented as one important component of the prospect theory of decision making,
proposed by Kahneman and Tversky (1984) as shown in Figure 8.7, which relates objective value on the x-
axis to subjective utility on the y-axis. To the right, the figure represents the functions for utility gains
(receiving money or other valuable items). To the left, it represents the functions for losses. Certain features of
this curve nicely account for some general tendencies in human decision making.

FIGURE 8.7 The hypothetical relationship between value and utility.

The prominent difference in the slope of the positive (gains) and negative (losses) segments of the
function represents loss aversion: a potential loss of a given amount is perceived as having greater subjective
consequences, and therefore exerts a greater influence over decision-making behavior than does a gain of the
same amount. As an example to illustrate this difference, suppose you are given a choice between refusing or
accepting a gamble that offers a 50 percent chance to win or lose $1. Most people would typically decline the
offer because the potential $1 loss is viewed as more negative than the $1 gain is viewed as positive. As a
result, the expected utility of the gamble (as shown in Figure 8.6, the sum of the probability of outcomes times
their utilities) is a loss. Another example of loss aversion is what is called the “endowment effect” in which
people charge more for selling a product (they will lose the product, and their charge is the utility of the loss)
than they are willing to pay for it (the utility of the gain, Garling, 1989). The distinct asymmetry between
losses and gains appears to reflect operations within different regions of the brain (Lehrer, 2009).
It is important to note that loss aversion is not consistently found, and that the greater impact of losses
can sometimes be accounted for by the greater attention paid to and arousal caused by information that
anticipates losses (Yechiam & Hoffman, 2012).
A second characteristic of the function in Figure 8.7 is that both positive and negative limbs are curved
toward the horizontal as they depart from zero, each showing that equal changes in value produce
progressively smaller changes in utility the farther one is from the zero point. This property makes intuitive
sense. The gain of $10 if we have nothing at all is more valued than the gain of the same $10 if we already
have $100. Similarly, we notice the first $10 we lose, more than an added $10 penalty to a loss that is already
$100. Thus, this property captures Weber’s Law of Psychophysics as applied to perceived value.

6.3.3 TEMPORAL DISCOUNTING Differences between value and utility are also reflected in a phenomenon known
as temporal discounting. Here people often make decisions or chose options that maximize the short term
gains (an immediate positive experience) rather than postponing them (a delayed utility) for an option that
may result in equal or even greater long term gains; this behavior reflects an implicit belief that the passage of

244
time “discounts” those gains (Mischel Shoda & Rodriguez, 1989). Such behavior seems to explain the
attractiveness of borrowing on credit, to obtain an immediate goal (short term gain; Garling, 1989), rather than
postponing the goal’s receipt until cash is in hand. Temporal discounting appears to differ substantially
between people (Ersner-Herschfield et al., 2009). Of course there may be good legitimate reasons to
downweight the expected utility of postponing outcomes, in particular because the future is usually uncertain,
and less reliably predicted than is the present or immediate future (see discussion of prediction in Chapters 5
and 7). If the probability of future gains is less than of present gains, this difference can offset the greater
utility of future gains.

6.3.4 PERCEPTION OF PROBABILITY We have noted at least three times previously that people’s perception of
probability is not always accurately calibrated. The “sluggish Beta” phenomenon discussed in Chapter 2, and
the representativeness heuristic discussed in this chapter, both illustrated a tendency to downweight the
influences of probability in detection and diagnosis, respectively and we introduced the biases in judging
proportions in Section 5.1.1. Consistent with these biases in prospect theory, Kahneman and Tversky (1984)
have suggested a function relating true (objective) probability to subjective probability (as the latter is inferred
to guide risky decision making) that is shown in Figure 8.8.
Four different aspects of this function are critical for understanding risky choice. The first, addressed in
Section 5.1.1, is the way in which the probability of rare events are often overestimated, which accounts for
two important departures from decision making to maximize expected value: (1) Why do people purchase
insurance (choosing a sure loss of money—the cost of the policy—over the risky loss of an accident or
disaster, which probably won’t happen), and (2) why do people gamble (sacrificing the sure gain of holding
onto money for the risky gain of winning)? The answer is that in both cases the risky events are quite rare (the
disaster covered by insurance or the winning ticket in the lottery), and hence as shown in Figure 8.8 their
probability is subjectively overestimated: The image of winning a gamble looms large, as does the possibility
of the disaster for which insurance is purchased. With a larger estimated probability input to the subjective
expected utility decision making function, the decision option which anticipates the objectively improbable
outcome is more likely to be made.

FIGURE 8.8 A hypothetical weighting function. The solid line represents estimates of subjective probability compared to the perfect
calibration of the dashed line.

We do note however, as discussed in Section 5.1.1. that the probability of very rare events may be
underestimated if that subjective probability is derived primarily from experience rather than description, and
the event in question, because of its rarity, is never personally experienced (Hertwig & Erev, 2009). This
second aspect is reflected by the disconnect at the far left of the solid line in Figure 8.8.
The third feature is the relatively lower (than 1.0) slope of the function at its low probability end. This
“flat slope” characterizes the reduced sensitivity to probability changes underlying the “sluggish Beta” as well
as the representativeness heuristic and ignorance of base rates discussed in Section 5.3.2.
The fourth feature of the function in Figure 8.8 is the fact that for most of its range (i.e., except for the
very infrequent events discussed above), the function shows perceived probability as less than actual

245
probability. If the perceived probability that influences one’s decision is less than the true probability, then
when choosing between two options with positive outcomes, one risky and one certain, the probability of gain
associated with the positive risky outcome will be underestimated, and this will also cause the expected gain
of the risky option to be underestimated; therefore the bias will be to choose the sure thing. When choosing
between negative outcomes, the probability of the risky negative outcome will also seem less, the expected
loss of this option will be underestimated, and it will now be more likely to be chosen over the certain loss. It
is this third feature, which can be used to account for a very important effect or bias in choice, which is
referred to as the framing effect or framing bias (Garling, 1989; Kahneman & Tversky, 1984; Mellers,
Schwartz, & Cooke, 1998; Munichor, Arev & Lotern, 2006), which we now discuss in detail.

6.3.5 THE FRAMING EFFECT In its simplest version, the framing effect accounts for how people’s preference for
outcomes and objects change as function of how their description is framed (Tversky & Kahneman, 1981).
For example the same ground beef product will seem more attractive if it is described as 80 percent lean than
if it is described as 20 percent fat, even though the product is identical in the two descriptions. People will be
more likely to choose the beef (over some other meat) with the former description, framed in the positive, than
the negative. More seriously, a physician considering treatment of a severely ill patient may have the
treatment outcomes listed as a 98 percent chance of survival or a 2 percent chance of mortality. Again, both
options describe the same probabilistic outcome. But skilled medical personnel will tend to choose the
treatment (over the option, for example doing nothing) more often with the former positive frame than with
the 2 percent negative frame (McNeil, Pauker, et al., 1982).
In the above example, we considered the decision to use the treatment (which had a risky, probabilistic
outcome) versus doing nothing, whose outcome may be certain. Indeed the framing effect accounts for
people’s preferences when faced with a choice between a risk and a sure thing. A classic example, faced by
most of us at some time or another is when we chose between adhering to some time (or cost) consuming
safety procedure (a sure loss), or adopting the risk of avoiding the procedure (driving too fast, running the red
light, failing to wear safety glasses) because the cost of compliance outweighs our expected benefits of
enhanced safety (avoiding the unexpected accident which the safety procedure is designed to prevent). The
framing effect as derived from Figure 8.8 accounts for the risk seeking bias when the choice is between the
negatives (risk and sure thing), but a risk aversion bias when the choice is between the positives (risk and sure
thing; Munichor, Arev, & Lotern, 2006; Simonsohn, 2009).
As a simpler example, if given the choice between winning $1.00 for sure (no risk) and taking a gamble
with a 50/50 chance of winning $2.00 or nothing at all (risky)—as we saw above—people typically choose
the certain option. They tend to “take the money and run.” However, suppose the word “winning” was
replaced by “losing,” so that the choice is between losses. This choice produces a so-called avoidance-
avoidance conflict, characteristic of the safety decision described above, and people here tend to choose the
risky option. They are risk seeking when choosing between losses.
The importance of these differences between perceived losses and gains is that a given change in value
(or expected value) may often be viewed either as a change in loss or a change in gain, depending on what is
considered to be the neutral point or frame of reference for the decision making; hence the title of the
framing effect. As we saw at the beginning of the chapter, Lauren saw her decision to abandon the summit as
a choice between losses. Her teammate gently rephrased this as a choice between gains and this reversed her
decision preference. As another example, a tax cut may be perceived as a reduction in loss if the neutral point
is “paying no taxes” or as a positive gain if the neutral point is “paying last year’s taxes” (Tversky &
Kahneman, 1981). As a consequence, different frames of reference used to pose the same decision problem
may produce fairly pronounced changes in decision-making behavior (Garling, 1989; Tversky & Kahneman,
1981). Puto, Patton, and King (1985) and Schurr (1987) noted that this kind of bias described the behavior of
professional buyers, given hypothetical investment decisions, just as aptly as it described the behavior of
typical laboratory subjects. McNeil, Pauker, et al., (1982) found that it also characterized the choices
physicians made between safer and riskier treatments.
The effects of framing in an engineering context can be illustrated by considering a process control
operator choosing between two courses of action after diagnosing a potentially damaging failure in a large
industrial process: continue to run while further diagnostic tests are performed or shut down the operation
immediately. The first action may be perceived to lead to a very large financial cost (serious damage to the
equipment) with some probability much less than 1.0. The second action will produce a substantial cost that is
almost certain but of lesser magnitude (start up costs, and lost production time). According to the framing
effect, when the choice is framed in this fashion, as the choice between losses, the operator would tend to

246
select the higher-risk alternative (continue to run) over the low-risk alternative (shut down) as long as the
expected utilities of the two actions are perceived to be similar. On the other hand, if the operator’s
perceptions were based on a framework of profits to the company (i.e., gains), the first, risky alternative
would be perceived as a probability mix of a full profit if nothing is wrong and a substantially diminished
profit if the disastrous event occurs. The second alternative would be perceived as a certain large (but not
maximum) profit. Within this positive frame, the choice would be biased toward the second, sure thing
alternative: shut the plant down.
The framing effect can also be used to account for the sunk cost bias (Arkes & Blumer, 1985;
Bazerman, 1998; Molden & Hui, 2011). Here, if we have made a bad decision, perhaps a poor investment,
and have already lost a great deal, then when confronted with the choice of whether to “get out” and cut the
losses, rather than continue with the investment, people will be more likely to continue (“throw good money
after bad”), even when it is in their economic interest to withdraw (a lower expected loss). Rationally, the
previous history of investment should not enter into the decision for the future. Yet it does. People faced with
the exact same choice but when they were not responsible for the initial investment decision (that had lost
utility) will be far more inclined to cut their losses and choose to terminate the investment (a sure loss). We
can see how this was illustrated by Lauren’s initial decision to push on toward the summit.
The interpretation of the sunk cost bias within the framing context is straightforward. For the investor
whose previous decision was poor, the choice is between a sure loss (get out now) and a risky loss (the bad
investment may turn good in the future, but is more likely to continue to worsen). For the newcomer,
encountering the same situation, but whose own utility had not been diminished by the bad decision, the “sure
thing” option is neither loss nor gain. Hence the choice is between 0 utility and an expected loss; a
circumstance that fairly easily leads to a bias to choose to terminate the investment.

6.4 The Decision to Behave Safely


The phenomenon of framing applies to a wide variety of risky choices made by people in society. As we have
noted, a common choice is whether or not to adhere to a particular safety regulation; wearing a seatbelt, a
protective helmet or harness, or some other behavior in the workplace. The sure “cost of compliance” is
always explicitly or implicitly compared against the expected negative utility of the more risky behavior. In
making such choices, it is important to bear in mind the influence of the framing effect—to the extent that
outcomes are viewed as negatives, the risky behavior may be chosen more often—as well as the two related
heuristics discussed in Section 5.3 which influence diagnosing the state of the world:
The availability heuristic indicates that the perceived frequency of different negative consequences of
unsafe behavior will be based not on their actual frequency (objective risk), but upon their salience in
memory, if those consequences were either directly experienced or learned through description. When these
do not correspond, risks can be seriously misestimated. The representativeness heuristic (and base rate
ignoring) suggests that we may not be very sensitive to the probability of disastrous consequences at all; and
indeed a study by Young, Wogalter, and Brelsford (1992) found that the perceived severity of a hazard has a
greater impact on risk estimation than does the probability of the hazard. Finally, it is the case that both
perceived severity and probability will be abstract experiences in making the choice, only possibly perceived
in the future. As temporal discounting suggest (see 6.3.3), their expected costs may be diminished. In contrast,
the cost of compliance imposes a direct tangible and present experience (e.g., the discomfort of wearing a
safety device or the inconvenience of adhering to safety procedure), the experience is highly accessible
(Kahneman & Frederick, 2002). This analysis suggests that risk mitigation efforts should be directed heavily
to reducing the cost of compliance more than increasing the perceived negative risks of the accident.
In addition to the “sure-thing versus risk” choice to behave unsafely, people also allow risks to enter into
their everyday safety decisions by balancing perceived risks, for example in their choice of transportation
modes, in foods to eat, or to behave in a way that is sensitive to climate change (Dotta, 2011). In analyzing
such behavior, it is important to realize the substantial departures between people’s perception of relative risks
and the true measures of risk (as for example defined by probability of death). As one example, the probability
of death from a fall in the home, is far more likely than the probability of death from an airplane crash; but
people’s perception of these risks are often reversed (Combs & Slovic, 1979).
At least three factors appear to be responsible for the fact that people elevate their estimate of risk above
the true “objective” values associated with, for example probability of death. The first is the fact that
publicity, for example from the news media, tends to make certain risks more available to memory than
others (Combs and Slovic, 1979). Hence we observe the high perceived risks of well publicized events (like a

247
major plane crash or a terrorist bombing). Second, people’s perception of risk is driven upward by what is
described as a “dread factor” (uncontrollable, catastrophic consequences, unavoidable), and third, perceived
risk is inflated by an “unknown” factor, which characterizes the risk of new technology, such as genetic
manipulations and many aspects of automation (Slovic, 1987).
It is important for policy makers to consider these influences on the risk perceived by the public. But it is
equally important for all people who make choices based upon risk to consider the consequences of those
choices on scarce resource allocation (Keeney, 1988). For example the choice to allocate a large amount of
money to reduce one particular risk, whose objective risk is small (but perceived riskiness is large), may be
made at a cost of pulling those resources away from mitigating a much larger objective risk, whose subjective
perception is smaller.
An important way to mitigate risky behavior when it results because the probability of the negative event
may be very rare (and hence never personally experienced) even as its negative consequences may be severe,
is through “gentle reminders” (Hertwig & Erev, 2009). This technique imposes minor penalties—a gentle
reminder—for the risk-producing behavior (e.g., failing to heed a safety precaution) which will be
experienced much more frequently than the rare severe consequences. Such a technique has proven effective
in inducing more safety compliant behavior in hospitals.
In conclusion, we note that risk perception, and risk seeking are influenced by a host of other factors,
besides the framing of negative outcomes. For example time stress appears to lead to more risk seeking
(Chandler & Ponin, 2012), and Figner and Weber (2011) discuss other contextual and individual difference
factors that influence risk seeking.

7. EFFORT AND META COGNITION


Our treatment of decision making up to now has focused most on the extermal drivers of decision making—
problem structure, risk, values, and probability—as filtered by human cognition. However, as shown in Figure
8.1, there are two critical inputs to the decision process emanating from the decision maker himself or herself:
effort and meta-cognition. Because these two are interrelated, we treat them together as follows, even as meta-
cognition was discussed in the previous chapter, and effort in Chapter 10.

7.1 Effort
In our discussion of decision fatigue, we emphasized that effective decision making often requires effort.
Resource-dependent working memory is necessary to diagnose and evaluate options. Decision making
competes for those resources with concurrent tasks (e.g., Sarno & Wickens, 1995; see Chapter 10), and
sustained decision making depletes that pool of resources or cognitive effort (Tierney, 2011). Indeed, it has
been shown that repeated decision making competes with the effort required for exerting self control in other
aspects of life (e.g., resisting temptations; Tierney, 2011). Not surprisingly then, the variety of decision-
making strategies will vary in their effort requirements (Bettman, Johnson, & Payne, 1990; Johnson & Payne,
1985; Payne, Bettman, & Johnson, 1993). In particular, heuristics, such as elimination-by-aspects or
representativeness, can be viewed as “effort-lite” versions of the more accurate, full compensatory choice
model (section 6.1) or base rate consideration (Section 5.4.1), respectively.
The effort required and accuracy observed of these two classes of DM strategies is reflected
schematically in Figure 8.9, which indeed previews the concept of the performance-resource function, to be
discussed in Chapter 10. Within this context, effort itself can be viewed as a valuable resource to be
conserved. For example, as more resources are invested, performance with both elimination-by-aspects
heuristic and the compensatory algorithm will improve. However with a small investment of resources, the
“efficiency” of decision making (accuracy per resources invested) will be greater with the heuristic; and
greater efficiency can be considered as more optimal, when time or resources are scarce. Time pressure will
place greater premium on effort conservation. Thus the pilot who dithers in deciding what to do, as the plane
heads toward a hillside or is running out of fuel, will surely be considered to be non-optimal (Orasanu &
Fischer, 1997). The contingent model of decision strategies developed by Payne, Bettman, and Johnson,
1993, predicts how different strategies will be chosen, contingent upon the available time (resources).

248
FIGURE 8.9 Effort, performance and heuristics in decision making. The figure shows the improvement in decision performance as a function
of more effort invested into the decision process for heuristics (solid line) and algorithms (dashed line). With small effort investment, heuristics
can produce better performance than algorithms.

Another important example of this contingency of decision strategy choice upon effort and accuracy
requirements is in the choice of whether to terminate a diagnosis or seek further (often confirmatory)
evidence, given the effort required for further information access (see also search termination discussed in
Chapter 3 Section 2.1). For example, in deciding whether or not a particular set of findings warrant inclusion
as a general principle in this text, the authors make decisions on whether it is worth the effort and time to go
back to the library and do further information search regarding the findings in question. What will be the
perceived gain in seeking more information (MacGregor, Fischoff, & Blackshaw, 1987)? How much time will
it take me to do so? How confident am I now that I have made an appropriate diagnosis of the state of human
performance already, to include the principle in question as part of a chapter?
Of course the tradeoff between accuracy and effort in choosing a strategy is not always based on the
actual level of these variables, but instead is based on the anticipated accuracy and effort (Fennema &
Kleinmuntz, 1995; Seagull Xiao & Plasters, 2004). In this regard, research has revealed that people are not
fully calibrated, in relating the anticipation of accuracy and effort, to the actual accuracy achieved and effort
experienced (Fennema & Kleinmuntz, 1995).

7.2 Meta-cognition and (Over) confidence


The issues of anticipated effort and accuracy, and the conscious choice of a decision strategy brings us to the
important role of meta-cognition in decision making. What does the decision maker know (or think) about the
accuracy of his diagnosis and choice? How does this anticipation influence the choice of strategy and
subsequent decision-making behavior (including the choice not to decide at all, as in the case of the parole
boards discussed in Section 5.4.3). As Kahneman and Klein (2009) note, this is the role of the type 2 system:
to oversee, review, and audit the more automatic decision-making behavior of the type 1 system.
It turns out that one of the most critical and enduring influences on meta-cognition is the confidence in
assessing ones own diagnosis and judgment. Such confidence is often unrealistically high, as manifest in the
overconfidence bias (Nickerson, 1998). In diagnosis, confidence judgments will influence the extent to which
we jump into action (choice), rather than seek more evidence, or prepare for the case in which the assessment
may have been wrong. In choice, confidence assessments will influence the extent to which we plan for
alternative actions (to the extent that we think our chosen action might have been wrong). In both cases, as
Griffin and Tversky (1992) state: “although overconfidence is not universal, it is prevalent, often massive, and
difficult to eliminate”. Several examples from different walks of life may be cited:
• The average driver estimates him/herself to be within the top 25th percentile of safe drivers (Brehmer,
1981). By definition, if confidence were calibrated, this should be 50 percent.
• Fischoff (1977) and Fischhoff and MacGregor (1982) asked people to make prediction about future
events (e.g., elections, winners of athletic contests), and noted that, typically, whereas predictions

249
might turn out to be 60 percent accurate (evaluated after the event took place), the confidence offered
as to prediction accuracy would be more like 80 percent.
• Such overconfidence is not confined to novices in a field, as Tetlock (2005) performed a long term
study of experts in political forecasting, and observed similar overconfidence. This was just as
prevalent and severe as in novices making similar predictions.
• OC is well documented in the planning fallacy (Buehler Griffin & Ross, 2002). Here people are
eternally optimistic in their projections of how long it will take (or how many resources will be
required) to do something, from achieving a personal goal (like turning in an assignment on time), to
completing massive construction projects like the Denver International Airport or the Sidney Opera
house. Indeed in one study, students expressed 84 percent confidence that they would complete an
assignment on time, whereas in fact, only 40 percent did so (Buehler et al, 2002).
• Scientists are notoriously overconfident about the precision of their estimates of various physical
constants, such as the speed of light (Henrion & Fischoff, 2002).
• Sulistyawati, Wickens, and Chui (2011) observed that those pilots who showed more overconfidence
in their situation awareness estimates were in fact less accurate in those estimates.
We have also encountered OC in other chapters: in Chapter 3, this was illustrated by the phenomenon of
“change blindness blindness” (Levin, Momen, et al., 2000), which describes people’s overconfidence in their
ability to detect unexpected events. In Chapter 5, we considered their overconfidence in detecting hazards at
night, leading to overspeeding (Leibowitz, Post et al., 1982). In eye witness testimony, discussed in Chapters
2 and 7, we learned of the general tendency to be overconfident of the accuracy of their own recognition
memory (Brewer & Wells, 2006; Wells, Lindsay, & Ferguson, 1979), and in learning itself (Chapter 7) people
tend to allow the ease of learning to act as a proxy for the ease of later recall (it is not), and hence be
overconfident in the accuracy of their predicted level of recall (how well they will do on the test), thereby
underestimating their need for study (Bjork, 1999). In Chapter 10 we will encounter OC again in the context
of people’s confidence in their ability to time share while driving (Horrey, Lesch, & Garabet, 2009).

FIGURE 8.10 How confidence and overconfidence is driven by reliability. Each arrow represents the effect of some task variable on both
decision accuracy and performance, as described in the text.

Of course there is great variability between individuals and circumstances in the extent to which OC is
manifest, and we describe below some key moderating variables. First however we can formally represent OC
within the context of the accuracy-confidence calibration space as shown in Figure 8.10. When confidence
is expressed by predicted or judged accuracy (e.g., how well do you think you did on the test), then the two
variables, actual and predicted performance, can be presented on the same scale, and this graph can define the
region of OC as shown above and to the left of the diagonal line of perfect calibration. Furthermore Figure
8.10 illustrates a relatively common phenomenon by the dashed arrows, in which a variable that diminishes
accuracy fails to produce a parallel loss of confidence, and we see that this phenomenon often (top dashed
arrow) but not invariably (bottom dashed arrow) leads to OC. Somewhat less prevalent is the pattern
represented by the solid arrow, in which a variable influences confidence, even as accuracy is little changed.
Research has now identified several variables that create overconfidence including the following in
particular:
1. Diagnostic or problem difficulty. This effect can be described in different ways. For example when
two hypotheses become less discriminable (more ambiguous cues) accuracy of diagnosis decreases,

250
but confidence does not, echoing the pattern shown in the upper dashed arrow in Figure 8.10 (Fishcoff,
1977; Koehler, Brenner, & Griffin, 2002). Evaluating pilots’ diagnosis of aviation problems, Mosier et
al. (2007) found a relation paralleling that line. In domains where prediction is hard to make accurately
because of many uncertainties (stock brokers, politics, mental health), overconfidence is prevalent
(even by experts) whereas it is less so in more predictable domains like weather forecasting
(Kahneman & Klein, 2009; Taleb, 2007; Tetlock, 2005). So too, poorer drivers (for which driving is,
by definition, a more difficult task) show more OC than better drivers (Kidd & Monk, 2009).
2. Evidence reliability. People are not very sensitive to differences in evidence reliability (as we saw with
the as-if heuristic; Griffin & Tversky, 1992) and are guided more by the strength of evidence than by
its reliability. Thus when reliability and performance decline (e.g., by samples with smaller N), their
confidence in the impact of the message provided by this lower reliability (lower information value)
does not. These changes all reflect differences along the upper dashed arrow.
3. In a pattern reflecting the solid arrow of Figure 8.10, when people rely on progressively more sources
of correlated information, they gain confidence (Kahneman & Klein, 2009). The problem is, when
information is highly correlated, errors (unreliability) in one source will typically co-occur in other
sources (e.g., a common failure may underlie both), and so confidence should not proportionately
increase. For example, consider two witnesses both depending on the same, unreliable source of
hearsay evidence.
4. Progressively more sources of information (whether correlated or not) will typically increase
confidence in a diagnosis. But as we discussed in Section 5.2.2, this often does not lead to an increase
in diagnostic accuracy.
In the above discussion of OC, we have examined differences in conditions that may differentially influence
confidence and accuracy. But we can also ask about differences between people. Are there certain classes of
people whose performance tends to occupy the upper left portion of the space. This issue is of particular
relevance to assessments of the accuracy of judicial eye witness testimony (Hope et al., 2004).

8. EXPERIENCE AND EXPERTISE IN DECISION MAKING


As we discussed earlier in this chapter, experts often (but not always) make better decisions than novices. As
we have noted above, this phenomenon is well captured by the study of naturalistic decision making
(Kahneman & Klein, 2009; Mosier & Fischer, 2010; Montgomery, Lipshitz, & Brenner, 2005; Zsambok &
Klein, 1997), which captures the experience-related differences associated with the two major stages of
decision making. In front end decision making (diagnosis), experts typically manifest recognition primed
decision making (RPDM). Here through repeated exposure to the same set of correlated cues, leading to the
same state assessment, experts are able to automatically classify the appropriate state, almost the same as the
automatic pattern recognition discussed in Chapter 6. Hammond et al. (1987) refer to this as holistic decision
making, a function associated with decision system 1 (Kahneman & Klein, 2009). Schriver, Morrow,
Wickens, and Talleur (2008) found that expert pilots were better able to exploit correlated cues in airplane
fault diagnosis than were novices. Their decision advantage was less pronounced when cues were
uncorrelated.
Also, as noted in Section 6.3.1, in back end decision making, experts can accomplish direct retrieval of
choices from long-term memory quite rapidly. What often worked before (given a RPDS situation assessment;
a good outcome) will work again. This phenomenon was observed by the more rapid response shown by
expert pilots by Schriver et al. (2009) with no sacrifice of accuracy. And yet, as we have seen, the success of
expertise in DM is far from guaranteed (Tetlock, 2005). Cues may be uncorrelated, overconfidence may short
change meta cognitive monitoring, and rapid pattern-recognition classification may overlook a single outlying
cue.
Furthermore, as we have considered before, practice in decision making does not necessarily make
perfect, as it does in other skills. Expertise in some decision-making tasks does not guarantee immunity to
certain biases and heuristics (Kahneman & Klein, 2009; Taleb, 2007; Tetlock, 2005). Some assistance in
solving the puzzle as to why experienced decision makers are neither perfect nor sometimes better than
novices is provided by Shanteau’s (1992) careful classification of the domains and properties of those
domains, that distinguish when expertise does develop from practice, and when it does not (Table 8.1).
Kahneman and Klein (2009) in particular, have highlighted the extent to which expertise in decision
making (where experience helps) only emerges in domains such as weather forecasting, in which the pattern
of cue correlations is relatively strong, and different predicted states can be well discriminated.

251
Table 8.1 From Shanteau (1992)
Domains of “Good” Decision Making Domains of “Poor” Decision Making
Weather Forecasting Clinical Psychologists
Chess Masters Personnel Selectors
Physicians Parole Officers
Photointelligence Analysts Stock Brokers
Accountants Court Judges
Characteristics of the Domains:
Dynamic Static
Decisions About Things Decisions About People
Repetitive Less Predictable
Feedback Available Less Feedback
Decomposable Decision ProblemsStatic Not Decomposable

So why does decision making not improve much with experience in these other cases? Einhorn and
Hogarth (1978) have added insight to understanding the problems of learning in decision making;
characteristic of the right side of Table 8.1; by addressing the role of feedback in the typical decision-making
problem. As we noted in Chapter 7, feedback is critical for nearly any form of learning or skill acquisition.
Yet several characteristics of decision making prevent it from offering its usual assistance.
1. Feedback is often ambiguous, in a probabilistic or uncertain world. That is, sometimes a decision
process will be poorly executed, but because of good luck will produce a positive outcome; at other
times, a decision process can follow all of the best procedures, but bad luck produces a negative
outcome. In the first case, the positive reinforcement will increase reliance on the bad process, whereas
in the second case, the punishment, realized by the bad outcome, will extinguish the effective
processing that went into the decision.
2. Feedback is often delayed. In many decisions, such as those made in investment, or even prescribing
treatment in medicine, the outcome may not be realized for some time. As we discussed in Chapter 7,
added delay in feedback beyond a few minutes is rarely of benefit. In decision making the reason is
that, when the feedback finally arrives, the decision maker may have forgotten the processes and
strategies used to make the decision in the first place, and therefore may fail to either reinforce those
processes (if the feedback was good) or correct them (if the feedback was bad). Furthermore, because
feedback is delayed, decision makers may well have turned their attention to other problems and
provide less attention to processing it than they would if feedback arrived immediately after. Finally,
in a phenomenon that we know as “Monday morning quarterbacking” or “hindsight bias,” Fischhoff
(1977) and Woods et al. (1994) have documented the extent to which, after an outcome is known, we
revise our memory of what we knew before the decision was made in such a way as to downplay our
“surprise” at its outcome (“I knew it all along”). If we do not consider ourselves surprised by the
outcome (in hindsight), then we will foresee less reason to revise our decision process (i.e., learn from
the outcome).
3. Feedback is processed selectively. Einhorn and Hogarth have considered the learning of a decision
maker who is classifying applicants as either acceptable to or rejected from a program, and is learning
from feedback, regarding the outcome of those who were selected (see Figure 8.11). As the decision
maker may process feedback from this process, we note that he or she will typically only have
available feedback from those who were admitted (and succeeded or failed), rarely learning if the
people excluded by his decision-making rule would have succeeded had they been admitted.
Furthermore, the confirmation bias will tend to lead people to focus more attention on those who were
admitted and succeeded (therefore confirming that the decision rule was correct), than those who were
admitted and failed (therefore disconfirming the validity of the decision rule). As shown in Figure
8.11b, they may provide extra assistance to those admitted by their rule—influencing the outcome of
the decision to provide further confirmation of the correctness of the rule.

252
FIGURE 8.11 (a) Source of unwarranted confidence in prediction. A predicted score of applicants, reflecting the decision maker’s rule, is
shown on the x-axis. The actual measure of success is shown on the y-axis. (b) The influence of extra assistance to those admitted to the
program. Source: H. J. Einhorn and R. M. Hogarth, “Confidence in Judgment: Persistence of the Illusion of Validity,” Psychological Review, 85
(1978), p. 397. Copyright 1978 by the American Psychological Association. Adapted by permission of the authors.

9 IMPROVING DECISION MAKING


In reviewing the material we have covered in this chapter, one may characterize human decision making as
either generally “good” (by focusing on its many successes) or “faulty” (by focusing on its failures). While we
have no interest in taking a stand on this scale of evaluating human decision making; we believe that as long
as there is evidence that some decision making can be improved in some circumstances, it is the responsibility
of engineering psychology to recommend possible ways of supporting that improvement. We consider four
such techniques in this chapter related to training, proceduralization, displays, and automation.

9.1 Training Debiasing


As we saw above, pure practice at decision making is not necessarily an effective or efficient way of
improving its quality. Instead, research has focused on more targeted practice and instructions to remove or
reduce many of the biases discussed above, a technique known as debiasing (Fischhoff, 1977; 2002; Larrick,
2006; Lipshitz & Cohen, 2005). In a review of debiasing literature, Larrick (2006) concluded that pure
instructions or exhortations to avoid biases are ineffective. Correspondingly he found little evidence that
simply teaching people about biases (e.g., reading this chapter) is effective. This may represent “inert
knowledge” which can be undertstood, but not transferred to practice. Instead, effective techniques focus not
only on instructing the nature of a particular bias in question, but providing specific examples, and practicing
the debiasing strategies (Fong et al., 1991). The following are some specific examples of success.
Hunt and Rouse (1981) have succeeded in training operators to extract diagnostic information from the
absence of cues. In sequential cue information integration tasks, Lopes (1982) and Wickens, Ketels, et al.
(2010) successfully reduced anchoring through training, the latter instructing subjects about the reduced
reliability of older information (see 5.5.1).

253
Some success in reducing the confirmation bias has also been observed by the training strategy of
“consider the opposite” (Mussweiler et al., 2000). For example Koriat, Lichtenstein, and Fischhoff (1980),
and Cohen, Freeman, and Thompson (1997) have both found that forcing forecasters to entertain reasons why
their forecasts might not be correct reduced their biases toward overconfidence in the accuracy of the forecast.
Also successful is a kind of training aid designed to provide more comprehensive and immediate
feedback in predictive and diagnostic tasks, so that operators are forced to attend to the degree of success or
failure of their rules. We noted that the feedback given to weather forecasters is successful in reducing the
tendency for overconfidence in forecasting (Murphy & Winkler, 1984). Jenkins and Ward (1965)
demonstrated that providing decision makers simultaneously with data in all four outcomes of a decision like
that represented in Figure 8.11 instead of simply the hit probability, improves their appreciation of predictive
relations. Where selection tasks or diagnostic treatments are prescribed, box scores should be maintained to
integrate data in as many cells of the matrix as possible (Einhorn & Hogarth, 1978; Goldberg, 1968). Tversky
and Kahneman (1974) suggested that decision makers should be taught to encode events in terms of
probability rather than frequency since probabilities intrinsically account for events that did not occur
(negative evidence) as well as those that did.
Finally, in an interesting take on debiasing training, Fischhoff (2002) described the success of some
training programs designed to reduce the prevalence of teens engaging in risky behavior (drinking, speeding).
Here he makes the point that such behavior, while actually not very frequent, is highly salient, compared to
the prevalence of safe behavior. As we have noted above, salient but rare described events may be
overestimated in their frequency. If training programs emphasize instead the high frequency of teens engaged
in safe behavior, the peer-pressure tendency to imitate the latter (e.g., behave safely) is increased.

9.2 Proceduralization
While debiasing is a form of training that often focuses people’s awareness directly on understanding the
sources of their cognitive limitations, proceduralization simply outlines prescriptions of techniques that
should be followed to improve the quality of decision making (Bazerman, 1998). This may include for
example prescriptions of following the decision decomposition steps of diagnosis and choice theory, as shown
in Figures 8.5 and 8.6 (Larrick, 2006). Such a technique has been employed successfully in certain real world
decisions which are easily decomposable into attributes and values, such as selecting the location of the
Mexico City airport (Kenney, 1973), or assisting land developers and environmentalists to reach a
compromise on coastal development policy (Gardner & Edwards, 1975). The formal representation of fault
tree and failure modes analysis (Kirwan & Ainsworth, 1992; Wickens, Lee, et al., 2004), is a procedure that
can assist the decision maker in diagnosing the possibility of different kinds of system failures. A study of
auditors by Ricchiute (1998) has recommended a procedure by which evidence, accumulated by a junior
auditor, is compiled and presented to a senior auditor who makes decisions, in such a way as to avoid the
sequential biases often encountered in processing information (see Section 5.4).
In a way that integrates debiasing training and proceduralization, Leher (2010) has summarized research
to suggest five strategies for effective decision making:
1. Simple problems require reasoning. (Using ones “gut” reflected in the type 1 system, may be a part of
this, but type 2 system analysis can almost always help).
2. Novel problems require reasoning. Given the type 1 system may not be available here, it is important
to examine past experience analytically to determine how these past decisions might advise a current,
complex decision.
3. Embrace uncertainty. Always entertain competing hypotheses. Always remind yourself of what you
don’t know.
4. You know more than you know. Once you have developed some level of expertise in an area, then it is
OK to trust your emotions and your “gut”, which can reflect the massively parallel processes in the
brain to suggest that certain choices may “seem right” and others are troublesome. But the type 2
system needs to audit these gut calls.
5. Think about thinking: the advocacy of meta-cognition.

9.3 Displays
There is good evidence that effective displays can support the front end of decision processes (cue integration
and diagnosis), by assisting the deployment of selective attention (Mosier & Fischer, 2010). For example,
Stone, Yates, and Parker (1997) observed that pictorial representations of risk data supported more calibrated

254
risk decisions than do numerical or verbal statements. Schkade and Kleinmuntz (1994), studying the decision
processes of loan officers, found that the format in which information regarding the attributes of different loan
applicants was structured influenced the nature of the judgments in a way suggesting that people minimized
the amount of attentional effort required for information integration. Cook and Smallman (2008) found that an
integrated graphical display of intelligence cues shown to professional intelligence analysis reduced the
confirmation bias, relative to a text-based presentation which implicitly suggested a sequential ordering (and
hence invited sequential biases).
The proximity compatibility principle (Wickens & Carswell, 1995), described in Chapter 3, is relevant to
effective decision making, prescribing that sources of information that need to be integrated in diagnosis are
made available simultaneously (not sequentially) and in close display proximity to each other so that all can
be accessed with minimal effort. Emergent features of object displays can sometimes facilitate the integration
process in diagnosis (Barnett & Wickens, 1988). In this regard, we also saw in Chapter 4 that ecological
displays assisted professionals in the diagnosis stage of process control fault management, corresponding to
front end decision making (Burns et al., 2008).

9.4 Automation and Decision Support Tools


Finally, automation and expert systems have offered promise in supporting human decision making. This is
described in much more detail in Chapter 12, but to provide a link here, such support can be roughly
categorized into front end (diagnosis and situation assessment) and back end (treatment, choice, and course-
of-action recommendations) support. This dichotomy is well illustrated in the two major classes of medical
decision aids (Garg et al., 2005; Morrow, Wickens, & North, 2005), where both have enjoyed some modest
success. We also note here that procedures whereby humans estimate weights and cue values for diagnostic
problems, but computers perform the integration of those values (e.g., Dawes & Corrigan, 1974; Fischhoff,
2002) dictate a preferred allocation of function between human and automation in a cooperative human-
automation decision endeavor.

10 CONCLUSION AND TRANSITION


In conclusion, we see that decision making is complex and interactive, with different components invoking
common cognitive and information processing mechanisms (e.g., overconfidence in both diagnosis and
choice). The topic also links to earlier topics of attention, perception, and memory, as well as the topic of
limited resources that we will discuss in Chapter 10. At this time it is appropriate to turn our attention to
decisions of a more rapid and automatic sort, often studied in the laboratory in the context of reaction time.
Thus our focus now in Chapter 9 will be on the decisions that select and execute rapid actions, under some
degree of time pressure.

Key Terms
absence of a cue 256
accessibility 259
accuracy-confidence calibration space 277
anchoring heuristic 260
as-if heuristic 256
attribute substitution 260
availability heuristic 259
base rate 259
Bayesian 260
choice 248
choice of action 262
confirmation bias 261

255
cost of compliance 272
Cue diagnosticity 254
Cue reliability 254
debiasing 281
decision fatigue 263
diagnosis 248
elimination-by-aspects 265
endowment effect 269
expected value 249
extrapolating non-linear trends 251
frame of reference 272
framing effect 271
gambler’s fallacy 252
heuristics/biases 246
hindsight bias 250
holistic decision making 278
information processing 247
information value 254
loss aversion 268
mental simulation 268
meta-cognition 249
naturalistic decision making 246
normative decision making 247
overconfidence bias 263
performance-resource function 274
planning fallacy 276
prevalence rates 259
primacy 261
proceduralization 282
prospect theory 268
representativeness heuristic 258
risk 246
salience bias 256
satisficing 265
sunk cost bias 272
temporal discounting 270
uncertainty 246
utility 268

256
257
9 SELECTION OF ACTION

The previous chapter discussed the front end decision processes of diagnosis or situation assessment. These in
turn often lead to the back end process of action choice. In Chapter 8, this choice was generally deliberative,
slow, and often made in the face of uncertainty of its outcomes. Much attention was paid to its accuracy, but
not much was paid to how long the choice took to implement. This is typical of what Rasmussen (1986) has
described as knowledge based behavior. However we noted, particularly in naturalistic decision making, the
choice is sometimes relatively more rapid and made without extensive deliberation. This type of choice,
characteristic of many routine medical or aviation decisions illustrates rule-based behavior. Here an action is
selected by bringing into working memory a hierarchy of if-then rules: ‘If X occurs, then do Y.’ After
mentally scanning these rules and comparing them with the stimulus conditions the decision maker will
initiate the appropriate action.
The current chapter focuses on actions selected by a third type of behavior, known as skill-based
behavior (Rasmussen, 1981). Here, following a relatively rapid perception of a stimulus or event (rather than
effortful scanning of multiple cues), with little uncertainty as to the state of the world, there is a rapid choice
of action (with generally little uncertainty as to its consequences). This example typifies applying the brake of
a car upon seeing a yellow light, shutting down a piece of equipment when the emergency alert goes off, or
pressing a key (or set of keys) on a keyboard, after seeing (or hearing) an element of the message that is to be
transcribed. Our quick-acting belayer in the story leading Chapter 8 certainly demonstrated skill-based
behavior. Accuracy and errors are still important in skill-based behavior. (Consider the unfortunate sprinter
who errs in the skill-based response to the starting gun by committing a false start.) However, much greater
emphasis in skill-based behavior is placed on response time (RT). In the laboratory, this is often measured as
‘reaction time,’ although in this chapter we consider the former term as a more generic one that characterizes
action in many applied work places.
Many different variables influence RT both inside and outside of the laboratory (Fitts & Posner, 1967;
Woodworth & Scholsberg, 1965). One of the most important is the degree of uncertainty about what stimulus
event will occur and therefore the degree of choice in the action to make. For the sprinter at the starting line of
a race, there is no uncertainty about the stimulus—the sound of the starting gun—nor is there a choice of what
response to make: to get off the blocks as fast as possible. On the other hand, for the driver of an automobile,
wary of potential obstacles in the road, there is both stimulus uncertainty and response choice. An obstacle
could be encountered on the left, requiring a swerve to the right; on the right, requiring a swerve to the left; or
perhaps at dead center, requiring that the brakes be applied. The situation of the sprinter illustrates the simple
RT task, the vehicle driver the task of choice RT.
Examples of simple RT do not frequently occur outside of the laboratory—the sprinter’s start or an
operator supervising a dangerous robotics operation, ready to shut down if anything goes wrong, are
examples. But the simple RT task is important for the following reason: all of the variables that influence RT
can be dichotomized into those that depend in some way on the choice of a response and those that do not;
that is, those that influence only choice RT and those that affect all reaction times. When the simple RT task is
examined in the laboratory, it is possible to study the second class of variables more precisely because the
measurement of response speed cannot be contaminated by factors related to the degree of choice. Hence in
the following treatment we will consider the variables that influence both choice and simple RT before
discussing those variables unique to the choice task.
After both sets of variables are discussed, we will consider what happens when several reaction times are
strung together in a series—the serial RT task and its manifestations beyond the laboratory. Finally, we will
address the causes of human error in responding.

1. VARIABLES INFLUENCING SIMPLE AND CHOICE RT


In the laboratory, simple RT is investigated by providing the subject with one response to make as soon as a
stimulus occurs. The subject may or may not be warned prior to the appearance of the stimulus. Four major
variables—stimulus modality, stimulus intensity, temporal uncertainty, and expectancy—influence response

258
speed in this paradigm.

1.1 Stimulus Modality


Several investigators have reported that simple RT to auditory stimuli is about 30 to 50 msec faster than to
visual stimuli presented in foveal vision (roughly 130 msec and 170 msec, respectively; Woodworth &
Schlossberg, 1965). This difference has been attributed to differences in the speed of sensory processing
between the two modalities. It should be noted that in most real-world designs, the auditory modality is more
favored for simple alerts because of its omnidirectionality; it can be processed with equal speed no matter how
the head is oriented. However, the nature of the environment and concurrent tasks must also be considered in
choosing between modalities, as discussed in Chapter 10.

1.2 Stimulus Intensity


Simple RT decreases with increases in intensity of the stimulus to an asymptotic value, following a function
as shown in Figure 9.1. Simple RT reflects the latency of a decision process that something has happened
(Fitts & Posner, 1967; Teichner & Krebs, 1972). This decision is based on the aggregation over time of
evidence in the sensory channel until a criterion is exceeded.

FIGURE 9.1 Relationship between stimulus intensity and simple reaction time.

In this sense, the simple RT is conceived as a two-stage process, as in signal detection theory discussed
in Chapter 2. Aggregation of stimulus evidence may be fast or slow, depending on the intensity of the
stimulus, and the criterion can be lowered or raised, depending on the ‘set’ of the subject. In the example of
the sprinter, a lowered criterion might well induce a false start if a random noise from the crowd exceeded the
criterion. After one false start, the runner will raise the criterion and be slower to start on the second gun in
order to guard against the possibility of being disqualified. This model then attributes the only source of
uncertainty in simple RT to be time or temporal uncertainty.

1.3 Temporal Uncertainty


The degree of predictability of when the stimulus will occur is called temporal uncertainty. This factor can be
manipulated by varying the warning interval (WI) occurring between a warning signal and the imperative
stimulus to which the subject must respond. In the case of the sprinter, two warning signals are provided:
‘Take your mark’ and ‘Set.’ The gunshot then represents the imperative stimulus. If the warning interval (WI)
is short and remains constant over a block of trials, then the imperative stimulus is highly predictable in time
and the RT will be short. In fact, if the WI is always constant at around 0.5 seconds, the subject can shorten
simple RT to nearly 0 seconds by synchronizing the response with the predictable imperative stimulus. On
the other hand, if the warning intervals are long or variable, RT will be long (Klemmer, 1957). Warrick et al.
(1964) investigated variable warning intervals as long as two and a half days! The subjects were secretaries
engaged in routine typing. Occasionally they had to respond with a key press when a red light on the
typewriter was illuminated. Even with this extreme degree of variability, simple RT was prolonged only to
around 700 msec.
Temporal uncertainty thus results from increases in the variability and the length of the WI. When the
variability of the WI is increased, this uncertainty is in the environment. When the mean length of the WI is
greater, the uncertainty is in the subjects’ internal timing mechanism since the variability of their estimates of
time intervals increases linearly with the mean duration of those intervals (Fitts & Posner, 1967).
Although warning intervals should not be too long, neither should they be so short that there is not
enough time for preparation. This characteristic is illustrated in a real-world example: the duration of the
yellow light on a traffic signal, the time that a driver has to prepare to make a decision of whether or not to
stop when the red signal occurs. In a study of traffic behavior at a number of intersections in the Netherlands,
Van Der Horst (1988) concluded that the existing warning interval (yellow light duration) was too short to

259
allow adequate preparation. When the duration was lengthened by one second at two selected intersections,
over a period of one year the frequency of red-light violations was reduced by half, with the obvious
implications for traffic safety. At the same time, Van Der Horst warns against excessively long warning
intervals because of the temporal uncertainty it presents. This uncertainty, he notes, is a contributing cause to
the many warning signal violations at drawbridges, where a 30-second warning signal precedes the lowering
of the gate.

1.4 Expectancy
We saw that when a constant warning interval is long, RT is shorter than when it is short, and when the
warning interval is varied, over trials, mean RT is longer than when it is constant. But if we look at individual
RTs to different warning intervals within the varied set, then RT following a short WI is longer than that
following a longer RT (Drazin, 1961). This difference is due to expectancy. The longer you wait, the more
‘primed’ you are for an action (the lower the criterion), and so when the signal occurs, you act faster; but at
the possible cost of an error (the false start of the sprinter after a long pause before the starting gun).
The role of expectancy and warning intervals in RT is critical in many real-world situations. As we have
noted, yellow traffic lights provide warnings for the red light to come, and many cautionary road signs
(‘STOP AHEAD’) provide the same function. In his study of traffic behavior, Van Der Horst (1988)
compared constant timed lights to lights with vehiclecontrolled timing. The latter lights tend to remain green
when an approaching driver is sensed, and hence they maintain a more continuous flow of traffic. However,
they also increase the oncoming driver’s expectancy that the light will remain green. Consistent with the
predictions of the underlying expectancy principle, Van Der Horst found that such lights increase by a full
second the time at which the driver will stop when a yellow light does appear at any point prior to the
intersection. That is, lower expectancy of yellow seems to add a full second to the stopresponse RT.
In all of the circumstances described above, RT was measured given the person’s expectancy that the
imperative stimulus (red light, starting gun) could indeed occur, even if its time of arrival was not expected. In
the real world, however, there is another class of events that appear to be so unexpected that the operator
simply does not envision their occurrence. Taleb (2007) describes these as ‘black swan’ events. Here,
response times are extremely long, in the order of several seconds (Wickens, Hooey, et al., 2009). One
example of the response to such a ‘truly surprising’ event might be the ‘emergency stop RT’—the time
required for a driver to press the brake following the sudden appearance of a totally unexpected roadway
obstacle. Such RTs are estimated in the range of two to four seconds with slower RTs by some individuals
considerably longer (Summala, 1981; Dewar, 1993). Moreover, as we saw in Chapters 2 and 3, the stimuli for
such very rare events are often missed altogether by perception.

2. VARIABLES INFLUENCING CHOICE REACTION TIME


When actions are chosen in the face of environmental uncertainty, a host of additional factors related to the
choice process itself influences the speed of action. In the terms described in Chapter 2, the operator is
transmitting information from stimulus to response. This characteristic has led several investigators to use
information theory to describe the effects of many of the variables on choice reaction time.

2.1 The Information Theory Model: The Hick-Hyman Law


It is intuitive that the more complex decisions or choices require longer time to initiate. A straightforward
example is the difference between simple RT, and choice RT in which there is uncertainty about which
stimulus will occur and therefore about which action to take. More than a century ago, Donders (1869, trans.
1969) demonstrated that choice RT was longer than simple RT. The actual function that related the amount of
uncertainty or degree of choice to RT was first presented by Merkel (1885). He found that RT was a
negatively accelerating function of the number of stimulus-response alternatives. Each added alternative
increases RT, but by a smaller amount than the previous alternative.
The theoretical importance of this function remained relatively dormant until the early 1950s, when in
parallel developments Hick (1952) and Hyman (1953) applied information theory to quantify the uncertainty
of stimulus events. Recall from Chapter 2 that three variables influence the information conveyed by a
stimulus: the number of possible stimuli, the probability of a stimulus, and its context or sequential
constraints. These variables were also found by Hick and Hyman to affect RT in a predictable manner. First,
both investigators found that choice RT increased linearly with stimulus information—log2 N, where N is the
number of alternatives—in the manner shown in Figure 9.2a. RT increases by a constant amount each time N

260
is doubled or, alternatively, each time the information in the stimulus is increased by one bit. When a linear
equation is fitted to the data in Figure 9.2a, RT can be expressed by the equation RT = a + bHs, a relation
often referred to as the Hick-Hyman law. The constant b reflects the slope of the function—the amount of
added processing time that results from each added bit of stimulus information to be processed. The constant
a describes the sum of those processing latencies that are unrelated to the reduction of uncertainty. These
would include, for example, the time taken to encode the stimulus and to execute the response.

FIGURE 9.2 The Hick-Hyman law of choice reaction time: RT= a+bH1. (a) RT as a function of the number of alternatives, (b) RT for two
alternatives of different probabilities.

If the Hick-Hyman law is valid in a general sense, a function similar to that in Figure 9.2 should be
obtained when information is manipulated by various means, as described in Chapter 2. Both Hick (1952) and
Hyman (1953) varied the number of stimulus-response alternatives, N. Thus the points representing 1, 2, and
3 bits of information on the x-axis of Figure 9.2a could be replaced by the values log2 2, log2 4, and log2 8,
respectively. Hyman further demonstrated that the function was still linear when the average information
transmitted by stimuli during a block of trials was manipulated by varying the probability of stimuli and their
sequential expectancy. If probability is varied, then when N alternatives are equally likely, as described in
Chapter 2, information is maximum (i.e., four alternatives yield two bits). When the probabilities are
imbalanced, the average information is reduced. Hyman observed that the mean RT for a block of trials is
shortened by this reduction of information in such a way that the new, faster data point still lies along the
linear function of the Hick-Hyman law.
Choice RT is also strongly influenced by expectancy (which, in turn, is influenced by the probability of
the stimulus event). If we expect to make a right turn because we always do, we will be fast in initiating that
action and slow when a left turn is suddenly signaled. In information theory terms, the expected event
contains less information than the surprising one. If there are two events, the occurrence of an expected one
(e.g., that which occurs 80 percent of the time) conveys less than one bit, whereas the surprising one conveys
more than one bit. But if we measure RT to each of these events, the RT measure will still fall directly on the
line predicted by the Hick-Hyman law as in Figure 9.2b.
Thus, the Hick-Hyman Law seems to capture the fact that, in many circumstances, the human has a
relatively constant rate of processing information, defined by the inverse slope (1/b) of Figure 9.2: a constant
number of bits/second.

2.2 The Speed-Accuracy Trade-off


In RT tasks, and in speeded performance in general, people often make errors. Furthermore, they tend to make
more errors as they try to respond more rapidly. This reciprocity between time and errors is referred to as the
speed-accuracy trade-off (Drury, 1994; Fitts, 1966; Pachella, 1974; Wickelgren, 1977). In previous chapters
we saw this very clearly manifest in visual search tasks, such as luggage X-ray screening (McCarley, 2009),
where speed stress will terminate visual sampling, and hence lead to missed targets. Indeed the SATO is fairly
ubiquitous in human performance, shown robustly in many visual tasks (Drury, 1996), decision tasks (Mosier
et al., 2007; Orasanu & Fischer, 1997), motor tasks (Fitts & Deininger, 1954: see Chapter 5), skimming text
(Duggan & Payne, 2009), and sports tasks (Bielock et al., 2008), as well as in every day life (e.g., completing
an assignment). At a macro level, one can think of the tradeoff in many industries between safety and
productivity. Safety is generally preserved by preventing errors, whereas productivity is typically achieved by
working fast (Drury, 1996). And company or organizational policy can often induce a shift in the work force
from one to the other, although the tradeoff is far from inevitable. The SATO is manifest differently across
different kinds of tasks and Drury (1994) has noted how visual search is the task that most strongly expresses

261
a SATO.

2.2.1 THE SPEED-ACCURACY OPERATING CHARACTERISTIC RT and error rate represent two dimensions of the
efficiency of processing information. These dimensions are analogous in some respects to the dimensions of
hit and false-alarm rate in signal detection (Chapter 2). Furthermore, just as operators can adjust their response
criterion in signal detection, so they can also adjust their set for speed versus accuracy to various levels
defining ‘optimal’ performance under different occasions, as the preceding examples demonstrated. The
speed-accuracy operating characteristic, or SAOC, is a function that represents RT performance in a
manner analogous to the receiver operating characteristic (ROC) representation of signal detection
performance.
Conventionally, the SAOC may be shown in one of two forms. In Figure 9.3, the RT is plotted on the x-
axis and some measure of accuracy (the inverse of error rate) on the y-axis (Pachella, 1974). The four
different points in the figure represent mean accuracy and RT data collected on four different blocks of trials
when the speed-accuracy set is shifted. From the figure, it is easy to see why information transmission is
optimal at intermediate speed-accuracy sets. When too much speed emphasis is given, accuracy will be at
chance, and no information will be transmitted at all. When too much accuracy stress is given, performance
will be greatly prolonged with little gain in accuracy. Indeed, investigations by Fitts (1966) and Rabbitt
(1989), using RT, and by Seibel (1972), employing typing, also conclude that performance efficiency reaches
a maximum value at some intermediate level of speed-accuracy set. These investigators conclude furthermore
that operators left to their own devices will seek out and select the level of set that achieves the maximum
performance efficiency (Howell & Kreidler, 1964).

FIGURE 9.3 The speed-accuracy trade-off.

This characteristic has an important practical implication concerning the kind of accuracy instructions
that should be given to operators in speeded tasks such as typing or keypunching. Performance efficiency will
be greatest at intermediate levels of speed-accuracy set. It is reasonable to tolerate a small percentage of errors
in order to obtain efficient performance, and it is probably not reasonable to demand zero defects, or perfect
performance. We can see why this is so by examining the speed-accuracy trade-off plotted in Figure 9.3.
Forcing the operator to commit no errors whatsoever could induce intolerably long RTs.
An important warning to experimenters emphasized by Pachella (1974) and Wickelgren (1977) is also
implied by the form of Figure 9.3. If experimenters instruct their subjects to make no errors, they are forcing
them to operate at a region along the SAOC in which very small changes in accuracy generate very large
differences in latency since the slope of the right-hand portion of Figure 9.3 is almost flat at that level. Hence,
RT will be highly variable, and the reliable assessment of its true value will be a difficult undertaking.
From an applied human factors perspective, one important aspect of the speed-accuracy trade-off is its
usefulness in deciding what is ‘best.’ Suppose, for example, that lines A and B in Figure 9.4 described the
performance of operators on two data entry devices in which accuracy has been transformed to the log of the
odds of a correct response. (This transformation makes the curve of figure 9.3 into a linear function, Pew,
1969). From the graph, there is no doubt that A supports better performance than B. But suppose the
evaluation had only compared one level on the SAOC of each device and produced the data of point 1 (for
system B) and point 2 (for system A). If the evaluator examined only response time (or data-entry speed), he
or she would conclude that B is the better device because it has shorter RT. Even if the evaluator looked at
both speed and accuracy, any conclusion about which is the superior device would be difficult because there is
no way of knowing how much of a trade-off there is between speed and accuracy, unless the trade-offs are
actually manipulated. If SAOCs are not actually created, it is critical to keep the error rate (or the latency) of
the two systems at equivalent levels to one another and to the realworld conditions in which the systems (and
their operators) are expected to operate.

262
FIGURE 9.4 The speed-accuracy operating characteristic (SAOC). Lines A and B represent two different SAOCs. Points 1 and 2 are different
‘styles’ of responding along the SAOCs.

System designers should also be aware that certain design features seem automatically to shift
performance along the SAOC. For example, redundant presentation across modalities (simultaneous text and
speech) appears to improve accuracy, but sometimes slows the speed of processing (Wickens, Prinet, et al.,
2011). Presenting more information, of greater precision, on a visual display will often lead to more accurate
performance (assuming that information is used by the operator) but at a greater cost of time. For example,
magnifying the displayed error in a target-aiming task will prolong the aiming response, as we discussed in
Chapter 5. Touch screens gain speed but at the cost of accuracy (Baber, 1997). Using SAOC analysis, Strayer,
Wickens, and Braune (1989) showed that older adults were less rapid in responding than younger ones, but
they also operated at a more conservative, accuracyemphasis portion of the SAOC.
The stress induced by emergency conditions sometimes leads to a speed-accuracy trade-off such that
operators are disposed to take rapid but not always well-conceived actions. It is for this reason that regulations
in some nuclear power industries require controllers to stop and take no action at all for a specified time
following a fault, thereby encouraging an accuracy set on the speed-accuracy trade-off. In aviation decision
making, experts were found to be slower (in diagnosis) than novices, but more accurate (Orasanu and Strauch,
1994). Orasanu & Fischer (1997) note that pilots who are good decision makers are more effective than poor
decision makers in moderating their speed-accuracy set based upon external conditions and time availability.
There is an important exception to the SATO, which might be described as the speed-accuracy trade-on
(SATON). For example, good design can produce faster and more accurate performance than poor design
(e.g., stimulus-response compatibility violations as we see later in this chapter). Beilock et al. (2008) studied
the SATON as reflected in the expertise effect in sports. Here experts (but not novices) may be more accurate
if less time is given for an action (e.g., golf putting). We will see further examples of the SATON when we
discuss the micro-trade-off in the following section.

2.2.2 THE SPEED-ACCURACY MICRO-TRADE-OFF The general picture of the SATO presented suggests that
conditions or sets in which speed is emphasized tend to produce more errors. A different way of looking at the
speed-accuracy relationship is to compare the accuracy of fast and slow responses within a block of trials,
using the same system (or experimental condition). (Alternatively one can compare the mean RT of correct
and error responses.) This comparison describes the speed-accuracy micro-trade-off. Its form depends on
what varies most from trial to trial. On the one hand, when the criterion varies, this produces a pattern typical
of the macro trade-off (faster responses are more error prone). Indeed, sometimes the criterion can be so low
that a response is essentially a ‘fast guess’ in which a random response is initiated as soon as the stimulus is
detected (Gratton et al., 1988; Pachella, 1974). The nature of this fast guess is usually that of the most
probable response. This positive micro-trade-off between reaction time and accuracy seems to be
characteristic of most speeded tasks when RTs are generally short and stimulus quality is good.
In contrast, Wickens (1984) concludes that when stimulus evidence is relatively poor (as in many signal
detection tasks) or processing is long and imposes a working memory load (as in many decision tasks), the
opposite form of the micro-trade-off is more likely to be observed. Fast responses are no longer more error-
prone and may even be more likely to be correct. When there is generally poor signal quality, the responses on
some trials will be longer because more processing is required to identify the signal; but this poor quality also
makes an error more likely. When decision tasks impose memory load, anything that delays processing

263
imposes a greater (longer) memory load, which yields poorer decision quality. Hence, the SATON form of the
micro-trade-off is observed: error responses tend to be slower than correct ones.

2.3 Stimulus Discriminability


RT is lengthened as a set of stimuli are made less discriminable from one another (Vickers, 1970). Tversky
(1977) has argued that we judge the similarity or difference between two stimuli on the basis of the ratio of
shared features to total features within a stimulus, and not simply on the basis of the absolute number of
shared (or different) features. Thus, the numbers 4 and 7 are quite distinct, but the numbers 721834 and
721837 are quite similar, although in each case only one digit differentiates the pair. Discriminability
difficulties in RT, like confusions in memory (see Chapter 7), can be reduced by deleting shared and
redundant features where possible. In Chapter 4, we saw this to be the case with graph labels.

2.4 The Repetition Effect


Several investigators have noted that in a random stimulus series, the repetition of a stimulus-response (S-R)
pair yields a faster RT to the second stimulus than does an alternation. For example, if the stimuli were
designated A and B, the response to A following A will be faster than to A following B (e.g., Hyman, 1953).
Thus, we may see the mail sorter in a post office becoming progressively faster as each letter encountered has
the same zip code. The advantage of repetitions over alternations, referred to as the repetition effect appears
to be enhanced by increasing N (the number of S-R alternatives), by decreasing S-R compatibility (see
below), and by shortening the interval between each response and the subsequent stimulus (Kornblum, 1973).
Research by Bertelson (1965) and others (see Kornblum, 1973 for a summary) suggests that the response to
repeated stimuli is speeded both by the repetition of the stimulus and by the repetition of the response.
There are two important circumstances in which the repetition effect is not observed. (1) As summarized
by Kornblum (1973), the repetition effect declines with long intervals between stimuli and may sometimes be
replaced by an alternation effect (faster RTs to a stimulus change). In this case, it appears that the gambler’s
fallacy discussed in Chapter 8 takes over. People do not expect a continued run of stimuli of the same sort,
just as gamblers believe that they are ‘due for a win’ after a string of losses. (2) As we discuss later in the
chapter, in some transcription tasks, such as typing, rapid repetition of the same digit or even digits on the
same hand will be slower than alternations (Sternberg, Kroll, & Wright, 1978).

2.5 Response Factors


Two characteristics of the response appear to influence RT. (1) RT is lengthened as the confusability between
the responses is increased. Thus, for example, Shulman and McConkie (1973) found that two choice RTs
executed by two fingers on the same hand were slower than those executed by the fingers on opposite hands,
the former pair being less discriminable from one another. Similarly, distinct shape and feel of a pair of
controls reduces the likelihood of their being confused. (2) RT is lengthened by the complexity of the
response. For example, Klapp and Irwin (1976) showed that the time to initiate a vocal or manual response is
directly related to the duration of the response. Sternberg, Kroll, and Wright (1978) found that it takes
progressively longer to initiate the response of typing a string of characters as the number of characters in the
string is increased.

2.6 Practice
Consistent results suggest that practice decreases the slope of the Hick-Hyman law function relating RT to
information (i.e., increases the information transmission rate). In fact, compatibility (to be discussed below)
and practice appear to trade-off reciprocally in their effect on this slope. This trade-off is nicely illustrated by
comparing three studies. Leonard (1959) found that no practice was needed to obtain a flat slope with the
highly compatible mapping of finger presses to tactile stimulation. Davis, Moray, and Treisman (1961)
required a few hundred trials to obtain a flat slope with the slightly lower compatibility task of naming a heard
word. Finally, Mowbray and Rhoades (1959) examined a RT mapping of slightly lower (but still high)
compatibility. The subjects depressed keys adjacent to lights. For one unusually stoic subject, 42,000 trials
were required to produce a flat slope. Recent findings suggest that training and practice on video games can
shorten the perceptual component of RT (without sacrificing accuracy in a SATO) in traditional lab-based RT
tasks (Dye, Green, & Bavelier, 2009).

2.7 Executive Control

264
Any speeded response task must be characterized by a rule by which responses or actions are associated with
stimuli or events. It appears to take some time to ‘load’ or activate these rules when they are first used, much
as it takes time to load a program on a computer, or shift from one program to another. Such rule loading in
human performance is assumed to be the function of executive control (Jersild, 1927; Rogers & Monsell,
1995) discussed extensively in Chapters 7 and 11, which also accomplishes functions like shifting the speed-
accuracy trade-off. A paradigm that nicely illustrates the time costs of executive control is one in which
speeded responses are made following one rule, like discriminating between high and low digits, and then
abruptly shift to a different rule, like discriminating between odd and even digits (Jersild, 1927; Rogers &
Monsell, 1995). Here the first RT following the switch is longer than the following ones, reflecting the switch
cost of executive control. While such costs will be greater when the switch is not expected (Allport, Styles, &
Hsieh, 1994), it still requires some time even when the new task is anticipated (Rogers & Monsell, 1995). We
will discuss the role of switching further, when we discuss dual-task performance in Chapter 10.

2.8 S-R Compatibility


In June 1989, the pilots of a commercial aircraft flying over the United Kingdom detected a burning engine
but mistakenly shut down the good engine instead. When their remaining engine (the burning one) eventually
lost power, leaving the plane with no engines, it crashed, with a large loss of life. Why? Analysis suggests that
a violation of stimulus-response compatibility in the display control relation may have been a contributing
factor (Flight International, 1990).
We have already encountered the concept of compatibility in earlier chapters. In Chapter 3, we discussed
the compatibility of proximity between display elements and information processing; in Chapter 4, we
described the compatibility between a display and the static or dynamic properties of the operator’s mental
model of the displayed elements. In Chapter 5, we described compatibility in terms of FORT transformations.
Here, we will discuss compatibility between a display location or movement and the location or movement of
the associated operator response. We devote considerable space to this topic because of its historic
prominence in engineering psychology research and because of its tremendous importance in system design.
As suggested, S-R compatibility has both static elements (where response devices should be located to
control their respective displays) and dynamic elements (how response devices should move in order to
control items in the workplace, and their associated dynamic displays). We refer to these as locational and
movement compatibility, respectively. Much of compatibility describes spatially oriented actions (e.g., the
location of switches in space or the movement of switches and continuous controls in space), but it can also
characterize other mappings between displays and responses. More compatible mappings require fewer
mental transformations from display to response. We will also examine compatibility in terms of modalities of
control and display. What is common about all of these different types of S-R compatibility, however, is the
importance of mapping. There is no single best display configuration or control configuration. Rather, each
display configuration will be compatible only when it is appropriately mapped to certain control
configurations.

2.8.1 LOCATION COMPATIBILITY The foundations of location compatibility are provided in part by the human’s
intrinsic tendency to move or orient toward the source of stimulation (Simon, 1969). Given the predominance
of this effect, it is not surprising that compatible relations are those in which controls are located next to the
relevant displays, a characteristic that defines the colocation principle. The touch-screen CRT display is an
example of designs that maximize S-R compatibility through colocation (but see Chapter 5 for some
limitations of this concept). Point and click cursor controls achieve colocation somewhat indirectly, to the
extent that the cursor is viewed as a direct extension of the hand. However, many systems in the real world
often fail to adhere to the colocation principle, for example, the location of stove burner controls (Chapanis
& Lindenbaum, 1959; Hoffman & Chan, 2011). Controls colocated beside their respective burners (Figure
9.5a) are compatible and will of course eliminate the possible confusions caused by arrays shown in Figure
9.5 (b and c), which are more typical.
Unfortunately the principle of colocation is not always possible to achieve. Operators of some systems
may need to remain seated, with controls at their fingertips that activate a more distant array of displays. In
combat aircraft, the high gravitational forces encountered in some maneuvers may make it impossible to move
the hands far to reach controls that are co-located with front-mounted displays. Even the colocation of Figure
9.5a may require the chef to reach across an active (hot) burner to adjust a control. Where colocation cannot
be obtained, two important compatibility principles are congruence and rules.

265
FIGURE 9.5 Possible arrangements of stove burner controls. (a) Controls adhere to colocation principle, (b) and (c) Controls exhibit less
compatible mapping, (d) Controls solve the compatibility problem by the visual linkages.

FIGURE 9.6 Each of the three stimulus panels on the left was assigned to one of the three response panels across the top. The natural
compatibility assignments are seen down the negative diagonal and indicated by an asterisk (*). Source: P. M. Fitts and C. M. Seeger,‘S-R
Compatibility: Spatial Characteristics of Stimulus and Response Codes’ Journal of Experimental Psychology, 46 (1953), p. 203.

The general principle of congruence is based on the idea that the spatial array of controls should be
congruent with the spatial array of displays. This principle was illustrated in a study by Fitts and Seeger
(1953), who evaluated RT performance when each of the three patterns of light stimuli on the left in Figure
9.6 was assigned to one of the three response mappings (moving a lever) indicated across the top. In each case
an eight-choice RT task was imposed. In stimulus array Sa, any one of the eight lights could illuminate (and
for Ra the eight lever positions could be occupied). In Sb, the same eight angular positions could be defined
by the four single lights and the four combinations of adjacent lights. In Rb, the eight shaded lever positions
could be occupied. In Sc, the eight stimuli were defined by the four single lights and four pairwise
combinations of one light from each panel. In Rc, each or both levers could be moved to either side. Fitts and
Seeger found that the best performance for each stimulus array was obtained from the spatially congruent
response array: Sa to Ra, Sb to Rb, and Sc to Rc. This advantage is indicated by both faster responses and
greater accuracy.
A stove-top array such as that shown in Figure 9.5c would also achieve this congruence (Hoffman &
Chan, 2011). Notice in b and d that there is no possible congruent mapping of the linear array of controls to
the square array of burners (displays). The only way to bypass this lack of compatibility is through the drawn
links as shown in Figure 9.5d (Hoffman & Chan, 2011).
Congruence is often defined in terms of an ordered array (e.g., left-right or top-down). In the 1989
airplane crash over England discussed above, a violation of location compatibility resulted because the
relevant indicator of malfunction of the burning engine, which was the left engine, was located on the right
side of the cockpit midline (see also figure 3.4).
Why are incongruent systems difficult to map? In an analysis of S-R compatibility effects, Kornblum,
Hasbroucq, and Osman (1990) argue that if the response dimension can be physically mapped to any
dimension along which the stimuli are ordered (e.g., both are linear arrays), the onset of a stimulus in an array
automatically activates a tendency to respond at the associated location. If this is not the correct location, a
time-consuming process is required to suppress this response tendency and activate the rule for the correct
response mapping instead.

266
This discussion brings us to the second feature of location compatibility—the importance of rules when
congruence is not obtained (Payne, 1995). Simple rules should be available to map the set of stimuli to the set
of responses (Kornblum et al. 1990). This feature is illustrated in a study by Fitts and Deininger (1954), who
compared three mappings between a linear array of displays and a linear array of controls. One mapping was
congruent; the second was reversed, so that the leftmost display was associated with the rightmost control and
so forth; the third mapping involved a random assignment of controls to displays. Fitts and Deininger found,
as expected, that performance was best in the first array, but also was considerably better in the reversed than
in the random array. In the reversed array, a single rule can provide the mapping, but there is no simple rule
for the random mapping. Haskell, Wickens, and Sarno (1990) showed that the number of rules necessary to
specify a mapping between linear arrays of four displays and four controls was a strong predictor of RT.
Payne (1995) notes that the contribution of such rules are often underestimated if users are simply allowed to
rate the estimated S-R compatibility of different mappings that are shown to them. Performance is a more
reliable indicator of good (and bad) HF design than are user ratings.
There are times when even congruence is difficult to achieve. Consider a linear array of switches that
must be positioned along an armrest to control (or respond to) a vertical array of displays. Since a congruent,
vertical array of switches on the armrest would be difficult to implement (and an anthropometrically poor
design), the axis of switch orientation must be incongruent with the display axis. However, there are rules to
guide the designer. These rules describe a mapping of ordered quantities from least to most in space, which
specifies that increases move from left to right, aft to forward, clockwise (for a circular array) and (to a lesser
extent) from bottom to top. Hence, a far right control should be mapped to a top display when a left-right
array is mapped to a vertically oriented display (Weeks & Proctor, 1990).
It is unfortunate, however, that the vertical ordering is not strong. On the one hand, high values are
compatible with top locations (as noted in Chapter 5; see also the typical calculator keyboard). On the other
hand, the order of counting (1, 2, 3, …), following the order of reading in English, is from top to bottom (see
the push-button telephone). These conflicting stereotypes suggest that vertical display (or control) arrays that
are not congruent with control (display) arrays should only be used with caution, an issue we will see echoed
with movement compatibility (Chan & Hoffman, 2010). An important design solution that can resolve any
potential mapping ambiguity is to put a slight cant, or angling, of one array in a direction that is congruent
with the other, as shown in Figure 9.7. If this cant is as great as 45ͦ, then reaction time can be as fast as if the
control and display axes are parallel (Andre, Haskell, & Wickens, 1991), echoing the minimal FORT costs of
such alignment discussed in Chapter 5 (Figure 5.2).

2.8.2 MOVEMENT COMPATIBILITY The best way to conceptualize movement compatibility is to imagine a user
with an intention to move something in the world in a particular direction. How should a control move to
make this happen most fluently and automatically? Most world movements belong to one of two kinds.
Spatial movements were discussed in Chapter 5 and can be represented in terms of either world referenced
(north-south, east-west) or ego-referenced (leftright, front-back, up-down) spatial coordinates. Conceptual
movements involve the increase or decrease in a quantity, such as risk or money or energy; while these are not
directly mapped onto space, we typically think of ‘more’ as higher, and so there is a natural or compatible
mapping.

FIGURE 9.7 Solutions of location compatibility problems by using cant. (a) The control panel slopes downward slightly (an angle greater than
90 degrees), so that control A is clearly above B, and B Is above C, just as they are in the display array, (b) The controls are slightly angled from

267
left to right across the panel, creating a left-right ordering that is congruent with the display array.

There are several variables that influence the compatibility between control movement and the
movement of the controlled entity. These include:
1. Population stereotypes. There is a very strong stereotype for moving a control upward to increase.
Somewhat less strong, but still pronounced, are rightward to increase and, if a control is a dial,
clockwise to increase. The forward to increase stereotype is weaker still, but still exists. Such
stereotypes have emerged from a long history of research by Chan and colleagues (e.g., Chan & Chan,
2007a,b, 2008; Chan & Hoffman, 2010, 2011; Hoffman, 1997).
2. Congruence of display movement. Most dynamic controls are (or should be) coupled by feedback
displays that indicate that the movement was accomplished in the direction intended. Alternatively, in
many tracking tasks, the display movement may signal a movement of the controlled agent that
requires a compensatory or pursuit control action (see Chapter 5).
  Here as discussed with regards to FORT in Chapter 5, maximum compatibility is achieved when
the display moves in a direction congruent with the control. For example, a linear moving vertical
control (e.g., joystick or slider) should be coupled with a vertical display such that upward movement
of the control produces upward movement of the displayed element. Similar congruence of course can
be obtained with fore-aft, left-right, or circular controls. In the case of controls that are not themselves
spatially moved (like depressing a radio frequency tuner for a longer time to increase the frequency),
then a display will often serve as a proxy for the control.
3. Mismatching dimensions. Sometimes physical constraints may limit application of perfect congruence.
For example, a rotary control may be more stable to adjust than a linear one in a dynamic, vibrating, or
unsupported environment, even as the feedback display is a linear one. There is some penalty for
mismatching dimensions. But when there is a mismatch, the strength of the ‘increase’ stereotype (1
above) can serve as a guide. For example, the clockwise control rotation should produce the upward
(or rightward) display movement (or vice versa, for a linear control with a rotary display). In these
cases of orthogonal mapping, Chan & Hoffman (2010) (and Burgess-Limerick et al., 2010) find that
vertically moving controls are not well mapped to horizontally moving displays (whether the latter
move fore-aft or left-right). Here the mental rotation function shown in Chapter 5 (Figure 5.1)
becomes a guiding compatibility map. As that figure indicates, it becomes important to preserve some
common vector in parallel between control and display motion, particularly on the lateral (left-right)
axis such that, for example, a rightward control movement is associated with a display movement that
also has some rightward component, even if most of that movement is (for example) forward, or
upward. This is equivalent to providing the cant in adhering to location congruence, as described
above (Figure 9.7).
4. Constrained versus unconstrained controls. When controls are constrained or ‘channeled’ to only
move along pure X, Y, or Z axes, it is easy to control along one axis at a time. However, when
controls and displays are free to move along any combination of axes, then such pure mapping
becomes more difficult. As an example of an unconstrained control, try to control a mouse cursor
when the mouse is oriented at an angle to the mouse pad. Here the mental rotation function shown in
Chapter 5 (Figure 5.1) becomes a guiding compatibility map. As that figure indicates, it becomes
important to preserve some common vector in parallel between control and display motion,
particularly on the lateral (left-right) axis such that, for example, a rightward control movement is
associated with a display movement that also has some rightward component, even if most of that
movement is (for example) forward, or upward. This is equivalent to providing the cant in adhering to
location congruence, as described above.
5. Frame of Reference modifications. When analyzing movement in a display with regard to the
controlled element in the world, a critical distinction, discussed in Chapter 4, is whether the display
depicts the moving element against a stable display frame or depicts a stable element within a moving
frame. The distinction between these, and advantages and costs of each, was discussed in Chapter 4,
and from the viewpoint of movement compatibility, there is a general preference for the moving
element on the display to represent what moves in the world and to move in the same direction as the
control. That is, the principle of the moving part dictates congruent control-display movement
directions. Nevertheless, it must be recognized that there are times when a moving world or inside-out
display is called for, particularly when that display is designed to represent direct vision, as in the case
of VR systems and realistic 3D forward looking flight displays in the airplane (see Chapter 4).
6. Compensatory status displays versus pursuit command displays. The distinction between inside-out

268
and outside-in is closely related to the distinction between compensatory and pursuit displays in
tracking. In a compensatory display, an increase in error, signaled by a leftward movement of the error
cursor, should trigger a rightward (compensatory) movement of the control. In a pursuit display, a
leftward movement of the target to be followed should trigger a leftward movement of the control.
From what we know about movement compatibility, the pursuit display should be a more compatible
display-control relationship, and indeed research in tracking suggests this to be the case (Roscoe, Corl,
& Jensen, 1981; Wickens, 1986). In the same way, spatial displays that provide a directional command
as to which way to move to reduce an error (like a moving target above) are more compatible than
those that provide a status of the state of error (Andre, Wickens, & Goldwasser, 1990). (See also
Chapter 6.)
7. The Warrick principle relates co-location with movement and is satisfied whenever the control moves
in the same direction as the closest moving element of the display (Hoffman, 1990, 2009; Warrick,
1947). This is illustrated in Figure 9.8a, where the Warrick principle is satisfied by placing the rotary
control on the right side of the vertical linear display, but violated by placing it on the left side (9.8b).
This figure brings up the issue of conflicting principles. What would be the cost of keeping the control
on the left side (9b), but now reversing the direction, so that a clockwise-to-decrease mapping was in
effect? In such circumstances one would expect the two principles to continue to offset each other,
now violating the direction of motion stereotype, but conforming to the Warrick principle. Guidance
of course, is to configure control (or display) placement in such a way to maximize all principles (right
side), or at least not violate any, which might be the case if the rotary dial were placed below the linear
scale (Figure 9.8a).

FIGURE 9.8 Three control-display layout configurations illustrating movement compatibility principles. (a) The arrow indicates the expected
control rotation direction to increase the indicator. (b) The display is ambiguous because the clockwise-to-increase and the Warrick proximity of
movement principles are in opposition. In (c) these two principles are congruent.
7. Movement in different planes. Much of our discussion has focused on controlling a display in the
frontal plane, the display mounted vertically in front of the controller. But suppose the display plane is
rotated by 90 degrees (Burgess-Limerick et al., 2010; Chan & Hoffman, 2010): a tabletop display or
one mounted to the right, left, or above (the latter might be the case for an astronaut looking out a
ceiling display toward a space station approached for docking; Wickens, Keller, & Small, 2010). Here
again as discussed in Chapter 5, certain control-display compatibilities underlie the ease of mapping
(and speed or accuracy of response), although some mappings are less well documented. However, in
particular:
• Left-right congruence should be preserved. Thus, mapping a left-right control to leftright
movement in either a vertical display in the frontal plane, or a horizontal tabletop display makes
little difference for lateral (left-right) control movements.
• The visual field compatibilty (Worringham & Beringer, 1989; Chan & Hoffman, 2010) is
dominant. Here, consider an operator viewing a 3D display mounted parallel to the right window,
depicting an element that she/he wishes to move to the left on the display. To accomplish this,
should she move a front-mounted control to the left (so that left on the control is left on the
display when the display is viewed by a 90 degree rightward head rotation)? Or should she move
the control forward, which would produce a compatible mapping if the object were moving in the
real world (e.g., the display was literally a window viewing outside the right window). The
answer here is clear. The first is the best mapping, preserving visual field compatibility (Burgess-
Limerick et al., 2010; Chan & Hoffman, 2010). While the operator may be viewing the display by

269
looking right, it appears she is inferring the motion relationship as if the display were aligned with
the trunk, not the momentary direction of gaze.
• See through displays. When head mounted displays are worn, as discussed in Chapter 5, some
complexities arise in off-axis viewing (such as that described above). If the display is meant to
depict movement in the world beyond the display window (a conformal display), then it becomes
less clear what the compatibility relationship should be (Wickens, Vincow, & Yeh, 2005).
Collectively, the influence of all of these factors on the ideal frame of reference for motion can be quite
complex. It has been argued that the net effect of all of them, some in confirmation, some in violation, may
approximately act in a weighted additive fashion (much like the depth cues of Chapter 4; Hoffman, 1990;
Proctor & Vu, 2006). Clearly, a safe design will be one that tries to satisfy as many principles as possible. As
discussed in Chapter 5, Wickens, Keller, & Small (2010) have developed a FORT (frame of reference
transformation) model that will examine a given 3D display-control mapping layout, integrate the various
penalties of violation, and return an overall penalty score, which can describe the collective extent of
violations of different principles.

2.8.3 TRANSFORMATIONS AND POPULATION STEREOTYPES Not all compatibility relationships are spatially defined.
Any S-R mapping that requires some transformation, even if it is not spatial, will be reduced in its
compatibility. Hence, a mapping between three pairs of stimulus and response digits of 1–1, 2–2, and 3–3 is
more compatible than 1–2, 2–3, and 3–4, which imposes the transformation ‘add one.’ Similarly, the
relationship between stimulus digits and response letters (1-A, 2-B, 3-C, etc.) is less compatible than digits-
digits or letters-letters mappings. Also, any S-R mapping that is many-to-one will be less compatible than a
one-to-one mapping (Norman, 1988; Posner, 1964). Consider, for example, the added cognitive difficulty of
entering alphabetic phone numbers, like 437-HELP, resulting from the 3–1 mapping of letters (stimuli) to
keys (responses) that is found in the suffix of the phone number (H-E-L-P). Ironically, in Chapter 6 we
identified this form of phone number as better from the standpoint of memory load. As we continuously see,
human engineering is always encountering such trade-offs.
We discussed movement population stereotypes above, which define mappings that are more directly
related to experience. For example, consider the relationship between the desired lighting of a room and the
movement of a light switch. In North America, the compatible relation is to flip the switch up to turn the light
on. In Europe, the compatible relation is the opposite (up is off). This difference is clearly unrelated to any
difference in the biological hardware between the American and European brain but rather is a function of
experience. Smith (1981) has evaluated population stereotypes in a number of verbal-pictorial relations. For
example, he asks whether the ‘inside lane’ of a four-lane highway refers to the center-most lane on each side
or to the driving lane. Smith finds that the population is equally divided on this categorization. Any mapping
that bases order on reading patterns (e.g., in English, left-right and top-bottom) will also be stereotypic, and
thereby not applicable, say, to Hebrew or Chinese readers. Finally, as noted in Chapter 4, color coding is
strongly governed by population stereotypes: red for danger, stop, and so on.

2.8.4 MODALITY S-R COMPATIBILITY Stimulus-response compatibility is also defined by stimulus and response
modality. Brainard et al. (1962) found that if a stimulus was a light, choice RT was faster for a pointing
(manual) than a voice response, but if the stimulus was an auditorily presented digit, RT was more rapid with
a vocal naming response than with a manual pointing one. Teichner and Krebs (1974) concluded that the four
S-R combinations defined by visual and auditory input and manual and vocal response produced reaction
times in the following order: A voice response to a light is slowest, a key-press response to a digit is of
intermediate latency, and a manual key-press response to a light and vocal naming of a digit are fastest.
Wickens, Sandry, and Vidulich (1983) and Wickens, Vidulich, and Sandry-Garza (1984) proposed that
these modality-based S-R compatibility relations may partially depend on the central processing code (verbal
or spatial) used in the task. In both the laboratory environment and in an aircraft simulator, they found that
tasks that use verbal working memory are served best by auditory inputs and vocal outputs, whereas spatial
tasks are better served by visual inputs and manual outputs. In the aircraft simulation, Wickens, Sandry, and
Vidulich found that these compatibility effects were enhanced when a concurrent flight task became more
difficult (Vidulich & Wickens, 1986), suggesting that compatibility influences resource demand (see Chapter
10).
As discussed in Chapters 4 and 6, these guidelines would hold only when the material is short since a
long auditory input of verbal material can lead to forgetting. Furthermore, for the voice control, the guidelines

270
would hold only when the vocal response does not disrupt rehearsal of the retained information (Wickens &
Liu, 1988). The particular advantages of voice control in multitask environments such as the aircraft cockpit
or the computer design station (Baber, Morin, et al., 2011) will be further discussed in the next chapter.

2.8.5 CONSISTENCY AND TRAINING Compatibility is normally considered to be an asset in system design.
However, to reiterate a point made in Chapter 4, the designer should always be wary of any possible violation
of consistency across a set of control-display mappings that may result from trying to optimize the
compatibility of each. For example, Duncan (1984) found that people actually had a more difficult time
responding to two RT tasks if one was compatibly mapped and one incompatible than responding when both
were incompatible. In other words, the consistency of having identical (but incompatible) mappings in both
tasks outweighed the advantages of compatibility in one. Correspondingly, a designer who needs to add
another function to a system that already contains a lot of control-display mappings should be wary of
whether the compatible addition proposed (e.g., a status display) is in disharmony with the existing set (e.g.,
several command displays) (Andre & Wickens, 1992).
We have seen how training and experience form the basis for population stereotypes. Training can also
be used to formulate correct mental models. It is also evident that training will improve performance on both
compatible and incompatible mappings. In fact the rate of improvement with practice is actually faster with
the incompatible mappings because they have more room to improve (Fitts & Seeger, 1953). However,
extensive training of an incompatible mapping will never fully catch up to a compatible one. When the
operator is placed under stress, performance with the incompatible mapping will regress further than with the
compatible one (Fuchs, 1962; Loveless, 1963). Hence, we should be wary of a designer who excuses an
incompatible design with the argument that the problem can be ‘trained away.’

2.8.6 KNOWLEDGE IN THE WORLD Most of our discussion of compatibility has focused on the mapping of stimuli
to responses, or displays to controls. In this context, it can be argued that good S-R compatibility provides the
user with direct visual knowledge of what action to take. Norman (1992) refers to this as ‘knowledge in the
world,’ which can be contrasted with ‘knowledge in the head,’ when the appropriate response must be derived
from learning and experience (the stovetops in Figures 9.5a and 9.5d provide examples of knowledge in the
world, while that in Figure 9.5b require knowledge in the head).

271
FIGURE 9.9 (a) Illustrates the availability of action options (knowledge in the world) through a menu. (b) Illustrates the invitation or
affordance to open a door handle, which affords grabbing and pulling. (c) A violation of knowledge in the world because it is not obvious what
is the ‘on’ switch on the coffee maker in the upper panel. This is partially fixed on the lower panel by the highly visible label. (d) A lockout, to
prevent people from descending the stairs beyond the ground-floor, fire-exit level.

The concept of knowledge in the world, however, applies to a broader range of actions than merely those
triggered by, or in response to, an event. When approaching a piece of equipment or a computer interface to
turn it on (or otherwise use it), one is responding to an intent, but not an ‘event’ in a way described by the RT
paradigm. Yet the importance of knowledge in the world in supporting compatible actions remains critical,
particularly for the novice user. Good design should provide an easily discriminable set of options for
allowable actions (such as a set of menu options always available on a computer screen); see Figure 9.9a or it
should provide an invitation to the appropriate actions, referred to as an affordance or forcing function
(Figures 9.9b and 9.9c) as well as a ‘lockout’ of the inappropriate actions (Figure 9.9d; Norman, 1988).

3. STAGES IN REACTION TIME


A central theme of this book is that human information processing and human performance can be roughly
conceptualized by a series of processing stages, from selective attention and sensation to perception to
response selection to response execution, as shown in the first chapter. Difficulties and delays in task
performance, as well as remedies for poor system design can often be targeted at certain stages. For example,
problems of incompatible S-R mappings do not lie in delays of perceiving a stimulus, nor in executing a
response, but rather in selecting the response given a perceived stimulus event.
Over the past century and beyond (Donders, 1869, translated 1969; Pachella, 1974; Sternberg, 1969),
psychologists have worked to identify the reality of these stages and particularly employed three different
techniques to identify the durations with which each stage is carried out, or how the effect of different
manipulations (such as degrading S-R compatibility) can be pinpointed to affect processing time at different
stages.

272
As an example of the subtractive technique, RT in two different tasks can be compared in which one
task clearly ‘deletes’ a stage. For example, an RT task in which one of two responses needs to be chosen can
be compared with the ‘go no-go’ task, in which only a single response is given if one stimulus occurs (‘go’)
and no action is taken for the other (‘no go’). Given that the latter task will produce a shorter RT, the
difference (subtracting the shorter from the longer RT) can be taken as an estimate of the time required to
choose between two responses (e.g., response selection time).
In the additive factors technique (Sternberg, 1969, 1975), two factors affecting RT (e.g., S-R
compatibility and stimulus discriminability) are manipulated orthogonally in a 2 3 2 experimental design. If
RT at the most difficult level of both (an incompatible mapping with confusable stimuli) is simply the sum
(additive) of the effect of each variable in isolation, then it is assumed that they influence different stages;
their effects are additive. This is observed when S-R compatibility is manipulated along with stimulus
discriminability. If, in contrast, the most difficult condition produces an RT greater than would be predicted
by each factor alone (an interaction), then the two factors are assumed to influence the same stage. This
happens when S-R compatibility is manipulated along with N number of alternatives (Wickens & Hollands,
2000).
Finally, researchers can employ psychophysiological techniques of event-related brain potentials
(Coles, 1988; Donchin, 1981; see Chapter 11) to help understand how long it takes the brain to perform
various operations. Components of these voltage fluctuations recorded from the surface of the scalp can be
distinctly associated with different mental operations, thanks in part to their appearance near regions of the
brain that are known to reflect those operations (e.g., auditory perception, visual perception, action selection).
As aspects of an RT task are made more difficult, changes in the latency of these components can be used to
infer changes in the speed of processing of the underlying brain functions. For example reductions in S-R
compatibility will not affect the latency of ERP components reflecting perception; but will affect those
reflecting response (McCarthy & Donchin, 1979).
Collectively, the data from the three techniques, described in more detail in Wickens & Hollands, 2000,
are quite consistent with the model of information processing described in Chapter 1. However, these data
also suggest that the separation of processing stages should not be taken too literally. In speeded reactions to
external events there clearly is some overlap in time between processing in successive stages (McClelland,
1979), just as the brain in general is capable of a good deal of parallel processing (Meyer & Kieras, 1997; see
Chapters 3 and 10). However, as with other models and conceptions discussed in this book, the stage concept
is a useful one that is consistent with dichotomies made elsewhere between sensitivity and response bias in
detection, between diagnosis and choice in decision making (Chapter 8) and decision support (Chapter 12),
and between early and late processing resources in time-sharing (see Chapter 10). The integrating value of the
stage concept more than compensates for any limitations in its complete accuracy.

4. SERIAL RESPONSES
So far we have discussed primarily the selection of a single discrete action in the RT task. Many tasks in the
real world however, call for not just one but a series of repetitive actions. Typing and assembly line work are
two examples. The factors that influence single RT are just as important in influencing the speed of repetitive
performance. However, the fact that several stimuli must be processed in sequence brings into play a set of
additional influences that relate to the timing and pacing of sequential stimuli and responses.
In the discussion of serial or repeated responses, we focus initially on the simplest case: only two stimuli
presented in rapid succession. This is the paradigm of the psychological refractory period. Next we examine
response times to several stimuli in rapid succession, the serial RT task. This discussion will lead us to an
analysis of transcription skills, such as typing.

4.1 The Psychological Refractory


Period The psychological refractory period, or PRP (Kantowitz, 1974; Meyer & Kieras, 1997; Pashler, 1998;
Telford, 1931) describes a situation in which two RT tasks are presented close together in time. The
separation in time between the two stimuli is called the interstimulus interval or ISI. The general finding is
that the response to the second stimulus is delayed by the processing of the first when the ISI is short.
Suppose, for example, a subject is to press a key (R1) as soon as a tone (S1) is heard, and is to speak (R2) as
soon as a light (S2) is seen. If the light is presented a fifth of a second or so after the tone, the subject will be
slowed in responding to the light (RT2) because of processing the tone. However, RT to the tone (RT1) will

273
not be affected by the presence of the light response task. The PRP delay in RT2 is typically measured with
respect to a single-task control condition, in which S2 is responded to without any requirement to respond to
S1.
The most plausible account of the PRP is a model that proposes the human being to be a single-channel
processor of information. The single-channel theory of the PRP was originally proposed by Craik (1947) and
has subsequently been expressed and elaborated on by Bertelson (1966), Welford (1976), Kantowitz, (1974),
Meyer & Kieras (1997), Pashler (1998), and Welford (1967). It is compatible with Broadbent’s (1958)
conception of attention as an information-processing bottleneck that can only process one stimulus or piece of
information at a time (see Chapter 3). As shown in Figure 9.10 in explaining the PRP effect, single-channel
theory assumes that the processing of S1 temporarily ‘captures’ the single-channel bottleneck of the decision-
making/response-selection stage. Thus, until R1 has been released (the single channel has finished processing
S1), the processor cannot begin to deal with S2. The second stimulus S2 must therefore wait at the ‘gates’ of
this single-channel bottleneck until they open. This waiting time is what prolongs RT2. The sooner S2 arrives,
the longer it must wait, just like arriving at a queue of fixed length waiting for the service provider (store
owner) to open. According to this view, anything that prolongs the processing of S1 will increase the PRP
delay of RT2. Reynolds (1966), for example, found that the PRP delay in RT2 was lengthened if the task of
RT1 involved a choice rather than a simple response.

FIGURE 9.10 Single-channel theory explanation of the psychological refractory period. The figure shows the delay (waiting time; the dashed
line) imposed on RT2 by the processing involved in RT1. This waiting time makes RT2 in the dual-task setting (top) longer than in the single-
task control (bottom).

This bottleneck in the sequence of information-processing activities does not appear to be located at the
peripheral sensory end of the processing sequence (such as blinders over the eyes that are not removed until
R1 has occurred). If this were the case then, no processing of S2 whatsoever could begin until RT1 is
complete. However, as described in Chapter 6, much of perception is relatively automatic. Therefore the basic
perceptual analysis of S2 can proceed even as the processor is fully occupied with selecting the response to S1
(Karlin & Kestinbaum, 1968; Keele, 1972; Pashler, 1998). Only after its perceptual processing is completed
does S2 have to wait for the bottleneck to dispense with R1. These relations are shown in Figure 9.10.
In the PRP paradigm, we see that the delay in RT2, beyond its single-task baseline, will increase linearly
(on a one-to-one basis) with a decrease in ISI (S2 arrives sooner) and with an increase in the complexity of
response selection of RT since both increase the waiting time. This relationship is shown in Figure 9.11.
Assuming that the single-channel bottleneck is perfect (i.e., post-perceptual processing of S2 will not start at
all until R1 is released), the relationship between ISI and RT2 will look like that shown in Figure 9.11. When
ISI is long (much greater than RT1), RT2 is not delayed at all. When ISI is shortened to about the length of
RT1, some temporal overlap will occur and RT2 will be prolonged because of a waiting period. This waiting
time will then increase linearly as ISI is shortened further.

274
FIGURE 9.11 Relationship between ISI and RT2 predicted by single-channel theory.

The relationship between ISI and RT2, as shown in Figure 9.11, successfully describes a large amount of
the PRP data (Bertelson, 1966; Kantowitz, 1974; Meyer & Kieras, 1997; Pashler, 1998). There are, however,
three important qualifications to the general single-channel model as it has been presented so far.
1. When the ISI is very short (less than about 100 msec), a qualitatively different processing sequence
occurs; both responses are emitted together (grouping) and both are delayed (Kantowitz, 1974). It is as
if the two stimuli are occurring so close together in time that S2 gets through the single channel gate
while it is still accepting S1 (Kantowitz, 1974; Welford, 1952).
2. Sometimes RT2 suffers a PRP delay even when the ISI is greater than RT1. That is, S2 is presented
after R1 has been completed. This delay occurs when the subject is monitoring the feedback from the
response of RT1 as it is executed (Welford, 1967).
3. Sometimes, when using separate perceptual resources and extensive training, the bottleneck can be
avoided altogether, as we discuss further in Chapter 10 (Schumacher et al., 2001).
In the world beyond the laboratory, people are more likely to encounter a series of stimulus events that
must be rapidly processed than a simple pair. In the laboratory the former situation is realized in the serial RT
paradigm. Here a series of RT trials occurs sufficiently close to one another in time that each RT is affected
by the processing of the previous stimulus event in the manner described by the single-channel theory. A large
number of factors influence performance in this paradigm, typical of tasks ranging from quality control
inspection to typewriting (keyboard transcription) to assembly-line manufacturing to sight reading music.
Many of these variables were considered earlier in this chapter. Factors such as S-R compatibility, stimulus
discriminability, and practice influence serial RT just as they do single-trial-choice RT. However, some of
these variables interact in important ways with the variables that describe the sequential timing of the
successive stimuli.

4.2 Decision Complexity: The Decision Complexity Advantage


Earlier we described how the linear relationship between choice RT and the amount of information
transmitted—the Hick-Hyman law—was seen to reflect a human capacity limit. The slope of this function
expressed as seconds per bit could be inverted and expressed as bits per second. Early interpretations of the
Hick-Hyman law assumed that the latter figure provided an estimate of the bandwidth or upper limit of the
human processing system. As decisions become more complex, decision rate slows proportionately.
If the human being really did have a constant fixed bandwidth for processing information, in terms of
bits per second, this limit should be the same, whether we make a small number of high-bit decisions per unit
time or a large number of low-bit decisions. For example, if one sixbit decision/sec was our maximum
performance, we should also be able to make two three-bit decisions/sec, three two-bit decisions/sec, or six
one-bit decisions/sec. In fact, however, this trade-off does not appear to hold. The most restricting limit in
human performance appears to relate more to the absolute number of decisions that can be made per second
than to the number of bits that can be processed per second. People are better able to process information
delivered in the format of one six-bit decision per second than in the format of six one-bit decisions per
second (Alluisi, Muller, & Fitts, 1957; Broadbent, 1971). Thus, the frequency of decisions and their
complexity do not trade off reciprocally.
The advantage of a few complex decisions over several simple ones may be defined as a decision

275
complexity advantage. This finding suggests that there is some fundamental limit to the central-processing or
decision-making rate, independent of decision complexity, that limits the speed of other stages of processing.
This limit appears to be about 2.5 decisions/sec for decisions of even the simplest possible kind (Debecker &
Desmedt, 1970). Such a limit might well explain why our motor output often outruns our decision-making
competence. The ‘uhs’ or ‘uhms’ that we sometimes interject into rapid speech are examples of how our
motor system fills in the non-informative responses while the decision system is slowed by its limits in
selecting the appropriate response (Welford, 1976).
The most general implication of the decision complexity advantage is that greater gains in information
transmission may be achieved by calling for a few complex decisions than by calling for many simple
decisions. Several investigators suggest that this is a reasonable guideline. For example, Deininger, Billington,
and Riesz (1966) evaluated push-button phone dialing. A sequence of 5, 6, 8, or 11 letters to be dialed was
drawn from a vocabulary of 22, 13, 7, and 4 alternatives, respectively (22.5 bits each). The total dialing time
was fastest with the shortest number of units (five letters), each delivering the greatest information content per
letter.
As another example, a general guideline in computer menu design is that people work better with broad-
shallow menus—each choice is among a fairly large number of alternatives (more information/decisions), but
there are only a few layers (fewer decisions)—than with narrow-deep menus—choices are simple, but several
choices must be made to get to the bottom of the menu (Commarford et al., 2008; Shneiderman, 1987).
The decision complexity advantage also has implications for any data-entry task, such as keyboarding.
For example, Seibel (1972) concluded that making text more redundant (less information per key stroke) will
increase the rate at which key responses can be made (decisions per second) but will decrease the overall
information transmission rate (bits per second). It follows from these data that processing efficiency could be
increased by allowing each key press to convey more information than the 1.5 bits provided on the average by
each letter (see Chapter 2). One possibility is to allow separate keys to indicate certain words or common
sequences such as and, ing, or th. This ‘rapid type’ technique has indeed proven to be more efficient than
conventional typing, given that the operator receives a minimal level of training (Seibel, 1963). However, if
there are too many of these high-information units, the keyboard itself will become overly large, like the
keyboard of a Chinese character typewriter. In this case, efficiency may decrease because the sheer size of the
keyboard will increase the time it takes to locate keys and to move the fingers from one key to another (see
Chapter 5). That is, a delay in response execution will offset any gain in response selection.
One obvious solution to this motor limitation is to allow chording, such as found in courtroom
proceedings transcribers in which simultaneous rather than sequential key presses are required (Baber, 1997).
This approach would increase the number of possible strokes without imposing a proportional increase in the
number of keys. Thus, with only a five-finger keyboard, it is possible to produce 25 – 1, or 31, possible chords
without requiring any finger movement to different keys. With ten fingers resting on ten keys the possibilities
are 210 – 1, or 1023. Consistent with the decision complexity advantage, a number of studies indeed suggest
that the greater information available per key stroke in chording provides a more efficient means of
transmittinginformation (Conrad & Longman, 1965; Gopher & Raij, 1988; Lockhead & Klemmer 1959;
Seibel, 1963, 1964).

FIGURE 9.12 The letter-shape keyboard devised by Sidorsky uses visual imagery to specify the form of the key press for an alphanumeric
character. There are three keys, and one to three of them must be pressed twice. The small dots indicate the keys that are not pressed. The top
row of each letter represents the first key press; the bottom row represents the second. The keys that are successively pressed have a movement
pattern that approximates the visual pattern of the letter. Source: C. Sidorsky, Alpha-dot: A New Approach to Direct Computer Entry of
Battlefield Data (Arlington, VA: U.S. Army Research Institute for the Behavioral and Social Sciences, 1974), Figure 1.

276
Besides capitalizing on the decision complexity advantage, chording keyboards are also useful because
they can be easily operated while vision is fixated elsewhere. A major problem with chording keyboards,
however, is that the sometimes arbitrary finger assignments take a long time to learn (Richardson, Telson, et
al., 1987). One solution is to capitalize on visual imagery, assigning the chording fingers in a way that ‘looks’
like the image of the letters. Such a chording keyboard was designed by Sidorsky (1974), following the
scheme in Figure 9.12. Using three fingers, the operator presses twice for each letter, ‘painting’ it from the top
row to the bottom. In the figure, the dots represent keys that are not pressed. Once the operator remembers the
particular idiosyncratic shapes of the letters, little learning is required, and Sidorsky found that subjects were
able to type from 60 percent to 110 percent as fast with this as they could with the conventional keyboard (see
also Gopher & Raij, 1988). Because only one hand is required, the chording keyboard can work in harmony
with a mouse, controlled by the other hand.

4.3 Pacing
The pacing factor defines the circumstances under which the operator proceeds from one stimulus to the next.
Pacing schedules may be force paced, such as the movement of equipment along an assembly line conveyer
belt. Here the speed of the belt determines what, in the laboratory is defined as the interstimulus interval (ISI),
or the speed with which responses must be implemented in order to keep up. Other examples of forced
packing in serial RT are the UN translator who must keep up with the speed of the speaker (Killian, 2011) or
the court recorder transcribing a legal deposition. Alternatively, pacing schedules may be self paced. Here the
next stimulus or event to be processed does not appear until some time after the previous response has been
executed, a time defined as the response-stimulus interval, or RSI. For example, in sight reading piano
music, the notes on the page are the stimuli, and the musician can translate them to key presses at any delay
that she desires. With either force or self-paced schedules, the speed of work can be increased by decreasing
the ISI or RSI, respectively.
Several studies have examined the differences between these two schedules in overall productivity, with
somewhat inconclusive results (see Wickens & Hollands, 2000, for a summary). However, some recent
evidence suggests that offering the greater autonomy of the self-paced schedule may be preferable (Dempsey
et al., 2010). The advantages of a self-paced schedule particularly emerge to the extent that the processing
time of the different stimulus events (e.g., words to be transcribed, parts to be assembled, screens to be
inspected) is variable. In a forced pace schedule, such variability will either impose a PRP-like overlap, if two
or more difficult items arrive in sequence, or an unnecessary amount of ‘slack time’ (off time) in a series of
easy items arrive. Shortening the ISI will lead to more of the former, with the possible loss of accuracy.
Lengthening the ISI will lead to more of the latter, with a loss in productivity. Maintaining a constant, well
chosen RSI of the self-paced schedule will avoid both problems.

4.4 Response Factors

4.4.1 RESPONSE COMPLEXITY More complex responses require longer to initiate. In the serial RT task, one
important consequence of increased response complexity is the requirement for more feedback monitoring of
the response. As noted in the discussion of the psychological refractory period, monitoring the execution of
and feedback from a response will sometimes delay the start of processing a subsequent stimulus event
(Welford, 1976).

4.4.2 RESPONSE FEEDBACK The feedback from a response can have two effects on performance, depending on
the sensory modality in which it is received. Consider first the case in which the feedback is an intrinsic part
of the response such as the perceived sound of one’s voice. Delays, distortions, or elimination of the intrinsic
feedback can produce substantial deficits in performance (Smith, 1962). For example, consider the difficulty
one has in speaking in a controlled voice when listening to loud music over headphones so that one’s voice
cannot be perceived, or in speaking when a delayed echo of the voice is heard. As most users of computers
know, feedback delays can exert a major influence on the fluency of human-computer interaction (Caldwell,
2009).
Less serious are disruptions of extrinsic feedback, such as the appearance of a visual letter on a screen
after the keystroke. Delays or degradation of this feedback can be harmful (Miller, 1968) particularly for
novice operators. However, as expertise on the skill develops, and the operator becomes less reliant on the
feedback to ensure that the right response has been executed, such feedback can be ignored; hence the harmful
effects of its delays (or elimination) are themselves reduced (Long, 1976).

277
4.4.3 RESPONSE REPETITION Earlier in this chapter, we saw that a response that repeated itself was more rapid
than if it followed a different response (Kornblum, 1973). However, there is a trend in many serial response
skills such as typewriting for the opposite effect to occur, in which a response is slowed by its repetition. This
effect results because the overall speed of responding in these transcription tasks, which may be up to 10
responses/second (Rummelhart & Norman, 1982) is much faster than the speed of successive Choice RT
tasks, which was estimated to be around 2.5 responses/second. We describe the reasons for this difference
below, but with regard to repetitions, the faster rate of transcription begins to impose on the refractory period
of individual muscle groups, such as those required to repeatedly depress a single finger.

4.5 Preview and Transcription


We have noted that the limits of serial RT performance are around 2½ decisions per second. Yet skilled
typists can execute key strokes at a rate of more than 15 per second for short bursts (Rumelhart & Norman,
1982). The major difference here is in the way in which typing and, more generally, the class of transcription
tasks (e.g., typing, reading aloud, and musical sight reading) is structured to allow the operator to make use of
preview, lag, and parallel processing. These are characteristics that allow more than one stimulus to be
displayed at a time (preview is available) and therefore allow the operator to lag the response behind
perception. Thus, at any time the response executed is not necessarily relevant to the stimulus that was most
recently encoded but is more likely to be related to a stimulus encoded earlier in the sequence. Therefore,
perception and response are occurring in parallel. Whether one speaks of this as preview (seeing into the
future) or lag (responding behind the present) obviously depends on the somewhat arbitrary frame of
reference one chooses to define the ‘present.’
Preview is demonstrated when the eyes fixate ahead of the keys while transcribing a written text to the
keyboard; and its complement, transcription lag, can be demonstrated when the UN translator speaks words
that may have been heard a few seconds prior. Critically, when operators use preview and lag, they must
maintain a running ‘buffer’ memory of encoded stimuli that have not yet been executed as responses. This lag
does not hurt transcription because it is only a few seconds long, shorter than the harmful delays of working
memory discussed in Chapter 7. Furthermore, the lag provides two specific benefits to transcription
performance in that:
1. It allows for variability in input, either in rate (e.g., rate of speaking) or in difficulty of encoding (e.g.,
clarity of spoken words), thereby allowing the buffer to either fill or near empty without slowing the
rate of responses.
2. It allows for chunking, which is itself a major source of variance in encoding rate. Thus in transcribing
text, if one only could see a single letter at a time (no preview, no lag allowed), the appearance of the
letter ‘a’ would not distinguish between the word ‘a’ and the first letter of ‘and.’ With preview,
appearance of the latter would allow the single chunk of the word to be encoded as one entity (and
held as a single chunk in the buffer) to the benefit of transcription performance.
Evidence of the importance of preview in allowing for variability, chunking, and enabling a smoother rate of
responding is derived from studies that varied the amount of preview (Hershon & Hillix, 1965; Shaffer, 1973;
Shaffer & Hardwick, 1970). Here, more preview clearly helps, just as it does in tracking (Chapter 5), but the
benefits diminish with the number of entities that can be previewed, such that approximately eight letters of
preview in typing transcription is sufficient to produce maximum gains in transcription speed. Eight letters
would be sufficient to encompass the great majority of words but generally not enough to extract coherent
semantic meaning from word strings. The absence of heavy semantic involvement in transcription would
thereby explain how skilled typists may be able to carry on a conversation or perform other verbal activity
while typing (Shaffer, 1975; see also Chapter 10). Further details regarding the mechanisms and benefits of
preview in transcription skills are found in Wickens & Hollands (2000) and Shaffer (1975).

5. ERRORS
In all phases of human performance, errors are a frequent occurrence. It has been estimated in various surveys
that human error is the primary cause of 60 to 90 percent of major accidents and incidents in complex systems
such as nuclear power, process control, and aviation (Rouse & Rouse, 1983). Card, Moran, and Newell (1983)
estimated that operators engaged in word processing make mistakes or choose inefficient commands on 30
percent of their choices. In one study of a well-run intensive care unit, doctors and nurses were estimated to
make an average of 1.7 errors per patient per day (Gopher et al., 1989). Errors in medicine were estimated to
account for approximately 98,000 deaths/year (Kohn et al, 1999); although the overall accident rate in

278
commercial and business aviation is extremely low, the proportion of accidents attributable to human error is
considerably greater than that due to machine failure. Of all accidents in commercial aviation, 88 percent have
been found to be due, in part, to human error (Boeing, 2000).
In the face of these statistics, it is important to reiterate a point made in Chapter 1—that many of the
errors people commit in operating systems are the result of bad system design or bad organizational structure
rather than irresponsible action by the person committing the error (Norman, 1988; Reason, 1990, 1997,
2008). Furthermore, although human error in accident analysis may be statistically defined as a contributing
cause to an accident, usually the error was only one of a lengthy and complex chain of breakdowns—many of
them mechanical or organizational—that affected the system and weakened its defenses (Perrow, 1984;
Reason, 1997, 2008; Wiegmann & Shappell, 2003).
We have already discussed human error in various guises and forms, as we have discussed the different
ways in which human performance can fall short. Examples include misses and false alarms in signal
detection, failures of absolute judgment or discrimination leading to misclassification, failures of working
memory leading to forgetting, prospective memory failures, a variety of ‘decision errors’ resulting from biases
and heuristics, or tracking errors resulting from high bandwidth or instability. Most errors do show up as an
inappropriate action and hence our choice to treat them comprehensively within this chapter.
The study of human error has emerged as an important and well-defined discipline (Norman, 1981;
Reason, 1990, 1997, 2008; Senders & Moray, 1991; Woods et al., 1994). Many human factors practitioners
have realized that errors made in operating systems are far more important and costly than delays of the 1/10
to ½ second magnitude typically observed in RT studies. This realization has forced human performance
theorists to consider the extent to which design guidelines based on RT generalize to error prediction; it has
also led researchers to consider classes of errors that do not necessarily result from the speed stress and SATO
typical of the RT paradigm, such as forgetting to change a mode switch on a computer or pouring orange juice
rather than syrup on your waffles.

5.1 Categories of Human Error: An Information-Processing Approach


A variety of taxonomies or classification schemes have been proposed for characterizing human error (Sharit,
2006). One example is the simple dichotomy between errors of commission (doing the wrong thing) and
errors of omission (not doing anything, when something should have been done). A more elaborate
classification scheme, consistent with the information processing representation in this book, is presented in
Figure 9.13 and is based upon schemes developed by Norman (1981, 1988) and Reason (1984, 1990, 1997,
2008). The human operator, confronting a state of the world represented by stimulus evidence, may or may
not interpret that evidence correctly; then given an interpretation, may or may not intend to carry out the
correct action to deal with the situation; finally, the operator may or may not execute that intention correctly.
Errors of interpretation or of formulating intentions are called mistakes. Thus, the misdiagnosis of the status
of the nuclear power plant at Three Mile Island is a clear example of a mistake. So, too, would be the
misunderstanding of the meaning of a button on any interface—a misunderstanding that would lead to its
incorrect use.

FIGURE 9.13 Information processing context for representing human error.

Quite different from mistakes are slips, in which the understanding of the situation is correct and the
correct intention is formulated, but the wrong action is accidentally triggered. Common examples are the
typist who presses the wrong key, the driver who turns on the wipers instead of the headlights, or pouring

279
orange juice instead of syrup on the pancakes.
As shown in Figure 9.13, it is possible for either or both kinds of errors to occur in a given operation. In
the following, we elaborate the distinction between intentions and executions further, based on the more
detailed schemes (and excellent readings) of Norman (1981, 1988) and Reason (1990, 1997). Reason (2008)
provides a more elaborate distinction between error categories than that described here.

5.1.1 MISTAKES Mistakes—failing to formulate the right intentions—can result from the shortcomings of
perception, memory, and cognition. Reason (1990) has discriminated between knowledge-based and rule-
based mistakes. Knowledge-based mistakes are like the kinds of errors made in front-end decision making, in
which incorrect plans of actions are arrived at because of a failure to correctly assess the situation (i.e.,
incorrect knowledge). Such failures result, in part, from the influences of many of the biases and cognitive
limits described in Chapters 6, 7, and 8. Operators misinterpret communications, their working memory limits
are overloaded, they fail to consider all the alternatives, they may succumb to a confirmation bias, and so
forth. They may also result from insufficient knowledge or expertise to interpret complex information.
Finally, knowledgebased mistakes can often be blamed on poor displays that either present inadequate
information or present it in a poor format, such as a table of digital readouts rather than clear graphical
readouts.
Rule-based mistakes, in contrast, occur when operators are somewhat more sure of their ground. They
know (or believe they know) the situation, and they invoke a rule or plan of action to deal with it. The choice
of a rule typically follows an ‘if-then’ logic. When understanding the environmental conditions (diagnosis)
matches the ‘if ’ part of the rule or when the rule has been used successfully in the past, the ‘then’ part is
activated. The latter may be an action—‘If my computer fails to read the disk, I’ll reload and try again’—or
simply a diagnosis—‘If the patient shows a set of symptoms, then the patient has a certain disease’.
Why might rules fail and thereby cause mistakes? Reason notes that first, a good rule might be
misapplied when the ‘if ’ conditions that trigger it are not actually met by the environment. This mistake often
occurs as exceptions to rules are learned. The rule has worked well in most cases, but subtle distinctions in the
environment or context now indicate that it is no longer appropriate. These distinctions or qualifications might
be overlooked, or their importance might not be realized. For example, although it is usually appropriate to
turn a vehicle in the direction in which you wish to go, an exception occurs when skidding on ice. The correct
rule then is to turn first toward the direction of the skid in order to regain control of the vehicle. Alternatively,
rule-based mistakes can result when a ‘bad rule’ is learned and applied.
Reason (1990) argues that the choice of a rule is guided by frequency and reinforcement. That is, rules
will be chosen that have frequently been employed in the past, have been successful, and therefore reinforced.
Rule-based mistakes tend to be made with a fair degree of certainty, as the operator believes that the
triggering conditions are in effect and that the rule is appropriate and correct. Thus, Reason describes rule-
based mistakes as ‘strong but wrong.’
While both knowledge-based and rule-based mistakes characterize intentions that are not appropriate for
the situation, there are some important differences between the two. Rule-based mistakes will be performed
with confidence, whereas in a situation in which rules do not apply and where knowledge-based mistakes are
more likely, the operator will be less certain. The latter situation will also involve far more conscious effort,
and the likelihood of making a mistake while functioning at a knowledge-based level is higher than it is at a
rule-based level (Reason, 1990) because there are so many more ways in which information acquisition and
integration can fail—through shortcomings of attention, working memory, logical reasoning, and decision
making.

5.1.2 SLIPSIn contrast to mistakes, in which the intended action is wrong (either because the diagnosis is wrong
or the rule for action selection is wrong), slips are errors in which the right intention is wrongly carried out. A
common class of slips are capture errors, which result when the intended stream of behavior is ‘captured’ by
a similar, well-practiced behavior routine. Such a capture is allowed to take place for three reasons: (1) The
intended action (or action sequence) involves a slight departure from the routine, frequently performed action,
(2) some characteristics of either the stimulus environment or the action sequence itself is closely related to
the now wrong (but more frequent) action; and (3) the action sequence is relatively automated and therefore
not monitored closely by attention. As Reason (1990) eloquently says, ‘When an attentional check is omitted,
the reins of action or perception are likely to be snatched by some contextually appropriate strong habit
(action schema), or expected pattern (recognition schema).’

280
Pouring orange juice, rather than syrup, on the pancakes while reading the newspaper is a perfect
example of a slip. Clearly the act was not intended, nor was it attended since attention was focused on the
paper. Finally, both the stimulus (the tactile feel of the pitcher) and the response (pouring) of the intended and
the committed action were sufficiently similar that capture was likely to occur. A more serious type of slip—
related to the same underlying cause—occurs when the incorrect one of two similarly configured and closely
placed controls is activated, for example, flaps and landing gear on some classes of small aircraft. Both
controls have similar appearance, feel, and direction; they are located close together; both are relevant during
the same phases of flight (takeoff and landing); and both are to be operated when there are often large
attention demands in a different direction (outside the cockpit). One might also imagine slips occurring in a
lengthy procedure of checks and switch setting that is operated in one particular way when a system is in its
usual state, but involves a change midway through the sequence when the system is in a different state. In the
absence of close attention, the standard action sequence could easily capture the stream of behavior.

5.1.3 LAPSES Whereas slips represent the commission of an incorrect action, different from the intended one,
lapses represent the failure to carry out any action at all. As such they can be directly tied to failures of
memory, but they are quite distinct from the knowledge-based mistakes associated with working memory
overload typical of poor decision making. Instead, the typical lapse is what is colloquially referred to as
forgetfulness, like forgetting to remove the last page from the photocopier when you have finished (Reason,
1997), a class of lapses that are sometimes referred to as post-completion errors (Byrne & Davis, 2006). As
we discuss in the next chapter, critical lapses may involve the omission of steps in a procedural sequence. In
this case, an interruption is what often causes the sequence to be stopped, then started again a step or two later
than it should have been, with the preceding step now missing, or in some cases, with the final step not
accomplished at all (Li et al., 2008). This reflects a failure of prospective memory (Chapter 7) and/or
interruption management (Chapter 10).
Unfortunately, lapses occur all too frequently in maintenance or installation procedures when a series of
steps must be completed, but the omission of a single step can be critical (Reason, 1997). Such a step might be
the tightening of a nut, closing a fastener, or removing a tool that had been used in the maintenance procedure.
One survey of the causes of 276 in-flight aircraft engine shutdowns revealed that incomplete installation (i.e.,
a step was missed) was by far the largest cause, occurring over twice as frequently as the second largest
(Boeing, 2000). This was often the final action in the sequence.

5.1.4 MODE ERRORS Mode errors are closely related to slips, but also have the memory failure characteristic of
lapses. They result when a particular action that is highly appropriate in one mode of (typically computer)
operation is performed in a different, inappropriate mode because the operator has not correctly remembered
the appropriate context (Norman, 1988). An example would be pressing the accelerator of a car to start at an
intersection when the transmission is in the ‘reverse’ mode. Mode errors are of concern in more automated
cockpits, which have various modes of autopilot control (Wiener, 1988). Mode errors are also of major
concern in humancomputer interactions if the operator must deal with keys that serve very different functions,
depending on the setting of another part of the system. Even on the simple word processor, a typist who
intends to type a string of digits (e.g., 1965) may mistakenly leave the case setting in the uppercase mode and
so produce !(^%. Mode errors may occur in computer text editing, in which a command that is intended to
delete a line of text may instead delete an entire page (or data file) because the command was executed in the
wrong mode.
Mode errors are a joint consequence of relatively automated performance or of high workload—when the
operator fails to be aware of which mode is in operation—and of improperly conceived system design, in
which such mode confusions can have major consequences. The reason, of course, that mode errors can occur
is that the same single action may be made in both appropriate and inappropriate circumstances.

5.1.5 DISTINCTIONS BETWEEN ERROR CATEGORIES The various categories of error can be distinguished in a
number of respects. For example, as already noted, knowledge-based mistakes tend to be characteristic of a
relatively low level of experience with the situation and a high attention demand focused on the task, whereas
rule-based mistakes, and particularly slips, are associated with higher skill levels. Slips are also more likely to
occur when attention is directed away from rather than toward the task or problem in question (a redirection
that is only likely when the task is well learned).
One of the most important contrasts between slips on the one hand, and mistakes and lapses on the other,
is in the ease of detectability. The detection of slips appears to be relatively easy because people typically

281
monitor, consciously or unconsciously, their motor output, and when the feedback of this output fails to match
the expected feedback (based on the correctly formulated intentions), the discrepancy is often detected.
Typing errors (usually slips) are very easily detected (Rabbitt & Vyas, 1970). In contrast, when the intentions
themselves are wrong (mistakes) or a step is omitted (lapse), any feedback about the error typically arrives
much later if at all, and errors cannot easily be detected online. This distinction in error correction is well
supported with data. In an analysis of simulated nuclear power plant incidents, Woods (1984) found that half
of the slips were detected by the operators themselves, whereas none of the mistakes were noted. Reason
(1990) summarized data from other empirical studies to conclude that the ease of error correction as well as
error detection also favors slips over mistakes. This factor is in part related to the easier cognitive process of
revising an action rather than reformulating an intention, rule, or diagnosis. However, system design
principles related to the visibility of feedback and the reversibility of action to be discussed below can have a
large impact on how easy it is to recover from a slip.
Given the many differences between slips and mistakes, it is logical to assume that the two major
categories should have somewhat different prescriptions for their remediation: heaviest emphasis on
preventing slips should focus on system and task design, addressing issues like S-R compatibility and
stimulus and control similarity. For the prevention of mistakes, in contrast, it is necessary to focus relatively
more on design features related to effective displays (supporting accurate updating of a mental model) and on
training (Rouse & Morris, 1987).

5.2 Human Reliability Analysis


Following the Three Mile Island nuclear power disaster in 1979, efforts in the human factors community
began to apply engineering reliability analysis to the human operator (Kirwan & Ainsworth, 1992; Miller &
Swain, 1987; Sharit, 2006), with the objective of predicting human error. A fairly precise analytic technique
can predict the reliability (probability of failure or mean time between failures) of a complex mechanical or
electrical system consisting of components of known reliabilities that are configured in series or in parallel
(Figure 9.14). For example, consider a system consisting of two components, each with a reliability of 0.9
(i.e., a 10 percent chance of failure during a specified time period). Suppose the components are arranged in
series, so that if either fails the total system fails (Figure 9.14a). This describes ‘the chain is only as strong as
its weakest link’ situation. The probability that the system will not fail (the probability that both components
will work successfully) is 0.9 × 0.9 = 0.81. This is the system reliability. Therefore, the probability of system
failure is precisely 1 – (0.9 × 0.9) = 1 – 0.81 = 0.19. In contrast, if the two components are arranged in parallel
(redundantly), as in Figure 9.14b, so that the system will fail only if both of them fail, the probability of
system failure is 0.1 × 0.1 = 0.1. Its reliability is 0.99.
The work of Miller and Swain on the technique for human error rate prediction (THERP) has attempted
to bridge the gap between machine and human reliability in the prediction of human error (Miller & Swain,
1987; Swain, 1990). THERP has three important components.
1. Human error probability (HEP) is expressed as the ratio of the number of errors made on a particular
task to the number of opportunities for errors. For example, for the task of routine keyboard data entry,
a HEP = 1/100. These values are obtained, where possible, from databases of actual human
performance (Sharit, 2006). When such data are lacking, they are instead estimated by experts,
although such estimates can be heavily biased and are not always terribly reliable (Reason, 1990).
2. When a task analysis is performed on a series of procedures, it is possible to work forward through an
event tree, or fault tree, such as that shown in Figure 9.15. In the figure, the two events (or actions)
performed are A and B, and each can be performed either correctly (lowercase) or in error (capital).
An example might be an operator who must read a value from a table (event A) and then enter it into a
keyboard (event B). Following the logic of parallel and serial components, and if the reliability of the
components can be accurately determined, it is then possible to deduce the probability of successfully
completing the combined procedure or, alternatively, the probability that the procedure will be in
error, as shown at the bottom of the figure.
3. The HEPs that make up the event tree can be modified by performance shaping factors, multipliers
that predict how a given HEP will increase or decrease as a function of expertise or the stress of an
emergency (Miller & Swain, 1987). Table 9.1 is an example of the predicted effects of these two
variables.

282
FIGURE 9.14 (a) Two components in series; (b) two components in parallel. The numbers in the boxes indicate the component reliabilities.
The numbers below indicate the system reliabilities. Probability of error = 1.0 – reliability.

Human reliability analysis represents an admirable beginning to the development of predictive models of
human error. Its advocates have argued that it can be a useful tool for identifying critical human factors
deficiencies. Furthermore, as noted in Chapter 1, providing hard HEP numbers, the output from the model,
which document poor human factors in the form of increased predicted errors can be an effective tool for
lobbying designers to incorporate human factors concerns (Swain, 1990). In spite of its potential value,
however, human reliability analysis has a number of major shortcomings, which have been carefully
articulated by Adams (1982), Reason (1990), Dougherty (1990). Briefly, these are as follows.

5.2.1 ERROR MONITORING When machine components fail, they require outside repair or replacement. Yet as we
have seen, humans normally have the capability to monitor their own performance, even when operating at a
relatively automated level. As a result, humans often correct errors before those errors ultimately affect
system performance, particularly capture errors or action slips (Rabbitt, 1978). The operator who accidentally
activates the wrong switch may be able to shut it off quickly and activate the right one before any damage is
done. Thus, it is difficult to associate the probability of a human error with the probability that it will be
cascaded onward to induce a system error.

FIGURE 9.15 Fault tree analysis, representing the success of failure of two subtasks (a and b) either in series or in parallel. Lower case
indicates successful performance, CAPITALS indicates failed performance.

Model accounting for stress and experience in performing routine tasks


Increase in error probability Increase in error probability
Skilled Novice
Stress Level
Very low X2 X2
Optimum X1 X1
Moderately high X2 X4
Extremely high X5 X10

Source: D. Miller & A. Swain, ‘Human reliability analysis’ in G. Salvendy (Ed) Handbook of Human Factors, NY: John Wiley & Sons, Inc.
Reprinted by permission.

5.2.2 NONINDEPENDENCE OF HUMAN ERRORS The assumption is sometimes made in analyzing machine errors that
the probability of the failure of one component is independent of that of another. Although this assumption is
questionable when dealing with equipment (Perrow, 1984), with humans it is particularly untenable. Such

283
dependence may work in two opposing directions. On the one hand, if we make one error, our resulting
frustrations and stress may sometimes increase the likelihood of a subsequent error. On the other hand, the
first error (if detected) may increase our care and caution in subsequent operations and make future errors less
likely. Whichever the case, it is impossible to claim that the probability of making an error at one time is
independent of whether an error was made at an earlier time, a critical assumption normally made in reliability
analysis. The actuarial database on human error probability, which is used to predict reliability, will not easily
capture these dependencies because they are determined by mood, caution, personality, and other uniquely
human properties (Adams, 1982).
A similar lack of independence can characterize the parallel operation of two human ‘components.’
When machine reliability is analyzed, the operation of two parallel (or redundant) components is assumed to
be independent. For example, three redundant autopilots are often used on an aircraft so that if one fails, the
two remaining in agreement will still give the true guidance input. None of the autopilots will influence the
others’ operation (unless they are all affected by a superordinate factor such as a total loss of power). This
independence, however, cannot be said to hold true of multiple human operators. Social factors may make the
two operators relatively more likely to agree than had they been processing independently, particularly if one
is in a position of greater authority (see also Chapter 6). Their overall effect may be to make correct
performance either more or less likely, depending on a host of influences that are beyond the scope of this
book.

5.2.3 INTEGRATING HUMAN AND MACHINE RELIABILITIES Adams (1982) argues that it is difficult to justify
mathematically combining human-error data with machine-reliability data, derived independently, to come up
with joint reliability measures of the total system. Here again a non-independence issue is encountered. When
a machine component fails (or is perceived as being more likely to fail), it will probably alter the probability
of human failure in ways that cannot be precisely specified. It is likely, for example, that the operator will
become far more cautious, trustworthy, and reliable when interacting with a system that has a higher
likelihood of failure or with a component that itself has just failed than when interacting with a system that is
assumed to be infallible. We saw this trade-off in our discussion of alarms in Chapter 2, and it will be
considered again in the discussion of automation mistrust in Chapter 12.
The important message here, as stated succinctly by both Reason (1990) and Adams (1982), is that a
considerable challenge is imposed to integrate actuarial data of human error with machine data to estimate
system reliability. Unlike some other domains of human performance (see particularly manual control in
Chapter 5), even if the precise mathematical modeling of human performance were achieved, it would not
appear to allow accurate prediction of total system performance. Although the potential benefits of accurate
human reliability analysis and error prediction are great, it seems likely that the most immediate human
factors benefits will be realized if effort is focused on case studies of individual errors in performance (Woods
et al., 1994). These case studies can be used to diagnose the resulting causes of errors and to recommend the
corrective system modification.

5.3 Errors in an Organizational Context


While the discussion till now has focused on individual information processing causes of error, a vital
extension is to analyze errors within a broader organizational context (Reason, 1990, 2008). This approach
becomes particularly important in the analysis of error causes (Wiegmann & Shappel, 2003). Here one may
identify not only individual breakdowns such as slips and mistakes, but also intentional violations (such as
intentionally exceeding the speed limit or shortcutting safety procedures), poor training, poor supervision, and
corporate policies and climates that circumvent a strong safety culture (Reason, 2008). These issues go well
beyond the scope of this book and the reader is referred to Wiegmann and Shappel (2003) and Reason (1997,
2008) for thorough treatments.

5.4 Error Remedies


We will now discuss the solutions offered to minimize the likelihood of errors or the potential damage that
they might cause.

5.4.1 TASK DESIGN Designers should try to minimize operator requirements to perform tasks that impose heavy
working memory load under conditions of stress or other tasks for which human cognitive mechanisms are
poorly suited. Such efforts will generally decrease the frequency of mistakes.

284
5.4.2 EQUIPMENT DESIGN There are a number of equipment design remedies that reduce the invitation for errors:
• Minimize perceptual confusions. Norman (1988) has described the care that is taken in the automobile
to ensure that fluid containers and apertures look distinct from one another, so that oil will not be
poured into the antifreeze opening nor antifreeze into the battery and so forth. Such a design stands in
stark contrast to the identical appearance of different fluid tubes and fluid containers supporting the
patient in an intensive care unit, a situation that describes an error waiting to happen (Bogner, 1995;
Gopher et al., 1989). There are, of course, a series of design solutions that can ensure discriminability
between controls and between displays which have been described in earlier chapters of the book:
distinct color and shape, spatial separation, distinct feel, and different control motions.
• Make the execution of action and the response of the system visible to the operator to aid error
recognition (Norman, 1988). When slips occur, they cannot easily be detected (and hence corrected) if
the consequences of actions cannot be seen. Hence, feedback from switches and controls that change a
state should be clearly and immediately visible. If it is not too complex, the way a system carries out
its operations should be revealed. Unfortunately, extreme simplicity, economy, and aesthetics in
engineering design can often mask the visibility of response feedback and system operation, a
visibility useful in preventing errors.
• Use constraints to ‘lock out’ the possibility of errors (Norman, 1988). Sometimes these can be
cumbersome and cause more trouble than they are worth. For example, interlock systems that prevent
a car from starting before the seat belts are fastened have proven to be so frustrating that people
disconnect the systems. On the other hand, an effective constraint is that seen in the car doors that
cannot be locked unless an action is taken on the key itself. This slight inconvenience will prevent the
key from being locked in the car. Other constraints may force a sequence of actions in the computer
that will prevent the commission of major errors—like erasing important files.
• Offer reminders. Given the prevalence of lapses, care can be taken to remind users of steps that are
known to be particularly likely to be omitted. An example is a prominent note on the photocopier
reading: ‘take the last page’ (Reason, 1997) which addresses the common post completion error lapse
of leaving the final page in the copier. Devices can also remind people (Hermann et al., 1999).
• Avoiding multimode systems. Systems, like the multimode digital watch, in which identical actions
accomplish different functions in different contexts, are a sure invitation for mode errors. When they
cannot be avoided, the designer should make the discrimination of modes as visible as possible by
employing salient visual cues. A continuous flashing light on a computer system, for example, is a
salient visual reminder that an unusual mode is in effect. Designers should resist the temptation to
create a great number of modes, where spatial separation can allow distinct physical differences in
controls.

5.4.3 TRAINING Because lack of knowledge is an important source of mistakes, it is not surprising that increased
training will reduce their frequency (although, as we have seen, training may have little effect on slips). As we
have noted in Chapter 7, however, it is appropriate that some errors do occur during training. If operators are
not practiced at correcting errors that occur during training, they will not know how to deal with the errors that
might occur in real system operation

5.4.4 ASSISTS AND RULES Both assists and rules can represent designer solutions to error-likely situations, and
some of these make very obvious sense. For example, such assists as memory aids for procedures checklists
can be extremely valuable (Rouse, Rouse, & Hammer, 1982), whether for operators of equipment following a
start-up procedure or for maintenance personnel carrying out a complex sequence of lapse-prone steps, or for
medical personnel during procedures (Gawande, 2009). If rules are properly explained, are logical, and are
enforced, they can reduce the likelihood of safety violations. However, if the implications of rules adopted for
complex systems, like nuclear and chemical process control plants, are not thought through, they can create
unforeseen problems of their own. As Reason (1990) describes it, the ‘band-aid’ approach to human error may
only make the situation worse. For example, rules may unexpectedly inhibit necessary behavior in times of
crisis, in a way that the rule designer had not anticipated.

5.4.5 ERROR-TOLERANT SYSTEMS Although human error is typically thought of as undesirable, it is possible to
see its positive side (Reason, 2008; Senders & Moray, 1991). In discussing both signal detection theory
(Chapter 2) and decision theory (Chapter 8), we saw that in a probabilistic world, certain kinds of errors will
be inevitable, and engineering psychologists are concerned as much with controlling the different kinds of

285
errors (e.g., misses versus false alarms) as with eliminating them. In Section 2.2, we saw that the optimal
setting of the speed-accuracy trade-off was usually at some intermediate level, where at least a small number
of errors was better than none at all. In Chapter 7, we saw that error is often necessary for learning to occur
(so long as the error is not repeated).
Finally, as discussed in Chapter 1, error may be viewed as the inevitable downside of the valuable
flexibility and creativity of the human operator. Understanding the inevitable and sometimes even desirable
properties of human error has forced a rethinking of conventional design philosophies, in which all errors
were to be eradicated (Rasmussen, 1989). Instead, researchers and human factors practitioners have advocated
the design of error-tolerant systems (Norman, 1988; Rouse & Morris, 1987). An error-tolerant design, for
example, would not allow the user to carry out irreversible actions without clear reminders (‘are you sure you
want to …’). A filedelete command on a computer will not irreversibly delete the file but simply remove it
and ‘hold’ it in another place for some period of time (e.g., until the computer is turned off or the user
commands ‘empty the trash’). Then the operator would have the chance to recover from the slip (which in this
case was an incorrect deletion command; Norman, 1988). The ‘undo’ button in computer systems has been
immensely valuable in this regard.
The concept of error-tolerant systems is closely linked to the concept of adaptive automation, discussed
in Chapter 12. In many error-tolerant systems, an automated or intelligent agent will monitor human
performance, and if it senses degradation, often manifest in errors, it will intervene to notify the human, or
perhaps even take control and correct the error.

6. TRANSITION
At this point in the book, we have completed our treatment of the various stages of processing information,
typically for a single task. Now in Chapter 10, we transition to discussing what happens when two concurrent
tasks compete for time and resources, the issue of multitasking.

Key Terms
adaptive automation 320
additive factors technique 303
affordance 303
alternation effect 292
bandwidth 306
central processing code 301
choice RT 284
chording 307
colocation principle 294
congruence 294
decision complexity advantage 307
error-tolerant systems 320
executive control 293
expectancy 287
force paced 308
Hick-Hyman law 288
imperative stimulus 286
interstimulus interval 304
knowledge-based mistakes 312

286
locational compatibility 294
lockout 303
mistakes 311
movement compatibility 294
performance shaping factors 316
post-completion errors 314
psychological refractory period 304
psychophysiological techniques 303
repetition effect 292
response-stimulus interval 308
rule-based mistakes 312
self paced 308
serial RT task 285
simple RT 284
single-channel processor 304
skill-based behavior 284
slips 312
speed-accuracy micro-trade-off 291
speed-accuracy operating characteristic 289
speed-accuracy trade-off 289
stimulus-response compatibility 293
subtractive technique 303
transcription tasks 310
transmitting information 287
violations 318
visual field compatibilty 299
warning interval (WI) 286
Warrick principle 299

287
10 MULTITASKING CORRECTED

A woman was texting while walking along the street. She became very heavily engaged trying to understand a
difficult problem that had arisen at work but was poorly explained in the abbreviated text mail. Approaching a
cross walk, she gave a brief and inattentive glance rightward, but did not observe the car speeding toward her.
Meanwhile, in that car the driver was himself heavily engaged in electronic communications, here on a hands
free cell phone. His eyes were forward, but his mind was fully engaged on the conversation concerning the
potential loss of a large contract for which his intervention was badly needed. Both parties to this scenario
were overloaded and neither noticed the other until the impact.

1. OVERVIEW
The study of multitasking examines how well each task in a multitask set (usually a dual task pair) will be
performed in a dual task combination, relative to how each is performed alone. If there is a decrease, it is
described as a dual task decrement, and the mechanisms by which this decrement is produced has been the
focus of psychologists for over a century (James 1890; Titchner, 1908) as well as the more recent focus of
applied psychology to understand andThe first three mechanisms, resource demand, resource multiplicity, and
resource allocation, can all be accommodated within the structure of multiple resource theory (Navon &
Gopher, 1979; Wickens, 1984, 2002a, 2008) whose architecture is shown in Figure 10.1. At the lower left,
task interference is determined by the difficulty or resource demands of a task. Quite intuitively, we can time
share two easy tasks successfully (walking and talking); but if one or both becomes difficult–walking on a
narrow ledge, or explaining a complex concept – the other may be somewhat degraded. We say they now
compete for resources and one or the other (or both) may not have the resources needed for performance at its
single task level. remedy causes of multitask overload in environments such as air or ground transportation,
the intensive care unit, or command and control center in crisis (Wickens & McCarley, 2008; Johnson &
Proctor, 2004).
Multitasking can also be described as dividing attention between tasks rather than between information
channels as discussed in Chapter 3. While the latter focuses heavily on sensation and perception, the former
considers the causes of task interference at all stages of processing and between all sorts of different cognitive
and response activities. Our approach below will focus on four general mechanisms of human performance
that can account for variability in multitask proficiency (or the dual task decrement) and across task
configurations and across people. These are the effort (resource) demands of a task related to its difficulty, the
similarity between two tasks in their demand for multiple resources, the relative priority or emphasis given on
one task or another, and the similarity between tasks in terms of the specific information and mappings within
each task of the pair. We conclude the chapter by discussing how people differ in their multitasking fluency.
The first three mechanisms, resource demand, resource multiplicity, and resource allocation, can all be
accommodated within the structure of multiple resource theory (Navon & Gopher, 1979; Wickens, 1984,
2002a, 2008) whose architecture is shown in Figure 10.1. At the lower left, task interference is determined by
the difficulty or resource demands of a task. Quite intuitively, we can time share two easy tasks successfully
(walking and talking); but if one or both becomes difficult–walking on a narrow ledge, or explaining a
complex concept–the other may be somewhat degraded. We say they now compete for resources and one or
the other (or both) may not have the resources needed for performance at its single task level.
At the lower right of the figure, we emphasize that the human does not possess just one “pool”of mental
resources for which all tasks compete equally. There are multiple resources, such as those used for voice and
manual responses. In the above, walking the ledge and tying a knot, two visual-motor activities, will be more
interfering than walking and talking. When two tasks demand the same resources, their decrement will be
greater than when they use separate resources. Resource demand and resource multiplicity together determine
the total dual task decrement. But when there is a decrement, which task suffers more? Or is the decrement
divided equally between the two? What happens when an interruption occurs in the middle of a high priority
task? Will the latter task be dropped, or will the interruption be ignored? This determination is made by the
resource allocation component, shown at the top of the figure. In the next three sections, we discuss each of
these three components of multiple resource theory in turn.

288
FIGURE 10.1 Architecture of multiple Resource Theory.

2. EFFORT AND RESOURCE DEMAND


We have encountered the concept of effort in several prior contexts within this book: the effort required to
continue a visual search (Wolfe, Horowitz, & Berman, 2005) or information search and seeking
(Janiszerwinski, 2008; Morgan, Patrick, et al., 2009), the role of effort in constraining visual scanning (the
first E in SEEV; see Chapter 3), the competition between working memory effort and information access
effort in the proximity compatibility principle (Chapter 3), the reduced effort required of heuristics versus
optimal algorithms, or type 1 versus type 2 processing in decision making (Kahneman & Klein, 2009; see
Chapter 8, figure 8.9), the effort invested in learning or following instructions as represented in cognitive load
theory (Mayer, 2007; Paas Renkle & Sweller, 2003; Chapter 7). Here we focus explicitly on the role of effort
in predicting or accounting for a dual task decrement.
In 1890, William James first invoked the concept of effort or difficulty by writing: “If you ask how many
things or ideas one can attend to at once, the answer is not very easily more than one, unless the processes are
very habitual. ”In this sentence, James essentially defined a continuum of task difficulty that dictates the ease
of dividing attention. Subsequently, the concept of “habitual ”as it influences divided attention has been
labeled “automaticity”(Fitts & Posner, 1967; Schneider & Shiffrin, 1977).
There is ample evidence that tasks which are automatic, either because of extensive practice (see
Chapter7; Schneider & Shiffrin; 1977; Bahrick & Shelly, 1958) or because of the very simplicity of their
stimulus-response mappings (e.g., a simple response time task, in the context of Chapter 9) require minimal
attentional resources to be performed, thereby availing near full resources for a concurrent task, and achieving
the gold standard of perfect time sharing (zero decrement). In fact, there is evidence that in some
circumstances very highly practiced tasks may actually degrade when more attention is focused on them. For
example, Gray (2004) examined the phenomenon of “choking ”by skilled athletes (baseball players) and
found that performance actually degraded for the expert batters when attention was directed toward the
batting task in a way that was not true for novices.
If automaticity defines one end of the resource demand scale, then we can speak of features that shift this
demand in a positive direction, incurring greater interference with concurrent tasks. Two general factors are
the lack of experience or practice and the intrinsic difficulty or complexity of a task (Halford, Baker, et al.,
2005; Halford, Wilson, & Phillips, 1998). In Chapter 11, we will discuss the ways of representing and
measuring this cognitive difficulty or mental workload of tasks. Here we focus on the notion that increased
difficulty or decreased automaticity leaves the human with less residual attention, residual resources, or
spare capacity to perform concurrent tasks, hence creating or amplifying a dual task decrement.
The relationship between resources demanded by (and therefore invested in) a task and its performance
can be graphically represented in the performance-resource function shown in Figure 10.2 (Norman &
Bobrow, 1975). Here on the X axis is presented the resources invested into a task; think of moving from left
to right as “trying harder. ”On the Y axis is any measure of performance on the task such that higher
represents “better ”(more accurate, more rapid, etc.). Three curves are presented. The relatively linear curve
(A) represents a more difficult task (or a task for a less skilled performer). Full resources are needed to
achieve perfect performance. Hence any withdrawal of resources to allocate to a concurrent task will lead to a
dual task decrement. This withdrawal is indicated by the vertical dashed line. For clear reasons the task

289
represented by this curve is said to be fully resource limited. The dashed curve (B) shown above it in the
figure typifies an “easier ”task, or one performed by the expert. Only a small amount of resources are
necessary to attain perfect performance. Additional resource investment can be said to be “wasted ”and there
will be ample resources available for a concurrent task. This second task is not much resource limited. As is
evident, the course of learning and practice at any task will produce a continuum of movement from curve A
to curve B.
Curve C in Figure 10.2 illustrates a data limit (Norman & Bobrow, 1975): a kind of limit to performance
that is quite different from the resource limit. Here performance is far from perfect even with full resource
investment, but even after investing a small amount of resources, no further performance gain can be achieved
by investing more. Why? Because the quality of performance is limited by the source of data or information
for the task. As one example, you cannot hear a faint signal below your threshold of hearing, no matter how
hard you “strain your ears. ”As another example, you cannot understand a fast conversation in a poorly
learned foreign language, no matter how hard you “strain your brain. ”In both of these cases, after investing a
certain amount of resources, further investment will be fruitless and you might as well save the residual as
spare capacity for other concurrent tasks.

FIGURE 10.2 The Performance resource function. The three different curves are described in the text.

As the foreign language example illustrates, data limits can also refer to data from long term memory
(Norman & Bobrow, 1975). For example, trying to retrieve the name of a person or vocabulary word you
know that you do not know would be a data limited task, but investing effort into retrieving a name you
“know you know ”(Nelson, 1996) would be more resource limited. As is apparent, any given task can contain
both resource limited and data limited regions along the PRF. In fact the “perfect performance ceiling ”of task
B in Figure 10.2can be referred to as a data limit, in the sense that beyond investing about 30 percent
resources, further resource investment cannot improve performance further.
In dual task performance, the obvious implication of the distinction between resource and data limits can
be seen in Figure 10.3 in which the relative allocation of resources between two tasks is plotted on the X axis.
Here the PRF for task B is shown as it was in Figure 10.2 But the PRF for task A is now plotted backwards.
Hence a given allocation policy of dividing resources between tasks represents a point on the X axis
(Wickens, Kramer, et al., 1983). It is quite feasible to manipulate relative resource allocation through
instructions (Gopher, Brickner, & Navon, 1982) and in real-world environments people will spontaneously
adopt some resource allocation strategy. This is not always optimal as our story at the beginning of the chapter
illustrated; both parties “engaged ”in tasks that should have been of lower priority, allowing the task of hazard
detection to degrade. In Figure 10.3 , the midpoint representing equal priority (vertical solid line) shows a
decrement in task A, but none in task B. A different allocation policy shown by the vertical dashed line,
emphasizing A at the expense of B will actually produce near perfect time sharing (Schneider & Fisk, 1982).
We return to the issue of resource allocation in section 4.
The concept of resource demand in dual task circumstances has been adopted in a variety of different
contexts. For example, in Chapter 7, we have described the role of resource demands imposed by germane,
intrinsic and extrinsic load in learning (Paas & van Gog, 2009). Other investigators have argued that certain
kinds of material, such as the relative frequency of events, arelearned automatically, in that frequency learning
appears to progress as rapidly in dual as in single task conditions (Hasher & Zacks 1979), even though
resources are more scarce in the former. In one fascinating application Kaplan and Berman (2010) have
argued that the resource demands of executive control compete with those of self regulation, necessary to
control impulses. Hence, heavily demanding cognitive tasks render self control more difficult. Interestingly,
the authors also provide data to suggest that a green, natural environment can restore resources depleted by

290
the combined demands of self regulation and cognition.

FIGURE 10.3 Two performance resource functions of time shared tasks illustrating the tradeoff of resources between them, as task priority is
varied.

Our discussion of decision making in Chapter 8 implied that “resource-lite ”heuristics were often chosen
to minimize resource expenditures (Fennema & Kleinmuntz, 1996), and Gray and Fu (2002) have nicely
modeled how the user ’s choice of different interface options (relying upon imperfect memory, versus key
presses to retrieve information) is driven, in part, by minimizing resource demands. See also Ballard Hayho
and Pelz (1995).
The concept of resources as effort, articulated clearly in a wonderful book by Kahneman (1973) as well
as the concept of effort depletion has lead to an understanding of its close relationship to neurophysiology.
We discuss this linkage more fully in the next chapter. Here however we note two other linkages. First, the
sustained deployment of effort does impose long term costs. We saw this as a limit of sustained attention in
both decision fatigue (Tierney, 2009; Chapter 8) and in the vigilance task (Deaton & Parasuraman, 1988;
Chapter 2). Even though the vigilance paradigm does not represent the general dual task situation (in fact
concurrent tasks can sometimes improve vigilance; Atchley & Chan, 2011), effort deployment and task
performance is very relevant as the “pool ”of resources appears to decline over time (or the motivation to
invest effort declines). Sustained effort cumulates in fatigue.
The second linkage concerns an issue that has intrigued many psychologists and is the extent to which
the “pool ”of resources is a fixed versus a variable one (Young & Stanton, 2002). What this “pool ”might be
in terms of actual brain functioning will be addressed in the next chapter. However, Kahneman (1973) has
argued that it is not fixed. He asserts that it is simply “harder to try hard on an easier task, than on a hard task.
”That is, increasing task demand itself essentially mobilizes additional resources, expanding the pool, as task
demand dictates. Young & Stanton (2002) provide data that are consistent with this view: processing becomes
more efficient with a more difficult task because more resources become available, and the ability to expand
the pool may differ between people (Matthews, Warm, et al., 2011; Matthews & Davies, 2001). Further
discussion of resources in task performance follows as we address mental workload in the next chapter.

3. MULTIPLICITY
Kahneman (1973; see also Rolfe, 1973) who invoked and elegantly articulated the concept of mental
resources underlying multiple task performance also acknowledged in his final chapter that there were other
factors accounting for task interference beyond a single “pool ’ of undifferentiated capacity, as it was then
described. In particular, he pointed to structural interference, such as the eye fixation needing to be at
different places at the same time (our unfortunate texting pedestrian), or motor interference, such as the hand
needing to execute two simultaneous competing actions. At around this time, other investigators (Navon &
Gopher, 1979; Kantowitz & Knight, 1976; Navon & Gopher, 1979; Wickens, 1976) begin to postulate
multiple rather than single resources.
The need for such elaboration came from several sources of evidence. In particular, some experimental
evidence suggested that a more difficult task (e.g., vigilance monitoring) might interfere less with another task
(tracking) than a less difficult one (maintaining a constant force; Wickens, 1976). Others focused on the
concept of “difficulty insensitivity ”such that increases of demand in one task sometimes did not degrade
performance of the concurrent task or did not degrade performance of the demand-increased task by an
amount greater than would be the case in single task situations where resources were more plentiful
(Kantowitz & Knight, 1976; Verguawe, Barrouillet, & Camos, 2010), even though such an effect would be
predicted with a single undifferentiated resource “pool. ”It was also noted that even when obvious structural
limitations were removed (e.g., visual information for two sources was placed adjacently, so no scanning was
required); dividing attention between stimuli to different senses (auditory and vision) still led to less

291
interference than between the same senses (e.g., Treisman & Davies, 1980). Such data are consistent with the
notion that humans possess separate resources, so that when two tasks demand non-overlapping resources, the
above two findings can be observed (Wickens, 1980).
Combining the implications of such human performance data with physiologically plausible dimensions
that might define such separation of resources within the brain (e.g., different sensory cortexes, different
cerebral hemispheres, anterior versus posterior brain regions), Wickens (1980) postulated a relatively simple
three-dimensional multiple resource model (stages, codes, modalities); since elaborated to four (focal-ambient
vision within visual perception; Wickens, 2002a, 2008). To the extent that two tasks demand separate
resources along these four dichotomous dimensions, (a) overall time sharing will improve and (b) increases in
the difficulty of one task will be less likely to degrade performance of the concurrent task. Each of these four
dimensions is now described in turn.

3.1 Stages
The resources used for perceptual activities and for cognitive activities (e.g., involving working memory)
appear to be the same and are functionally separate from those underlying the selection and execution of
responses (Figure 10.4). Evidence for this dichotomy is provided when the difficulty of responding in one task
is varied (demanding greater or fewer resources) and this manipulation does not affect performance of a
concurrent task whose demands are more perceptual and cognitive in nature. Conversely, evidence is provided
when increases in perceptual-cognitive difficulty do not much influence the performance of a concurrent task
whose demands are primarily response-related (Wickens & Kessel, 1980).

FIGURE 10.4 The two stage-defined resources.

In the realm of language, Shallice, McLeod, and Lewis (1985) have examined dual-task performance on
a series of tasks involving speech recognition (perception) and speech production (response) and have
concluded that the resources underlying these two language processes are somewhat separate, even as they
share verbal resources (see codes below). It is important that the stage dichotomy can be associated with
different brain structures (see Chapter 11). That is, speech and motor activity tend to be controlled by frontal
regions in the brain (forward of the central sulcus), while perceptual and language comprehension activity
tends to be posterior of the central sulcus. Physiological support for the dichotomy is also provided by
research on eventrelated brain potentials (e.g., Isreal, Chesney, et al., 1980; see Chapter 11).
As shown in Figure 10.4, the stage dichotomy of the multiple resource models also predicts that there
will be substantial interference between resource-demanding perceptual tasks and cognitive tasks requiring
working memory to store or transform information (Liu & Wickens, 1992b; Liu, 1996). Even though these do
constitute different stages of information processing, they are supported by common resources. For example,
visual search coupled with mental rotation, or speech comprehension coupled with verbal rehearsal both
provide examples of operations at different stages (perceptual and cognitive), that will still compete for
common stage-defined resources, and will thus be likely to interfere. As our unfortunate driver illustrated, the
cognitive processes in cell-phone conversation clearly interfere with perceptual processes involved in noting
changes in the driving environment (McCarley Vais et al., 2004; Strayer & Drews, 2007), and in pedestrian
judgments of road crossing safety in heavy traffic (Neider, McCarley, et al., 2010). Fougnie and Marois
(2007) have explicitly linked the increasing demands of a working memory task to an increase in change
blindness as discussed in Chapter 3.
Finally, we note how the stage dichotomy of multiple resources is consistent with the evidence for a
bottleneck in response selection (the psychological refractory period), as discussed inChapter 9 (Pashler,
1998), in that two tasks both involving a response selection stage will aggressively compete for the common

292
response-related resource (causing a delay in response to the second-arriving stimulus), but such response
selection will compete much less with tasks that rely upon perceptual cognitive processing.

3.2 Processing Codes


The processing code dimension reflects the distinction between analog/spatial processing and
categorical/symbolic (usually linguistic or verbal) processing. Data from multiple task studies (see Wickens,
1980) indicate that spatial and verbal processes, or codes, whether functioning in perception, cognition, or
response, depend on separate resources and that this separation can often be associated with the two cerebral
hemispheres (Polson & Friedman, 1988; see also Baddeley, 1986, 2002, and Logie, 1995, and Chapter 6 and
Chapter 7 for parallel views on the important distinctions between spatial and verbal working memory or
cognitive operations).
The distinction between spatial and verbal resources also accounts for the relatively high degree of
efficiency with which manual and vocal responses can be time-shared, assuming that manual responses are
usually spatial in nature (tracking, steering, joystick or mouse movement) and vocal ones are usually verbal.
In this regard several investigations (e.g., Martin, 1989; Tsang & Wickens, 1988; Wickens & Liu, 1988;
Wickens, Sandry, & Vidulich, 1983; Sarno & Wickens, 1995; Tsang, 2006) have shown that continuous
manual tracking and a discrete verbal task are time-shared more efficiently when the discrete task employs
vocal as opposed to manual responses. Also consistent is the finding that discrete manual responses using the
nontracking hand appear to interrupt the continuous flow of the manual tracking response, whereas discrete
vocal responses do not (Wickens & Liu, 1988). Note that a hybrid operation is keyboarding (typing).
This can best be described as a manual response that is fed by verbal cognition (or visual-verbal input, if it is
simple transcription; see Chapter 9).
An important practical implication of the processing codes distinction is the ability to predict when it
might or might not be good to employ vocal (speech) versus manual control. Manual control may disrupt
performance in a task environment imposing demands on spatial working memory (e.g., driving), whereas
voice control may disrupt performance of tasks with heavy verbal demands (or be disrupted by those tasks,
depending on resource allocation policy). This issue is addressed in discussing distracted driving below in
Section 5.

3.3 Perceptual Modalities


It is apparent that we can sometimes divide attention between the eye and ear better than between two
auditory channels or two visual ones (Wickens, 1980; Meyer & Keiras, 1997). That is, crossmodal time-
sharing is better than intramodal time sharing. As examples, Wickens, Sandry, and Vidulich (1983) found
advantages to cross-modal over intramodal displays in both a laboratory tracking experiment and in a fairly
complex flight simulation, and Wickens, Goh, et al. 2003, solidly replicated the latter results. Parks and
Coleman (1990) and Donmez Boyle and Lee (2006) observed that visual distractions were more detrimental
for drivers than auditory ones when negotiating a curve. A meta-analysis compared auditory-visual (AV) with
visual-visual (VV) tasks in which the modality-varied task was a discrete interruption and the visual task was
relatively continuous, averaged the effects over 29 studies. (Wickens, Prinet, et al., 2011). The results
indicated that auditory presentation of a discrete task offered a significant 15 percent advantage (collapsed
over both speed and accuracy) over visual presentation. This effect will be revisited in the following section
on interruption management.
The degree to which peripheral rather than, or in addition to, central factors are responsible for the
examples of better cross-modal time-sharing (AV better than AA or VV) remains uncertain. Wickens Prinet et
al. (2011) found that the 15 percent auditory advantage persisted even when the two sources of visual
information were adjacent, hence ruling out the possibility that the cross-modal AV advantage was entirely
due to the elimination of visual scanning. When visual scanning is minimized, however, cross-modal displays
do not always produce better time-sharing (Wickens & Liu, 1988; Horrey & Wickens, 2004; Wickens, Dixon,
& Seppelt, 2005; Wickens & Colcombe, 2007), particularly for the ongoing visual task (whose modality is not
varied). We address this issue of auditory preemption below in the context of resource allocation and task
interruptions (section 4).
Nevertheless, in most real-world settings, visual scanning does impose enough of a penalty for VV
interfaces that dual-task interference can be reduced by off-loading some information channels from the visual
to the auditory modality in environments such as the anesthesiology work station (Watson & Sanderson,
2004), or the airplane cockpit (Wickens, Goh, et al., 2003), the computer-based instructional work station

293
(Mayer, 2007; 2009); and on the other side, simultaneous auditory messages (AA) are sufficiently hard to
process that an advantage can usually be gained by displaying one of them visually (AV better than AA;
Rollins & Hendricks, 1980).
In addition to the auditory and visual channels, considerable recent interest has focused on the role of
tactile channels for presenting information: an electronic “tap ”on a soldier ’s shoulder to inform of an enemy
on the right, or a buzz on the wrist of a pilot to inform of an important visual change on the display (Sarter,
2007). In this regard, it appears that the tactile modality acts as another perceptual resource channel in much
the same manner as the auditory channel, conferring the same relative advantages to VT visual time sharing,
as the auditory modality does in VA pairing (Lu, Wickens, et al., 2011).
Before closing the section on modalities, it is important to consider some aspects of the redundant
presentation of auditory and visual information, as when synthetic speech “echoes ”a printed text. We
discussed the advantages of multi-modal redundancy gain for instructions in Chapter 6. Consider for example
in-vehicle navigation (e.g., while driving) presented redundantly by voice and text. Here the results suggest
that redundant display may provide a benefit to the accuracy of processing the navigational information, but
not to the ongoing visual tracking task, as if the driver allows the visual text information to compete with
visual driving, even though such interference is not necessary. (Attention could be focused on the auditory
modality instead; Wicken, Prinett, et al., 2011.) There is some suggestion that training of the appropriate
attention allocation strategies can be provided to allow redundant presentation to foster “the best of both
(auditory and visual) worlds ”(Wickens & Gosney, 2003).

3.4 Visual Channels


In addition to the distinction between auditory and visual modalities of processing, there is good evidence that
two aspects of visual processing, referred to as focal and ambient vision (Chapter 4) constitute separate
resources in the sense of (a) supporting efficient time-sharing (b) being characterized by qualitatively different
brain structures, and (c) being associated with qualitatively different types of information processing
(Leibowitz, Post, et al., 1982; Previc, 1998; 2000; Sumaala, Nieminene, & Punto, 1996; Horrey, Wickens, &
Consalus, 2006; Wickens & Horrey, 2009). Focal vision, which is nearly always foveal, is required for
perceiving fine detail, pattern and object recognition (e.g., reading text, identifying small objects). In contrast,
ambient vision heavily (but not exclusively) involves peripheral vision, and is used for perceiving orientation
and ego motion (see Chapter 4). When the mail carrier manages to successfully walk down the sidewalk while
reading a letter address, she is exploiting the parallel processing or capabilities of focal and ambient vision,
just as we are when keeping the car moving forward in the center of the lane (ambient vision) while reading a
road sign, briefly glancing down to check a navigational display, or recognizing a hazardous object in the
middle of the road (focal vision; Horrey, Wickens, & Consalus, 2006). Aircraft designers have considered
several ways of exploiting ambient vision to provide guidance and alerting information to pilots, while their
focal vision is heavily loaded by perceiving specific channels of displayed instrument information (Balkly,
Dyre, et al., 2009; Nikolic & Sarter, 2001 see also Chapter 5).

3.5 A Computational Model


Collectively, the four dimensions of the multiple resource model can be represented in the “cube ”form of
Figure 10.5, in which we can see three modalities nested within the perceptual-cognitive stage and the focal-
ambient distinction nested within the visual modality. Any task occupies one or more cells within the cube,
and the more that two tasks occupy overlapping cells (overlapping levels on a dimension), the greater will be
the interference between them due to resource competition.
A computational model to predict the joint influence of resource demand and resource conflict has been
developed (Sarno & Wickens, 1995; Wickens, 2002a, 2005; Wickens, Bagnall, et al., 2011) and has been
validated against both generic multitasking data (Sarno & Wickens, 1995) and multitask driving in a high
fidelity simulator (Horrey & Wickens 2003). While it is beyond the scope of this chapter to fully explain the
model within the context of Figures 10.1 and 10.5, it essentially computes, and then adds, the costs of the left
(resource demand) and right (multiple resource conflict) components. The computation of the demand or
mental workloadcomponent is straightforward (see Chapter 11). The conflict component in essence tallies the
number of dimensions of the model whose levels overlap in the two tasks (e.g., verbal-verbal, or response-
response). Other computational models of multitask performance have been proposed by Meyer and Keiras
(1997) and Salvucci and Taatgen (2008, 2011) and more elaborate structures of multiple resources have been
proposed by Boles, Bursk, et al. (2007).

294
FIGURE 10.5 The three dimensional (cube) structure of the Multiple Resource model.

4. EXECUTIVE CONTROL, SWITCHING, AND RESOURCE


MANAGEMENT
Effort (resource demand) and multiplicity together yield a dual task performance decrement. How then is this
decrement allocated? Which task is “primary ”and protected? Which one is secondary and suffers the brunt of
resource competition? This is the allocation component at the top of Figure 10.1For example in driving, we
see that lane keeping and hazard avoidance normally take precedence over cell phone conversation or
attention to an in-vehicle task (otherwise there would be far more distraction accidents than there are; see
section 10.5). But occasionally these priorities are reversed, as we understand when a cell phone conversation
is a cause of an accident (Regan, Lee, & Young, 2009).
Indeed the “poster child ”for such a priority reversal occurred in 1979, when an aircraft was taxiing on
the runway prior to take off from Detroit. The copilot was going through his primary task of following a
checklist, to insure that the plane was configured appropriately to generate enough lift to get off the ground.
(In aviation, assuring that the aircraft has lift is always the top priority task; Schutte & Trujillo, 1996).
Midway through the checklist, the copilot was interrupted by air traffic control, instructing a different runway
for takeoff. The copilot dealt with this request, but when attention was returned to the checklist, it returned
after a critical item instructing the setting of flaps. With the flaps not then set for take-off, the plane never had
sufficient lift to become stable, struggled on a feeble take off and crashed soon after it left the ground, with
thetragic loss of over 100 lives (NTSB, 1988). Here the pilots clearly did not prioritize the task of keeping
track of the checklist with its flight stability instructions, over the important (but less so) task of ATC
communications.
In fact, we can represent resource allocation in two different manners, graded or allor-none, manifest in
two different sorts of research paradigms, both of which appear to depend on the contribution of the executive
control system discussed in (Baddeley, 1983, 1995; Banich, 2009). From the perspective of graded
allocation, we can ask dual task performers to dynamically adjust the allocation of resources between two
tasks, as represented on the opposing PRFs shown in Figure 10.3 (Gopher, Brickner, & Navon, 1982; Tsang
& Wickens, 1980) as was discussed above. Indeed, people can do this, and in fact this represents a way in
which the PRF of a given task can be reconstructed by tracing its gain in performance in a dual task setting, as
resources are progressively allocated away from a concurrent task (of similar resource structure) toward its
own performance. Neither task is abandoned entirely; one is simply given higher priority than the other, a
manipulation that can be accomplished through instructions or monetary incentives. Furthermore, the more
that tasks share common resources in the context of Figure 10.5, the greater is the tradeoff (Tsang, 2006). We
saw in Chapter 7 that this technique was an effective way of training complex tasks (Gopher, 2007).
The second manifestation of resource allocation, to which we devote considerable attention below, is the
all-or-none switching examined in paradigms of interruption management and task management as
discussed as follows.
In this approach, we consider the multitasker as decision maker who essentially chooses to abandon one
task entirely to perform another, as for example when the driver goes fully head down to program a

295
navigational device for several seconds, while totally abandoning the driving task which must be supported by
out-the-window visual attention. A general way of representing this process is the ongoing task interruption
diagram shown in Figure 10.6 (Wickens & McCarley, 2008). An ongoing task (OT) is interrupted by an
interrupting task (IT), and when the latter is finished (permanently, or temporarily) attention returns to the OT.
The OT is typically defined as a more continuous task, and often of higher priority. Ideally, attention is
returned to the OTwhere it was “left off ”, but of course sometimes we return to the OT earlier (as when we
restart reading a full paragraph after being interrupted midway through), and sometimes we return later, as the
tragedy of the Detroit crash illustrated.

FIGURE 10.6 Ongoing task (OT) interrupted by Interrupting task (IT) a S1. Return to ongoing task at S2.

The general OT-IT-OT representation underlies a large amount of recent work on interruption
management (McFarlane & Latorella, 2002; Trafton & Monk 2007; Altmann & Trafton, 2002; Dismukes,
2010; Grungreiger, Sanderson, & Meyer, 2010). Because it focuses its analysis on the switching of attention
between an OT and an IT, such a representation is also closely related to research on discrete task switching
that begin nearly a century ago (Jersild, 1927; Rogers & Monsell, 1995; Monsell, 2003), although here neither
task is designated as “ongoing ”versus interrupting. Instead, the focus of this research is on the switch
between two tasks of relatively similar status and, unlike a more continuous ongoing task, each task is
completed before a switch.

4.1 Task Switching


The typical discrete attention switching paradigm is one in which the subject sees a series of digit pairs. On
one trial they must judge whether their sum is greater or less than 10; on the next trial they must judge
whether both digits are odd or even. The subject must then “switch ”the mapping rules of stimuli to response
on consecutive trials. Performance (response time) for this switching block can then be compared with
performance on two “pure blocks, ”one for each task, in which the same mapping rule applies consecutively
for the series of trials. Across this and many different variants of the paradigm three findings are prominent.
1. There is a clear “switch cost. ”Response times are longer on mixed than on pure blocks. While this
cost is often relatively small in the basic laboratory tasks involving switching decision rules, it can be
substantial (over one second) in more complex simulations, such as the supervision and control of
unmanned air vehicles (Wickens, Dixon, & Ambinder, 2006). Correspondingly, Rubinstein, Meyer,
and Evans (2001) have also found that switch costs increase with task complexity.
2. This cost is reduced when the stimuli for each task provide a clear indication of what operations
should be performed (Allport, Styles, & Hsieh, 1994). For example, switching between two-digit tasks
as above yields longer switching times than switching between say an odd-even digit classification
task and a vowel-consonant letter classification task, since in the latter case, the stimulus itself (digit,
or letter) automatically dictates what the classification rule must be. There is no confusion, since
letters are not “odd or even”, nor can digits be consonants or vowels.
3. The cost is also amplified if the interval between switches (consecutive stimulus delivery) is
shortened, as if it takes some time to prepare or “load ”the decision rules for a different oncoming task.
But even with longer delays, such a cost does not vanish entirely (Merien, 1996). It as if the
information for the next task must be physically present in order for those mental rules of stimulus-
response mapping to be fully “loaded”
If indeed there is a cost for rapid switching between activities, then it would seem that repeated

296
interruptions in any ongoing task will be detrimental to the latter, an observation that certainly conforms to
our intuition as well as real world observations (Loukopolous Dismukes & Barshi, 2009; Dismukes Berman &
Loukopolous, 2010). Thus we turn now to consider the frequency and nature of those interruptions, and how
such understanding has implications both for design and training in the multitask environment.

4.2 Interruption Management


The frequency of interruptions in the workplace has been well documented in the specific workplaces such as
those involving information technology (Gonzalas & Mark, 2004), humancomputer interaction (McFarlane &
Latorella, 2002; Aviation (Dornheim, 2000; Loukopolous, Dismukes & Barshi, 2010) and health care (Wolf
Potter et al., 2006, Grundgeiger et al., 2010; Koh Park et al., 2011). Within the context of Figure 10.6, we can
point to factors that affect interruption management (in particular the smooth and fluent return to OT after IT),
at each of the two switching points (we call switch 1 and switch 2) as well as in terms of properties of the OT
and IT itself. We describe these as follows:

4.2.1 S1 PROPERTIES OF THE OT


1. Engagement. Different OT ’s can vary in their “engagement ”(Horrey Lesch & Garabet, 2009;
Matthews Warm et al., 2010, Montgomery & Shariefe 2004) a property that makes it difficult for an IT to
break in and call the switch. In the context of Chapter 3, engagement or “cognitive tunneling ” has a severe
inhibitory effect on change detection (Wickens & Alexander 2009), as does any task with a high perceptual
load (Lavie, 2010).
A challenge however is to define the precise properties of engagement (Montgomery & Shareafi, 2004).
Certainly the inherent interest in a task is one feature. In a meta-analysis of cell phone interference with
driving, Horrey and Wickens (2006) found that more interesting conversations (simulating cell phone activity)
were more disruptive of driving than less interesting but higher workload synthetic cognitive tasks. As
discussed in Chapter 7, in the context of cognitive load theory, Mayer et al. (2008) have noted how interesting
(engaging) details in instructional program can divert attention from mastery of the concepts in instruction.
Wickens & Alexander (2009) have argued that more compelling immersed 3D flight path displays (see
Chapter 5) are more engaging, and hence more disruptive of noticing a discrete hazard event (the IT) than
standard 2D guidance displays. In complex systems, the task of fault management also is highly engaging
(Moray & Rotenberg, 1989), both because of its high cognitive demands and the possible effects of stress in
amplifying such cognitive tunneling (see Chapter 11). Dehais Causee and Tremblay (2011) have developed a
repertoire of successful techniques to “break through ”the cognitive attentional tunnel in fault management.
2. Modality. An OT that involves auditory working memory (e.g., listening to a complex series of
instructions) will also serve to retain attention on the auditory task and resist an interruption (Latorella, 1996;
Wickens & Colcombe, 2007) for the simple reason that the performer may wish to (or need to) rehearse the
fragile auditory material lest it be lost from working memory (see Chapter 7). Such a requirement is not
imposed by a more permanent visual OT like reading (although the visual OT may require “placeholders ”as
we see below). Such a bias may explain why auditory communications tasks are particularly disruptive of
some higher priority visual flight tasks (Damos, 1997).
3. Dynamics. The performer of an OT involving control of a vehicle or other dynamic system control
may (and indeed should) resist an interruption if the system is in a temporarily unstable state. For example,
the driver may postpone a look down to a display if the car is veering toward the edge of a lane until a time
when the car is both lane-centered, and heading parallel with the lane. Such dynamic instability is not an issue
for example in a checklist following or reading OT.
4. Priority of the OT. In general it can be argued that people should be more resistant to an IT (delay S1
longer) if the OT is of higher priority. While this is often found to be the case (Iani & Wickens, 2007), there
are of course occasional violations (e.g., distraction-based automobile accidents), and systematic observations
from the flight deck (Damos, 1997) indicate that pilots often let lower priority communications tasks interrupt
those of higher priority involving navigation. This reversal from optimality may be related, as above, to
modality (auditory communications versus visual navigation). Clearly, the Detroit crash involved a departure
from priority optimization.
5. Subgoal completion. Altmann & Trafton (2002) have proposed a theory of interruption management
based on a decaying memory, during the IT, for the status of the OT goals while attention is directed to the IT.
In particular, OT sub goals that are interrupted before they are achieved will be quite vulnerable to failures

297
upon return at S2. Hence, interruptions will be less disruptive if they occur at a time when a particular subgoal
of the OT has just been completed (Monk Boehm-Davis & Trafton, 2004; Trafton & Monk, 2007), and people
may delay S1 until subgoal completion. For example, reading will be less disrupted if an interruption occurs
after a paragraph has been read than in mid paragraph. While interruptions are less damaging when they occur
after subgoal completion, in some environments beyond the laboratory workers do not seem to defer
interruptions until subgoals are completed. Studying nurses in the ICU, Grundgeiger et al. (2010) found that
this deference was observed only when text tasks were interrupted, but not manual tasks. This leads to an
important distinction between what people should do, and what they actually do, discussed below.
One important implication of the subgoal completion finding is that intelligent human computer
interaction can “decide ”to impose an interruption (e.g., alert of a waiting e-mail), only when the automation
infers that the worker is at a boundary between subgoals (Bailey & Konstan, 2006; Dornich, Ververs et al.,
2012). As an example, when the information worker is creating text, an interruption could be imposed only
after the new-paragraph key is hit. This is a form of adaptive automation that will be discussed in Chapter 12.
6. Delay in S1. If an interruption occurs and people delay before they switch full attention to it, this
should provide the opportunity for two adaptive strategies: (1) rehearsing the place they are when they left off
(thus committing it to a more enduring memory for goals upon return) and (2) physically placing some sort of
“bookmark ”at the leaving off place. Consider for a moment how disaster in the Detroit plane crash could
have been avoided had an electronic system triggered a bright flashing surrounding the last item completed
before the interruption. (Electronic checklists now essentially do this; Bresley, 1995; Wickens, 2002b).
Dodhia & Dismukes (2003) and Trafton Altman et al., (2003) have found that such delays are beneficial
to overall OT performance, enabling a more timely and fluent return at S2, and Trafton Altmann & Brock
(2005) noted the benefit of a salient flashing placeholder. McDaniel and colleagues (2004) found that
presenting a blue dot on the computer screen during an interruption is an effective cue for supporting the
return to the OT. However, they argue that in order to be effective the cue must be used relatively infrequently
to make it more distinctive (salient). Grundgreiger et al., 2010 observed that nurses often spontaneously create
their own placeholder, in that when an OT is represented by some hand-held physical artifact, task return at S2
is more rapid. St. John & Smallman (2008) describe other display technology that can improve the fluency of
interruption management.
It should be noted that the focus on activities at S1 to grace the transition at S2 bears close resemblance to
the role of prospective memory (Dismukes & Nowinski, 2007; McDaniel & Einstein, 2007; ) in
remembering to do something in the future, as discussed in Chapter 7. Here the memory is specifically one for
returning to the OT, and the work of Loulopopolus Dismukes & Barshi (2009; Dismukes, 2010) has nicely
integrated these two forms of cognition, prospective memory and attention switching, in the study of
task/interruption management. Finally we note here, as suggested above, that many OT activities and
properties at (or just prior to) S1 influence the fluency of return at S2. But for now we interrupt this discussion
of the OT to focus on properties of the IT.

4.2.2 SWITCH 1 PROPERTIES OF THE IT: SALIENCE AND MODALITY Probably the most important IT factor at S1 is IT
salience (discussed in Chapter 3 in the context of attention capture and change blindness). If the IT salience is
high, it will rapidly and reliably cause the switch away (Trafton Altmann et al., 2003). If it is quite low, it may
not trigger a switch at all and cognitive tunneling is observed. Here it is found that both tactile and auditory
interruptions are more salient, leading to 15 percent more rapid attention switches than visual interruptions
(Lu Wickens et al., 2011), particularly if those visual interruptions are farther in the periphery relative to the
central point of visual interest for the OT (Wickens Dixon & Seppelt, 2005). This phenomenon is sometimes
referred to as auditory preemption (Wickens & Colcombe, 2007), leading to an inherent attention allocation
bias toward the initial delivery of auditory tasks (Ho Nikolic et al., 2004).
One cause of such preemption is related to the cognitive demands of rehearsing and processing auditory
linguistic information, as discussed above (Latorella, 1996; Damos, 1997), and this can explain why synthetic
voice messages are more disruptive of ongoing visual flight tasks than are equivalent visual text messages
(Helleberg & Wickens, 2003; Wickens & Colcombe, 2007). However, because preemption also applies to
non-linguistic and tactile interruptions, there is another aspect of preemption that is a mechanism different
from (although consistent with) the conscious decision to “stay with ”an auditory task containing longer
strings of verbal material that must be rehearsed, as described above. We may call this second mechanism
sensory preemption.

298
In this regard, we note that auditory preemption may offset the benefits of separate resources for an OT-
IT combination, an AV benefit discussed in section 10.3 above. As a consequence an auditory IT may disrupt
a visual ongoing task, while the separate resources used by an auditory IT may facilitate the OT in an
offsetting fashion. The IT on the other hand will clearly benefit from an auditory over a visual presentation
because of the benefits of both preemption and multiple resources. This explanation can account for findings
that the auditory (versus visual) delivery of IT information has little impact on the performance of a visual OT
(Wickens Prinet et al., 2011).
An important concept that has emerged in considering IT properties at S1 is that of “ pre-attentive
alerting ”proposed by Woods (1995) and evaluated by Ho Nikolic et al., (2004). This is a concept by which
the IT can register its own presence in a non-salient non disruptive form, allowing the performer to be aware
of that presence, but not requiring a full attention switch (to establish its content), and thereby forcing
abandonment of the OT.
Just as high salience makes an S1 switch more rapid, so low salience makes the switch later (or less likely
to occur at all; see change blindness in Chapter 3). Indeed, an important concept is that of a zero salience IT,
one which entirely depends on prospective memory to be initiated. Such a situation imposes a demand on
“knowledge in the head ”rather than “knowledge in the world ”(Norman, 1988) and would designate the status
of a task like “remember to check the altitude ”imposed on a pilot who is busy with other tasks. Indeed, the
fact that these zero salience interrupting tasks fail to trigger attention switches can account for the high
frequency of “altitude busts ”in aviation as well as the prevalence of “controlled flight into terrain ”(CFIT)
accidents in which a pilot flies a perfectly airworthy aircraft into the ground (Weiner, 1977). Such an accident
must by default result from the failure to remember to perform the “altitude check ”task. Although in modern
aircraft this task will be triggered by the alert of a ground proximity warning system, such alerts might occur
too late to be fully useful and are themselves subject to problematic alarm false alarms (see Chapter 2). In the
context of Chapter 7, the zero salience It represents dangers to maintaining level 1 SA.

4.2.3 S2: QUALITY OF RETURN TO OT Did you remember what you were reading about before we had this little
excursion to discuss the IT? (Hint: It was about the OT.) If so, you can probably pick up the flow of this text
fluently and rapidly. Altmann & Trafton (2002) have paid particular attention to the “resumption lag ”as a
time required to return to the OT following completion of the IT. In this regard, the resumption lag is a close
cousin of the switching costs discussed above. But it is possible to speak more broadly of the “fluency of
return ”including not only the resumption lag (a time measure), but the avoidance of unwanted errors for the
first few post-S2 seconds, or wasted time, as the OT may be returned earlier than left (at its worst, it may
require “starting from scratch”) or, as in the case of the tragic Detroit crash, it may return at a later place.
In this regard, several of the factors discussed regarding the OT at S1 have their effects realized
downstream at S2. A delay at S1, if exploited by rehearsal or placeholder-placing strategies will increase S2
fluency (McDaniel et al., 2004. A switch 1 after an OT subgoal is completed will improve S2 fluency. In
contrast, two particular properties of the IT can degrade the S2 return: if it is long and/or difficult (Monk
Trafton & Boehm-Davis 2008, Grundgeiger et al., 2010). In the context of memory-for-goals theory (Altman
& Trafton, 2002, Trafton & Monk, 2007) a long IT will prolong the period during which OT goals may decay,
and a difficult IT will simply prevent goal rehearsal through dual task interference, hence disrupting the
fluency of resumption. Kujaala & Sarimona (2011) note how a more disorganized menu structure can disrupt
the fluency of scan return to the menu on a driver ’s dashboard relative to an organized list structure. In this
case we speak of the menu task as the OT.
In a corresponding way any property of the IT that may obscure visibility of the OT workspace will also
disrupt return fluency, whether this property involves a blanked computer screen (Ratwani & Trafton 2010) or
simply looking (or walking) away farther from the OT workspace (Grundgeiger et al., 2010). Indeed, Ratwani
and Trafton (2010) have shown the impact of visualvisual resource competition in degrading interruption
management. When the IT is highly visual, even if it does not actually obscure the view of the OT working
surface (here a display on which equipment orders were configured), it will still disrupt return more than when
the same IT information is delivered auditorily.
Finally we note one additional property of the OT-IT relationship that affects interruption management,
and this is the similarity of material between the two (Gillie & Broadbent, 1989; Cellier & Eyrolle, 1992).
The greater the similarity, the greater is the difficulty of handling interruptions, as if similar IT material
intrudes on, or is confused with OT material, delaying the start up of the OT at S2. We address this issue

299
further in Section 6.

4.3 From Interruption Management to Task Management


Just as the study of attention switching involves repeated cycles of OT-IT, so the more general topics of task
management and workload management involve stringing together lots of different heterogeneous tasks, with
every task essentially interrupting others or “clamoring for attention ”like a room full of unruly
kindergartners.
How do people perform in managing these multiple heterogeneous tasks, whether this is examined on a
smaller time scale . . . like a surgical nurse in the operating room, or a larger one, like a student with five
classes and papers due during finals week? People are not always effective at such task management (Puffer,
1989) even in the highly skilled environment of the airplane cockpit (Funk, 1991; Chao Madhavan & Funk,
2003; Loukopolous et al., 2009). To answer this question, we can turn to a broader perspective that borrows
from queing theory and operations engineering (Walden & Rouse, 1978, Moray Dessouki et al., 1991) in
terms of specifying the optimal strategies that should be deployed, in order to maximize collective
performance on all tasks. What strategies should influence the human in deciding “what task I should perform
now? ”after having just completed one task in a multitask environment. Such strategies would seemingly
include the following (Freed, 2000):
• Urgency. More urgent tasks should dominate selection, where urgency can be formally defined in
terms of the difference between the time required to finish the task and its deadline. The shorter this
interval, the greater the urgency.
• Importance. Much as we have seen task importance driving resource allocation, so importance should
also drive task switching (Iani & Wickens, 2007). Importance is distinct from urgency since the former
is value based whereas the latter is time based. Importance parallels the V parameter in the SEEV
model of visual attention as discussed in Chapter 3.
• Task duration. While duration may have an effect on urgency (with a fixed deadline, longer tasks
have greater urgency), it also plays into task selection in a different manner. Because of a reluctance to
switch attention (the switching costs discussed above), once a long duration task is undertaken it will
be more likely to dominate attention for a longer period of time, at the expense of shorter tasks. In
essence, in order to anticipate these demands one says “let ’s get the little ones out of the way before
we tackle the biggy. ”Naturally, larger switching costs will lead to a greater “task inertia” the tendency
to stay with a task for a longer duration, whatever its length, importance, or urgency.
While these three factors influence the optimality of task switching, another factor, preview can help the
human to choose optimal task selection (Tulga & Sheridan, 1980). Anticipation of task duration is often
uncertain (and is often underestimated, see the planning bias, in Chapter 8); hence better planning and task
selection can be done to the extent that a planning horizon is visible, providing estimates of the duration of
tasks that will be arriving as well as their approximate time of arrival. The utility of displays to support
planning was discussed in Chapter 7, Section 9.
These factors, articulated by Freed (2000), dictates what people should do. However, there is
considerably less data on what people actually do in task selection. In contrast to computer optimization
models, people do not tend to maintain elaborate and highly optimal planning strategies for task management
(Liao & Moray, 1993; Laudeman & Palmer, 1995; Raby & Wickens, 1994), such as carefully calculating the
appropriate optimal sequence in which to perform tasks of differing priority. This cognitive simplification
apparently results because applying such strategies themselves is a source of high cognitive workload or
resource demand (Tulga & Sheridan, 1980). Hence, their application would be self defeating, competing with
task performance at the very time they might be most necessary for optimization of that performance.
In conclusion, the search for valid models of all the collective influences on task scheduling and
management is daunting. While some might clearly be linked to scanning models such as SEEV, we can
readily understand that the task complexity of just looking is a lot simpler (and easier to model) than when it
is coupled with thinking and doing.

5. DISTRACTED DRIVING
A prototypical real world OT-IT scenario is driving a car with a series of interruptions, ranging from cell
phone calls, to passenger conversations, to CD insertions to programming a navigational device, to
unwrapping the veggie burger, to even the internal disruptions of daydreaming (He Becic et al., 2011, Smilek

300
et al., 2010; Lavie, 2010). This situation is colloquially one defining the “distracted driver ”(Regan Lee and
Young, 2009, Hurts Angell & Perez, 2011; Lee, 2005, Lee & Angell, 2011). In these circumstances we can
clearly define two ongoing primary tasks of equal high importance. These are: (1) lane keeping & headway
monitoring (tracking tasks discussed in Chapter 5 and relying heavily on ambient vision, and a perceptual-
motor loop) and (2) hazard monitoring, a purely perceptual task discussed in Chapter 3 and relying far more
on focal vision and change detection (Horrey Wickens & Consalus, 2006). Periodic interruptions or
distractions will then be imposed on these two ongoing tasks, considered “primary ”not only because of their
continuous nature, but also their clear dominance for safety.

5.1 Mechanisms of Interference


There is no doubt that distractions are a huge factor in highway safety, whether from traditional sources (kids
fighting in the back seat, eating, unfolding a map, mind wandering, or searching for a road sign), or from
more emerging technology (cell phone use, navigational system operation, infotainment, or texting). Indeed,
the percentage of crashes due to distraction could be said to be around 20 percent, even as different authors
report wide differences in the range of estimates (e.g., Gordon, 2009, 2 percent to 30 percent; Lee Young &
Regan, 2009; 11 percent to 23 percent, Hurts Angel and Perez, 2010 10 percent to 25 percent). The reason for
this wide variability is simply that there is no accurate recording of when a distraction caused an accident
since often such an attribution is solely determined by the driver ’s willingness to self report this on a police
record (Dingus Hanowski & Klauer, 2011). In a relatively unique naturalistic (non-simulator) study in which
actual accident and incident rate could be reliably obtained Dingus et al. (2006) estimated that 78 percent of
crashes and 68 percent of near crashes involved inattention as a contributing factors. (However, this statistic
also includes numerous contributions of fatigue to inattention, which is not classified as a dual task
distraction.) The sources of interference of distraction for driving reflects the contributions of all three
mechanisms of multitasking encompassed in multiple resource theory, as discussed above.
Effort and resource demand. Mattes and Haller (2009) nicely illustrated the role of effort, as they
found that increasing the demands of a fully cognitive task (avoiding any visual interference) increased error
in a lane change task by 27 percent. Salvucci & Beltowska (2008) observed that increasing working memory
demand of a concurrent task substantially retarded the time to brake for a hazard.
Multiple resource structure. Ample evidence exists that visual tasks interfere more with driving than
do auditory tasks (Dingus, Hanowski & Klauer, 2011, Horrey & Wickens, 2004, Collett Guijillot & Petit,
2010) and that manual interfaces interfere more than speech interfaces (Shutko & Tijerno, 2011, Dingus
Hanowski & Klaur, 2011, Tsimhomi Smith & Green, 2004), particularly when the latter involve the
processing of visual feedback for data entry. Regan Young Lee and Gordon (2009) have carefully analyzed
different distracting tasks in terms of their multiple resource components.
Resource allocation. It is true by default that when a distracting task interferes with safe driving, the
driver has allowed preemption by the less important IT of the more important OT.
As described above, Horrey & Wickens (2006) found that more “engaging ”concurrent or interrupting tasks
interfered more with driving than less engaging (but often more difficult) cognitive tasks, suggesting that this
engagement drew resources away from safe driving. Horrey Lesch & Garabet (2009) have examined
engagement and driving in more detail, finding that while engaging tasks disrupt driving as much as non-
engaging ones, drivers feel that they are less disrupted by the engaging tasks, hence reflecting overconfidence
and a failure of meta-cognition. The linking between resource allocation and driving tasks can be made
explicit when visual scanning is used to infer the direction of attention to tasks. Wickens & Horrey (2009)
have developed a model of hazard risk exposure based on the application of the SEEV visual scanning model
(Chapter 3) to the dual tasking of in-vehicle tasks and roadway monitoring.

5.2 Cell Phone Interference


It is clear that all three factors of MRT–demand, structure and allocation–play important roles in the specific
question of mobile or cell phone distraction from driving, an issue we now address in greater depth, by
addressing three distinct questions.
1. Do cell phones interfere with driving? There is ample evidence that they do (Collett Guijillot & Petit,
2010) and this led, in December 2011, the National Transportation & Safety Board to advise a ban on all cell
phone (and texting) use while driving. For example, Drews and Strayer (2007) have estimated that this
interference is equivalent to that of driving under the influence of 0.08 percent blood alcohol level and can
lead to a fourfold increase in the risk of fatalities, based in part on the interpretation of epidemiological data

301
(Redelmeier & Tibshirani, 1997, Violante & Marshall, 1996). As noted, the Horrey & Wickens (2006) meta
analysis revealed a modest 1/6 sec slowing in driver response time attributable to interference from either
actual or simulated cellphone engagement, across experiments where dual task decrements could be precisely
measured. A subsequent meta analysis (Caird Willness et al., 2008) revealed a larger ¼ second estimate.
Flannagan & Sayer (2010) estimated that approximately 3 percent of highway accidents were directly
attributable to cell phone distraction.
In answer to the response to the question: “yes, but other forms of distraction also interfere, ”it is of
course unclear whether the cell phone conversation may interfere more or less than tasks like eating or being
distracted by an infant; but this argument is somewhat irrelevant to safety issues since cell phone use does
cause increased accident exposure and legislation can effectively reduce this exposure.
2. What are the mechanisms of interference? A careful task analysis of the cell phone tasks (Regan
Young Lee & Gordon , 2009) can reveal the precise nature of interfering effects; and there is certainly
documented evidence that interference is greater than simply due to listening, as the interference effects are
considerably greater than radio listening. Hence, the cognitive requirements (resource demands) of engaging
in conversation add load, just as does the possible competition for response resources between speaking and
the various aspects of responding in driving (braking, steering).
Furthermore, as noted above, cell phone conversations can be interesting, engaging, and cognitively
loading, as working memory is often demanded to follow the gist of a conversation and prepare appropriate
responses. Hence, it is of no surprise that cell phone conversations impose particular interference with visual
perception, inhibiting change detection (Vais, McCarley et al., 2004; Strayer & Drews, 2007; Drews &
Strayer, 2008), as well as narrowing the visual scan pattern (Recarte & Nunes 2000; using a task that
simulated the cognitive demands of cell phone use).
In this regard, an important issue is understanding why cell phone conversations interfere more with
driving than do conversations with a passenger (Dingus, Hanowski & Kluaer, 2011, Drews Pasupathi &
Strayer, 2008, Gugerty Rikauskus & Brooks, 2004). One explanation here appears to be what is described as
the lack of “common ground ”with the participant in a remote cell phone conversation, versus one co-located
with the driver. The passenger can (and does) modulate his or her rate of conversation based on the perceived
demands of the roadway, slowing or halting conversation altogether if, for example, the driver is making a left
turn on a busy highway. They share situation awareness. The cell phone mate on the other hand, has no
knowledge of such driving conditions. As such, the common ground interpretation may be seen as a
resourceallocation effect. If there is no distraction during such busy periods (the passenger conversation is
halted), then their full resources can be allocated to driving.
3. Are hands-free cell phones safer than hand held phones? There is strong evidence that under certain
conditions, hand held phones do produce more interference (Collet Guijillot & Petit, 2010, Dingus Hanowski
& Klauer, 2011, Goodman Tijerina et al., 1999). Although most of the time simply holding a phone produces
no interference (Drews & Strayer, 2009), common sense informs us that holding a phone with one hand can
interfere with negotiating a sharp turn. Furthermore, the requirements to “dial ”the phone will impose heavier
motor-motor interference and may also require vision as well, to assure correct finger positioning, creating
greater interference than with voice dialing (Hurts Angell and Perez, 2011, Shutko & Tijerna, 2011). Thus, the
distinction between hands-free and handheld cell phone usage is important, but at the same time, for all of the
reasons described above, hands-free phones do not entirely remove the interference with driving (but see,
Dingus Hanowski & Klauer, 2011). Thus removal of the peripheral structural aspects of visual-visual and
motor-motor interference does not allow perfect parallel processing.
Various forms of remedies are of course available (Victor, 2011). The most obvious is legislation, and
many states and countries have outlawed either hand held, and, in some cases hands free phones while
driving. Other solutions involve “lockouts ”that may prevent incoming calls from being taken, or sometimes
placed during particular phases (e.g., when the car is in motion, or other dangerous conditions are sensed by
smart automation; Domez Boyle, Lee & McGehee, 2006). Alerts and attention guidance to the outside world
(see Chapter 3) can be effective if some forms of intelligent automation can infer distraction or excessive head
down time (Victor, 2011), a form of adaptive automation discussed in Chapter 12. Careful design can also
integrate some of the distracting systems into the natural flow of driving, such as via steering-wheel mounted
controls, or integrated displays (Shutko & Tijierno, 2011), rather than having the technology functions
belonging to a separate physical device. Finally, there is of course the possibility that effective task and
interruption management can be trained (Horrey Lesch Kramer & Melton, 2009; Regan Lee & Young 2009b),
as outlined earlier in the chapter, although there is only modest and somewhat ambivalent evidence that cell

302
phone interference decreases with experience (Collet Guijillot & Petit 2010, Young Regan & Lee, 2009).
In closing, the evolution of technology has led to the emergence of texting while driving (Hosking
Young & Regan, 2009, Drews Yazdani et al., 2009), and even while cycling (de Waard & Schlepers, 2010).
Here the evidence is strongly compelling that the competition for visual and motor resources is so high as to
make the interference drastically greater than that of cell phone use (Dingus Hanowski & Klaur, 2011).

6 TASK SIMILARITY, CONFUSION, AND CROSSTALK


In section 10.3, we discussed the strong impact of similarity of resource demand in increasing multitask
interference. Here we describe how increasing the similarity of the processing routines as well as the
similarity of material between two tasks may reduce time-sharing efficiency, a result of confusion. For
example, Hirst and Kalmar (1987) found that time-sharing between a spelling and mental arithmetic task is
easier than time-sharing between two spelling or two mental arithmetic tasks. Hirst (1986) showed how
distinctive acoustic features of two dichotic messages, by avoiding confusion, can improve the person ’s
ability to deal with each separately. Many of these confusion effects are closely related to interference effects
in memory, discussed in Chapter 7. Indeed, Venturino (1991) has shown similar effects when tasks are
performed in sequence so that the memory trace of one interferes with the processing of the other. Such
similarity-based confusion of similar material underlies challenges in interruption management, as noted
above in 10.4 (Gillie & Broadbent, 1992).
Although these findings are analogous in one sense to the concepts underlying multipleresources theory
(greater similarity producing greater interference), it is probably not appropriate to label these elements as
“resources ”in the same sense as stages, codes, modalities, and visual channels in the context of Figure 10.5
(Wickens, 2007b, Vidulich & Tsang, 2007). This is because such items as a spelling routine or distinctive
acoustic features hardly share the gross anatomically-based dichotomous characteristics of the dimensions of
the multiple-resources model (Wickens 1984, 2005, 2002a). Instead, it appears that interference of this sort is
more likely based on confusion, or a mechanism that Navon (1984; Navon & Miller, 1987) has labeled
outcome conflict. Responses (or processes) relevant for one task are activated by stimuli or cognitive activity
for a different task, producing confusion or crosstalk between the two (Fracker & Wickens, 1989). This is, of
course, a close cousin of the response conflict of the Stroop task discussed in Chapter 3; there describing the
failure of focused attention, and here the failure of divided attention. It is also closely related to the slip or
capture error discussed in Chapter 9.
Confusion and crosstalk often occurs with dual manual conditions as well (Fracker & Wickens, 1989;
Duncan, 1979; Navon & Miller, 1987). Consider the challenges imposed by rubbing your head while patting
your stomach or, in music, playing a 4-4 rhythm with one hand and a 3-4 with the other (Klapp, 1979).
Although confusion due to similarity certainly contributes to task interference in some circumstances, it
is not always present nor always an important source of task interference (Pashler, 1998; Fracker & Wickens,
1989). Its greatest impact probably occurs when an operator must deal with two verbal tasks requiring
concurrent working memory for one and active processing ( comprehension, rehearsal, or speech) for the
other, or with two manual tasks with spatially incompatible motions. In the former case, as discussed in
Chapter 7, similarity based confusions in working memory may play an important role.

7 INDIVIDUAL DIFFERENCES IN TIME SHARING


How do people differ in their time sharing ability? Building upon what we have learned in this chapter and in
Chapter 3, we address three major forms of differences: between experts and novices, between younger and
older adults, and, in Chapter 11, across what may appear to be genetic differences in inherent ability. The first
and third of these are directly relevant to how to train or select people for work domains with high
multitasking components. The second may both identify particular areas of vulnerability, and aid the design of
environments (Fisk and Rogers, 2007) in order to buffer attentional vulnerabilities in the aging population; it
may also identify attention skills that more rapidly decline with age, for which specialized training can offset.

7.1 Expertise and Attention


There is no doubt that experts are more proficient than novices in many complex tasks, including those that
involve considerable time sharing. A straightforward explanation is that experts are more automated in
performance of the component tasks than novices (Chapter 7). Thus the PRF ’s on the skills for which they
demonstrate expertise look more like those of Figure 10.2b than of 10.2a, with a greater data limited region.

303
Such differences have long been offered as explanations for expertise multitask proficiency (Bahrick &
Shelley, 1958; Bahrick Noble & Fitts, 1954, Damos, 1978 Fisk & Schneider, 1982) and there is little doubt
that this is a valid explanation. Examining the shape of the curves A and B in Figure 10.2, it is important to
realize that such differences may not show up in single task performance, when full resources are devoted to
the task, but will readily be expressed in a multitasking environment.
But is single task automaticity the only source of difference? If so, then the development of expertise in
complex multitask environments like driving or flying would be learned most efficiently simply if all parts
were trained in isolation, a type of training called fractionation part task training (Wightman & Lintern,
1985; see Chapter 7). This is because, when training in parts, full attention can be allocated to learning each
part (task) at a time. But ample data reviewed in Chapter 7 suggests that this is not altogether true and whole
task training is usually more efficient (Wickens Hutchinson et al., 2012). Given that this is the case, then we
can turn our inquiry to identifying the form of this time sharing skill that differentiates levels of expertise, an
emergent feature that is not the part of any task alone in a multitask ensemble, but of the group together.
Below we offer some candidates for such a skill that are supported by research.
• Visual scanning. Ample data indicates that experts scan in a multitask environment differently from
Novices (Fisher & Pollatsek, 2007, Pradham et al., 2006, Pradham Fisher & Pollatsek, 2009, Bellenkes
Kramer & Wickens, 1997; Shinar, 2008, Mourant & Rockwell, 1970; Koh Park et al., 2011). As with
the expected value model of scanning (SEEV) discussed in Chapter 3, so we can assume that experts
know better when to look to each taskrelevant source of information to pick up important information.
For example, more skilled drivers sample farther down the highway to support lane keeping (Mourant
& Rockwell, 1970), and have shorter downward scans away from the road (Phradhan, Divekar et al.,
2011); and more skilled pilots sample that information which is more predictive (Bellenkes Wickens
& Kramer 1997). We can say the experts have a better mental model of the information within the
multitask ensemble..
• Interruption management. Koh Park et al. (2011) have found that in the multitask environment of
the operating room, expert nurses are more resistant to interrupting critical foreign-object count task in
the face of lower priority interruptions, than are novices. Given the wealth of strategies that can govern
interruption management, as discussed in Section 10.4, it is not surprising that experience and training
assist in learning to deploy these strategies more fluently and optimally (Cade Boehm-Davis et al.,
2011, Dismukes, 2010, Hess & Detweiller, 1994).
• Attention flexibility. Both of the above are related to task management, and so it is reasonable to
hypothesize that experts are better at flexibly allocating resources, to more important tasks at the times
that those resources are needed (Gopher. 1993) This is reflected in the research on how to train time
sharing expertise that we now describe.

7.2 Training Expertise in Time-Sharing Skills


Just because experts differ from novices in a certain aspect of performance (here multitasking) does not
necessarily mean that there are “short cuts ”to developing expertise. But there is evidence that some
attentional skills can be directly trained. Here again we provide a bulleted list.
• Shapiro & Raymond (1989) and Pradhan and colleagues (Pradhan Fisher & Pollatsek, 2009; Pradhan
Divekar et al, 2011) have both demonstrated the benefits of scanning training, capturing the patterns of
expert ’s looking behavior within the context of the multitasking skills of playing a video game and
safe driving, respectively, and teaching these to novices. With various methods to induce novices to
“look more as experts do, ”their program produces successful positive transfer.
• Dismukes & Nowinski (2007) have advocated explicit training in some of the techniques of
interruption management discussed above, as applied specifically to flight training and Cade Boehm-
Davis et al., (2008) have noted the success of repeated exposures to interruptions, in supporting better
interruption management skills.
• Echoing our categorization of expertise differences, the two training programs described above both
involve training in attention management. Research has established that attention priority flexibility
can also be trained, in a way that not only produces better multitasking on the task pair which that
training strategy hosted (Gopher Brickner & Navon, 1982), but also transfers positively to different
dual task combinations Kramer Larish & Strayer, 1995; Gopher, Weil & Barakeit, 1994). Again, in the
context of Figure 10.3, a proficient multitasker can know when resources may be temporarily less
necessary for one task (e.g., in a data limited region), and be allocated to one of greater resource
demands (Schneider & Fisk, 1982; Gopher, 1993). To do so in a dynamic environment, where the

304
resource needs for each task may vary continuously, appears to be a general skill that can be taught.
• Indirect evidence may be offered by studies of bilingual children (Bialystok Craik et al., 2009). These
children appear to be more proficient in executive control and suppressing unwanted inputs (focused
attention), than those raised in a monolingual household (see Chapter 7). They appear to have gained
the ability to flexibly switch from one language to another, in comprehension, cognition and speech.
• Navarro et al. (2003) have found that children can be taught to more flexibly manage their attention by
playing a game forcing them to divide visual attention between different elements (e.g., faces) in a
complex scene, in essence asking them to make judgments that “one of these things is not like the
other. ”Green & Bavillier (2003) have found evidence that playing certain types of video games can
actually expand the useful field of view. While such expansion does not necessarily translate to
improved multitasking, it certainly could do so when two sources of visual information for two tasks
were not adjacent.
• As we have noted above, some amount of whole task training, of tasks in pairs is necessary to teach
time sharing skills and achieve optimal multi-tasking (Damos & Wickens, 1980). It is apparent that
these benefits are realized by learning some of the specific skills above. Furthermore, it is also
important to realize that the variable of between-task interaction also enhances the value and
importance of whole task over part task training (Naylor & Briggs, 1963; Lintern & Wickens, 1991).
Such interaction is characteristic of circumstances in which the responses of one task directly affect
the perceived information in another. This might characterize concurrent manipulation of the clutch
and gear shift on a stick shift car; simultaneously controlling altitude and heading in an aircraft, or
strumming while chording on the guitar. Such linkages and cross coupling simply cannot be learned
when each task is practiced alone.

7.3 Aging and Attention Skills


There is clear evidence that time-sharing, or divided attention skills, decline with age. Verhaeghen Steitz et
al., 2003; Sit & Fisk, 1999; Fisk and Rogers, 2007). As one direct example, concurrent driving and cell phone
use gets worse with age. (Alm & Nilson, 1999). But here again, one may ask what components underlying
attention skills decline beyond about age 60 or 70.
• Again, attentional flexibility may be partially responsible. Sit & Fisk (1999) and Tsang & Shanar
(1998) both observed age-related deficiencies in the resource allocation component of Figure 10.1.
This aging effect was also implicated in the study by Kramer Larish and Strayer (1995), which
indicated that flexible resource allocation could be trained and transferred to a different dual task pair.
But their experiment also included both younger and older adults. While the older group was less
proficient dual taskers than the younger participants, they also benefited more from the variable
priority training, as if this was a capability they were particularly lacking. Bojko Kramer & Peterson
(2005) observed greater switching costs for older adults in the task switching paradigm discussed in
Section 4.1 above.
• Older people also suffer greater from distractions (Gazzely et al., 2005). While the ability to focus
attention, degrading with age, is not in itself a dual task skill, one can understand how increasing
distractiblity can disrupt for example, the ability to concentrate on one member of a dual task pair,
requiring working memory and rehearsal, while a less important event in a concurrent task intrudes on
that rehearsal. We might say that selective filtering degrades with age. (Barr & Giambra, 1990).
• To repeat a theme, both of the above factors appear to reflect some degradation in executive control
(Banich, 2009, Shallice & Norman, 1986) for older adults. They perform more poorly in complex
tasks, demanding working memory and executive control (de Jong, 2001). This also involves a
degradation of change detection ability (MCarley Vais et al., 2004). Working memory capacity,
closely related to executive control, also shows declines with aging (e.g., Dobbs & Rule, 1989). Hence
it is not surprising that reduced efficiency in executive control with aging is observed in a variety of
circumstances, just as the measures of fluid intelligence, requiring such flexibility decline with age,
even as those measures of crystallized intelligence requiring direct access to knowledge in long term
memory, may increase. The issue of working memory and intelligence differences between people is
addressed more in the next chapter.

8 CONCLUSION AND TRANSITION


Time sharing and multitasking is ubiquitous in our society in both leisure and work activities, and can be

305
described by various mechanisms and theories, many of which work in harmony to predict the full range of
multi-tasking performance. Such theories must of necessity accommodate the emergent feature that is time
sharing of two component tasks, but also must often accommodate theories of the tasks themselves (e.g., what
drives their effort demands of each alone). As of now, the gap between well controlled laboratory research in
theory testing, and the complexity of real world multitasking remains large. With the contribution of
computational models, however, the rising concern for safety implications of multitasking, and the
understanding of brain mechanisms, this gap is being closed.
From a different point of view, there is no doubt that multitasking is often highly stressful, and such
stress can have its own consequences. In the following chapter we address the issue of stress, and here we
place great emphasis on the assessment and predictions of multitask stress as we address the issue of mental
workload. Finally, we note that insight into many aspects of multitasking, stress and mental workload is being
provided by studies of the brain–the study of neuroergonomics, which reveal sources of task differences and
individual differences between people in these endeavors.

Key Terms
adaptive automation 340
allocation policy 324
Attention flexibility 342
auditory preemption 335
automaticity 322
between-task interaction 343
codes 327
cognitive tunneling 333
common ground 340
data limit 323
dual task decrement 321
Engagement 333
executive control system 331
foveal 329
fractionation 342
interruption management 331
mental workload 323
multiple resource theory 321
neuroergonomics 345
outcome conflict 341
performance-resource function 323
preview 337
processing code 327
rospective memory 334
residual attention 323
residual resources 323
resource limited 323
sensory preemption 335

306
similarity 336
spare capacity 323
structural interference 325
Subgoal completion 334
sustained attention 324
Task duration 334
task management 331
time sharing skill 342
undifferentiated capacity 345
Urgency 337
visibility 336
Visual scanning 342

307
11 MENTAL WORKLOAD, STRESS, AND
INDIVIDUAL DIFFERENCES: COGNITIVE
AND NEUROERGONOMIC PERSPECTIVES

1. INTRODUCTION
A patient is undergoing a lengthy procedure, say a heart transplant that involves many surgeons,
anesthesiologists, and nurses working together as a team. At some point during the surgery, the patient begins
to exhibit changes in vital signs that might indicate a critical, life-threatening condition. It has been a long,
mentally taxing, and stressful experience for all involved. The lead surgeon must decide on the appropriate
course of action to take, if any. Carrying out the various complex surgical procedures and dealing with various
unanticipated events during the surgery has imposed a significant demand on the surgeon’s attentional
capacity. Is the mental workload experienced by the surgeon so great that the latest unexpected event cannot
be adequately dealt with? Will the stress of the situation impair his or her decision-making ability? Moreover,
can we account for why one surgeon may have sufficient attentional capacity to deal with the latest
emergency, while a colleague may not? Another surgeon may not cope well with the stressful demands of the
situation and may not act decisively, potentially endangering the patient. Yet another one may fall prey to the
fatigue associated with the long operation and may make a faulty decision.
These factors—high mental workload, stressful environments, and differences between people in the way
they are able to cope with such demands—are the focus of this chapter. Attention is the core cognitive ability
that allows human operators to meet these challenges. In previous chapters of this book we have discussed
different aspects of human attention, first with respect to display design in Chapter 3 and then in relation to
multitasking in Chapter 10. In this chapter we continue our examination of applied aspects of attention by
describing its role in mental workload. Some of the theories and empirical findings on dual-task performance
that we discussed in Chapter 10 will be referred to again in relation to workload, but our focus in this chapter
will be on more applied issues of its measurement and evaluation in work settings. Because stress can be a
significant contributor to workload, we also describe some of the dominant theoretical approaches to the study
of stress and methods for its mitigation in the workplace. Finally, because people differ from one another in
their response to sources of task load, individual differences is another topic covered in this chapter.
Our coverage of these three topics—mental workload, stress, and individual differences—is not
comprehensive but selective, with a focus on implications for an understanding of human performance in the
workplace. We examine each of these topics not only from the typical cognitive approach that we have
followed throughout this book, but also from the perspective of neuroergonomicsz, which is increasingly
being applied to the study of a number different issues in human factors and ergonomics (Parasuraman &
Rizzo, 2007; Parasuraman & Wilson, 2008).

2. THE NEUROERGONOMIC APPROACH


Neuroergonomics has been defined as the study of the human brain in relation to performance at work and in
everyday settings (Parasuraman, 2011). The central premise is that research and practice in human factors and
cognitive engineering can be enriched by considering theories and results from neuroscience. Such a goal has
become possible because of the phenomenal growth in human cognitive, and more recently, social
neuroscience (Gazzaniga, 2009; Cacioppo, 2002). Findings from neuroscience can constrain or extend
theories of human performance (Poldrack & Wagner, 2004). Neuroergonomics can therefore provide added
value, beyond that available from traditional neuroscience and conventional ergonomics, to our understanding
of brain function and behavior as encountered in work and in natural settings.
While human factors research and practice was initially conducted within a behaviorist tradition in its
early history before World War II, the advent of cognitive psychology a decade later saw the adoption of the
information-processing approach, which remains current today and is the approach taken in this book. Until
recently, however, findings from cognitive neuroscience have not had much influence within conventional

308
human factors work. Some researchers in cognitive neuroscience are aware of the importance of ecological
validity (e.g., see Kingstone et al., 2006), but typically tend to study mental processes in isolation independent
of considerations of the artifacts and technologies of the world that require the use of those processes.
Neuroergonomics goes one critical step further. It postulates that the human brain, which implements
cognition and is itself shaped by the physical environment, must also be examined in interaction with the
environment in order to understand fully the interrelationships of cognition, action, and the world of artifacts
(Parasuraman, 2003). A recent review of progress in human factors describes the historical changes in the
field from its beginnings in behaviorism, its adoption of the information processing view, and culminating in
the neuroergonomic approach (Proctor & Vu, 2010).
In this chapter we discuss how our understanding of three areas in human factors research— mental
workload, effects of stress on performance, and individual differences in cognition and human performance—
can be enhanced by examining them from both cognitive and neuroergonomic perspectives.

3. MENTAL WORKLOAD
Mental workload is probably one of the most widely invoked concepts in human factors research and practice
(Bailey & Iqbal, 2008; Loft et al., 2007; Moray, 1979, Parasuraman & Hancock, 2001; Tsang & Wilson,
2006; Wickens, 2008). System designers and managers raise the issue of mental workload when they ask
questions such as: How busy is the operator? How complex are the tasks that the operator is required to
perform? Can any additional tasks be handled above and beyond those that are already performed? Will the
operator be able to respond to unexpected events? How does the operator feel about the tasks being
performed? Each of these questions could be asked of the people in the surgical scenario described at the start
of this chapter. Answers to the questions can be provided given that mental workload can be measured in an
existing system or modeled for a system that is not yet built.
Mental workload characterizes the demands of tasks imposed on the limited information processing
capacity of the brain in much the same way that physical workload characterizes the energy demands upon the
muscles. In any resource-limited system, the most relevant measure of demand is specified relative to the
supply of available resources, as discussed in Chapter 10. Thus a context for conceptualizing this supply-
demand relationship associated with mental workload is provided by the two functions shown in figure 11.1.
The X-axis depicts increasing resource demands of a task (or set of tasks) in a way that can encompass either
the demands of a single task, or multitask demands (e.g., requirement to supervise more than a single
unmanned vehicle or robot). We will distinguish between the single and multitask cases below.

FIGURE 11.1 Schematic relationship among primary-task resource demand, resources supplied, and performance, indicating the “red line” of
workload overload.

The Y-axis represents two functions. A “resource supply” function (solid line) reflects the fact that when
demands are increased from 0 (doing nothing) to some level, the operator has ample supply to meet those
demands. But as a limited capacity or limited resource system, when the demand exceeds the supply, no
further resources can be supplied; the solid line flattens. Of course this level cannot be established precisely,
and hence the leveling is gradual, not abrupt. The dashed line represents performance on the task(s) in
question. Almost by definition, when supply exceeds demand, performance remains perfect, and is unchanged
by differences in demands. Once demand equals supply, further demand increases will lead to further

309
performance decrements. The discontinuity or “knee” on the two curves is sometimes referred to as the “red
line” of workload (Hart & Wickens, 2010; Rennerman, 2009; Wickens, 2009); or given its fuzziness, a “red
zone.” Importantly, as we describe below, the red line divides two regions of the supply demand space. The
region at the left can be called the “reserve capacity” region. That to the right can be labeled the “overload
region.” The two regions have different implications for workload theory, prediction, and assessment, as well
as the kinds of concerns of engineering psychologists. We treat these in sequence below.

3.1 Workload Overload


Both engineering psychologists and designers are interested in predicting when demand exceeds supply and
performance declines as a result, as well as in applying different remedies when this overload condition
occurs. As we discussed in Chapter 10, when this performance decrement results because of multitasking
overload, models such as the multiple resource model can offer a framework for design or task changes that
will reduce the demand and resulting decrement in performance (see Figure 10.1 in Chapter 10). This may
include using separate, rather than common resources or reducing the resource demands of the task. Examples
of methods for reducing resource demands include reducing working memory load (see Chapter 7),
automating parts of the task (as discussed in Chapter 12), reassigning some of the tasks to another operator or
changing procedures in such a way that previously concurrent tasks can now be performed sequentially.
The multiple resource model is a useful tool for predicting what can be done to lower the multitask
resource demand, and this reduction can be quantified by computational models (e.g., Horrey & Wickens,
2004b; Wickens, 2005). Hence, such models can be used to predict the relative workload (e.g., workload
reduction) of different design alternatives. Multiple resource models can also predict the reduction in
performance decrement achieved by operator training via developing automaticity of one or more of the
component tasks (refer back to Figure 10.2), but such models cannot predict how much training is required to
move demands below the red line. In the same way, the computational models of multiple resources are not
yet able to predict the level of resource demand and resource competition that is at the red line (such that
further demand increases will degrade performance and decreases will not improve it). That is, such models
do not well predict the absolute workload.
Increasing demands can also be imposed by increasing the difficulty of a single task (rather than
multitasking) as when the working memory load is increased (see Chapter 7), the relational complexity of a
cognitive task is increased (Halford, Baker, et al., 2005; Halford, Wilson, & Phillips, 1998), the bandwidth of
a tracking task is increased (driving along a winding road at faster and faster speeds, see Chapter 5) or the
number of aircraft that a controller needs to supervise in his/her sector rises (Ayaz et al., 2012).
In these cases, where a particular variable can be counted (e.g., number of chunks, number of variable
interactions, number of turns/second or number of aircraft, respectively), it is straightforward to predict
relative workload (more is higher) and in many cases, data have provided a reasonable approximation to a red
line. For example, we have noted that the red line for working memory at roughly seven chunks of
information (see Chapter 7). For relational complexity it is roughly three (Halford et al., 2005). For tracking
bandwidth, it is roughly one cycle per second (Wickens & Hollands, 2000).
Several variables can moderate these count ”constants,” effectively moving the red line to the left or right
along the X-axis of Figure 11.1. In the case of the air traffic controller, for example, the degree of uncertainty
in trajectory as well as the complexity of the airspace greatly affect the number of planes that can be
adequately supervised (Hilburn, 2004). Similar modulating factors influence the number of unmanned
vehicles that can be supervised (Cummings & Nehme, 2010).
One of the most important count variables, which can be employed in either single or multitask
circumstances is time: simple timeline analysis computes the ratio of time required (TR) to time available
(TA) (Parks & Boucek, 1989). We discuss timeline analysis further below in the context of reserve capacity.
More specifically, timeline analysis will enable the system designer to “profile” the workload that operators
encounter during a typical mission, such as landing an aircraft or starting up a power-generating plant (Kirwan
& Ainsworth, 1992). In a simplified but readily usable version, it assumes that workload is proportional to the
ratio of the time occupied performing tasks to total time available. If one is busy with some measurable task(s)
for 100 percent of a time interval, workload is 100 percent during that interval. In a simple model, this may be
defined as a “red line.” Thus, the workload of a mission would be computed by drawing lines representing
different activities, of length proportional to their duration. The total length of the lines would be summed and
then divided by the total time (Parks & Boucek, 1989), as shown in Figure 11.2. In this way the workload
encountered by or predicted for different members of a team (e.g., pilot and copilot) may be compared and

310
tasks reallocated if there is a great imbalance. Furthermore, epochs of peak workload or work overload in
which load is calculated as greater than 100 percent can be identified as potential bottlenecks.

FIGURE 11.2 Time-line analysis. The percentage of workload at each point is computed as the average number of tasks per unit time within
each time window.

Importantly, time line analysis is equally applicable to both the overload region (TR/TA >1) and the
reserve capacity region (TR/TA <1), and in the latter it can be used equally well in workload predictive
models (if tables are available to look up the time required to perform different tasks) and in workload
assessment, as discussed below, if observers can carefully record operator activity (including non-observable
cognitive tasks). While the 100 percent level may be initially set as the red line, observations by Parks and
Boucek (1989) suggest instead that it is the 80 percent level where errors in performance begin to occur.
The important general point to be made here is that for both single and multitask demands in the
overload region above the red line, simple measures of performance are adequate to measure “workload,” and
models of multitask performance or single task count variables can predict workload increases (performance
decreases) or relative workload above the red line. Count variables can be used to predict absolute workload
values, both above and blow the red line, but multitask interference models can not easily do so at the current
stage of their maturity.

3.2 Reserve Capacity Region


When demands are below the red line, both single task count variables and multiple resource models can
continue to offer reliable relative predictions (for example, four chunks will have a higher workload than
three chunks; and using separate modality resources will create more reserve capacity compared to common
resources). However, as the dotted line curve in Figure 11.1 makes quite clear, performance on the task of
interest is no longer adequate to assess differences in workload. (We refer to this task as the primary task.)
Not only do primary task performance measures fail to reflect differences in resource demands below the red
line, many such primary tasks are often very coarse, or nonexistent in their performance measures because a
final output may not reflect vast differences in the complexity of cognition that supported it. For example, a
decision-making task might have only one of two outcomes (correct or wrong, see Chapter 8). Yet reaching
the decision may involve considerable working memory activity needed to entertain alternative hypotheses
and evaluate possible outcomes. All of this cognitive activity is a large contributor to mental workload, and
yet variation in it will not be reflected by the simple rightwrong measure of decision-making performance. As
another example, the task of “maintaining situation awareness” can impose a high level of workload (and
potential interference with other activities; Wickens, 2002c), but there may be no measure of performance at
all associated with this task (unless situation awareness is periodically “probed” by potentially intrusive
measures like SAGAT; see Chapter 7).

3.3 Measures of Mental Workload and Reserve Capacity


To cope with the inadequacy of primary task performance measures for workload assessment, engineering
psychologists have developed four other categories of workload measurement tools, all designed, directly or
indirectly, to assess the amount of reserve capacity (e.g., the distance to the left of the red line). We discuss
three of these first: behavior, secondary task, and subjective measures, before describing physiological
measures in some detail, given their anchoring in neuroergonomics.

3.3.1 BEHAVIORAL MEASURES Even as performance might not change with increasing resource demands in the

311
reserve capacity region, behavior often will. If behavior is defined in its simplest form as “doing something,”
then, as described above, time line analysis serves as a very effective measure of workload. In many manual
control tasks, control activity, mean control velocity, or high frequency power can serve as an effective
behavioral measure of workload that may change with task demands even as tracking error (performance)
remains constant.

3.3.2 SECONDARY TASKS Secondary tasks have the greatest fidelity. One simply asks: if the operator is
performing the primary task at an adequate level (e.g., perfect performance), how well can she/he perform a
concurrent secondary task? The better that task can be performed, we infer the more residual capacity there is
available to it from the primary task, and lower the resource demands of the primary task. A variety of
secondary tasks have been employed in a variety of circumstances, such as responding to an unexpected probe
stimulus, estimating time passing, or doing mental arithmetic, and many of these have been reviewed in other
papers (e.g., Gopher & Donchin, 1986; Hancock & Meshkati, 1988; Hart & Wickens, 2010; Hendy, Liao, &
Milgram, 1997; Moray, 1979, 1988; O’Donnell & Eggemeier, 1986; Tsang & Wilson, 1997; Wierwille &
Williges, 1978; Williges & Wierwille, 1979).
One major problem with secondary tasks is that researchers cannot always control the amount of
attention given to them. For example, in some cases the secondary task may be intrusive on the primary task
whose workload is measured, disrupting that level of performance. Ideally, this performance should be perfect
and untouched, or at least primary task decrements caused by the secondary task should be the same across all
versions of the primary task to be compared. Such disruption may bias the measurement itself (after all, here
the secondary task is given more resources than it should receive) in much the same way that measuring the
length of a worm with a ruler could cause the worm to contract if the ruler touches it in the measurement
process. Alternatively, in order to avoid such intrusiveness, operators (particularly in high risk environments
like driving or flying) might choose to ignore the secondary task altogether, providing no measure
whatsoever.
In order to partially guard against these problems, researchers have recognized the importance of
embedded secondary tasks (Raby & Wickens, 1994). These are in fact natural components of the total task
scenario, but typically of the lower level of priority characteristic of a secondary task. Examples of embedded
secondary tasks might be periodic glances to the rear or side view mirror (to measure driver workload) or
offering periodic position reports to an air traffic controller (to measure pilot workload). As an embedded
secondary task, Metzger and Parasuraman (2005) measured the latency with which air traffic controllers made
a check mark on the flight strip of an aircraft when it had reached a navigational waypoint, and found that
controllers either delayed or omitted such checking when they had more aircraft in their sector to control and
their workload increased.

3.3.3 SUBJECTIVE MEASURES Subjective measures of workload experienced are widely used techniques for
workload assessment (e.g., Hart & Wickens, 2010; Hill et al., 1992; O’Donnell & Eggemier, 1986; Tsang &
Vidulich, 2006; Vidulich & Tsang, 1986; Wierwille & Casali, 1983). With this method workers using a
unidimensional scale are simply asked to report on the scale what their experience of workload is (or was over
some previous duration). A variety of scales abound, such as the Bedford scale or the Modified Cooper-
Harper (Wierwille & Casali, 1983) scale (See Hart & Wickens, 2010 for a review). Many of these scales have
the advantage that a verbal description of the red line can be actually placed at a given level. For example, a
rating of 7 on a 10-point scale might be described as “no extra attention is available to give to any additional
tasks.” Subjective measures can therefore effectively serve the range of task demands both above and below
the red line.
There are also multidimensional workload scales (Boles, Bursk, et al., 2007; Hill, Iavecchia, et al., 1992;
Reid & Nygren, 1988; Vidulich & Tsang, 1986). These assume that mental workload has different
components, just as physical workload may be imposed separately on, say, the arms, legs, or fingers. Perhaps
the most widely used of these is the NASA TLX (Task Load Index) scale, which asks users to provide
separate subjective ratings on subscales of mental demand, physical demand, time demand (time pressure),
performance, effort, and frustration level (Hart & Staveland, 1988).
Multidimensional scales do not always provide sufficient added information relative to uni-dimensional
scales, in order to justify the added time requirements for generating multiple ratings. But often when
comparing qualitatively different systems (“apples and oranges”) on a workload scale, such qualitative
differences can be well revealed by differing differences along the different TLX subscales. For example, one
procedure in programming a navigational device may impose a high time (temporal) load, but low mental

312
load, whereas another, with which it is compared, may show the reverse. Descriptions of TLX provide
guidance on how subscales can be combined to produce a single scale if desired (Hart & Staveland, 1988).

3.3.4 PURPOSE OF WORKLOAD ASSESSMENT Importantly, all measures of workload, whether performance based,
subjective or physiological, can be assessed for two qualitatively different purposes. Offline measures are
assessed during system evaluation, and designers can take the results of this assessment, diagnose a workload
deficiency, and proceed to invoke some remedy, to move demands far enough below the redline to preserve a
certain margin of residual capacity. In contrast, online measures of workload are assessed while the operator
is performing the task outside the laboratory in the operational environmnent and can be used in adaptive
automation to reduce the demands if workload is either over the red line, or perhaps is increasing toward it
(see Chapter 12).
Online measures are only one source of evidence available for automation to infer workload, and other
sources will be discussed in the next chapter. However, we note here that for online applications, there is
heightened concern for the intrusiveness of the measure. Anything that might interfere with the primary task
such as performing a secondary task or even responding to a probe in order to give a subjective measure could
have serious possible consequences. For this reason, engineering psychologists have been particularly
interested in “passive,” less intrusive neurophysiological measures to index workload. We now turn to these in
some detail.

3.4 Neuroergonomics of Workload

3.4.1 OVERVIEW Over a century ago, the famous neuroscientist Sir Charles Sherrington suggested that mental
work was fundamentally brain work. He proposed that the movement of blood through the brain’s arteries
was in response to the demands placed on neurons by the need for cognitive processing. Such neuronal
demands were met by the brain supplying increased oxygenated blood to the area of the active neurons.
Sherrington therefore proposed that neuronal function was reflected in cerebral blood flow, which he was able
to measure in animal models (Roy & Sherrington, 1890), but which he suggested could also be applied, in
principle, to the study of human brain function. It took nearly a century before technological developments,
first with the invention of positron emission tomography (PET) but later with functional magnetic resonance
imaging (fMRI), allowed for noninvasive measurement of cerebral blood flow in humans, and thereby
provided a basis for confirming Sherrington’s hypothesis in humans. fMRI findings of increased cerebral
blood flow in regions of the prefrontal cortex with increased task demand, e.g., higher working memory load,
have pointed to neural correlates of resources (Parasuraman & Caggiano, 2005; Posner & Tudela, 1997).
Furthermore, other fMRI findings (Just et al., 2003) have supported the distinction between
perceptual/cognitive, verbal/spatial, and input and output modality-specific processing, which are components
of the multiple resource model of Wickens (1984).
Cognitive neuroscience research using fMRI will continue to enhance our knowledge of the specific
neural systems associated with attention and cognitive processing and will therefore contribute to a better
theoretical understanding of the components of mental workload. But fMRI is an expensive, restrictive, and
non-portable technology that is not suitable for routine use or for low-cost practical applications. However, a
variety of other neuroergonomic techniques are available for the assessment of mental workload. These
methods fall into three general classes: (1) electrophysiological; (2) hemodynamic; and (3) autonomic. We
discuss examples of techniques in each of these classes in the following.

3.4.2 EEGThe electroencephalogram (EEG) records the brain’s electrical activity from electrodes placed on the
scalp of a participant’s head. Spectral power in different EEG frequency bands has been found to be sensitive
to increased working memory (WM) load and demand for attentional resources. Given the important role that
WM plays in comprehension, reasoning, and other cognitive tasks (Baddeley, 2003; see also Chapter 7), many
studies have examined changes in the spectral structure of the EEG in tasks in which WM demands are varied.
A common WM task in which workload can be easily manipulated is the “N-back” task, in which
participants are given a continuous stream of stimuli and must respond, not to the current stimulus, but that
presented N stimuli back. This task is trivial when N = 0 and relatively easy when N = 1, but much more
difficult when N >1. To be successful in such tasks when the WM demand is high (e.g., N = 3), participants
must continuously apply mental effort and typically report high levels of workload and show increased neural
activity in frontal and parietal regions of the brain (Owen et al., 2005). The spectral structure of the EEG
shows systematic load-related modulation during such N-back task performance. Typically, when recorded

313
from midline frontal electrode sites, EEG activity in the theta band (4–7 Hz) is increased in power for high
WM load compared to low load (Gevins & Smith, 2003). Frontal midline theta increases have also often been
reported for other difficult tasks requiring sustained concentration (Gevins et al., 1998). In contrast to the
midline frontal theta, activity in the alpha band (8–12 Hz) shows an inverse relationship with task load, being
reduced with high WM demand. The attenuation of EEG alpha with visual attention and with cognitive load
has been shown in many studies since its initial demonstration by the discoverer of EEG, Hans Berger in
1929.
Frontal theta activity (4–7 Hz) increases while alpha power (8–12 Hz) decreases as more resources have
to be allocated to the task and thus provide sensitive measures of mental workload (Gevins & Smith, 2003).
Spectral power in these two frequency bands can be fairly easily computed from the raw EEG, including in
near real-time (several seconds) using readily available software packages.
EEG measures have also been found to index operator mental workload in more complex tasks that are
more representative of operational environments. These include tasks such as the Multiple Attribute Task
Battery (Gevins & Smith, 2007), simulated process control (Hockey et al., 2009), and operational tasks such
as flight, air traffic control (ATC), and road and rail transportation (Brookhuis & De Waard, 1993; Hankins &
Wilson, 1998; Lei & Roetting, 2011; Wilson, 2001, 2002). For example, Brookings et al. (1996) recorded
EEG from Air Force controllers while varying the difficulty of a simulated ATC task along two dimensions,
the number or volume of aircraft to be controlled, and the aircraft mix (complexity). Right hemisphere frontal
and temporal EEG theta band activity increased with workload. Midline central and parietal areas showed
theta band activity to also increase with increased workload to both types of task manipulation. Alpha band
activity decreased with increased task complexity but not with the number of aircraft being monitored. Thus
these EEG components were differentially sensitive to different aspects of mental workload.
Can EEG be used to assess mental workload reliably in operational settings? Yes, but only with some
difficulty. One problem is that EEG can be contaminated by eye movement and muscular artifacts in such
environments. While it is relatively easy to remove these artifacts off line, after recordings have been made
and stored, on-line artifact removal is more challenging. However, the recent development of mathematical
techniques such as independent components analysis (ICA) has allowed for implementation of measurement
of artifact-free EEG in an online manner (Jung et al., 2000). Real-time measurement of artifact-free EEG in
operational settings is currently a topic of much research and development.

3.4.3 EVENT-RELATED POTENTIALS Event-related potentials (ERPs) represent the brain’s neural response to
specific sensory, motor, and cognitive events. ERPs are computed by recording the EEG and by averaging
EEG epochs time-locked to a particular stimulus or response event. At the present time ERPs hold a
somewhat unique position in the tool shed of cognitive neuroscientists because they provide the only
neuroimaging technique that has high temporal resolution, of the order of milliseconds, compared to
techniques such as PET and fMRI which are inherently sluggish (because they index cerebral hemodynamics).
ERPs are often used whenever researchers need to examine the relative timing of neural mechanisms
underlying cognitive processes with millisecond precision. For example, the timing information provided by
ERPs provided critical evidence for the “early selection” view of attention because of findings showing
attentional modulation of neural activity after about 100 ms post-stimulus (Hillyard et al., 1998).
The latency of one prominent ERP component, the P300, increases with the difficulty of identifying
targets but not with increases in the difficulty of response choice, suggesting that P300 provides a relatively
pure measure of perceptual processing/categorization time, independent of response selection/execution stages
(Kutas et al., 1977; see Chapter 9). P300 amplitude is also proportional to the amount of attentional resources
allocated to the target (Johnson, 1986; Polich, 2003). Thus, any diversion of resources away from target
discrimination in a dual-task situation will lead to a reduction in P300 amplitude. Isreal, Chesney, et al. (1980)
used this logic to examine the temporal locus of added workload demands in a dual-task situation. They
showed that P300 amplitude decreased when a primary task, tone counting, was combined with a secondary
task of visual tracking. However, increases in the difficulty of the tracking task did not lead to a further
reduction in P300 amplitude. Thus, they argued that P300 reflects processing resources associated with
perceptual processing and stimulus categorization, but not responserelated processes (see Chapter 10, Section
3.1). In a subsequent study, Wickens, Kramer, et al. (1983) showed reciprocal changes in P300 amplitude as
resources were flexibly allocated between primary and secondary tasks.
Several studies have used the auditory P300 to assess the workload demands of different complex tasks.
Isreal, Wickens, et al. (1980) showed the sensitivity of P300 to display complexity in an air-traffic monitoring

314
type of task. Ullsperger et al. (2001) used secondary-task P300 amplitude changes to make inferences
regarding the amount and type of resource demand of a gauge monitoring task. More recent studies have used
P300 to assess the workload demands of learning to use different computer systems. For example, one of the
problems associated with educational systems such as hypermedia is to assess how demanding they are for
individual learners, and thereby to adapt them on a person-by-person basis, as discussed in Chapter 7.
Schultheis and Jamieson (2004) found that P300 amplitude to auditory stimuli was sensitive to the difficulty
of text presented in a hypermedia system. They concluded that auditory P300 amplitude and other measures,
such as reading speed, may be combined to evaluate the relative ease of use of different hypermedia systems.
For another example from the domain of driving assessment, Baldwin and Coyne (2005) found that P300
amplitude was sensitive to the increased difficulty of simulated driving in poor visibility due to fog, compared
to driving in clear conditions. The unique value of this neuroergonomic measure was shown by the finding
that performance-based and subjective indices were not affected by the visibility manipulation.

3.4.4 ULTRASOUND MEASURES OF CEREBRAL BLOOD FLOW EEG and ERP represent the class of
electrophysiological measures. Two hemodynamic measures, in addition to PET and fMRI, are Transcranial
Doppler Sonography (TCD) and near infrared spectroscopy. TCD is an ultrasound device that can be used as a
noninvasive method to monitor cerebral blood flow. TCD therefore provides another technique that, like
fMRI can be used to examine Sherrington’s view that mental work is associated with brain work, as reflected
in cerebral blood flow to the left or right cerebral hemispheres. TCD uses a small 2 MHz pulsed Doppler
transducer to gauge arterial blood flow, typically of the middle cerebral artery (MCA), which can be isolated
through the cranial “windows” in the temporal bone on each side of the head (Aaslid, 1986). The low weight
and small size of the TCD transducer and the ability to embed it in a headband allow for measurement of
cerebral blood flow while not limiting, or becoming hampered by, head and body motion (Tripp & Warm,
2007).
When a particular area of the brain becomes metabolically active due to cognitive processing, byproducts
such as carbon dioxide increase, leading to a dilation of blood vessels serving that area. This, in turn, results in
increased blood flow to that region. Several TCD studies have shown that changes in the difficulty of
perceptual and cognitive tasks are accompanied by increases in cerebral blood flow in either the left or right
hemisphere (see reviews by Duschek and Schandry, 2003; Stroobant & Vingerhoets, 2000). Shaw et al.
(2010) examined dynamic changes in cerebral blood using TCD in a simulated air defense task in which
participants had to protect a “no fly zone” by engaging enemy aircraft that approached the zone. They found
that cerebral blood flow closely tracked changes in the number of enemy threats that lead to changes in mental
workload.

3.4.5 NEAR INFRARED SPECTROSCOPY AND CEREBRAL OXYGENATION The TCD technique provides only an indirect
index of oxygen utilization in the brain, as revealed by changes in blood flow. A more direct measure of
cerebral oxygenation would be useful as another indicator of “brain work”—engagement of neurons recruited
in the service of cognitive processing. Optical imaging, in particular near infrared spectroscopy (NIRS),
provides such a measure. NIRS typically uses near-infrared light that is emitted by several sources embedded
in a strap that is placed over the front of the head. The strap also contains several infrared detectors that detect
the light after it has passed through the skull and brain. Changes in light absorption, typically measured at two
wavelengths, are used to calculate relative changes of oxygenated and deoxygenated blood in the frontal
cortex. NIRS has a precision advantage over TCD, given its ability to assess activation in several frontal brain
regions, and not just in the left and right hemispheres as with TCD.
Previous research using the NIRS as well as fMRI has shown that tissue oxygenation increases with the
information-processing demands of the task being performed (Toronov et al., 2001). More recently, Ayaz et
al. (2012) used NIRS to examine cerebral oxygenation in experienced controllers monitoring air traffic in a
high-fidelity simulator. Controller communications with pilots were via standard voice or visual text data link
(see Chapter 7). Ayaz et al. (2012) found that there was a systematic increase in blood oxygenation as the
number of aircraft that had to be controlled increased from 6 to 12 to 18. These neural changes were
accompanied by similar changes in subjective workload, as measured by the NASA-TLX.

3.4.6 HEART-RATE VARIABILITY Autonomic measures constitute the third class of neuroergonomic measure. Of
these, heart-rate variability has been the object of sustained study. Several investigators have examined
different measures associated with the variability or regularity of heart rate as a measure of mental load.
Variability is generally found to decrease as the load increases, particularly that variability which cycles with

315
a period of around 10 seconds (0.1 Hz) (Mulder & Mulder, 1981). When this variability is associated
specifically with the cycles resulting from respiration, the measure is termed sinus arrhythmia (Backs et al.,
2003; Derrick, 1988; Mulder et al., 2003; Sirevaag et al., 1993; Vicente et al., 1987).
Heart rate variability is sensitive to a number of different difficulty manipulations and therefore appears
to be more sensitive than diagnostic. Derrick (1988) investigated this measure with four quite different tasks
performed in different combinations within the framework of the multiple-resource model. His data suggested
that the variability measure reflected the total demand imposed on all resources within the processing system
more than the amount of resource competition (and therefore dual-task decrement) between tasks. Backs et al.
(2003) examined three different heart rate measures during simulated driving over easy or difficult curved
courses and found that they were differentially affected by curve radius. They concluded that the differential
effects indicated that the perceptual demands of driving could be distinguished from central and motor
processing demands.

3.4.7 PUPIL DIAMETER Several investigators have observed that the diameter of the pupil correlates quite closely
and accurately with the resource demands of a large number of diverse cognitive activities (Beatty, 1982).
These include mental arithmetic (Kahneman et al., 1967), short-term memory load (Peavler, 1974), visual
search (Porter et al., 2007), air traffic control monitoring load (Jorna, 1997), simulated driving (Recarte &
Nunes, 2003), and on-the-road driving (Razael & Klette, 2011). This diversity of responsiveness suggests that
the pupilometric measure may be highly sensitive, although as a result it is undiagnostic of the type of
workload demand. It will reflect demands imposed anywhere within the information-processing system.
However, changes in ambient illumination must be monitored since these also affect the pupil and because of
its association with the autonomic nervous system, the measure will also be susceptible to variations in
emotional arousal.

3.4.8 VISUAL SCANNING, ENTROPY, AND THE “NEAREST NEIGHBOR INDEX” While discussed as a measure of
selective attention allocation in Chapter 3, visual scanning—the direction of pupil gaze—can also contribute
extensively to workload modeling in two different ways. First, as we have noted, dwell time can serve as an
index of the resources required for information extraction from a single source. In an aircraft simulation,
Bellenkes et al. (1997) found that dwells were longest on the most information rich flight instrument (the
artificial horizon or artificial horizon instrument; see Chapter 3) and that dwells were much longer for novice
than expert pilots, reflecting the novices’ greater workload in extracting the information. Second, scanning
can be a diagnostic index of the source of workload within a multielement display environment. For example,
Bellenkes et al. found that long novice dwells on the artificial horizon display were coupled with more
frequent visits, and hence that instrument served as a major “sink” for visual attention. Little time was left for
novices to monitor other instruments, and as a consequence their performance declined on tasks using those
other instruments. Dinges et al. (1987) and Wikman et al. (1998) used scanning as a critical measure of the in
vehicle head down time caused by the workload associated with different in-vehicle systems such as maps,
radio buttons, etc.
Analyzing the degree of randomness of visual scanning or entropy can also be potentially informative
regarding mental workload (Ephrath et al., 1980; Harris et al., 1986). One view is that as mental workload
increases, a person’s pattern of visual exploration of a region of interest in a display becomes more
stereotyped and less random because they fixate on only the few regions of the display containing the relevant
information so that entropy decreases. Conversely, a reduction in mental workload should increase entropy.
Hilburn et al. (1997) confirmed this finding when examining the effects of automation on the mental workload
and visual scanning patterns of experienced air traffic controllers. A challenge is that the entropy measure in
this and other related studies typically ignores visual fixations outside a defined region of interest. Di Nocera
et al. (2007), however, argued that all areas of visual fixation should be analyzed, and proposed a derived
measure of mental workload called the Nearest Neighbor Index (NNI), defined as the ratio of the average of
the observed minimum distances between fixation points and the mean distance that one would expect if the
distribution of fixations was random. Di Nocera et al. (2007) found that the NNI index was significantly
higher during the demanding take off and landing phases of flight operations than during cruise flight,
pointing to the utility of NNI as an index of mental workload.

3.4.9 COSTS AND BENEFITS OF PHYSIOLOGICAL MEASURES OF WORKLOAD Neuroergonomic indexes have two
advantages over behavioral and subjective measures of workload: (1) Such measures provide a relatively
continuous record of data over time. (2) They are not obtrusive into primary-task performance. But they

316
sometimes require that electrodes be attached, so a degree of physical constraint is imposed, and therefore
they are not truly unobtrusive in a physical sense. However, the latest generation of eye tracking devices does
not require any instrumentation of the participant, as the infrared sensors can be mounted on the desk or the
side of the display being monitored. Other measures do require that the participant be fitted with the sensor in
some manner, e.g., an EEG cap or a head strap for NIRS. These constraints will influence user acceptance.
Many physiological measures have a further potential cost in that they are, generally, one conceptual step
removed from the inference that the system designers would like to make. That is, workload differences
measured by physiological means must be used to infer that performance breakdowns would result or to infer
how the operator would feel about the task. Secondary measures assess the former directly, whereas
subjective measures assess the latter
There are many factors such as cost, ease of implementation, intrusiveness, etc., that must be taken into
consideration when choosing a workload assessment technique for engineering psychology applications.
Some of these factors (e.g., cost) may rule out the use of physiological measures in favor of simpler indexes
such as subjective measures. Some individuals may also not wish to be “wired up” for physiological recording
in work environments, so operator acceptance is another important factor to consider. With increasing
miniaturization and development of “dry electrode,” wireless wearable systems, some of these concerns are
diminishing. At the same time, even if practical considerations rule out the use of physiological measurement,
the neuroergonomic approach may nevertheless remain important for theory development, which in turn may
lead to more sensitive assessment of mental workload (Kramer & Parasuraman, 2007).

3.5 Relationship Between Workload Measures


If all measures of workload demonstrated high correlation with one another and the residual disagreement was
due to random error, there would be little need for further validation research in the area. The practitioner
could adopt whichever technique was methodologically simplest and most reliable for the workload
measurement problem at hand. Generally, high correlations between measures will be found if the measures
are assessed across tasks of similar structure and widely varying degrees of difficulty. However, the
correlations may not be high and may even be negative when quite different tasks are contrasted. For
example, consider an experiment conducted by Herron (1980) in which an innovation designed to assist in a
target-aiming task was subjectively preferred by users over the original prototype but generated reliably
poorer performance than the original. Similar dissociations have been observed by Wierwille and Casali
(1983) and by Childress et al. (1982) and, who measured pilot workload associated with cockpit-display
innovations.
We use the term dissociation to describe these circumstances in which conditions that are compared have
different effects on different workload measures. The understanding of attention and resource theory can be
quite useful in interpreting why these dissociations occur. Yeh and Wickens (1988) suggested that subjective
measures directly reflect two factors: the effort that must be invested into performance of a task and the
number of tasks that must be performed concurrently. These two factors, however, do not always influence
performance. To illustrate, consider the following situations:
A. If two different tasks are in the underload region on the left of Figure 11.1, the greater resources
invested on the more difficult task (and therefore that higher subjective workload) will not yield better
performance.
B. Subjective measures often fail to reflect differences due to data limits (see Chapter 10, figure 10.2),
particularly if the lower level of performance caused by the lower level of the data limit is not
immediately evident to the performer who is giving the rating. (Note however that this is an advantage
of the NASA TLX measure, which allows the operator to separately rate “performance” and “mental
effort.”)
C. In the context of the performance-resource function, if two systems are compared, one of which
induces a greater investment of effort, this one will probably show higher subjective workload, even as
its performance is improved (through the added effort investment). This dissociation is shown when
effort investment is induced through monetary incentives (Vidulich & Wickens, 1986). However, it
also appears that greater effort is invested when better (e.g., higher resolution) display information is
available to achieve better performance. Thus in tracking tasks, features like an amplified error signal
(achieved through magnification or prediction and inducing more precise corrections) will increase
tracking performance but at the expense of higher subjective ratings of workload (Yeh & Wickens,
1988).

317
Yeh and Wickens (1988) concluded that a very strong influence on subjective workload is exerted by
D. the number of tasks that must be performed at once. The subjective workload from time-sharing two
(or more) tasks is almost always greater than that from a single task. We can see here the source of
another dissociation with performance because a single task might be quite difficult (and result in poor
performance as a result), whereas a dual-task combination, if the tasks are not difficult and use
separate resources, may indeed produce a very good performance in spite of its higher level of
subjective load.
The presence of dissociations often leaves the system designer in a quandary. Which system should be
chosen when performance and workload measures do not agree on the relative merits between them? The
previous discussion, and the chapter as a whole, does not provide a firm answer to this question. However, the
explanation for the causes of dissociation and its basis on a theory of resources should at least help the
designer to understand why the dissociation occurs, and thus why one measure or the other may offer a less
reliable indicator of the true workload of the system in specific circumstances.

3.6 Consequences of Workload


Increases in workload do not inherently have “bad” consequences. Indeed, in many environments it is the low
levels of workload that, when coupled with boredom, fatigue, or sleep loss can have negative implications for
human performance (Chapter 2; Huey & Wickens, 1993). Adding task requirements can sometimes improve
performance in low workload driving circumstances (Atchley & Chan, 2011). Given some flexibility,
operators usually work homeostatically to achieve an “optimal level” of workload by seeking tasks when
workload is low, and shedding them when workload is excessive (Hart & Wickens, 1990). This basis for
strategic task management was discussed in Chapter 10.
In revisiting these task management issues, we must highlight the importance of understanding the
strategy of task management that operators adapt when workload becomes excessive (i.e., crosses the red line
from the underload to the overload region of Figure 11.1 as measured by the techniques described above). At
a most general level, four types of adaptation are possible.
• People may allow performance of tasks to degrade, as a vehicle driver might allow lane position to
wander as the workload of dealing with an in-vehicle automation system increases.
• People may perform the tasks in a more efficient, less resource consuming way. For example in
decision making, they may shift from optimal algorithms to satisfactory heuristics.
• People may shed tasks altogether, in an “optimal” fashion, eliminating performance of those of lower
priority. For example, under high workload, the air traffic controllers may cease to offer pilots weather
information unless requested, while turning their full attention to traffic separation.
• People may shed tasks in a non-optimal fashion, abandoning those that should be performed,
abandoning safe driving in favor of a cell phone conversation (see Chapter 10). Unfortunately, beyond
the material covered in Chapter 10 on resource allocation, very little is known about general principles
that can account for when people adopt one strategy or the other. However, as discussed there, training
can certainly help (Orasanu, 1997).

4. STRESS, PHYSIOLOGICAL AROUSAL, AND HUMAN


PERFORMANCE
We have all experienced stress at some point in our lives. Stress is typically seen as an emotional state of
heightened arousal that can impair performance and, if severe enough, potentially disrupt behavior and have
negative consequences for health. Stress is not always negative, however, for it may also serve as an
energizing force that motivates people to perform well. Distinguishing the conditions under which stress
impairs cognition and performance, and the mechanisms by which it does so, is one of the many challenges of
stress research (Hancock & Desmond, 2001; Matthews et al., 2000).
The topic of stress has been studied from many different perspectives in the biological, psychological,
and social sciences, with each discipline tending to define stress in different ways and examine different
aspects of the phenomenon (Cohen et al., 1997). Within engineering psychology, the typical approach has
been to adopt a stress-strain model in which an environmental stressor, such as noise, is compared to a
condition without the stressor and effects on performance, physiology, and subjective feelings are assessed.
The simple stress-strain model is shown in Figure 11.3. Stressors may include environmental influences such
as noise, vibration, heat, dim lighting, and high acceleration, as well as such psychological factors as anxiety,

318
fatigue, frustration, and anger. As discussed in Chapter 8, they may also include time pressure (Dougherty &
Hunter, 2003; Svenson & Maule, 1993) as well as organizational factors such as severe penalties for poor
performance. An air traffic controller who has only a little time to “de-conflict” two aircraft that are on a
course to lose minimum separation and who could be relieved of duty if such a conflict occurs works under
both these sources of stress.
In general, stressors typically have three manifestations in people: (l) They produce a phenomenological
experience and often an emotional or “affective” one. For example, we are usually (but not always) able to
report a feeling of frustration or arousal as a consequence of a stressor. (2) Closely linked, a change in activity
in the peripheral nervous system is often observable. This might be a transient change—such as the increase in
heart rate in pilots during demanding flight maneuvers such as takeoff and landing (Hankins & Wilson, 1998)
or of air traffic controllers following an increase in the number of aircraft being handled (Wilson & Russell,
2003). The change might also be a more sustained effect as assessed for example by the change in the output
of catecholamines measured in the urine or saliva after periods of flying simulated combat maneuvers in an
F16 (Lieberman et al., 2004) or actual battlefield events (Bourne, 1971). The phenomenological and
physiological characteristics are often, but not invariantly, linked. (3) Stressors affect characteristics of
information processing, although they do not always degrade performance.

FIGURE 11.3 A representation of stress effects.

As Figure 11.3 shows, these effects may be characterized as having either external or internal influences
on human performance. External stressors influence the quality of information received by the receptors or the
precision of the motor or vocal response, and hence their influences and effects are more easily predictable
(Wickens et al., 2004). For example, vibration will reduce the quality of visual input for fine detail and the
precision of motor control, and noise will do the same for auditory input. Time stress may simply curtail the
amount of information that can be perceived in a way that will quite naturally degrade performance. Sleep loss
can have an external influence on sustained visual tasks by increasing the frequency of eye closures. Some
stressors, however—like noise or sleep loss—as well as others for which no external effect can be observed—
like anxiety, fear, or incentives—appear to influence the efficiency of information processing through internal
mechanisms that are not completely well understood. Because of our emphasis on engineering psychology
and human performance, rather than on the nonpsychological aspects of human factors, we will focus our
discussion on those stress influences on human performance that have internal influences, rather than those
such as lighting, cold, or vibration that have physically measurable external effects.

4.1 Arousal Theory


The effects of stress on human performance—whether considering internal or external sources—have often
been explained in the context of arousal theory (e.g., Duffy, 1957; Selye, 1976). Arousal refers to an
individual’s level of activity, whether reflected in general behavioral states such as active wakefulness or
sleep or in subjective experience such as alertness or drowsiness. Such changes are also accompanied by
systematic changes in brain activity (e.g., in the EEG) and in the peripheral nervous system, particularly the
sympathetic part of the autonomic system.
One of the easiest ways to measure the quantitative levels of many stressors is through physiological
measures of arousal, mainly mediated by the activity of the sympathetic nervous system. These include
measures such as heart rate, pupil diameter, or the output of catecholamines in the blood or urine. Brain
measures of arousal can also be relatively easily obtained through EEG recordings. For example, it has long
been known that increased EEG theta activity recorded from posterior electrode sites on the scalp is associated

319
with lowered arousal and to poor performance on prolonged, monotonous tasks (O’Hanlon & Beatty, 1997).
Also, fMRI studies have shown that activation in the brain stem and in widespread frontal-parietal networks in
the right hemisphere are associated with variations in arousal (Sturm & Wilmes, 2001).
Many of these psychophysiological and neuroergonomic measures reflect the increased arousal or effort
associated with the motivational variable of “trying harder” as tasks impose increasing difficulty or as goals
are imposed for better performance, as discussed in our treatment of resource theory previously in this chapter
(Hockey, 1997; Kahneman, 1973). While most stressors, such as anxiety and noise are thought to increase the
level of arousal, others, like sleep loss, or fatigue, will decrease arousal.

4.2 The Yerkes Dodson Law


Effects of arousal on human performance have often been interpreted within the Yerkes Dodson law (1908),
which postulates an inverted U-shaped function between stress and performance. This function was originally
proposed in the context of studies in the 1990s on the learning performance of rats receiving electrical shocks
of different intensity levels. The law was subsequently generalized to human performance and other stressors,
often by secondary sources that did not refer to the original findings with rats reported by Yerkes and Dodson
(see discussion in Hancock and Ganey, 2003). The pattern of performance effects predicted by the law are
shown in Figure 11.4 and suggest that at lower end of the arousal scale (low stress), increasing stress, by
increasing arousal and effort mobilization, will increase performance. Higher levels of stress, however, will
begin to produce attentional and memory difficulties that will cause performance to decrease.
In addition to the inverted U, a second characteristic of the Yerkes Dodson law, is that the function is
shifted as task difficulty increases. The “knee” in the curve, or the optimum level of arousal, is at a lower level
for the more complex task (or the less skilled operator) than for the simpler task (or expert operator;
Kahneman, 1973). This prediction is consistent with the assumption that more complex tasks usually involve
greater demands for attentional selectivity (more possible cues to sample) as well as greater working memory
load, and hence will be more vulnerable to the deficiencies of these processes at higher arousal levels.
Despite its intuitive appeal, the Yerkes Dodson law and the inverted-U relation between stress and
performance have been subjected to several criticisms over the years. Hockey (1984) pointed out that the law
is difficult to falsify, since many results could be fitted, post-hoc, to the inverted U, particularly if an
independent measure of the x-axis (stress) is unavailable, as is usually the case in studies of stress.
Consequently, others have suggested that stress/strain and inverted-U models are too mechanistic and need to
be supplemented by considering the adaptive or coping techniques of individuals when exposed to stressors
and the information-processing components that are influenced by such strategies (Hancock & Warm, 1989;
Hockey, 1997; Matthews et al., 2000).

FIGURE 11.4 The Yerkes Dodson Law.

4.3 Transactional and Cognitive Appraisal Theories of Stress


Traditional arousal theories and the Yerkes Dodson Law assume a relatively passive view of humans,
responding to stressful stimuli with “strain” much in the same way as a physical object strains under an
external stressor like weight or heat. Unlike inanimate objects, people differ from one another in response to

320
the same environmental stressor. Even the response of the same individual to the same stressor can vary at
different times. Whereas traditional arousal theories of stress adopt a stimulus-driven approach, in contrast
transactional and cognitive appraisal theories view stress responses as the outcome of an interaction
between the person and the environment, and in particular the person’s appraisal of the environmental
challenge. One of the major transactional theories of stress is that of Lazarus and Folkman (1984). They
suggested that stress reactions reflect a person’s cognitive appraisal of the environmental event (threatening,
not challenging, mild) and of the person’s competence in coping with the event. Since people differ in their
cognitive appraisals and coping abilities, stress is therefore not solely a property of the environment but
reflects the joint influence of person and environment.
While transactional theories fit well with the results of many studies of different stress sources such as
noise and anxiety, Matthews (2001) pointed out that they neglect effects on neural functioning that may not be
available to a person’s self-cognition or subjective awareness. Matthews (2001) also proposed that the
Lazarus and Folkman (1984) transactional model does not make explicit predictions about objective
performance changes as a stress outcome. Such outcomes are of major interest to engineering psychologists.
Accordingly, Matthews (2001) proposed an extension of the transactional approach to explain stress effects at
multiple levels: biological hardware (neural level), information processing (representation or computational
level), and overall human goals (adaptation level). Matthews (2001) also developed the Dundee Stress State
Questionnaire (DSSQ), which can be administered to participants before and after a stressinducing
manipulation in order to assess how the stressor influences affective responses and cognitive appraisals. For
example, Matthews and Desmond (2001) showed that their multilevel transactional model and the use of the
DSSQ can provide a comprehensive assessment of stress in automobile drivers.

4.4 Stress Effects on Performance


Knowing how stress degrades human performance can help to support the design of more stress-tolerant
interfaces, or to develop stress reducing training techniques. But developing models that will accurately
predict stress effects is challenging for two reasons. First, ethical considerations make it difficult to carry out
controlled experiments that place human subjects under the same levels of stress that might be characteristic
of environments or conditions for which that prediction is desired: for example, combat or other life- or
health-threatening circumstances. Hence, relatively little empirical data exist, compared to the available
human performance data in many other domains of engineering psychology. Second, for reasons that are
described below, human performance response to stressors appears to be complex and often inconsistent,
modulated by a great number of cognitive (e.g., appraisal), skill, and personality variables, which makes
derivation of general predictions quite challenging. Before we describe the pattern of effects that have been
observed by different stressors, we consider some of the possible sources of data from which the pattern of
human stress response can be inferred.
First, it is possible to examine many situations like the USS Vincennes incident, or Three Mile Island, in
which errors were made, and stress was undoubtedly high (Orasanu & Fischer, 1997). One might draw
inferences that stress was a causal factor in the errors made in the events, yet the causal inferences will always
be ambiguous: did stress cause the error? Or was the stress a consequence of the error that might have
occurred just as well under unstressed conditions? How many similar stressful circumstances have people
confronted without making the errors of the incident in question? Indeed a careful analysis of the USS
Vincennes’ incident (see Chapter 8) carried out by Klein (1996) revealed relatively little evidence that stress
was responsible for the unfortunate decisions to fire upon the commercial aircraft.
Second, there have been a series of efforts to capitalize on stress imposed for other reasons, to gain
insights into performance changes. For example, Ursin, Baade, and Levine (1978) and Simonov et al. (1977)
described the performance of parachutists awaiting their first jump. In a classic study, Berkun (1964) had
army soldiers attempt to fill out an insurance form while being led to believe that the aircraft in which they
were flying was in danger of crashing or that artillery shells were exploding around them or that a demolition
had seriously injured one of their fellow soldiers. In all cases, the subjects believed that they or someone they
felt responsible for was at serious mortal risk, with a resulting degradation of cognitive performance.
Third, there are a number of studies that have examined the effects of stressors such as the threat of
shock, temperature, noise, sleep loss, or time pressure in more controlled laboratory environments. Such
studies confirm some of the patterns of effects that will be discussed below (Hockey, 1997). However, most
have the inevitable shortcoming that the laboratory conditions can never fully replicate the true experienced
stress of the danger in emergency conditions, a pattern whose prediction is so important for system design.

321
4.5 Stress Component Effects
One of the best ways of integrating the effects of stress on performance of tasks, observed from the different
classes of data discussed above, is to consider their influence on the different information processing
components or mechanisms that have been discussed in the previous chapters of this book (Hockey, 1997).
Thus, given the nature of a stressor effect on processing components like selective attention, working
memory, or response choice, and given the dependence of a task on particular components, a framework is
established for predicting task performance changes. For example, if stressor A affects working memory and
task B uses working memory, but task C does not, we can predict that stressor A will affect task B, but not
task C. In the following pages we will first describe these component effects, but will then discuss how a large
amount of variance in stress response is related to the adaptation strategies invoked by a particular human
operator. We then describe the way in which stress response can be mediated by other non-stress factors, and
finally consider some of the ways in which the negative effects of stress on performance have been
remediated.

4.5.1 SELECTIVE ATTENTION: NARROWING Changes in human selective and focused attention, as discussed in
Chapter 3, mediate many stress effects. One of the most important and robust of these appears to be an
increased selectivity or attentional narrowing that results from a wide variety of different stressors
(Kahneman, 1973). For example, Weltman et al. (1971) found that participants led to believe that they were
experiencing the conditions of a 60-foot dive in a pressure chamber performed as well as a group not told this
on central detection task but were impaired on a peripheral detection task. Similar perceptual-narrowing
effects of loud noise were reported by Hockey (1970).
The stress effect on tunneling is not simply defined by a reduction of the spatial area of the attention
spotlight, so that peripheral stimuli are automatically filtered. Rather the filtering effect seems to be defined
by subjective importance, or priority, as when skimming text under time stress (Duggan & Payne, 2009).
Performance of those tasks of greatest subjective importance remain unaffected—or perhaps enhanced
(through arousal)—in their processing, whereas those of lower priority are filtered (Broadbent, 1971). In one
sense this kind of tunneling is adaptive, and even optimal, but it will provide undesirable effects if the
subjective importance of the attended channel proves to be unwarranted. Such was the case, for example, in
the Three Mile Island nuclear power plant incident. Operators, under the high stress following the initial
failure, appeared to fixate their attention on the one indicator supporting their belief that the water level was
too high, thereby filtering attention from more reliable indicators that supported the opposite hypothesis.
Correspondingly, stress-induced tunneling should have less of an effect if the task requires the processing of
few information channels than if it requires the processing of many (Edland, 1989).

4.5.2 SELECTIVE ATTENTION: DISTRACTION Many stressors simply impose a distraction and thus divert selective
attention away from task-relevant processing. Loud or intermittent noises or even the conversation at a nearby
table at the library will serve as a source of such distraction (Baldwin, 2012). It also appears to be the case that
the documented influence of life stress events (like family or financial problems) at the workplace (Alkov et
al., 1982; Wine, 1971) relates to the distraction or diversion of attention to thinking about these issues, at the
expense of processing job related information.

4.5.3 WORKING MEMORY LOSS Davies and Parasuraman (1982) and Wachtel (1968) have directly identified the
negative effects of anxiety stress on working memory. Many of the difficulties in cognitive aspects of problem
solving that Berkun (1964) observed when his army subjects were placed under the stress of perceived danger
can also be attributed to reduced working memory capacity. Noise, as well as danger and anxiety, will also
degrade working memory (Hockey, 1997). The stress effects of noise on working memory can be seen to
result from either of two causes. First, it is clear that noise will disrupt the “inner speech” necessary to carry
out rehearsal of verbal information in the phonetic loop, as discussed in Chapter 7 (Poulton, 1976) because
rehearsal is a resource-limited process.
Second, both noise and non-noise stressors can distract or divert attention away from rehearsal of
material that is either phonetic or spatial, in a way that will allow the representation of that information to
degrade. This second effect can account for the influence of non-noise stressors such as anxiety on working
memory (Berkun, 1964), as well as the effects of either noise or non-noise stressors on spatial working
memory (Stokes & Raby, 1989). As an example, in a simulation study of pilot decision making, Wickens,
Stokes, et al. (1993) observed that the negative effects of noise were quite pronounced on decision problems
that relied on spatial visualization for their successful resolution. Examining aviation accident reports that

322
might be attributed to stress effects, Orasanu (1997) noted the greater frequency of stress effects on situation
awareness, a process which, as we discussed in Chapter 7, is closely tied to working memory. Given the
important role of working memory, as well as broad selective attention, in encoding new information into
long-term memory, it would appear that stress would not lead to efficient learning (Keinan & Friedland,
1984). This reasoning is certainly one of the important factors behind the advocacy of simulators as useful
training devices for dangerous activities such as flying or deep-sea diving (Flexman & Stark, 1987; O’Hare &
Roscoe, 1990; see also Chapter 7). That is, simulators can support the complexity of the real task, without
imposing its stressful life threatening dangers.

4.5.4 PERSEVERATION There is evidence that high levels of stress will cause people to “perseverate” or continue
with a given action or plan of action that they have used in the past (Zakay, 1993). For example, in problem
solving (Luchins, 1942) under stress people will be more likely to continue trying the same unsuccessful
solution (the very failure of which might be a cause of increasing stress). Cowen (1952) found that people
perseverated longer with an inappropriate problem solving solution under the threat of shock. The concept of
perseveration with previous action patterns is also consistent with the view that, under stress, familiar
behavior is little hampered, but more novel behavior becomes disrupted, an effect that has profound
implications for the design of procedures to be used under the stressful conditions of emergency. The greater
disruptive effect of stress on novel or creative behavior is consistent with an effect that was observed by
Shanteau and Dino (1993), who observed a selective decrease in performance on tests of creativity, caused by
the combined stress of heat, crowding, and distraction.
It is apparent that the combined effects of stress on attentional narrowing and perseveration can
contribute to a pattern of convergent thinking or “cognitive narrowing” that can be dangerous in crisis
decision making (Woods, Johannesen, et al., 1994; see Chapter 8): stress will initially narrow the set of cues
processed to those that are perceived to be most important; as these cues are viewed to support one
hypothesis, the decision maker will perseverate to consider only that hypothesis, and will process the
(restricted) range of cues consistent with that set. That is, stress will enhance the confirmation bias discussed
in Chapter 8, causing the decision maker to be even less likely to consider the information that might support
an alternative hypothesis. This pattern can be used to describe the behavior of the operators at Three Mile
Island, or the dangerous pattern of behavior in which unqualified pilots may continue to fly into bad weather
(Jensen, 1982; Wiegmann & O’Hare, 2003).

4.5.5 STRATEGIC CONTROL Perhaps the most important processing changes that occur under stress can be
characterized by the general label of strategic control: that is, the characterization of a set of strategies that
the human will consciously adapt to cope with the perceived stress effects. These strategies are incorporated
in a feedback control model, presented in Figure 11.5, which is based upon similar concepts proposed by
others (Lazarus & Folkman, 1984; Hockey, 1997; Matthews, 2001). The model has two key components:
appraisal and strategic choice. One important concept of the model is that the operator does not respond to the
stressor per se, but to the perceived or understood level of stress. As described earlier in this chapter, Lazarus
and Folkman have labeled this the process of cognitive appraisal. Thus, two people could be in identical
circumstances (i.e., under the same physical stress or dangerous conditions), but have very different
appreciations of how much danger they were in, or the extent to which they had resources available to cope
with the stressor. Stress would increase as the perceived disparity between necessary and available resources
increases.
Having then appraised the level of stress, the human has the option of choosing a variety of different
information processing strategies to cope with the stressor (Hockey, 1997; Maule & Hockey, 1993). It is in
the selection of the appropriate or inappropriate strategies that much of the variability of stress response
between people is found. Adapting the framework proposed by Hockey (1997; Maule & Hockey, 1993), four
major categories of adaptive responses may be proposed, each with somewhat different implications for
performance.

323
FIGURE 11.5 An adaptive closed-loop model of stress, based on concepts proposed by Hockey. The ability to cope with stressors is appraised
at the top. A choice of one of four categories of strategies is made as a consequence of this appraisal. These choices will affect performance to
varying degrees (and hence lead to a reappraisal). The choice to mobilize effort for long durations may have physiological costs. The choice to
accelerate will have a selected effect of lowering accuracy. Source: Adapted from G. R. J. Hockey, “Compensatory Control in the Regulation of
Human Performance Under Stress and High Workload,” Biological Psychology, 45, 1997, pp. 73–93; A. J. Maule and G. R. J. Hockey, “State,
Stress, and Time Pressure,” in Time Pressure and Stress in Human judgment and Decision Making, ed. O. Svenson and A. J. Maule (New York:
Plenum, 1993), pp. 83–102.

4.5.5.1 Recruitment of more resources Here the response is simply to “try harder,” or mobilize more
resources in the face of the stressor. If the source of stress is time pressure (Svenson & Maule, 1993), then this
strategy may be labeled as “acceleration” (Stiensmeier-Pelster & Schrmann, 1993): doing more in less time.
Such a strategy can be adaptive, but it has risks. In the first place, the sustained mobilization of increased
effort may impose long term costs of fatigue and possible health risks (Hockey, 1997) which may leave the
human vulnerable after the stressor is removed (Huey & Wickens, 1993). Furthermore, in some cases
acceleration may eliminate redundancies. As discussed in Chapter 7, removing redundancies in
communications systems can invite confusions and errors.
The strategy of acceleration is one that invites a shift in the speed-accuracy tradeoff, toward faster but
more error prone performance, an effect that has been observed under a variety of stressors (Hockey, 1997).
For example, Villoldo and Tarno (1984) report that bomb disposal experts worked more rapidly, but made
more procedural errors under stress. Keinan and Friedland (1987) found that subjects prematurely terminated
problem-solving activities under the stress of a potential shock. The tendency of the stress of emergency to
cause a shift in performance from accurate to fast (but error prone) responding has been cited as a concern in
operator response to complex failure in nuclear power control rooms. The hasty action of the control room
operators in response to the Three Mile Island incident was to shut down an automated device that had in fact
been properly doing its job. To combat this tendency for a non-optimal speed-accuracy shift in an emergency,
nuclear power plant regulations in some countries explicitly require operators to perform no physical actions
for a fixed time following an alarm while they gain an accurate mental picture of the nature of the
malfunction.
4.5.5.2 Remove the stressor The human may sometimes adapt successfully by simply trying to
eliminate the source of stress. At times this is easy, such as turning off (or removing oneself from) a stressful
source of noise, postponing performance of a task till a time in which one is no longer sleep deprived, or
postponing a deadline to remove time pressure. At other times, removal may be more difficult, such a putting
a source of anxiety out of mind, and may depend upon the availability of trained stress coping skills, to be
described below.
4.5.5.3 Change the goals of the task Stress researchers have revealed a variety of ways in which people
adaptively display qualitatively different performance strategies under higher stress conditions. (Driskell et
al., 1994; Ford et al., 1989; Johnson et al., 1993; Klein, 1996). What makes these strategies adaptive is that
they are chosen to be ones that are more immune to the known degrading effects of stress on information

324
processing, as discussed above. Hence a simpler, less effortful strategy is often chosen. Many of these changes
have been observed in decision making tasks under time pressure (Flin et al., 1997; Svenson & Maule, 1993),
as discussed in Chapter 8 where simpler heuristics may begin to dominate the more working memory
intensive strategies. The skilled operator will often have available a repertoire of such strategies, to be able to
chose the one that is most immune from stress effects. It is for this reason, in part, that stressors sometimes
fail to produce performance decrements: humans adapt by choosing a simpler and more efficient strategy.
Indeed sometimes stressors even produce performance improvements (Driskell et al., 1994). For example,
Lusk (1993) studied professional weather forecasters and found that, under the time pressure imposed by
busier forecasts (more meteorological information to be processed per unit time), forecasting performance
actually improved.
However, it is also the case that strategy choice can degrade performance if the task is not well served by
the simpler strategy. For example, a robust finding that we discussed above is that people choose to generate
fewer hypotheses (Dougherty & Hunter, 2003) and choose fewer cues in decision tasks carried out under time
pressure. If a decision task contains few cues, this strategy will produce no penalty, but for multiple cue tasks
it will (Edland, 1989). Furthermore, the effects of processing fewer cues will depend upon the extent to which
those cues that are filtered out are less important (little cost to performance) or simply less salient. In this case,
there will be a cost if the less salient cues that were filtered are also more important. Wallsten (1993) notes
that both importance and salience are used as cue filtering attributes by people under time pressure.
4.5.5.4 Do nothing The final strategy identified by Maule and Hockey (1993) is for people to simply do
nothing to adjust their processing under stress, allowing the stress effects to influence performance in a more
predictable way.
In considering these four categories of choice of strategic response shown inFigure 11.5, it should be
apparent that different people can respond quite differently to the same stressors, in terms of when (or
whether) each of the three different adjustment strategies (1–3) are invoked. Further differences will result if
strategy 1 is chosen, depending on the extent to which more effort will be mobilized (a motivational issue)
and, if strategy 2 is chosen, depending on the extent to which the selected way of performing the task is
optimal or not. It is, in part, these large degrees of choice, that make accurate stress predictions hard to attain.

4.6 Stress Remediation


A variety of techniques may be adopted in the effort to minimize the degrading effects of stress on human
performance. Roughly these may be categorized as environmental solutions, design solutions, which address
the task, and personal solutions, which address the operator, either through task training or through training of
stress management strategies.

4.6.1 ENVIRONMENTAL SOLUTIONS Clearly, where possible, stressors should be removed from the environment, a
solution that is more feasible in the case of external stressors, such as noise or temperature, than for internal
stressors such as those related to anxiety.

4.6.2 DESIGN SOLUTIONS Design solutions may focus on the human factors of displays. If perceptual narrowing
among information sources or unsystematic scanning does occur, then reducing the amount of unnecessary
information (visual clutter) and increasing its organization will somewhat buffer the degrading effects of
stress. Schwartz and Howell (1985) found that the degrading effects of time pressure on a simulated decision
task were reduced by using a graphic rather than a digital display. Similarly, it is clear that any design efforts
that minimize the need for operators to maintain or transform information in working memory should be
effective. Thus high display compatibility, either with responses or with the mental model of the task, is
important. The manner in which this is achieved through the design of ecological interfaces was briefly
discussed in Chapters 3 and 4, and such displays, that endeavor to replace working memory demands with
perceptual ones, are most effective for fault management (Burns et al., 2008), a task which is, almost by
definition, stressful.
Particular attention should be given to the design of support for emergency procedures since these will
probably be less familiar than routine procedures (to the extent that emergencies happen rarely) and will be
likely to be needed under the high stress conditions that are, by definition, the properties of an emergency.
Hence, these procedures must be clear and simply phrased (see Chapter 6) and should be as consistent as
possible with routine operations. Ideally procedural instructions of what to do should be redundantly coded
with speech as well as with print or pictures, should avoid arbitrary symbolic coding (abbreviations or tones,

325
other than general alerting alarms), and should be phrased in direct statements of what action to take rather
than as statements of what not to do (avoid negatives). As discussed in Chapter 6, commanded actions or
procedures should augment any information that only describes the current state of the system and should not
be confusable with that information. This is the policy inherent in voice alerts for aircraft in emergencies, in
which commands are directed to the pilot of what to do to avoid collision (“Climb, climb, climb”).

4.6.3 TRAINING We have noted before the beneficial effects of training, in particular, extensive training of key
emergency procedures so that they become the dominant and easily retrieved habits from long-term memory
when stress imposes that bias. In fact, a case can possibly be made that training for emergency procedures
should be given greater priority than training for routine operations, particularly when emergency procedures
(or those to be followed in high-stress situations) are in some way inconsistent with normal operations. As an
example of this inconsistency, the procedure to be followed in an automobile when losing control on ice (an
emergency) is to turn in the direction toward the skid, precisely the opposite of our conventional turning
habits in normal driving. Clearly, where possible, systems should be designed so that procedures followed
under emergencies are as consistent as possible with those followed under normal operations.
Programs of stress inoculation training or stress exposure training have been designed to introduce
humans to the consequences of stress on their performance (Johnston & Cannon-Bowers, 1996; Keinan &
Friedland, 1996; Meichenbaum, 1985, 1993). Such programs provide a mixture of explanation of anticipated
stress effects, teaching of stress coping strategies, and actual experience of stressors on performance, an
experience that is gradually introduced and adaptively increased (see Chapter 7). A review of studies which
have evaluated such techniques, applied to such stressful circumstances as test taking, rock descending, public
speaking, or volleyball performance, reveals that many of them have been successful (Johnston & Cannon-
Bowers, 1996). However, positive benefits to trainee attitude (greater confidence) seems to be more
consistently observed across these studies than benefits to actual performance.
In conclusion, it is apparent that prediction of the effects of stressors on performance remains one of the
greatest challenges for human performance theory, a consequence of the multidimensional effects of stress,
and the multiple compensatory or coping strategies available to people. These must be revealed by looking
beyond the final output of task performance to consider the behavior and cognitive processes involved in that
performance, as well as physiological reflections of coping strategies. However, the very availability of those
strategies, which can make precise performance prediction difficult for engineering psychology serves as a
real benefit for human factors by making available several options for effective remediation, through training
and design.

5. INDIVIDUAL DIFFERENCES
The topics we have described thus far have all involved studies examining behavioral and/or physiological
measures in groups of participants, with the reported findings reflecting the mean of the group with respect to
workload, performance, or stress. Our description of workload and, to a lesser extent, stress, implicitly
assumed that environmental factors that influence these phenomena do so in more or less the same way in all
people. But will all persons in a group under study show the same effects of stress and workload? It is well
recognized in studies of large groups that some individuals within the group may show performance or brain
function changes that are not reflected in the mean profile. However, such deviations from the mean are
typically seen as “noise” because the goal of much research in engineering psychology is to derive general
principles of human performance that are applicable widely, so that such individual differences are not the
focus of study (Szalma, 2009). We have described many such examples of such population-wide principles in
this book, such as the limited capacity of working memory (Chapter 7) and Fitts Law (Chapter 9).
Despite their ubiquitous occurrence, individual differences have generally not been considered in detail
in human factors in system design. The implicit assumption has been that good interface design and training
can overcome any difficulty that any particular individual worker may face in operating a system.
However, a consideration of individual differences has implications not only for personnel selection and
training, but also for design. Consider mental workload, which we discussed extensively earlier in this chapter
and the related issue of multitasking, which was examined in Chapter 10. Working memory is thought to be a
major contributor to mental workload. Yet individuals are known to differ widely in the capacity of working
memory (Engle, 2002). Therefore, a design that is predicted or measured to be within the workload limit of
the “average” worker may not be handled well by an individual with a low working memory capacity.
Similarly, given the proliferation of opportunities for multitasking in modern society (cell phones, iPhones,

326
GPS devices, etc.), it is important to ask whether some individuals are better able to handle such multitasking
demands than others. In Chapter 10 we examined whether individual differences in executive control can
inform the development of training methods for the development of expertise in multitasking. We continue
that discussion in this section, focusing more broadly on cognitive functions that contribute not only to
multitasking but to other aspects of human performance at work as well.
We consider first, ability differences between people, possible innate, which may explain why some
people are better multitaskers than others. Because molecular genetics—a new methodological tool that has
been used in neuroergonomics (Parasuraman, 2009)—now allows for an examination of the specific genes
that control inheritance of cognitive ability, we discuss genetic contributions to individual variation in human
performance. Finally, we briefly discuss methods to enhance performance in individuals who have reduced
cognitive functioning because of physical disabilities, focusing on “neural prostheses” to help such
individuals.

5.1 Ability Differences in Multitasking


In order to establish where multitasking ability differences exist, it is necessary to adopt a correlational
approach. Large numbers of people are assessed on a variety of component tasks, in isolation, and in paired
(time shared) combinations. Single and dual task performance measures are correlated with each other, and
the extent to which dual task decrements are correlated with each other but are not correlated with
performance of their component tasks is identified. This correlation, the feature that all dual task combinations
but no single task components have in common, may reflect a time-sharing ability (Ackerman et al., 1984;
Fogarty & Stankov, 1983; Jennings & Chiles, 1977; Stankov, 1982). It turns out that the data collected in such
(usually massive) experiments do in fact support such an interpretation (e.g., Fogarty & Stankov 1982,
Wickens et al., 1981; see Wickens & McCarley, 2008). Then it is the next step to determine what aspects of
cognition may underlie this ability, with the possible goal that if this aspect can be readily assessed, it may
provide a selection tool for those special skills that require a high level of time-sharing proficiency, like flying
a high performance aircraft (Gopher et al., 1994). Here the data appear to reveal three possibilities.
The first of these is reflected in a relatively complex interrelationship between executive control,
working memory, and intelligence (Wickens & McCarley, 2008). The executive control system, well
identified in the brain (Banich, 2009) and known to be highly heritable (Friedman et al., 2008), plays a major
role in attention switching, task priority management, and attentional focus. Also, executive control is closely
related to working memory capacity (see Chapter 7), as such a system must coordinate rehearsal of items with
operations performed on those items (Turner & Engle, 1989). There are large and stable individual differences
in working memory (Engle 2002; see below) that are argued to predict performance on attention demanding
tasks. Working memory is also a function that now seems well associated with specific genetic components in
the brain (Parasuraman, 2009), as we discuss in more detail below. Working memory is also closely related to
fluid intelligence (Cattell, 1971, Engle et al., 1999; see 5.2), which is a major component of general
intelligence or g. And as a final integrating link, g itself has been found to be the best predictor of individual
differences in performance in complex multitask domains (Borman et al., 1997; Caretta & Ree, 2003).
The second possibility is an ability related to the speed of attention switching (Hunt & Lansman 1981;
Hunt et al., 1989; Kahneman et al., 1973), which appears to have some degree of stability and is independent
of (uncorrelated with) single task ability. The extent to which this is modality specific rather than general
however remains unclear (Braune & Wickens, 1986).
A third possibility is that people differ in their motivation to invest effort into a task; that is, to
temporarally shift the red line of Figure 11.1 to the right, or to move resource investment toward the right side
of the performance-resource function, or even to temporarily expand a pool of “ malleable attentional
resources,” as discussed in Chapter 10 (Matthews et al., 2010; Young & Stanton, 2002). Matthews and Davies
(2001) refer to these individuals as “high-energy people.”

5.2 Differences in Working Memory


Until recently psychometrics has provided the main tool for the study of how individual differences in various
human abilities affect performance on different tasks. Selection and training methods used in human factors
have also largely depended on the psychometric approach. Typically, tests of general intelligence, such as IQ,
and its principal factor g, as well as sub-components, such as fluid and crystallized intelligence (Cattell, 1971)
are correlated with measures of human performance. Fluid intelligence refers to an individual basic ability to
perform cognitive functions (e.g., attention, working memory) that require little new learning and are

327
relatively free of cultural influences. Crystallized intelligence, on the other hand, reflects abilities linked to
acquired knowledge and is highly dependent on learning, education, and other cultural factors. Fluid
intelligence may decline with age, but crystalized intelligence often increases (Cattell, 1971).
In our previous discussion of mental workload, we described how working memory plays a key role in
the experience of effort that individuals experience and report in performing different tasks. For the same
objective level of task demand, some individuals will report relatively low levels of workload, while others
will exert greater effort and report higher levels of subjective workload. Such differences may reflect
individual differences in working memory capacity (Colom et al., 2003).
As discussed in Chapter 7, several different methods for the assessment of working memory capacity
have been put forward, including different versions of “span” tasks (Engle, 2002). In such tasks participants
have to perform some mental operation on a set of stimuli while simultaneously being presented with other
stimuli that must be retained in memory and recalled in a subsequent memory test. For example, in the
reading span task, participants have to make judgments about a sentence that they read and then recall words
from the sentence in the correct order (Daneman & Carpenter, 1980). In the operation span task, participants
have to make a yes/ no judgment about an arithmetic operation that is followed by a letter or word, and recall
these at the end of a series of such operations. The reading span test of verbal working memory has been
correlated with individual differences in reading and listening comprehension (Daneman & Carpenter, 1980).
Individual differences in the span measure of working memory capacity have been related to the ability to
control the focus of attention in visual search tasks (Bleckley et al., 2003) and in the proficiency of complex
aviation decision making (Causse, Dehaise, & Pastor (2011). We described in Chapter 7 how Baddeley’s
(2003) model of working memory distinguishes between a verbal-phonological loop and a visuospatial
“scratchpad” for temporary storage and manipulation of information, with these sub-systems of working
memory being coordinated by a central executive. Consistent with these views, individual differences in
phonological and visuospatial working memory capacity have been found to be predictive of variation in
performance in verbal (Caplan & Waters, 1999) and spatial (Miyake et al., 2001) tasks, respectively.

5.3 Molecular Genetics and Individual Differences in Cognition


The rapidly expanding new field of molecular genetics complements both the abilities approach and
psychometrics as a tool for examining sources of individual differences in human performance. We briefly
consider this approach as part of our focus in this chapter on neuroergonomics. The advantage of this
approach is that with the completion of the Human Genome project in the early 2000s and with expanding
information on genetic variation and gene expression, results from cognitive neuroscience can be used to
examine molecular pathways associated with individual differences in cognition. This in turn may lead to
improved theories of inter-individual variation in cognition and have important implications for understanding
and improving human performance at work (Parasuraman, 2009).
Genetics is relevant to an examination of individual differences because major aspects of human ability,
such as general intelligence, working memory capacity, and executive function have been found to highly
heritable, based on twin studies (Ando et al., 2001; Friedman et al., 2008). Twin studies cannot identify the
specific genes that contribute to that heritability, but the new molecular genetic methods do allow for such
identification of contributory genes. Normal variations between people in the specific DNA sequences that
make up a gene can affect the production of proteins encoded by the gene. If the proteins influence
neurotransmitter function in the brain, it is possible that they influence the efficiency of neural networks
associated with a cognitive function. If so, then variations in gene expression can be linked to individual
differences in cognitive performance (Parasuraman & Greenwood, 2004).
Accordingly, molecular genetic studies of individual differences in cognition have followed the
following line of reasoning: gene—gene variants—protein expression—neurotransmitter modulation—brain
network modulation—cognitive performance. Many studies have focused on “candidate genes”—those that
are likely on a theoretical basis to be linked to cognition, whereas others have used an atheoretical “shot-gun”
approach by examining the entire human genome and its variants in relation to variation in cognitive
performance (Butcher et al., 2008). Some (but not all) genes come in different forms (alleles), with one of the
two alleles in a paired DNA strand being inherited from each parent. A given person may have none, one, or
two alleles in a specified location within the gene. One then can examine the functional consequence of such
allelic variation. Studies using this approach have shown that individual differences in cognitive functioning
can be linked to variations in specific genes (Green et al., 2008; Greenwood, et al., 2005; Parasuraman &
Greenwood, 2004; Posner et al., 2007).

328
For example, Parasuraman et al. (2005) genotyped a sample of about 100 healthy adults for the
CHRNA4 gene, which codes for the neurotransmitter acetylcholine, and the DBH gene, which controls the
relative availability of the neurotransmitters dopamine and norepinephrine. DNA collected from cheek
samples was tested for the cytosine (C) allele in a specified region of the CHRNA gene and the guanine (G)
allele for the DBH gene. Performance on a spatial attention task modeled after Posner (1980; see Chapter 3)
was associated with the CHNRA gene but was unrelated to the DBH gene. Conversely, performance on a
spatial working memory task varied with the DBH gene but was not associated with the CHRNA gene. Thus
working memory and attention shifting appear to be two distinct and uncoupled abilities (thus parsing the
complex relationship between these two, in executive control and time sharing). Each of these genes had fairly
large effects on these cognitive functions. Cohen (2008) described a statistic known as effect size that can be
used to determine how large the influence of any factor is on human performance. Typically effect sizes of 0.5
are thought to be moderate in size. The effect sizes for CHRNA and DBH genes were in the range between
0.4 and 0.7.
Recently, molecular genetic studies have gone beyond examining associations with simple laboratory
tasks—selective attention, working memory, and vigilance—to more complex tasks representative of tasks in
the workplace. Parasuraman and colleagues (2012) examined individual differences in complex decision
making in a simulated battlefield command and control task, in which participants were required to identify
the most dangerous enemy target in the terrain view and to select a corresponding friendly unit to engage in
combat, assisted by a decision aid that was 80 percent reliable automation. Both decision accuracy and speed
when using imperfect automation showed considerable individual differences that were associated with
variants of the DBH gene.
The new field of the molecular genetics of cognition is still in its infancy, and hence its future impact on
neuroergonomics is still uncertain. The research to date has established a theoretical framework for examining
genetic associations for basic cognitive functions. Preliminary findings indicate that genetic associations may
also be found for more complex cognitive functions such as decision-making (Parasuraman, 2009;
Parasuraman & Jiang, 2012). As more such studies are conducted, greater potential for practical applications
will emerge, particularly if gene-environment interactions are examined (e.g., studies of training in sub-groups
of individuals defined by genotype). Individuation of design of user interfaces might also be informed by a
better understanding of the genetic basis of cognitive and affective variation between people (Oron-Gilad et
al., 2005).

5.4 Brain Computer Interfaces for Healthy and Disabled Individuals


Thus far we have considered what might be termed the “normal range” of individual differences; that is,
people who are either relatively poor or good in such domains as multitasking or decision making. At one
extreme of this range, however, are individuals with physical disabilities that impact their performance on
everyday and work tasks (Vanderheiden, 2006). Can neuroergonomics help such individuals? The emerging
area of brain computer interfaces (BCIs, Nam, 2012) suggests that it can.
A BCI is a system to allow those with physical disabilities interact more easily with devices or other
people. Neural activity is sensed while the user thinks, imagines, or performs some other cognitive operation.
For users who are incapable of speaking or moving their limbs-as in patients with “locked-in syndrome”
(amyotrophic lateral sclerosis)—such a device can allow for communication with the outside world and a
degree of social interaction with other people where neither existed previously. With a BCI a user can interact
with the environment without engaging in any muscular activity (e.g., without the need for hand or eye
movements). Instead, the user is trained to engage in a specific type of mental activity that is associated with a
unique brain “signature.” The resulting brain potentials (if EEG is used) or hemodynamic activity (if NIRS is
used) are processed and classified so as to provide a control signal in real time for an external device.
BCI research has increased dramatically in recent years. Different types of brain signals are used to
control external devices without the need for motor output. The basic idea of BCIs follows from the work on
“biocybernetics” in the 1980s pioneered by Donchin (1980) but has progressed beyond the earlier
achievements with further technical developments. The biocybernetic concept was initially proposed as a
means of providing healthy individuals additional communication channels (e.g., in addition to hand
movements or speech) with which to interact with devices. BCI research was also further stimulated by the
Augmented Cognition (Aug Cog) program (Schmorrow et al., 2006; St. John et al., 2004), which sought to
use neurophysiological measures such as EEG to trigger automated support systems that could enhance the
cognitive performance of healthy individuals. The Aug Cog concept overlaps with the use of adaptive
automation based on neuroergonomic measures, which is considered in more detail in Chapter 12.

329
In contrast to the previous work on biocybernetics and Aug Cog, which focused on enhancing
performance in healthy individuals, most BCI research and development has focused on providing interactive
help for disabled individuals. However, more recently researchers have also proposed “passive” BCIs,
typically based on automatic monitoring and decoding of EEG signals, as interfaces that could be used both
by healthy persons and individuals with physical handicaps (Zander & Kothe, 2011).
Non-invasive BCIs have used a variety of brain signals derived from scalp EEG recordings. These
include quantified EEG from different frequency bands (Pfurtscheller & Neuper, 2001) and ERPs such as
P300 (Donchin et al., 2000). BCIs based on these signals have been used to operate voice synthesizers, move
robotic arms, spell out letters on a computer display, and control other physical devices. Currently, non-
invasive BCIs have relatively slow throughput rates, but this is likely to improve in the future. (For reviews,
see Birbaumer, 2006, and Mussa-Ivaldi et al., 2007). For a recent survey of BCIs based on both intention-
based and spontaneous brain signals, see Coffey et al. (2010).
One interesting BCI application that has been explored in healthy people is based on the “error related
negativity” of the ERP. This is an ERP component that is elicited when people make errors in a perceptual or
cognitive task (Fedota & Parasuraman, 2010). Parra et al. (2003a) showed that the ERN could be identified on
a single-trial basis, without the need for averaging over several trials as is common for many ERP
components. Using this method, Parra et al. (2003b) then showed that in a task such as the Eriksen flanker
task, in which a decision about central stimulus must be made while ignoring flanking stimuli (Eriksen &
Eriksen, 1974), the ERN could be used to drive a BCI by recording ERNs to individual errors in performing
the task. For online correction of errors, the BCI used the previous 100 correct and error trials to calculate a
threshold ERN signal value that minimized the chance of misclassification. When the threshold ERN signal
strength was exceeded, the BCI interpreted the trial as an error and corrected the response. Such online
corrections lead to an average error reduction of 21 percent. Ferrez and del Millan (2005) also reported a BCI
that could be potentially be used for human-robot interaction. They showed that ERN signals elicited in
response to errors made by the robot interface could be detected on single trials and used to improve the
efficiency of interaction with the robot.
In addition to non-invasive BCIs, invasive BCIs have also been developed. These typically involve
recording of field potentials and multi-unit neuronal activity from implanted electrodes; this technique has
been reported to be successful in controlling robotic arms by monkeys (Nicolelis, 2003). Such invasive
recording techniques have superior signal-to-noise ratio but are obviously limited in use to patients with no
motor functions in whom electrode implantation is clinically justified. Felton et al. (2005) developed a BCI
based on the electrocorticogram—brain activity recorded from implanted cortical electrodes. The BCI
provided paraplegic patients the ability to compose letters and symbols on a computer. In a subsequent study,
Felton et al. (2009) used Fitts’ law to evaluate the performance of participants who used a scalp EEG-based
BCI in a target acquisition task. The BCI performance of a group of five motor disabled patients was
compared to that of eight healthy controls (the latter also used joystick control in a separate block of trials).
Fitts’ Law (see Chapter 9) predicted and allowed for direct comparisons in movement time (as a function of
index of difficulty) between the healthy and disabled subjects, and between EEG and joystick control in the
former. That a fundamental lawful relation, Fitts’ Law, that was well established in human factors research
from the 1950s, applies not only to limb-based direct motor control but also to brain-based control provides a
fitting endorsement for the neuroergonomics approach that has been developed four decades later.

6. CONCLUSIONS AND TRANSITION


This chapter has addressed a critical limitation of human performance when high levels of stress are imposed,
with particular emphasis on the stress of high task demand or mental workload. We have also shown how the
response to stress varies between people, in terms of skill automaticity, time-sharing skills, coping strategies,
and genetically based working memory and executive function capabilities. In particular we have emphasized
how this response is manifest in various aspects of brain function, consistent with the neuroergonomics
approach.
As such, we provide two vital links to the next chapter. First, designers of automation to replace or
augment human performance are driven heavily (although not exclusively) by the desire to reduce or
“offload” human operator workload. Second, because such needs are not constant, but vary from person to
person and occasion to occasion, such automation can be applied adaptively rather than in the same, static
way. It is in this domain that we find that the neuroergonomic methods discussed in this chapter provide some
of the most important signals for when to adapt automation to the specific needs of individual human

330
operators.

Key Terms
absolute workload 349
adaptive automation 352
arousal 361
attentional narrowing 364
automaticity 349
brain computer interfaces 374
cognitive appraisal 363
confirmation bias 366
data link 356
embedded secondary tasks 351
entropy 357
mental workload 346
multitasking 348
neuroergonomics 346
offline measures 352
online measures 352
predictive models 350
relative predictions 350
relative workload 349
strategic control 366
stress inoculation 370
time-sharing 371
transactional appraisal 363
workload assessment 350

331
12 AUTOMATION AND HUMAN
PERFORMANCE

1. INTRODUCTION
Since their invention, computers have become smaller, faster, more powerful, cheaper, and—to a degree—
more “intelligent”. These changes have come about at an exponential rather than a linear rate—an acceleration
known as “Moore’s Law” (Moore, 1965)—and have fuelled the widespread introduction of computer-based
automation, which, from small beginnings in the 1960’s has encroached on all parts of life today. Automated
systems are found in all aspects of work—work—in manufacturing, power generation, health care,
transportation, offices, homes, and in many other industries. The growth has been so pervasive that
automation is here to stay. Think of life without GPS, Internet search engines, and electronic commerce. In
the near future, miniature automated devices may permeate our clothes and perhaps even our bodies. The
extent to which automation has pervaded both the workplace and everyday life is well captured by a massive
volume on automation published by Nof (2009), which required more than 90 chapters to describe this
widespread application!
Many factors are responsible for the widespread implementation of automation, which shows little sign
of abating. The factors include economic issues, in particular reducing labor costs, increasing efficiency,
improving safety requirements, and remaining competitive in the marketplace (Satchell, 1998). Have such
outcomes been realized?—To a large degree, yes.
Automation has yielded many benefits. Consider two domains where automation is common: health care
and aviation. In the former, electronic medical records and decision support systems have contributed to a
reduction in adverse patient outcomes (Gawande & Bates, 2000; Morrow, Wickens, & North, 2006).
Automatic clinical reminders that guide the physician’s attention to health issues for a particular patient and
recommend follow up have also improved patient care (Karsh, 2010; Vashitz et al., 2009). In surgery, “image-
guided navigation” that supports the surgeon during mastectomy operations can improve patient safety
(Manzey et al., 2011). In aviation, automation has allowed aircraft to fly more direct routes, thereby reducing
fuel costs. The safety record of more automated commercial airplanes also continues to improve on that of
earlier generations of aircraft (Billings, 1997; Pritchett, 2009; Wiener, 1988). Similar benefits have been
documented in many other domains where automation has been implemented at work, in transportation, in
leisure activities, and at home (Nof, 2009; Sheridan & Parasuraman, 2006).
A principal benefit of automation, irrespective of the area of application, is that it can, if carefully
designed, reduce the human user’s workload, both mental and physical. Such workload reductions can occur
in response execution and muscular exertion (consider the automated can opener, screw driver, or pencil
sharpener), in decision choice (remember, as discussed in Chapter 8, the mental effort involved in making
high-risk decisions in unfamiliar domains), and in information acquisition and analysis (recall the cost of
scanning a cluttered display, or mentally adding two numbers). More than anything else, the potential for
automation to reduce workload is what makes it attractive to the human operator in environments in which
time stress is high or in work settings where cognitive effort has to be minimized because of the need to carry
out many other concurrent tasks. Yet, as we shall see later in this chapter, this workload-reducing feature can
at the same time invite new types of problems when automation is introduced.
Given the widespread benefits that automation has provided, it is not surprising that designers have
pushed for greater and more powerful automation when they are charged with developing new systems. This
is often done in the belief that human error will be eliminated, or that excessive levels of operator workload
will be reduced, so that opportunities for human error will decrease. However, such beliefs have turned out to
be fallacious. While automation may reduce some forms of error, it can introduce new ones (Pritchett, 2009;
Sarter, 2008), and in some cases automation may paradoxically increase rather than decrease human mental
workload (Wiener & Curry, 1980). Research on human-automation interaction has shown that automation
changes the nature of the cognitive tasks that humans have to do, often in ways that were unexpected or
unanticipated by designers (Parasuraman & Riley, 1997). Consequently and ironically, as automation

332
becomes more powerful and assumes more authority, the human role actually becomes more rather than less
important (Parasuraman & Wickens, 2008).
A technology-centered approach to design has been largely responsible for the human performance
issues that have arisen with automated systems. Designers have typically concentrated their energies on the
sensors, algorithms, and actuators that go into automated systems, with little or no attention given to the
characteristics of the human users of such systems. There is now ample evidence to support the view that
rather than focusing simply on the technical features of the automation, designers should also consider human
performance, an approach sometimes called human-centered automation (Billings, 1997). The challenge,
therefore, is to design for joint human-automation performance (Lee & Seppelt, 2009).
In this chapter we discuss how that challenge can be met. We consider different aspects of human
capabilities and limitations that are brought out when humans interact with automation and which have been
extensively described in previous chapters of this book. Because automation can be applied to the entire range
of human functioning, from sensing through decision making to action, many of the components of the
information-processing model that was introduced in Chapter 1 are relevant to an understanding of human-
automation interaction. We begin our examination of issues in human-automation interaction by first
discussing examples and purposes of automation.

2. EXAMPLES AND PURPOSES OF AUTOMATION


Automation can be defined as the performance by machines (typically computers) of functions that were
previously carried out, whether fully or partially, by humans (Parasuraman & Riley, 1997). In some cases, the
term automation has also been applied to describe those tasks that humans are incapable of performing (e.g.,
sensing beyond the visible or audible spectrum, or robots lifting heavy loads or handling toxic material).
Automation may be described in terms of its purposes, the human performance functions it replaces, and the
strengths and weaknesses it shows as humans interact with automated devices ranging from simple alarm
systems to complex autopilots and decision-aiding systems. The different purposes of automation may be
assigned to five general categories.

2.1 Tasks that Humans Cannot Perform


Automation is sometimes necessary because it can carry out functions that the human operator cannot
perform. This category describes many of the complex mathematical operations performed by computers
(e.g., those involved in statistical analysis). In the realm of dynamic systems, examples include control
guidance in a manned booster rocket, in which the time delay of a human operator would cause instability (see
Chapter 10); aspects of control in complex nuclear reactions, in which the dynamic processes are too complex
for the human operator to respond to online; or robots that operate in hazardous confined spaces, such as their
use in searching for victims in the collapsed World Trade Center following the September 11 terrorist attack
(Casper & Murphy, 2003). In these and similar circumstances, automation appears to be essential and
unavoidable, whatever its costs.

2.2 Human Performance Limitations


This category of automation includes functions that the human operator can do but only poorly or at the cost
of high workload because of system complexity and information load. Examples include the autopilots that
control many aspects of flight on commercial aircraft (Degani, 2004; Pritchett, 2009; Sarter & Woods, 1995;
Sebok et al., 2012), and the automation of certain complex monitoring functions, such as the ground
proximity warning system (GPWS), alerting pilots to the possibility of collision with the terrain, or alerts for
possible collisions with other aircraft (Wickens, Rice, et al., 2009). Efforts have also been directed toward
automating diagnosis and decision processes in such areas as medicine (Garg et al., 2005; Morrow et al.,
2006), nuclear process control (Woods & Roth, 1988), ship navigation (Lee & Sanquist, 2000), and
coordination of multiple unmanned aerial and ground vehicles (Barnes & Jentsch, 2010; Cummings et al.,
2007; Parasuraman et al., 2007). Military command and control operations are also increasingly being carried
out in a network-centric manner, where many entities are connected together in large, complex, distributed
networks, further mandating the use of automated agents (Cummings et al., 2010). These approaches
generally require the implementation of artificial intelligence, in the form of expert systems (Darlington,
2000) or agent-based software (Lewis, 1998).

2.3 Augmenting or Assisting Human Performance

333
Automation can assist humans in areas where they exhibit limitations. This category is similar to the
preceding one, but automation is intended not as a replacement for integral aspects of the task but as an aid to
peripheral tasks or mental operations necessary to accomplish the main task. As we have seen in previous
chapters, there are major bottlenecks in human performance, in particular, limitations in human working
memory and in prediction or anticipation for which automation would be useful. An automated display or
visual echo of auditory messages is one such example, as discussed in Chapters 6 and 7. Examples of this
might be the phone number retrieved from operator information which appears on a small telephone display;
or digitized data link instructions from air traffic control “uplinked” to the aircraft that can appear as a text
message on the pilot’s console (Helleberg & Wickens, 2003).
Another example is a computer-displayed “scratch pad” of the output of diagnostic tests in fault
diagnosis of the chemical, nuclear, or process control industries. As suggested in Chapter 8, this procedure
would greatly reduce memory load. As noted several times throughout this book, any sort of predictive
display that would off-load the human’s cognitive burden of making predictions would be of great use. Yet
another example of an automated aid is the display “decluttering” option, which can remove unnecessary
detail from an electronic display when it is not needed, thereby facilitating the process of focused and
selective attention (St. John et al., 2005; Yeh & Wickens, 2001).

2.4 Economics
Automation is often introduced because it is less expensive than paying people to do equivalent jobs or to be
trained for those jobs. Thus, we see robots replacing workers in many manufacturing plants and automated
phone menus replacing the human voice on the other end of the line. Unmanned air vehicles are far cheaper to
both manufacture and fly than are manned airplanes (Cooke et al., 2006). But as the phone menu example
suggests, the economy achieved by such automation does not necessarily make the service “user friendly” to
the human who must interact with it (Landauer, 1995; St. Amant et al., 2004).

2.5 Productivity
There are many instances in which increased demands for productivity are imposed when there is limited
manpower. For example, increased demands for air travel put more planes in the sky, but the work force of
skilled air traffic controllers is limited. Doctors may need to see more patients when their number is limited.
The military is often seeking to fly more unmanned air vehicles with a limited number of pilots to increase the
productivity of surveillance, and hence push for more UAVs to be supervised by a single pilot. In such cases,
workload is rapidly exceeded unless layers of automation are introduced (Cummings & Nehme, 2010; Dixon
et al., 2005).

3. AUTOMATED-RELATED INCIDENTS AND ACCIDENTS


Although automation has yielded many benefits, at the same time it has introduced new problems that have
occasionally led to accidents. Several highly publicized incidents and accidents have underscored the need for
designing automated systems by taking human factors into account early in the systems requirements phase.
Many such incidents have involved commercial automated aircraft (Billings, 1997; Parasuraman & Byrne,
2003). Analyses of these accidents have not only revealed that automation can introduce new vulnerabilities in
system performance, but have also illustrated how human capabilities and limitations are brought to the
forefront when designers introduce automation from a purely technology-centered perspective. We describe a
few of the many such automation-related incidents and accidents.
A caveat must be noted before describing these examples. Most accidents are the result of multiple
precipitating occurrences and conditions ultimately leading to the event (e.g., Reason, 1990, 2008).
Consequently, attributing an accident exclusively to poor automation design can be difficult. Nevertheless,
analysis of several incidents has pointed to a leading role for automation (Funk et al., 1999).
An early example was the 1972 crash of an L-1011 aircraft in the Florida Everglades while on descent to
Miami. The crew became preoccupied with troubleshooting a problem with a landing gear indication light,
and they did not recognize that the “altitude-hold” function of the autopilot had been inadvertently
disconnected. A major factor contributing to this accident was the poor feedback on automation state provided
by the system (Norman, 1990). In their report on the accident, the National Transportation Safety Board
(NTSB) stated that disengagement of automation should be clearly signaled so that the pilot can validate
whether it was intended or unintended (NTSB, 1973). In the L-1011 accident, the principle that automation
states and state changes should be made salient to the human operator was violated. Most current autopilots

334
now provide an aural and/or a visual alert upon disconnect. The alert remains active for a few seconds or
requires a second disconnect command input by the pilot before it is silenced.
At sea, an accident in which low saliency of alerts and high operator trust in automation (complacency)
were major contributing factors in the grounding of the cruise ship Royal Majesty off the coast of Nantucket,
Massachusetts, which resulted in several million dollars worth of damage to the vessel (Parasuraman & Riley,
1997). This ship was fitted with an automatic radar plotting aid (ARPA) for navigation that was based on GPS
receiver output. The bridge crew had to monitor the ARPA while engaged in other duties. Because of a loss in
the GPS signal due to a frayed cable from the antenna, the ARPA system reverted to “dead reckoning” mode
and did not correct for the prevailing tides and winds, so that the ship was gradually steered toward a sand
bank in shallow waters. The change in automation mode was signaled by a hard-to-see change in a single
letter on a small, liquid crystal display (see change blindness in Chapter 3). At the same time, the crew
continued to follow the ARPA display for over a day and failed to notice other indicators that the ship was in
dangerously shallow waters, such as communications from small fishing vessels in the area and lights on the
shore. The NTSB (1997) report on the incident cited poor interface design, crew over-reliance on the ARPA
system, and complacency associated with insufficient monitoring of other sources of navigational information
(such as another radar and visual lookout).
The upheavals in Wall Street over the past few years provide a third example illustrating the role of
automation in catastrophic incidents. The financial crises in 2008 and 2010 were directly related to the use of
computerized derivatives trading and other forms of automated transactions in the stock market. Automated
trading has long been touted for its economic benefits (Domowitz, 1993; Steil, 2001), but an unintended
consequence was the development of so-called high-frequency trading, where millions of shares were traded
automatically without human intervention, creating extreme volatility that lead to the market meltdown of
2008 and again in 2010. The complexity and opacity of the algorithms underlying automated trading, coupled
with human users (including those at regulatory agencies such as the Securities and Exchange Commission)
who had limited understanding of the automation algorithms, were major reasons for the crises (McTeague,
2011). Furthermore, as noted by Taleb (2007), the problem with many of the algorithms in the financial
models that went awry was that they assumed that human decision making was optimal. As discussed
previously in Chapter 8, a large body of research has shown, however, that human decision making is
dominated by heuristics and other cognitive “short cuts,” which work most but not all of the time (Tversky &
Kahneman, 1974). Unfortunately, these decision heuristics were never incorporated in the automation
algorithms.

4. LEVELS AND STAGES OF AUTOMATION


Analyses of automation-related incidents and accidents reveal that the functionality of the automation has a
major influence on how well human operators interact with automation in meeting their system performance
goals. The different functions that automation can take have been described in a number of ways. Automation
is not all or none, but can vary across a continuum of levels, from the lowest level of fully manual
performance (no automation) to the highest level of full automation. Sheridan and Verplanck (1978), in
proposing the concept of supervisory control, first suggested a taxonomy of 10 such levels of automation.
Supervisory control refers to a system in which a human operator does not directly operate on the physical
plant being controlled but does so through an intermediary, usually a computer, that has effectors to act on the
environment based on information obtained from sensors (Sheridan, 2002; Sheridan & Parasuraman, 2006).

HIGH 10. The computer decides everything, acts autonomously, ignoring the human
9. informs the human only if it, the computer, decides to

8. informs the human only if asked, or

7. executes automatically, then necessarily informs the human, and

6. allows the human a restricted time to veto before automatic execution, or


5. executes that suggestion if the human approves, or

4. suggests one alternative

3. narrows the selection down to a few, or

2. the computer offers a complete set of decision/action alternatives, or

LOW 1. The computer offers no assistance: human must take all decisions and action

335
FIGURE 12.1 Levels of automation scale (after Sheridan & Verplanck, 1978).

Figure 12.1 shows the 10-point Sheridan-Verplanck scale, with higher levels representing increased
autonomy of computer over human action. For example, at a low level 2, several options are provided to the
human, but the system has no further say in which decision is chosen. An example of level 4 automation
would be a conflict detection and resolution system that notifies an air traffic controller of a conflict in the
flight paths of two aircraft and suggests a resolution, but the controller retains authority for executing that
alternative or choosing another one. At a higher level 6, the system gives the human only a limited time for a
veto before carrying out the decision choice. Sheridan further refined this scale in subsequent published work
(Sheridan, 2002; Sheridan & Parasuraman, 2006) and others have proposed related taxonomies (Endsley &
Kaber, 1999).
It should be noted that the concept of levels of automation does not require that there be 10 levels; there
is no “magic number” 10. What is most important is that the levels are defined such that higher levels define
more responsibility for automation and reduced cognitive work for the human. The levels of automation
concept also does not imply that humans and automation work as independent agents. As Sheridan and
Verplanck (1978) first noted in their description of the supervisory control concept, the human and machine
components are inter-dependent, with the human making plans to execute via the machine, monitoring its
actions, and “teaching” it what to do next. The relative degree to which the human is engaged in these
activities, however, varies with the level of automation. For example, as the automation takes on more
responsibility, the human requirement for monitoring increases (Parasuraman, 1987).
The Sheridan-Verplanck scale is based on different levels of human versus automation involvement and
control, but one can also think of automation also applied to different information-processing stages, from
sensing through decision making to action. This book has been structured within a framework emphasizing
stages of information processing, and automation too can be conceptualized in terms of how it augments or
assists those different processing stages. Parasuraman et al. (2000, 2008) extended the levels of automation
concept to cover stages of automation in human-machine systems. In this expanded version, a simpler form of
the human information processing model described in Chapter 1 was adopted, a four-stage model consisting
of information acquisition, information analysis, decision making, and action implementation (see Figure
12.2).

FIGURE 12.2 Model of levels automation for different information-processing stages (after Parasuraman, Sheridan, & Wickens, 2000).

The first stage in the model of Parasuraman et al. (2000) refers to the acquisition and registration of
multiple sources of information. This stage includes sensory processing, initial pre-processing of data prior to
full perception, and selective attention. For example, the alarms discussed in Chapter 2 are a form of
automation designed to direct the user’s attention to a problem. The second stage involves manipulation and
integration of processed and retrieved information in working memory. This stage can also be conceptualized
to include cognitive operations such as rehearsal, integration, and inference, such as situation assessment and
automatic diagnosis. However, such operations are proposed to occur before decision making and action
selection, which is the third stage where automation assists in making a choice. The fourth and final stage
involves the implementation of a response or action consistent with the decision choice. The model proposes

336
that automation can be applied at different levels to each of these four stages from completely manual
operation to full automation.
While the four-stage model simplifies to some extent the complexities of the human information-
processing model discussed throughout this book, with its many feedback loops and availing of parallel
processing, it has proven useful as a framework with far-reaching implications for automation design.
Furthermore, a model of automation support need not be as complex as the human it is meant to aid.
Figure 12.2 provides a schematic of the model of levels and stages of automation. A particular system
can involve automation of all four dimensions at different levels. Thus, for example, a given system (A) could
be designed to have moderate to high levels of automation of information acquisition, information analysis,
and decision making, but a low level of action automation. Another system (B), on the other hand, might have
high levels of automation across all four dimensions. An example of type A is the Theater High Altitude Area
Defense (THAAD) system. THAAD, which is used to intercept ballistic missiles (Department of the Army,
2003), has relatively high levels of automation across information and decision stages; however, action
implementation automation is low, giving the human full control over the firing of missiles. On the other
hand, Robonaut, a robot used in extra-vehicular tasks during deep space missions, represents an example of
type B, with high automation across all stages (Bluethmann et al., 2003). We describe each stage within the
taxonomy as follows.

4.1 Information Acquisition


Automation of information acquisition (stage 1 automation) applies to the sensing and registration of input
data. These operations are equivalent to the first human information processing stage, supporting human
sensory and selective attention processes. A low level of information acquisition automation could involve
manipulation of sensors in order to scan and observe. For example, modern unmanned air vehicles (UAVs)
typically have cameras that can provide a remotely located operator a video feed of a scene and are capable of
features such as tilt or zoom (Cooke et al., 2006). In the area of health care, automation such as electronic
medical records (EMR) can assist a physician by directing selective attention to sources of information about
the patient or the medications they may be using. A somewhat higher level of automation at this stage could
involve organization of incoming information according to some criteria (e.g., a priority list and highlighting
of some part of the information). For example, modern air traffic control facilities use “electronic flight strips”
that have the capability of listing aircraft in terms of priority for handling by the controller. In human-
computer interaction, the “ping” of a newly arriving e-mail, or the highlighting of a misspelled word provide
attentional guidance.
As noted in Chapter 2 in the section on signal detection theory, human operators can sometimes fail to
detect critical events in the environment, and such failures to notice critical targets can be more prevalent if
the work period is prolonged (vigilance). Information acquisition automation can mitigate both problems by
providing alarms that direct operator attention to such events. When such alarms are simply triggered by a
sensor, they can be characterized as relatively “dumb” and associated with a low level of information
acquisition (stage 1) automation. However, when alarms integrate information from several sensors to make
an inference regarding the identity or severity of a critical event, then such “smart” alarms qualify as
information analysis (stage 2) automation. A fire alarm that integrates temperature and particulate
concentration might be a simple example of such integration. Pritchett (2009) provides examples of both types
as they are used in cockpit alerting systems.

4.2 Information Analysis


Automation of information analysis involves support of cognitive functions such as working memory and
inferential processes. A low level of stage 2 automation could involve the processing of incoming data and
presentation on the operator’s display of the projected future course of that data, or so-called trend or predictor
displays (Yin et al., 2011, see Chapter 5). For example, nuclear power plant control rooms have displays that
show both the current and the anticipated future state of the plant (Moray, 1997). Such predictor displays were
also discussed in Chapter 5 in relation to their use in process control environments. A higher level of
automation at this stage involves integration of information values rather than only prediction. In such cases,
the systems combine several input variables into a single value or object, as in the integrated polygon displays
that are used in process control or surgical settings (Smith et al., 2006). In both these examples, information
integration assists the human operator by reducing the demand on working memory and the need for effortful
inferential processing. Diagnostic aids in medicine are prototypical examples of stage 2 automation (Garg et
al., 2005). So too is the output of a computer statistics package that makes an inference that two means differ

337
with a certain likelihood. A lower automation level may signal confidence intervals. A higher level will
simply signal “significant” or “not significant.”
The distinction between stage 1 and 2 automation corresponds closely to the distinction between level 1
situation awareness (noticing) on the one hand, and levels 2 and 3 SA (inference and prediction) on the other,
as discussed in Chapter 7. Stage 1 automation assists (or replaces) the first of these, stage 2 assists the second.

4.3 Decision Making and Action Selection


The third stage, decision and action selection, involves selection of one from among alternative decision
choices. Stage 3 automation involves providing the human decision maker either with an entire list of
alternatives, a prioritized list, or a single best choice. Sheridan’s original 10-level taxonomy (Figure 12.1) is
applicable to this stage of automation. We discussed such types of decision automation in the context of
“command displays” in Chapter 6. Important are the distinctions at the highest levels of stage 3 automation in
which (a) the human may be offered a single option, but can choose to ignore it; (b) the human cannot ignore
the option since it will be chosen (and executed) unless the human vetoes it (within some time limit); (c) the
human cannot even veto. Levels (b) and (c) will also mandate the highest level of action implementation
automation.
Examples of stage 3 automation can be found in many work domains. An example from aviation is the
airborne traffic warning system, which provides a resolution advisory that tells the pilot to fly one particular
maneuver (e.g., “climb climb”) to avoid a collision with another aircraft (Pritchett, 2009). In health care,
decision-aiding systems have been developed to support physicians in making diagnostic decisions about
patients or treatments (Garg et al., 2005; Morrow et al., 2006). An example is the appearance on the display
screen of computerized patient record systems of specific recommendations regarding treatment of a patient
with HIV (Patterson et al., 2004).
It is important to note the distinction between stage 3 automation, which specifies which course of action
the human operator should follow, from the previously discussed stage 2 automation, which only supports
inferential processing that leads to a decision. In the context of the statistics package, one that goes beyond
providing a p value, and tells the user whether to “accept” or “reject” the null hypothesis is invoking stage 3
automation.
The contrast between stage 2 and stage 3 automation is directly analogous to the contrast between what
Mosier and Fischer (2010) describe as front end and back end decision making, respectively. The distinction
is critical here, as it was in Chapter 8, because at stage 2, automation need not impose any values in making an
inference on what is the likely diagnostic state or assessment of a situation. However at stage 3, automation
must either explicitly or implicitly assume values for the different decision outcomes it is advising (or
mandating), and these added assumptions leave room for greater departure between a human’s choice and the
recommendations of automation.

4.4 Action Implementation


The final stage of action implementation refers to the physical accomplishment of the action choice. Stage 4
automation involves machine execution of the choice of action, replacing human motor response (e.g., hand or
limb movements or voice commands).
Different levels of action automation may be defined by the relative amount of manual versus automatic
activity in executing the response. For example, in a photocopier, manual sorting, automatic sorting,
automatic collation, and automatic stapling represent different levels of action automation that can be chosen
by the user. A somewhat more complex example from air traffic control is the automated “handoff,” in which
transfer of control of an aircraft from one airspace sector to another is carried out automatically via a single
key press, once the decision has been made by the controller (Wickens et al., 1998). Robotic telesurgery, in
which a surgeon guides a remote robot that carries out surgical actions on a patient, provides another example
of a high level of stage 4 automation. Marescaux et al. (2001) reported successful use of such a system to
allow a surgeon in New York to perform a gall-bladder removal operation on a patient 3,500 miles away in
France.
The implications of the stages and levels model for automation design are discussed in Section 9.2 of this
chapter. In the following sections we consider a number of different aspects of automated systems that can
contribute to difficulties in their use by human operators.

338
5. AUTOMATION COMPLEXITY
Automation, by its very nature, replaces functions that were originally performed by humans, by mechanical
or computer components. Thus, while eliminating human error, discussed in Chapter 9, the increased number
of non-human components will increase the probability of a system error or fault. Furthermore, the greater the
levels or complexity of an automation function, the more components it will contain and, using the reliability
equation of Chapter 9, the greater is the possibility that something, somewhere, sometime, will fail. Thus, it is
almost inevitable that automation in such complex systems will be imperfect. Automation imperfection can
lead to problems of over- or under-reliance on automation, as discussed further in succeeding sections of this
chapter.
An assumption that is often made is that computer-based automation can improve the reliability and
safety of systems compared to analog or electro-mechanical devices because the hardware failure modes of
these older technologies are reduced by using software. However software is also not free of potential failure.
The increasing sophistication and complexity of software has lead to many more lines of code in automated
systems. Often, new software developed by a company incorporates “legacy” code that was written by
programmers long gone from the company and unavailable to provide information on the old code.
As an example of software size and complexity, the new Boeing 787 “Dreamliner” aircraft requires
several million lines of code to run its automated systems. With such large systems, there is a significant
probability that insidious “bugs” hiding within the software can lead to unforeseen problems (Landauer,
1995). Leveson (2005) has written extensively on the problem of “software safety” and the difficulty of
software verification. She has also analyzed the role of software in many accidents involving aircraft, space
vehicles, and other complex systems. In her analysis of the SOHO spacecraft accident in 1996, for example,
she pointed out that overconfidence and complacency lead to inadequate testing and review of changes to
ground-issued software commands to the spacecraft (Leveson, 2005). The human organizational response to
software failures thus represents another type of automation-related accident, in addition to those discussed
previously in this chapter.
Automation complexity brings with it the issue of observability to the human user. When complex
algorithms are embedded within an automated system, the operator is likely not to understand why the
automation performs a certain action because the algorithms are not observable, as was the case with many of
the algorithms involved in computer trading that led to the stock market crash in 2008. In some instances the
automation is so complex that it functions as an independent “agent” through which the human operator acts
on the environment (Lewis, 1998). As a result, mutual intelligibility between the human and machine agent
can be lost (Woods, 1996). Consequently, agent-based systems might be best served for relatively, simple,
low-risk tasks. For more complex tasks involving contextual decision-making, however, such systems must
provide feedback to the human operator so that agent intentions are understood (Olson & Sarter, 2000).
Increased automation complexity brings with it a second concern. If algorithms are so complex as to do
things a different way from how humans normally (or previously) accomplished the same task, then the
human operator may become surprised, and sometimes suspicious of automated functioning. An example is
the flight management system (FMS), a collection of sophisticated autopilots that guide an aircraft through
flight efficient routes, using algorithms and logic considerably more sophisticated than a pilot would use to fly
the same routes (Pritchett, 2009; Sarter & Woods, 1995; Sarter, 2008; Sebok et al., 2012). Because of these
complex, non-human (and therefore non-intuitive) algorithms, such systems will on occasion do things
(legitimately) that pilots do not expect, and hence lead them to ask “why is it doing this?,” a concept in
aviation described as “automation surprises” (Degani, 2004; Sarter, 2008; Sarter et al., 1997). In general,
such surprises do not have major implications unless they lead the human to assume that the automation has
failed, and hence, intervene, perhaps inappropriately, situations that have led to fatal accidents (Degani, 2004).

6. FEEDBACK ON AUTOMATION STATES AND BEHAVIORS


If automation is not carefully introduced, it can have the characteristics that Sarter and Woods (1996) have
labeled as “not a team player.” Much of this deficiency may result from the absence of effective feedback to
the human monitor of the automation’s functioning, regarding what it is doing and why (Norman, 1990). This
issue has long concerned pilots as they supervise their powerful, but complex and often uncommunicative
FMS (Wiener, 1988; Sarter & Woods; 1995; Sarter, 2008). Deficiencies in automation feedback can be of
several types: it can be completely absent, a state Sarter and Woods (1995) characterized as automation that is
“silent;” it can be poor, in the sense of not being salient enough to draw the operator’s attention to state

339
changes; it can be ambiguous, so that the operator is confused; and finally, it can be inflexible, lacking detail,
and not providing information specific to the situation.
When automation provides no feedback on its state, human operators can be left in the dark. As noted in
Chapter 2, humans have difficulty in detecting subtle changes in the environment because of limitations in
signal detection and vigilance capabilities. Even if feedback is presented, however, its saliency may be so low
that operators do not notice it, particularly if their attention is focused elsewhere on their other tasks. As noted
in Chapter 3, even apparently compelling changes in the environment (e.g., a man in a gorilla suit strolling
through a group of people playing a pass-the-ball game; Simons & Chabris, 1999) can be missed if attention is
directed elsewhere—the phenomenon of change blindness. On the flight deck, flight mode state annunciators
appeared to go unnoticed if they were unexpected (Sarter, Mumaw, & Wickens, 2007). In the Royal Majesty
ship accident that was discussed earlier in this chapter, the failure of the GPS signal to the automated radar
system was signaled, but it was small (a change in one character of a small, liquid crystal display) that it was
virtually unnoticeable.
Even when salient feedback is provided, additional communication deficiencies can result from the
inherent inflexibility in the dialogue with most automated systems. Such systems must, after all, be
preprogrammed with a fixed set of rules that limits their “conversational flexibility.” The increasingly
prevalent phone menu is the perfect example of such inflexibility, where a simple question one might have,
that does not meet the pre-specified set of menu categories, cannot be easily handled. Often one must wait till
the final option: “if you need to speak to an operator, press eight.” Also, as we noted in Chapter 6, there are a
number of non-linguistic features of human-human communications that cannot be readily captured by
computer mediated (i.e., automated) communications. We examine the issue of communication between
automated systems and human users in more detail later in this chapter when we discuss the concept of
“humancomputer etiquette.”

7. TRUST IN AND DEPENDENCE ON AUTOMATION


There is probably no variable more important in human-automation interaction than that of trust. Classic
studies by Bainbridge (1983), Muir (1988), Wiener and Curry (1980), and by Lee and Moray (1992)
introduced the concept, and early papers by Parasuraman et al. (1993) and Sorkin (1989) introduced concepts
of complacency (over-trust) and the “cry wolf effect” (under trust), respectively, concepts that will be
described in depth below. There has subsequently emerged a large literature on trust and its relation to
automation usage and human-system performance. Lee and See (2004) provided an overview of this work and
a process model of trust in automation. Madhavan and Wiegmann (2007) extended this review and also
compared humanhuman and human-automation trust, observing that the two had features in common but
could also be distinguished. Hancock et al. (2011) reported a meta-analysis of trust studies in the specific
context of human-robot interaction.
At the outset, it is essential to distinguish between automation trust and automation dependence. The
former is a cognitive/affective state of the user that is typically assessed with subjective ratings (e.g., Jian et
al., 2000; Singh et al., 1993); the latter is an objective behavior that can be measured from the user’s
interaction with the automation (e.g., Lee & Moray, 1992). It may for example be measured by the extent to
which the user “turns automation on,” follows its advice, or cross checks automation’s recommendations
against the raw data (Bahner et al., 2008).
Automation trust and dependence are usually correlated: if we trust an agent, whether a machine or a
human, we will tend to depend on that agent. For example, “I trust my teenage daughter not to text while
driving since I have lectured her many times that it is an unsafe practice;” or “I trust my automatic teller
machine to give me the correct amount of cash in a transaction without my having to count the money,
because I have never been short changed.” This correlation between trust and dependence is often
considerably less than 1.0. We may be forced to depend on the automation when our workload is high, but
may not always fully trust it; sometimes we may “look over its shoulder.” We may also fully trust automation,
but may not depend on it at all, if we simply prefer to do the task manually because of the excitement and
challenges of the latter.
Several variables are known to affect both trust and dependence in the same way. For example, the more
complex the algorithms of the process of automation, the lower is the trust (Lee & See, 2004). Certainly this
was a major source of pilot mistrust of the hugely complex FMS described in Section 5 above. Closely related
is the loss of trust caused by the lack of transparency or feedback of what automation is doing described in 6
above. What goes on inside the “black box?” Madhavan et al. (2006) found that the kinds of mistakes made

340
by automation also affect trust. Really “bad” automation errors degrade trust more than plausible errors (like
the kinds the human user would make). There are also individual differences in trust/dependence (Krueger et
al., 2012; Merritt & Ilgen, 2008).
Of all the variables to affect trust/dependence, probably the most critical is automation reliability.
Perfectly (100 percent) reliable automated systems are rare except when they are extremely simple. This
necessarily means that human operators may sometimes choose not to trust the output of an automated
system. Of course, for complex systems operating in an uncertain world, perfect performance is virtually
impossible, whether the task is executed by automation or by a human expert, because of the inherent
uncertainties involved in the information which automation must process, in domains such as weather
forecasting, economics, disease progression, or prediction of the behavior of individual humans (such as in
terrorism or mental health). Though imperfect, automation can provide useful assistance to the human in such
areas (Wickens & Dixon, 2007). As other causes that degrade reliability, we have identified above the role of
software bugs in causing automation imperfections; and we can consider the role of power failures (the
calculator giving out in the middle of the exam, forcing reliance upon mental long division), and improper
human “set up” or programming of the automation (Wiener & Curry, 1980). The latter two may not be
considered failures of the automation itself, but they can have similar consequences for trust and dependence.
Whatever the sources of unreliability, a critical concept in the relation between automation
trust/dependence and reliability is the calibration curve, shown in Figure 12.3. Here reliability is scaled on
the X-axis (and can often be expressed numerically on a 0–1.0 scale, by dividing the number of automation
errors by the opportunity for errors). Either trust or dependence is represented on the Y-axis, trust by a
minimum-maximum subjective rating scale, while for dependence, any number of objective measures can
quantify the proportion of times automation is used (e.g., the proportion of times a human decision agrees
with an automated recommendation). The diagonal line represents the line of perfect calibration. Importantly,
the curve bisects the space into two regions, elaborated in the sections below: over-trust to the upper left, and
under trust to the lower right. It often happens that these two sections are linked in time via the dynamics of
trust. (Lee & Moray, 1992; Yeh, Merlo, et al., 2003). In a typical scenario the operator works with an
automation system of high reliability. It may operate for many “trials” (or a long time) without failure, and
during this time the operator builds up trust in and dependence on it, often to the point of being complacent far
to the upper left of Figure 12.3. Then it fails, in what we describe as the first failure, an event that has
particular significance in the study of human-automation interaction (Rovira et al., 2007). The human
response (or non-response) to these first failures are often dramatic (Yeh, Merlo, et al., 2003) and represent
the source of many automation-based accidents, such as the Royal Majesty grounding described above (see
also Dornheim, 2000, for examples of first failure experience in aircraft automation).

FIGURE 12.3 The relation between subjective trust and automation reliability.

Following the first failure the operator will then typically leap across the calibration curve of Figure 12.3
to the far bottom right, showing a great amount of mistrust (“burned once, never again” or “Fool me once,
shame on me. Fool me twice, shame on you”). Then, over time, trust and dependence will gradually recover
toward the calibration line at a level approximating long range reliability (Yeh, Merlo, et al., 2003). In the
following discussion, we present these two regions in their typical sequence, from over-trust to first failure to
under-trust.

7.14.4 Over-trust

341
7.1.1 COMPLACENCY Automated systems that are highly reliable but not perfectly so can invite the tendency not
to monitor automation or the information sources that the automated system uses. The term complacency has
long been used in aviation (Wiener, 1981) and other domains (Casey, 1988). As in the case of the cruise ship
Royal Majesty that was described earlier, complacency has been implicated as a contributing factor in many
accidents.
While automation complacency will provide ample resources for concurrent tasks prior to the first
failure, it can have at least two behavioral consequences upon its failure. On the one hand, the infrequent and
therefore unexpected automation failures, when they do occur, are hard to detect, as we learned in Chapter 2
(expectancy and signal detection), Chapter 3 (expectancy and visual scanning), and in Chapter 9 (expectancy
and reaction time). On the other hand, an operator who expects that automation is doing its job will be less
likely to monitor the job it is doing—losing awareness of the evolving state of, or surrounding, the automated
system (Endsley & Kiris, 1995; Kaber et al., 1999; see situation awareness: Chapter 7). Hence, if the failure
does occur, the monitor will be less able to deal with it appropriately. An example would be a pilot who has to
jump into the control loop to fly the aircraft manually, should the autopilot unexpectedly fail. Furthermore,
research reveals that it is easier to remember an action if you have chosen it yourself than if you have
witnessed another agent (another person, or automation) choose that action—the generation effect (Farrell &
Lewandowsky, 2000; Slamecka & Graf, 1978). Thus, automation leaves the operator less aware of the chosen
actions in the system. For example, what is the mode of automation currently in effect (Sarter & Woods,
1997)? Of course complacency does not reveal a problem until automation fails and such a failure, although
often unlikely, is never impossible.
Experimental evidence for automation complacency was provided by Parasuraman et al. (1993), who had
participants perform three concurrent tasks from the Multiple Task Battery (MATB), one of which (an engine
monitoring task) was supported by an automated system that was not perfectly reliable. Complacency was
operationally defined as the operator not detecting or being slow to notice failures of the stage 1 automation to
detect engine malfunctions. In a control condition, participants had to perform only the engine-monitoring
task with automation support, without the other manual tasks, so that their overall task load was considerably
lower. Detection of automation failures was significantly poorer in the multi-task condition as resources were
shared between the tasks (see Chapter 10) than in the single-task condition. When participants had simply to
“back up” the automation routine without other duties, monitoring was efficient and near perfect in accuracy.
Thus, automation complacency represents an active reallocation of attention away from the automation to
other manual tasks in cases of high workload (Manzey & Parasuraman, 2010).
As discussed in Chapter 2, operators are less likely to detect signals when they occur infrequently, a
finding consistent with the expectancy theory of vigilance (Parasuraman, 1987).
Thus, automation complacency should be more severe when automation failures are infrequent, occur for
the first time in the operator’s experience (first failure), and/or occur after long periods of error-free
performance. Indeed, one of the ironies of automation is that the more reliable it is, the more it is trusted, and
the more complacent the operator becomes (Bainbridge, 1983). Molloy and Parasuraman (1996) confirmed
this in a study in which the automation failed on only a single occasion, either early or late, in two separate
sessions. Only about half the participants detected the early automation failure, and even fewer detected the
late failure (see also replications by Bailey and Scerbo (2007) and by Manzey et al. (2012)). In a related study,
De Waard and colleagues (1999) had participants in a simulator drive a vehicle in which steering and lateral
control were automated but could be overridden by depressing the brake. On a single occasion a vehicle
merged suddenly into the same lane as in front of the participant’s vehicle but the automation failed to detect
the intrusion. Half the drivers did not detect the failure, depress the brake, and retake manual control, while 14
percent did respond but not quickly enough to avoid a collision.
Complacency in stage 1 automation (alerts and alarms) can be reflected in two different forms of
automation dependence: reliance and compliance (Meyer, 2001, 2004, 2012). When operators stop monitoring
the raw data during the long periods while the alert is “silent” (or not activated), they can turn their attention
elsewhere to support concurrent tasks. This form of dependence is described as high reliance on automation
(Dixon & Wickens, 2006; Meyer, 2001). When operators react rapidly when the alert “sounds” (or is
activated), this reflects compliance. While a change in reliance does not necessary mandate a change in
compliance (or vice versa), the two states are often reciprocally coupled via the alert threshold discussed in
Chapter 2. That is, decreases in the alert threshold (beta in SDT terms, see Chapter 2) typically will cause
compliance to decline, as reliance will increase (Dixon & Wickens, 2006; Maltz & Shinar, 2003).
Automation complacency has also been found in studies with skilled workers supervising automation

342
that closely resembles real systems. Galster and Parasuraman (2001), for example, found that experienced
general aviation pilots detected fewer engine malfunctions when using an actual cockpit automation system,
the Engine Indicator and Crew Alerting System (EICAS), than when performing all flight simulation tasks
manually. Yeh et al. (2003) demonstrated the strong first failure effect with army personnel using attention-
guidance automation as discussed in Chapter 3. Metzger and Parasuraman (2005) tested experienced
controllers on a high-fidelity air traffic control simulator with “conflict probe” automation that pointed to a
potential conflict between two aircraft several minutes before its occurrence. Significantly fewer controllers
detected a conflict when the conflict probe failed than when the same conflict was handled manually in a
separate session. Eye movement analysis also showed that controllers who missed the conflict made
significantly fewer fixations of the radar display under automation support than under manual control,
pointing to a link between the automation complacency effect and reduced visual attention to the raw data
information sources feeding automation (see also Bagheri & Jamieson, 2004; Manzey et al., 2012; Wickens,
Dixon, Goh, & Hammer, 2005).
The evidence therefore suggests that automation complacency is typically found under conditions of
multiple task load, when manual (non-automated) tasks compete with the automated task for the operator’s
attention. This finding is also consistent with the meta-analysis of stage 1 automation reliability studies by
Wickens and Dixon (2007). They found automation dependence, reflecting complacency, was more correlated
with automation reliability in dual task conditions, when cognitive resources were scarce, than in single task
conditions. Under such multi-tasking conditions, the operator’s attention allocation policy appears to favor his
or her manual tasks, as opposed to the automated task. This strategy may itself stem from an initial orientation
of trust in the automation, which is then reinforced when the automation performs without failure
(Parasuraman & Manzey, 2010).
Moray (2003; Moray & Inagaki, 2000) pointed out that an attention allocation policy devoted primarily
to non-automated tasks and only occasionally to the automated task can be considered rational (see also
Moray, 1984; Sheridan, 2002). Moray also suggested that complacency could only be inferred if the
operator’s rate of sampling of the automated task was actually below that of an optimal or normative observer.
After all, if something never fails (in your experience), why do you need to look at it? The reason of course, is
that it could fail. In terms of the SEEV model of scanning (Chapter 3), the value of monitoring automation is
extremely high, even if the expectancy is quite low. But people often use their own actuarial experience to
guide their expectancies (Hertwig & Erev, 2009; see Chapter 8).
In support, Bahner et al. (2008) conducted a study to examine sampling of raw data processed by
automation in which they compared how often participants looked at the optimal number of information
sources needed to verify automation diagnosis. They had participants perform a simulated process control task
requiring supervisory control of sub-systems of a life support system for a space station. An automated fault
management system provided recommended diagnoses regarding system faults. The extent to which
participants accepted the automation’s diagnosis without verifying it provided a measure of complacency.
Participants could access (via mouse click) all relevant system information (e.g., tank flow rates) needed to
verify the diagnosis. Bahner and colleagues reasoned, following Moray (2003), that a participant who
accessed the correct number of information parameters needed to verify a diagnosis before accepting it was
optimal whereas one who sampled less information than that necessary to completely verify the aid’s
recommendation could be classified as complacent. All participants sampled less than the optimal number of
information sources and several demonstrated poor detection of the first failure. Thus complacency was a
general finding. Manzey, Reichenbach, and Onasch (2012) found that more optimal samplers were less likely
to miss the first failure altogether and Bahner et al. (2008) also found that those participants who had been
specifically trained with examples of automation failures had higher sampling rates and intervened more
appropriately when automation failed. The results provided strong evidence for the existence of automation
complacency, but also pointed to one method (training) that can be used to reduce its incidence: pre-exposure
to the automation failure.

7.1.2 AUTOMATION BIAS What has been referred to by Mosier and Fischer (2010) as automation bias represents
another human performance consequence of over-trust. Closely related to complacency, the automation bias
has typically been associated with automated decision aids that are meant to support human decision-making
in complex environments (Mosier & Fischer, 2010; Mosier et al., 1998). If the users of such systems have
strong trust in such automation, they may ascribe it greater power and authority than other sources of
information and advice. Mosier and Skitka (1996, p. 205) defined automation bias as “a heuristic replacement
for vigilant information seeking and processing.” In this view, individuals may not conduct a thorough

343
analysis of all available information but simply follow the automation advisory, even when the advice is
incorrect, thereby committing an error of commission (Bahner et al., 2008). An example is a pilot following
the advice of a flight planning automated system although its recommendations are wrong (e.g., Layton et al.,
1994).
In an early flight simulation experiment, Mosier et al. (1992) found that 75 percent of pilots incorrectly
shut down an engine, when the automation also incorrectly recommended such a shutdown (stage 3
automation) based on its incorrect diagnosis of an engine fire in the wrong engine. In contrast, only 25 percent
of pilots using a traditional paper checklist committed the same commission error. A later study revealed that
commercial pilots were just as prone to follow such incorrect automation advice. The failure to check the “raw
data” is what has been previously described as automation complacency, reflecting the allocation of the
operator’s attention to other non-automated tasks in busy multi-tasking environments.
The automation bias can also create attentional tunneling, discussed in Chapter 10. Thus Wickens and
Alexander (2009), summarizing several flight simulator studies with skilled pilots, observed that 52 percent of
pilots followed the direction of the automation stage 3 highway-in-the-sky (HITS) display that directly led
them into the obstacle or hazardous path, even though the hazard was visible had pilots consulted the raw
data, visible through the windshield outside the airplane.
Certainly, at least some aspects of the automation bias are due to the same attentional limitations that also
lead to complacency (Parasuraman & Manzey, 2010). However, other aspects of automation bias may reflect
decisional rather than attentional factors (Goddard et al., 2012; Mosier & Fischer, 2010). In this view,
automation bias, like other decision heuristics and biases, reflects the tendency of humans to choose the road
of least cognitive effort in decision-making, the so-called “cognitive miser” hypothesis (see Chapter 8 on
decision making). Automation bias may also occur due to users overestimating the capability of automated
aids. More specifically they may ascribe to the aid’s greater performance and authority than to other humans
or themselves (Dzindolet et al., 2002).
Goddard et al. (2012) reviewed studies of automation bias in health care settings, focusing on the use of
decision support systems. They found that automation bias was relatively prevalent in many types of medical
diagnostic decision making situations, particularly computer-aided detection of radiological images such as
mammograms and in computerbased interpretation of electrocardiograms. In each case, participants had
reduced diagnostic accuracy when provided with erroneous advice by the automation, compared to
performance without automation.

7.1.3 OVEDEPENDENCE: DESKILLING AND “OOTLUF” In addition to complacency and the generation effect (loss of
SA), a third negative consequence of high-level automation is that the operator’s ability to carry out the
automated manually may decline over time, a phenomenon sometimes called “deskilling” (Ferris, Sarter, &
Wickens, 2010; Geiselman, Johnson, & Buck, 2012; Lee & Moray, 1994). There is evidence for such skill
loss among pilots of highly automated aircraft, which are mitigated by the pilots choosing to “hand fly” the
aircraft from time to time (Wiener, 1988). Collectively, the three phenomena of degraded detection through
complacency, awareness/diagnosis, and manual skill loss may be referred to as the syndrome of “out of the
loop unfamiliarity” or “OOTLUF”.
Automation-related incidents and accidents were previously described in this chapter. However, a
number of recent aviation accidents have specifically pointed to the issue of deskilling as a direct result of
automation. A highly publicized accident was the Colgan Air crash near Buffalo, NY in 2009. The co-pilot
had input incorrect information into the FMS, causing it to slow to an unsafe speed that triggered a stall
(“stick shaker”) warning. The loss in aircraft speed was apparently not noticed by the flight crew, but when
the stall warning came on, the captain responded by repeatedly pulling back on the control yoke, which
further caused the aircraft to stall and crash, resulting in the death of 49 people on board. The accident
investigation suggested that the crash could have been avoided if the captain had pushed the yoke forward
rather than back. A similar accident was the Air France crash in the Atlantic in 2009, also involving a high
altitude stall in which the pilot made a “nose up” yoke input whereas the opposite should have been done to
maintain lift. In these and related accidents, pilots’ loss of skill in handling stalls due to extensive use of
autoflight systems has been thought to be a major contributing factor.
In many automated systems, OOTLUF concerns are pitted against the very real automation benefit of
reduced workload. For the busy vehicle driver, navigating in an unfamiliar freeway environment, it will likely
be both preferred and a true benefit to safety, to offload some aspects of the inner loop driving control (lane
keeping and headway monitoring) to an intelligent and reliable autopilot, in order that navigational

344
information can be consulted and decisions can be made without diversion of resources. But the implications
of this tradeoff should be considered carefully, so that the OOTLUF syndrome does not occur. In Section 9.2,
we address whether there is evidence that there might be optimum levels of automation on the tradeoff that do
not produce OOTLUF, yet still provide automation at a high enough level so that workload is tempered
(Wickens, 2008).

7.2 Mistrust and Alarm False Alarms


As noted at the beginning of Section 7, the first failure will often drive the operator from the state of over-trust
to that of under-trust or distrust, just as other factors too, such as complexity and poor feedback can cause
mistrust. As a consequence, such automation may be abandoned (Parasuraman & Riley, 1997), even when it
is accurate (after all, 10 percent unreliable automation will still be accurate 90 percent of the time). Nowhere
is the phenomenon better illustrated than in the “alarm false alarm” problem within automation stages 1 and 2,
in which an alarm (a form of automated advice) will sound, even if no actual failure condition exists (Dixon et
al., 2007; Parasuraman et al., 1997; Sorkin, 1989). Such circumstances invite the operator to mistrust the
alarm system—that is, to be “under-calibrated” as to the true value that the alarm can offer.
Whether because of true unreliability, or complexity (leading to perceived unreliability), automation
disuse can have consequences that may be relatively minor—sometimes we are less efficient when we turn off
automation than we would be with its assistance. For example, Wickens and Dixon (2007) found that people
were better off depending on automated diagnostic systems with as high as a 20 percent error rate, than they
would be relying on their own manual diagnostic skills. In contrast, catastrophic incidents may sometimes
occur because a true (valid) alarm was ignored or if a critical condition was never announced in the first place
because the “annoying” unreliable alarm system had been turned off previously. Sorkin (1989) reported many
cases of train engineers taping over the speakers from which auditory alerts emanated in the cab because they
were typically false. Seagull and Sanderson (2001) reported that 42 percent of alarms heard by anesthesiology
nurses were ignored (no action taken), and Wickens, Rice, et al. (2009) found that 45 percent of the conflict
alerts received by air traffic controllers required no action (nor was one taken). In the domain of weather
forecasting. Barnes et al. (2007) reported that 76 percent of tornado warnings that were issued were false.
Sometimes the “cry wolf ” response can have tragic consequences. In one report it was concluded that 21
percent of the deaths or injuries related to long-term patient ventilator incidents resulted from delayed or no
responses to ventilator alarms (Joint Commission, 2002). The 2001 crash of a Korean Airlines flight in Guam
in which over 100 people died represents a particularly tragic consequence of the “cry wolf ” syndrome. The
air traffic controllers monitoring the flight had disabled the terrain collision avoidance system because it had
issued too many false alerts and consequently did not notice that the aircraft was descending into a mountain
short of the runway.
There are of course many reasons for not responding to a false alert or false warning, unrelated to any
loss of trust (Lees and Lee, 2007; Wickens, Rice, et al., 2009; Xiao et al., 2004), particularly if the operator
also has some access to and has awareness of the raw data or information that lead to the alert. As was
discussed in Chapter 2, the response threshold (or criterion) of alert systems is often set at a low level to guard
against misses, but at the cost of false alarms. But if the false alarm nevertheless warns the operator of a
potentially dangerous future situation even if it is not a true danger, it can still be useful and may not increase
mistrust or lead to the “cry wolf ” syndrome. For example, Lees and Lee (2007) argued that automated alerts
in cars (e.g., collision warnings) can supplement the driver’s judgment as to what safe driving maneuver to
execute. Similarly, Wickens, Rice, et al. (2009) conducted an analysis of conflict alerts issued in an air traffic
control center and found little evidence that controllers were prone to the “cry wolf ” effect because the alerts,
even if false because of a low alert threshold, reinforced the controllers own perception of the raw data (that a
close passage of the two aircraft was coming).
In conclusion, the relation between trust, dependence, and reliability is complex, but there is no doubt
that humans are not always optimal in calibrating their cognition and behavior, with potentially serious
consequences whether over-trust or under-trust is manifest. In the following sections, we discuss some
potential solutions for harmonizing human-automation interaction, including adaptive automation, finding an
optimal balance in the level and stage of automation to balance the tradeoff between workload and OOTLUF
that underlies calibrated trust.

8. ADAPTIVE AUTOMATION
Thus far in our discussion of different aspects of human interaction with automation, we have assumed that

345
the functional properties of the automation, once designed and implemented, remain constant or static during
system operations. This approach, in which the characteristics of automation are set at the design stage and
then executed in the same way during operational use, has been referred to as static automation. In contrast, in
adaptive automation, the level and/or stage of automation is not fixed but may change during system
operations. (Feigh, Dorneich, & Hayes., 2012; Hancock & Chignell, 1989; Inagaki, 2003; Kaber et al., 2005;
Kaber & Kim, 2011; Parasuraman et al., 1992, 1996; Rouse, 1988; Scerbo, 2001).
A general schematic for adaptive automation is shown in Figure 12.4. The cognitive state of the operator,
in this case illustrated by mental workload or the capacity of the human to perform, is inferred and used by a
“task manager” to assign more of a task to automation (if workload is high) or to the human (if workload is
reduced). The task manager itself could be automation, human, or a cooperative enterprise. Figure 12.5 shows
some of the possible ways in which adaptive automation can change workload and situation awareness in
order to maintain a balance between the two.

FIGURE 12.4 Adaptive automation. Workload or the capacity of the human to perform is inferred and used by a “task manager” to assign more
of a task to automation (if workload is high) or to the human (if workload is reduced). The task manager itself could be automation, human, or a
cooperative enterprise.

FIGURE 12.5 Three possible strategies of adaptive automation. It is assumed that point B is at a higher level of automation than points A, C,
and D.

Adaptive automation is akin to dynamic function allocation (Lintern, 2012; Winter & Dodou, 2011), in
which the division of labor between human and machine agents is not fixed but changeable, flexible, and
context dependent. For example, if high human workload is inferred at a particular level of automation and
impending performance breakdown is suspected, automation may go to a higher level to support the operator.
At other times, if the operator is in danger of losing situation awareness due to working with high-level
automation, he or she may be brought back more in the loop through a reduction in the level of automation. In
general, adaptive systems seek to limit the potential costs of automation, in particular OOTLUF, and to boost
overall system performance by changing automation functionality during system operations. Adaptive
automation, in contrast to static automation, allows for restructuring the task environment in terms of (a) what
is automated, (b) how to infer, and (c) when changes occur.

8.1 What to Adapt


The first issue concerns what aspect of a task (or task complex) should be adapted. Parasuraman et al. (1999)
distinguished between adaptive aiding, in which a certain component of a task is made simpler (by
automation), and adaptive task allocation, in which an entire task (from a larger multitask context) is shifted to
automation.

346
Here a reasonable argument can be made that the appropriate choice should be one that reduces workload
to the greatest extent, even as it also reduces situation awareness (i.e., moves from point A to point B in
Figure 12.5). The rationale for such an argument is that if the adaptive automation moves from C to B, there is
no workload savings and hence no reason to invoke automation in the first place; and if it moves from D to B,
the task component might as well be fully and inflexibly automated, since this would produce no loss of
situation awareness. Such a criterion could be applied independently of whether adaptive aiding or adaptive
task allocation is implemented.

8.2 How to Infer


A second, critical issue concerns how to infer when adaptive changes should be made. To do so effectively,
both the operator and the automated system must have knowledge of each other’s current capabilities,
performance, and state. Several different approaches have been proposed to generate criteria for adapting
automation to the user (Parasuraman et al., 1992; Rouse, 1988): (1) Environmentally determined, where
automation functionality is varied in response to easily measurable environmental changes or external task
conditions, e.g., providing descent advisories to air traffic controllers only when traffic load or complexity is
high but not otherwise (Hilburn et al., 1997); (2) continuous assessment of operator performance (Kaber &
Endsley, 2004; Parasuraman et al., 2009); or (3) continuous assessment of mental workload, through
neuroergonomic measures (Wilson & Russell, 2007), so that the operator is aided when a suboptimal state
(e.g., high workload) is detected.
As examples of environmental triggers, Parasuraman et al. (1999) demonstrated the success of adaptive
automation in aviation, that was invoked in takeoff and landing phases (known to be most demanding), but
removed during the low-workload midflight cruise portion. In this case the external conditions were the
known phase of flight. Inagaki (1999) suggested that different time periods during the acceleration of an
airplane for takeoff make it more or less important for automation to assume responsibility for a rejected
takeoff decision, should such a decision be required following an engine failure. Here the passage of time and
speed of the aircraft is the external condition. In driving, one might consider an automation aid that uses the
darkness of night (an external condition) to infer that a driver might be more fatigued and less vigilant, hence
adapting an automated alerting device, sensitive to lane deviations.
Measures of performance can also drive adaptation, particularly to the extent that good performance
modeling has revealed clear “leading indicators” that preview subsequent breakdowns. Kaber and Riley
(1999), for example, demonstrated the benefits of adaptive aiding on a primary task (a dynamic cognitive
monitoring and control task) that was based upon degradation of an automation-monitored secondary task.
Parasuraman et al. (2009) used performance on a change detection task—detecting icon changes on a situation
map—to drive adaptive aiding (automatic target recognition, ATR) on a task requiring supervision of multiple
unmanned air and ground vehicles. Compared to performance without the ATR, or to static automation where
the ATR was continuously available, the adaptive automation condition was associated with reduced
workload and increased situation awareness.
Mental workload or cognitive state can be monitored directly as assessed by physiological measures, as
discussed in Chapter 11. Such measures have some advantages over performance measures, primarily their
higher bandwidth and the ability to be obtained even in absence of any overt behavioral output which might
otherwise pull attentional resources away from the primary task of interest (Kramer & Parasuraman, 2007). In
the past decade a number of studies have explored the possibility of using measures such as EEG, heart rate,
etc. in adaptive automation (Dorneich, Ververs, et al., 2012; Feigh et al., 2012; Scerbo, 2001).
For example, Wilson and Russell (2007) had participants supervise unmanned air vehicles (UAVs) that
provided radar images of critical target locations at which weapons had to be released. EEG, eye movements,
and heart rate were monitored and used to train an artificial neural network to recognize low and high mental
workload. On detecting high workload, the operator was aided by automation that slowed the speed of the
relevant UAV, thus giving the operator additional time to complete the targeting task before the vehicle
reached the weapon release point. Compared to a manual condition or to one in which adaptive aiding was
provided randomly, adaptive automation led to a significant improvement in targeting performance.
Physiological measures of operator state can be combined with other environmental and operator
measures to infer when to aid the operator. Ting et al. (2010), for example, used an artificial neural network to
integrate different physiological indexes (heart rate, EEG) with performance measures to trigger adaptive
aiding in a simulated process control task. Inagaki (2008) used physiological measures such as ear and nose
tip temperature and eye fixation data in combination with performance measures to determine when drivers

347
needed assistance in simulated collision scenarios.
A DARPA research and development program known as Augmented Cognition has been developing
neurophysiological-adaptive systems that could potentially be fielded (Schmorrow, 2005). In one of the
studies funded by this program, Dorneich et al. (2007) used EEG recorded in mobile individuals to estimate
high workload and to drive a communications scheduler that would block incoming messages when high
workload was detected. Similar studies using several EEG parameters to estimate operator workload in real
time using different machine learning techniques to drive adaptive automation have also been reported
(Baldwin & Penaranda, 2012; Christensen et al., 2012; Wang et al., 2012).
How practical are adaptive systems that use physiological measures to assess operator cognitive state?
Clearly, despite their advantages, such measures face challenges such as obtaining artifact-free measurements
in real work environments and user acceptance (Cummings, 2010). The prominent concerns with both leading
indicators of performance and assessments of physiological state is that both of these sources require some
time to integrate a sufficient amount of data so that a reliable inference can be made that the capacity to
perform is diminishing (or restored). If adequate time and data are not allowed, then an inference of capacity
change might be wrong, and this could lead to adaptation increasing workload when a decrease is desired, or
vice versa. On the other hand if sufficient time to attain a reliable estimate is used and dynamic changes in
environmental workload are present, then, given the feedback loop shown in Figure 12.4 the resulting lag in
inference could produce closed loop instability, in the sense described with tracking tasks in Chapter 5. That
is, an inference of high (or low) workload could be drawn after a substantial delay, and adaptive aiding
implemented (or removed) at the very time that workload has now diminished (or increased). In this regard it
is important that advocates of closed loop adaptive automation systems endeavor to establish the time required
to make reliable estimates of workload based on EEG or other measures (Christensen et al., 2012).

8.3 Who Decides?


The third issue regarding adaptive automation, and perhaps the most controversial one, is the issue of “who
decides” whether to implement or remove automation. That is, in the context of Figure 12.4, who is the “task
manager?” In the previous section, it was implicitly assumed that the machine itself was responsible for
invoking automation, following the signal of one or more of the three inference sources; external conditions,
leading indicators, or workload. We might say that the task manager is itself working at the highest level of
stage 3 automation (Figure 12.2). In contrast, an argument could be made that humans themselves are capable
of monitoring their own workload (or capacity to perform) and making the appropriate choice to invoke or
remove higher automation levels, a concept sometimes referred to as adaptable automation (Ferris et al.,
2010).
Existing data remain ambiguous as to where the choice should lie. One relevant issue is the accuracy of
the humans’ own assessment of their capability to perform. To the extent that humans tend to be
overconfident or inaccurate in this ability (Horrey et al., 2009; see also Chapter 8), particularly in relation to
machine performance at equivalent tasks (Liu et al., 1993), then some caution should be exercised concerning
the wisdom of human choice.
Reinforcing this preference for machine over human choice is the results of the adaptive aiding study by
Kaber and Riley (1999). Using secondary task performance to implement adaptive aiding on a video game
task, Kaber and Riley compared two strategies: a mandating strategy directly implemented the aiding when it
was assumed, by automation, to be desirable (highest stage 3 automation), whereas an advising strategy only
provided the corresponding suggestion (a lower level of stage 3 automation). The authors observed a cost for
the less automated advising strategy, a cost that they attributed to the added workload demands when
operators must monitor their own performance, and then decide whether or not automation was required.
Inagaki (2008) also noted that machine authority to implement adaptive changes when an inference is made
that the human cannot avoid a hazardous situation effectively, while controversial, could be justified in certain
instances. However, such a position is likely to be highly domain specific. Inagaki (2008) suggested that it
may be easier to justify computer authority with everyday users of automation such as car drivers, who are
likely to vary considerably in abilities and skill, than it would be for skilled, expert users such as aircraft pilots
or physicians.
To these formal, data-driven arguments for computer authority as task manager can be added
consideration of some compelling hypothetical scenarios. For example, most people would probably agree
that automation should be responsible for adapting automated steering and slowing (along with alerting)
should a reliable inference be drawn that the driver has fallen asleep, or is otherwise incapacitated. But the

348
key factor here is reliability; and it would seem that the less reliable the inference, the lower level or earlier
stage that the automation decision should be, on the stage 2 (information analysis) scale of Figure 12.2. For
example, in this particular case, consider alerting rather than seizing control. By adapting mid levels of this
scale, the designer is thus endorsing a collaborative and cooperative human-machine concept; one well within
the spirit of human centered automation.
Miller and Parasuraman (2007) also suggested that there exist many situations in which putting the
human in charge of changes in automation functionality can be beneficial. They outlined an architecture for
such “adaptable” control of automation (see Opperman, 1994) called “Playbook,” in which the human can
delegate tasks to automation at either a “hands off ” high level or by specifying various stipulations and
constraints. Parasuraman et al. (2005) reported a study of simulated human-robot interaction in which they
found performance benefits for the Playbook approach to adaptive automation.
The concept of adaptive/adaptable automation is a conceptually attractive approach to human-machine
system design, capitalizing on the strengths of human and machine in a dynamic and cooperative fashion
(Winter & Dodou, 2011). The related concept of adjustable autonomy has also been put forward in the field
of human-robot interaction, where the relative merits of machine-directed versus human-directed changes in
the relative autonomy of the robots have been debated (Cummings et al., 2010; Goodrich et al., 2007; Valero-
Gomez et al., 2011).
The adaptive/adaptable automation concept certainly remains in the forefront of the thinking of designers
of many highly automated complex systems (Ahlstrom et al., 2005; Inagaki, 2003; Miller & Parasuraman,
2007; Parasuraman et al., 2007; Valero-Gomez et al., 2011). Yet as we have discussed, there are many issues
that must be addressed before viable systems can become effective or even feasible. Most importantly, these
will depend on a continued and better understanding of the fundamentals of human attention, along with
fascinating areas of human performance theory that have only recently received interest in the human factors
domain—communication, cooperation, and trust.

9. DESIGNING FOR EFFECTIVE HUMAN-AUTOMATION


INTERACTION
In the previous pages, we have identified a number of human performance issues that arise when users interact
with automated systems. Many of these issues are prevalent in systems designed purely from a technology-
centered perspective. In contrast, the past two decades of research on human-automation interaction have
pointed to several solutions to these problems (Degani, 2004; Parasuraman, 2000; Sheridan, 2002; Sheridan &
Parasuraman, 2006; Sethumadhavan, 2011). These solutions were identified implicitly in our earlier
discussion of human-automationinteraction problems. Many of these can be loosely grouped under the rubric
of human centered automation (Billings, 1997). It should be noted that these solutions will not necessarily
provide the optimal use of automation from the point of productivity or system performance, but should, if
followed, provide for greater margins of safety, more satisfaction for the human user, and the least disruptive
episodes of “manual recovery” in the instance of system failure.

9.1 Feedback
We saw previously that many cases of accidents and incidents in automated systems have occurred because
human operators were provided poor or no feedback on automation states and behaviors (Norman, 1990).
Accordingly, designers of automation should make efforts to display critical information regarding the current
state of automation, changes in those states (e.g., a switch in automation levels), and the status of the process
being monitored or controlled by the automation (e.g., the continuous variable that is sensed by the automated
alarm). It should be noted that the type of feedback should carefully thought out; poorly presented or
excessive feedback can be as bad as no feedback at all. In Chapter 4, we discussed some case studies of
successful displays in the context of ecological interface design (Seppelt & Lee, 2007)
One approach to providing operators feedback to the operator is to use a multi-modal display, so as not
to overload the main sensory channel that the operator uses, which is typically vision (see Chapter 7).
Auditory channels can be considered, and there are examples of the use of auditory feedback to provide
information on system state to enhance performance on primarily visual tasks (Ho & Spence, 2008).
However, as auditory displays grow in sophistication with the advent of auditory “earcons”, speech
synthesizers, etc. (Baldwin, 2012, see Chapter 6), even the auditory channel can become crowded. As a result,
a number of researchers have explored the utility of haptic or tactile displays as feedback channels (Sarter,

349
2007). For example, Sklar and Sarter (1999) showed that a tactile display worn of the wrist could provide
information on FMS mode changes without disrupting primary flight performance, while improving alert
detection.

9.2 Appropriate Levels and Stages of Automation


Integrating our discussion of trust/dependence with that of stages and levels of automation, we understand that
“more automation” (e.g., more and later stages of higher levels within a stage; a higher degree of automation)
may be a two edged sword. Typically a higher degree of automation will increase routine performance and/or
decrease workload. (If neither of these is observed, the automation is clearly faulty from a human performance
perspective.) But increasing degree is also likely to increase OOTLUF (by degrading situation awareness)
and, as a consequence, degrade failure management. Thus, the improvements in workload and routine
performance brought about by increasing degree of automation are offset by the loss of SA and failure
response (Wickens, Li, et al., 2010). (This tradeoff is analogous to the changes in signal detection, brought
about by increasing beta, on misses and false alarms.) Furthermore, these tradeoffs would appear to be
enhanced as automation reliability increases. The hypothetical tradeoff of these variables with DOA is shown
in Figure 12.6.

FIGURE 12.6 Hypothetical tradeoff between routine and failure performance, and between workload and the loss of situation awareness, as the
degree of imperfect automation (stages and levels) is increased.

If these performance, workload, and SA functions of degree of automation were predictable and reliable,
they could then be employed to establish the optimum degree of automation to the extent that a designer could
establish the relative weight to be assigned to improving performance (both routine and failure) and reducing
workload. But this has proven to be a challenging task.
There are of course a few studies that have varied stages and/or levels while measuring some of these
critical variables. A classic study was that described in Chapter 6 by Crocoll and Coury (1990), who
contrasted a status display (stage 2 automation) with a command display (stage 3) and found that while the
latter favored routine performance, the former favored failure-management performance. Analogous findings
were obtained by Sarter and Schroeder (2001) when evaluating automation to prevent aircraft icing,
contrasting either displays that showed the inference of where ice was building up (stage 2) or a command
advisor that recommended maneuvers to recover from icing. A classic study examining the tradeoff across
levels of automation was carried out by Endsley and Kiris (1995). They examined the effects of a driving
decision aid in this study and observed that the optimal point on the tradeoff was at a mid-high level of
automation, but not at the highest level.
Rovira et al. (2007) further examined the effects of different levels of automation reliability (60 percent
and 80 percent) and three different levels of decision automation on performance impairment with imperfect
decision automation. The performance cost of inaccurate decision advice was most pronounced at the highest
level of automation (i.e. when a specific recommendation for an optimum decision was given) and when
automation reliability was high.
Wickens, Li, et al. (2010) attempted to integrate in a meta-analysis the collective wisdom of these and
several other studies that have varied the degree of imperfect automation, while assessing two or more of the

350
four critical variables (performance under routine and failure mode situations, workload and situation
awareness, as shown in figure 12.6) (e.g., Manzey, D., Reichenbach, J., & Onnasch, L., 2012;
Sethumadhavan, 2009; Kaber, Onal, & Endsley, 2000). The results of the meta analysis revealed consistent
trends for performance across studies: routine performance improved and failure performance degraded as the
degree of automation increased. However, the results for decreasing workload and decreasing situation
awareness were less clear cut (in part given the lack of studies that assessed SA across different degrees of
imperfect automation). One finding, however, bears particular note. Those studies in which higher degrees of
automation were shown to have higher situation awareness were ones in which both routine and failure
performance improved as degree of automation increased. We might infer that these were studies in which
researchers played particular attention to effective display design, and transparency of feedback, a point we
addressed in detail in the previous section.
In addition to the fact that good feedback can probably move the optimal point farther to the right of
Figure 12.6, a case can also be made that, when the risks of imperfection and human/ automation error are
high, the optimum point should be moved more to the left (Parasuraman, Sheridan, & Wickens, 2000), but
when the time pressure is extremely high, such that a human operator decision may not be able to be made in
time (such as the decision to shut down a faulty engine on takeoff; Inagaki, 2003), the optimal point for an
automation system to support the pilot in the decision should be moved farther to the right.

9.3 Designing for Human-Automation “Etiquette”


As discussed previously, trust plays an important role in determining the degree to which human operators
depend on automated systems. Trust has both cognitive and affective properties. The latter takes on a more
prominent role as automated systems increase in their “ intelligence” and in their ability to interact with
humans in ways that mimic human-human interaction—e.g., through voice and “face to face”
communications. Nass and colleagues (Nass et al., 1995; Reeves & Nass, 1996) have shown that people often
respond socially to computers in ways similar to how they interact normally with other people. Because some
forms of automation can appear to be endowed with greater intelligence and with other human-like properties,
therefore, it is important to ask whether they should be designed to act in socially appropriate ways with
humans.
Our interaction with others is typically governed by rules that are implicitly understood and adhered to in
most settings, whether formal or informal. Such etiquette, or adherence to an accepted but frequently implicit
code of behavior between individuals in any social setting, may also be important for effective human-
computer relations. Parasuraman and Miller (2004) showed that etiquette can influence the efficiency with
which operators make diagnostic decisions when using automation. They tested participants on the MATB
simulation with the aircraft automated fault management system that provided advisories on possible engine
faults. In one condition the automation followed good etiquette, i.e. provided participants with a pre-warning
and then waited for them to complete what they were doing being issuing an advisory. In a second condition,
the automation displayed poor etiquette by not warning the participants and not waiting for them to finish
what they were doing. Diagnostic accuracy was about 20 percent higher in the good etiquette condition than in
the poor etiquette condition. Dorneich, Ververs, et al. (2012) reported similar benefits of good automation
etiquette on multitasking performance. They designed an adaptive automation system that managed the flow
of messages to the user by directing only high priority messages to the user when the user’s workload was
high and storing low-priority messages for later viewing, in a manner similar to a human assistant who
follows good etiquette.
There are also aspects to etiquette other than knowing when and when not to interrupt. Grice (1975)
described the behavioral practices that allow for acceptable and efficient interaction between people. For
example, in conversation with another person we typically try to avoid being obscure or ambiguous if we
want to communicate effectively. Hayes and Miller (2011) suggested that automated systems should similarly
avoid obscurity or ambiguity. In this view, automation that is designed to follow such agreed-upon axioms of
etiquette tends to be accepted and liked by human operators.

9.4 Calibrating Operator Trust: Display Design and Training


As we have seen, poorly calibrated trust is a major contributor to inefficient human-automation interaction.
Human users exhibit both under- and over-trust. Both can be remediated through attention to automation
design and training. We address first the possible solutions to mistrust, and then to the challenges of over-trust
and complacency.

351
9.4.1 MITIGATING MISTRUST The material in Section 7.2, by identifying the sources of mistrust in automation,
has implicitly suggested solutions. For example, simplifying the complexity of automation functionality
and/or making it more “transparent” to the user via good displays should reduce mistrust. So also should
increased training of the human supervisor of the functioning of those algorithms. To guard against the
mistrust of false-alarm prone alerts, we refer the reader back to the points made in Chapter 2 (Section 4.3) but
wish to elaborate the readers on two of them: the issue of training and that of likelihood alerts.
First, with regard to training, it is necessary for users of alarm systems to realize that, in conditions in
which system failures may be subtle yet catastrophic and early warnings are thus desirable, and in which the
base rate of failures is quite low; then alarm false alarms are an inevitable consequence to be tolerated
(Parasuraman et al., 1997).
Second, regarding displays, there is evidence that the alarm false alarm problem can be mitigated to an
extent through the use of likelihood displays (Sorkin et al., 1988). Such displays provide two or more graded
levels of certainty that a critical condition exists. In essence, such a concept allows the system to say “I’m not
sure” rather than just blurting out a full alarm or nothing at all (and setting a risky criterion to avoid misses).
As we learned in Chapter 2, allowing human signal detectors to express their confidence in “signal-present” at
more than one level improves human detection performance. Similarly, allowing the alarm system a
corresponding resolution in confidence provides a corresponding improvement in the sensitivity of the human
and system together (Sorkin et al., 1988).
In a field study of a homeland security threat detection system, radiation portals at border crossings,
Sanquist et al. (2008) showed that using a likelihood alarm display and a Bayesian analysis could reduce the
false alarm problem. Monitors for detecting such radioactive sources (e.g., a “dirty bomb”) that are currently
deployed at border crossings are plagued by “nuisance alarms”—alarms that occur because of objects that are
radioactive but are not true threats, such as fertilizer, pet litter, or irradiated fruit. Sanquist et al. first estimated
the (very low) base rate of true threats (e.g., weapons-grade plutonium) and provided design criteria for
increasing alarm positive predictive value (PPV)—the probability that, given an alarm, a true threat exists.
They also showed that the PPV could be increased (and nuisance alarms reduced) by including cargo manifest
information (e.g., whether a truck was carrying fertilizer) in the detection system algorithm. Finally, they
showed that the use of a three-level likelihood display (e.g., “Pass: no material of concern”; “Alert: naturally
occurring radiation material; and “Alarm: potential threat of radiation material”) also substantially reduced
false alerts.

9.4.2 MITIGATING OVER-TRUST AND COMPLACENCY In order to mitigate the trust calibration phenomena of over-
trust, there is some evidence that providing automation reliability information can help users calibrate their
trust and dependence on an automated combat identification system (Neyedli et al., 2011; Wang et al., 2009).
As we suggested implicitly in Section 7.1.1, by comparing first failure responses with subsequent responses to
the failures of automation (e.g., Parasuraman & Molloy, 1996; Merlo et al., 2003; Manzey, D., Reichenbach,
J., & Onnasch, L., 2012, Wickens et al., 2009), one of the best techniques is to get the “first failure out of the
way” with training or practice on automation, before real-time use is undertaken. Hence, the “first failure” in
that real-time use is now actually a “subsequent failure” after a certain amount of mistrust in the automation
has been allowed to accumulate.
Importantly however, simply informing the automation user that a failure could occur (Skitka, Mosier, &
Burdick 2000) is far less effective than is actually experiencing that failure, a conclusion echoing that
regarding debiasing in decision making, discussed in Chapter 8, Section 9.1 (Larrick, 2006). This conclusion
is highlighted by the findings of Bahner, Huper, and Manzey (2008), who examined whether experience with
automation failures could reduce complacency. They tested two groups of participants in a process control
simulation in which fault management automation provided advisories on system faults. One group was
simply informed that the automation would work highly reliably, although not perfectly, and that they should
verify each diagnosis before accepting it by checking the information sources pertaining to the diagnosis
(“information group”). The other group received the same information but was additionally exposed to a few
automation failures (incorrect diagnoses) during training (“experience group”). Following the work of Lee and
Moray (1992), Bahner and colleagues found that experience with imperfect automation reduced overall trust
and thereby the degree of complacency, which they measured by the number of information parameters
checked prior to accepting a diagnosis. Participants in the “information group” sampled fewer information
parameters than the “experience group.” This finding indicates that training with exposure to automation
failures can reduce complacency.

352
10. CONCLUSIONS
Automated systems, supporting or replacing all stages of human information processing, are found in all
aspects of work and life—in manufacturing, power generation, health care, transportation, offices, homes, and
in many other industries. In many such environments, automation has improved efficiency, enhanced safety,
and reduced operator workload. At the same time, automation has also introduced new problems and changed
the nature of cognitive work of human operators, which at times has lead to incidents and accidents. Several
human performance issues have arisen because automated systems have often been designed from a
technology-centered perspective. These include unbalanced mental workload, reduced situation awareness,
and uncalibrated trust, both under-trust and over-trust. A number of approaches to designing for effective
human-automation interaction are possible. These include using appropriate levels and stages of automation,
reducing automation complexity, providing feedback, and training for calibrated trust. Adaptive/adaptable
automation may also help in reducing some of the human performance costs of automation, although further
work needs to be conducted on its practical feasibility as a design option.

Key Terms

adaptive automation 395


adjustable autonomy 399
automation bias 392
automation dependence 388
automation etiquette 402
automation reliability 388
automation surprises 387
calibration curve 389
change blindness 397
complacency (over-trust) 388
compliance 391
cry wolf effect (under trust) 388
first failure effect 391
function allocation 396
generation effect 390
human-centered automation 378
levels of automation 381
likelihood displays 403
Mental workload 397
multi-modal 400
out of the loop unfamiliarity 393
reliance 391
stages of automation 382
trust 388

353
354
EPILOGUE

Over the course of the 11 chapters that have addressed specific components of performance, a number of
themes emerged that characterized findings and principles across more than a single chapter. By virtue of their
repeated occurrences, these are themes that we believe are particularly important for understanding human
performance strengths and limitations in the workplace. Our list below is not exhaustive, and we would be
pleased if readers might like to augment this with thoughts of their own.
1. Working memory limitations. Repeatedly, we have noted that working memory is a very constraining
limit in its own right (as the phone dialing example illustrates), but also such limits drive other
processing constraints, and principles, such as the costs of having material to be compared separated in
space and time. Effective use of working memory is effortful. Effort is a limited resource, and the
human’s natural tendency to conserve it can harm processes based on working memory, creating
errors, delaying performance, or imposing cognitive workload.
2. The 2 C’s: compatibility and confusion. Both of these C concepts made repeated appearances:
compatibility in terms of display-control (stimulus-response), ecological, modality, and cognitive
aspects. The key general point this raises is the interaction between stages of information processing.
No stage can easily be treated in isolation because the mapping between them is so important.
  While compatibility has long been a well known principle in engineering psychology, the concept
of similarity-based confusion does not enjoy such a rich history, yet is every bit as critical in
characterizing human performance. If two things look, sound, or feel similar and could both occur in
the same context, there is a high likelihood that the wrong one will be inappropriately perceived, its
associated response triggered, or that they will be confused in working memory, sometimes with
devastating consequences. Thus, where similarity is the bad here, discriminability represents the good.
3. Tradeoffs. As with the two C’s, tradeoffs also have shown two manifestations. First, people often have
a cognitive set that can trade off two variables or kinds of processes in human performance. For
example, there are hits versus misses in signal detection (via beta), speed versus accuracy in many
tasks, task A versus task B in time sharing, effort conservation versus accuracy in decision making and
search, and balancing probability versus value in choice. Human performance theory is critical in
helping to understand these strategic tradeoffs, what drives people along the tradeoff function, where
they should operate versus where they do operate, and how to measure the quality of human
performance across the function. In a sense, the receiver operating characteristic (ROC), the speed-
accuracy operating characteristic (SAOC), and the performance operating characteristic (POC) offer
explicit representations of such tradeoffs.
  Second, principles of design often trade off, as a given design may satisfy one principle while
violating another. As examples, consider the alert threshold (trading off reliance versus compliance),
narrow deep menus versus shallow broad ones, (trading off cognitive load for visual search), designing
for consistency versus compatibility across a set of display-control mappings, designing for close
proximity (reducing information access effort) or greater separation (reducing readout clutter), or
designing automation to increase situation awareness or reduce workload. The fact that such tradeoffs
exist amplifies the need for the computational models that can help a designer understand the balance
of one function versus the other, or whether indeed there may be a “sweet spot” in the tradeoff that can
provide either a free lunch, or at least a cheap one.
4. Expectancy. Expectancy has made repeated appearances across chapters, exerting huge impacts on
what and how we see and hear, and how or whether we respond.
5. Stage of Processing. Of course stages of processing represent a hallmark of the information
processing approach, and we saw four (and sometimes just three or two) highlighted across many of
our chapters and applications, from signal detection theory (2: sensitivity and response criterion) to
displays (2: status and command) to situation awareness (3: level 1, 2, and 3) to decision making (3:
cue perception, situation assessment, choice), to transfer of training (2: stimulus similarity, response
similarity) to resources (2: perceptual-cognitive, action), to automation (all four stages of automation
assisting the four information processing stages in the full model). On the one hand, we can highlight
the very real physiological distinctions between these, as discussed in several places in chapters 10 and

355
11, as well as the different design implications that may fall from whether a system imposes limits at
one stage or the other. On the other hand, however, the stage distinction does not imply that the stages
must run in purely sequential fashion, nor does it imply that processing “starts” at any particular stage.
Indeed, the prominence of feedback loops is exactly what allows the cycle of human information
processing to start anywhere and carry on continuously if a task, such as manual control, requires it.
  Nevertheless, just as in many work domains, it is characteristic that certain things must ideally
happen before others (for example, reading the safety information before finding the power switch and
then using the dangerous equipment), so in human performance, ideal performance often does proceed
in certain stage-constrained ways; and when it does not, as when an action is taken without a prior
careful evaluation of the situation, poor performance can results.
6. Emphasis on perception-cognition. The reader will note that while we articulated a fourstage model,
seven of the chapters address primarily the “early” stages of perception and cognition, only two focus
heavily upon action selection (Chapters 8 and 9), and only parts of Chapters 5 and 9 address action
execution. This emphasis, in part, reflects the evolution of technology as we described in Chapter 12,
where progressively more functions in the workplace are offloaded to machines, which can perform
response execution tasks with great facility. As a consequence, the relative contributions of human
perception and cognition to total system performance have grown accordingly.
7. From principles to design, and back again. As we noted at the outset, this is not intended to be a
human factors textbook for how to build “human centered things.” The principles we have articulated
here must be coupled with careful task analysis and good engineering in order to assure that they are
well expressed in designs that support fast, accurate, and/or low workload performance. But we hope
that the reader will understand how those principles can be applied, and then will follow up with a
deep understanding of the human factors of design from other sources. At the same time, we urge our
readers to look around and see how the principles may be embodied in examples of both good and bad
design (or annoyances) in their everyday life. Finally, we hope that the principles articulated here (as
well as the findings from human factors) can feed back to the basic researcher in psychology and
cognitive science, to highlight ways in which their theories have been successful, found wanting, or
may need elaboration. As such, the full feedback loop that is human factors will be realized.

356
REFERENCES

Aaslid, R. (1986). Transcranial Doppler examination techniques. In R. Aaslid (Ed.), Transcranial Doppler
Sonography (pp. 39–59). New York: Springer-Verlag.
Ackerman, P., Schneider, W., & Wickens, C. D. (1984). Deciding the existence of a time-sharing ability: A
combined methodological and theoretical approach. Human Factors, 26, 71–82.
Adamic, E. J., Behre, J., & Dyre, B. P. (2010). Attentional locus and ground dominance in control of speed
during low altitude flight. In Proceedings of the Human Factors and Ergonomics Society 54th Annual
Meeting (pp. 1,665–1,669). Santa Monica, CA: Human Factors and Ergonomics Society.
Adams, B. D., Webb, R. D. G., Angel, H. A., & Bryant, D. J. (2003). Development of theories of collective
and cognitive skill retention. DRDC Contractor Report CR-2003-078. Toronto: Defence Research and
Development Canada.
Adams, J. A., & Hufford, L. E. (1962). Contribution of a part-task trainer to the learning and relearning of a
time-shared flight maneuver. Human Factors, 4, 159–170.
Adams, M. J., Tenney, Y. J. and Pew, R. W. (1995). Situation awareness and the cognitive management of
complex systems. Human Factors, 37, 85–104.
Adelman, L., Bresnick, T., Black, P., Marvin, F., & Sak, S. (1996). Research with Patriot air defense
officers: examining information order effects. Human Factors, 38, 250–261.
Ahlstrom, V., Longo, M., & Truitt, T. (2005). Human factors design guide (DOT/FAA/CT-02-11). Atlantic
City, NJ: Federal Aviation Administration.
Alexander,A.L., Wickens,C.D., &Hardy,T.J. (2005).Synthetic vision systems: The effects of guidance
symbology, display size, and field of view. Human Factors, 47, 693–707.
Alexander, A. L., Wickens, C. D., & Merwin, D. H. (2005). Perspective and coplanar cockpit displays of
traffic information: Implications for maneuver choice, flight safety, and mental workload. International
Journal of Aviation Psychology, 15, 1–21.
Algom, D., Dekel, A., & Pansky, A. (1996). The perception of number from the separability of the stimulus:
The Stroop effect revisited. Memory and Cognition, 24, 557–572.
Alkov, R., Borowsky, M. & Gaynor, M (1982). Stress coping and US Navy aircrew factor mishap. Aviation
Space and Environmental Medicine. 53, 1,112–1,115.
Allen P. A., Groth, K. E., Grabbe, J. W., Smith, A. F., Pickle, J. L., & Madden, D. J. (2002). Differential
age effects for case and hue mixing in visual word recognition. Psychology and Aging, 17, 622–635.
Allen, G. (1982). Probability judgment in weather forecasting. In Ninth Conference in Weather Forecasting
and Analysis. Boston: American Meteorological Society.
Allison, R. S., Gillam, B. J., & Becellio, E. (2009). Binocular depth discrimination and estimation beyond
interaction space. Journal of Vision, 9(1):10, 1–14.
Allport, D. A. (1993). Attention and control: Have we been asking the wrong questions? A critical review of
the last 25y ears. In D.E. Meyer & S. Kornblum (Eds.), Attention and performance XIV: A silver jubilee.
Cambridge, MA: MIT Press.
Allport, D. A., Styles, E. A., & Hsieh, S. (1994). Shifting intentional set: Exploring the dynamic control of
tasks. In C. Umilta & M. Moscovitch (Eds.), Attention and performance XV (pp. 421–452). Cambridge,
MA: MIT Press.
Alluisi, E., Muller, P. I., & Fitts, P. M. (1957). An information analysis of verbal and motor response in a
force-paced serial task. Journal of Experimental Psychology, 53, 153–158.
Altmann, E. M., & Trafton, J. G. (2002). Memory for goals: An activation-based model. Cognitive Science,
23, 39–83.

357
Amadieu, F., Mariné, C., & Laimay, C. (2011). The attention-guiding effect and cognitive load in the
comprehension of animations. Computers in Human Behavior, 27, 36–40.
Amer, T. S. 2005. Bias due to visual illusion in the graphical presentation of accounting information. Journal
of Information Systems, 19, 1–18.
Amishav, R., & Kimchi, R. (2010). Perceptual integrality of componential and configural information in
faces. Psychonomic Bulletin & Review, 17, 743–748.
Anderson, J. R. (1981). Cognitive skills and their acquisition. Hillsdale, NJ: Erlbaum.
Anderson, J. R. (1991). Is human cognition adaptive? Behavioral and Brain Sciences, 14, 471–484.
Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ: Erlbaum.
Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51, 355–365.
Anderson, M. C. (2003). Rethinking interference theory: Executive control and the mechanisms of forgetting.
Journal of Memory and Language, 49, 415–445.
Ando, J., Ono, Y., & Wright, M. J. (2001). Genetic structure of spatial and verbal working memory.
Behavioral Genetics, 31, 615–624.
Andre, A. D., & Wickens, C. D. (1992). Compatibility and consistency in display-control systems:
Implications for aircraft decision aid design. Human Factors, 34, 639–653.
Andre, A. D., & Wickens, C. D. (1995). When users want what’s not best for them. Ergonomics in Design,
October, 10–14.
Andre, A. D., Wickens, C. D., & Goldwasser, J. B. (1990). Compatibility and consistency in display-control
systems: Implications for decision aid design. University of Illinois Institute of Aviation Technical Report
(ARL-90-13/NASA-A3I-90-2). Savoy, IL: Aviation Research Laboratory.
Andre, A. D., Haskell, I. D., & Wickens, C. D. (1991). S-R compatibility effects with orthogonal stimulus
and response dimensions. In Proceedings of the 35th Annual Meeting of the Human Factors Society (pp.
1546-1550). Santa Monica, CA: Human Factors Society.
Andre, A. D., Wickens, C. D., Moorman, L., & Boschelli, M. M. (1991). Display formatting techniques for
improving situation awareness in the aircraft cockpit. International Journal of Aviation Psychology, 1, 205–
218.
ANSI (1997). Methods for calculation of the speech intelligibility index, S3.5–1997. New York: American
National Standards Institute.
Antonijevic, S. (2008). From text to gesture online: A microethnographic analysis of nonverbal
communication in the Second Life virtual environment. Information, Communication & Society, 11(2),
221–238.
Aretz, A. J. (1991). The design of electronic map displays. Human Factors, 33, 85–101.
Aretz, A. J., & Wickens, C. D. (1992). The mental rotation of map displays. Human Performance, 5, 303–
328.
Arkes, H. R., & Blumer, C. (1985). The psychology of sunk cost. Organizational Behavior and Human
Performance, 35, 129–140.
Arkes, H. R., & Harkness, A. R. (1980). Effect of making a diagnosis on subsequent recognition of
symptoms. Journal of Experimental Psychology: Human Learning and Memory, 6, 568–575.
Arnott, D. (2006). Cognitive biases and decision support systems development: a design science approach.
Information Systems Journal, 16, 55–78.
Arthur, J. J., Prinzel, L. J., Kramer, L. J., & Bailey, R. E. (2006). Dynamic tunnel usability study: Format
recommendations for synthetic vision system primary flight displays. NASA Langley Research Center,
Technical Report TM-2006-214272. Hampton, VA: National Aeronautics and Space Administration.
Arthur, W. Jr., Bennett, W., Jr., Stanush, P. L., & McNelly, T. L. (1998). Factors that influence skill decay
and retention: A quantitative review and analysis. Human Performance, 11, 57–101.
Ashby, F. G., & Lee, W. W. (1991). Predicting similarity and categorization from identification. Journal of
Experimental Psychology: General, 120, 150–172.

358
Ashby, F. G., & Maddox, W. T. (1994). A response time theory of perceptual separability and perceptual
integrality in speeded classification. Journal of Mathematical Psychology, 33, 423–466.
Atchley, P., & Chan, M. (2011). Potential benefits of concurrent task engagement to maintain vigilance: A
driving simulator study. Human Factors, 53, 3–12.
Avery, B., Sandor, C., & Thomas, B. H. (2009). Improving spatial perception for augmented reality x-ray
vision. In IEEE Virtual Reality 2009 Proceedings (pp. 79–82). New York: Institute of Electrical and
Electronics Engineers.
Ayaz, H., Shewokis, P. A., Bunce. S., Izzetoglu, K., Willems, B., & Onaral, B. (2012). Optical brain
monitoring for operator training and mental workload assessment. NeuroImage, 59, 36–47.
Ayres, T. J. (2006). Fifty years of warning researchers. In Proceedings of the Human Factors and Ergonomics
Society 50th Annual Meeting (pp. 1,794–1,797). Santa Monica, CA: Human Factors and Ergonomics
Society.
Azuma, R. T. (2001). Augmented reality: Approaches and technical challenges. In W. Barfield & T. Caudell
(Eds.), Fundamentals of wearable computers and augmented reality (pp. 27–63). Mahwah, NJ: Erlbaum.
Baber, C. (1997). Beyond the desktop. San Diego: Academic Press.
Baber, C., Morin, C., Parekh, M., Cahillane, M., & Houghton, R. (2011). Multimodal control of sensors on
multiple simulated unmanned vehicles. Ergonomics, 54, 792–805.
Backs, R. W., Lennerman, J. K., Wetzel, J. M., & Green, P. (2003). Cardiac measures of driver workload
during simulated driving with and without visual occlusion. Human Factors, 45, 525–538.
Baddeley, A. (1966). The capacity for generating information by randomization. Quarterly Journal of
Experimental Psychology, 18, 119–130.
Baddeley, A. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology, 49A, 5–
28.
Baddeley, A. D. (1972). Selective attention and performance in dangerous environments. British Journal of
Psychology, 63, 537–546.
Baddeley,A.D. (1986). Working memory. Oxford: Clarendon Press.
Baddeley, A. D. (1990). Human memory: Theory and practice. Boston, MA: Allyn and Bacon.
Baddeley, A. D. (1993). Working memory or working attention? In A. Baddeley & L. Weiskrantz (Eds.),
Attention: Selection, awareness, and control. A tribute to Donald Broadbent (pp. 152–170). Oxford:
Oxford University Press.
Baddeley, A. D. (1995). Working memory. In M. S. Gazzaniga et al. (Eds.), The cognitive neurosciences (pp.
755–784). Cambridge, MA: MIT Press.
Baddeley, A. D. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology, 49A,
5–28.
Baddeley, A. D. (2003). Working memory: Looking back and looking forward. Nature Reviews Neuroscience,
4, 829–839.
Baddeley, A. D. (2007). Working memory, thought and action. Oxford: Oxford University Press.
Baddeley, A. D., & Colquhoun, W. P. (1969). Signal prob-ability and vigilance: A reappraisal of the “signal
rate” effect. British Journal of Psychology, 60,169–178.
Baddeley, A. D., & Hitch, G. (1974). Working memory. In G. Bower (Ed.), Recent advances in learning and
motivation (vol. 8). New York: Academic Press.
Baddeley, A. D., Chincotta, D., & Adlam, A. (2001). Working memory and the control of action: Evidence
from task switching. Journal of Experimental Psychology: General, 130, 641–657.
Baddeley, A. D., Hitch, G. J., and Allen, R. J. (2009). Working memory and binding in sentence recall.
Journal of Memory and Language, 61, 438–456.
Bagheri, N., & Jamieson, G. A. (2004). Considering sub-jective trust and monitoring behavior in assessing
automation-induced “complacency.” In D. A. Vicenzi, M., Mouloua, & P. A. Hancock (Eds.), Human
performance, situation awareness, and automation: Current research and trends (pp. 54–59). Mahwah, NJ:

359
Erlbaum.
Bahner, E., Huper, A. D., & Manzey, D. (2008). Misuse of automated decision aids: Complacency,
automation bias and the impact of training experience. International Journal of Human-Computer Studies,
66, 688–699.
Bahrick, H. P., Noble, M., & Fitts, P. M. (1954). Extra task performance as a measure of learning a primary
task. Journal of Experimental Psychology, 48, 298–302.
Bahrick, H. P., & Shelly, C. (1958). Time-sharing as an index of automization. Journal of Experimental
Psychology, 56, 288–293.
Bailey, B. P., & Iqbal, S. T. (2008). Understanding changes in mental workload during execution of goal-
directed tasks and its application for interruption management. ACM Transactions on Computer-Human
Interaction, 14(4). 21:1–28.
Bailey, B. P., & Konstan, J. A. (2006). On the need for attention-aware systems: measuring effects of
interruption on task performance, error rate, and affective state. Computers in Human Behavior, 23, 685–
708.
Bailey, N., & Scerbo, M. S. (2007). Automation-induced complacency for monitoring highly reliable systems:
The role of task complexity, system experience, and operator trust. Theoretical Issues in Ergonomics
Science, 8, 321–348.
Bailey, R. W. (1989). Human performance engineering: Using human factors/ergonomics to achieve
computer system usability (2nd Ed.). Englewood Cliffs, NJ: Prentice Hall.
Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779.
Baker, C. H. (1961). Maintaining the level of vigilance by means of knowledge of results about a secondary
vigilance task. Ergonomics, 4, 311–316.
Baldwin, C. L. (2012). Auditory cognition and human performance: Research and applications. New York:
CRC Press.
Baldwin, C. L., & Coyne, J. (2005). Dissociable aspects of mental workload: Examinations of the P300 ERP
component and performance assessments. Psychologia, 48, 102–119.
Baldwin, C. L., & Penaranda, B. (2012). Adaptive training using an artificial neural network and EEG
metrics for within-and cross-task workload classification. NeuroImage, 59, 48–56.
Balla, J. (1980). Logical thinking and the diagnostic process. Methodology and Information in Medicine, 19,
88–92.
Balla, J. (1982). The use of critical cues and prior probability in concept identification. Methodology and
Information in Medicine, 21, 9–14.
Ballard, D. H., Hayhoe, M. M., & Pelz, J. B. (1995). Memory representation in natural tasks. Journal of
Cognitive Neuroscience, 7(1), 66–86.
Banbury, S., & Berry, D. C. (1997). Habituation and dishabituation to speech and office noise. Journal of
Experimental Psychology: Applied, 3, 1–16.
Banbury, S., & Berry, D. C. (1998). The disruption of office-related tasks by speech and office noise. British
Journal of Psychology, 89, 499–517.
Banbury, S., & Berry, D. C. (2005). Office noise and employee concentration: Identifying causes of
disruption and potential improvements. Ergonomics, 48, 25–37.
Banbury, S., Croft, D. G., Macken, W. J., & Jones, D. M. (2004). A cognitive streaming account of situation
awareness. In S. Banbury & S. Tremblay (Eds.), A cognitive approach to situation awareness: Theory and
application (pp. 117–134). Aldershot, UK: Ashgate.
Banbury, S., & Tremblay, S. (Eds.) (2004). A cognitive approach to situation awareness: Theory and
application. Aldershot: Ashgate.
Banbury, S., Dudfield, H., & Lodge, M. (2007). FASA: Development and validation of a scale to measure
factors affecting commercial airline pilot Situation Awareness. International Journal of Aviation
Psychology, 17, 131–152.

360
Banbury, S., Fricker, L., Emery, L., & Tremblay, S. (2003). Using auditory streaming to reduce disruption of
serial memory by extraneous auditory warnings. Journal of Experimental Psychology: Applied, 9, 12–29.
Banbury, S., Jones, D. M., & Berry, D. C. (1998). Extending the ‘irrelevant sound effect’: The effects of
extraneous sound on performance in the office and on the flight deck. In Proceedings of the 7th
International Congress on Noise as a Public Health Problem, Sydney, Australia.
Banbury, S., Macken, W. J., Tremblay, S., & Jones, D. M. (2001). Auditory distraction and short-term
memory: Phenomena and practical implications. Human Factors, 45, 12–29.
Banbury, S., Selcon, S. J., & McCrerie, C. M. (1997). New light through old windows: The role of cognitive
compatibility in aircraft dial design. In Proceedings of the Human Factors and Ergonomics Society 41st
Annual Meeting (pp. 56–60). Santa Monica, CA: Human Factors and Ergonomics Society.
Banich, M. T. (2009). Executive function. The search for an integrated account. Current Directons in
Psychological Science, 18, 89–93.
Barclay, R. L., Vicari, J. J., Doughty, A. S., Johanson, J. F., & Greenlaw, R. L. (2006). Colonoscopic
withdrawal times and adenoma detection during screening colonoscopy. New England Journal of Medicine,
355, 2,533–2,541.
Barfield, W. (1997). Skilled performance on software as a function of domain expertise and program
organization. Perceptual and Motor Skills, 85, 1,471–1,480.
Barnes, L. R., Gruntfest, E. C., Hayden, M. H., Schultz, D. M., & Benight, C. (2007). False alarms and close
calls: A conceptual model of warning accuracy. Weather and Forecasting, 22, 1,140–1,147.
Barnes, M., & Jenstch, F. (Eds.) (2010). Human-robot interactions in future military operations. Farnham,
Surrey, UK: Ashgate.
Barnett, B. J. (1990). Aiding type and format compatibility for decision aid interface design. In Proceedings
of the 34th Annual Meeting of the Human Factors Society (pp. 1,552–1,556). Santa Monica, CA: Human
Factors Society.
Barnett, B. J., & Wickens, C. D. (1988). Display proximity in multicue information integration: The benefit
of boxes. Human Factors, 30, 15–24.
Barr, R. A., & Giambra, L. M. (1990). Age-related decrement in auditory selective attention. Psychology and
Aging, 5, 597–599.
Barrouillet, P., Bernardin, S., & Camos, V. (2004). Time constraints and resource sharing in adults’ working
memory spans. Journal of Experimental Psychology: General, 133, 83–100.
Barsalou, L. W. (2008). Grounded cognition. Annual Review of Psychology, 59, 617–645.
Barton, P. H. (1986). The development of a new keyboard for outward sorting foreign mail. IMechE, 57–63.
Bartram, D. J. (1980). Comprehending spatial information: The relative efficiency of different methods of
presenting information about bus routes. Journal of Applied Psychology, 65, 103–110.
Bastardi, A., Uhlman, E., & Ross, L. (2011). Belief, desire and the motivational evaluation of scientific
evidence. Psychological Science, 22, 731–732.
Bateman, S., Mandryk, R. L., Gutwin, C., Genest, A., McDine, D., &Brooks, C. (2010). Useful junk? The
effects of visual embellishment on comprehension and memorability of charts. In Proceedings of the 28th
International Conference on Human Factors in Computing Systems CHI 2010 (pp. 2,573–2,582). New
York: Association for Computing Machinery.
Bates, D. W., Cohen, M., Leape, L. L., Overhage, J. M., Shabot, M. M., & Sheridan, T. (2001). Reducing
the frequency of errors in medicine using information technology. Journal of the American Medical
Informatics Association, 8, 299–308.
Bates, E., & Fitzpatrick, D. (2010). Spoken mathematics using prosody, earcons and spearcons. In K.
Miesenberger et al. (Eds.), Proceedings of the ICCHP 2010, Part II, LNCS 6180, 407–414.
Bazerman, M. (1998). Judgment in managerial decision making (4th Ed.). New York: Wiley.
Beaman, C. P. (2005). Auditory distraction from low-intensity noise: A review of the consequences for
learning and workplace environments. Applied Cognitive Psychology, 19, 1,041–1,064.

361
Beatty, J. (1982). Task-evoked pupillary responses, processing load, and the structure of processing resources.
Psychological Bulletin, 91, 276–292.
Beck, M. R., Lohrenz, M. C., & Trafton, J. G. (2010). Measuring search efficiency in complex visual search
tasks: Global and local clutter. Journal of Experimental Psychology: Applied, 16, 238–250.
Beck, M. R., Peterson, M. S., & Angelone, B. L. (2007). The roles of encoding, retrieval, and awareness in
change detection. Memory & Cognition, 35, 610–620.
Becker, A. B., Warm, J. S., Dember, W. N., & Hancock, P. A. (1995). Effects of jet engine noise and
performance feedback on perceived workload in a monitoring task. International Journal of Aviation
Psychology, 5, 49–62.
Becker, R., & Cleveland, W. (1987). Brushing scatterplots. Technometrics, 29(2), 127–142.
Bederson, B. B., Hollan, J. D., Stewart, J., Rogers, D., Vick, D., Ring, L., Grose, E., & Forsythe, C. (1998).
A zooming web browser. In C. Forsythe, E. Grose, & J. Ratner (Eds.), Human factors and web development
(pp. 255–266). Mahwah, NJ: Erlbaum.
Beilock, S. L., Bertenthal, B., Hoerger, M. & Carr, T. (2008). When does haste make waste? Journal of
Experimental Psychology: Applied, 14, 340–352.
Bellenkes, A. H., Wickens, C. D., & Kramer, A. F. (1997). Visual scanning and pilot expertise: The role of
attentional flexibility and mental model development. Aviation, Space, and Environmental Medicine, 68,
569–579.
Bennett, A. M., Flach, J. M., McEwen, T. R., Russell, S. M. (2006). Active regulation of speed during a
simulated low-altitude flight task. In Proceedings of the Human Factors and Ergonomics Society 50th
Annual Meeting (pp. 1,589–1,593). Santa Monica, CA: Human Factors and Ergonomics Society.
Bennett, K. B., & Flach, J. M. (1992). Graphical displays: implications for divided attention, focused
attention, and problem solving. Human Factors, 34, 513–533.
Bennett, K. B., & Flach, J. M. (2011). Display and interface design: Subtle science, exact art. Boca Raton,
FL: CRC Press.
Bennett, K. B., & Flach, J. M. (2012). Visual momentum redux. International Journal of Human Computer
Studies 70 (2012) 399–414.
Ben–Shakhar, G., & Elaad, E. (2003). The validity of psychophysiological detection of information with the
Guilty Knowledge Test: A metaanalytic review. Journal of Applied Psychology, 88, 131–151.
Berends, I. E., & van Lieshout, E. C. D. M. (2009). The effects of illustrations in arithmetic problem-solving:
Effects of increased cognitive load. Learning and Instruction, 19, 345–353.
Beringer, D. B., & Chrisman, S. E. (1991). Peripheral polar-graphic displays for signal/failure detection.
International Journal of Aviation Psychology, 1, 133–148.
Beringer, D. B., Williges, R. C., & Roscoe, S. N. (1975). The transition of experienced pilots to a frequency-
separated aircraft attitude display. Human Factors, 17, 401–414.
Berkun, M. M. (1964). Performance decrement under psychological stress. Human Factors, 6, 21–30.
Bertelson, P. (1965). Serial choice reaction-time as a function of response versus signal-and-response
repetition. Nature, 206, 217–218.
Bertelson, P. (1966). Central intermittency twenty years later. Quarterly Journal of Experimental Psychology,
18, 153–163
Bertin, J. (1983). Semiology of graphics. Madison, WI: University of Wisconsin Press.
Bertolotti, H., and Strybel, T. Z. (2011). Audio and audiovisual cueing in visual search: effects of target
uncertainty and auditory cue precision. In D. Harris (Ed.): Engineering psychology and cognitive
ergonomics, HCII 2011, LNAI 6781 (pp. 10–20). Springer-Verlag: Berlin.
Bettman, J. R., Johnson, E. J., & Payne, J. (1990). A componential analysis of cognitive effort and choice.
Organizational Behavior and Human Performance, 45, 111–139.
Bettman,J.R., Payne,J.W., &Staelin,R. (1986).Cognitive considerations in designing effective labels for
presenting risk information. Journal of Marketing and Public Policy, 5, 1–28.

362
Bialystok, E., Craik, F. I. M., Green, D. W., & Gollan, T. H. (2009). Bilingual minds. Psychological Science
in the Public Interest, 10, 89–129.
Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological
Review, 94, 115–147.
Biederman, I., Mezzanotte, R.J., Rabinowitz, J.C., Francolin, C. M., & Plude, D. (1981). Detecting the
unexpected in photo interpretation. Human Factors, 23, 153–163.
Biggs, S. J., & Srinivasan, M. A. (2002). Haptic interfaces. In K. M. Stanney (Ed.), Handbook of virtual
environments (pp. 93–115). Mahwah, NJ: Erlbaum.
Billings, C. (1997). Aviation automation: The search for a human-centered approach. Englewood Cliffs, NJ:
Erlbaum.
Birbaumer, N. (2006). Breaking the silence: Brain-computer interfaces (BCI) for communication and motor
control. Psychophysiology, 43, 517–532.
Bird, J. (1993). Sophisticated computer gets new role: system once used only in fighters helping in Bosnia,
Air Force Times, October 25, p. 8.
Bjork,R.A. (1999). Assessing our own competence: Heuristics and illusions. In D. Gopher & A. Koriat (Eds.),
Attention and performance XVII: Cognitive regulation of performance: Interaction of theory and
application. New York: Academic Press.
Bleckley, M. K., Durso, F. T., Crutchfield, J. M., Engle, R. W., & Khanna, M. M. (2003). Individual
differences in working memory capacity predict visual attention allocation. Psychonomic Bulletin &
Review, 10, 884–889.
Bliss, J. P., & Dunn, M. C. (2000). Behavioral implications of alarm mistrust as a function of task workload.
Ergonomics, 43, 1283–1300.
Bluethmann, W., Ambrose, R., Diftler, M., Askew, S., Huber, E., Goza, M., Rehnmark, F., Lovchik, C., &
Magruder, D. (2003). Robonaut: A robot designed to work with humans in space. Autonomous Robots, 14,
179–197.
BoeingCompany (2000). Statistical Summary of Commercial Jet Airplane accidents: world wide operations:
1959–1999 [online] available: www.boeing.com/news/techissues/pdf/1999_statsum.pdf.
Bogner, M. S. (1994) (Ed.). Human error in medicine. Hillsdale, NJ: Erlbaum.
Bojko, A., Kramer, A. F., & Peterson, M. S. (2004). Age equivalence in switch costs for prosaccade and
antisaccade tasks. Psychology and Aging, 19, 226–234.
Boles, D. B., Bursk, J. H., Phillips, J. B., & Perdelwitz, J. R. (2007). Predicting dual-task performance with
the multiple resources questionnaire (MRQ). Human Factors, 49, 32–45.
Bolstad, C. A., & Endsley, M. (2000). Shared displays and team performance. Proceedings of the Human
Performance, Situation Awareness and Automation Conference, Savannah, GA.
Booher, H. R. (1975). Relative comprehensibility of pictorial information and printed words in
proceduralized instructions. Human Factors, 17, 266–277.
Booher, H. R. (2003) (Ed.). Handbook of human systems integration. Hoboken, NJ: Wiley.
Borman, W. C., Hanson, M. A., & Hedge, J. W. (1997). Personnel selection. Annual Review of Psychology,
48, 299–337.
Bos, J. C., & Tack, D. W. (2005). Investigation: Visual display alternatives for infantry soldiers: A literature
review. DRDC Toronto Contract Report CR 2005-027. Toronto: Defence Research and Development
Canada.
Botzer, A., Meyer, J., Bak, P., & Parmet, Y. (2010). User settings of cue thresholds for binary categorization
decisions. Journal of Experimental Psychology: Applied, 16, 1–15.
Bourne, P. G. (1971). Altered adrenal function in two combat situations in Vietnam. In B. E. Elefheriou and
J. P. Scott (Eds.), The physiology of aggression and defeat. New York: Plenum.
Bower, G. H., & Springston, F. (1970). Pauses as recoding points in letter series. Journal of Experimental
Psychology, 83, 421–430.

363
Bower, G. H., Clark, M. C., Lesgold, A. M., & Winzenz, D. (1969). Hierarchical retrieval schemes in the
recall of categorical word lists. Journal of Verbal Learning and Verbal Behavior, 8, 323–343.
Boyle, E. A., Anderson, A. H., & Newlands, A. (1994). The effect of eye contact on dialogue and
performance in a cooperative problem-solving task. Language & Speech, 37, 1–20.
Brainard, R. W., Irby, T. S., Fitts, P. M., & Alluisi, E. (1962). Some variable influencing the rate of gain of
information. Journal of Experimental Psychology, 63, 105–110.
Bransford, J. D., & Johnson, M. K. (1972). Contextual prerequisites for understanding: Some investigations
of comprehension and recall. Journal of Verbal Learning and Verbal Behavior, 11, 717–726.
Braun, C. C., & Silver, N. C. (1995). Interaction of signal word and colour on warning labels: Differences in
perceived hazard and behavioural compliance. Ergonomics, 38, 2,207–2,220.
Braune, R. J. (1989). The common/same type rating: Human factors and other issues. Anaheim, CA: SAE.
Braune, R., & Wickens, C. D. (1986). Time-sharing revisited: Test of a componential model for the
assessment of individual differences. Ergonomics, 29, 1,399–1,414.
Braunstein, M. L. (1990). Structure from motion. In J. I. Elkind, S. K. Card, J. Hochberg, & B. M. Huey
(Eds.), Human performance models for computeraided engineering (pp. 89–105). Orlando, FL: Academic
Press.
Bregman, A. S. (1990). Auditory scene analysis: The perceptual organization of sound. Cambridge, MA:
MIT Press.
Brehmer, B. (1981). Models of diagnostic judgment. In J. Rasmussen & W. Rouse (Eds.), Human detection
and diagnosis of system failures. New York: Plenum.
Bremen, P., van Wanrooij, M. M., & van Opstal, A. J. (2010). Pinna cues determine orienting response
modes to synchronous sounds in elevation. The Journal of Neuroscience, 30(1), 94 –204.
Bresley, B. (1995, April–June). 777 flight deck design. Airliner. 1–9.
Breslow, L. A., Trafton, J. G. McCurry, J. M., & Ratwani, R. M. (2010). An algorithm for generating color
scales for both categorical and ordinal coding. Color Research and Application, 35, 18–28.
Breslow, L. A., Trafton, J. G., & Ratwani, R. M. (2009). A perceptual process approach to selecting color
scales for complex visualizations. Journal of Experimental Psychology: Applied, 15, 25–34.
Brewer, N., & Wells, G. L. (2006). The confidence-accuracy relationship in eyewitness identification: Effects
of line-up instructions, foil similarity, and target-absent base rates. Journal of Experimental Psychology:
Applied, 12, 11–30.
Brewer, N., Harvey, S., & Semmler, C. (2004). Improving comprehension of jury instructions with audio-
visual presentation. Applied Cognitive Psychology, 18, 765–776.
Brewster, C., & O’Hara, K. (2007). Knowledge representation with ontologies: Present challenges—future
possibilities. International Journal of Human-Computer Studies, 65, 563–568.
Brewster, S. (2009). Nonspeech auditory output. In A. Sears & J. A. Jacko (Eds.), Human-computer
interaction: Fundamentals (pp. 223–240). Boca Raton, FL: CRC Press.
Breznitz, S. (1983). Cry-wolf: The psychology of false alarms. Hillsdale, NJ: Lawrence Earlbaum.
Broadbent, D. E. (1958). Perception and communications. New York: Permagon.
Broadbent, D. E. (1971). Decision and stress. New York: Academic Press.
Broadbent, D. E. (1975). The magic number seven after fifteen years. In A. Kennedy & A. Wilkes (Eds.),
Studies in long-term memory (pp. 3–18). New York: Wiley.
Broadbent, D. E. (1977). Language and ergonomics. Applied Ergonomics, 8, 15–18.
Broadbent, D. E. (1982). Task combination and selective intake of information. Acta Psychologica, 50, 253–
290.
Broadbent, D. E., & Broadbent, M. H. (1980). Priming and the passive/active model of word recognition. In
R. Nickerson (Ed.), Attention and performance VIII. New York: Academic Press.
Broadbent, D. E., & Gregory, M. (1965). Effects of noise and of signal rate upon vigilance as analyzed by

364
means of decision theory. Human Factors, 7, 155–162.
Brookhuis, K. A., & de Waard, D. (1993). The use of psychophysiology to assess driver status. Ergonomics,
36, 1,099–1,100.
Brookings, J. B., Wilson, G. F., & Swain, C. R. (1996). Psychophysiological responses to changes in
workload during simulated air traffic control. Biological Psychology, 42, 361–377.
Brown, I. D., & Poulton, E. C. (1961). Measuring the spare “mental capacity” of car drivers with a subsidiary
task. Ergonomics, 4, 35–40.
Brown, J. (1959). Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental
Psychology, 10, 12–21.
Brown, M. E., & Gallimore, J. J. (1995). Visualization of three-dimensional structure during computer-aided
design. International Journal of Human-Computer Interaction, 7, 37–56.
Brown, S. D., Marley, A. A. J., Donkin, C., & Heathcote, A. (2008). An integrated model of choices and
response times in absolute identification. Psychological Review, 115, 396–425.
Brown, S. W., & Boltz, M. G. (2002). Attentional processes in time perception: Effects of mental workload
and event structure. Journal of Experimental Psychology: Human Perception and Performance, 28, 600–
615.
Bruno, N., & Cutting, J. E. (1988). Minimodularity and the perception of layout. Journal of Experimental
Psychology: General, 117, 161–170.
Bruyer, R., & Scailquin, J. C. (1998). The visuospatial sketchpad for mental images: Testing the
multicomponent model of working memory. Acta Psychologica, 98, 17–36.
Bryant, D. (2003). Critique, explore, compare, and adapt (CECA): A new model for command decision
making. DRDC Toronto Technical Report TR 2003–105. Toronto: Defence Research and Development
Canada.
Buehler, R., Griffin, D., & Ross, M. (2002). Inside the planning fallacy: the causes and consequences of
optimistic predictions. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases (pp. 250–
270). Cambridge, UK: Cambridge University Press.
Bulkley, N. K., Dyre, B. P., Lew, R., & Caufield, K. (2009). A peripherally-located virtual instrument landing
display affords more precise control of approach path during simulated landings than traditional instrument
landing displays. In Proceedings of the Human Factors and Ergonomics Society—53rd Annual Meeting
(pp. 31–35). Santa Monica, CA: Human Factors and Ergonomics Society.
Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97, 523–547.
Burgess, N., & Hitch, G. J. (2006). A revised model of short-term memory and long-term learning of verbal
sequences. Journal of Memory and Language, 55, 627–652.
Burgess-Limerick, R., Krupenia, V., Wallis, G., Pratim-Bannerjee, A., & Steiner, L. (2010). Directional
control-response relationships for mining equipment. Ergonomics, 53, 748–757.
Burki-Cohen, J., Sparko A., & Mellman, M. ( 2011). Flight Simulator Motion Literature Pertinent to Airline-
Pilot Recurrent Training and Evaluation. AIAA Modeling and Simulation Technologies Conference AIAA
2011–6320.
Burns, C. M., & Hajdukiewicz, J. R. (2004). Ecological interface design. Boca Raton, FL: CRC Press.
Burns, C. M., Skraaning, G., Jamieson, G. A., Lau, N., Kwok, J., Welch, R., & Andresen, G. (2008).
Evaluation of ecological interface design for nuclear process control: Situation awareness effects. Human
Factors, 50, 663–679.
Butcher, L. M., Davis, O. S., Craig, I. W., & Plomin, R. (2008). Genomewide quantitative trait locus
association scan of general cognitive ability using pooled DNA and 500K single nucleotide polymorphism
microarrays. Genes, Brain, and Behavior, 7, 435–446.
Buxton, W. (2007). Sketching user experience. San Francisco: Morgan Kaufmann.
Byrne, M. D., & Davis, E. M. (2006). Task structure and postcompletion error in the execution of a routine
procedure. Human Factors, 48, 627–638.

365
Cabeza, R., Kapur, S., Craik, F. I. M., McIntosh, A. R., Houle, S., & Tulving, E. (1997). Functional
neuroanatomy of recall and recognition: A PET study of episodic memory. Journal of Cognitive
Neuroscience, 9, 254–265.
Cacioppo, J. T. (2002). Social neuroscience: Understanding the pieces fosters understanding the whole and
vice versa. American Psychologist, 57, 819–831.
Caclin, A., Giard, M. H., Smith, B. K., & McAdams, S. (2007). Interactive processing of timbre dimensions:
A Garner interference study. Brain Research, 1138, 159–170.
Cades, D. M., Boehm–Davis, D. A., Trafton, J. G., & Monk, C. A. (2011). Mitigating disruptive effects of
interruptions through training: what needs to be practiced? Journal of Experimental Psychology: Applied.
17, 97–109.
Cades, D. M., Trafton, J. G., Boehm-Davis, D. A., & Monk C. A. (2007). Does the difficulty of an
interruption affect our ability to resume? In Proceedings of the Human Factors and Ergonomics Society
51st Annual Meeting (pp. 234–238). Santa Monica, CA: Human Factors and Ergonomics Society.
Caggiano, D., & Parasuraman, R. (2004). The role of memory representation in the vigilance decrement.
Psychonomic Bulletin & Review, 11, 932–937.
Cain, B., Magee, L. E., & Kersten, C. (2011). Validation of a virtual environment incorporating virtual
operators for procedural learning. DRDC Technical Report No. 2011–132. Toronto: Defence Research and
Development Canada.
Caird, J., Willness, C., Steel, P., & Scialfa, C. (2008). A meta analysis of the effects of cell phones on driver
performance. Accident Analysis and Prevention, 40, 1,282–1,293.
Caldwell, B. (2009). Delays and user performance in human-computer network interaction tasks. Human
Factors, 31, 813–830.
Camacho, M. J., Steiner, B. A., & Berson, B. L. (1990). Icons versus alphanumerics in pilot-vehicle
interfaces. In Proceedings of the 34th annual meeting of the Human Factors Society (pp. 11–15). Santa
Monica, CA: Human Factors Society.
Canham, M. S., Wiley, J., & Mayer, R. E. (in press). When diversity in training improves dyandic problem
solving. Applied Cognitive Psychology.
Cannon-Bowers, J. A., and Salas, E. (2001). Reflections on shared cognition. Journal of Organizational
Behavior, 22, 195–202.
Caplan, D., & Waters, D. S. (1999). Verbal working memory and sentence comprehension. Behavioral and
Brain Sciences, 22, 77–126.
Carbonnell, J. R., Ward, J. L., & Senders, J. W. (1968). A queueing model of visual sampling: Experimental
validation. IEEE Transactions on Man-Machine Systems, MMS-9, 82–87.
Card, S. K., English, W. K., & Burr, B. J. (1978) Evaluation of mouse, rate-controlled isometric joystick,
step keys and task keys for text selection on a CRT. Ergonomics, 21, 601–613.
Card, S. K., Mackinlay, J. D., & Shneiderman, B. (1999) (Eds.). Readings in information visualization. San
Francisco: Morgan Kaufmann.
Card, S. K., Moran, T. P., & Newell, A. (1983). The psychology of human-computer interaction. Hillsdale,
NJ: Erlbaum.
Card, S. K., Newell, A., & Moran, T. P. (1986). The model human processor. In K. Boff, L. Kaufman, & J.
Thomas (Eds.), Handbook of perception and performance (Vol. II, Ch. 45). New York: Wiley.
Carlander, O., Kindström, M., & Eriksson, L. (2005). Intelligibility of stereo and 3D-audio call signs for fire
and rescue command operators. In Proceedings of the Eleventh Meeting of the International Conference on
Auditory Display (pp. 292–295). ICAD: Limerick, Ireland.
Carlson, L., Holscher, C., Shipley, D., & Dalton, R. (2010). Getting lost in buildings. Current Directions in
Psychological Science. 5, 284–289.
Carpenter, P. A., & Shah, P. (1998). A model of the perceptual and conceptual processes in graph
comprehension. Journal of Experimental Psychology: Applied, 4, 75–100.
Carrasco, M., Pizarro, L., & Domingo, M. (2010). Visual inspection of glass bottlenecks by multiple-view

366
analysis. International Journal of Computer Integrated Manufacturing, 23, 925–941.
Carretta, T. R., Perry, D. C. & Ree, M. J. (1996). Prediction of situational awareness in F-15 Pilots.
International Journal of Aviation Psychology, 6, 21–41.
Caretta, T., & Ree, M. J. (2003). Pilot selection methods. In M. Vidulich & P. Tsang (Eds.), Principles and
Practices of Aviation Psychology. Mahwah, NJ: Lawrence Erlbaum.
Carroll, J. (1990). The Nurnberg Funnel: Designing minimalist instruction for practical computer skills.
Cambridge, MA: MIT Press.
Carroll, J. M. (2002). Human-computer interaction in the new millennium. New York: Addison-Wesley
Professional.
Carroll, J. M., & Olson, J. (Eds.). (1987). Mental models in human-computer interaction: Research issues
about what the user of software knows. Washington, DC: National Academy Press.
Carswell, C. M. (1992a). Reading graphs: Interactions of processing requirements and stimulus structure. In
B. Burns (Ed.), Percepts, Concepts and Categories (pp. 605–645). Amsterdam: Elsevier.
Carswell, C. M. (1992b). Choosing specifiers: An evaluation of the basic tasks model of graphical perception.
Human Factors, 34, 535–554.
Carswell, C. M., Frankenberger, S., & Bernhard, D. (1991). Graphing in depth: Perspectives on the use of
three-dimensional graphs to represent lower-dimensional data. Behaviour & Information Technology, 10,
459–474.
Carswell, C. M., & Wickens, C. D. (1996). Mixing and matching lower-level codes for object displays:
Evidence for two sources of proximity compatibility. Human Factors, 38, 1–22.
Carter, R. C. & Cahill, M. C. (1979). Regression models of search time for color-coded information displays.
Human Factors, 21, 293–302.
Casey, S. (1988). Set phasers on stun. Santa Barbara: Aegean Press.
Casner, S. M. (1991). A task-analytic approach to the automated design of graphic presentations. ACM
Transactions on Graphics, 10, 111–151.
Casner, S. M. (1994). Understanding the determinants of problem-solving behavior in a complex
environment. Human Factors, 36, 580–596.
Casper, J., & Murphy, R. (2003). Human-robot interactions during the robot-assisted urban search and rescue
response at the World Trade Center. IEEE Transactions on Systems, Man, and Cybernetics, 33, 367–385.
Catrambone, R., & Carroll, J. M. (1987). Learning a word processing system with training wheels and
guided exploration. Proceedings of CHI & GI human-factors in computing systems and graphics
conference (pp. 169–174). New York: Association for Computing Machinery.
Causse, M., Dehais, F. & Pastor, J. (2011). Executive functions and pilot characteristics predict flight
simulator performance in general aviation pilots. International Journal of Aviation Psychology 21, 217–
234.
Cattell, R. B. (1971). Abilities: Their structure, growth, and action. Boston: Houghton Miffin.
Cellier, J. M., & Eyrolle, H. (1992). Interference between switched tasks. Ergonomics, 35, 25–36.
Cellier, J. M., Eyrolle, H., & Mariné, C. (1997). Expertise in dynamic environments. Ergonomics, 40, 28–50.
Cepeda, Pashler, et al., 2006
Chan, A. H. S., & Chan, W. H. (2008). Strength and reversibilty of stereotypes for a rotary control with
linear displays. Perceptual and Motor Skills, 106, 341–353.
Chan, A. H. S., & Hoffman, E. (2010). Movement compatibility for frontal controls with displays located in
four cardinal directions. Ergonomics, 53, 1403–1419.
Chan, A. H. S., & Hoffman, E. (2011). Movement compatibility for configurations of displays located in
three cardinal orientations and ipsilateral, contralateral and overhead controls. Applied Ergonomics, 1–13.
Chan, W. H., & Chan, A. H. S. (2007a). Movement compatibility for rotary control and digital display.
Recent Advances in Engineering and Computer Science, 978e988-98671-1-9, pp. 79–84.

367
Chan, W. H., & Chan, A. H. S. (2007b). Strength and reversibility of movement stereotypes for lever control
and circular display. International Journal of Industrial Ergonomics, 37, 233–244.
Chan, W. H., & Chan, A. H. S. (2008). Movement compatibility for two dimensional lever control and digital
counter. IEEE Transactions on Systems, Man, & Cybernetics A, 38, 528–536.
Chandler, J. & Ponin, E. (2012) Fast though speed induces risk taking. Psychological Science. 23, 370–374.
Chandrasekaran, B., & Lele, O. (2010). Mapping descriptive models of graph comprehension into
requirements for a computational architecture: Need for supporting imagery operations. In A. K. Goel, M.
Jamnik, & N. H. Narayanan (Eds.), Diagrams 2010 Lecture Notes in Artificial Intelligence 6170 (pp. 235–
242). Berlin: Springer-Verlag.
Chapanis,A., &Lindenbaum,L.E. (1959).Areactiontimestudy of four control-display linkages.
HumanFactors,1, 1–14.
Chapman, G. B., & Johnson, E. J. (2002). Incorporating the irrelevant: Anchors in judgments of belief and
value. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases (pp. 120–138). Cambridge
UK: Cambridge University Press.
Charissis, V., Papanastasiou, S., & Vlachos, G. (2009). Interface development for early notification warning
system: Full windshield head-up display case study. Lecture Notes in Computer Science, HCII 2009, 5613,
683–692. Heidelberg: Springer.
Charness, N. (1976). Memory for chess positions: Resistance to interference. Journal of Experimental
Psychology: Human Learning and Memory, 2, 641–653.
Chase, W. G., & Chi, M. (1979). Cognitive skill: Implications for spatial skill in large-scale environments
(Technical Report No. 1). Pittsburgh: University of Pittsburgh Learning and Development Center.
Chase, W. G., & Ericsson, A. (1981). Skilled memory. In S. A. Anderson (Ed.), Cognitive skills and their
acquisition. Hillsdale, NJ: Erlbaum.
Chase, W. G., & Simon, H. A. (1973). The mind’s eye in chess. In W. G. Chase (Ed.), Visual information
processing. New York: Academic Press.
Chau, A. W., & Yeh, Y. Y. (1995). Segregation by color and stereoscopic depth in three-dimensional visual
space. Perception & Psychophysics, 57, 1,032–1,044.
Cheal, M. & Lyon, D. R. (1991). Central and peripheral precueing of forced-choice discrimination. Quarterly
Journal of Experimental Psychology A, 43, 859–880.
Chen, C., & Czerwinski, M. (2000). Introduction to special issue on empirical evaluation of information
visualizations. International Journal of Human-Computer Studies, 53, 631–635.
Chen, J., Forsberg, A. S., Swartz, S. M., & Laidlaw, D. H. (2007). Interactive multiple scale small multiples.
In IEEE Visualization Proceedings. New York: Institute of Electrical and Electronics Engineers.
Chi, C. F., & Drury, C. G. (1998). Do people choose an optimal response criterion in an inspection task? IIE
Transactions, 30, 257–266.
Chignell, M. H., & Peterson, J. G. (1988). Strategic issues in knowledge engineering. Human Factors, 30,
381–394.
Childress, M. E., Hart, S. G., & Bortalussi, M. R. (1982). The reliability and validity of flight task workload
ratings. In R. Edwards (Ed.), Proceedings of the 26th Annual Meeting of the Human Factors Society. Santa
Monica, CA: Human Factors Society.
Childs, J. M. (1976). Signal complexity, response complexity, and signal specification in vigilance. Human
Factors, 18, 149–160.
Chou, C., Madhavan, D., & Funk, K. (1996). Studies of cockpit task management errors. International
Journal of Aviation Psychology, 6, 307–320.
Christ, R. E. (1975). Review and analysis of color coding research for visual displays. Human Factors, 17,
542–570.
Christensen, J. C., Estepp, J. R, Wilson, G. F., & Russell, C. S. (2012). The effects of day-to-day variability
of physiological data on operator functional state classification. NeuroImage, 59, 57–63.

368
Christenssen-Szalanski, J. J., & Bushyhead, J. B. (1981). Physicians’ use of probabilistic information in a
real clinical setting. Journal of Experimental Psychology: Human Perception and Performance, 7, 928–
936.
Chun, M. M., & Wolfe, J. M. (1996). Just say no: How are visual searches terminated when there is no target
present? Cognitive Psychology, 30, 39–78.
Cizaire, C. (2007). Effect of 2 module docked spacecraft configurations on spatial orientation. Unpublished
Master’s thesis, Massachusetts Institute of Technology. Cambridge, MA: MIT.
Clark, H. H., & Brownell, H. H. (1975). Judging up and down. Journal of Experimental Psychology: Human
Perception and Performance, 1, 339–352.
Clark, R. C., & Kwinn, A. (2007). The new virtual classroom: Evidence-based guidelines for synchronous e-
learning. San Francisco: Wiley-Pfeiffer.
Clawson, D. M., Healy, A. F., Ericsson, K. A., & Bourne, L. E., Jr. (2001). Retention and transfer of Morse
code reception skill by novices: Part-whole training. Journal of Experimental Psychology: Applied, 7, 129–
142.
Cleveland, W. S., & McGill, R. (1984). Graphical perception: Theory, experimentation, and application to the
development of graphic methods. Journal of the American Statistical Association, 70, 531–554.
Cleveland, W. S., & McGill, R. (1985). Graphical perception and graphical methods for analyzing scientific
data. Science, 229, 828–833.
Cleveland, W. S., & McGill, R. (1986). An experiment in graphical perception. International Journal of Man-
Machine Studies, 25, 491–500.
Clifasefi, S. L., Takarangi, M. K. T., & Bergman, J. S. (2006). Blind drunk: The effects of alcohol on
inattentional blindness. Applied Cognitive Psychology, 20, 697–704.
Cockburn, A., & McKenzie, B. (2001). What do web users do? An empirical analysis of web use.
International Journal of Human-Computer Studies, 54, 903–922.
Coffey, E. B. J., Brouwer, A.M., Wilschut, E., & van Erp, J. B. F. (2010). Brain–machine interfaces in
space: Using spontaneous rather than intentionally generated brain signals. Acta Astronautica, 67, 1–11.
Cohen, A. L., Rotello, C. M., & Macmillan, N. A. (2008). Evaluating models of remember-know judgments:
Complexity, mimicry, and discriminability. Psychonomic Bulletin & Review, 15, 906–926.
Cohen, G. (2008). Memory for knowledge: General knowledge and expert knowledge. In G. Cohen & M. A.
Conway (Eds.), Memory in the real world (3rd Ed.) (pp. 207–227). New York: Taylor & Francis.
Cohen, M. S., Freeman, J. T., & Thompson, B. B. (1997). Training the naturalistic decision maker. In C. E.
Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 257–268). Mahwah, NJ: Erlbaum.
Cohen, S., Kessler, R. C., & Gordon, U. (1997). Measuring stress: A guide for health and social scientists.
New York: Oxford University Press.
Cole, W. G. (1986). Medical cognitive graphics. In Proceedings of the ACM–SIGCHI: Human factors in
computing systems (pp. 91–95). New York: Association for Computing Machinery.
Coles, M. G. H. (1988). Modern mind-brain reading: Psychophysiology, physiology, and cognition.
Psychophysiology, 26, 251–269.
Collet, C., Guillot, S. A., & Petit, C. (2010). Phoning while driving II: A review of driving conditions
influence. Ergonomics, 53, 602–616.
Collins, A. M., & Quillian, M. R. (1969). A spreading activation theory of semantic processing.
Psychological Review, 82, 407–428.
Colom, R., Rebollo, I., Palacios, A., Juan-Espinosa, M., & Kyllonen, P. C. (2003). Working memory is
(almost) perfectly predicted by g. Intelligence, 32, 277–296.
Coman, A., Manier, D., & Hirst, W. (2009). Forgetting the unforgettable through conversation: Socially
shared retrieval-induced forgetting of September 11 memories. Psychological Science, 20, 627–633.
Combs, B., & Slovic, P. (1979). Newspaper coverage of causes of death. Journalism Quarterly, 56(4), 837–
843; 849.

369
Commarford, P. M., Lewis, J. R., Smither, J. A., & Gentzler, M. D. (2008). A comparison of broad versus
deep auditory menu structures. Human Factors, 50, 77–89.
Comstock,J.R., Jones,L.C., & Pope, A. T. (2003).The effectiveness of various attitude indicator display sizes
and extended horizon lines on attitude maintenance in a part-tasksimulation. In Proceedings of the Human
Factors and Ergonomics Society—47th AnnualMeeting (pp.144–148).SantaMonica, CA: Human Factors
and Ergonomics Society.
Conrad, R., & Longman, D. S. A. (1965). Standard type-writer vs. chord keyboard: An experimental
comparison. Ergonomics, 8, 77–88.
Cook, M. B., & Smallman, H. S. (2008). Human factors of the confirmation bias in intelligence analysis.
Human Factors, 50, 745–754.
Cooke, N. J. (1994). Varieties of knowledge elicitation techniques. International Journal of Human-
Computer Studies, 41, 801–849.
Cooke, N. J., & Gorman, J. C. (2006). Assessment of team cognition. In P. Karwowski (Ed.), International
Encyclopedia of Ergonomics and Human Factors (2nd Ed.). UK: Taylor & Francis.
Cooke, N. J., Pringle, H. L., Pedersen, H. K., & Connor, O. (Eds.) (2006). Human factors of remotely
operated vehicles: Advances in human performance and cognitive engineering research (Vol. 7).
Amsterdam.
Courtney, A. J. (1986). Chinese population stereotypes: Color associations. Human Factors, 28, 97–99.
Cowan, N. (2001). The magical number 4 in short-term memory: A reconsideration of mental storage
capacity. Behavioral and Brain Sciences, 24(1), 87–114.
Cowen, E. L. (1952). The influence of varying degrees of psychosocial stress on problem-solving rigidity.
Journal of Abnormal and Social Psychology, 47, 512–519.
Craig, A. (1981). Monitoring for one kind of signal in the presence of another. Human Factors, 23, 191–198.
Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal
of Verbal Learning and Verbal Behavior, 11, 671–684.
Craik, K. W. J. (1947). Theory of the human operator in control systems I: The operator as an engineering
system. British Journal of Psychology, 38, 56–61.
Crandall, B., Klein, G., Militello, L. G., & Wolfe, S. P. (1994). Tools for applied cognitive task analysis
(Contract summary report on N66001-94-C-7008). Fairborn, OH: Klein Associates.
Crede, M., & Sniezek, J. A. (2003). Group judgment processes and outcomes in video-conferencing versus
face-to-face groups. International Journal of Human-Computer Studies, 59, 875–897.
Crocoll, W. M., & Coury, B. G. (1990). Status or recommendation: Selecting the type of information for
decision aiding. In Proceedings of the 34th Annual Meeting of the Human Factors Society (pp. 1,524–
1,528). Santa Monica, CA: Human Factors Society.
Croft, D., Banbury, S., Butler, L. T., & Berry, D. C. (2004). The role of awareness in situation awareness. In
S. Banbury & S. Tremblay (Eds.), A cognitive approach to situation awareness: Theory and application
(pp. 82–103). Aldershot, UK: Ashgate.
Crossley, S. A., Greenfield, J., & McNamara, D. S. (2008). Assessing text readability using cognitively
based indices. TESOL Quarterly, 42(3), 475–493.
Cummings, M. L. (2004). Automation bias in intelligent time critical decision support systems. Paper
presented to the AIAA 1st Intelligent Systems Technical Conference, September 2004. Reston, VA:
American Institute for Aeronautics and Astronautics. [Available from: http://citeseerx.ist.psu.edu/viewdoc/
summary?doi=10.1.1.91.2634.]
Cummings, M. L. (2010, Spring). Technology impedances to augmented cognition. Ergonomics in Design,
18(2), 25–27.
Cummings, M. L., Brezinski, A. S., & Lee, J. D. (2007). The impact of intelligent aiding for multiple
unmanned aerial vehicle schedule management. IEEE Intelligent Systems, 22, 52–59.
Cummings, M. L., Bruni, S., & Mitchell, P. J. (2010). Human supervisory control challenges in network-
centric operations. Reviews of Human Factors and Ergonomics, 6, 34–78.

370
Cummings, M. L., & Guerlain, S. (2007). Developing operator capacity estimates for supervisory control of
autonomous vehicles. Human Factors, 49, 1–15.
Cummings, M. L., & Nehme, C. E. (2010). Modeling the impact of workload in network-centric supervisory
control settings. In S. Kornguth, R. Steinberg, & M. D. Matthews (Eds.), Neurocognitive and physiological
factors during high-tempo operations. (pp. 23–40). Surrey, UK: Ashgate.
Cutting, J. E., & Vishton, P. M. (1995). Perceiving layout and knowing distances: The integration, relative
potency, and contextual use of different information about depth. In W. Epstein & S. Rogers (Eds.),
Perception of space and motion (pp. 69–117). San Diego: Academic Press.
Dahlström, Ö., Danielsson, H., Emilsson, M., & Andersson, J. (2011). Does retrieval strategy disruption
cause general and specific collaborative inhibition? Memory, 19, 140–154.
Damos, D. L. (1978). Residual attention as a predictor of pilot performance. Human Factors, 20, 435–440.
Damos, D. L. (1997). Using interruptions to identify task prioritization in Part 121 air carrier operations. In R.
Jensen (Ed.), Proceedings of the 9th International Symposium on Aviation Psychology. Columbus, OH:
Ohio State University.
Damos, D. L., & Wickens, C. D. (1980). The identification and transfer of time-sharing skills. Acta
Psychologica, 46, 15–39.
Danaher, J. W. (1980). Human error in ATC systems. Human Factors, 22, 535–546.
Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of
Verbal Learning and Verbal Behavior, 19, 450–466.
Danzigera, L., Levav, J.Avnaim-Pessoa, L. (2011). Extraneous factors in judicial decisions. Proceedings of
the National Academy of Sciences US. April 26, 108 PP 6,889–6,892.
Darken, R. P., & Peterson, B. (2002). Spatial orientation, wayfinding, and representation. In K. M. Stanney
(Ed.), Handbook of virtual environments (pp. 493–518). Mahwah, NJ: Erlbaum.
Darker, I. T., Gerret, D., Filik, R., Purdy, K. J., & Gales, A. G. (2011). The influence of “Tall Man” lettering
on errors of visual perception in the recognition of written drug names. Ergonomics, 54, 21–33.
Darlington, K. (2000). The essence of expert systems. New York: Pearson Education.
Davenport, W. G. (1968). Auditory vigilance: The effects of costs and values of signals. Australian Journal of
Psychology, 20, 213–218.
Davies, D. R., & Parasuraman, R. (1982). The psychology of vigilance. London: Academic Press.
Davies, G., Shepherd, J., & Ellis, H. (1979). Effects of interpolated mugshot exposure on accuracy of
eyewitness identification. Journal of Applied Psychology, 64, 232–237.
Davis, J. H. (1984). Order in the courtroom. In D. J. Miller, D. G. Blackman, & A. J. Chapman (Eds.),
Perspectives in psychology and law. New York: Wiley.
Davis, M. H., & Johnsrude, I. S. (2007). Hearing speech sounds: Top-down influences on the interface
between audition and speech perception. Hearing Research, 229, 132–147.
Davis, R., Moray, N., & Treisman, A. (1961). Imitative responses and the rate of gain of information.
Quarterly Journal of Experimental Psychology, 13, 78–89.
Dawes, R. M. (1979). The robust beauty of improper linear models in decision making. American
Psychologist, 34, 571–582.
Dawes, R. M., & Corrigan, B. (1974). Linear models in decision making. Psychological Bulletin, 81, 95–106.
Dawes, R. M., Faust, D., & Meehl, P. E. (1989). Clinical versus statistical judgment. Science, 243, 1,668–
1,673.
De Bondt, W. F. M., & Thaler, R.H. (2002). Do analysts overreact? In T. Gilovich, D. Griffin & D.
Kahneman (Eds). Heuristics & biases: The psychology of intuitive judgment. (pp. 678–685). New York:
Cambridge University Press.
De la Pena, N., Weil, P., Liobera, J., et al. (2010). Immersive journalism: Immersive virtual reality for the
first-person experience of news. Presence, 19, 291–301.
De Waard, D., Schepers, P., Ormel, W., and Brookhuis, K. (2010). Mobile phone use while cycling.

371
Ergonomics, 53, 30–42.
De Waard, D., van der Hulst, M., Hoedemaeker, M., & Brookhuis, K. A. (1999). Driver behavior in an
emergency situation in the automated highway system. Transportation Human Factors, 1, 67–82.
Debecker, J., & Desmedt, R. (1970). Maximum capacity for sequential one-bit auditory decisions. Journal of
Experimental Psychology, 83, 366–373.
Deffenbacher, K. A., Bornstein, B. H., and Penrod, S. D. (2006). Mugshot exposure effects: Retroactive
interference, mugshot commitment, source confusion, and unconscious transference. Law and Human
Behavior, 30(3), 287–307.
Degani, A. (2004). Taming HAL. Designing interfaces beyond 2001. New York: Talgrave MacMillan.
Degani, A., & Wiener, E. L. (1990). Human factors of flight-deck checklists: The normal checklist (NASA
Contractor Report 177549). Moffett Field, CA: NASA Ames Research Center.
deGroot, A. D. (1965). Thought and choice in chess. The Hague: Mouton.
Dehais, F., Causse, M., & Tremblay, S. (2011). Mitigation of conflicts with automation: use of cognitive
countermeasures. Human Factors, 53, 448–460.
Deininger, R. L., Billington, M. J., & Riesz, R. R. (1966). The display mode and the combination of
sequence length and alphabet size as factors of speed and accuracy. IEEE Transactions on Human Factors
in Electronics, 7, 110–115.
DeLucia, P. R. (2003). Judgments about collision in younger and older drivers. Transportation Research,
Part F, 6, 63–80.
DeLucia, P. R. (2004). Time-to-contact judgments of an approaching object that is partially concealed by an
occluder. Journal of Experimental Psychology: Human Perception and Performance, 30, 287–304.
DeLucia, P. R. (2005). Does binocular disparity or familiar size information override effects of relative size
on judgements of time to contact? Quarterly Journal of Experimental Psychology, 58A, 865–886.
DeLucia, P. R. (2007). How big is an optical invariant? In M. A. Peterson, B. Gillam, & H. A. Sedgwick
(Eds.), In the mind’s eye: Julian Hochberg on the perception of pictures, films and the world (pp. 473–
482). Oxford, UK: Oxford University Press.
DeLucia, P. R. (2008). Critical roles for distance, task, and motion in space perception: Initial conceptual
framework and practical implications. Human Factors, 50, 811–820.
DeLucia, P. R., & Griswold, J. A. (2011). Effects of camera arrangement on perceptual-motor performance in
minimally invasive surgery. Journal of Experimental Psychology: Applied, 17, 210–232.
Dempsey, P., Mathiassen, E., Jackson, J., & O’Brien, N. (2010). Influence of three principles of pacing on
the temporal organization of work during cyclic assembly and disassembly tasks. Ergonomics, 53, 1,347–
1,358.
Denton, G. G. (1980). The influence of visual pattern on perceived speed. Perception, 9, 393–402.
Department of the Army (2003). THAAD theatre high altitude area defense missile system, USA. Retrieved
from http://www.army-technology.com/projects/thaad/.
Derrick, W. L. (1988). Dimensions of operator workload. Human Factors, 30, 95–110.
DeSota, C. B., London, M., & Handel, S. (1965). Social reasoning and spatial paralogic. Journal of
Personality and Social Psychology, 2, 513–521.
Dewar, R. E. (1976). The slash obscures the symbol on prohibitive traffic signs. Human Factors, 18, 253–
258.
Dewar, R. E. (1993, July). Warning: Hazardous road signs ahead. Ergonomics in Design, 26–31.
Di Nocera, F., Camilli, M., & Terenzi, M. (2007). A random glance at the flight deck: Pilots’ scanning
strategies and the real-time assessment of mental workload. Journal of Cognitive Engineering and Decision
Making, 1, 271–285.
Diehl, A. E. (1991). The effectiveness of training programs for preventing aircrew error. In R. S. Jensen (Ed.),
Proceedings of the 6th International Symposium on Aviation Psychology (pp. 640–655). Columbus, OH:
Dept. of Aviation, Ohio State University.

372
Dienes, Z. (2011). Basyesian versus orthodox statistics: which side are you on? Perspectives on Psychological
Sciences. 6, 274–290.
Dietz, P. H., & Eidelson, B. D. (2009). SurfaceWare: Dynamic tagging for Microsoft surface. In TEI ‘09
Proceedings of the 3rd International Conference on Tangible and Embedded Interaction (pp. 249–254).
New York: Association for Computing Machinery.
Difede, J., Cukor, J., Jayasinghe, N., Patt, I., Jedel, S., Spielman, L., et al. (2007). Virtual reality exposure
therapy for the treatment of posttraumatic stress disorder following September 11, 2001. Journal of Clinical
Psychiatry, 68, 1639–1647.
Dinges, D. F., Orne, K. T., Whitehouse, W. G., & Orne, E. C. (1987). Temporal placement of a nap for
alertness: Contributions of circadian phase and prior wakefulness. Sleep, 10, 313–329.
Dingus, T. A., Klauer, S. G., Neale, V. L., Petersen, A., Lee, S. E., Sudweeks, J., et al. (2006). The 100-car
naturalistic driving study, phase II–Results of the 100-car field experiment. (Tech Report No. DOT HS 810
593). Washington, DC: National Highway Traffic Safety Administration.
Dingus, T. A., Hanowski, J., & Klauer, S. (2011). Estimating crash risk. Ergonomics in Design, 4, 8–12.
Dismukes, R. K. (2010). Remembrance of things future: prospective memory in the laboratory, workplace and
everyday settings. In D. Harris (Ed.), Reviews of Human factors and Ergonomics (Vol. 6). Santa Monica
CA: Human Factors and Ergonomics Society.
Dismukes, R. K., Berman, B. A., & Loukopoulos, L. D. (2010). The limits of expertise: Rethinking pilot error
and the causes of airline accidents. Aldershot, England: Ashgate.
Dismukes, R. K., & Nowinski, J. (2007). Prospective memory, concurrent task management and pilot error.
In A. Kramer, D. Wiegmann, & A. Kirlik (Eds.), Attention: from theory to practice. Oxford, England:
Oxford University Press.
Dixon, S. R., & Wickens, C. D. (2006). Automation reliability in unmanned aerial vehicle flight control: A
reliance-compliance model of automation dependence in high workload. Human Factors, 48, 474–486.
Dixon, S. R., Wickens, C.D., & Chang, D. (2005). Mission control of multiple unmanned aerial vehicles: A
work-load analysis. Human Factors, 47, 479–487.
Dixon, S. R., Wickens, C. D., & McCarley, J. S. (2007). On the independence of compliance and reliance:
Are automation false alarms worse than misses? HumanFactors,49,564–572.
Doane, S. M., Pellegrino, J. W., & Klatzky, R. L. (1990). Expertise in a computer operating system:
Conceptualization and performance. Human-Computer Interaction, 5, 267–304.
Dobbs, A. R., & Rule, B. G. (1989). Adult age differences in working memory. Psychology and Aging, 4,
500–503.
Dockrell, J. E., and Shield, B.M. (2006). Acoustical barriers in classrooms: the impact of noise on
performance in the classroom. British Educational Research Journal, 32(3), 509–525.
Dodhia, R. & Dismukes, R, (2008). Interruptions create pros-pectivememorytasks. Applied Cognitive
Psychology 22, 1–17.
Doll, T. J., & Hanna, T. E. (1989). Enhanced detection with bimodal sonar displays. Human Factors, 31,
539–550.
Domini, F., & Caudek, C. (2010). Matching perceived depth from disparity and from velocity: Modeling and
psychophysics. Acta Psychologica, 133, 81–89.
Domini, F., Shah, R., & Caudek, C. (2011). Do we perceive a flattened world on the monitor screen? Acta
Psychologica, 138, 359–366.
Domowitz, I. (1993). A taxonomy of automated trade execution systems, Journal of International Money and
Finance, 12, 607–631.
Donald, F. M. (2008). The classification of vigilance tasks in the real world. Ergonomics, 51, 1,643–1,655.
Donchin, E. (1980). Event-related potentials: Inferring cognitive activity in operational settings. In F. E.
Gomer (Ed.), Biocybernetic applications for military systems (pp. 35–42). (Technical Report MDC
EB1911). Long Beach, CA: McDonnell Douglas.

373
Donchin, E., Spencer, K. M., & Wijesinghe, R. (2000). The mental prosthesis: Assessing the speed of a
P300-based brain-computer interface. IEEE Transactions on Rehabilitation Engineering, 8, 174–179.
Donders, F. C. (1869, trans. 1969). On the speed of mental processes (trans. W. G. Koster). Acta
Psychologica, 30, 412–431.
Dong, X., & Hayes, C. (2011). The impact of uncertainty visualizations on team decision making and
problem solving. In Proceedings of the Human Factors and Ergonomics Society 55th Annual Meeting (pp.
257–261). Santa Monica, CA: Human Factors and Ergonomics Society.
Donmez, B., Boyle, L., & Lee, J. D. (2006). The impact of distraction mitigation strategies on driving
performance. Human Factors, 48, 785–801.
Donovan, J. J., & Radosevich, D. J. (1999). A meta-analytic review of the distribution of practice effect:
Now you see it, now you don’t. Journal of Applied Psychology, 84, 795–805.
Dornheim, M. A. (2000, July 17). Crew distractions emerge as new safety focus. Aviation Week and Space
Technology, 58–65.
Dornreich, M. C., Whitlow, S. D., Mathan, S., Ververs, P. M., Erdogmus, D., Adami, A., Pavel, M., & Lan
T. (2007). Supporting real-time cognitive state classification on a mobile individual. Journal of Cognitive
Engineering and Decision Making, 1, 240–270.
Dorneich, M. C., Ververs. P. M., Mathan, S., Whitlow, S., & Hayes, C. C. (2012). Considering etiquette in
the design of an adaptive system. Journal of Cognitive Engineering and Decision Making, 6(2), 243–265.
Dosher, B. A., & Lu, Z. L. (2000). Noise exclusion in spatial attention. Psychological Science, 11, 139–146.
Dougherty, E. M. (1990). Human reliability analysis --Where shouldst thou turn? Reliability Engineering and
System Safety, 29, 283–299.
Dougherty, M. R. P., & Hunter, J. E. (2003). Probability judgment and subadditivity: The role of working
memory capacity and constraining retrieval. Memory & Cognition, 31, 968–982.
Draper, M. H. (1998). The effects of image scale factor on vestibulo-ocular reflex adaptation and simulator
sickness in head-coupled virtual environments. In Proceedings of the Human Factors and Ergonomics
Society 42nd Annual Meeting (pp. 1,481–1,485). Santa Monica, CA: Human Factors and Ergonomics
Society.
Drazin, D. (1961). Effects of fore-period, fore-period variability and probability of stimulus occurrence on
simple reaction time. Journal of Experimental Psychology, 62, 43–50.
Drews, F. A., & Strayer, D. L. (2007). Multi-tasking in the automobile. In A Kramer, D. Wiegmann, & A.
Kirlik (Ed.), Attention: From theory to practice. Oxford UK: Oxford University Press.
Drews, F. A., & Strayer, D. L. (2009). Cellular phones and driver distraction. In M. Regan, J. Lee, & K.
Young (Eds.), Driver distraction: Theory, effects and mitigation. Boca Raton, FL: CRC Press.
Drews, F. A., Pasupathi, M., & Strayer, D. L. (2008). Passenger and cell phone conversations in simulated
driving. Journal of Experimental Psychology: Applied, 14, 392–400.
Drews, F. A., & Westenskow, D. R. (2006). The right picture is worth a thousand numbers: Data displays in
anesthesia. Human Factors, 48, 59–71.
Drews, F. A., Yazdani, H., Godfrey, C. N., Cooper, J. M., & Strayer, D. L. (2009). Text messaging during
simulated driving. Human Factors, 51, 762–770.
Driskell, J. E., Radtke, P. H., & Salas, E. (2003). Virtual teams: Effects of technological mediation on team
performance. Group Dynamics: Theory, Research, and Practice, 7(4), 297–323.
Driskell, J. E., Salas, E., & Hall, J. K. (1994). The effect of vigilant and hypervigilant decision training on
performance. Paper presented at the Annual Meeting of the Society of Industrial and Organizational
Psychology. Nashville, TN.
Driver, J., & Spence, C. (2004). Crossmodal spatial attention: Evidence from human performance. In C.
Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 179–220). Oxford: Oxford
University Press.
Druckman, D., & Bjork, R. A. (1994). Transfer: Training for performance. In Learning, remembering,
believing (pp. 25–56). Washington, DC: National Academy Press.

374
Drury, C. G. (1975). Inspection of sheet metal: Model and data. Human Factors, 17, 257–265.
Drury, C. G. (1990). Visual search in industrial inspection. In D. Brogan (Ed.), Visual search (pp. 263–276).
London: Taylor & Francis.
Drury, C. G. (1994). The speed accuracy tradeoff in industry. Ergonomics, 37, 747–763.
Drury, C. G. (2001). Human factors in aircraft inspection. In Aging aircraft fleets: Structural and other
sybsystem aspects (pp. 7-1–7-11). Report No. ADA390841. Defense Technical Information Center.
Drury, C. G. (2006). Inspection. In W. Karwowski (Ed.), International encyclopedia of ergonomics and
human factors (Vol. 2). Boca Raton, FL: Taylor & Francis.
Drury, C. G., & Chi, C. F. (1995). A test of economic models of stopping policy in visual search. IIE
Transactions, 27, 382–393.
Drury, C. G., & Clement, M. R. (1978). The effect of area, density, and number of background characters on
visual search. Human Factors, 20, 597–602.
Drury, C. G., & Coury, B. G. (1981). Stress, pacing, and inspection. In G. Salvendy & M. J. Smith (Eds.),
Machine pacing and operational stress. London: Taylor & Francis.
Drury, C. G., Maheswar, G., Das, A., & Helander, M. G. (2001). Improving visual inspection using
binocular rivalry. International Journal of Production Research, 39, 2143–2153.
Duffy, E. (1957). The psychological significance of the concept of ‘arousal’ or ‘activation’. Psychological
Review, 64, 265–275.
Duggan, G. B., & Payne, S. J. (2009). Text skimming: The process and effectiveness of foraging through text
under time pressure. Journal of Experimental Psychology: Applied, 15, 228–242.
Dulaney, C. L., & Marks, W. (2007). The effects of training and transfer on global/local processing. Acta
Psychologica, 125, 203–220.
Duncan, J. (1979). Divided attention: The whole is more than the sum of its parts. Journal of Experimental
Psychology: Human Perception and Performance, 5, 216–228.
Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental
Psychology: General, 113, 501–517.
Duncan, J., & Humphreys, G. W. (1989). Visual search and stimulus similarity. Psychological Review, 96,
433–458.
Durding, B. M., Becker, C. A., & Gould, J. D. (1977). Data organization. Human Factors, 19, 1–14.
Durgin, F. H., & Li, Z. (2010). Controlled interaction: Strategies for using virtual reality to study perception.
Behavior Research Methods, 42, 414–420.
Durso, F. T., Bleckley, M. K., & Dattel, A. R. (2006). Does situation awareness add to the validity of
cognitive tests? Human Factors, 48, 721–733.
Durso, F. T., & Dattel, A. R. (2004). SPAM: The real-time assessment of SA. In S. Banbury and S. Tremblay
(Eds.). A cognitive approach to Situation Awareness: Theory and application. Aldershot, England:
Ashgate.
Durso, F. T., & Gronlund, S. D. (1999). Situation awareness. In F. T. Durso, R. Nickerson, R. Schvaneveldt,
S. Dumais, S. Lindsay and M. Chi (Eds.), Handbook of applied cognition (pp. 283–314). New York: Wiley.
Durso, F. T., & Sethumadhavan, A. (2008). Situation awareness: Understanding dynamic environments.
Human Factors, 50, 442–448.
Duschek, S., & Schandry, R. (2003). Functional transcranial Doppler sonography as a tool in
psychophysiological research. Psychophysiology, 40, 436–454.
Dutcher, J. S. (2006). Caution: This Superman suit will not enable you to fly: Are consumer product warning
labels out of control? Arizona State Law Journal, 38, 633–659.
Dutt, V., & Gonzalez, C. (in press). Why do we want to delay actions on climate change? Effects of
probability and timing of climate consequences. Journal of Behavioral Decision Making, 24: n/a. doi:
10.1002/bdm.721.
Dutta, A., & Nairne, J. S. (1993). The separability of space and time: Dimensional interaction in the memory

375
trace. Memory & Cognition, 21, 440–448.
Dvorak, A. (1943). There is a better typewriter keyboard. National Business Education Quarterly, 12, 51–58.
Dwyer, F. M. (1967). Adapting visual illustrations for effective learning. Harvard Educational Review, 37,
250–263.
Dye, M., Green, S., & Bavelier, D. (2009). Increasing speed of processing with action video games. Current
Directions in Psychological Science. 18, 321–326.
Dyre, B. P. (1997). Perception of accelerating self-motion: Global optical flow rate dominates discontinuity
rate. In Proceedings of the Human Factors and Ergonomics Society 41st Annual Meeting (pp. 1,333–
1,337). Santa Monica, CA: Human Factors and Ergonomics Society.
Dyre, B. P., & Anderson, G. J. (1997). Image velocity magnitudes and perception of heading. Journal of
Experimental Psychology: Human Perception and Performance, 23, 546–565.
Dyre, B. P., & Lew, R. (2005). Steering errors may result from non-rigid transparent optical flow. In
Proceedings of the Human Factors and Ergonomics Society—49th Annual Meeting (pp. 1,531–1,534).
Santa Monica, CA: Human Factors and Ergonomics Society.
Dyson, B. J., & Quinlan, P. T. (2010). Decomposing the Garner interference paradigm: Evidence for
dissociations between macrolevel and performance. Attention, Perception & Psychophysics, 72, 1,676–
1,691.
Dzindolet, M. T., Pierce, L. G., Beck, H. P., & Dawe, L. A. (2002). The perceived utility of human and
automated aids in a visual detection task. Human Factors, 44, 79–94.
Eberts, R. E., & MacMillan, A. G. (1985). Misperception of small cars. In R. E. Eberts & C. G. Eberts (Eds.),
Trends in ergonomics/human factors II (pp. 33–39). Amsterdam: North Holland.
Eckstein, M. P., Thomas, J. P., Palmer, J., & Shimozaki, S. S. (2000). A signal detection model predicts the
effects of set size on visual search accuracy for feature conjunction, triple conjunction, and disjunction
displays. Perception & Psychophysics, 62, 425–451.
Edland, A. (1989). On cognitive processes under time stress: A selective review of the literature on time stress
and related stress. Reports from the Department of Psychology. University of Stockholm, Sweden.
Edwards, W. (1987). Decision making. In G. Salvendy (Ed.), Handbook of human factors (pp. 1,061–1,104).
New York: Wiley.
Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research.
Psychological Review, 70, 193–242.
Edworthy, J., Hellier, E., Morley, N., Grey, C., Aldrich, K., & Lee, A. (2004). Linguistic and location effects
in compliance with pesticide warning labels for amateur and professional users. Human Factors, 46, 11–31.
Edworthy, J., Hellier, E., Titchener, K., Naweed, A., & Roels, R. (2011). Heterogeneity in auditory alarm
sets makes them easier to learn. International Journal of Industrial Ergonomics, 41, 136–146.
Edworthy, J., & Loxley, S. (1990). Auditory warning design: The ergonomics of perceived urgency. In E. J.
Lovesey (Ed.), Contemporary ergonomics 1990 (pp. 384–388). London: Taylor & Francis.
Egan, J., Carterette, E., & Thwing, E. (1954). Some factors affecting multichannel listening. Journal of the
Acoustical Society of America, 26, 774–782.
Egeth, H. E., & Pachella, R. (1969). Multidimensional stimulus identification. Perception & Psychophysics,
5, 341–346.
Egeth, H. E., & Yantis, S. (1997). Visual attention: control, representation, and time course. Annual Review
of Psychology, 48, 269–297.
Egger, M., & Smith, G. D. (1997). Meta-analysis: Potentials and promise. British Medical Journal, 315,
1,371–1,374.
Ehrenreich, S. L. (1982). The myth about abbreviations. Proceedings of the 1982 IEEE International
Conference on Cybernetics and Society. New York: Institute of Electrical and Electronic Engineers.
Ehrenreich, S. L. (1985). Computer abbreviations: Evidence and synthesis. Human Factors, 27, 143–155.
Ehrlich, J. A., & Kolasinski, E. M. (1998). A comparison of sickness symptoms between dropout and

376
finishing participants in virtual environment studies. In Proceedings of the Human Factors and Ergonomics
Society 42nd Annual Meeting (pp. 1,466–1,470). Santa Monica, CA: Human Factors and Ergonomics
Society.
Ehrlich, J. A., Singer, M. J., & Allen, R. C. (1998). Relationships between head-shoulder divergences and
sickness in a virtual environment. In Proceedings of the Human Factors and Ergonomics Society 42nd
Annual Meeting (pp. 1,471–1,475). Santa Monica, CA: Human Factors and Ergonomics Society.
Eichstaedt, J. (2002). Measuring differences in preactivation on the Internet: The content category superiority
effect. Experimental Psychology, 49, 283–291.
Einhorn, H. J., & Hogarth, R. M. (1978). Confidence in judgment: Persistence of the illusion of validity.
Psychological Review, 85, 395–416.
Einhorn, H. J., & Hogarth, R. M. (1981). Behavioral decision theory. Annual Review of Psychology, 32, 53–
88.
Einhorn, H. & Hogarth, R. (1982). Theory of diagnositic inference 1: imagination and the psychopysics of
evidence. Technical Report #2. Chicago: University of Chicago School of Business.
Einstein, G. O., & McDaniel, M. A. (1990). Normal aging and prospective memory. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 16, 717–726.
Einstein, G. O., & McDaniel, M. A. (1996). Retrieval processes in prospective memory: Theoretical
approaches and some new findings. In M. Brandimonte, G. O. Einstein, & M. A. McDaniel (Eds.),
Prospective memory: Theory and applications. Mahwah, NJ: Erlbaum.
Eisen, L. A., & Savel, R. H. (2009). What went right: Lessons for the intensivist from the crew of US
Airways Flight 1549. Chest, 136, 910–917.
Elliott, E. M. (2002). The irrelevant speech effect and children: theoretical implications of developmental
change. Memory & Cognition, 30, 478–487
Ellis, N. C. & Hennelly, R. A. (1980). A bilingual word-length effect: Implications for intelligence testing and
the relative ease of mental calculation in Welsh and English. British Journal of Psychology, 71, 43–51.
Ellis, N. C., & Hill, S. E. (1978). A comparison of seven-segment numerics. Human Factors, 20, 655–660.
Ellis, S. R. (2006). Towards determination of visual requirements for augmented reality displays and virtual
environments for the airport tower. In NATO workshop proceedings: Virtual Media for the Military HFM–
121/RTG 042 HFM–136 (pp. 31-1-31-9). West Point, NY: North Atlantic Treaty Organization.
Ellis, S. R., & Hitchcock, R. J. (1986). The emergence of Zipf ’s law: Spontaneous encoding optimization by
users of a command language. IEEE Transactions on Systems, Man, and Cybernetics, SMC-16(3), 423–
427.
Ellis, S. R., Mania, K., Adelstein, B. D., & Hill, M. I. (2004). Generalizability of latency detection in a
variety of virtual environments. In Proceedings of the Human Factors and ErgonomicsSociety—
48thAnnualMeeting (pp.2,632–2,636). Santa Monica, CA: Human Factors and Ergonomics Society.
Ellis, S. R., McGreevy, M. W., & Hitchcock, R. J. (1987). Perspective traffic display format and air pilot
traffic avoidance. Human Factors, 29, 371–382.
Ellis, S. R., Smith, S. R., Grunwald, A. J., & McGreevy, M. W. (1991). Direction judgement error in
computer generated displays and actual scenes. In S. R. Ellis (Ed.),
Pictorial communication in virtual and real environments (pp. 504–526). London: Taylor and Francis.
Emmelkamp, P. M. G., Krijn, M., & Hulsbosch, A. M. (2002). Virtual reality treatment versus exposure in
vivo: A comparative evaluation in acrophobia. Behaviour Research and Therapy, 40, 509–516.
End, C. M., Worthman, S., Mathews, M. B., and Wetterau, K. (2010). Costly cell phones: The impact of cell
phone rings on academic performance. Teaching of Psychology, 37, 55–57.
Endsley, M. R. (1988). Design and evaluation for situation awareness enhancement. In Proceedings of the
Human Factors Society 32nd Annual Meeting (pp. 97–101). Santa Monica, CA: Human Factors Society.
Endsley, M. R. (1995a). Toward a theory of situation awareness in dynamic systems. Human Factors, 37, 32–
64.

377
Endsley, M. R. (1995b). Measurement of situation awareness in dynamic systems. Human Factors, 37, 65–
84.
Endsley, M. R. (1997). The role of situation awareness in naturalistic decision making. In G. K. Caroline and
E. Zsambok (Eds.), Naturalistic decision making expertise: Research and applications (pp. 269–283).
Mahwah, NJ: Erlbaum.
Endsley, M. R. (2000). Theoretical underpinnings of situation awareness: A critical review. In M. R. Endsley
& D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 3–32). Mahwah, NJ: Erlbaum.
Endsley, M. R. (2004). Situation awareness: Progress and directions. In S. Banbury & S. Tremblay (Eds.), A
cognitive approach to situation awareness: Theory and application (pp. 317–341). Aldershot, UK:
Ashgate.
Endsley,M.R., &Garland,D.G. (Eds.)(2001). Situation awareness analysis and measurement. Mahwah, NJ:
Erlbaum.
Endsley, M. R., & Jones, D. G. (2001). Disruptions, interruptions, and information attack: Impact on situation
awareness and decision making. In Proceedings of the Human Factors and Ergonomics Society 45th
Annual Meeting (pp. 63–67). Santa Monica, CA: Human Factors and Ergonomics Society.
Endsley, M. R., & Kaber, D. B. (1999). Level of automation effects on performance, situation awareness and
work-load in a dynamic control task. Ergonomics, 42, 462–492.
Endsley, M. R., & Kiris, E. O. (1995). The out-of-the-loop performance problem and level of control in
automation. Human Factors, 37, 381–394.
Engle, R. W. (2002). Working memory capacity as executive attention. Current Directions in Psychological
Science, 11(1), 19–23.
Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term
memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology:
General, 128, 309–331.
Enns, J. T., & Lleras, A. (2008). What’s next? New evidence for prediction in human vision. Trends in
Cognitive Sciences, 12(9), 327–333.
Ephrath, A. R., Tole, J. R., Stephens, A. T., & Young, L. R. (1980). Instrument scan—Is it an indicator of the
pilot’s workload? In Proceedings of the Human Factors Society 24th Annual Meeting. (pp. 257–258). Santa
Monica, CA: Human Factors and Ergonomics Society.
Ericsson, K. A. (2006). The influence of experience and deliberate practice in the development of superior
expert performance. In K. A. Ericcson, N. Charness, P. J. Feltovich, & R. R. Hoffman (Eds.), The
Cambridge handbook of expertise and expert performance (pp. 683–704). New York: Cambridge
University Press.
Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review, 102, 211–245.
Ericsson, K. A., & Polson, P. G. (1988). An experimental analysis of a memory skill for dinner orders.
Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 303–316.
Ericsson, K. A., & Ward, P. (2007). Capturing the naturally occurring superior performance of experts in the
laboratory: Toward a science of expert and exceptional performance. Current Directions in Psychological
Science, 16(6), 346–350.
Eriksen, B. A., & Eriksen, C. W. (1974). Effects of noise letters upon the identification of a target letter in a
non-search task. Perception & Psychophysics, 16, 143–149.
Eriksen, C. W., & Hake, H. N. (1955). Absolute judgments as a function of stimulus range and number of
stimulus and response categories. Journal of Experimental Psychology, 49, 323–332.
Erlick, D. E. (1964). Absolute judgments of discrete quantities randomly distributed over time. Journal of
Experimental Psychology, 67, 475–482.
Ersner-Hershfield, H., Garton, T., Ballard, K., Samanez-Larkin, G. & Knutson, B. (2009). Don’t stop
thinking about tomorrow: Individual differences in future self-continuity account for saving. Judgment and
Decision Making, 4, 280–286.
ESSAI (2001). WP2 Identification of factors affecting situation awareness and crisis management on the

378
flight deck work package report. Report accessible online at www.essai.nlr.nl.
Eulitz, C., & Hanneman, R. (2010). On the matching of top-down knowledge with sensory input in the
perception of ambiguous speech. BMC Neuroscience, 11, 67–78.
Evans, J. St. B. T. (2007). Hypothetical thinking: Dual processes in reasoning and judgment. Hove, East
Sussex, England: Psychology Press.
Fadden,S., Ververs, P. M., & Wickens, C. D. (1998). Costs and benefits of head-up display use: A meta-
analytic approach. In Proceedings of the Human Factors and Ergonomics Society 42nd Annual Meeting
(pp. 16–20). Santa Monica, CA: Human Factors and Ergonomics Society.
Fadden, S., Ververs, P. M., & Wickens, C. D. (2001). Pathway HUDS: Are they viable? Human Factors, 43,
173–193.
Falk, V., Mintz, D., Grunenfelder, J., Fann, J. I., & Burdon, T. A. (2001). Influence of three-dimensional
vision on surgical telemanipulator performance. Surgical Endoscopy, 15(11), 1282–1288.
Farrell, S., & Lewandowsky, S. (2000). A connectionist model of complacency and adaptive recovery under
automation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 395–410.
Fedota, J. R., & Parasuraman, R. (2010). Neuroergonomics and human error. Theoretical Issues in
Ergonomics Science, 11, 402–421.
Feigh, K. M., Dorneich, M. C., & Hayes, C. C. (2012). Toward a characterization of adaptive systems: A
framework for researchers and system designers. Human Factors, 54. doi: 10.1177/0018720812443983.
Fein, R. M., Olson, G. M., & Olson, J. S. (1993). A mental model can help with learning to operate a
complex device. In CHI 93 Proceedings of Human Factors in Computing Systems (pp. 157–158). New
York: Association for Computing Machinery.
Feldon, D. F. (2007). The implications of research on expertise for curriculum and pedagogy. Educational
Psychology Review, 19, 91–110.
Felton, E. A., Radin, R. G., Wilson, J. A., & Williams, J. C. (2009). Evaluation of a modified Fitts law brain-
computer interface target acquisition task in able and motor disabled individuals. Journal of Neural
Engineering, 6, 1–7.
Felton, E. A., Wilson, J. A., Radwin, R. G., Williams, J. C., & Garell, P. C. (2005). Electrocorticogram-
controlled brain-computer interfaces in patients with temporary subdural electrode implants. Neurosurgery,
57(2), 425.
Fendrich, D. W., & Arengo, R. (2004). The influence of string length and repetition on chunking of digit
strings. Psychological Research, 68, 216–223.
Fennema, M. G., & Kleinmuntz, D. N. (1995). Anticipations of effort and accuracy in multiattribute choice.
Organizational Behavior and Human Decision Processes, 63, 21–32.
Ferrarini, L, Verbist, B. M., Olofsen, H., et al. (2008). Autonomous virtual mobile robot for three-
dimensional medical image exploration: Application to micro-CT co-chlear images. Artificial Intelligence
in Medicine, 43, 1–15.
Ferrez, P. W., & del Millan, J. (2005). You are wrong!—automatic detection of interaction errors from brain
waves. In Proceedings of the 19th International Joint Conference on Artificial Intelligence (pp. 1,413–
1,418). Edinburgh, Scotland: IJCAI.
Ferris, T., Sarter, N. B., & Wickens, C. D. (2010). Cockpit automation: still struggling to catch up….In E.
Weiner & D. Nagle (Eds) Human Factors in Aviation. 2nd Ed. Elsevier.
Figner, B. & Weber, E. (2011). Who takes risks when and why? Determinants of risk taking. Current
Directions in Psychological Science. 20. 211–216.
Fincham, J. M., Carter, C. S., van Veen, V., Stenger, V. A., & Anderson, J. R. (2002). Neural mechanisms
of planning: A computational analysis using event-related fMRI. Proceedings of the National Academy of
Sciences (USA), 99, 3,346–3,351.
Fischer, E., Haines, R., & Price, T. (1980, December). Cognitive issues in head-up displays (NASA
Technical Paper 1711). Washington, DC: NASA.
Fischhoff, B. (1977). Perceived informativeness of facts. Journal of Experimental Psychology: Human

379
Perception and Performance, 3, 349–358.
Fischhoff, B. (2002). Heuristics and biases in application. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.).
Heuristics and biases: The psychology of intuitive judgment (pp. 730–748). New York: Cambridge
University Press.
Fischhoff, B., & Bar-Hillel, M. (1984). Diagnosticity and the base-rate effect. Memory & Cognition, 12,
402–410.
Fischhoff, B., & MacGregor, D. (1982). Subjective confidence in forecasts. Journal of Forecasting, 1, 155–
172.
Fischhoff, B., Slovic, P., & Lichtenstein, S. (1977). Knowing with certainty: The appropriateness of extreme
confidence. Journal of Experimental Psychology: Human Perception and Performance, 3, 552–564.
Fisher, D. L., & Tan, K. C. (1989). Visual displays: The highlighting paradox. Human Factors, 31, 17–30.
Fisher, D. L., Coury, B. G., Tengs, T. O., & Duffy, S. A. (1989). Minimizing the time to search visual
displays: The role of highlighting. Human Factors, 31, 167–182.
Fisher, D. L., & Pollatsek, A. (2007). Novice driver crashes: Failure to divide attention or failure to recognize
risks. In A. Kramer, D. Wiegmann, & A. Kirlik (Eds.), Attention: from theory to practice (pp. 134–156).
Oxford, UK: Oxford University Press.
Fisher, D. L., Schweickert, R., & Drury, C. G. (2006). Mathematical models in engineering psychology:
Optimizing performance. In G. Salvendy (Ed.), Handbook of human factors and ergonomics (3rd Ed.), (pp.
997–1024). New York: Wiley.
Fisk, A. D., Ackerman, P. L., & Schneider, W. (1987). Automatic and controlled processing theory and its
applications to human factors problems. In P. A. Hancock (ed.), Human factors psychology (pp. 159–197).
Amsterdam: Elsevier.
Fisk, A. D., Oransky, N. A., & Skedsvold, P. R. (1988). Examination of the role of “higher-order”
consistency in skill development. Human Factors, 30, 567–582.
Fisk, A. D., & Rogers, W. (2007). Attention goes home: support for aging adults. In A. Kramer, D.
Wiegmann, & A. Kirlik (Eds.), Attention: From theory to practice. Oxford, UK: Oxford University Press.
Fisk, A. D., & Schneider, W. (1981). Controlled and automatic processing during tasks requiring sustained
attention. Human Factors, 23, 737–750.
Fitts, P. M. (1966). Cognitive aspects of information processing III: Set for speed versus accuracy. Journal of
Experimental Psychology, 71, 849–857.
Fitts, P. M., & Deininger, R. L. (1954). S-R compatibility: Correspondence among paired elements within
stimulus and response codes. Journal of Experimental Psychology, 48, 483–492.
Fitts, P. M., & Posner, M. A. (1967). Human performance. Pacific Palisades, CA: Brooks Cole.
Fitts, P. M., & Seeger, C. M. (1953). S-R compatibility: Spatial characteristics of stimulus and response
codes. Journal of Experimental Psychology, 46, 199–210.
Fitts, P. M., Peterson, J. R., & Wolpe, G. (1963). Cognitive aspects of information processing II:
Adjustments to stimulus redundancy. Journal of Experimental Psychology, 65, 423–432.
Fitts, P., & Posner, M. (1967). Human performance. Brooks Cole.
Flach, J. M., Hagen, B. A., & Larish, J. F. (1992). Active regulation of altitude as a function of optical
texture. Perception & Psychophysics, 51, 557–568.
Flach, J. M., Warren, R., Garness, S. A., Kelly, L., & Stanard, T. (1997). Perception and control of altitude:
Splay and depression angles. Journal of Experimental Psychology: Human Perception and Performance,
23, 1,764–1,782.
Flach, J., Mulder, M., & van Paassen, M. M. (2004). The concept of situation in psychology.In S.Banbury &
S.Tremblay (Eds.), A cognitive approach to situation awareness: Theory and application (pp. 42–60).
Aldershot, UK: Ashgate.
Flannagan, M., & Sayer (2010). University of Michigan Transportation Research Institute Technical report.:
Ann Arbor Michigan: University of Michigan.

380
Flavell, R., & Heath, A. (1992). Further investigations into the use of colour coding scales. Interacting with
Computers, 4, 179–199.
Fleetwood, M. & Byrne, M. (2006) ,pde;omg the vosia; searcj pf ds[;aus: revosed ACT-R model of icon
search based on eye-hand tracking data. Human-Computer INtereaction, 21, 155–98.
Flexman, R., & Stark, E. (1987). Training simulators. In G. Salvendy (Ed.), Handbook of human factors.
New York: Wiley.
Flight International. (1990, October 31). Lessons to be learned, pp. 24–26.
Flin, R. H. (2007). Crew resource management for teams in the offshore oil industry. Team Performance
Management, 3(2), 121–129.
Flin, R., Fletcher, G., McGeorge, P., Sutherland, A., & Patey, R. (2003). Anaesthetists’ attitudes to
teamwork and safety. Anaesthesia, 58, 233–242.
Flin, R., Salas, E., Strub, M., & Martin, L. (1997). Decision making under stress: Emerging themes and
applications. Burlington, VT: Ashgate.
Flowe, H. D., & Ebbesen, E. B. (2007). The effect of lineup member similarity on recognition accuracy in
simultaneous and sequential lineups. Law and Human Behavior, 31, 33–52.
Fogarty, G., & Stankov, L. (1982). Competing tasks as an index of intelligence. Personality and Individual
Differences, 3, 407–422.
Folk, C. L., Remington, R. W., & Johnston, J. C. (1992). Involuntary covert orienting is contingent on
attentional control settings. Journal of Experimental Psychology: Human Perception and Performance, 18,
1030–1044.
Fong, G. T., & Nisbett, R. E. (1991). Immediate and delayed transfer of training effects in statistical
reasoning. Journal of Experimental Psychology: General, 120, 34–45.
Fontenelle, G. A. (1983). The effect of task characteristics on the availability heuristic or judgments of
uncertainty (Report No. 83–1). Office of Naval Research, Rice University.
Ford, J. K., Schmitt, N., Scheitman, S. L., Hults, B. M., & Doherty, M. L. (1989). Process tracing methods:
Contributions, problems and neglected research questions. Organizational Behavior & Human Decision
Processes, 43, 75–117.
Fougnie, D., & Marois, R. (2007). Executive working memory load induces inattentional blindness.
Psychonomic Bulletin & Review, 14, 142–147.
Foushee, H. C. (1984). Dyads and triads at 35,000 feet: Factors affecting group process and aircrew
performance. American Psychologist, 39, 885–893.
Foushee, H. C., & Helmreich, R. L. (1988). Group interaction and flight crew performance. In E. Wiener &
D. Nagel (Eds.), Human factors in aviation. San Diego, CA: Academic Press.
Fowler, F. D. (1980). Air traffic control problems: A pilot’s view. Human Factors, 22, 645–654.
Fracker, M. L., & Wickens, C. D. (1989). Resources, confusions, and compatibility in dual axis tracking:
Display, controls, and dynamics. Journal of Experimental Psychology: Human Perception and
Performance, 15, 80–96.
Frankenstein, J., Mohler, B., Bulthoff, H. & Meilinger, T. (2012). Is the map in our head oriented north?
Psychological Science. 22, 120–125.
Franklin, N., & Tversky, B. (1990). Searching imagined environments. Journal of Experimental Psychology:
General, 119, 63–76.
Frankmann, J. P., & Adams, J. A. (1962) Theories of vigi-lance. Psychological Bulletin, 59, 257–272.
Frantz, J. P. (1994). Effect of location and procedural explicitness on user processing of and compliance with
product warnings. Human Factors, 36, 532–546.
Freed, M. (2000). Reactive prioritization. In Proceedings of the 2nd NASA International Workshop on
Planning and Scheduling in Space. Washington, DC: National Aeronautics and Space Administration.
Friedman, D. B., & Hoffman-Goetz, L. (2006). A systematic review of readability and comprehension
instruments used for print and web-based cancer information. Health Education & Behavior, 33(3), 352–

381
373.
Friedman, N. P., Miyake, A., Young, S. E., DeFries, J. C., Corley, R. P., & Hewitt, J. K. (2008). Individual
differences in executive functions are almost entirely genetic in origin. Journal of Experimental
Psychology: General, 137, 201–225.
Fuchs, A. H. (1962). The progression regression hypothesis in perceptual-motor skill learning. Journal of
Experimental Psychology, 63, 177–192.
Funk, K., Lyall, B., Wilson, J., Vint, R., Niemczyk, M., Suroteguh, C., & Owen, G. (1999). Flight deck
automation issues. International Journal of Aviation Psychology 9, 109–123.
Gajendran, R. S., & Harrison, D. A. (2007). The good, the bad, and the unknown about telecommuting:
Meta-analysis of psychological mediators and individual consequences. Journal of Applied Psychology, 92,
1,524–1,541.
Gallimore, J. J., & Brown, M. E. (1993). Visualization of 3-D computer-aided design objects. International
Journal of Human-Computer Interaction, 5, 361–382.
Galster, S., & Parasuraman, R. (2001). Evaluation of countermeasures for performance decrements due to
automated-related complacency in IFR-rated General Aviation pilots. In Proceedings of the International
Symposium on Aviation Psychology (pp. 245–249). Columbus, OH: Association of Aviation Psychology.
Gane, B. D., & Catrambone, R. (2011). Extended practice in motor learning under varied practice schedules:
Effects of blocked, blocked-repeated, and random schedules. In Proceedings of the Human Factors and
Ergonomics Society--55th Annual Meeting (pp. 2143–2147). Santa Monica, CA: Human Factors and and
Ergonomics Society.
Ganel, T., Goshen-Gottstein, Y., & Goodale, M. A. (2005). Interactions between the processing of gaze
direction and facial expression. Vision Research, 45, 1,191–1,200.
Garbis, C., and Artman, H. (2004). Team situation awareness as communicative practices. In S. Banbury &
S. Tremblay (Eds.), A cognitive approach to situation awareness: Theory and application (pp. 275–296).
Aldershot, UK: Ashgate.
Gardiner, J. M., & Richardson-Klavehn, A. (2000). Remembering and knowing. In E. Tulving & F. I. M.
Craik (Eds.), The Oxford handbook of memory (pp. 229–244). New York: Oxford University Press.
Garg, A. X., Adhikari, N. K., McDonald, H., Rosas-Arellano, M. P., Devereaux, P., & Beyene, J. (2005).
Effects of computerized clinical decision support systems on practitioner performance and patient
outcomes. Journal of the American Medical Association, 293, 1,223–1,238.
Gärling, T. (1989). The role of cognitive maps in spatial decisions. Journal of Environmental Psychology, 9,
269–278.
Garling, T., Kirchler, E., Lewis, A. & van Raaij, F. (2009). Psychology, financial decision making and
financial crises. Psychological Science in the Public Interest. 10, (whole issue).
Garner, W. R. (1974). The processing of information and structure. Hillsdale, NJ: Erlbaum.
Garner, W. R., & Felfoldy, G. L. (1970). Integrality of stimulus dimensions in various types of information
processing. Cognitive Psychology, 1, 225–241.
Garzonis, S., Jones, S., Jay, T., & O’Neill, E. (2009). Auditory icon and earcon mobile service notifications:
Intuitiveness, learnability, memorability and preference. In Proceedings of the 27th International
Conference on Human Factors in Computing Systems. Boston, MA, USA.
Gawande, A., & Bates, D. (2000, February). The use of information technology in improving medical
performance: Part I. Information systems for medical transactions. Medscape General Medicine, 2, 1–6.
Gazzaley, A., Cooney, J. W., Rissman, J., & D’Esposito, M. (2005). Top-down suppression deficit underlies
working memory impairment in normal aging. Nature Neuroscience, 8, 1,298–1,300.
Gazzaniga, M. S. (2009). The cognitive neurosciences. Cambridge, MA: MIT Press.
Geelhoed, E., Parker, A., Williams, D. J., & Groen, M. (2011). Effects of latency on telepresence. HP
Laboratories Report HPL-2009-120. Palo Alto, CA: Hewlett-Packard.
Geisler, W. S. (2008). Visual perception and the statistical properties of natural scenes. Annual Review of
Psychology, 59, 10.1–10.26.

382
Geisler, W. S., & Chou, K. (1995). Separation of low-level and high-level factors in complex tasks: visual
search. Psychological Review, 102, 356–378.
Gentner, D. R. (1982). Evidence against a central control model of timing in typing. Journal of Experimental
Psychology: Human Perception and Performance, 9, 793–810.
Gentner, D., & Stevens, A. L. (1983). Mental models. Hillsdale, NJ: Erlbaum.
Getty, D., Swets, J., Pickett, R., & Gonthier, D. (1995). System operator response to warnings of danger.
Journal of Experimental Psychology: Applied, 1, 19–33.
Getty, D. J., & Green, P. J. (2007). Clinical applications for stereoscopic 3-D displays. Journal of the Society
for Information Display, 15 (6), 377–384.
Getty, D. J., Pickett, R. M., D’Orsi, C. J., & Swets, J. A. (1988). Enhanced interpretation of diagnostic
images. Investigative Radiology, 23, 240–252.
Getzmann, S. (2003). The influence of the acoustic context on vertical sound localization in the median plane.
Perception & Psychophysics, 65, 1,045–1,057.
Gevins, A., & Smith, M. E. (2003). Neurophysiological measures of cognitive workload during human-
computer interaction. Theoretical Issues in Ergonomics Science, 4(1–2), 113–131.
Gevins, A., & Smith, M. E. (2007). Electroencephlaogram in neuroergonomics. In R. Parasuraman & M.
Rizzo (Eds.), Neuroergonomics: The brain at work (pp. 15–31). New York: Oxford University Press.
Gevins, A., Smith, M. E., Leong, H., McEvoy, L., Whitfield, S., & Du, R. (1998). Monitoring working
memory load during computer-based tasks with EEG pattern recognition methods. Human Factors, 40, 79–
91.
Gibb, R. W. (2007). Visual spatial disorientation: Revisiting the black hole illusion. Aviation, Space, and
Environmental Medicine, 78, 801–808.
Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton-Mifflin.
Gigerenzer, G., Czerlinski, J., & Martignon, L. (2002). How good are fast and frugal heuristics? In T.
Gilovich, D. Griffin, & D. Kahneman (Eds). Heuristics and biases: The psychology of intuitive judgment.
New York: Cambridge University Press.
Gigerenzer, G., & Todd, P (1999). Simple heuristics that make us smart. New York: Oxford University Press.
Gigerenzer, G., Czerlinski, J., & Martignon, L. (2002). How good are fast and frugal heuristics? In T.
Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and Biases (pp. 559–581). Cambridge, UK:
Cambridge University Press.
Gillan, D. J. (1995). Visual arithmetic, computational graphics, and the spatial metaphor. Human Factors, 37,
766–780.
Gillan, D. J. (2009). A componential model of human interaction with graphs: VII. A review of the mixed
arithmetic-perceptual model. In Proceedings of the Human Factors and Ergonomics Society 53rd Annual
Meeting (pp. 829–833). Santa Monica, CA: Human Factors and Ergonomics Society.
Gillan, D. J., & Lewis, R. (1994). A componential model of human interaction with graphs: I. Linear
regression modeling. Human Factors, 36, 419–440.
Gillan, D. J., & Richman, E. H. (1994). Minimalism and the syntax of graphs. Human Factors, 36, 619–644.
Gillan, D. J., & Sorensen, D. (2009). Minimalism and the syntax of graphs: II. Effects of graph backgrounds
on visual search.In Proceedings of the Human Factors and Ergonomics Society—53rd Annual Meeting (pp.
1,096–1,100). Santa Monica, CA: Human Factors and Ergonomics Society.
Gillan, D. J., Wickens, C. D., Hollands, J. G., & Carswell, C. M. (1998). Guidelines for presenting
quantitative data in HFES Publications. Human Factors, 40, 28–41.
Gillie, T., & Broadbent, D. (1989). What makes interruptions disruptive? A study of length, similarity, and
complexity. Psychological Research, 50, 243–250.
Gillies, M., & Spanlang, B. (2010). Comparing and evaluating real-time character engines for virtual
environments. Presence, 19, 95–117.
Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). Heuristics and biases: The psychology of intuitive

383
judgment. New York: Cambridge University Press.
Gilovich, T., Vallone, R., & Tversky, A. (2002). The hot hand in basketball: On the misperception of random
sequences. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and Biases (pp. 601-616).
Cambridge, UK: Cambridge University Press.
Glanzer, M., Kim, K., Hilford, A., & Adams, J. K. (1999). Slope of the receiver-operating characteristic in
recognition memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 500–
513.
Glass, G. V. (1976). Primary, secondary, and meta-analysis of research. Educational Researcher, 5, 3–8.
Gobet, F. (1998). Expert memory: A comparison of four theories. Cognition, 66, 115–152.
Gobet, F. (2005). Chunking models of expertise: Implications for education. Applied Cognitive Psychology,
19, 183–204.
Gobet, F., & Clarkson, G. (2004). Chunks in expert memory: Evidence for the magical number four….or is it
two? Memory, 12(6), 732–747.
Goddard, K., Roudsari, A., & Wyatt, J. C. (2012). Automation bias: a systematic review of frequency, effect
mediators, and mitigators. Journal of the American Medical Informatics Association, 19, 121–127.
Goldberg, L. (1968). Simple models or simple processes? Some research on clinical judgment. American
Psychologist, 23, 483–96.
Golden, T. D., Veiga, J. F., & Dino, R. N. (2008). The impact of professional isolation on teleworker job
performance and turnover intentions: Does time spent teleworking, interacting face-to-face, or having
access to communication-enhancing technology matter? Journal of Applied Psychology, 93, 1,412–1,421.
Goldstein, E. B. (2010) Sensation and perception (8th Ed.). Belmont, CA: Wadsworth.
Goldstein, W. M., & Hogarth, R. M. (1997). Research on judgment and decision making: Currents,
connections, and controversies. New York: Cambridge University Press.
Golestani, N., Rosen, S., & Scott, S. K. (2009). Native-language benefit for understanding speech-in-noise:
The contribution of semantics. Bilingualism: Language & Cognition, 12, 385–392.
Gollwitzer, P. M. (1999). Implementation intentions: strong effects of simple plans. American Psychologist,
54, 493–503.
Gong, L., & Nass, C. (2007). When a talking-face computer agent is half-human and half-humanoid: Human
identity and consistency preference. Human Communication Research, 33(2), 163–193.
Gonzales, V. M., & Mark, G. (2004). Constant, constant, multi-tasking craziness: Managing multiple working
spheres. In Human Factors of Computing Systems: CHI 04 (pp. 113–120). New York: Association for
Computing Machinery.
Gonzalez, C., & Wimisberg, J. (2007). Situation awareness in dynamic decision-making: Effects of practice
and working memory. Journal of Cognitive Engineering and Decision Making, 1, 56–74.
Goodale, M. A., & Milner, A. D. (2005). Sight unseen: An exploration of conscious and unconscious vision.
Oxford, UK: Oxford University Press.
Goodman, M. J., Tijerna, L., Bents, F. D., & Wierwille, W. W. (1999). Using cellular telephones in vehicles:
Safe or unsafe? Transportation Human Factors, 1, 3–42.
Goodrich, M. A., McLain, T. W., Anderson, J. D., Sun, J., & Crandall, J. W. (2007). Managing autonomy in
robot teams: Observations from four experiments. In Proceedings of the Second ACMSIGCHI/SIGART
Conference on Human-Robot Interaction (pp.25–32).doi:10.1145/1228716.1228721.New York:
Association for Computing Machinery.
Goodstein, L. P. (1981). Discriminative display support for process operators. In J. Rasmussen & W. B.
Rouse (Eds.), Human detection and diagnosis of system failures. New York: Plenum.
Goodwin, G. A. (2006). The training, retention, and assessment of digital skills: A review and integration of
the literature. U.S. Army Research Institute Research Report 1864. Arlington, VA: U.S. Army Research
Institute for the Behavioral and Social Sciences.
Gopher, D. (1993). The skill of attention control: Acquisition and execution of attention strategies. In D.

384
Meyer & S. Kornblum (Eds.), Attention and performance XIV. Hillsdale, NJ: Erlbaum.
Gopher, D. (2007). Emphasis change in high demand task training. In A. Kramer, D. Wiegmann, & A. Kirlik
(Eds.), Attention: from theory to practice. Oxford, England: Oxford University Press.
Gopher, D., Brickner, M., & Navon, D. (1982). Different difficulty manipulations interact differently with
task performance: evidence for multiple resources. Journal of Experimental Psychology: Human
Perception and Performance, 8, 146–157.
Gopher, D., & Donchin, E. (1986). Workload: An experimentation of the concept. In K. Boff, L. Kauffman,
& J. Thomas (Eds.), Handbook of perception and performance (Vol. II). New York: Wiley.
Gopher, D., & Koriat, A. (Eds.) (1998). Attention and performance XVII: Cognitive regulation of
performance: Interaction of theory and application. New York: Academic Press.
Gopher, D., & Raij, D. (1988). Typing with a two hand chord keyboard—will the QWERTY become
obsolete? IEEE Transactions in System, Man, and Cybernetics, 18, 601–609.
Gopher, D., Weil, M., & Bareket, T. (1994). Transfer of skill from a computer game trainer to flight. Human
Factors, 36, 387–405.
Gopher, D., Weil, M., & Siegel, D. (1989). Practice under changing priorities: An approach to the training of
complex skills. Acta Psychologica, 71, 147–177.
Gordon, C. P. (2009). Crash studies of driver distraction. In M. Regan, J. Lee, & K. Young (Eds.), Driver
distraction: Theory, effects and mitigation. Boca Raton, FL: CRC Press.
Gordon, R. L., Schön, D., Magne, C., Astésano, C., & Besson, M. (2010). Words and melody are intertwined
in perception of sung words: EEG and behavioral evidence. PLoS ONE 5(3): e9889.
Gordon, S. E., Schmierer, K. A., & Gill, R. T. (1993). Conceptual graph analysis: Knowledge acquisition for
instructional system design. Human Factors, 35, 459–481.
Gorman, J. C., & Cooke, N. J. (2011). Changes in team cognition after a retention interval: The benefits of
mixing it up. Journal of Experimental Psychology: Applied, 17, 303–319.
Gorman, J. C., Cooke, N. J., and Winner, J. L. (2006). Measuring team situation awareness in decentralized
command and control environments. Ergonomics, 49, 1,312–1,325.
Gramopadhye, A. K., Drury, C. G., Jiang, X., &Sreenivasan, R. (2002). Visual search and visual lobe size:
can training on one affect the other? International Journal of Industrial Ergonomics, 30, 181–195.
Gratton, G., Coles, M. G. H., Sirevaag, E., Eriksen, C. W., & Donchin, E. (1988). Pre- and post-stimulus
activation of response channels: A psychophysiological analysis. Journal of Experimental Psychology:
Human Perception and Performance, 14, 331–344.
Gray, R. (2004). Attending to the execution of a complex sensory motor skill: Expertise differences, choking
and slumps. Journal of Experimental Psychology: Applied, 10, 42–54.
Gray, W. (2007) (Ed.) Integrated Models of Cognitive Systems. Oxford, UK: Oxford University Press.
Gray, R., Geri, G. A., Akhtar, S. C., & Covas, C. M. (2008). The role of visual occlusion in altitude
maintenance during simulated flight. Journal of Experimental Psychology: Human Perception and
Performance, 34, 475–488.
Gray, W. D., & Fu, W. T. (2004). Soft constraints in interactive behavior: The case of ignoring perfect
knowledge in-the-world for imperfect knowledge in-the-head. Cognitive Science, 28, 359–382.
Green, A. E., Munafo, M., DeYoung, C., Fossella, J. A., Fan, J., & Gray, J. R. (2008). Using genetic data in
cognitive neuroscience: From growing pains to genuine insights. Nature Reviews Neuroscience, 9, 710–
720.
Green, C. S., & Bavelier, D. (2003). Action video game modifies visual selective attention. Nature, 423, 534–
537.
Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: Wiley.
(Reprinted 1988, Los Altos, CA: Peninsula).
Greenwald, A. (1970). A double stimulation test of ideomotor theory with implications for selective attention.
Journal of Experimental Psychology, 84, 392–398.

385
Greenwood, P. M., Fossella, J., & Parasuraman, R. (2005). Specificity of the effect of a nicotinic receptor
polymorphism on individual differences in visuospatial attention. Journal of Cognitive Neuroscience, 17,
1,611–1,620.
Gregory, R. L. (1997). Knowledge in perception and illusion. Philosophical Transactions of the Royal Society
London, B, 352, 1,121–1,128.
Grether, W. F. (1949). Instrument reading I: The design of long-scale indicators for speed and accuracy of
quantitative readings. Journal of Applied Psychology, 33, 363–372.
Grether, W. F., & Baker, C. A. (1972). Visual presentation of information. In H. P. Van Cott & R. G.
Kinkade (Eds.), Human engineering guide to system design. Washington, DC: U.S. Government Printing
Office.
Grice, H. P. (1975). Logic and conversation. In P. Cole & J. Morgan (Eds.), Syntax and semantics: Speech
acts (Vol. 3, pp. 276–290). New York: Academic Press.
Griffin, D., & Tversky, A, (1992).The weighing of evidence and the determinants of confidence. Cognitive
Psychology, 24, 411–435.
Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological
Review, 114, 211–244.
Gronlund, S. D., Ohrt, D. D., Dougherty, M. R. P., Perry, J. L., & Manning, C. A. (1998). Role of memory
in air traffic control. Journal of Experimental Psychology: Applied, 4, 263–280.
Gronlund, S. D., Carlson, C., Dailey, S, & Foodsell, C. (2009). Robustness of the sequential line up
advantage. Journal of Experimental Psychology: Applied, 15. 140–152.
Grossman, T., Dragicevic, P., & Balakrishnan, R. (2007). Strategies for accelerating online learning of
hotkeys. In CHI 2007 Proceedings of Human Factors in Computing Systems (pp. 1,591–1,600). New York:
Association for Computing Machinery.
Grosz, J., Rysdyk, R. T., Bootsma, R. J., Mulder, J. A., van der Vaart, J. C., & van Wieringen, P. C. W.
(1995). Perceptual support for timing of the flare in the landing of an aircraft. In P. Hancock, J. Flach, J.
Caird, & K. Vicente, Local applications of the ecological approach to human-machine systems (pp. 104–
121). Hillsdale, NJ: Erlbaum.
Grundgeiger, T., Sanderson, P., Macdougall, H., & Balaubramanian, V. (2010). Interruption management
in the intensive care unit. Journal of Experimental Psychology: Applied, 16, 317–334.
Grunwald, A. J., & Ellis, S. R. (1993). Visual display aid for orbital maneuvering: Design considerations.
Journal of Guidance, Control, and Dynamics, 16, 139–150.
Gugerty, L. J., & Tirre, W. C. (2000). Individual differences in situation awareness. In M. R. Endsley & D. J.
Garland (Eds.), Situation awareness analysis and measurement (pp. 249–276). Mahwah, NJ: Erlbaum.
Gugerty, L. J., Brooks, J. O., & Treadaway, C. A. (2004). Individual differences in situation awareness for
transportation tasks. In S. Banbury & S. Tremblay (Eds.), A cognitive approach to situation awareness:
Theory and application (pp. 193–212). Aldershot, UK: Ashgate.
Gugerty, L. J., Rakauskas, M., & Brooks, J. (2004). Effects of remote and in-person verbal interactions on
verbalization rates and attention to dynamic spatial scenes. Accident Analysis and Prevention, 36, 1,029–
1,043.
Gunn, D. V., Warm, J. S., Nelson, W. T., Bolia, R. S., Schumsky, D. A., & Corcoran, K. J. (2005). Target
acquisition with UAVs: Vigilance displays and advanced cuing interfaces. Human Factors, 47, 488–497.
Gurushanthaiah, K., Weinger, M. B., & Englund, C. E. (1995). Visual display format affects the ability of
anesthesiologists to detect acute physiologic changes: A laboratory study employing a clinical display
simulator. Anesthesiology, 83, 1,184–1,193.
Haber, R. N., & Schindler, R. M. (1981). Error in proofreading: Evidence of syntactic control of letter
processing? Journal of Experimental Psychology: Human Perception and Performance, 7, 573–579.
Haelbig, T. D., Mecklinger, A., Schriefers, H., & Friederici, A. D. (1998). Double dissociation of processing
temporal and spatial information in working memory. Neuropsychologia, 36, 305–311.
Hagen, L., Herdman, C. M., & Brown, M. S. (2007). The performance costs of digital headup displays.

386
VSIM Report, Centre for Advanced Studies in Visualization and Simulation, Carleton University, Ottawa,
Canada. Available at http://www6.carleton.ca/ace/projects-and-publications/heads-up-displays/
Hailpern, J., Karahalios, K., DeThorne, L., & Halle, J. (2009). Talking points: the differential impact of
real-time computer generated audio/visual feedback on speech-like & non-speech-like vocalizations in low
functioning children with ASD. Proceedings of the 11th international ACM SIGACCESS conference on
Computers and Accessibility. Pittsburgh, PA.
Hale, S., Stanney, K. M., & Malone, L. (2009). Enhancing virtual environment spatial awareness training and
transfer through tactile and vestibular cues. Ergonomics, 52, 187–203.
Halford, G. S., Baker, R., McCredden, J. E., & Bain, J. D. (2005). How many variables can humans
process? Psychological Science, 16(1), 70–76.
Halford, G., Wilson, W., & Philips, S. (1998). Processing capacity defined by relational complexity.
Behavioral and Brain Sciences. 21, 803–831.
Hammond, K. R., Hamm, R. M., Grassia, J., & Pearson, T. (1987). Direct comparison of the efficacy of
intuitive and analytical cognition in expert judgment. IEEETransactions on Systems, Man, and Cybernetics,
SMC–17(5), 753–770.
Hampton, D. C. (1994). Expertise: The true essence of nursing art. Advances in Nursing Science, 17, 15–24.
Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y. C., de Visser, E., & Parasuraman, R. (2011). A
meta-analysis of factors affecting trust in human-robot interaction. Human Factors, 53, 517–727.
Hancock, P. A., & Chignell, M. H. (Eds.) (1989). Intelligent interfaces: Theory, research and design. North–
Holland: Elsevier.
Hancock, P. A., & Desmond, P. (2001). Stress, workload and fatigue. Mahwah, NJ: Erlbaum.
Hancock, P. A., & Ganey, N. (2003). From the inverted-U to the extended-U: The evolution of a law of
psychology. Journal of Human Performance in Extreme Environments, 7, 5–14.
Hancock, P. A., & Meshkati, N. (1988). Human mental workload. Amsterdam: North Holland.
Hancock, P. A., & Warm, J. S. (1989). A dynamic model of stress and sustained attention. Human Factors,
31, 519–537.
Hankins, T. C., & Wilson, G. F. (1998). A comparison of heart rate, eye activity, EEG and subjective
measures of pilot mental workload during flight. Aviation, Space and Environmental Medicine, 69, 360–
367.
Harrington, D., & Kello, J. (1991). Systematic evaluation of nuclear operator team skills training. In
Proceedings of the American Nuclear Society, San Francisco, CA.
Harris, D. H., & Chaney, F. D. (1969). Human factors in quality assurance. New York: Wiley.
Harris, H., Ballenson, J. N., Nielsen, A., & Yee, N. (2009). The evolution of social behavior over time in
Second Life. Presence, 18, 434–448.
Harris, R. L., Glover, B. L., & Spady, A. A. (1986). Analytic techniques of pilot scanning behavior and their
application, NASA Langley Research Center, Technical Paper No. 2525. Hampton, VA: National
Aeronautics and Space Administration.
Hart, S. G. (1988). Helicopter human factors. In E. L. Wiener & D. C. Nagel (Eds.), Human factors in
aviation (pp. 591–638). San Diego, CA: Academic Press.
Hart, S. G., & Staveland, L. E. (1988). Development of NASA–TLX (Task Load Index): Results of empirical
and theoretical research. In P. A. Hancock & N. Meshkati (Eds.), Human mental workload (pp. 139–183).
Amsterdam: North Holland.
Hart, S. G., & Wickens, C. D. (1990). Workload assessment and prediction. In H. R. Booher (Ed.),
MANPRINT: An emerging technology. Advanced concepts for integrating people, machines and
organizations (pp. 257–300). New York: Van Nostrand Reinhold.
Hart, S. G., & Wickens, C. D. (2010). Cognitive Workload. NASA Human Systems Integration handbook,
Chapter 6.
Hasher, L., & Zacks, R. (1979). Automatic and effortful processes in memory. Journal of Experimental

387
Psychology: General, 108, 356–388.
Haskell, I. D., & Wickens, C. D. (1993). Two-and three-dimensional displays for aviation: A theoretical and
empirical comparison. The International Journal of Aviation Psychology, 3, 87–109.
Haskell,I.D., Wickens,C.D., &Sarno,K. (1990).Quantifying stimulus-response compatibility for the
Army/NASA A3I display layout analysis tool. In Proceedings of the 5th Mid–Central Human
Factors/Ergonomics Conference. Dayton, OH.
Hawkins, F. H. (1993). In H. W. Orlady (Ed.), Human factors in flight (2nd ed.). Brookfield, VT: Ashgate.
Hawkins, F., & Orlady, H. W. (1993). Human factors in flight (2nd Ed.). Brookfield, VT: Gower.
Hayes, C., & Miller, C. (Eds.) (2011). Human-computer etiquette: Understanding the impact of human
culture and expectations on the use and effectiveness of computers and technology. New York: Taylor &
Francis.
He, J., Becic, W., Lee, Y. C., & McCarley, J. (2011). Mind wandering behind the wheel: performance and
oculomotor correlates. Human Factors, 53, 13–21.
Healy, A. F. (1976). Detection errors on the word “the”. Journal of Experimental Psychology: Human
Perception and Performance, 2, 235–242.
Healy. A. & Bourne, L. (2012). Training cognition: Optimizing efficiency, durability, and generalizability.
New York: Psychology Press.
Heer, J., & Robertson, G. G. (2007). Animated transitions in statistical data graphics. IEEE Transactions on
Visualization and Computer Graphics, 13, 1,240–1,247.
Heer, J., Kong, N., & Agrawala, M. (2009). Sizing the horizon: The effects of chart size and layering on the
graphical perception of time series visualizations. In CHI 2009: Proceedings of the 27th International
Conference on Human Factors in Computing Systems (pp. 1,303–1,312). New York: Association for
Computing Machinery.
Hegarty, M., & Waller, D. (2005). Individual differences in spatial intelligence. In P. Shah & A. S. Miyaki
(Eds.), The Cambridge handbook of visuospatial thinking. Cambridge, UK: Cambridge University Press.
Helleberg, J. R., & Wickens, C. D. (2003). Effects of data-link modality and display redundancy on pilot
performance: An attentional perspective. International Journal of Aviation Psychology, 13, 189–210.
Hellier, E., Edworthy, J., Weedon, B., Walters, K. & Adams, A. (2002). The perceived urgency of speech
warnings: Semantics versus acoustics. Human Factors, 44, 1–17.
Helmreich, R. L. (2000). On error management: Lessons from aviation. British Medical Journal, 320, 781–
785.
Helmreich, R. L., & Merritt, A. C. (1998). Culture at work in aviation and medicine. Brookfield, VT:
Ashgate.
Henderson, S. J., & Feiner, S. (2009). Evaluating the benefits of augmented reality for task localization in
maintenance of an armored personnel carrier turret. In IEEE Symposium on Mixed and Augmented Reality
Science and Technology Proceedings (pp. 135–144). Orlando FL: Institute of Electrical and Electronic
Engineers.
Hendy, K. C., Liao, J., & Milgram, P. (1997). Combining time and intensity effects in assessing operator
information-processing load. Human Factors, 39, 30–47.
Henrion, M., & Fischoff, B. (2002). Assessing uncertainty in physical constants. In T. Gilovich, D. Griffin,
& D. Kahneman (Eds), Heuristics and biases: The psychology of intuitive judgment. New York: Cambridge
University Press.
Henry, R. A., & Sniezek, J. A. (1993). Situational factors affecting judgments of future performance.
Organizational Behavior and Human Decision Processes, 54, 104–132.
Herbert, W. (2010). On second thought. New York: Random House.
Hermann, D., Brubaker, B., Yoder, C., Sheets, V., & Tio, A. (1999). Devices that remind. In F. Durso (Ed.)
Handbook of Applied Cognition (2nd Ed., pp. 377–408). New York: Wiley.
Herron, S. (1980). A case for early objective evaluation of candidate displays. In G. Corrick, M. Hazeltine, &

388
R. Durst (Eds.), Proceedings of the 24th Annual Meeting of the Human Factors Society. Santa Monica, CA:
Human Factors Society.
Hershon, R. L., & Hillix, W. A. (1965). Data processing in typing: Typing rate as a function of kind of
material and amount exposed. Human Factors, 7, 483–492.
Hertwig, R., & Erev, I. (2009). The description-experience gap in risky choice. Trends in Cognitive Science.
9, 1–7.
Hess, S. & Detweiller, M. (1994). Training Interruptions. Proceedings 38th Conference of the Human Factors
& Ergonomics Society. Santa Monica, CA: Human Factors.
Hess, S. M., & Detweiler, M. C. (1996). The value of display space at encoding and retrieval in keeping track.
In Proceedings of the Human Factors and Ergonomics Society—40th Annual Meeting (pp. 1,232–1,236).
Santa Monica, CA: Human Factors and Ergonomics Society.
Hess, S. M., Detweiler, M. C., & Ellis, R. D. (1999). The utility of display space in keeping-track of rapidly
changing information. Human Factors, 41, 257–281.
Hick, W. E. (1952). On the rate of gain of information. Quarterly Journal of Experimental Psychology, 4, 11–
26.
Hickox, J. C., & Wickens, C. D. (1993). Two-and three-dimensional displays for aviation: A theoretical and
empirical comparison. International Journal of Aviation Psychology, 3, 87–109.
Hickox, J. C. & Wickens, C. D. (1999). Effects of elevation angle disparity, complexity, and feature type on
relating out-of-cockpit field of view to an electronic cartographic map. Journal of Experimental
Psychology: Applied, 5, 284–301.
Hicks, J. L., Marsh, R. L., & Russell, E. J. (2000). The properties of retention intervals and their effect on
retaining prospective memories. Journal of Experimental Psychology: Learning, Memory, and Cognition,
26, 1,160–1,169.
Hilburn, B. (2004). Cognitive complexity in air traffic control: A literature review. (CHPR Technical Report).
The Hague, Netherlands: Center for Human Performance Research.
Hilburn, B., Jorna, P. G., Byrne, E. A., & Parasuraman, R. (1997). The effect of adaptive air traffic control
(ATC) decision aiding on controller mental workload. In M. Mouloua and J. Koonce (Eds.), Human-
automation interaction: Research and practice (pp. 84–91). Mahwah, NJ: Erlbaum.
Hill, S. G., Iavecchia, H. P., Byers, J. C., Bittner, A. C., Jr., Zaklad, A. L., & Christ, R. E. (1992).
Comparison of four subjective workload rating scales. Human Factors, 34, 429–440.
Hillyard, S. A., Vogel, E. K., & Luck, S. J. (1998). Sensory gain control (amplification) as a mechanism of
selective attention: electrophysiological and neuroimaging evidence. Philosophical Transactions of the
Royal Society of London-Series B: Biological Sciences, 353, 1,257–1,270.
Hirst, W. (1986). Aspects of divided and selected attention. In J. LeDoux & W. Hirst (Eds.), Mind and brain.
New York: Cambridge University Press.
Hirst, W., & Kalmar, D. (1987). Characterizing attentional resources. Journal of Experimental Psychology:
General, 116, 68–81.
Ho, C. Y., Nikolic, M. I., Waters, M., & Sarter, N. B. (2004). Not now! Supporting interruption management
by indicating the modality and urgency of pending tasks. Human Factors, 46, 399–410.
Ho, C., & Spence, C. (2008). The multisensory driver. Brookfield, VT: Ashgate.
Ho, G., Scialfa, C. T., Caird, J. K., & Graw, T. (2001). Visual search for traffic signs: The effects of clutter,
luminance and aging. Human Factors, 43, 194–207.
Hochberg, J., & Brooks, V. (1978). Film cutting and visual momentum. In J. W. Senders, D. F. Fisher, & R.
A. Monty (Eds.), Eye movements and the higher psychological functions. Hillsdale, NJ: Erlbaum.
Hockey, G. R. J. (1970). Effect of loud noise on attentional selectivity. Quarterly Journal of Experimental
Psychology, 22, 28–36.
Hockey, G. R. J. (1997). Compensatory control in the regulation of human performance under stress and high
workload: A cognitive-energetical framework. Biological Psychology, 45, 73–93.

389
Hockey, R. (1984). Varieties of attentional state: The effects of the environment. In R. Parasuraman & D. R.
Davies (Eds.), Varieties of attention (pp. 449–484). New York: Academic Press.
Hockey, G. R. J., Nickel, P., Roberts, A. C., & Roberts. M. H. (2009). Sensitivity of candidate markers of
psychophysiological strain to cyclical changes in manual control load during simulated process control.
Applied Ergonomics, 40(6), 1,011–1,018.
Hodgetts, H., Farmer, E., Joose, M., Parmentier, F., Schaefer, D., Hoogeboom, P., van Gool, M. & Jones,
D. (2005). The effects of party line communication on flight task performance. In D. de Waard, K. A.
Brookhuis, R. van Egmond, and T. Boersema (Eds.), Human factors in design, safety, and management
(pp. 1–12). Maastricht, Netherlands: Shaker.
Hoffman, R. R., Crandall, B., & Shadbolt, N. (1998). Use of the critical decision method to elicit expert
knowledge: A case study in the methodology of cognitive task analysis. Human Factors, 40, 254–276.
Hoffman, R. R., Shadbolt, N. R., Burton, A. M., & Klein, G. (1995). Eliciting knowledge from experts: A
methodological analysis. Organizational Behavior and Human Decision Processes, 62, 129–158.
Hoffmann, E. R. (1990). Strength of component principles determining direction-of-turn stereotypes for
horizontally moving displays. In Proceedings of the 34th Annual Meeting of the Human Factors Society
(pp. 457–461). Santa Monica, CA: Human Factors Society.
Hoffmann, E. R. (1997). Strength of component principles determining direction of turn stereotypes e linear
displays with rotary controls. Ergonomics 40, 199–222.
Hoffmann, E. R. (2009). Warrick’s principle, implied link-ages and hand/control location effect. The
Ergonomics Open Journal 2, 170–177.
Hogarth, A. (1987). Judgment and choice (2nd Ed.). Chichester: Wiley.
Hogarth, R. M., & Einhorn, H. J. (1992). Order effects in belief updating: The belief-adjustment model.
Cognitive Psychology, 24, 1–55.
Hogue, J. R., Allen, R. W., MacDonald, J., & Schmucker, C., Markham, S., & Harmsen, A. (2001). Virtual
reality parachute simulation for training and mission rehearsal. In 16th AIAA Aerodynamic Decelerator
Systems Seminar and Conference, AIAA 2001–2061 (pp. 1–8). Reston, VA: American Institute of
Aeronautics and Astronautics.
Holding, D. H. (1976). An approximate transfer surface. Journal of Motor Behavior, 8, 1–9.
Holding, D. H. (1987). Training. In G. Salvendy (ed.), Handbook of human factors. New York: Wiley.
Hole, G. J. (1996). Decay and interference effects in visuo-spatial short-term memory. Perception, 25, 53–64.
Hollands,J.G. (2003). The classification of graphical elements. Canadian Journal of Experimental
Psychology, 57, 38–47.
Hollands, J. G., Carey, T. T., Matthews, M. L., & McCann, C. A. (1989). Presenting a graphical network: A
comparison of performance using fisheye and scrolling views. In G. Salvendy & H. Smith (Eds.),
Designing and using human-computer interfaces and knowledge-based systems (pp. 313–320). Amsterdam:
Elsevier.
Hollands, J. G., & Dyre, B. P. (2000). Bias in proportion judgments: The cyclical power model.
Psychological Review, 107, 500–524.
Hollands, J. G., & Lamb, M. (2011). Viewpoint tethering for remotely operated vehicles: Effects on complex
terrain navigation and spatial awareness. Human Factors, 53, 154–167.
Hollands, J. G., & Merikle, P. M. (1987). Menu organization and user expertise in information search tasks.
Human Factors, 29, 577–586.
Hollands, J. G., & Neyedli, H. F. (2011). A reliance model for automated combat identification systems:
Implications for trust in automation. In N. Stanton (Ed.), Trust in military teams (pp. 151–182). Farnham,
England: Ashgate.
Hollands, J. G., Parker, H. A., & Morton, A. (2002). Judgments of 3D bars in depth. In Proceedings of the
Human Factors and Ergonomics Society—46th Annual Meeting (pp 1565–1569). Santa Monica, CA:
Human Factors and Ergonomics Society.
Hollands, J. G., Pavlovic, N. J., Enomoto, Y., & Jiang, H. (2008). Smooth rotation of 2-D and 3-D

390
representations of terrain: An investigation into the utility of visual momentum. Human Factors, 50, 62–76.
Hollands, J. G., Pierce, B. J., & Magee, L. E. (1998). Displaying information in two and three dimensions.
International Journal of Cognitive Ergonomics, 2, 307–320.
Hollands, J. G., & Spence, I. (1992). Judgments of change and proportion in graphical perception. Human
Factors, 34, 313–334.
Hollands, J. G., & Spence, I. (1998). Judging proportion with graphs: The summation model. Applied
Cognitive Psychology, 12, 173–190.
Hollands, J. G., & Spence, I. (2001). The discrimination of graphical elements. Applied Cognitive
Psychology, 15, 413–431.
Holsanova, J. N., Holmberg, N., & Holmqvist, K. (2009). Reading information graphics: The role of spatial
contiguity and dual attentional guidance. Applied Cognitive Psychology, 23, 1,215–1,226.
Holscher, C. (2009). Adaptivity of wayfinding strategies in a multi-building ensemble: The effects of spatial
structure, task requirements and metric information. Journal of Environmental Psychology, 29, 208–219.
Hoosain, R., & Salili, F. (1988). Language differences, working memory, and mathematical ability. In M. M.
Grunberg, P. E. Morris, & R. N. Sykes (Eds.), Practical aspects of memory: Current research and issues
(Vol. 2, pp. 512–517). Academic Press: New York.
Hope, L., Lewinski, W., Dixon, J., Blocksidge, D., & Gabbert, F. (2012) Witnesses in action: The effect of
physical exertion on recall and recognition. Psychological Science. 23, 386–390.
Hope, L.Memon, A., & McGeorge, P. (2004). Understanding pretrial publicity: Predecisional distortion of
evidence by mock jurors. Journal of Experimental Psychology: Applied, 10, 111–119.
Hope, L., & Wright, D. (2007). Beyond unusual? Examining the role of attention in the weapon focus effect.
Applied Cognitive Psychology, 21, 951–961.
Hopkin, V. S. (1980). The measurement of the air traffic controller. Human Factors, 22, 347–360.
Hörmann, H. J., Banbury, S., Dudfield, H., Lodge, M. and Soll, H. (2004). Evaluating the effects of
situation awareness training on flight crew performance. In S. Banbury and S. Tremblay (Eds.), A cognitive
approach to situation awareness: Theory and application. Aldershot, UK: Ashgate and Town.
Horrey, W. J., & Wickens, C. D. (2003). Multiple resource modeling of task interference in vehicle control,
hazard awareness and invehicle task performance. In Proceedings of the Second International Driving
Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, Park City, UT.
Horrey, W. J., & Wickens, C. D. (2004). Driving and side task performance: The effects of display clutter,
separation, and modality. Human Factors, 46, 611–624.
Horrey, W. J., & Wickens, C. D. (2006). The impact of cell phone conversations on driving: A meta-analytic
approach. Human Factors, 48, 196–205.
Horrey, W. J., & Wickens, C. D. (2007). In-vehicle glance duration: Distributions, tails and a model of crash
risk. Transportation Research Record, 2018, 22–28.
Horrey, W. J., Lesch, M. F., & Garabet, A. (2009). Dissociation between driving performance and driver’s
subjective estimates of performance and workload in dual task conditions. Journal of Safety Research, 40,
7–12.
Horrey, W. J., Lesch, M. F., Kramer, A. F., & Melton, D. F. (2009). Examining the effects of a computer-
based training module on drivers’ willingness to engage in distracting activities while driving. Human
Factors, 51, 571–581.
Horrey, W. J., Wickens, C. D., & Consalus, K. P. (2006). Modeling drivers’ visual attention allocation while
interacting with in-vehicle technologies. Journal of Experimental Psychology: Applied, 12, 67–86.
Hosking, S. G., Young, K. L., & Regan, M. A. (2009). The effects of text messaging on young drivers.
Human Factors, 51, 582–592.
Howell, W. C., & Kreidler, D. L. (1963). Information processing under contradictory instructional sets.
Journal of Experimental Psychology, 65, 39–46.
Howell, W. C., & Kreidler, D. L. (1964). Instructional sets and subjective criterion levels in a complex

391
information processing task. Journal of Experimental Psychology, 68, 612–614.
Hu, Y., & Malthaner, R. A. (2007). The feasibility of threedimensional displays of the thorax for preoperative
planning in the surgical treatment of lung cancer. European Journal of Cardiothoracic Surgery, 31, 506–
511.
Huang, K. C. (2008). Effects of computer icons and figure/ background area ratios and color combinations on
visual search performance on an LCD monitor. Displays, 29(3), 237–242.
Hubbold, R. J., Hancock, D. J., & Moore, C. J. (1997). Autosteroscopic display for radiotherapy planning.
In: S. F. Scott, J. O. Merritt, & M. T. Bolas (Eds.), Stereoscopic display and virtual reality system IV. SPIE
Proceedings; 3012: 16–27.
Huestegge, L., & Philipp, A. M. (2011). Effects of spatial compatibility on integration processes in graph
comprehension. Attention, Perception, & Psychophysics, 73, 1,903–1,915.
Huey, M. B., & Wickens, C. D. (Eds.). (1993). Workload transition: Implications for individual and team
performance. Washington, DC: National Academy Press.
Huggins, A. (1964). Distortion of temporal patterns of speech: Interruptions and alterations. Journal of the
Acoustical Society of America, 36, 1,055–1,065.
Hughes, T. & MacRae, A. W. (1994). Holistic peripheral processing of a polygon display. Human Factors,
36, 645–651.
Humes, L. E., Lee, J. H., and Coughlin. M. P. (2006). Auditory measures of selective and divided attention
in young and older adults using single-talker competition. Journal of the Acoustical Society of America,
120, 2,926–2,937.
Hunn, B. P. (2006). Video imagery’s role in network centric, multiple unmanned aerial vehicle (UAV)
operations. In N. J. Cooke, H. L. Pringle, H. K. Pedersen, O. Connor, & E. Salas (Eds.), Human factors of
remotely operated vehicles (pp. 179–191). Amsterdam: Elsevier.
Hunt, E., & Lansman, M. (1981). Individual differences in attention. In R. J. Sternberg (Ed.), Advances in
the psychology of human intelligence. Vol 1. Hillsdale, NJ: Erlbaum.
Hunt, E., Pellegrino, J. W., & Yee, P. L. (1989). Individual differences in attention. In G. H. Bower (Ed.),
The Psychology of Learning and Motivation, Vol. 24 (pp. 285–310). San Diego: Academic Press.
Hunt, R., & Rouse, W. (1981). Problem-solving skills of maintenance trainees in diagnosing faults in
simulated power plants. Human Factors, 23, 317–328.
Hurts, K., Angell, L., & Perez, M. A. (2011). Attention, distraction, and driver safety. In P. DeLucia (Ed.),
Reviews of Human Factors & Ergonomics. Vol 7. Santa Monica, CA: Human Factors and Ergonomics
Society.
Hyman, I. E., Boss, S. M., Wise, B. M., McKenzie, K. E., & Caggiano, J. M. (2010). Did you see the
unicycling clown? Inattentional blindness while walking and talking on a cell phone. Applied Cognitive
Psychology, 24, 597–607.
Hyman, R. (1953). Stimulus information as a determinant of reaction time. Journal of Experimental
Psychology, 45, 423–432.
Iani, C., & Wickens, C. D. (2007). Factors affecting task management in aviation. Human Factors, 49, 16–24.
Ichikawa, M., & Saida, S. (1996). How is motion disparity integrated with binocular disparity in depth
perception? Perception and Psychophysics, 58, 271–282.
Inagaki, T. (1999). Situation-adaptive autonomy: Trading control of authority in human-machine systems. In
M. W. Scerbo & M. Mouloua (Eds.), Automation technology and human performance: Current research
and trends (pp. 154–159). Mahwah, NJ: Erlbaum.
Inagaki, T. (2003). Adaptive automation: Sharing and trading of control. In E. Hollnagel (Ed.), Handbook of
cognitive task design (pp. 46–89). Mahwah, NJ: Erlbaum.
Inagaki, T. (2008). Smart collaboration between humans and machines based on mutual understanding.
Annual Reviews in Control, 32, 253–261.
Inbar, O., Tractinsky, N., & Meyer, J. (2007). Minimalism in information visualization—attitudes towards
maximizing the data-ink ratio. In Proceedings of the European Conference on Cognitive Ergonomics (pp.

392
185–188). New York: Association for Computing Machinery.
Ince, F., Williges, R. C., & Roscoe, S. N. (1975). Aircraft simulator motion and the order of merit of flight
attitude and steering guidance displays. Human Factors, 17, 388–400.
Inoue, T., Kawai, T., & Noro, K. (1996). Performance of 3-D digitizing in stereoscopic images. Ergonomics,
39, 1,357–1,363.
Inselberg, A. (1999). Multidimensional detective. In S. K. Card, J. D. Mackinlay, & B. Shneiderman, (Eds.),
Readings in information visualization (pp. 107–114). San Francisco: Morgan Kaufmann.
Isakoff, M., & Corn, D. (2006). Hubris. NY.: Random House.
Isherwood, S. (2009). Graphics and semantics: The relationship between what is seen and what is meant in
icon design. In D. Harris (Ed.), Engineering Psychology and Cognitive Ergonomics, Berlin: Springer.
Isherwood, S. J., McDougall, S. J. P., & Curry, M. B. (2007). Icon identification in context: The changing
role of icon characteristics with user experience. Human Factors, 49, 465–476.
Isreal, J. B., Chesney, G. L., Wickens, C. D., & Donchin, E. (1980). P300 and tracking difficulty: Evidence
for a multiple capacity view of attention. Psychophysiology, 17, 259–273.
Isreal, J. B., Wickens, C. D., Chesney, G. L., & Donchin, E. (1980). The event-related brain potential as a
selective index of display monitoring workload. Human Factors, 22, 211–224.
Itti, L., & Koch, C. (2000). A saliency-based search mechanism for overt and covert shifts of visual attention.
Vision Research, 40, 1,489–1,506.
Jack, D., Boian, R., Merians, A. S. et al. (2001). Virtual reality-enhanced stroke rehabilitation. IEEE
Transactions on Neural Systems and Rehabilition Engineering, 9, 308–318.
Jacob, R. J. K., Sibert, L. E., McFarlane, D. C., & Mullen, M. P. (1994). Integrality and separability of input
devices. ACM Transactions on Computer-Human Interaction, 1, 3–26.
Jagasinski, R. J., & Flach, J. M. (2003). Control theory for humans. Mahwah, NJ: Erlbaum.
Jakobsen, M. R., & Hornbaek, K. (2006). Evaluating a fish-eye view of source code. In Proceedings of the
SIGCHI conference on human factors in computing systems (CHI 2006) (pp. 377–386). New York:
Association for Computing Machinery.
James, W. (1890). Principles of psychology. New York: Holt. (Reprinted in 1983 by Harvard University
Press). Available online at http://psychclassics.yorku.ca/James/ Principles/.
Jang, J., Schunn, C. D., & Nokes, T. J. (2011). Spatially distributed instructions improve learning outcomes
and efficiency. Journal of Educational Psychology, 103, 60–72.
Janiszewski, C., Lichtenstein, D., & Belyavsky, J. (2008). Judgments about judgments: The dissociation of
consideration price and willingness to purchase judgments. Journal of Experimental Psychology: Applied,
14, 151–164.
Jarmasz, J., Herdman, C. M., & Johannsdottir, K. R. (2005). Object-based attention and cognitive tunneling.
Journal of Experimental Psychology: Applied, 11, 3–12.
Jarvic, J. G., Hollingworth, W., Martin, B., et al. (2003). Rapid magnetic resonance imaging versus
radiographs for patients with low back pain. Journal of the American Medical Association. 289, 2,810–
2,818.
Jay, C., Glencross, M., & Hubbold, R. (2007). Modeling the effects of delayed haptic and visual feedback in
a collaborative virtual environment. ACM Transactions on Computer-Human Interaction, 14 (2), Article 8.
Jenkins, D., Stanton, N., Salmon, P & Walker, G. (2009). Cognitive work analysis: Coping with complexity.
Burlington, VT: Ashgate.
Jenkins, H. M., & Ward, W. C. (1965). Judgment of contingency between responses and outcomes.
Psychological Monographs: General and Applied, 79 (whole no. 594).
Jennings, A. E., & Chiles, W. D. (1977). An investigation of time-sharing ability as a factor in complex task
performance. Human Factors, 19, 535–547.
Jensen, R. S. (1982). Pilot judgment: Training and evaluation. Human Factors, 24, 61–74.
Jeon, M., & Walker, B. N. (2009). “Spindex”: Accelerated initial speech sounds improve navigation

393
performance in auditory menus. In Proceedings of the 53rd Annual Meeting of the Human Factors and
Ergonomics Society (pp. 1,081–1,085). Santa Monica, CA: Human Factors and Ergonomics Society.
Jeon, S., & Choi, S. (2009). Haptic augmented reality: Taxonomy and an example of stiffness modulation.
Presence, 18, 387–408.
Jersild, A. T. (1927). Mental set and shift. Archives of Psychology, Whole No. 89.
Jessa, M., & Burns, C. M. (2007). Visual sensitivities of dynamic graphical displays. International Journal of
Human-Computer Studies, 65, 206–222.
Jex, H. R., & Clement, W. F. (1979). Defining and measuring perceptual-motor workload in manual control
tasks. In N. Moray (Ed.), Mental workload: Its theory and measurement. New York: Plenum.
Jian, J. Y., Bisantz, A., & Drury, C. (2000). Foundations for an empirically determined scale of trust in
automated systems. International Journal of Cognitive Ergonomics, 4, 53–71.
Johannsdottir, K. R., & Herman, C. M. (2010). The role of working memory in supporting drivers’ situation
awareness for surrounding traffic. Human Factors, 52, 663–673.
Johnson, A., & Proctor, R. (2004) Attention: Theory and practice. Thousand Oaks, CA: Sage.
Johnson, E. J., & Payne, J. W. (1985). Effort and accuracy in choice. Management Science, 31, 395–414.
Johnson, E. J., Payne, J. W., & Bettman, J. R. (1988). Information displays and preference reversals.
Organizational Behavior and Human Decision Processes, 42, 1–21.
Johnson, E. J., Payne, J. W., & Bettman, J. R. (1993). Adapting to time constraints. In O. Svenson & A. J.
Maule (Eds.), Time pressure and stress in human judgment and decision making (pp. 103–116). New York:
Plenum.
Johnson, E.R., Cavanaugh, R., Spooner, R., & Samet, M. (1973). Utilization of reliablility measurements in
Bayesian Inference. IEEE Transactions on Reliability. PP 176–182
Johnson, R., Jr. (1986). A triarchic model of P300 amplitude. Psychophysiology, 23, 367–384.
Johnson, S. J., Guediri, S. M., Kilkenny, C., & Clough, P. J. (2011). Development and validation of a virtual
reality simulator: Human factors input to interventional radiology training. Human Factors, 53, 612–625.
Johnson, S. L., & Roscoe, S. N. (1972). What moves, the airplane or the world? Human Factors, 14, 107–
129.
Johnston, J. H., & Cannon-Bowers, J. A. (1996). Training for stress exposure. In J. E. Driskell & E. Salas
(Eds.), Stress and human performance (pp. 223–256). Mahwah, NJ: Erlbaum.
Joint Commission, (2002) Sentinel Event alert. Preventing ventilator-related deaths and injuries. The joint
commission American Association of Respiratory Care. Issue 25. Feb 26, 2002.
Jolicouer, P., & Ingleton, M. (1991). Size invariance in curve tracing. Memory & Cognition, 19, 21–36.
Jones, D. M. (1993). Objects, streams, and threads of auditory attention. In A. D. Baddeley and L. Weiskrantz
(Eds.), Attention: Selection, awareness and control. Oxford, UK: Clarendon Press.
Jones, D. M. (1999). The cognitive psychology of auditory distraction: The 1997 BPS Broadbent Lecture.
British Journal of Psychology, 90, 167–187.
Jones, D. M., & Macken, W. J. (2003). Irrelevant tones produce an irrelevant sound effect: Implications for
phonological coding in working memory. Journal of Experimental Psychology: Learning, Memory and
Cognition, 19, 369–381.
Jones, D. M., Alford, D., Bridges, A., Tremblay, S., and Macken, B. (1999). Organizational factors in
selective attention: The interplay of acoustic distinctiveness and auditory streaming in the irrelevant sound
effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 25, 464–473.
Jones, D. M., Hughes, R. W., & Macken, W. J. (2010). Auditory distraction and serial memory: The
avoidable and the ineluctable. Noise Health, 12, 201–209.
Jonides, J., & Nee, D. E. (2006). Brain mechanisms of proactive interference in working memory.
Neuroscience, 139, 181–193.
Jorna, P. (1997). In D. Harris (Ed.), Engineering psychology and cognitive ergonomics: Vol 1. Brookfield,
VT: Ashgate.

394
Joslyn, S., Savelli, S., & Limor, N. G. (2011). Reducing probabalistic weather forecasts to the worst-case
scenario: Anchoring effects. Journal of Experimental Psychology: Applied, 17, 342–353.
Juan, M. C., & Perez, D. (2009). Comparison of the levels of presence and anxiety in an acrophobic
environment viewed via HMD or CAVE. Presence, 18, 232–248.
Jung, T. P., Makeig, S., Humphreys, C., Lee, T., McKeown, M. J., Iragui, V., & Sejnowski, T. (2000).
Removing electroencephalographic artifacts by blind source separation. Psychophysiology, 37, 163–178.
Jungk, A., Thull, B., Hoeft, A., & Rau, G. (2001). Evaluation of two new ecological interface approaches for
the anesthesia workplace. Journal of Clinical Monitoring and Computing, 16, 243–258.
Just, M. A., & Carpenter, P. A. (1971). Comprehension of negation with quantification. Journal of Verbal
Learning and Verbal Behavior, 10, 244–253.
Just, M. A., Carpenter, P. A., & Miyake, A. (2003). Neuroindices of cognitive workload: Neuroimaging,
pupillometric, and event-related potential studies of brain work. Theoretical Issues in Ergonomics Science,
4, 56–88.
Kaarlela-Tuomaala, A., Helenius, R., Keskinen, E., and Hongisto, V. (2009). Effects of acoustic
environment on work in private office rooms and open-plan offices - longitudinal study during relocation.
Ergonomics,52,1423–1444.
Kaber, D. B., Alexander, A. L., Stelzer, E. M., Kim, S. H., Kaufmann, K., and Hsiang, S. (2008). Perceived
clutter in advanced cockpit displays. Aviation Space and Environmental Medicine, 79, 1–12.
Kaber, D. B., & Endsley, M. (2004). The effects of level of automation and adaptive automation on human
performance, situation awareness and workload in a dynamic control task. Theoretical Issues in
Ergonomics Science, 5, 113–153.
Kaber, D. B., & Kim, S. H. (2011). Understanding cognitive strategy with adaptive automation in dual-task
performance using computational cognitive models. Journal of Cognitive Engineering and Decision
Making, 5, 309–331.
Kaber, D. B., Onal, E., & Endsley, M. R. (1999). Level of automation effects on telerobot performance and
human operator situation awareness and subjective workload. In M. W. Scerbo & M. Mouloua (Eds.),
Automation technology and human performance: Current research and trends (pp. 165–170). Mahwah, NJ:
Erlbaum.
Kaber,D.B., Onal,E., & Endsley, M. R. (2000). Design of automation for telerobots and the effect on
performance, operator situation awareness, and subjective workload. Human Factors and Ergonomics in
Manufacturing, 10, 409–430.
Kaber, D. B., & Riley, J. M. (1999). Adaptive automation of a dynamic control task based on secondary task
work-load measurement. International Journal of Cognitive Ergonomics, 3, 169–187.
Kaber, D. B., Wright, M. C., Prinzel, L. J., & Clamann, M. P. (2005). Adaptive automation of human-
machine system information-processing functions. Human Factors, 47, 730–741.
Kahneman, D. (1973). Attention and effort. Englewood Cliffs, NJ: Prentice Hall.
Kahneman, D. (1991). Judgment and decision making: A personal view. Psychological Science, 2(3), 142–
145.
Kahneman, D. (2003). A perspective on judgment and choice: mapping bounded rationality (Nobel Prize
lecture). American Psychologist, 58, 697–720.
Kahneman, D., Beatty, J., & Pollack, I. (1967). Perceptual deficits during a mental task. Science, 157, 218–
219.
Kahneman, D., Ben-Ishai, R., & Lotan, M. (1973). Relation of a test of attention to road accidents. Journal
of Applied Psychology, 58, 113–115.
Kahneman, D., & Frederick, S. (2002). Representativeness revisited: Attribute substitution in intuitive
judgment. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of
intuitive judgment (pp. 49–81). New York: Cambridge University Press.
Kahneman, D., & Klein, G. A. (2009). Conditions for intuitive expertise. A failure to disagree. American
Psychologist. 64, 515–524.

395
Kahneman, D., Slovic, P., & Tversky, A. (Eds.). (1982). Judgment under uncertainty: Heuristics and biases.
New York: Cambridge University Press.
Kahneman, D., & Treisman, A. (1984). Changing views of attention and automaticity. In R. Parasuraman and
D. A. Davies (Eds.), Varieties of attention (pp. 29–61). New York: Academic Press.
Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80, 251–273.
Kahneman, D., & Tversky, A. (1984). Choices, values, and frames. American Psychologist, 39, 341–350.
Kalkofen, D., Mendez, E. & Schmaltstieg, D. (2009).
Kalyuga, S. (2011) Cognitive Load Theory: How Many Types of Load DoesIt Really Need? Educational
Psychology Review 23:1–19.
Kalyuga, S., Chandler P., & Sweller, J. (2001). Learner experience and efficiency of instructional guidance.
Educational Psychology. 21, 5–23.
Kalyuga, S., & Renkl, A. (2010). Expertise reversal effect and its instructional implications: Introduction to
the special issue. Instructional Science, 38, 209–215.
Kalyuga, S., Chandler, P., Tuovinen, J., & Sweller, J. (2001). When problem solving is superior to studying
worked examples. Journal of Educational Psychology, 93, 579–588.
Kanarick, A. F., Huntington, A., & Peterson, R. C. (1969). Multisource information acquisition with optimal
stopping. Human Factors, 11, 379–386.
Kane, M. J., Bleckley, M. K., Conway, A. R. A., & Engle, R. W. (2001). A controlled-attention view of
working-memory capacity. Journal of Experimental Psychology: General, 130, 169–183.
Kane, M. J., & Engle, R. W. (2000). Working memory capacity, proactive interference and divided attention:
Limits on long-term memory retrieval. Journal of Experimental Psychology: Learning, Memory and
Cognition, 26, 336–358.
Kane, M. J., & Engle, R. W. (2002). The role of prefrontal cortex in working memory capacity, executive
attention, and general fluid intelligence: An individual differences perspective. Psychonomic Bulletin and
Review, 9, 637–671.
Kantowitz, B. H. (1974). Double stimulation. In B. H. Kantowitz (Ed.), Human information processing.
Hillsdale, NJ: Erlbaum.
Kantowitz, B. H., & Knight, J. L. (1976). Testing tapping timesharing. I. Auditory secondary task. Acta
Psychologica, 40, 343–362.
Kaplan, S., & Berman, M. G. (2010). Directed attention as a common resource for executive functioning and
self regulation. Perspectives on Psychological Science, 5, 43–57.
Kapralos, B., Jenkin, M. R., & Milios, E. (2008). Virtual audio systems. Presence, 17, 527–549.
Karlin, L., & Kestinbaum, R. (1968). Effects of number of alternatives on the psychological refractory
period. Quarterly Journal of Experimental Psychology, 20,160–178.
Karlsen, P. J., Allen, R. J., Baddeley, A. D., & Hitch, G. J. (2010). Binding across space and time in visual
working memory. Memory & Cognition, 38, 292–303.
Karpicke, J. & Roediger, H (2008). The critical importance of retrieval for learning. Science, 319, 966–968.
Karsh, B. T. (2010). Clinical practice improvement and redesign: How change in workflow can be supported
by clinical decision support. (AHRQ Publication No. 09-0054-EF). Rockville, MD: Agency for Healthcare
Research and Quality.
Karsh, R., Walrath, J. D., Swoboda, J. C., & Pillalamarri, K. (1995). Effect of battlefield combat
identification system information on target identification time and errors in a simulated tank engagement
task (Technical report ARL–TR–854). Aberdeen Proving Ground, MD, United States: Army Research
Laboratory.
Karwowski, W., & Mital, A. (Eds.) (1986). Applications of fuzzy set theory in human factors. New York:
Elsevier.
Kaufmann, R., & Glavin, S. J. (1990). General guidelines for the use of colour on electronic charts.
International Hydrographic Review, 67, 87–99.

396
Keele, S. W. (1969). Repetition effect: A memory dependent process. Journal of Experimental Psychology,
80, 243–248.
Keele, S. W. (1972). Attention demands of memory retrieval. Journal of Experimental Psychology, 93, 245–
248.
Kees, J., Burton, S., Andrews, J. C., & Kozup, J. (2006). Tests of graphic visuals and cigarette package
warning combinations: implications for the framework convention on tobacco control. Journal of Public
Policy & Marketing, 25(2), 212–223.
Keillor, J., Trinh, K., Hollands, J. G., & Perlin, M. (2007). Effects of transitioning between perspective–
rendered views. In Proceedings of the Human Factors and Ergonomics Society–51st Annual Meeting (pp.
1,322–1,326). Santa Monica, CA: Human Factors and Ergonomics Society.
Keinan, G., & Freidland, N. (1984). Dilemmas concerning the training of individuals for task performance
under stress. Journal of Human Stress, 10, 185–190.
Keinan, G., & Freidland, N. (1987). Decision making under stress: Scanning of alternatives under physical
threat. Acta Psychologica, 64, 219–228.
Keinan, G., & Friedland, N. (1996). Training effective performance under stress: Queries, dilemmas, and
possible solutions. In J. E. Driskell & E. Salas (Eds.), Stress and human performance (pp. 257–278).
Mahwah, NJ: Erlbaum.
Keith, N., & Frese, M. (2008). Effectiveness of error management training: A meta-analysis. Journal of
Applied Psychology, 93(1), 59–69.
Kelley, C. M., & McLaughlin, A. C. (2008). How individual differences and task load may affect feedback
use when learning a new task. In Proceedings of the Human Factors and Ergonomics Society Annual
Meeting (pp. 1,825–1,829). Santa Monica, CA: Human Factors and Ergonomics Society.
Kelly, M. L. (1955). A study of industrial inspection by the method of paired comparisons. Psychological
Monographs, 69, (394), 1–16.
Kemler-Nelson, D. G. (1993). Processing integral dimensions: The whole view. Journal of Experimental
Psychology: Human Perception and Performance, 19, 1,105–1,113.
Kenney, R. L. (1973). A decision analysis with multiple objectives: The Mexico City airport. Bell Telephone
Economic Management Science, 4, 101–117.
Keppel, G., & Underwood, B. J. (1962). Proactive inhibition in short-term retention of single items. Journal
of Verbal Learning and Verbal Behavior, 1, 153–161.
Kesting, I., Miller, B., & Lockhart, C (1988). Auditory alarms during anesthesia monitoring. Anesthesiology,
69, 106–107.
Kidd, D., & Monk, C. (2009). Are unskilled drivers aware of their deficiencies? In Proceedings of the Human
Factors and Ergonomics Society—53rd Meeting (pp. 1,781–1,786). Santa Monica, CA: Human Factors and
Ergonomics Society.
Kim, J., Palmisano, S. A., Ash, A., & Allison, R. S. (2010). Pilot gaze and glideslope control. ACM
Transactions on Applied Perception, 7(3), 18:1–18.18.
Kim, W. S., Ellis, S. R., Tyler, M., Hannaford, B., & Stark, L. (1987). A quantitative evaluation of
perspective and stereoscopic displays in three-axis manual tracking tasks. IEEE Transactions on Systems,
Man, and Cybernetics, 17, 61–71.
Kingstone, A., Smilek, D., & Eastwood, J. D. (2006). Cognitive ethology: A new approach for studying
human cognition. British Journal of Psychology, 99, 317–340.
Kintsch, W., & Van Dijk, T. A. (1978). Toward a model of text comprehension and reproduction.
Psychological Review, 85, 363–394.
Kirby, P. H. (1976). Sequential effects in two choice reaction time: Automatic facilitation or subjective
expectation. Journal of Experimental Psychology: Human Perception and Performance, 2, 567–577.
Kirk, D., Sellen, A., & Cao, X. (2010). Home video communication: Mediating closeness. In Proceedings of
Computer Supported Cooperative Work 2010 (pp. 135–144). New York: Association for Computing
Machinery.

397
Kirkpatrick, M., & Mallory, K. (1981). Substitution error potential in nuclear power plant control rooms. In
R. C. Sugarman (Ed.), Proceedings of the 25th Annual Meeting of the Human Factors Society (pp. 163–
167). Santa Monica, CA: Human Factors Society.
Kirschenbaum, S. S., & Arruda, J. E. (1994). Effects of graphic and verbal probability information on
command decision making. Human Factors, 36, 406–418.
Kirsh, D. (1995). The intelligent use of space. Artificial Intelligence, 73, 31–68.
Kirwan, B., & Ainsworth, L. (1992). A guide to task analysis. London: Taylor & Francis.
Klapp, S. T. (1979). Doing two things at once: The role of temporal compatibility. Memory & Cognition, 7,
375–381.
Klapp, S. T., & Irwin, C. I. (1976). Relation between programming time and duration of response being
programmed. Journal of Experimental Psychology: Human Perception and Performance, 2, 591–598.
Klatzky, R. L., Marston, J. R., Giudice, N. A., Golledge, R. G., & Loomis, J. M. (2006). Cognitive load of
navigating without vision when guided by virtual sound versus spatial language. Journal of Experimental
Psychology: Applied, 12, 223–232.
Klayman, J., & Ha, Y. W. (1987). Confirmation, disconfirmation, and information in hypothesis testing.
Psychological Review, 94, 211–228.
Klein, G. (1989). Recognition primed decision making. Advances in Man-Machine Systems Research, 5, 47–
92.
Klein, G. (1996). The effects of acute stressors on decision making. In J. E. Driskell & E. Salas (Eds.), Stress
and human performance (pp. 49–88). Mahwah, NJ: Erlbaum.
Klein, G. (1997). The recognition-primed decision (RPD) model: Looking back, looking forward. In C. E.
Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 285–292). Mahwah, NJ: Erlbaum.
Klein, G., Calderwood, R., & Clinton-Cirocco, A. (1996). Rapid decision making on the fire ground. In
Proceedings of the 30th Annual Meeting of the Human Factors and Ergonomics Society (pp. 576–580).
Santa Monica, CA: Human Factors and Ergonomics Society.
Klein, G., & Crandall, B. W. (1995). The role of mental simulation in problem solving and decision making.
In P. A. Hancock, J. Flach, J. Caird, & K. Vicente (Eds.), Local applications of the ecological approach to
human-machine systems (Vol., 2, pp. 324–358). Hillsdale, NJ: Erlbaum.
Klein, G., Moon, B., & Hoffman, R. (2006). Making sense of sensemaking. IEEE Intellligent Systems, 21,
88–92.
Kleinmuntz, B. (1990). Why we still use our heads instead of formulas: Toward an integrative approach.
Psychological Bulletin, 107, 296–310.
Klemmer, E. T. (1957). Simple reaction time as a function of time uncertainty. Journal of Experimental
Psychology, 54, 195–200.
Klemmer, E. T. (1969). Grouping of printed digits for manual entry. Human Factors, 11, 397–400.
Kliegel, M., Martin, M., McDaniel, M. A., & Einstein, G. O. (2004). Importance effects on performance in
event-based prospective memory tasks. Memory, 12(5), 553–561.
Knight, J. B., Meeks, J. T., Marsh, R. L., Cook, G. I., Brewer, G. A., & Hicks, J. L. (2011). An observation
on the spontaneous noticing of prospective memory event-based cues. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 37, 298–307.
Knill, D. C. (2007). Robost cue integration: A Bayesian model and evidence from cue-conflict studies with
stereoscopic and figure cues to slant. Journal of Vision, 7(7):5, 1–24.
Koehler, D., Brenner, L., & Griffin, D. (2002). The calibration of expert judgment: Heuristics and biases
beyond the laboratory. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics and biases: The
psychology of intuitive judgment. New York: Cambridge University Press.
Kohn, L., Corrigan, J., & Donaldson, M. (1999) To err is human: building a safer health care system.
Washington, DC: National Academy Press.
Koh, R., Park, T., Wickens, C., Teng O., & Chia, N. (2011). Differences in attentional strategies by novice

398
and experienced operating theatre scrub nurses. Journal of Experimental Psychology: Applied, 17, 233–
246.
Kolygula, Chandler, & Sweller, J. (1998) Expertise and Instructional Design. Human Factors, 1–17.
Kooi, F. (2011). A display with two depth layers: Attentional segregation and declutter. In C. Roda (Ed.),
Human attention in digital environments (pp. 245–258). Cambridge, England: Cambridge University Press.
Kopala, C. J. (1979). The use of color-coded symbols in a highly dense situation display. In Proceedings of
Human Factors Society—23rd Annual Meeting (pp. 397–401). Santa Monica, CA: Human Factors Society.
Kopardekar, P., Schwartz, A., Magyarits, S., & Rhodes, J. (2009). Airspace complexity measurement: An air
traffic control simulation analysis. International Journal of Industrial Engineering, 16, 61–70.
Koriat, A., Lichtenstein, S., & Fischoff, B. (1980). Reasons for confidence. Journal of Experimental
Psychology: Human Learning and memory. 6, 107–118.
Kornblum, S. (1973). Sequential effects in choice reaction time. A tutorial review. In I. Kornblum (Ed.),
Attention and performance IV. New York: Academic Press.
Kornblum, S., Hasbroucq, T., & Osman, A. (1990). Dimensional overlap: Cognitive basis for stimulus-
response compatibility—A model and taxonomy. Psychological Review, 97, 253–270.
Kornbrot, D. E. (2006). Signal detection theory, the approach of choice: Model-based and distribution-free
measures and evaluation. Perception & Psychophysics, 68, 393–414.
Kosko, B. (1993). Fuzzy thinking: The new science of fuzzy logic. New York: Hyperion.
Kraft, C. (1978). A psychophysical approach to air safety. Simulator studies of visual illusions in night
approaches. In H. L. Pick, H. W. Leibowitz, J. E. Singer, A. Steinschneider, & H. W. Stevenson (Eds.),
Psychology: From research to practice. New York: Plenum.
Kraiger K, & Jerden E. (2007). A new look at learner control: Meta-analytic results and directions for future
research. In Fiore, S.M., Salas E. (Eds.), Where is the learning in distance learning? Towards a science of
distributed learning and training. Washington, DC: American Psychological Association.
Kraiger, K., Salas, E., & Cannon-Bowers, J. A. (1995). Measuring knowledge organization as a method of
assessing learning during training. Human Factors, 37, 804–816.
Kraiss, K. F., & Knäeuper, A. (1982). Using visual lobe area measurements to predict visual search
performance. Human Factors, 24, 673–682.
Kramer,A.F., &Parasuraman,R. (2007).Neuroergonomics—application of neuroscience to human factors. In
J. Caccioppo, L. Tassinary, & G. Berntson (Eds.), Handbook of psychophysiology (2nd Ed.). New York:
Cambridge University Press.
Kramer, A. F., Larish, J. F., & Strayer, D. L. (1995). Training for attentional control in dual task settings: A
comparison of young and old adults. Journal of Experimental Psychology: Applied, 1, 50–76.
Krueger, F., Parasuraman, R., Iyengar, V., Thornburg, M., Weel, J., Lin, M., Clarke, E., McCabe, K., &
Lipsky, R. (2012). Oxytocin receptor genetic variation promotes trust behavior. Frontiers in Human
Neuroscience, 6, doi: 10.3389/fnhum.2012.00004.
Krijn, M., Emmelkamp, P. M. G., Olafsson, R. P., & Biemond, R. (2004). Virtual reality exposure therapy of
anxiety disorders: A review. Clinical Psychology Review, 24, 259–281.
Kroft, P., & Wickens, C. D. (2003). Displaying multi-do-main graphical database information: An evaluation
of scanning, clutter, display size, and user interactivity. Information Design Journal, 11(1), 44–52.
Kryter, K. D. (1972). Speech communications. In H. P. Van Cott & R. G. Kinkade (Eds.), Human engineering
guide to system design. Washington, DC: U.S. Government Printing Office.
Kuhl, S. A., Thompson, W. B., & Creem-Regeher, S. H. (2009). HMD calibration and its effects on distance
judgments. ACM Transactions on Applied Perception, 35, 9, 1–24.
Kühl, T., Scheiter, K., Gejets, P., & Edelmann, J. (2011). The influence of text modality on learning with
static and dynamic visualizations. Computers in Human Behavior, 27, 29–35.
Kujala, T., & Saariluoma, P. (2011). Effects of menu structure and touch screen scrolling style on the
variability of glance duration during invehicle visual search tasks. Ergonomics, 53, 716–732.

399
Kumagai, J. K., & Massel, L. J. (2005). Alternative visual displays in support of wayfinding. DRDC Toronto
Contractor Report CR-2005-016. Toronto: Defence Research and Development Canada.
Kumar, N., & Benbasat, I. (2004). The effect of relationship encoding, task type, and complexity on
information representation: An empirical evaluation of 2D and 3D line graphs. MIS Quarterly, 28, 255–
281.
Kundel, H. L., & LaFollette, P. S. (1972). Visual search patterns and experience with radiological images.
Radiology, 103, 523–528.
Kundel, H. L., & Nodine, C. F. (1978). Studies of eye movements and visual search in radiology. In J. W.
Senders, D. F. Fisher, & R. A. Monty (Eds.), Eye movements and the higher psychological functions (pp.
317–328). Hillsdale, NJ: Erlbaum.
Kutas, M., McCarthy, G., & Donchin, E. (1977). Augmenting mental chronometry: The P300 as a measure
of stimulus evaluation time. Science, 197, 792–795.
Kveraga, K., Ghuman, A. S., & Bar, M. (2007). Top-down predictions in the cognitive brain. Brain and
Cognition, 65, 145–168.
Kwantes, P. J. (2005). Using context to build semantics. Psychonomic Bulletin & Review, 12, 703–710.
LaBerge, D. (1973). Attention and the measurement of perceptual learning. Memory & Cognition, 1, 268–
276.
Lalomia, M. J., Coovert, M. D., & Salas, E. (1992). Problem-solving performance as a function of problem
type, number progression, and memory load. Behaviour & Information Technology, 11, 268–280.
Lam, T. M., Mulder, M., & van Paassen, M. M. (2007). Haptic Interface for UAV Collision Avoidance.
International Journal of Aviation Psychology, 17, 167–195.
Laming, D. (2001). Statistical information, uncertainty, and Bayes’ theorem: Some applications in
experimental psychology. In Proceedings of ECSQARU 2001, LNAI 2143 (pp. 635–646). Berlin: Springer-
Verlag.
Laming, D. (2010). Statistical information and uncertainty: A critique of applications in experimental
psychology. Entropy, 12, 720–771.
Landauer, T. K. (1995). The trouble with computers. Cambridge, MA: MIT Press.
Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory
of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211–240.
Langewiesche, W. (1998). The lessons of ValuJet 592. The Atlantic Monthly, March, 81–98.
Lanthier, S. N., Risko, E. F., Stolz, J. A., & Besner, D. (2009). Not all visual features are created equal:
Early processing in letter and word recognition. Psychonomic Bulletin & Review, 16, 67–73.
Lappin, J. (1967). Attention in the identification of stimuli in complex visual displays. Journal of
Experimental Psychology, 75, 321–328.
Larish, J. F., & Flach, J. M. (1990). Sources of optical information useful for perception of speed of
rectilinear self–motion. Journal of Experimental Psychology: Human Perception and Performance, 16,
295–302.
Larrick, R. P. (2004). Debiasing. In D. Koehler and N. Harvey (Eds.), Blackwell handbook of judgment and
decision making (pp. 316–357). Oxford, UK: Blackwell.
Laskowski, S. J., & Redish, J. (2006). Making ballot language understandable to voters. In Proceedings of
the USENIX/ Accurate Electronic Voting Technology Workshop 2006 on Electronic Voting Technology
Workshop. Vancouver, B.C., Canada, USENIX Association: 1–1.
Laszlo, S., & Federmeier, K. D. (2007). The acronym superiority effect. Psychonomic Bulletin & Review, 14,
1158–1163.
Latorella, K. A. (1996). Investigating interruptions—An example from the flightdeck. In Proceedings of the
40th Annual Meeting of the Human Factors and Ergonomics Society (pp. 249–253). Santa Monica, CA:
Human Factors and Ergonomics Society.
Lau, N., Veland, O., Kwok, J., Jamieson, G. A., Burns, C. M., Braseth, A. O., & Welch, R. (2008).

400
Ecological interface design in the nuclear domain: An application to the secondary subsystems of a boiling
water reactor plant simulator. IEEE Transactions on Nuclear Science, 55, 3579–3596.
Laudeman, I. V., & Palmer, E. A. (1995). Quantitative measurement of observed workload in the analysis of
aircrew performance. International Journal of Aviation Psychology, 5, 187–198.
Laudeman, I. V., Shelden, S. G., Branstrom, R., & Brasil, C. L. (1998) Dynamic density: An air traffic
management metric. Technical Report, NASA–TM–1998–112226. Ames, CA: National Aeronautics and
Space Administration.
Lavie, N. (2010). Attention, distraction and cognitive control under load. Current Directions in Psychological
Science. 19, 143–48.
Layton, C., Smith, P. J., & McCoy, C. E. (1994). Design of a cooperative problem-solving system for enroute
flight planning: An empirical evaluation. Human Factors, 36, 94–119.
Lazarus, R., & Folkman, S. (1984). Stress, appraisal and coping. New York: Springer.
Leachtenauer, J. C. (1978). Peripheral acuity and photo interpretation performance. Human Factors, 20,
537–551.
Lee, D. N. (1976). A theory of visual control of braking based on information about time-to-collision.
Perception, 5, 437–459.
Lee, E., & MacGregor, J. (1985). Minimizing user search time in menu-retrieval systems. Human Factors,
27, 157–162.
Lee, J. D. (2005). Driving safety. In R. Nickerson (Ed.) Reviews of Human Factors & Ergonomics, vol 1.
Santa Monica, CA: Human Factors and Ergonomics Society.
Lee, J. D., & Angell, L. (2011). (Eds.), Special issue on Driver Distraction. Ergonomics in Design, October.
Lee, J. D., Caven, B., Haake, S., & Brown, T. L. (2001). Speech-based interaction with in-vehicle computers:
The effect of speech-based e-mail on drivers’ attention to the roadway. Human Factors, 43, 631–640.
Lee, J. D., & Moray, N. (1992). Trust, control strategies and allocation of function in human-machine
systems. Ergonomics, 35, 1,243–1,270.
Lee, J. D., & Moray, N. (1994). Trust, self confidence, and operator’s adaptation to automation. International
Journal of Human–Computer Studies, 40, 153–184.
Lee, J. D., & Sanquist, T. F. (2000). Augmenting the operator function model with cognitive operations:
Assessing the cognitive demands of technological innovation in ship navigation. IEEE Transactions on
Systems, Man, and Cybernetics. Part A: Systems and Humans, 30, 273–285.
Lee, J. D., & See, J. (2004). Trust in automation and technology: Designing for appropriate reliance. Human
Factors, 46, 50–80.
Lee, J. D., & Seppelt, B. D. (2009). Human factors in automation design. In S. Nof (Ed.), Springer handbook
of automation (pp. 417–436). New York: Springer.
Lee, J. D., Young, K., & Regan, M. (2009). Defining driver distraction. In M. Regan, J. Lee, & K Young
(Eds.), Driver distraction: Theory, effects and mitigation. Boca Raton, FL: CRC Press.
Lee, K. M. (2004). Why presence occurs: Evolutionary psychology, media equation, and presence. Presence,
13, 494–505.
Lee, Y. C., Lee, J. & Boyle, L. (2007). Visual attention in driving: the effects of cognitive load and visual
disruption. Human Factors, 49, 721–733.
Lees, M. N., & Lee. J. D. (2007). The influence of distraction and driving context on driver response to
imperfect collision warning systems. Ergonomics, 50, 1,264–1,286.
Lehrer, J. U. (2009). How we decide. Boston: Houghton-Mifflin.
Lehto, M. (1997) Decision making. In G. Salvendy (Ed.), Handbook of human factors & ergonomics (pp.
1201–1248). New York: Wiley.
Leibowitz, H. W., Post, R. B., Brandt, T., & Dichgans, J. W. (1982). Implications of recent developments in
dynamic spatial orientation and visual resolution for vehicle guidance. In W. Wertheim & H. W. Leibowitz
(Eds.), Tutorials on motion perception (pp. 231–260). New York: Plenum.

401
Lei, S., & Roetting, M. (2011). Influence of task combination on EEG spectrum modulation for driver
workload estimation. Human Factors, 53(2), 168–179.
Leonard, J. A. (1959). Tactile choice reactions I. Quarterly Journal of Experimental Psychology, 11, 76–83.
Leroy, G., Helmreich, S., Cowie, J. R., Miller, T., & Zheng, W. (2008). Evaluating online health information:
Beyond readability formulas. In Proceedings of the American Medical Informatics Association Symposium
(pp. 394–398). Bethesda, MD: American Medical Informatics Association.
Danziger, S., Levav, J., & Pesso, A. (2011). Extraneous factors in judicial decisions. Proceeedings of the
National Academy of Sciences. 108, 6689–6692.
Leveson, N. (2005). Software challenges in achieving space safety. Journal of the British Interplanetary
Society, 62, 265–272.
Levin, D. T., Momen, N., Drivdahl, S. B., & Simons, D. J. (2000). Change blindness blindness: The
metacognitive error of overestimating change-detection ability. Visual Cognition, 7, 397–412.
Levine, M. (1982). You-are-here maps: Psychological considerations. Environment and Behavior, 14, 221–
237.
Lew, R., Dyre, B. P., & Wotring, B. (2006). Effects of roadway visibility on steering errors while driving in
blowing snow. In Proceedings of the Human Factors and Ergonomics Society—50th Annual Meeting (pp.
1,656–1,660). Santa Monica, CA: Human Factors and Ergonomics Society.
Lewandowsky, S., Little, D., & Kalish, M. L. (2007). Knowledge and expertise. In F. Durso (Ed.), Handbook
of applied cognition (2nd Ed.) (pp. 83–109). New York: Wiley.
Lewandowsky, S., Oberauer, K., and Brown, G. D. A. (2009). No temporal decay in verbal short-term
memory. Trends in Cognitive Science, 13(3), 120–126.
Lewis, K. (2003). Measuring transactive memory systems in the field: Scale development and validation.
Journal of Applied Psychology, 88, 587–604.
Lewis, M. (1998). Designing for humanagent interaction. Artificial Intelligence, 19(2), 67–78.
Li, F. F., VanRullen, R., Koch, C., & Perona, P. (2002). Raplid natural scene categorization in the near
absence of attention. Proceedings of the National Academy of Sciences, 99(14), 9,596–9,601.
Li, L., & Chen, J. (2010). Relative contributions of optic flow, bearing, and splay angle information to lane
keeping. Journal of Vision, 10(11), 1–14.
Li, S. Y., Blandford, A., Cairns, P., & Young, R. M. (2008). The effect of interruptions on postcompletion
and other procedural errors: An account based on the activation-based goal memory model. Journal of
Experimental Psychology: Applied, 14, 314–328.
Li, Z., & Durgin, F. H. (2009). Downhill slopes look shallower from the edge. Journal of Vision, 9(11):6, 1–
15.
Liang, D. W., Moreland, R., & Argote, L. (1995). Group versus individual training and group performance:
the mediating role of transactive memory. Personality and Social Psychology Bulletin, 21, 384–393.
Liao, J., & Moray, N. (1993). A simulation study of human performance deterioration and mental workload.
Le Travail humain, 56(4), 321–344.
Liao, T. W. (2003). Classification of welding flaw types with fuzzy expert systems. Expert Systems with
Applications, 25, 101–111.
Liben, L. (2009). The road to understanding maps. Current Directions in Psychological Science, 18, 310–315.
Lieberman, H. R., Bathalon, G. P., Falco, C. M., Kramer, F. M., Morgan, C. A., & Niro, P. (2004). Severe
decrements in cognition function and mood induced by sleep loss, heat, dehydration, and undernutrition
during simulated combat. Biological Psychiatry, 57, 422–429.
Linden, D. E. J., Bittner, R., Muckli, L., Waltz, J. A., Kriegeskorte, N., Goebel, R., Wolf Singer, W., &
Munk, M.H. J. (2003). Cortical capacity constraints for visual working memory: dissociation of fMRI load
effects in a fronto-parietal network. NeuroImage, 20, 1,518–1,530.
Lindsay, P. H., & Norman, D. A. (1972). Human information processing. New York: Academic Press.
Lindsay, R. C. L. (1999). Applying applied research: Selling the sequential line-up. Applied Cognitive

402
Psychology, 13, 219–225.
Lindsay, R. C. L., & Wells, G. L. (1985). Improving eye-witness identification from lineups: simultaneous
versus sequential lineup presentations. Journal of Applied Psychology, 70, 556–564.
Ling, J., & van Schaik, P. (2004). The effects of link format and screen location on visual search of web
pages. Ergonomics, 47, 907–921.
Lintern, G. (2012). Work-focused analysis and design. Cognition, Technology, and Work, 14, 71–81.
Lintern, G., Roscoe, S. N., & Sivier, J. E. (1990). Display principles, control dynamics, and environmental
factors in pilot training and transfer. Human Factors, 32, 299–317.
Lintern, G., & Wickens, C. D. (1991). Issues for acquisition in transfer of timesharing and dual-task skills. In
D. Damos (Ed.), Multiple-task performance. (pp. 123–138). London: Taylor & Francis.
Lipshitz, R. (1997). Naturalistic decision making perspectives on decision errors. In C. E. Zsambok & G.
Klein (Eds.), Naturalistic decision making (pp. 151–162). Mahwah, NJ: Erlbaum.
Lipshitz, R., & Cohen, M. S. (2005). Warrants for prescription: Analytically and empirically base approaches
to improving decision making. Human Factors, 47, 102–120.
Liu, Y. (1996) Quantitative assessment of effects of visual scanning on concurrent task performance.
Ergonomics, 39, 382–289.
Liu, Y. C., Fuld, R., & Wickens, C. D. (1993). Monitoring behavior in manual and automated scheduling
systems. International Journal of Man–Machine Studies, 39, 1,015–1,029.
Liu, Y. C., & Wen, M. H. (2004). Comparison of head-up display (HUD) vs. head-down display (HDD):
driving performance of commercial vehicle operators in Taiwan. International Journal of Human–
Computer Studies, 61, 679–697.
Liu, Y. C., & Wickens, C. D. (1992). Use of computer graphics and cluster analysis in aiding relational
judgment. Human Factors, 34, 165–178.
Liu, Y. C., & Wickens, C. D. (1992). Visual scanning with or without spatial uncertainty and divided and
selective attention. Acta Psychologica, 79, 131–153.
Liu, Y. C., Zhang, X., & Chaffin, D. (1997). Perception and visualization of human posture information for
computer-aided ergonomic analysis. Ergonomics, 40, 819–833.
Liuzzo, J., & Drury, C. G. (1978). An evaluation of blink inspection. Human Factors, 11, 201–210.
Lockhead, G. R., & King, M. C. (1977). Classifying integral stimuli. Journal of Experimental Psychology:
Human Perception & Performance, 3, 436–443.
Lockhead, G. R., & Klemmer, E. T. (1959, November). An evaluation of an 8-k wordwriting typewriter (IBM
Research Report RC–180). Yorktown Heights, NY: IBM Research Center.
Loeb, M., & Binford, J. R. (1968). Variation in performance on auditory and visual monitoring tasks as a
function of signal and stimulus frequencies. Perception & Psychophysics, 4, 361–367.
Loft, S., Sanderson, P., Neal, A., & Mooij, M. (2007). Modeling and predicting mental workload in en route
air traffic control: Critical review and broader implications. Human Factors, 49, 376–399.
Loft, S., Smith, R. E., & Bhaskara, A. (2009). Designing memory aids to facilitate intentions to deviate from
routine in an air traffic control simulation. In Proceedings of the Human Factors and Ergonomics Society
53rd Annual Meeting (pp. 56–60). Santa Monica, CA: Human Factors and Ergonomics Society.
Loftus, E. F. (1979). Eyewitness testimony. Cambridge, MA: Harvard University Press.
Loftus, E. F. (2005). Planting misinformation in the human mind: A 30-year investigation of the malleability
of memory. Learning & Memory, 12, 361–366.
Loftus, E. F., Coan, J. A. and Pickrell, J. E. (1996). Manufacturing false memories using bits of reality. In L.
M. Reder (Ed.), Implicit memory and metacognition (pp. 195–220). Hillsdale, NJ: Erlbaum.
Loftus, G. R., Dark, V. J., & Williams, D. (1979). Short-term memory factors in ground controller/pilot
communications. Human Factors, 21, 169–181.
Logan, G. D. (2004). Cumulative progress in formal theories of attention. Annual Review of Psychology, 55,
207–234.

403
Logan, G., & Klapp, S. (1991) Automatizing alphabet arithmetic. Journal of Experimental Psychology:
Learning, Memory, & Cognition, 17, 179–195.
Logie, R. H. (1995). Visuo-spatial working memory. Hove, UK: Erlbaum.
Logie, R. H. (2011). The functional organization and capacity limits of working memory. Current Directions
in Psychological Science, 20(4), 240–245.
Logie, R., Baddeley, A., Mane, A., Donchin, E., & Sheptak, R. (1989). Working memory in the acquisition
of complex cognitive skills. Acta Psychologica, 71, 53–87.
Lohse,G.L. (1993). A cognitive model for understanding graphical perception. Human–Computer Interaction,
8, 353–388.
Long, J. (1976). Effects of delayed irregular feedback on un-skilled and skilled keying performance.
Ergonomics, 19, 183–202.
Loomis, J. M., & Knapp, J. M. (2003). Visual perception of egocentric distance in real and virtual
environments. In L. J. Hettinger & M. W. Hass (Eds.), Virtual and Adaptive Environments. Hillsdale NJ:
Erlbaum.
Lopes, L. L. (1982, October). Procedural debiasing (Technical Report WHIPP 15). Madison, WI: Wisconsin
Human Information Processing Program.
Lorenz, B., Di Nocera, F., Roettger, S., & Parasuraman, R. (2002). Automated fault management in a
simulated space flight microworld. Aviation, Space, & Environmental Medicine, 73, 886–897.
Loukopoulos, L., Dismukes, R. K., & Barshi, E. (2009). The multi-tasking myth. Burlington, VT: Ashgate.
Loveless, N. E. (1963). Direction of motion stereotypes: A review. Ergonomics, 5, 357–383.
Lu, S., Wickens, C. D., Sarter, N., & Sebok, A. (2011). Informing the design of multimodal displays: A meta-
analysis of empirical studies comparing auditory and tactile interruptions. In Proceedings of the 55th
Annual Meeting of the Human Factors and Ergonomics Society (pp. 1,155–1,159). Santa Monica, CA:
Human Factors and Ergonomics Society.
Luce, R. D. (2003). Whatever happened to information theory in psychology? Review of General Psychology,
7, 183–188.
Luce, R. D., Nosofsky, R. M., Green, D. M., & Smith, A. F. (1982). The bow and sequential effects in
absolute identification. Perception & Psychophysics, 32, 397–408.
Luchins, A. S. (1942). Mechanizations in problem solving: The effect of Einstellung. Psychological
Monographs, 54 (Whole No. 248).
Luo, Z., Wickens, C. D., Duh, H. B. L., & Chen, I. (2010). Integrating route and survey learning in complex
virtual environments: Using a 3D map. In Proceedings of the Human Factors and Ergonomics Society 54th
Annual Meeting (pp. 2,393–2,397). Santa Monica, CA: Human Factors and Ergonomics Society.
Lusk, C. M. (1993). Assessing components of judgment in an operational setting: The effects of time pressure
on aviation weather forecasting. In O. Svenson & A. J. Maule (Eds.), Time pressure and stress in human
judgment and decision making (pp. 309–322). New York: Plenum.
Lusted, L. B. (1976). Clinical decision making. In D. Dombal & J. Grevy (eds.), Decision making and
medical care. Amsterdam: North Holland.
Luus, C. A. E., & Wells, G. L. (1991). Eyewitness identification and the selection of distracters for lineups.
Law and Human Behavior, 15, 43–57.
Lyall, B., & Wickens, C. D. (2005). Mixed fleet flying between two commercial aircraft types: An empirical
evaluation of the role of negative transfer. Proceedings of the 49th Annual Meeting of the Human Factors
& Ergonomics Society. Santa Monica, CA: HFES.
Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy
conditions based on new band-importance functions. Journal of the Acoustical Society of America, 125(5),
3,387–3,405.
Macedo, J., Kaber, D., Endsley, M., Powanusorn, P., & Myung, S. (1998). The effect of automated
compensation for incongruent axes on teleoperator performance. Human Factors, 40, 541–553.

404
MacGregor, D., & Slovic, P. (1986). Graphic representation of judgmental information. Human-Computer
Interaction, 2, 179–200.
MacGregor, D., Fischhoff, B., & Blackshaw, L. (1987). Search success and expectations with a computer
interface. Information Processing and Management, 23, 419–432.
MacGregor, J. N. & Chu, Y. (2010). Human performance on the traveling salesman and related problems: A
review. Journal of Problem Solving, 3, 1–29.
MacGregor, J. N. & Ormerod, T. (1996). Human performance on the traveling salesman problem. Perception
& Psychophysics, 58, 527–539.
MacGregor, J. N., Chronicle, E. P., & Ormerod, T. C. (2004). Convex hull or crossing avoidance? Solution
heuristics in the traveling salesperson problem. Memory & Cognition, 32, 260–270.
Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.
Mackinlay, J. D., Robertson, G. G., & Card, S. K. (1991). The perspective wall: Detail and context smoothly
integrated. In Proceedings of CHI ’91: Human Factors in Computing Systems (pp. 173–179). New York:
Association for Computing Machinery.
Mackworth, J. F., & Taylor, M. M. (1963). The d’ measure of signal detectability in vigilance–like situations.
Canadian Journal of Psychology, 17, 302–325.
Mackworth, N. H. (1948). The breakdown of vigilance during prolonged visual search. Quarterly Journal of
Experimental Psychology, 1, 5–61.
Mackworth, N. H. (1950). Research in the measurement of human performance (MRC Special Report Series
No. 268). London: H. M. Stationery Office. Reprinted in W. Sinaiko (Ed.), Selected papers on human
factors in the design and use of control systems. New York: Dover, 1961.
MacLean, K. A., Ferrer, E., Aichele, S. R., Bridwell, D. A., Zanesco, A. P., et al. (2010). Intensive
meditation training improves perceptual discrimination and sustained attention. Psychological Science, 21,
829–839.
MacLeod, C. M. (1991). Half a century of research on the Stroop effect: An integrative review. Psychological
Bulletin, 109, 163–203.
MacMahon, C., & Starkes, J. L. (2008). Contextual influences on baseball ball-strike decisions in umpires,
players, and controls. Journal of Sports Sciences, 26, 751–760.
Macmillan, N. A., & Creelman, C. D. (1990). Response bias: Characteristics of detection theory, threshold
theory, and “nonparametric” indexes. Psychological Bulletin, 107, 401–413.
Macmillan, N. A., & Creelman, C. D. (1996). Triangles in ROC space: History and theory of
“nonparametric” measures of sensitivity and response bias. Psychonomic Bulletin and Review, 3, 164–170.
Macmillan, N. A., & Creelman, C. D. (2005). Detection theory: A user’s guide (2nd Ed.). Mahwah, NJ:
Erlbaum.
Maddox, W. T. (2002). Toward a unified theory of decision criterion learning in perceptual categorization.
Journal of the Experimental Analysis of Behavior, 78, 567–595.
Maddox, W. T., & Ashby, F. G. (1996). Perceptual separability, decisional separability, and the identification-
speeded classification relationship. Journal of Experimental Psychology: Human Perception and
Performance, 22, 795–817.
Madhavan, P., & Wiegmann, D. (2007). Similarities and differences between human-human and human-
automation trust: an integrative review. Theoretical Issues in Ergonomics Science, 8, 270–301.
Madhavan, P., Lacson, F., & Wiegmann, D. (2006). Automation failures on tasks easily performed by
operators undermine trust in automated aids. Human Factors, 48, 241–256.
Maki, R. H., Maki, W. S., & Marsh, L. G. (1977). Processing locational and orientational information.
Memory & Cognition, 5, 602–612.
Malcolm, R. (1984). Pilot disorientation and the use of a peripheral vision display. Aviation, Space, and
Environmental Medicine, 55, 231–238.
Malhotra, N. K. (1982). Information load and consumer decision making. Journal of Consumer Research, 8,

405
419–430.
Malpass, R. S., & Devine, P. G. (1981). Eyewitness identification: lineup instructions and the absence of the
offender. Journal of Applied Psychology, 66, 482–489.
Maltz, M., & Shinar, D. (2003). New alternative methods in analyzing human behavior in cued target
acquisition. Human Factors, 45, 281–295.
Mane, A., Adams, J., & Donchin, E. (1989) Adaptive and part-whole training in the acquisition of a complex
perceptual-motor skill. Acta Psychologica, 71, 179–196.
Manzey, D., Luz, M., Mueller, S., Dietz, A., Meixensberger, J., & Strauss, G. (2011). Automation in surgery:
The impact of navigation-control assistance on performance, workload, situation awareness, and acquisition
of surgical skills. Human Factors, 53, 544–599.
Manzey, D., Reichenbach, J., & Onnasch, L. (2012). Human performance consequences of automated
decision aids: The impact of degree of automation and system experience. Journal of Cognitive
Engineering and Decision Making, 6, 1–31.
Marescaux, J., Leroy, J., Gagner, M., Rubino, F., Mutter, D.,Vix, M., Butner, S. E., & Smith, M. K. (2001).
Transatlantic robot-assisted telesurgery. Nature, 413, 379–380.
Marshall, D. C., Lee, J. D., & Austria, P. A. (2007). Alerts for invehicle information systems: Annoyance,
urgency and appropriateness. Human Factors, 49, 145–157.
Marshall, D., Lee, J. D., & Austria, A. (2001). Annoyance and urgency of auditory alerts for in-vehicle
information systems. In Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting
(pp. 1627–1631). Santa Monica, CA: Human Factors and Ergonomics Society.
Martens, M. H. (2011). Change detection in traffic: Where do we look and what do we perceive?
Transportation Research Part F, 14, 240–250.
Martin, B. A., Brown, N. L., and Hicks, J.L. (2011). Ongoing task delays affect prospective memory more
powerfully than filler task delays. Canadian Journal of Experimental Psychology, 65, 48–56.
Martin, G. (1989). The utility of speech input in user-computer interfaces. International Journal of Man-
Machine Studies, 18, 355–376.
Martin, R. C., Wogalter, M. S., & Forlano, J. G. (1988). Reading comprehension in the presence of
unattended speech and music. Journal of Memory and Language, 27, 382–398.
Masalonis, A. J., & Parasuraman, R. (2003). Fuzzy signal detection theory: Analysis of human and machine
performance in air traffic control, and analytic considerations. Ergonomics, 46, 1,045–1,074.
Mattes, S., & Hallen, A. (2009). Surrogate distraction measurement techniques. In M. Regan, J. Lee, & K.
Young (Eds.), Driver distraction. Boca Raton, FL: CRC Press.
Matthews, G. (2001). Levels of transaction: A cognitive science framework for operator stress. In P. A.
Hancock and P. Desmond (Eds.), Stress, workload, and fatigue (pp. 5–33). Mahwah, NJ: Erlbaum.
Matthews, G., & Davies, D. R. (2001). Individual differences in energetic arousal and sustained attention: A
dual-task study. Personality and Individual Differences, 31, 575–589.
Matthews, G., & Desmond, P. (2001). A transactional model of driver stress. In P. A. Hancock and P.
Desmond (Eds.), Stress, workload, and fatigue (pp. 133–163). Mahwah, NJ: Erlbaum.
Matthews, G., Davies, D. R., & Holley, P. J. (1993). Cognitive predictors of vigilance. Human Factors, 35,
3–24.
Matthews, G., Davies, D. R., Westerman, S. J., & Stammers, R. B. (2000). Human performance: Cognition,
stress, and individual differences. Hove, UK: Psychology Press.
Matthews, G., Warm, J., Reinerman-Jones, L., Langheim, L., Washburn, D., & Tripp, L. (2010). Task
engagement, cerebral blood flow velocity, and diagnostic monitoring for sustained attention. Journal of
Experimental Psychology: Applied, 16, 187–203.
Matthews, M. D., Eid, J., Johnsen, B. H., & Boe, O. C. (2011). A comparison of expert ratings and self-
assessments of situation awareness during a combat fatigue course. Military Psychology, 23, 125–136.
Maule, A. J., & Hockey, G. R. J. (1993). State, stress, and time pressure. In O. Svenson & A. J. Maule (Eds.),

406
Time pressure and stress in human judgment and decision making (pp. 83–102). New York: Plenum Press.
May, P. A., Campbell, M.,& Wickens, C. D. (1996). Perspective displays for air traffic control: Display of
terrain and weather. Air Traffic Control Quarterly, 3(1), 1–17.
Mayer, A., Boron, J. B., Kress, C., Fisk, A. D., & Rogers, W. A. (2007). Caution! Warning effectiveness may
be more obfuscated than it appears: Making sense of the warning literature. In Proceedings of the Human
Factors and Ergonomics Society 51st Annual Meeting (pp. 1,511–1,513). Santa Monica, CA: Human
Factors and Ergonomics Society.
Mayer, R. E. (2001). Multimedia learning. New York: Cambridge University Press.
Mayer, R. E. (in press). Multi-Media Instruction. In Handbook of Research on Educational Communications
and Technology.
Mayer, R. (2007). Research guidelines for multimedia instructions. In F. Durso (Ed.), Reviews of Human
Factors & Ergonomics vol 5. Santa Monica, CA: Human Factors.
Mayer, R., Griffith, I., Jurkowitz, N., & Rothman, D. (2008). Increased interestingness of extraneous details
in a multimedia science presentation leads to decreased learning. Journal of Experimental Psychology:
Applied. 14, 329–339.
Mayer, R., Hegarty, M., Mayer, S., & Campbell, J. (2005). When static media promote active learning.
Journal of Experimental Psychology: Applied, 11, 256–265.
Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational
Psychologist, 38, 45–52.
Mayer, R. E., & Johnson, C. I. (2008). Revising the redundancy principle in multimedia learning. Journal of
Educational Psychology, 100, 380–386.
Mayeur, A., Bremond, R., & Bastien, J. M. C. (2008). Effect of task and eccentricity of the target on
detection thresholds in mesopic vision: Implications for road lighting. Human Factors, 50, 712–721.
Mayhew, D. J. (1992). Principles and guidelines in software user interface design. Englewood Cliffs, NJ:
Prentice–Hall.
McBride, D. M., Beckner, J. K., & Abney, D. H. (2011). Effects of delay of prospective memory cues in an
ongoing task on prospective memory task performance. Memory & Cognition, 39, 1,222–1,231.
McCarley, J. S. (2009). Effects of speed-accuracy instructions on ocularmotor scanning and target recognition
in a simulated baggage X-ray screening task. Ergonomics, 52, 325–333.
McCarley, J. S., Kramer, A. F., Wickens, C. D., Vidoni, E. D., & Boot, W. R. (2004). Visual skills in airport-
security screening. Psychological Science, 15, 302–306.
McCarley, J. S., Vais, M. J., Pringle, H., Kramer, A. F., Irwin, D. E., & Strayer, D. L. (2004). Conversation
disrupts change detection in complex traffic scenes. Human Factors, 46, 424–436.
McCarthy, G.,&Donchin, E.(1979). Event-related potentials: Manifestation of cognitive activity. In F.
Hoffmeister & C. Muller (Eds.), Bayer Symposium VIII: Brain function in old age. New York: Springer.
McClelland, J. L. (1979). On the time-relations of mental processes: An examination of processes in cascade.
Psychological Review, 86, 287–330.
McConkie, G. W. (1983). Eye movements and perception during reading. In K. Rayner (Ed.), Eye movements
in reading. New York: Academic Press.
McCormick, E., Wickens, C. D., Banks, R., & Yeh, M. (1998). Frame of reference effects on scientific
visualization subtasks. Human Factors, 40, 443–451.
McDaniel, M., Howard, D., & Einstein, G. (2009). The read-recite-review study strategy. Psychological
Science. 20, 516–522.
McDaniel, M. A., & Einstein, G. O. (2007). Prospective memory: An overview and synthesis of an emerging
field. Thousand Oaks, CA: Sage.
McDaniel, M. A., Einstein, G. O., Graham, T., & Rall, E. (2004). Delaying execution of intentions:
Overcoming the costs of interruptions. Applied Cognitive Psychology, 18, 533–547.
McDougall, S. J. P., De Bruijn, O., & Curry, M. B. (2000). Exploring the effects of icon characteristics on

407
user performance: The role of icon concreteness, complexity, and distinctiveness. Journal of Experimental
Psychology: Applied, 6, 291–306.
McDougall, S., Forsythe, A., Isherwood, S., Petocz, A., Reppa, I., & Stevens, C. (2009). The Use of
Multimodal Representation in Icon Interpretation. In D. Harris (Ed.). Engineering Psychology and
Cognitive Ergonomics (pp. 62–70). Berlin: Springer.
McDougall, S., Reppa, I., Smith, G., & Playfoot, D. (2009). Beyond emoticons: Combining affect and
cognition in icon design. In D. Harris (Ed.). Engineering Psychology and Cognitive Ergonomics (pp. 71–
80). Berlin: Springer.
McFall, R. M., & Treat, T. A. (1999). Quantifying the information value of clinical assessments with signal
detection theory. Annual Review of Psychology, 50, 215–241.
McFarland, C., and Glisky, E. (2011). Implementation intentions and imagery: individual and combined
effects on prospective memory among young adults. Memory & Cognition, 40, 62–69.
McFarlane, D. C., & Latorella, K. A. (2002). The source and importance of human interruption in human-
computer interface design. Human-Computer Interaction, 17, 1–61.
McGeoch, J. A. (1936). Studies in retroactive inhibition: VII. Retroactive inhibition as a function of the
length and frequency of presentation of the interpolated lists. Journal of Experimental Psychology, 19, 674–
693.
McGookin, D. K., & Brewster, S. A. (2004). Understanding concurrent earcons: Applying auditory scene
analysis principles to concurrent earcon recognition. ACM Transactions on Applied Perception, 1(2), 130–
155.
McGowan, A., & Banbury, S. (2004). Evaluating interruption-based techniques using embedded measures of
driver anticipation. In S. Banbury & S. Tremblay (Eds.), A cognitive approach to situation awareness:
Theory and application (pp.176–192). Aldershot, UK: Ashgate.
McGrath, B. J., Estrada, A., Braithwaite, M. G., Raj, A. K., & Rupert, A. H. (2004). Tactile situation
awareness system flight demonstration final report. U.S. Army Report USAARL 2004–10. Fort Rucker,
AL: United States Army Aeromedical Research Laboratory, Aircrew Health and Performance Division.
McGraw, A. P., Larsen, J. T., Kahneman, D., & Schkade, D. (2010). Comparing gains and losses.
Psychological Science, 10, 1,438–1,445.
McIntire, J. P., Havig, P. R., Watamaniuk, S. N. J., & Gilkey, R. H. (2010). Visual search performance with
3-D auditory cues: Effects of motion, target location, and practice. Human Factors, 52, 41–53.
McKee, S. P. and K. Nakayama. The detection of motion in the peripheral visual field. Vision Res. 24: 25–
32, 1984.
McKee, S. P., Levi, D. M., & Bowne, S. F. (1990). The imprecision of stereopsis. Vision Research, 30,
1,763–1,779.
McNeil,B.J., Pauker,S.G., Sox,H.C.,Jr., &Tversky,A. (1982). On the elicitation of preferences for alternative
therapies. New England Journal of Medicine, 306, 1,259–1,262.
McTeague, J. (2011). Crapshoot investing. New York: Free Trade Press.
McVay, J., & Kane, M. (2009). Conducting the train of thought: Working memory capacity, goal neglect, and
mind wandering in an executive-control task. Journal of Experimental Psychology: Learning, Memory and
Cognition, 35, 196–204.
Meehl, P. C. (1954). Clinical versus statistical prediction. Minneapolis: University of Minnesota Press.
Meichenbaum, D. (1985). Stress inoculation training. New York: Pergamon.
Meichenbaum, D. (1993). Stress inoculation training: A twenty year update. In R. L. Woolfolk, & P. M.
Lehrer (Eds.), Principles and practice of stress management (2nd ed., pp. 373–406). New York: Guilford.
Meiran, N. (1996). Reconfiguration of processing mode prior to task performance. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 22, 1,423–1,442.
Meissner, C. A., Tredoux, C. G., Parker, J. F., &MacLin, O. H. (2005). Eyewitness decisions in
simultaneous and sequential lineups: A dual-process signal detection theory analysis. Memory & Cognition,
33, 783–792.

408
Melara, R. D., & Mounts, J. R. W. (1994). Contextual influences on interactive processing: Effects of
discriminability, quantity, and uncertainty. Perception & Psychophysics, 56, 73–90.
Mellers, B. A., Schwartz, A., & Cooke, A. D. J. (1998). Judgment and decision making. Annual Review of
Psychology, 49, 447–477.
Melton, A. W. (1947) (Ed.), Apparatus tests. USAAF Aviation Psychology Progrram Research report. No.4
PP 917–921.
Melton, A. W. (1963). Implications of short-term memory for a general theory of memory. Journal of Verbal
Learning and Verbal Behavior, 2, 1–21.
Memmert, D. (2006). The effects of eye movements, age, and expertise on inattentional blindness.
Consciousness and Cognition, 15, 620–627.
Merkel, J. (1885). Die zeitlichen Verhaltnisse der Willensthatigkeit. Philosophische Studien, 2, 73–127.
Merritt, S. M., & Ilgen, D. R. (2008). Not all trust is created equal: dispositional and history–based trust in
human-automation interactions. Human Factors, 50, 194–210.
Merwin, D. H., Vincow, M. A., & Wickens, C. D. (1994). Visual analysis of scientific data: Comparison of
3D-topographic, color, and gray scale displays in a feature detection task. In Proceedings of the Human
Factors and Ergonomics Society 38th Annual Meeting (pp. 240–244). Santa Monica, CA: Human Factors
and Ergonomics Society.
Merwin, D. H., & Wickens, C. D. (1993). Comparison of eight color and gray scales for displaying
continuous 2D data. In Proceedings of the 37th Annual Meeting of the Human Factors Society. Santa
Monica, CA: The Human Factors and Ergonomics Society.
Metzger, U., & Parasuraman, R. (2001). The role of the air traffic controller in future air traffic management:
An empirical study of active control versus passive monitoring. Human Factors, 43, 519–528.
Metzger, U., & Parasuraman, R. (2005). Automation in future air traffic management: Effects of decision aid
reliability on controller performance and mental workload. Human Factors, 47, 35–49.
Meyer, D. E., & Kieras, D. E. (1997a). A computational theory of executive cognitive processes and
multiple-task performance: Part 1. Basic mechanisms. Psychological Review, 104, 3–65.
Meyer, D. E., & Kieras, D. E. (1997b). A computational theory of executive cognitive processes and
multiple-task performance: Part 2. Accounts of psychological refractory-period phenomena. Psychological
Review, 104, 749–791.
Meyer, J. (2001). Effects of warning validity and proximity on responses to warnings. Human Factors, 43,
563–572.
Meyer, J. (2004). Conceptual issues in the study of dynamic hazard warnings. Human Factors, 46, 196–204.
Meyer, J., Shinar, D., & Leiser, D. (1997). Multiple factors that determine performance with tables and
graphs. Human Factors, 39, 268–286.
Meyer, J., Taieb, M., & Flascher, I. (1997). Correlation estimates as perceptual judgments. Journal of
Experimental Psychology: Applied, 3, 3–20.
Michinov, N., & Michinov, E. (2009). Investigating the relationship between transactive memory and
performance in collaborative learning. Learning and Instruction, 19, 43–54.
Micire, M. J. (2010). Multi-touch interaction for robot command and control. Unpublished doctoral
dissertation, University of Massachusetts, Lowell, Department of Computer Science.
Miles, K. S., & Cottle, J. L. (2011). Beyond plain language: A learner-centered approach to pattern jury
instructions. Technical Communication Quarterly, 20(1), 92–112.
Milgram, P., & Colquhoun, H., Jr. (1999). A taxonomy of real and virtual world display integration. In Y.
Ohta & H. Tamura (Eds.), Mixed reality—merging real and virtual worlds (pp. 5–30). Berlin: Springer-
Verlag.
Milgram, S., & Jodelet, D. (1976). Psychological maps of Paris. In H. M. Proshansky, W. H. Itelson, & L. G.
Revlin (Eds.), Environmental psychology. New York: Holt Rinehart & Winston.
Miller, C., & Parasuraman, R. (2007). Designing for flexible interaction between humans and automation:

409
Delegation interfaces for supervisory control. Human Factors, 49, 57–75.
Miller, R. B. (1968). Response time in noncomputer conversational transactions. In Proceedings of 1968 Fall
Joint Computer Conference. Arlington, VA: AFIPS Press.
Miller, D., & Swain, A. (1987). Human reliability analysis. In G. Salvendy (Ed.), Handbook of human factors.
New York: Wiley.
Miller, G. A. (1956). The magical number seven plus or minus two: Some limits on our capacity for
processing information. Psychological Review, 63, 81–97.
Miller, G. A., & Isard, S. (1963). Some perceptual consequences of linguistic rules. Journal of Verbal
Learning and Verbal Behavior, 2, 217–228.
Miller, R. J., & Penningroth, S. (1997). The effects of response format and other variables on comparisons of
digital and dial displays. Human Factors, 39, 417–424.
Mischel, W., Shoda, Y., & Rodriguez, M. L. (1989). Delay of gratification in children. Science, 244, 933-938.
Misra, S., Ramesh, K. T., & Okamura, A. M. (2008). Modeling of tool-tissue interactions for computer-based
surgical simulation: A literature review. Presence, 17, 463–491.
Mitchell, J., & Shneiderman, B. (1989). Dynamic versus static menus: An exploratory comparison. ACM
SIGCHI Bulletin, 20(4), 33–37.
Mitta, D., & Gunning, D. (1993). Simplifying graphics-based data: Applying the fisheye lens viewing
strategy. Behaviour & Information Technology, 12, 1–16.
Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity
and diversity of executive functions and their contributions to complex “frontal lobe” tasks: A latent
variable analysis. Cognitive Psychology, 41, 49–100.
Miyake, A., Friedman, N. P., Rettinger, D. A., Shah, P., & Hegarty, M. (2001). How are visuospatial
working memory, executive functioning, and spatial abilities related? A latent-variable analysis. Journal of
Experimental Psychology: General, 130, 621–664.
Moertl, P. M., Canning, J. M., Gronlund, S. D., Dougherty, M. R. P., Johansson, J., & Mills, S. H. (2002).
Aiding planning in air traffic control: An experimental investigation of the effects of perceptual information
integration. Human Factors, 44, 404–412.
Molden, D., & Hui, C. (2011). Promoting deescalation of commitment: a regulatory focus perspective on
sunk costs. Psychological Science, 22, 8–12.
Molloy, R., & Parasuraman, R. (1996). Monitoring an automated system for a single failure: Vigilance and
task complexity effects. Human Factors, 38, 311–322.
Mondor, T. A., & Zatorre, R. J. (1995). Shifting and focusing auditory spatial attention. Journal of
Experimental Psychology: Human Perception & Performance, 21, 387–409.
Mondor, T. A., Zatorre, R. J., & Terrio, N. A. (1998). Constraints on the selection of auditory information.
Journal of Experimental Psychology: Human Perception and Performance, 24, 66–79.
Monk, C., Boehm-Davis, D., & Trafton, J. G. (2004). Recovering from interruptions: implications for driver
distraction research. Human Factors, 46, 650–664.
Monk, C., Trafton, G., & Boehm-Davis, D. (2008) The effect of interruption duration and demand on
resuming suspended goals. Journal of Experimental Psychology: Applied, 13, 299–315.
Monsell, S. (2003). Task switching. Trends in Cognitive Science, 7, 134–140.
Montello, D. (1995). Navigation. In P. Shah & A. S. Miyaki (Eds.), The Cambridge handbook of visuospatial
thinking. Cambridge UK: Cambridge University Press.
Montgomery, H., & Shareafi, P. (2004). Engaging in activities involving information technology:
Dimensions, mode and flow. Human Factors, 46, 334–348.
Moore, A. B., Clark, B. A., & Kane, M. J. (2008). Who shalt not kill? Individual differences in working
memory capacity, executive control and moral judgement. Psychological Science, 19(6), 549–557.
Moore, G. E. (1965). Cramming more components onto integrated circuits. Electronics Magazine, 38 (8),
114–117.

410
Moray, N. (1959). Attention in dichotic listening. Quarterly Journal of Experimental Psychology, 11, 56–60.
Moray, N. (Ed.). (1979). Mental workload: Its theory and measurement. New York: Plenum.
Moray, N. (1984). Attention to dynamic visual displays in man-machine systems. In R. Parasuraman & D. R.
Davies (Eds.), Varieties of attention (pp. 485–513). San Diego, CA: Academic Press.
Moray, N. (1986). Monitoring behavior and supervisory control. In K. R. Boff, L. Kaufman, & J. P. Thomas
(Eds.), Handbook of perception and performance, (Vol II, pp. 40-1-40-51). New York: Wiley.
Moray, N. (1988). Mental workload since 1979. International Reviews of Ergonomics, 2, 123–150.
Moray, N. (1997). Human factors in process control. In G. Salvendy (Ed.), Handbook of ergonomics and
human factors (pp. 1944–1971). New York: Wiley.
Moray, N. (1999). Mental models in theory and practice. In D. Gopher & A. Koriat (Eds.), Attention and
performance XVII: Cognitive regulation of performance (pp. 223–258). Cambridge, MA: MIT Press.
Moray, N. (2003). Monitoring, complacency, scepticism and eutactic behaviour. International Journal of
Industrial Ergonomics 31, 175–178.
Moray, N., &Inagaki, T. (2000). Attention and complacency. Theoretical Issues in Ergonomics Science, 1,
354–365.
Moray, N., & Rotenberg, I. (1989). Fault management in process control: Eye movements and action.
Ergonomics, 32, 1,319–1,342.
Moray, N., Dessouky, M. I., Kijowski, B. A., & Adapathya, R. (1991). Strategic behavior, workload and
performance in task scheduling. Human Factors, 33, 607–629.
Moray, N., King, K. R., Turksen, R., & Waterton, K. (1987). A closed-loop model of workload based on a
comparison of fuzzy and crisp measurement techniques. Human Factors, 29, 339–348.
Moreland, R. L., & Myaskovsky, L. (2000). Exploring the performance benefits of group training:
Transactive memory or improved communication? Organizational Behavior and Human Decision
Processes, 82, 117–133.
Morgan, P., Patrick, J., Waldron, S., King, S. & Patrick, T. (2009). Improving memory after interruption:
exploiting soft constraints and manipulating information access cost. Journal of Experimental Psychology:
Applied. 15, 291–306.
Mori, H., & Hayashi, Y. (1995). Visual interference with users’ tasks on multiwindow systems. International
Journal of Human–Computer Interaction, 7, 329–340.
Morrow, D. G., North, R., & Wickens, C. D. (2006). Reducing and mitigating human error in medicine. In R.
S. Nickerson (Ed.), Reviews of Human Factors and Ergonomics (Vol. 1, pp. 254–296). Santa Monica, CA:
Human Factors and Ergonomics Society.
Morrow, D. G., Weiner, M., Steinley, D., Young, J., & Murray, M. D. (2007). Patients’ health literacy and
experience with instructions—Influence preferences for heart failure medication instructions. Journal of
Aging and Health, 19, (4), 575–593.
Morrow, D., North, R., & Wickens, C. D. (2006). Reducing and mitigating human error in medicine. Reviews
of Human Factors and Ergonomics, 1, 254–296.
Moses, F. L., & Ehrenreich, S. L. (1981). Abbreviations for automated systems. In R. Sugarman (Ed.), In
Proceedings of the 25th Annual Meeting of the Human Factors Society. Santa Monica, CA: Human Factors
Society.
Moses, F. L., Maisano, R. E., & Bersh, P. (1979). Natural associations between symbols and military
information. In C. Bensel (Ed.), Proceedings of the 23rd Annual Meeting of the Human Factors Society.
Santa Monica, CA: Human Factors Society.
Mosier, K. L., & Fischer, U. (2010). Judgment and decision making by individuals and teams: Issues, models,
and applications. Reviews of Human Factors and Ergonomics, 6, 198–255.
Mosier, K. L., & Skitka, L. J. (1996). Human decision makers and automated decision aids: Made for each
other? In R. Parasuraman & M. Mouloua (Eds.), Automation and human performance: Theory and
application (pp. 201–220). Mahwah, NJ: Erlbaum.

411
Mosier, K. L., Sethi, N., McCauley, S., Khoo, L., & Orasanu, J. M. (2007). What you don’t know can hurt
you: Factors impacting diagnosis in the automated cockpit. Human Factors, 49, 300–310.
Mosier, K. L., Skitka, L. J., Heers, S., & Burdick, M. (1998). Automation bias: Decision-making and
performance in hightech cockpits. International Journal of Aviation Psychology, 8, 47–63.
Most, S. B., & Astur, R. S. (2007). Feature-based attentional set as a cause of traffic accidents. Visual
Cognition, 15, 125–132.
Mourant, R. R., & Rockwell, T. H. (1972). Strategies of visual search by novice and experienced drivers.
Human Factors, 14, 325–335.
Mowbray, G. H., & Gebhard, J. W. (1961). Man’s senses vs. informational channels. In W. Sinaiko (Ed.),
Selected papers on human factors in the design and use of control systems. New York: Dover.
Mowbray, G. H., & Rhoades, M. V. (1959). On the reduction of choice reaction time with practice. Quarterly
Journal of Experimental Psychology, 11, 16–23.
Muhlbach, L., Bocker, M., &Prussog, A. (1995). Telepresence in video communications: A study of
stereoscopy and individual eye contact. Human Factors, 37, 290–305.
Muir, B. (1987) Trust between humans and machines. In E. Hollnagel, G. Mancini, & D. Woods (Eds.) ,
Cognitive engineering in complex dynamic worlds (pp 71-83) London: Academic Press.
Mulder, G., & Mulder, L. J. (1981). Information processing and cardiovascular control. Psychophysiology,
18, 392–401.
Mulder, L. J. M., van Roon, A., Veldman, H., Laumann, K., Burov, O., Qusipel, L.,& Hogenoom, P.
(2003). How to use cardiovascular state changes in adaptive automation. In G. R. J. Hockey, O. Burov, &
A. W. K. Gaillard (Eds.), Operator functional state (pp. 260–269). Amsterdam: IOS Press.
Mulder, M. (2003). An information-centered analysis of the tunnel-in-the-sky display, Part One: Straight
tunnel trajectories. International Journal of Aviation Psychology, 13, 49–72.
Muller, H. J., & Rabbitt, P. M. (1989). Reflexive and voluntary orienting of visual attention: Time course of
activation and resistance to interruption. Journal of Experimental Psychology: Human Perception &
Performance, 15, 315–330.
Munichor, N., Erev, I., & Lotern, A. (2006). Risk attitude in small timesaving decisions. Journal of
Experimental Psychology: Applied, 12, 129–141.
Munoz, Y., Chebat, J. C., & Suissa, J. A. (2010). Using fear appeals in warning labels to promote responsible
gaming among VLT players: The key role of depth of processing. Journal of Gambling Studies, 26, 593–
609.
Munzer, S., Zimmer, H., & Baus, J. (2012). Navigational assistance: a tread-off between wayfinding support
and configural learning support. Journal of Experimental Psychology: Applied. 16, 18–37.
Murphy, A. H., & Winkler, R. L. (1984). Probability of precipitation forecasts. Journal of the Association of
the American Meteorological Society, 79, 391–400.
Murphy, T. D., & Eriksen, C. W. (1987). Temporal changes in the distribution of attention in the visual field
in response to precues. Perception & Psychophysics, 42, 576–586.
Mursalin, T. E., Eishita, F. Z., & Islam, A. R. (2008). Fabric defect inspection system using neural network
and microcontroller. Journal of Theoretical and Applied Information Technology, 4, 560–570.
Mussa-Ivaldi, F., Miller, L., Rymer, W. Z., & Weir, R. (2007). Neural engineering. In R. Parasuraman & M.
Rizzo (Eds.), Neuroergonomics: The brain at work (pp. 293–312). New York: Oxford.
Mussweiler, T., Strack, F., & Pfeiffer, T. (2000). Overcoming the inevitable anchoring effect: Considering
the opposite compensates for selective accessibility. Personality and Social Psychology Bulletin, 26, 1,142–
1,150.
Mynatt, C. R., Doherty, M. E., & Tweney, R. D. (1977). Confirmation bias in a simulated research
environment: An experimental study of scientific inference. Quarterly Journal of Experimental Psychology,
29, 85–95.
Nagy, A. L., & Sanchez, R. R. (1992). Chromaticity and luminance as coding dimensions in visual search.
Human Factors, 34, 601–614.

412
Nakano, A., Bachlechner, M. E., Kalia, R. K., et al. (2001). Multiscale simulation of nanosystems.
Computing in Science & Engineering, 3, 56–66.
Nass, C., Moon, Y., Fogg, B. J., Reeves, B., & Dryer, D. C. (1995). Can computer personalities be human
personalities? International Journal of Human-Computer Studies, 43, 223–239.
National Highway Traffic Safety Administration (2005). Traffic safety facts 2005. Department of
Transportation technical report DOT HS 810 631. Washington, DC: U.S. Department of Transportation.
National Transportation Safety Board (1973). Eastern Airlines L-1011, Miami, Florida, 20 December 1972.
(Report NTSB-AAR-94/07). Washington, DC: Author.
National Transportation Safety Board. (1997). Grounding of the Panamanian passenger ship Royal Majesty
on Rose and Crown shoal near Nantucket, Massachusetts, June 10, 1995. (Report NTSB/MAR-97-01).
Washington DC: Author.
Navarro, J., Marchhena, E., Alcalde, C., Ruiz, G., Llorens, I. & Aguillar, M. (3002). Improving attention
behavior in primary and secondary school children with a computer assisted instruction procedure.
International Journal of Psychology. 38, 359–365.
Navon, D. (1977). Forest before trees: The presence of global features in visual perception. Cognitive
Psychology, 9, 353–383.
Navon, D. (1984). Resources: A theoretical soup stone. Psychological Review, 91, 216–334.
Navon, D., & Gopher, D. (1979). On the economy of the human processing system. Psychological Review,
86, 254–255.
Navon, D., & Miller, J. (1987). The role of outcome conflict in dual-task interference. Journal of
Experimental Psychology: Human Perception and Performance, 13, 435–448.
Naylor, J., & Briggs, G. (1963). Effects of task complexity and task organization on the relative efficiency of
part and whole training methods. Journal of Experimental Psychology, 65, 217–224.
Neider, M. B., McCarley, J. S., Crowell, J. A., Kaczmarski, H., & Kramer, A. F. (2010). Pedestrians,
vehicles, and cell phones. Accident Analysis and Prevention, 42, 589–594.
Neisser, U. (1963). Decision time without reaction time: Experiments in visual scanning. American Journal of
Psychology, 76, 376–385.
Neisser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts.
Neisser, U., Novick, R., & Lazar, R. (1964). Searching for novel targets. Perceptual and Motor Skills, 19,
427–432.
Nelson, T. O. (1996). Consciousness and meta cogntition. American Psychologist, 51, 102–116.
Nelson, W. T., Bolia, R. S., & Tripp, L. D. (2001). Auditory localization under sustained +Gz acceleration.
Human Factors, 43, 299–309.
Neuhoff, J. G., & McBeath, M. K. (1996). The Doppler illusion: The influence of dynamic intensity change
on perceived pitch. Journal of Experimental Psychology: Human Perception and Performance, 22, 970–
985.
Nevile, M. (2002). Gesture in the airline cockpit: Allocating control of the power levers during takeoff. In
Proceedings of the First International Conference on Gesture, University of Texas at Austin, USA.
Newsome, S. L., & Hocherlin, M. E. (1989). When “not” is not bad: A reevaluation of the use of negatives.
In Proceedings of the 33rd Annual Meeting of the Human Factors Society (pp. 229–234). Santa Monica,
CA: Human Factors Society.
Neyedli, H. F., Hollands, J. G., & Jamieson, G. A. (2011). Beyond identity: Incorporating system reliability
information into an automated combat identification system. Human Factors, 53, 338–355.
Nguyen, D. T., & Canny, J. (2009). More than face-to-face: Empathy effects of video framing. Proceedings
of CHI 2009—Telepresence and online media. New York: Association for Computing Machinery.
Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General
Psychology, 2, 175–220.
Nickerson, R. S. (1977). Some comments on human archival memory as a very large data base. In

413
Proceedings of the Third International Conference on Very Large Data Bases, VLDB 77, Vol. 3. (pp. 159–
168). Tokyo.
Nicolelis, M. A. (2003). Brain-machine interfaces to restore motor function and probe neural circuits. Nature
Reviews Neuroscience, 4, 417–422.
Nikolic, M. I., & Sarter, N. B. (2001). Peripheral visual feedback. Human Factors, 43, 30–38.
Nikolic, M. I., Orr, J. M., & Sarter, N. B. (2004). Why pilots miss the green box: How display context
undermines attention capture. International Journal of Aviation Psychology, 14, 39–52.
Nilsson, L. G., Ohlsson, K., & Ronnberg, J. (1977). Capacity differences in processing and storage of
auditory and visual input. In S. Dornick (Ed.), Attention and Performance VI. Hillsdale, NJ: Erlbaum.
Nisbett, R. E., Zukier, H., & Lemley, R. (1981). The dilution effect: Nondiagnostic information. Cognitive
Psychology, 13, 248–277.
Nishanian, P., Taylor, J. M. G., Korns, E., Detels, R., Saah, A., & Fahey, J. L. (1987). Significance of
quantitative enzyme-liked immunosorbent assay (ELISA) results in evaluation of three ELISAs and
Western blot tests for detection of antibodies to human immunodeficiency virus in a high-risk population.
Journal of the American Medical Association, 259, 2,574–2,579.
Nof, S. Y. (2009). (Ed.), Springer handbook of automation. New York: Springer.
Nolte, L. W., & Jaarsma, D. (1967). More on the detection of one of M orthogonal signals. Journal of the
Acoustical Society of America, 41, 497–505.
Norman, D. (1968). Toward a theory of memory and attention. Psychological Review, 75, 522–536.
Norman, D. A. (1981). Categorization of action slips. Psychological Review, 88, 1–15.
Norman, D. A. (1981). The trouble with UNIX. Datamation, 27(12), 139–150.
Norman, D. A. (1988). The psychology of everyday things. New York: Basic.
Norman, D. A. (1990). The ‘problem’ with automation: Inappropriate feedback and interaction, not ‘over-
automation’. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences,
327, 585–593.
Norman, D. A. (1992). The design of everyday things. New York: Harper & Row.
Norman, D. A., & Bobrow, D. G. (1975). On data-limited and resource-limited processing. Cognitive
Psychology, 7, 44–60.
Norman, D. A., & Fisher, D. (1982). Why alphabetic keyboards are not easy to use: Keyboard layout doesn’t
much matter. Human Factors, 24, 509–520.
North, C. (2006). Information Visualization. In G. Salvendy (Ed.), Handbook of human factors and
ergonomics (3rd Ed.) New York: Wiley.
North, R. A., & Riley, V. A. (1989). A predictive model of operator workload. In G. R. McMillan, D. Beevis,
E. Salas, M. H. Strub, R., Sutton, & L. Van Breda (Eds.), Applications of human performance models to
system design (pp. 81–90). New York: Plenum.
Noyes, J. M., & Starr, A. F. (2007). A comparison of speech input and touch screen for executing checklists
in an avionics application. International Journal of Aviation Psychology, 17, 299–315.
Noyes, J. M., Hellier, E., & Edworthy, J. (2006). Speech warnings: A review. Theoretical Issues in
Ergonomics Science, 7, 551–571.
Nugent, W. A. (1987). A comparative assessment of computer-based media for presenting job task
instructions. In Proceedings of the 31st Annual Meeting of the Human Factors Society (pp. 696–700). Santa
Monica, CA: Human Factors Society.
Nunes, A., Wickens, C. D., & Yin, S. (2006). Examining the viability of the Neisser search model in the flight
domain and the benefits of highlighting in visual search. In Proceedings of the Human Factors and
Ergonomics Society 50th Annual Meeting (pp. 35–39). Santa Monica, CA: Human Factors and Ergonomics
Society.
O’Brien, K. S., & O’Hare, D. (2007). Situation awareness ability and cognitive skills training in a complex
real-world task. Ergonomics, 50, 1064–1091.

414
O’Donnell, R. D., & Eggemeier, F. T. (1986). Workload assessment methodology. In K. Boff, L. Kaufman,
& J. Thomas (Eds.), Handbook of perception and performance (vol. II). New York: Wiley.
O’Hanlon, J. F., & Beatty, J. (1997). Concurrence of electroencephalographic and performance changes
during a simulated radar watch and some implications for the arousal theory of vigilance. In R. R. Mackie
(Ed.), Vigilance: Theory, operational performance, and physiological correlates (pp. 189–202). New York:
Plenum.
O’Hara, K. P., & Payne, S. J. (1998). The effects of operator implementation cost on planfulness of problem
solving and learning. Cognitive Psychology, 35, 34–70.
O’Regan,J.K., Deubel,H., Clark, J. J., & Rensink, R. A. (2000). Picture changes during blinks: Looking
without seeing and seeing without looking. Visual Cognition, 7, 191–211.
O’Connor, P., Campbell, J., Newon, J., Melton, J., Salas, E., & Wilson, K. A. (2008). Crew Resource
Management training effectiveness: A meta-analysis and some critical needs. The International Journal of
Aviation Psychology, 18, 353–368.
O’Hare, D., & Roscoe, S. N. (1990). Flightdeck performance: The human factor. Ames, IA: Iowa State
University Press.
Okado, Y. and Stark, C. E. L. (2005). Neural activity during encoding predicts false memories created by
misinformation. Learning & Memory, 12, 3–11.
Oliva, A., & Torralba, A. (2007). The role of context in object recognition. Trends in Cognitive Sciences,
11(12), 520–527.
Olmos, O., Liang, C. C., & Wickens, C. D. (1997). Electronic map evaluation in simulated visual
meteorological conditions. International Journal of Aviation Psychology, 7, 37–66.
Olmos, O., Wickens, C. D., & Chudy, A. (2000). Tactical displays for combat awareness: An examination of
dimensionality and frame of reference concepts and the application of cognitive engineering. International
Journal of Aviation Psychology, 10, 247–271.
Olson, J. S., Olson, G. M., & Meader, D. K. (1995). What mix of video and audio is useful for remote
realtime work. In Proceedings of the Conference on Human Factors in Computing Systems (pp. 33–45).
Denver, CO: Academic Press.
Olson, W. A., & Sarter, N. B. (2000). Automation management strategies: Pilot preferences and operational
experiences. International Journal of Aviation Psychology, 10, 327–341.
Opperman, R. (1994). Adaptive user support. Hillsdale, NJ: Erlbaum.
Orasanu, J. (1997). Stress and naturalistic decision making: Strengthening the weak links. In R. Flin, E.
Salas, M. Strub, & L. Martin (Eds.), Decision making under stress: Emerging themes and applications (pp.
43–66). Brookfield: Ashgate.
Orasanu, J., & Fischer, U. (1997). Finding decisions in natural environments: The view from the cockpit. In
C. E. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 343–358). Mahwah, NJ: Erlbaum.
Orasanu, J., & Strauch, B. (1994). Temporal factors in aviation decision making. In Proceedings of the 38th
Annual Meeting of the Human Factors and Ergonomics Society (pp. 935–939). Santa Monica, CA: Human
Factors and Ergonomics Society.
Orlansky, J., Taylor, H. L., Levine, D. B., & Honig, J. G. (1997). The cost and effectiveness of the multi-
service distributed training testbed (MDT2) for training close air support. IDA Paper P–3284. Alexandria,
VA: Institute for Defense Analyses.
Oron-Gilad, T., Szalma, J., & Hancock, P. A. (2005). Incorporating individual differences into the adaptive
automation paradigm. In P. Carayon, M. Robertson, B. Kleiner, and P. L. T. Hoonakker (Eds.), Human
factors in organizational design and management VIII (pp. 581–586). Santa Monica, CA: IEA Press.
Oskamp, S. (1965). Overconfidence in case-study judgments. Journal of Consulting Psychology, 29, 261–
265.
Overbye, T. J., Wiegmann, D. A., Rich, A. M., & Sun, Y. (2002). Human factors aspects of power system
voltage contour visualizations. IEEE Transactions on Power Systems, 18, 76–82.
Owen, A. M., McMillan, K. M., Laird, A. R., & Bullmore, E. (2005). N–back working memory paradigm: A

415
meta-analysis of normative functional neuroimaging studies. Human Brain Mapping, 25, 46–59.
Owen, D. H., & Warren, R. (1987). Perception and control of self-motion: Implications for visual simulation
of vehicular locomotion. In L. S. Mark, J. S. Warm, & R. L. Huston (Eds.), Ergonomics and human factors:
Recent research (pp. 40–70). New York: Springer-Verlag.
Owsley, C., Ball, K., McGwin, G., Sloane, M. E., Roenker, D. L., White, M. F., & Overley, E. T. (1998).
Visual processing impairment and risk of motor vehicle crash among older adults. Journal of the American
Medical Association, 279, 1,083–1,088.
Paas, F. Renkl, & Sweller, J. (2003). Cognitive load theory and instructional design. Educational
Psychologist 38, 1–4.
Paas, F., & van Gog, T. (2009). Principles for designing effective and efficient training of complex cognitive
skills. In F. Durso (Ed.), Reviews of Human Factors and Ergonomics, Vol. 5. Santa Monica, CA: Human
Factors and Ergonomics Society.
Pachella, R. G. (1974). The interpretation of reaction time in information processing research. In B. H.
Kantowitz (Ed.), Human information processing (pp. 41–82). Potomac, MD: Erlbaum.
Paese, P. W., & Sniezek, J. A. (1991). Influences on the appropriateness of confidence in judgment: Practice,
effort, information, and decision making. Organizational Behavior and Human Decision Processes, 48,
100–130.
Palmer, S. E. (1999). Vision science: Photons to phenomenology. Cambridge, MA: MIT Press.
Palmisano, S., Favelle, S., & Sachtler, W. L. (2008). Effects of scenery, lighting, glideslope, and experience
on timing the landing flare. Journal of Experimental Psychology: Applied, 14, 236–246.
Parasuraman, R. (1979). Memory load and event rate control sensitivity decrements in sustained attention.
Science, 205, 925–927.
Parasuraman, R. (1985). Detection and identification of abnormalities in chest x-rays: Effects of reader skill,
disease prevalence, and reporting standards. In R. E. Eberts & C. G. Eberts (eds.), Trends in
ergonomics/human factors II (pp. 59–66). Amsterdam: North-Holland.
Parasuraman, R. (1986). Vigilance, monitoring, and search. In K. Boff, L. Kaufman, & J. Thomas (eds.),
Handbook of perception and human performance. Vol. 2: Cognitive processes and performance (pp. 43.1–
43.39). New York: Wiley.
Parasuraman, R. (1987). Human-computer monitoring. Human Factors, 29, 695–706.
Parasuraman, R. (2000). Designing automation for human use: Empirical studies and quantitative models.
Ergonomics, 43, 931–951.
Parasuraman, R. (2009). Assaying individual differences in cognition with molecular genetics: theory and
application. Theoretical Issues in Ergonomics Science, 10, 399–416.
Parasuraman,R.,deVisser,E.,Lin,M.-K.,&Greenwood,P.M. (2012). DBH genotype identifies individuals less
susceptible to bias in computer-assisted decision making. PLoS One, 7(6). e39675.
doi:10.1371/journal.pone.0039675.
Parasuraman, R. (2011). Neuroergonomics: Brain, cognition, and performance at work. Current Directions
in Psychological Science, 20, 181–186.
Parasuraman, R., Bahri, T., Deaton, J. E., Morrison, J. G., & Barnes, M. (1992). Theory and design of
adaptive automation in aviation systems (Technical Report, Code 6021). Warminster, PA: Naval Air
Development Center.
Parasuraman, R., Barnes, M., & Cosenzo, K. (2007). Adaptive automation for human-robot teaming in
future command and control systems. International Journal of Command and Control, 1(2), 43–68.
Parasuraman, R., & Byrne, E. A. (2003). Automation and human performance in aviation. In P. Tsang and
M. Vidulich (Eds.), Principles of aviation psychology (pp. 311–356). Mahwah, NJ: Erlbaum.
Parasuraman, R., & Caggiano, D. (2005). Neural and genetic assays of mental workload. In D. McBride &
D. Schmorrow (Eds.), Quantifying human information processing (pp. 123–155). Lanham, MD: Rowman
and Littlefield.
Parasuraman, R., Cosenzo, K., & de Visser, E. (2009). Adaptive automation for human supervision of

416
multiple uninhabited vehicles: Effects on change detection, situation awareness, and mental workload.
Military Psychology, 21, 270–297.
Parasuraman, R., De Visser, E., Clarke, E., McGarry, W. R., Hussey, E., Shaw, T., & Thompson, J. (2009).
Detecting threat-related intentional actions of others: Effects of image quality, response mode, and target
cueing on vigilance. Journal of Experimental Psychology: Applied, 15, 275–290.
Parasuraman, R., Galster, S., Squire, P., Furukawa, H., & Miller, C. (2005). A flexible delegation interface
enhances system performance in human supervision of multiple autonomous robots: Empirical studies with
RoboFlag. IEEE Transactions on Systems, Man, and Cybernetics. Part A: Systems and Humans, 35, 481–
493.
Parasuraman, R., & Greenwood, P. M. (2004). Molecular genetics of visuospatial attention and working
memory. In M. I. Posner (Ed.), Cognitive neuroscience of attention (pp. 245–259). New York: Guilford.
Parasuraman, R., Greenwood, P. M., Kumar, R., & Fossella, J. (2005). Beyond heritability:
Neurotransmitter genes differentially modulate visuospatial attention and working memory. Psychological
Science, 16, 200–207.
Parasuraman, R., & Hancock, P. A. (2001). Adaptive control of workload. In P. A. Hancock & P. E.
Desmond (Eds.), Stress, workload, and fatigue (pp. 305–320). Mahwah, NJ: Erlbaum.
Parasuraman, R., Hancock, P. A., & Olofinboba, O. (1997). Alarm effectiveness in driver-centered
collision-warning systems. Ergonomics, 40, 390–399.
Parasuraman, R., & Jiang, Y. (2012). Individual differences in cognition, affect, and performance:
Behavioral, neuroimaging, and molecular genetic approaches. NeuroImage, 59, 70–82.
Parasuraman, R., & Manzey, D. (2010). Complacency and bias in human use of automation: An attentional
integration. Human Factors, 52, 381–410.
Parasuraman, R., Masalonis, A. J., & Hancock, P. A. (2000). Fuzzy signal detection theory: Basic
postulates and formulas for analyzing human and machine performance. Human Factors, 42, 636–659.
Parasuraman, R., & Miller, C. (2004). Trust and etiquette in high-criticality automated systems.
Communications of the Association for Computing Machinery, 47(4), 51–55.
Parasuraman, R., Molloy, R., & Singh, I. L. (1993). Performance consequences of automation-induced
“complacency”. International Journal of Aviation Psychology, 3, 1–23.
Parasuraman,R., Mouloua,M., &Hilburn,B. (1999).Adaptive aiding and adaptive task allocation enhance
human-machine interaction. In M. W. Scerbo & M. Mouloua (Eds.), Automation technology and human
performance: Current research and trends (pp. 119–123). Mahwah, NJ: Erlbaum.
Parasuraman, R., Mouloua, M., & Molloy, R. (1996). Effects of adaptive task allocation on monitoring of
automated systems. Human Factors, 38, 665–679.
Parasuraman,R., &Riley,V. (1997).Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39,
230–253.
Parasuraman, R., & Rizzo, M. (2007). Neuroergonomics: The Brain at Work. New York: Oxford.
Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human
interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics. Part A: Systems and
Humans, 30, 286–297.
Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2008). Situation awareness, mental workload, and trust
in automation: Viable, empirically supported cognitive engineering constructs. Journal of Cognitive
Engineering and Decision Making, 2, 141–161.
Parasuraman, R., & Wickens, C. D. (2008). Humans: Still vital after all these years of automation. Human
Factors, 50, 511–520.
Parasuraman, R., & Wilson, G. F. (2008). Putting the brain to work: Neuroergonomics past, present, and
future. Human Factors, 50, 468–474.
Park, O., & Gittelman, S. S. (1995). Dynamic characteristics of mental models and dynamic visual displays.
Instructional Science, 23, 303–320.
Parkes, A. M., & Coleman, N. (1990). Route guidance systems: A comparison of methods of presenting

417
directional information to the driver. In E. J. Lovesey (Ed.), Contemporary ergonomics 1990 (pp. 480–
485). London: Taylor & Francis.
Parks, D. L., & Boucek, G. P., Jr. (1989). Workload prediction, diagnosis, and continuing challenges. In G.
R. McMillan, D. Beevis, E. Salas, M. H. Strub, R. Sutton, & L. Van Breda (Eds.), Applications of human
performance models to system design (pp. 47–64). New York: Plenum.
Parra, L. C., Spence, C. D., Gerson, A. D., & Sajda, P. (2003b). Response error correction–a demonstration
of improved human-machine performance using real-time EEG monitoring. IEEE Transactions on Neural
Systems and Rehabilitation Engineering, 11(2), 173–177.
Parra, L., Alvino, C., Tang, A., Pearlmutter, B., Yeung, N., Osman, A., & Sajda, P. (2003a). Single-trial
detection in EEG and MEG: Keeping it linear. Neurocomputing, 52–54, 177–183.
Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Leaning styles: concepts and evidence.
Psychological Science in the Public Interest. 9, #3.
Pashler, H. E. (1998). The psychology of attention. Cambridge, MA: MIT Press.
Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence.
Psychological Science in the Public Interest, 9(3), 105–119.
Patel,V.L., & Groen, G. J. (1991).The general and specific nature of medical expertise: A critical look. In K.
A. Ericsson & J. Smith (Eds.), Toward a general theory of expertise (pp. 93–125). Cambridge, MA:
Cambridge University Press.
Paterson, K. B., & Jordan, T. R. (2010). Effects of increased letter spacing on word identification and eye
guidance during reading. Memory & Cognition, 38, 502–512.
Patrick, J., & James, N. (2004). A task-oriented perspective of situation awareness. In S. Banbury & S.
Tremblay (Eds.), A cognitive approach to situation awareness: Theory and application (pp. 61–81).
Aldershot, UK: Ashgate.
Patterson, E., Nguyen, A. D., Halloran, J. M., & Asch, S. M. (2004). Human factors barriers to the effective
use of ten HIV clinical reminders. Journal of the American Medical Informatics Association, 11, 50–59.
Patterson, R. (2007). Human factors of 3D displays. Journal of the Society for Information Display, 15 (11),
861–871.
Pavlovic, N. J., Keillor, J., Chignell, M. H., & Hollands, J. G. (2006). Congruency between visual and
auditory displays on spatial tasks using different reference frames. In Proceedings of the Human Factors
and Ergonomics Society—50th Annual Meeting (pp. 1523–1527). Santa Monica, CA: Human Factors and
Ergonomics Society.
Pavlovic, N. J., Keillor, J., Hollands, J. G., & Chignell, M. H. (2009). Reference frame congruency in
search-and-rescue tasks. Human Factors, 51, 240–250.
Payne, J. W. (1980). Information processing theory: Some concepts and methods applied to decision research.
In T. S. Wallsten (Ed.), Cognitive processes in choice and decision behavior. Hillsdale, NJ: Erlbaum.
Payne, J. W., Bettman, J. R., & Johnson, E. J. (1993). The adaptive decision maker. Cambridge, England:
Cambridge University Press.
Payne, S. J. (1991). Display-based action at the user interface. International Journal of Man-Machine
Studies, 35, 275–289.
Payne, S. J. (1995). Naive judgments of stimulus-response compatibility. Human Factors, 37, 495–506.
Pea, R. D. (2004). The social and technological dimensions of scaffolding and related theoretical concepts for
learning, education, and human activity. The Journal of the Learning Sciences, 13, 423–451.
Peacock, B. (2009) The laws and rules of Ergnomics in Design. Santa Monica Cal.: Human Factors Society.
Peavler, W. S. (1974). Individual differences in pupil size and performance. In M. Janisse (Ed.), Pupillary
dynamics and behavior. New York: Plenum.
Peebles, D. (2008). The effect of emergent features on judgments of quantity in configural and separable
displays. Journal of Experimental Psychology: Applied, 14, 85–100.
Peebles, D., & Cheng, P. C. H. (2003). Modeling the effect of task and graphical representation on response

418
latency in a graph reading task. Human Factors, 45, 28–45.
Penningroth, S. L., Scott, W. D., & Freuen, M. (2011). Social motivation in prospective memory: Higher
importance ratings and reported performance rates for social tasks. Canadian Journal of Experimental
Psychology, 65, 3–11.
Perham, N., Banbury, S., & Jones, D. M. (2007). Do realistic reverberation levels reduce auditory
distraction? Applied Cognitive Psychology, 21, 839–847.
Perrin, B. M., Barnett, B. J., Walrath, L., & Grossman, J. D. (2001). Information order and outcome
framing: An assessment of judgment in a naturalistic decision making context. Human Factors, 43, 227–
238.
Perrone, J. A. (1982). Visual slant underestimation: A general model. Perception, 11, 641–654.
Perrott, D. R., Saberi, K., Brown, K., & Strybel, T. Z. (1990). Auditory psychomotor coordination and visual
search performance. Perception & Psychophysics, 48, 214–226.
Peterson, C. R., & Beach, L. R. (1967). Man as an intuitive statistician. Psychological Bulletin, 68, 29–46.
Perrow, C. (1984). Normal accidents: Living with high risk technology. New York: Basic Books.
Peterson, L. R., & Peterson, M. J. (1959). Short-term retention of individual verbal items. Journal of
Experimental Psychology, 58, 193–198.
Petrov, A. A., & Anderson, J. R. (2005). The dynamics of scaling: A memory-based anchor model of
category rating and absolute identification. Psychological Review, 112, 383–416.
Pew, R. W. (1969). The speed-accuracy operating characteristic. Acta Psychologica, 30, 16–26.
Pew, R. W. (2000). The state of situation awareness measurement: Heading toward the next century. In M. R.
Endsley & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 33–47). Mahwah, NJ:
Erlbaum.
Pew, R., & Mavor, A. (1998). Modeling Human & Organizational Behavior. Washington, DC: National
Academy Press.
Pfurtscheller, G., & Neuper, C. (2001). Motor imagery and direct brain-computer communication.
Proceedings of the IEEE, 89, 1123–1134.
Pichora-Fuller, M. K. (2008). Use of supportive context by younger and older adult listeners: Balancing
bottom-up and top-down information processing. International Journal of Audiology, 47(s2), 144–154.
Pigeau, R. A., Angus, R. G., O’Neill, P., & Mack, I. (1995). Vigilance latencies to aircraft detection among
NORAD surveillance operators. Human Factors, 37, 622–634.
Pilotti, M., Chodorow, M., & Schauss, F. (2009). Text familiarity, word frequency, and sentential constraints
in error detection. Perceptual and Motor Skills, 109, 627–645.
Pinker, S. (1990). A theory of graph comprehension. In R. Freedle (Ed.), Artificial intelligence and the future
of testing (pp. 73–126). Hillsdale, NJ: Erlbaum.
Plath, D. W. (1970). The readability of segmented and conventional numerals. Human Factors, 12, 493–497.
Playfair, W. (1786). Commercial and political atlas. London: Corry.
Poldrack, R. A., & Packard, M. G. (2003). Competition among multiple memory systems: Converging
evidence from animal and human brain studies. Neuropsychologia, 41, 245–251.
Poldrack, R. A., & Wagner, A. D. (2004). What can neuro-imaging tell us about the mind? Insights from
prefrontal cortex. Current Directions in Psychological Science, 13, 177–181.
Polich, J. (2003). Updated P300: An integrative theory of P3a and P3b. Clinical Neurophysiology, 118,
2,128–2,148.
Pollack, I. (1952). The information of elementary auditory displays. Journal of the Acoustical Society of
America, 24, 745–749.
Pollack, E., Chandler, P., & Sweller, J. (2002). Assimilating complex information. Learning & Instruction.
12, 61–86.
Pollack, I., & Ficks, L. (1954). The information of elementary multidimensional auditory displays. Journal of

419
the Acoustical Society of America, 26, 155–158.
Pollack, I., & Norman, D. A. (1964). A nonparametric analysis of recognition experiments. Psychonomic
Science, 1, 125–126.
Pollatsek, A., Narayanaan, V., Pradhan, A., & Fisher, D. L. (2006). Using eye movements to evaluate a PC–
based risk awareness training program on a driving simulator. Human Factors, 48, 447–464.
Polson, M. C., & Friedman, A. (1988). Task-sharing within and between hemispheres: A multiple-resources
approach. Human Factors, 30, 633–643.
Pomerantz, J. R., & Pristach, E. A. (1989). Emergent features, attention, and perceptual glue in visual form
perception. Journal of Experimental Psychology: Human Perception and Performance, 15, 635–649.
Pond, D. J. (1979). Colors for sizes: An applied approach. In Proceedings of the Human Factors Society—
23rd Annual Meeting (pp. 427–430). Santa Monica, CA: Human Factors Society.
Pool,M.M., Koolstra,C.M., & Van Der Voort, T. H. A. (2003). Distraction effects of background soap operas
on homework performance: An experimental study enriched with observational data. Educational
Psychology, 23(4), 361–380.
Porter, G., Troscianko, T., & Gilchrist, I. D. (2007). Effort during visual search and counting: Insights from
pupillometry. Quarterly Journal of Experimental Psychology, 60, 211–229.
Posner, M. I., Snyder, C. R. R., & Davidson, B. J. (1980). Attention and the detection of signals. Journal of
Experimental Psychology: General, 109(2), 160–174.
Posner, M. I. (1964). Information reduction in the analysis of sequential tasks. Psychological Review, 71,
491–504.
Posner, M. I. (1978). Chronometric explorations of mind. Hillsdale, NJ: Erlbaum.
Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology, 32, 3–25.
Posner, M. I. (1986). Chronometric explorations of mind (2nd Ed.). New York: Oxford University Press.
Posner, M. I., Nissen, M. J., & Ogden, W. C. (1978). Attended and unattended processing modes: The role of
set for spatial location. In H. L. Pick & I. J. Saltzman (Eds.), Modes of perceiving and processing
information. Hillsdale, NJ: Erlbaum.
Posner, M. I., Rothbart, M. K., & Sheese, B. E. (2007). Attention genes. Developmental Science, 10, 24–29.
Posner, M. I., & Tudela, P. (1997). Imaging resources. Biological Psychology, 45, 95–107.
Poulton, E. C. (1976). Continuous noise interferes with work by masking auditory feedback and inner speech.
Applied Ergonomics, 7, 79–84.
Poulton, E. C. (1985). Geometric illusions in reading graphs. Perception & Psychophysics, 37, 543–548.
Povenmire, H. K., & Roscoe, S. N. (1973). Incremental transfer effectiveness of a ground–based general
aviation trainer. Human Factors, 15, 534–542.
Pradham, A., Hammel, K., De Remus, R, Pollatsek, A., Noyce, D., & Fisher, D. (2005). The use of eye
movements to evaluate the effects of driver age on risk perception in an advanced driving simulatior.
Human Factors, 47, 840–852.
Pradham, A.Pollatsek, A., Knodler, M. & Fisher, D. (2009). Can younger drivers be trained to scan for
information that will reduce their risk in roadway traffic scenarios? Ergonomics, 53, 657–673.
Pradham, A., Divekar, K., Masserasng, K., et al. (2011) The effects of focused attention training on the
duration of novice drivers’ glances inside the vehicle. Ergonomics, 54, 917–931.
Previc, F. H. (1998). The neuropsychology of 3–D space. Psychological Bulletin, 124, 123–164.
Previc, F. H. (2000). Neuropsychological guidelines for aircraft control stations. IEEE Engineering in
Medicine and Biology, March/April, 81–88.
Previc, F. & Ercoline, W. (2004) Spatial Disorientation in Aviation. Vol 203. Reston, VA: Americal Institute
of Aeronautics & Astronautics.
Prichard, J. S., Bizo, L. A., & Stratford, R. J. (2011). Evaluating the effects of team-skills training on
subjective workload. Learning and Instruction, 21, 429–440.

420
Prinzel, L., & Wickens, C. D. (Eds.) (2009). Preface to special issue on synthetic vision systems.
International Journal of Aviation Psychology, 19, 99–104.
Pritchett, A. (2009). Aviation automation: General perspectives and specific guidance for the design of modes
and alerts. Reviews of Human Factors and Ergonomics, 5, 82–113.
Proctor, R. W., & Dutta, A. (1995). Skill acquisition and human performance. Thousand Oaks, CA: Sage.
Proctor, R. W., & Van Zandt, T. (1994). Human factors in simple and complex systems. Boston: Allyn-
Bacon.
Proctor, R. W., & Van Zandt, T. (2008). Human factors in simple and complex systems (2nd Ed.). Boca
Raton, FL: CRC Press.
Proctor, R. W., & Vu, K. (2006). Selection and control of action. In G. Salvendy (Ed.) Handbook of human
factors and ergonomics (3rd Ed.). New York: Wiley.
Proctor, R. W., & Vu, K. L. (2010). Cumulative knowledge and progress in human factors. Annual Review of
Psychology, 61, 623–651.
Puffer, S. (1989). Task completion schedules: determinants and consequences for performance. Human
Relations, 42, 937–955.
Puto, C. P., Patton, W. E., III, & King, R. H. (1985). Risk handling strategies in industrial vendor selection
decisions. Journal of Marketing, 49, 89–98.
Rabbitt, P. M. A. (1978). Detection of errors by skilled typists. Ergonomics, 21, 945–958.
Rabbitt, P. M. A. (1989). Sequential reactions. In D. H. Holding (Ed.), Human skills (2nd Ed.). New York:
Wiley.
Raby, M., & Wickens, C. D. (1994). Strategic workload management and decision biases in aviation.
International Journal of Aviation Psychology, 4, 211–240.
Randel, J. M., Pugh, H. L., & Reed, S. K. (1996). Differences in expert and novice situation awareness in
naturalistic decision making. International Journal of Human-Computer Studies, 45, 579–597.
Raskin, J. (2000). The humane interface. Boston: Addison–Wesley.
Rasmussen, J. (1981). Models of mental strategies in process control. In J. Rasmussen & W. Rouse (Eds.),
Human detection and diagnosis of system failures. New York: Plenum.
Rasmussen, J. (1986). Information processing and human-machine interaction: An approach to cognitive
engineering. New York: North Holland.
Rasmussen, J., & Rouse, W. B. (1981). Human detection and diagnosis of system failures. New York:
Plenum.
Rattan, A., & Eberhardt, J. L. (2010). The role of social meaning in inattentional blindness: When gorillas in
our midst do not go unseen. Journal of Experimental Social Psychology, 46, 1,085–1,088.
Ratwani, R. M., Trafton, J. G., & Boehm-Davis, D. A. (2008). Thinking graphically: Connecting vision and
cognition during graph comprehension. Journal of Experimental Psychology: Applied, 14, 36–49.
Ratwani, R., & Trafton, J. G. (2010). An eye movement analysis of the effect of interruption modality on
primary task resumption. Human Factors, 52, 370–380.
Rau, P. L. P., & Salvendy, G. (2001). Ergonomics guidelines for designing electronic mail addresses.
Ergonomics, 44, 402–424.
Rayner, K. (2009). Eye movements and attention in reading, scene perception, and visual search. Quarterly
Journal of Experimental Psychology, 62, 1,457–1,506.
Rayner, K., & Juhasz, B. (2004). Eye movements in reading: Old questions and new directions. European
Journal of Cognitive Psychology, 16, 340–352
Razael, M., & Klette, R. (2011). Simultaneous analysis of driver behavior and road condition for driver
distraction detection. International Journal of Image and Data Fusion, 2(3), 217–236.
Reason, J. T. (1984). Lapses of attention. In R. Parasuraman & R. Davies (Eds.), Varieties of attention. New
York: Academic Press.

421
Reason, J. (1990). Human error. Cambridge, England: Cambridge University Press.
Reason, J. (2008). The human contribution: Unsafe acts, accidents and heroic recoveries. Burlington, VT:
Ashgate.
Recarte, M. A., & Nunes, L. M. (2000). Effects of verbal and spatial-imagery tasks on eye fixations while
driving. Journal of Experimental Psychology: Applied, 6, 31–43.
Recarte, M. A., & Nunes, L. M. (2003). Mental workload while driving: Effects on visual search,
discrimination, and decision making. Journal of Experimental Psychology: Applied, 9, 119–137.
Redelmeier, D. A., & Tibshirani, R. J. (1997). Association between cellular-telephone calls and motor
vehicle collisions. New England Journal of Medicine, 336, 453–458.
Reder, L. (1996). Implicit memory and metacognition. Mahwah, NJ: Erlbaum.
Reeves, B., & Nass, C. (1996). The media equation: How people treat computers, television, and new media
like real people and places. New York: Cambridge University Press.
Regan, M., Lee, J., & Young, K. (2009a). Driver distraction. Boca Raton, FL: CRC Press.
Regan, M., Lee, J., & Young, K. (2009b). Driver distraction injury prevention countermeasures part 2:
Education and Training. In M. Regan, J. Lee, & K. Young (Eds.), Driver distraction. Boca Raton, FL: CRC
Press.
Regan, M., Young, K., Lee, J., & Gordon, C. (2009a). Distraction, crashes and crash risk. In M. Regan, J.
Lee, & K. Young (Eds.), Driver distraction. Boca Raton, FL: CRC Press.
Regan, M., Young, K., Lee, J., & Gordon, C. (2009b). Sources of driver distraction. In M. Regan, J. Lee, &
K. Young (Eds.), Driver distraction. Boca Raton, FL: CRC Press.
Reicher, G. M. (1969). Perceptual recognition as a function of meaningfulness of stimulus material. Journal
of Experimental Psychology, 81, 275–280.
Reichle, E. D., Liversedge S. P., Pollatsek, A., & Rayner, K. (2009). Encoding multiple words
simultaneously in reading is implausible. Trends in Cognitive Sciences, 13(3), 115–119.
Reid, G. B., & Nygren, T. E. (1988). The subjective work-load assessment technique: A scaling procedure for
measuring mental workload. In P. A. Hancock & N. Meshkati (Eds.), Human mental workload (pp. 185–
213). Amsterdam: North Holland.
Remington, R. W., Johnston, J. C., Ruthruff, E., Gold, M., & Romera, M. (2000). Visual search in complex
displays: Factors affecting conflict detection by air traffic controllers. Human Factors, 42, 349–366.
Renshaw, J. A., Finlay, J. E., Tyfa, D., & Ward, R. D. (2004). Understanding visual influence in graph
design through temporal and spatial eye movement characteristics. Interacting with Computers, 16, 557–
578.
Rensink, R. A. (2002). Change detection. Annual Review of Psychology, 53, 245–277.
Rey, G., & Buchwald, F. (2010). The expertise reversal effect: cognitive load and motivational explanations.
Journal of Experimental Psychology: Applied, 17, 33-48.
Reynolds, D. (1966). Time and event uncertainty in unisensory reaction time. Journal of Experimental
Psychology, 71, 286–293.
Ricchiute, D. N. (1998). Evidence, memory, and causal order in a complex audit decision task. Journal of
Experimental Psychology: Applied, 4, 3–15.
Richards, A., Hannon, E. M., & Derakshan, N. (2010). Predicting and manipulating the incidence of
inattentional blindness. Psychological Research, 74, 513–523.
Richer, F., Silverman, C., & Beatty, J. (1983). Response selection and initiation in speeded reactions: A
pupillometric analysis. Journal of Experimental Psychology: Human Perception and Performance, 9, 360–
370.
Rieskamp, J. (2006). Positive and negative recency effects in retirement savings decisions. Journal of
Experimental Psychology: Applied. 12, 233–250.
Risden, K.Czerwinski, M., Munzer, T., & Cook, D. (2000). An initiatl examination of the ease of use for 2D
and 3D information visualizations of Web content. International Journal of Huyman-computer studies, 53.

422
Rizy, E. F. (1972). Effect of decision parameters on a detection/localization paradigm quantifying sonar
operator performance (Report No. R–1156). Washington, DC: Office of Naval Research Engineering
Program.
Robertson, G. G., Card, S. K., & Mackinlay, J. D. (1993). Information visualization using 3D interactive
animation. Communications of the ACM, 36, 57–71.
Robertson, G., Czerwinski, M., Fisher, D., & Lee, B. (2009). Human factors of information visualization. In
F. Durso (Ed.), Reviews of Human Factors and Ergonomics, (Vol. 5). Santa Monica, CA: Human Factors
and Ergonomics Society.
Roediger, H., & Karpicke, J (2006) Test-enhanced learning: Taking memory tests improves long-term
retention. Psychological Science, 17, 249–255.
Roenker, D. L., Cissell, G. M., Ball, K. K., Wadley, V. G., & Edwards, J. D. (2003). Speed-of-processing and
driving simulator training result in improved driving performance. Human Factors, 45, 218–233.
Roge, J., Douissembekov, E., & Vienne, F. (2012). Low conspicuity of motorcycles for car drivers. Human
Factors, 54, 14–25.
Rogers, R. D., & Monsell, S. (1995). Costs of a predictable switch between simple cognitive tasks. Journal of
Experimental Psychology: General, 124, 207–231.
Rogers, S. P. (1979). Stimulus-response incompatibility: Extra processing stages versus response competition.
In Proceedings of the 23rd Annual Meeting of the Human Factors Society. Santa Monica, CA: Human
Factors Society.
Rogers, W. A., Rousseau, G. K., & Fisk, A. D. (1999). Application of attention research. In F. Durso (Ed.),
Handbook of Applied Cognition. West Sussex, UK: Wiley.
Rolfe, J. M. (1973). The secondary task as a measure of mental load. In W. T. Singleton, J. G. Fox, & D.
Whitfield (Eds.), Measurement of man at work (pp. 135–148). London: Taylor & Francis.
Rollins, R. A., & Hendricks, R. (1980). Processing of words presented simultaneously to eye and ear. Journal
of Experimental Psychology: Human Perception and Performance, 6, 99–109.
Rolt, L. T. C. (1978). Red for danger. London: Pan Books.
Roring, R. W., Hines, F. G., & Charness, N. (2007). Age differences in identifying words in synthetic
speech. Human Factors, 49, 25–31.
Roscoe, S. N. (1968). Airborne displays for flight and navigation. Human Factors, 10, 321–332.
Roscoe, S. N. (2004). Moving horizons, control reversals, and graveyard spirals. Ergonomics in Design, 12
(4), 15–19.
Roscoe, S. N., & Williges, R. C. (1975). Motion relationships in aircraft attitude guidance displays: A flight
experiment. Human Factors, 17, 374–387.
Roscoe, S. N., Corl, L., & Jensen, R. S. (1981). Flight display dynamics revisited. Human Factors, 23, 341–
353.
Rose, A. M. (1989). Acquisition and retention of skills. In G. MacMillan, D. Beevis, E. Salas, M. H. Strub, R.
Sutton & L. Van Breda (Eds.), Applications of human performance models to system design. New York:
Plenum.
Rosen, M. A., Salas, E., Fiore, S. M., Pavlas, D., & Lum, H. C. (2009). Team cognition and external
representations: A framework and propositions for supporting collaborative problem solving. In
Proceedings of the Human Factors and Ergonomics Society 53rd Annual Meeting (pp. 257–261). Santa
Monica, CA: Human Factors and Ergonomics Society.
Rosenholtz, R., Li, Y., & Nakano, L. (2007). Measuring visual clutter. Journal of Vision, 7(2), 1–22.
Rosenthal, R., & DiMatteo, M. R. (2001). Meta-analysis: Recent developments in quantitative methods for
literature review. Annual Review of Psychology, 52, 59–82.
Roske-Hofstrand, R. J., & Paap, K. R. (1986). Cognitive networks as a guide to menu organization: An
application in the automated cockpit. Ergonomics, 29, 1,301–1,311.
Rossi, A. L., & Madden, J. M. (1979). Clinical judgment of nurses. Bulletin of the Psychonomic Society, 14,

423
281–284.
Roth, E. M., & Woods, D. D. (1988). Aiding human performance I: Cognitive analysis. Le Travail Humain,
51,39–64.
Rothbaum, B. O., Anderson, P., Zimand, E., et al. (2006). Virtual reality exposure therapy and standard (in
vivo) exposure therapy in the treatment of fear of flying. Behavior Therapy, 37, 80–90.
Rothrock, L., Barron, K., Simpson, T. W., Frecker, M., Ligetti, C., & Barton, R. R. (2006). Applying the
proximity compatibility and the control-display compatibility principles to engineering design interfaces.
Human Factors and Ergonomics in Manufacturing, 16, 61–81.
Rouse, W. B. (1981). Experimental studies and mathematical models of human problem solving performance
in fault diagnosis tasks. In J. Rasmussen & W. Rouse (Eds.), Human detection and diagnosis of system
failures. New York: Plenum.
Rouse, W. B. (1988). Adaptive aiding for human/computer control. Human Factors, 30, 431–438.
Rouse, W. B., & Morris, N. M. (1987). Conceptual design of a human error tolerant interface for complex
engineering systems. Automatica, 23(2), 231–235.
Rouse, W. B., & Rouse, S. H. (1983). Analysis and classification of human error. IEEE Transactions on
Systems, Man, and Cybernetics, SMC-13, 539–554.
Rouse, S. H., Rouse, W. B., & Hammer, J. M. (1982). Design and evaluation of an onboard computer-based
information system for aircraft. IEEE Transactions on Systems, Man, and Cybernetics, SMC-12, 451–463.
Rousseau, R., Tremblay, S., and Breton, R. (2004). Defining and modeling situation awareness: A critical
review. In S. Banbury & S. Tremblay (Eds.), A cognitive approach to situation awareness: Theory and
application (pp. 3–21). Aldershot, UK: Ashgate.
Rousseau, R., Tremblay, S., Banbury, S., Breton, R., & Guitouni, A. (2010). The role of metacognition in
the relationship between objective and subjective measures of situation awareness. Theoretical Issues in
Ergonomic Science, 11, 119–130.
Rovira, E., McGarry, K., & Parasuraman, R. (2007). Effects of imperfect automation on decision making in
a simulated command and control task. Human Factors, 49, 76–87.
Rowe, A. L., Cooke, N. J., Hall, E. P., & Halgren, T. L. (1996). Toward an online knowledge assessment
methodology: Building on the relationship between knowing and doing. Journal of Experimental
Psychology: Applied, 2, 31–47.
Roy, C. S., & Sherrington, C. S. (1890). On the regulation of the blood supply of the brain. Journal of
Physiology, 11, 85–108.
Rubenstein, T., & Mason, A. F. (1979, November). The accident that shouldn’t have happened: An analysis
of Three Mile Island. IEEE Spectrum, pp. 33–57.
Rubinstein, J. S., Meyer, D. E., & Evans, J. E. (2001). Executive control of cognitive processes in task
switching. Journal of Experimental Psychology: Human Perception and Performance, 4, 763–797.
Ruffle-Smith, H. P. (1979). A simulator study of the interaction of pilot workload with errors, vigilance, and
decision (NASA Technical Memorandum 78482). Washington, DC: NASA Technical Information Office.
Rumelhart, D. E. (1977). Human information processing. New York: Wiley.
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing: Explorations in the
microstructure of cognition (Vol. 1). Cambridge, MA: MIT Press.
Rumelhart, D., & Norman, D. (1982). Simulating a skilled typist: A study of skilled cognitive-motor
performance. Cognitive Science, 6, 1–36.
Russo, J. E. (1977). The value of unit price information. Journal of Marketing Research, 14, 193–201.
Ruva, C. L., & McElvoy, C. (2008). Negative and positive pretrial publicity affect juror memory and decision
making. Journal of Experimental Psychology: Applied, 14, 226–235.
Ryu, H., & Monk, A. (2009). Interaction unit analysis: A new interaction design framework. Human–
Computer Interaction, 24, 367–407.
Sadowski, W., & Stanney, K. (2002). Presence in virtual environments. In K. M. Stanney (Ed.), Handbook of

424
virtual environments (pp. 791–806). Mahwah, NJ: Erlbaum.
Saito, M. (1972). A study on bottle inspection speed-determination of appropriate work speed by means of
electronystagmography. Journal of Science of Labor, 48, 395–400. (In Japanese, English summary.)
Salamé, P., & Baddeley, A. D. (1989). Effects of background music on phonological short-term memory.
Quarterly Journal of Experimental Psychology, 41A, 107–122.
Salas, E., Wilson, K. A., Burke, C. S., Wightman, D. C., & Howse, W. R. (2006). A checklist for crew
resource management training. Ergonomics in Design, Spring 2006, 6–15.
Salmon, P., Stanton, N., Walker, G., & Green D. (2006). Situation awareness measurement: A review of
applicability for C4i environments. Applied Ergonomics, 37, 225–238.
Salterio, S. (1996). Decision support and information search in a complex environment: Evidence from
archival data in auditing. Human Factors, 38, 495–505.
Salvendy, G. (2012) Ed. Handbook of Human Factors & Ergonomics, 4th edition. NY.: John Wiley & Sons.
Salvucci, D., & Beltowska, J. (2008). Effects of memory rehearsal on driver performance: experiment and
theoretical account. Human Factors, 50, 824–844.
Salvucci, D., & Taatgen, N. A. (2008). Threaded cognition. Psychological Review, 115, 101–130.
Salvucci, D., & Taatgen, N. A. (2011). The multi-tasking mind. Oxford, UK: Oxford University Press.
Salzer, Y., Oron-Gilad, T., Ronen, A., & Parmet, Y. (2011). Vibrotactile “on-thigh” alerting system in the
cockpit. Human Factors, 53, 118–131.
Samet, M. G., Weltman, G., & Davis, K. B. (1976, December). Application of adaptive models to information
selection in C3 systems (Technical Report PTR-1033-76-12). Woodland Hills, CA: Perceptronics.
Sanders, A. F., & Houtmans, M. J. M. (1985). Perceptual processing models in the functional visual field.
Acta Psychologica, 58, 251–261.
Sanderson, P. M. (1989). Verbalizable knowledge and skilled task performance: Association, dissociation,
and mental models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 729–747.
Sanderson, P. M., Flach, J. M., Buttigieg, M. A., & Casey, E. J. (1989). Object displays do not always
support better integrated task performance. Human Factors, 31, 183–198.
Sanquist, T. F., Doctor, P., & Parasuraman, R. (2008). A threat display concept for radiation detection in
homeland security cargo screening. IEEE Transactions on Systems, Man, and Cybernetics. Part C.
Applications, 38, 856–860.
Sarno, K. J., & Wickens, C. D. (1995). Role of multiple resources in predicting time-sharing efficiency:
Evaluation of three workload models in a multiple-task setting. International Journal of Aviation
Psychology, 5, 107–130.
Sarter, N. B. (2007). Multimodal information presentation: Design guidance and research challenges.
International Journal of Industrial Ergonomics, 36, 439–445.
Sarter, N. B. (2008). Investigating mode errors on automated flight decks: Illustrating the problem-driven,
cumulative, and interdisciplinary nature of human factors research. Human Factors, 50, 506–510.
Sarter, N. B. (2009). The need for multisensory interfaces in support of effective attention allocation in highly
dynamic event-driven domains: The case of cockpit automation. International Journal of Aviation
Psychology, 10, 231–245.
Sarter, N. B., Mumaw, R. J., & Wickens, C. D. (2007). Pilots’ monitoring strategies and performance on
automated flight decks: An empirical study combining behavioral and eye-tracking data. Human Factors,
49, 347–357.
Sarter, N. B., & Schroeder, B. K. (2001). Supporting decision-making and action selection under time
pressure and uncertainty: The case of inflight icing. Human Factors, 43, 573–583.
Sarter, N. B., & Woods, D. D. (1995). How in the world did we ever get into that mode? Mode error and
awareness in supervisory control. Human Factors, 37, 5–19.
Sarter, N. B., & Woods, D. D. (1996). Team play with a powerful and independent agent: Operational
experiences and automation surprises on the Airbus A–320. Human Factors, 39, 559–573.

425
Sarter, N. B., Woods, D. D., & Billings, C. E. (1997). Automation surprises. In G. Salvendy (Ed.), Handbook
of human factors and ergonomics (2nd ed., pp. 1926–1943). New York: Wiley.
Satchell, P. (1998). Innovation and automation. Brookfield, VT: Ashgate.
Sauer, J., Wastell, D. G., & Schmeink, C. (2009). Designing for the home: A comparative study of support
aids for central heating systems. Applied Ergonomics, 40, 165–174.
Scanlan, L. A. (1975). Visual time compression: Spatial and temporal cues. Human Factors, 17, 337–345.
Scerbo, M. (1996). Theoretical perspectives on adaptive automation. In R. Parasuraman & M. Mouloua
(Eds.), Automation and human performance: Theory and applications. Mahwah, NJ: Erlbaum.
Scerbo, M. (2001). Adaptive automation. In W. Karwowski (Ed.), International encyclopedia of ergonomics
and human factors (pp. 1,077–1,079). London: Taylor & Francis.
Scerbo, M. W., Greenwald, C. Q., & Sawin, D. A. (1993). The effects of subject-controlled pacing and task
type on sustained attention and subjective workload. Journal of General Psychology, 120, 293–307.
Schall, G., Mendez, E., Kruijff, E., Veas, E., Junghanns, S., Reitinger, B., & Schmalstieg, D. (2009).
Handheld augmented reality for underground infrastructure visualization. Personal and Ubiquitous
Computing, 13, 281–291.
Scharenborg, O. (2007). Reaching over the gap: A review of efforts to link human and automatic speech
recognition research. Speech Communication, 49(5), 336–347.
Schaudt, W. A., Caufield, K. J., & Dyre, B. P. (2002). Effects of a virtual air speed error indicator on
guidance accuracy and eye movement control during simulated flight. In Proceedings of the Human
Factors and Ergonomics Society—46th Annual Meeting (pp. 1,594–1,598). Santa Monica, CA: Human
Factors and Ergonomics Society.
Scheck, B., Neufeld, P., & Dwyer, J. (2003). Actual innocence: When justice goes wrong and how to make it
right. New York: New American Library.
Schiff, W., & Oldak, R. (1990). Accuracy of judging time to arrival: Effects of modality, trajectory, and
gender. Journal of Experimental Psychology: Human Perception and Performance, 16, 303–316.
Schkade, D. A., & Kleinmuntz, D. N. (1994). Information displays and choice processes: Differential effects
of organization, form, and sequence. Organizational Behavior and Human Decision Processes, 57, 319–
337.
Schlittmeier, S. J., & Hellbrück, J. (2009). Background music as noise abatement in open-plan offices: A
laboratory study on performance effects and subjective preferences. Applied Cognitive Psychology, 23,
684–697.
Schlittmeier, S. J., Hellbrück, J., Thaden, R., & Vorländer, M. (2008). The impact of background speech
varying in intelligibility: Effects on cognitive performance and perceived disturbance. Ergonomics, 51,
719–736.
Schumacher, E., Seymour, T., Glass, J., Fencsik, D., Lauber, E., Kieras, D., & Meyer, D. (2001). Virtually
perfect time sharing in dual task performance. Psychological Science, 12, 101–108.
Schmauder, A. R., Morris, R. K., & Poynor, D. V. (2000). Lexical processing and text integration of function
and content words: Evidence from priming and eye fixations. Memory & Cognition, 28, 1,098–1,108.
Schmidt, J. K., & Kysor, K. P. (1987). Designing airline passenger safety cards. In Proceedings of the 31st
Annual Meeting of the Human Factors Society (pp. 51–55). Santa Monica, CA: Human Factors Society.
Schmidt, R. A., & Bjork, R. A. (1992). New conceptualizations of practice: Common principles in three
paradigms suggest new concepts for training. Psychological Science, 3, 207–217.
Schmorrow, D. D. (Ed.) (2005). Foundations of augmented cognition. Mahwah, NJ: Erlbaum.
Schmorrow,D.D., Stanney,K., Wilson,G., & Young, P. (2006). Augmented cognition in human-system
interaction. In G. Salvendy, Handbook of human factors and ergonomics.
Schneider, W. (1985). Training high-performance skills: Fallacies and guidelines. Human Factors, 27, 285–
300.
Schneider, W., & Chein, J. M. (2003). Controlled & automatic processing: Behavior, theory, and biological

426
mechanisms. Cognitive Science, 27, 525–559.
Schneider, W., & Fisk, A. D. (1982). Concurrent automatic and controlled visual search: Can processing
occur without resource cost? Journal of Experimental Psychology: Learning, Memory and Cognition, 8,
261–278.
Schneider, W., & Fisk, A. D. (1984). Automatic category search and its transfer. Journal of Experimental
Psychology: Learning, Memory, and Cognition, 10, 1–15.
Schneider, W., & Shiffrin, R. M. (1977). Controlled and automatic human information processing I:
Detection, search, and attention. Psychological Review, 84, 1–66.
Schoenfeld, V. S., & Scerbo, M. W. (1997). Search differences for the presence and absence of features in
sustained attention. In Proceedings of the Human Factors and Ergonomics Society 41st Annual Meeting
(pp. 1,288–1,292). Santa Monica, CA: Human Factors and Ergonomics Society.
Scholl, B. J. (2001). Objects and attention: The state of the art. Cognition, 80, 1–46.
Schraagen, J. M., Chipman, S. F., & Shalin, V. L. (2000). Cognitive task analysis. Mahwah, NJ: Erlbaum.
Schraagen, J. M., Chipman, S. F., & Shute, V. J. (2000). State-of-the-art review of cognitive task analysis
techniques. In J. M. Schraagen, S. F. Chipman, & V. L. Shalin (Eds.), Cognitive task analysis (pp. 467–
487). Mahwah, NJ: Erlbaum.
Schreiber, B. T., Wickens, C. D., Renner, G. J., Alton, J., & Hickox, J. C. (1998). Navigational checking
using 3D maps: The influence of elevation angle, azimuth, and foreshortening. Human Factors, 40, 209–
223.
Schriver, A. T., Morrow, D. G., Wickens, C. D., & Talleur, D. A. (2008). Expertise differences in attentional
strategies related to pilot decision making. Human Factors, 50, 846–878.
Schröder, S., & Ziefle, M. (2008). Effects of icon concreteness and complexity on semantic transparency:
Younger vs. older users. In K. Miesenberger, J. Klaus, W. Zagler, & A. Karshmer (Eds.), Computers
helping people with special needs (pp. 90–97). Berlin: Springer.
Schroeder, R. G., & Benbassat, D. (1975). An experimental evaluation of the relationship of uncertainty to
information used by decision makers. Decision Sciences, 6, 556–567.
Schultheis, H., & Jamieson, A. (2004). Assessing cognitive load in adaptive hypermedia systems:
Physiological and behavioral methods. In P. De Bra and W. Nejdl (Eds.), Adaptive hypermedia and
adaptive web-based systems. (pp. 18–24). Eindhoven Netherlands: Springer.
Schum, D. (1975). The weighing of testimony of judicial proceedings from sources having reduced
credibility. Human Factors, 17, 172–203.
Schurr, P. H. (1987). Effects of gain and loss decision frames on risky purchase negotiations. Journal of
Applied Psychology, 72, 351–358.
Schustack, M. W., & Sternberg, R. J. (1981). Evaluation of evidence in causal inference. Journal of
Experimental Psychology: General, 110, 101–120.
Schutte, P. C., & Trujillo, A. C. (1996). Flight crew task management in non-normal situations. In
Proceedings of the 40th Annual Meeting of the Human Factors and Ergonomics Society (pp. 244–248).
Santa Monica, CA: Human Factors and Ergonomics Society.
Schwartz, D. R., & Howell, W. C. (1985). Optional stopping performance under graphic and numeric CRT
formatting. Human Factors, 27, 433–444.
Schwarz, N. & Vaughn, L. (2002). The availability heuristics revisited. In T. Gilovich, D. Griffin & D.
Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment. New York: Cambridge
University Press.
Scialfa, C. T., Kline, D. W., & Lyman, B. J. (1987). Age differences in target identification as a function of
retinal location and noise level: Examination of the useful field of view. Psychology and Aging, 2, 14–19.
Scullin, M., & McDaniel, M. (2010). Remembering to execute a goal: Sleep on it. Psychological Science, 21,
1,028–1,035.
Seagull, F. J., & Sandserson, P. M. (2001). Anesthesiology alarms in context: An observational study.
Human Factors, 43, 66–78.

427
Seagull, F. J., Xiao, Y., & Plasters, C. (2004). Information accuracy and sampling effort: a field study of
surgical scheduling coordination. IEEE Transactions on Systems Man & Cybernetics Part A. 34, 764–771.
Seamster,T.L., Redding, R. E., &Kaempf,G.L. (1997). Applied cognitive task analysis in aviation.
Brookfield, VT: Ashgate.
Seamster, T. L., Redding, R. E., Cannon, J. R., Ryder, J. M., & Purcell, J. A. (1993). Cognitive task analysis
of expertise in air traffic control. International Journal of Aviation Psychology, 3, 257–283.
Search, A. & Jacko, J (2009). Human-Computer Inreraction Fundamentals. Boco Ratan, FL: CRC Press.
Sebok, A., Wickens, C. D., Sarter, N. B., Quesada, S., Socash, C., & Anthony, B. (in press). The Automation
Design Advisor Tool (ADAT): Development and validation of a model-based tool to support flight deck
automation design for NextGen operations. Human Factors and Ergonomics in Manufacturing and Service
Industries.
See, J. E., Howe, S. R., Warm, J. S., & Dember, W. N. (1995). Meta-analysis of the sensitivity decrement in
vigilance. Psychological Bulletin, 117, 230–249.
See, J. E., Warm, J. S., Dember, W. N., Howe, S. R. (1997). Vigilance and signal detection theory: An
empirical evaluation of five measures of response bias. Human Factors, 39, 14–29.
Seegmiller, J. K., Watson, J. M., & Strayer, D. L. (2011). Individual differences in susceptibility to
inattentional blindness. Journal of Experimental Psychology: Learning, Memory, and Cognition, 37, 785–
791.
Segal, L. (1995). Designing team workstations: The choreography of teamwork. In P. A. Hancock, J. M.
Flach, J. Caird, & K. J. Vicente (Eds.), Local applications of the ecological approach to human-machine
systems (Vol. 2). Hillsdale, NJ: Erlbaum.
Seibel, R. (1964). Data entry through chord, parallel entry devices. Human Factors, 6, 189–192.
Seibel, R. (1972). Data entry devices and procedures. In R. G. Kinkade & H. S. Van Cott (Eds.), Human
engineering guide to equipment design. Washington, DC: U.S. Government Printing Office.
Seidler, K. S., & Wickens, C. D. (1992). Distance and organization in multifunction displays. Human Factors,
34, 555–569.
Seligman, M. E. P., & Kahana, M. (2009). Unpacking intuition: A conjecture. Perspectives on Psychological
Science, 4(4), 399–402.
Selye, H. (1976). Stress in health and disease. Boston, MA: Butterworth.
Senders, J. (1964). The human operator as a monitor and controller of multidegree of freedom systems. IEEE
Transactions on Human Factors in Electronics, HFE-5, 2-6.
Senders, J. (1980). Visual Scanning Processes. Unpublished Doctoral Dissertation. University of Tilburg,
Netherlands.
Senders, J., & Moray, N. (1991). Human error: Cause, prediction and reduction. Hillsdale, NJ: Erlbaum.
Seppelt, B. D., & Lee, J. D. (2007). Making adaptive cruise control (ACC) limits visible. International
Journal of Human-Computer Studies, 65, 192–205.
Serfaty, D., MacMillan, J., Entin, E. E., &Entin, E. B. (1997). The decision-making expertise of battle
commanders. In C. E. Zsambok & G. Klein (Eds.), Naturalistic decision making (pp. 233–246). Mahwah,
NJ: Erlbaum.
Servos, P., Goodale, M. A., & Jakobson, L. S. (1992). The role of binocular vision in prehension: a kinematic
analysis. Vision Research, 32, 1,513–1,521.
Sethumadhavan, A. (2009). Effects of automation types on air traffic controller situation awareness and
performance. In Proceedings of the Human Factors and Ergonomics Society 53rd Annual Meeting (pp. 1–
5). Santa Monica, CA: Human Factors and Ergonomics Society.
Sethumadhavan, A. (2011). Automation: Friend or foe? Ergonomics in Design, 119(2), 31–32.
Sexton, J. B., & Helmreich, R. L. (2000). Analyzing cockpit communication: The links between language,
performance, error, and workload. In Proceedings of the Tenth International Symposium on Aviation
Psychology, Columbus, OH.

428
Shaffer, L. H. (1973). Latency mechanisms in transcription. In S. Kornblum (Ed.), Attention and performance
IV. New York: Academic Press.
Shaffer, L. H. (1975). Multiple attention in continuous verbal tasks. In S. Dornic (Ed.), Attention and
performance V. New York: Academic Press.
Shaffer, L. H., & Hardwick, J. (1970). The basis of transcription skill. Journal of Experimental Psychology,
84, 424–440.
Shaffer, M. T., Hendy, K. C., & White, L. R. (1988). An empirically validated task analysis (EVTA) of low
level Army helicopter operations. In Proceedings of the 32nd Annual Meeting of the Human Factors
Society (pp. 178–183). Santa Monica, CA: Human Factors Society.
Shah, P., & Carpenter, P. A. (1995). Conceptual limitations in comprehending line graphs. Journal of
Experimental Psychology: General, 124, 43–61.
Shah, P., & Miyaki, A. (Eds). (2005). The Cambridge handbook of visuospatial thinking. Cambridge UK:
Cambridge University Press.
Shallice, T., McLeod, P., & Lewis, K. (1985). Isolating cognition modules with the dual-task paradigm: Are
speech perception and production modules separate? Quarterly Journal of Experimental Psychology, 37,
507–532.
Shannon, C. E., & Weaver, W. (1949). The mathematical theory of communications. Urbana, IL: University
of Illinois Press.
Shanteau, J. (1992). Competence in experts: The role of task characteristics. Organizational Behavior and
Human Decision Processes, 53, 252–266.
Shanteau, J., & Dino, G. A. (1993). Environmental stressor effects on creativity and decision making. In O.
Svenson & A. J.Maule(Eds.), Time pressure and stress in human judgment and decision making (pp. 293–
308). New York: Plenum.
Shapiro, K. L., & Raymond, J. (1989). Training of efficient oculomotor strategies enhances skill acquisition.
Acta Psychologica, 71, 217–242.
Sharit, J. (2006). Human error. In G. Salvendy (Ed.), Handbook of human factors and ergonomics (3rd Ed.).
New York: Wiley.
Sharma, G., Mavroidis, C., Ferreira, A. (2005). Virtual reality and haptics in nano-and bionanotechnology.
In M. Rieth & W. Schommers (Eds.), Handbook of theoretical and computational nanotechnology (Vol X,
pp. 1–33). Valencia: CA: American Scientific Publishers.
Shaw, T. H., Parasuraman, R., Guagliardo, L., & de Visser, E. (2010). Towards adaptive automation: A
neuroergonomic approach to measuring workload during a command and control task. In W. Karwowski &
G. Salvendy (Eds.), Applied human factors and ergonomics. Boca Raton, FL: Taylor & Francis.
Shebilske, W. L., Goettl, B. P., & Garland, D. J. (2000). Situation awareness, automaticity, and training. In
M. R. Endsley & D. J. Garland, Situation awareness, analysis, and measurement (pp. 271–288). Mahwah,
NJ: Erlbaum.
Shechter, S., & Hochstein, S. (1992). Asymmetric interactions in the processing of the visual dimensions of
position, width, and contrast of bar stimuli. Perception, 21, 297–312.
Sheedy, J. E., Subbaram, M. V., Zimmerman, A. B., & Hayes, J. R. (2005). Text legibility and the letter
superiority effect. Human Factors, 47, 797–815.
Shen, M.Carswell, M., Santhanam, R. and Bailey, K. (2012). Emergency management information systems:
Could decision makers be supported in choosing display formats?, Decision Support Systems, 52(2), 318–
330.
Shepard, R. N. (1982). Geometrical approximations to the structure of musical pitch. Psychological Review,
89, 305–333.
Sheridan, T. B. (1970). On how often the supervisor should sample. IEEE Transactions on Systems Science
and Cybernetics, SSC-6(2), 140–145.
Sheridan, T. B. (1996). Further musings on the psychophysics of presence. Presence, 5, 241–246.
Sheridan, T. B. (2002). Humans and automation: Systems design and research issues. New York: Wiley.

429
Sheridan, T. B., & Ferrell, W. A. (1974). Man-machine systems: Information, control, and decision models of
human performance. Cambridge, MA: MIT Press.
Sheridan, T. B., & Parasuraman, R. (2006). Human-automation interaction. Reviews of Human Factors and
Ergonomics, 1, 89–129.
Sheridan, T. B., & Verplank, W. L. (1978). Human and computer control of undersea teleoperators.
(Technical Report, Man-Machine Systems Laboratory, Department of Mechanical Engineering).
Cambridge, MA: MIT Press.
Sherman, W., & Craig, A. (2003). Understanding virtual reality: Interface, application and design. San
Francisco: Morgan Kaufmann.
Shiffrin, R. M., & Nosofsky, R. M. (1994). Seven plus or minus two: A commentary on capacity limitations.
Psychological Review, 101, 357–361.
Shiffrin, R. M., & Schneider, W. (1977). Controlled and automatic human information processing II:
Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127–190.
Shih, S. I., & Sperling, G. (2002). Measuring and modeling the trajectory of visual spatial attention.
Psychological Review, 109, 260–305.
Shinar, D. (2008). Looks are (almost) everything: Where drivers look to get information. Human Factors, 50,
380–384.
Shneiderman, B. & Plaisant, C. (2005). Designing the user interface: Strategies for effective human
computer interaction (4th Ed.). Reading, MA: Addison-Wesley.
Shneiderman, B., & Plaisant, M. (2009). Designing the user interface: Strategies for effective human
computer interaction (5th Ed.). Reading, MA: Addison-Wesley.
Shoda, M. & Rodriguez, M. L. (1989). Delay of gratification in children. Science, 244, 933–938.
Sholl, M. J. (1987). Cognitive maps as orienting schemata. Journal of Experimental Psychology: Learning,
Memory and Cognition, 13, 615–628.
Shortliffe, E. H. (1983). Medical consultation systems. In M. E. Sime and M. J. Coombs (Eds.), Designing
for human–computer communications (pp. 209–238). New York: Academic Press.
Shugan, S. M. (1980). The cost of thinking. Journal of Consumer Research, 7, 99–111.
Shulman,H.G., & McConkie, A. (1973). S-R compatibility, response discriminability and response codes in
choice reaction time. Journal of Experimental Psychology, 98, 375–378.
Shutko, J., & Tijierno, L. (2011). Ford’s approach to managing driver attention: SYNC and MyFord Touch.
Ergonomics in Design, 4, 13–16.
Sidorsky, R. C. (1974, January). Alpha-dot: A new approach to direct computer entry of battlefield data
(Technical Paper 249). Arlington, VA: U.S. Army Research Institute for the Behavioral and Social
Sciences.
Siegel, J. A., & Siegel, W. (1972). Absolute judgment and paired associate learning: Kissing cousins or
identical twins? Psychological Review, 79, 300–316.
Siegrist, M. (1996). The use or misuse of three–dimensional graphs to represent lower-dimensional data.
Behaviour & Information Technology, 15, 96–100.
Simola, J., Kuisma, J., Öörni, A., Uusitalo, L., Hyönä, J. (2011). The impact of salient advertisements on
reading and attention on web pages. Journal of Experimental Psychology: Applied, 17, 174–190.
Simon, H. A. (1955). A behavioral model of rational choice. Quarterly Journal of Economics, 69, 99–118.
Simon, H. A. (1978). Rationality as process and product of thought. Journal of the American Economic
Association, 68, 1–16.
Simon, H. A. (1981). The sciences of the artificial (2nd Ed.). Cambridge, MA: MIT Press.
Simon, H. A. (1990). Invariants of human behaviour. Annual Review of Psychology, 41, 1–19.
Simon, J. R. (1969). Reaction toward the source of stimulus. Journal of Experimental Psychology, 81, 174–
176.

430
Simonov, P. V., Frolov, M. V., Evtushenko, V. F., & Suiridov, E. P. (1977). Effect of emotional stress on
recognition of visual patterns. Aviation, Space, and Environmental Medicine, 48, 856–858.
Simons, D. J., & Chabris, C. F. (1999). Gorillas in our midst: sustained inattentional blindness for dynamic
events. Perception, 28, 1,058–1,074.
Simons, D. J., & Levin, D. T. (1998). Failure to detect changes to people during a real-world interaction.
Psychonomic Bulletin & Review, 5, 644–649.
Simonsohn, U. (2009) Direct Risk Aversion. Psychological Science, 20, 686–691.
Simpson, B. D,, Brungart, D. S., Giley, R. H., Cowgill, J. L., Dallman, R. C., Green, R. F., Youngblood, K.
L., & Moore. T. J. (2004). 3D audio cueing for target identification in a simulated flight task. In
Proceedings of the Human Factors and Ergonomics Society–48th Annual Meeting (pp. 1,836–1,840). Santa
Monica, CA: Human Factors and Ergonomics Society.
Singh, I. L., Molloy, R., & Parasuraman, R. (1993). Automation-induced “complacency”: Development of
the complacency-potential rating scale. International Journal of Aviation Psychology, 3, 111–121.
Singley, M., & Andersen, J. (1989) The transfer of cognitive skill. Cambridege, MA: Harvard University
Press.
Sirevaag, E. J., Kramer, A. F., Wickens, C. D., Reisweber, M., Strayer, D. L., & Grenell, J. F. (1993).
Assessment of pilot performance and mental workload in rotary wing aircraft. Ergonomics, 36, 1,121–
1,140.
Sit, R. A., & Fisk, A. D. (1999). Age-related performance in a multiple-task environment. Human Factors, 41,
26–34.
Sitzmann, T., Ely, K., Bell, B. S., & Bauer, K. (2010). The effects of technical difficulties on learning and
attrition during online training. Journal of Experimental Psychology: Applied, 16 (3), 281–292.
Skitka, L. J., Mosier, K. L., & Burdick, M. (2000). Accountability and automation bias. International Journal
of Human-Computer Studies, 52, 701–717.
Sklar, A. E., & Sarter, N. B. (1999). Good vibrations: Tactile feedback in support of attention allocation and
human-automation coordination in event-driven domains. Human Factors, 41, 543–552.
Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of
Experimental Psychology: Human Learning, Memory, and Cognition, 4, 592–604.
Slater, M. & Usoh, M. (1993). Presence in immersive virtual environments. In IEEE Virtual Reality
International Symposium (pp. 90–96). New York: IEEE.
Sloman, S. (2002). Two systems of reasoning. In T. Gilovich, D. Griffin, & D. Kahneman (Eds.), Heuristics
and biases: The psychology of intuitive judgment. New York: Cambridge University Press.
Slovic, P. (1987). Perception of risk. Science, 236, 280–285.
Slovic, P., Finucane, M., Peters, E., & MacGregor, D. (2002). The affect heuristic. In T. Gilovich, D.
Griffin, & D. Kahneman (Eds.), Heuristics and biases: The psychology of intuitive judgment. New York:
Cambridge University Press.
Smallman, H. S., & Cook, M. B. (2011). Naïve realism: Folk fallacies in the design and use of visual
displays. Topics in Cognitive Science, 3(3), 579–608.
Smallman, H. S., Manes, D. I., & Cowen, M. B. (2003). Measuring and modeling the misinterpretation of 3-
D perspective views. In Proceedings of the Human Factors and Ergonomics Society—47th Annual Meeting
(pp. 1,615–1,619). Santa Monica, CA: Human Factors and Ergonomics Society.
Smallman, H. S., & St. John, M. (2005). Naïve realism: Misplaced faith in the utility of realistic displays.
Ergonomics in Design, 13, 6–13.
Smallman, H. S., St. John, M., & Cowen, M. B. (2002). Use and misuse of linear perspective in the
perceptual reconstruction of 3-D perspective view displays. In Proceedings of the Human Factors and
Ergonomics Society—46th Annual Meeting (pp. 1,560–1,564). Santa Monica, CA: Human Factors and
Ergonomics Society.
Smallman, H. S., St. John, M., &Cowen, M. B. (2005). Limits of display realism: Human factors issues in
visualizing the common operational picture. In Visualisation and the common operational picture. NATO

431
RTO Meeting Proceedings RTO-MP-IST-043. Neuilly-sur-Seine, France: NATO Research and Technology
Organisation.
Smelcer, J. B., & Walker, N. (1993). Transfer of knowledge across computer command menus. International
Journal of Human-Computer Interaction, 5, 147–165.
Smilek, D., Carriere, J., & Cheyne., J. A. (2010). Out of mind, out of sight: eye blinking as an indicator and
embodiment of mind wandering. Psychological Science, 21, 786–789.
Smith, J. J., & Wogalter, M. S. (2010). Behavioral compliance to in-manual and on-product warnings. In
Proceedings of the Human Factors and Ergonomics Society 54th Annual Meeting (pp. 1,846–1,850). Santa
Monica, CA: Human Factors and Ergonomics Society.
Smith, K. U. (1962). Delayed sensory feedback and balance. Philadelphia: Saunders.
Smith, S. (1981). Exploring compatibility with words and pictures. Human Factors, 23, 305–316.
Smith, K., & Hancock, P. A. (1995). Situation awareness is adaptive, externally directed consciousness.
Human Factors, 37, 137–48.
Smith, P. J., Bennett, K. B., & Stone, R. B. (2006). Representation aiding to support performance on
problem-solving tasks. Reviews of Human Factors and Ergonomics, 2, 74–108.
Smith, P., Bennett, K., & Stone, R (2006). Representational aiding. In R. Williges (Ed.), Reviews of Human
Factors & Ergonomics (Vol 2). Santa Monica, CA: Human Factors and Ergonomics Society.
Smith, S., & Thomas, D. (1964). Color versus shape coding in information displays. Journal of Applied
Psychology, 48, 137–146.
Sniezek, J. A. (1980). Judgments of probabilistic events: Remembering the past and predicting the future.
Journal of Experimental Psychology: Human Perception and Performance, 6, 695–706.
Snodgrass, J. G., & Corwin, J. (1988). Pragmatics of measuring recognition memory: Applications to
dementia and amnesia. Journal of Experimental Psychology: General, 117, 34–50.
Snow, M. P., & Williges, R. C. (1997). Empirical modeling of perceived presence in virtual environments
using sequential exploratory techniques. In Proceedings of the Human Factors and Ergonomics Society—
41st Annual Meeting (pp. 1,224–1,228). Santa Monica, CA: Human Factors and Ergonomics Society.
Sodnik, J., Dicke, C., Tomazic, S., & Billinghurst, M. (2008). A user study of auditory versus visual
interfaces for use while driving. International Journal of Human Computer Studies, 66, 318–322.
Sodnik J., Jakus, G., Tomazic, S. (2011). Multiple spatial sounds in hierarchical menu navigation for visually
impaired computer users. International Journal of Human Computer Studies, 69, 100–112.
Soegaard, Mads (2010). Interaction Styles. Retrieved 29 February 2012 from Interaction-Design.org:
http://www.interaction-design.org/encyclopedia/interaction_ styles.html.
Sohn, Y. W., & Doane, S. M. (2003). Roles of working memory capacity and long-term working memory
skill in complex task performance. Memory & Cognition. 31, 458–466.
Sohn, Y. W., & Doane, S. M. (2004). Memory processes of flight situation awareness: Interactive roles of
working memory capacity, long-term working memory, and expertise. Human Factors, 46, 461–475.
Sollenberger, R. L., & Milgram, P. (1993). Effects of stereoscopic and rotational displays in a three-
dimensional path-tracing task. Human Factors, 35, 483–499.
Sorensen, C. (2011). Cockpit crisis. Macleans magazine, September 5, 56–61. Rogers Publishing: Toronto.
Available online at http://www2.macleans.ca/2011/08/24/cockpit-crisis/.
Sorensen, L. J., Stanton, N. A., and Banks, A. P. (2011). Back to SA school: Contrasting three approaches to
situation awareness in the cockpit. Theoretical Issues in Ergonomics Science, 12, 451–471.
Sorkin, R. D. (1989). Why are people turning off alarms? Human Factors Society Bulletin, 32(4), 3–4.
Sorkin, R. D., & Woods, D. D. (1985). Systems with human monitors: A signal detection analysis. Human-
Computer Interaction, 1, 49–75.
Sorkin, R. D., Kantowitz, B. H., & Kantowitz, S. C. (1988). Likelihood alarm displays. Human Factors, 30,
445–460.
Sowerby, L. J., Rehal, G., Husein, M., Doyle, P. C., Agrawal, S., & Ladak, H. M. (2010). Development and

432
face validity testing of a three-dimensional myringotomy simulator with haptic feedback. Journal of
Otolaryngology—Head & Neck Surgery, 39, 122–129.
Spanish Ministry of Transportation and Communications (1978). Report of collision between PAA B-747
and KLM B-747 at Tenerife. Aviation Week & Space Technology, 109 (November 20), 113–121;
(November 27), 67–74.
Speier, C. (2006). The influence of information presentation formats on complex task decision-making
performance. International Journal of Human-Computer Studies, 64, 1,115–1,131.
Spence, C., McDonald, J., & Driver, J. (2004). Exogenous spatialcuing studies of human crossmodal
attention and multisensory integration. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal
attention (pp. 277–320). Oxford: Oxford University Press.
Spence, C., & Read, L. (2003). Speech shadowing while driving: On the difficulty of splitting attention
between eye and ear. Psychological Science, 14, 251–256.
Spence, I. (2004). The apparent and effective dimensionality of representations of objects. Human Factors,
46, 738–747.
Spence, I., & Efendov, A. (2001). Target detection in scientific visualization. Journal of Experimental
Psychology: Applied, 7, 13–26.
Spence, I., Kutlesa, N., & Rose, D. L. (1999). Using color to code quantity in spatial displays. Journal of
Experimental Psychology: Applied, 5, 393–412.
Spencer, K. (1988). The psychology of educational technology and instructional media. London: Routledge.
Sperling, G., & Dosher, B. A. (1986). Strategy and optimization in human information processing. In K. Boff,
L. Kaufman, & J. Thomas (Eds.) Handbook of Perception and Performance (Vol. 1), (pp. 2-1-2-65). New
York: Wiley.
St. Amant, R., Horton, T. E., & Ritter, F. E. (2004). Model-based evaluation of expert cell phone menu
interaction. In Proceedings of the ACM Conference on Human Factors in Computing Systems (pp. 343–
350). Washington DC: Association for Computing Machinery.
St. Cyr, O., & Burns, C. M. (2001). Mental models and the abstraction hierarchy. In Proceedings of the
Human Factors and Ergonomics Society—45th Annual Meeting (pp. 297–301). Santa Monica, CA: Human
Factors and Ergonomics Society.
St. John, M., Cowen, M. B., Smallman, H. S., & Oonk, H. M. (2001). The use of 2D and 3D displays for
shape understanding versus relative position tasks. Human Factors, 43, 79–98.
St. John, M., Kobus, D. A., Morrison, J. G., & Schmorrow, D. (2004). Overview of the DARPA augmented
cognition technical integration experiment. International Journal of Human–Computer Interaction, 17,
131–149.
St. John, M., & Risser, M. R. (2009). Sustaining vigilance by activating a secondary task when inattention is
detected. In Proceedings of the Human Factors and Ergonomics Society 53rd Annual Meeting (pp. 155–
159). Santa Monica, CA: Human Factors and Ergonomics Society.
St. John, M., & Smallman, H. (2008). Four design principles for supporting situation awareness. Journal of
Cognitive engineering and Decision Making 2, 118–139.
St. John, M., Smallman, H. S., Manes, D. I., Feher, B. A., & Morrison, J. G. (2005). Heuristic automation
for decluttering tactical displays.Human Factors, 47, 509–525.
Stager,P., &Angus,R. (1978). Locating crash sites in simulated air-to-ground visual search. Human Factors,
20, 453–466.
Stankov, L. (1983). Attention and Intelligence. Journal of Educaational Psychology, 74(4), 471–490.
Stankov, L. (1988). Single tasks, competing tasks, and their relationship to the broad factors of intelligence.
Personality and Individual Difference, 9, 25–44.
Stanney, K. M., & Zyda, M. (2002). Virtual environments in the 21st century. In K. M. Stanney (Ed.),
Handbook of virtual environments (pp. 1–14). Mahwah, NJ: Erlbaum.
Stansfeld, S. A., Berglund, B., Clark, C., Lopez-Barrio, I., Fischer, P., Öhrström, E., Haines, M., Head, J.,
Hygge, S., van Kamp, I., and Berry, B. F. (2005). Aircraft and road traffic noise and children’s cognition

433
and health: A cross-national study. The Lancet, 265, 1942–1949.
Stansky, D., Wilcox, L., & Dubrowski, A. (2010). Mental rotation: Cross task training and generalization.
Journal of Experimental Psychology: Applied. 16, 349–360.
Stanton, N. A., & Baber, C. (2008). Modeling of human alarm handling response times: A case study of the
Ladbroke Grove rail accident in the UK. Ergonomics, 51, 423–440.
Stanton, N. A., Salmon, P. M., Walker, G. H., and Jenkins, D. P. (2010). Is situation awareness all in the
mind? Theoretical Issues in Ergonomics Science, 11, 29–40.
Starr, M. S., & Rayner, K. (2004). Eye movements during reading: Some current controversies. Trends in
Cognitive Science, 5, 156–163.
Steblay, N. (1997). Social influence in eyewitness recall: A metaanalytic review of lineup instruction effects.
Law and Human Behavior, 21, 283–297.
Steblay, N., Dysart, J., Fulero, S., & Lindsay, R. C. L. (2001). Eyewitness accuracy rates in sequential and
simultaneous linup presentations: A meta-analytic comparison. Law and Human Behavior, 25, 459–473.
Steelman, K. S., McCarley, J. S., & Wickens, C. D. (2011). Modeling the control of attention in visual
workspaces. Human Factors, 53, 142–153.
Stefanidis, D., Korndorffer, J. R., Markley, S., Sierra, R., & Schott, D. J. (2006). Proficiency maintenance:
Impact of ongoing simulator training on laparoscopic skill retention. Journal of the American College of
Surgeons, 202 (4),599–603.
Stefanidis, D., Korndorffer, J. R., Sierra, R. Touchard, C., Dunne, J. B., & Scott, D. J. (2005). Skill
retention following proficiency-based laparoscopic simulator training. Journal of Surgery, 138 (2), 165–
170.
Steil, B. (2001). Creating securities markets in developing countries: A new approach for the age of
automated trading. International Finance, 4(2), 257–278.
Steltzer, E. M., & Wickens, C. D. (2006). Pilots strategically compensate for display enlargements in
surveillance and flight control tasks. Human Factors, 48, 166–181.
Sternberg, S. (1966). High speed scanning in human memory. Science, 153, 652–654.
Sternberg, S. (1969). The discovery of processing stages: Extension of Donders’ method. Acta Psychologica,
30, 276–315.
Sternberg, S. (1975). Memory scanning: New findings and current controversies. Quarterly Journal of
Experimental Psychology, 27, 1–32.
Sternberg,S., Kroll,R.L., &Wright,C.E. (1978).Experiments on temporal aspects of keyboard entry. In J. P.
Duncanson (Ed.), Getting it together: Research and application in human factors. Santa Monica, CA:
Human Factors Society.
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 677–680.
Stevens, S. S. (1957). On the psychophysical law. Psychological Review, 64, 153–181.
Stevens, S. S. (1975). Psychophysics. New York: Wiley.
Stiensmeier-Pelster, J., &Schrmann, M. (1993). Information processing in decision making under time
pressure: The influence of action versus state orientation. In O. Svenson & A. J. Maule (Eds.), Time
pressure and stress in human judgment and decision making (pp. 241–254). New York: Plenum.
Stokes, A. & Kite, K. (1994). Flight Stress: Fatigue and performance in aviation. Aldershot, UK: Ashgate.
Stokes, A. F., Wickens, C. D., & Kite, K. (1990). Display technology: Human factors concepts. Warrendale,
PA: Society of Automotive Engineers.
Stokes, A. F., & Raby, M. (1989). Stress and cognitive performance in trainee pilots. In Proceedings of the
33rd Annual Meeting of the Human Factors Society. Santa Monica, CA: Human Factors Society.
Stone, D. E., & Gluck, M. D. (1980). How do young adults read directions with and without pictures?
(Technical Report). Ithaca, NY: Cornell University, Department of Education.
Stone, E. R., Yates, J. F., & Parker, A. M. (1997). Effects of numerical and graphical displays on professed
risk-taking behavior. Journal of Experimental Psychology: Applied, 3, 243–256.

434
Stone, R. J. (2002). Applications of virtual environments: An overview. In K. M. Stanney (Ed.), Handbook of
virtual environments (pp. 827–856). Mahwah, NJ: Erlbaum.
Stone, R. T., Watts, K. P., Zhong, P., & Wei, C. S. (2011). Physical and cognitive effects of virtual reality
integrated training. Human Factors, 53, 558–572.
Strayer, D. L., & Drews, F. A. (2007). Multitasking in the automobile. In A. F. Kramer, D. A. Wiegmann, &
A. Kirlik (Eds.), Attention: From theory to practice. Oxford UK: Oxford University Press.
Strayer, D. L., Drews, F. A., & Johnston, W. A. (2003). Cell phone-induced failures of visual attention
during simulated driving. Journal of Experimental Psychology, Applied, 9, 23–32.
Strayer, D. L., Wickens, C. D., & Braune, R. (1987). Adult age differences in the speed and capacity of
information processing. II. An electrophysiological approach. Psychology and Aging, 2, 99–110.
Stroobant, N., & Vingerhoets, G. (2000). Transcranial Doppler ultrasonography monitoring of cerebral
hemodynamics during performance of cognitive tasks: A review. Neuropsychology Review, 10, 213–231.
Stroop, J. R. (1935). Studies of interference in serial verbal reactions. Journal of Experimental Psychology,
18, 643–662.
Sturm, W., & Wilmes, K. (2001). On the functional neuro-anatomy of intrinsic and phasic alertness.
NeuroImage, 14, S76–S84.
Sulistyawati, K., Wickens, C. D., & Chui, Y. P. (2011). Prediction in situation awareness: Confidence bias
and underlying cognitive abilities. The International Journal of Aviation Psychology, 21, 153–174.
Summala, H. (1981). Driver/vehicle steering response latencies. Human Factors, 23, 683–692.
Summala., H., Nieminen, T., & Punto, M. (1996). Maintaining lane position with peripheral vision during
invehicle tasks. Human Factors, 38, 442–451.
Svenson, O. (1981). Are we less risky and more skillful than our fellow drivers? Acta Psychologica, 47, 143–
148.
Swain, A. D. (1990). Human reliability analysis: Need, status, trends and limitations. Reliability Engineering
and System Safety, 29, 301–313.
Svenson, S., & Maule, A. (1993). Time pressure and stress in human judgment and decision making. New
York: Plenum Press.
Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 257–
285.
Sweller, J., & Chandler, P. (1994). Why some material is difficult to learn. Cognition and Instruction, 12,
185–233.
Sweller, J., Chandler, P., Tierney, P., & Cooper, M. (1990). Cognitive load as a factor in the structuring of
technical material. Journal of Experimental Psychology: General, 119, 176–192.
Swets, J. A. (Ed.). (1964). Signal detection and recognition by human observers: Contemporary readings.
New York: Wiley.
Swets, J. A. (1992). The science of choosing the right decision threshold in high-stake diagnostics. American
Psychologist, 47, 522–532.
Swets, J. A. (1996). Signal detection theory and ROC analysis in psychology and diagnostics. Mahwah, NJ:
Erlbaum.
Swets, J. A. (1998). Separating discrimination and decision in detection, recognition, and matters of life and
death. In An invitation to cognitive science: Methods, models, and conceptual issues (Vol. 4, D.
Scarborough and S. Sternberg, Eds.) (2nd Ed., pp. 635–702). Cambridge, MA: MIT Press.
Swets, J. A., & Pickett, R. M. (1982). The evaluation of diagnostic systems. New York: Academic Press.
Szalma, J. L. (2009). Individual differences in human-technology interaction: incorporating variation in
human characteristics into human factors and ergonomics research and design Theoretical Issues in
Ergonomics Science, 10, 381–397.
Szalma, J. L., & Hancock, P. A. (2011). Noise effects on human performance: A meta-analytic synthesis.
Psychological Bulletin, 137, 682–707.

435
Taatgen, N. A., Huss, D., Dickison, D., & Anderson, J. R. (2008). The acquisition of robust and flexible
cognitive skills. Journal of Experimental Psychology: General, 137, 548–565.
Taati, B., Tahmasebi, A. M., & Hashtrudi-Zaad, K. (2008). Experimental identification and analysis of the
dynamics of a PHANToM premium 1.5A haptic device. Presence, 17, 327–343.
Takeuchi, A. H., & Hulse, S. H. (1993). Absolute pitch. Psychological Bulletin, 113, 345–361.
Taleb, N. N. (2007). The black swan: The impact of the highly improbable. New York: Random House.
Taylor, H., Brunye, T., & Taylor, S. (2008). Spatial mental representation implications for navigation system
design. In M. Carswell (Ed.), Reviews of Human Factors and Ergonomics (Vol 4). Santa Monica, CA:
Human Factors and Ergonomics Society.
Taylor, J. L., O’Hara, R., Mumenthaler, M. S., Rosen, A. C., and Yesavage, J. A. (2005). Cognitive Ability,
Expertise, and Age Differences in Following Air-Traffic Control Instructions. Psychology and Aging In the
public domain, 20 (1), 117–133.
Taylor, R. M., & Selcon, S. J. (1990). Cognitive quality and situational awareness with advanced aircraft
attitude displays. In Proceedings of the 34th annual meeting of the Human Factors Society (pp. 26–30).
Santa Monica, CA: Human Factors Society.
Taylor, V. A., & Bower, A. B. (2004). Improving product instruction compliance: “If you tell me why, I might
comply”. Psychology and Marketing, 21(3), 229–245.
Technical Working Group for Eyewitness Evidence (1999). Eyewitness evidence: A guide for law
enforcement. Washington, DC: US Department of Justice, Off. Justice Programs.
Teevan, J. (2008). How people recall, recognize, and reuse search results. ACM Transactions on Information
Systems, 267, 4, Article 19.
Teichner, W. H. (1974). The detection of a simple visual signal as a function of time of watch. Human
Factors, 16, 339–353.
Teichner, W. H., & Krebs, M. J. (1972). Laws of the simple visual reaction time. Psychological Review, 79,
344–358.
Telford, C. W. (1931). Refractory phase of voluntary and associate response. Journal of Experimental
Psychology, 14, 1–35.
Teichner, W. H., & Mocharnuk, J. B. (1979). Visual search for complex targets. Human Factors, 21, 259–
276.
Teichner, W. H., & Krebs, M. J. (1974). Laws of visual choice reaction time. Psychological Review, 81, 75–
98.
Tenney, Y. J., & Pew, R. W. (2007). Situation awareness catches on. What? So what? What now? In R. C.
Williges (Ed.), Reviews of human factors and ergonomics (Vol. 2, pp. 89–129). Santa Monica, CA: Human
Factors and Ergonomics Society.
Tetlock, P. E. (2002). Intuitive politicians, theologians and prosecutors: Exploring the empirical implications
of deviant functionalist metaphors. In T. Gilovich, D. Griffin & D. Kahneman (Eds.), Heuristics and
biases: The psychology of intuitive judgment. New York: Cambridge University Press.
Tetlock, P. E. (2005). Expert political judgment: How good is it? How can we know? Princeton, NJ: Princeton
University Press.
Theeuwes, J., Atchley, P., & Kramer, A. F. (1998). Attentional control within 3-D space. Journal of
Experimental Psychology: Human Perception and Performance, 24, 1,476–1,485.
Thomas, L. C., & Wickens, C. D. (2008). Display dimensionality and conflict geometry effects on maneuver
preferences for resolving inflight conflicts. Human Factors, 50, 576–588.
Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273–286.
Tierney, J. (2011). To choose is to lose. NY Times Magazine. Aug 17.
Tiersma, P. M. (2006). Communicating with juries: How to draft more understandable jury instructions.
National Center for State Courts, Williamsburg, VA.
Tindall-Ford, S., Chandler, P., & Sweller, J. (1997). When two sensory modes are better than one. Journal

436
of Experimental Psychology: Applied, 3, 257–287.
Ting, C., Mahfouf, M., Nassef, A., Linkens, D. A., Panoutsos, G., Nickel, P., Roberts, A. C., & Hockey, G.
(2010). Real-time adaptive automation system based on identification of operator functional state in
simulated process control operations. IEEE Transactions on Systems, Man, and Cybernetics. Part A:
Systems and Humans, 40, 251–262.
Tinker, M. A. (1955). Prolonged reading tasks in visual research. Journal of Applied Psychology, 39, 444–
446.
Titchner, E. B. (1908). Lectures on the elementary psychology of feeling and attention. New York:
MacMillan.
Torgerson, W. S. (1958). Theory and method of scaling. New York: Wiley.
Toronov, V., Webb, A., Choi, J. H., Wolf, M., Michalos, A., Gratton, E., & Huber, D. (2001). Investigation
of human brain hemodynamics by simultaneous near-infrared spectroscopy and functional magnetic
resonance imaging. Medical Physics, 28, 521–527.
Trafton, J. G., & Monk, C. (2007). Dealing with interruptions. Reviews of Human Factors and Ergonomics,
Vol 3. Santa Monica, CA: Human Factors and Ergonomics Society.
Trafton, J. G., Altman, E. M., & Brock, D. P. (2005). Huh? What was I doing? How people use
environmental cues after an interruption. In Proceedings of the Human Factors and Ergonomics Society
49th Annual Meeting (pp. 468–472). Santa Monica, CA: Human Factors and Ergonomics Society.
Trafton, J. G., Altmann, E. M., Brock, D. P., & Mintz, F. E. (2003). Preparing to resume an interrupted task:
Effects of prospective goal encoding and retrospective rehearsal. International Journal of Human–
Computer Studies, 58, 583–603.
Treisman, A. M. (1964a). The effect of irrelevant material on the efficiency of selective listening. American
Journal of Psychology, 77, 533–546.
Treisman, A. M. (1964b). Verbal cues, language, and meaning in attention. American Journal of Psychology,
77, 206–214.
Treisman, A. M. (1986). Properties, parts, and objects. In K. R. Boff, L. Kaufman, & J. P. Thomas (Eds.),
Handbook of perception and human performance (Vol. II, pp. 35.1–35.70). New York: Wiley.
Treisman, A. M., & Davies A. (1973). Divided attention to eye and ear. In S. Kornblum (Ed.), Attention and
performance IV. New York: Academic Press.
Treisman, A. M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12,
97–136.
Treisman, A., & Souther, J. (1985). Search asymmetry: A diagnostic for preattentive processing of separable
features. Journal of Experimental Psychology: General, 114, 285–310.
Tremblay, S., & Jones, D. M. (1999). Changes in intensity fails to produce an irrelevant sound effect
Implications for representation of unattended sound. Journal of Experimental Psychology: Human
Perception and Performance, 25, 1,005–1,015.
Tremblay, S., & Jones, D. M. (2001). Beyond the matrix: A study of interference. In D. Harris (Ed.).
Engineering Psychology and Cognitive Ergonomics (Vol. 6, pp. 255–262). Aldershot, England: Ashgate.
Tripp, L. D., & Warm, J. S. (2007). Transcranial Doppler sonography. In R. Parasuraman & M. Rizzo (Eds.)
Neuroergonomics: The brain at work. (pp. 82–94). New York: Oxford University Press.
Tsang, P. S. (2006). Regarding time-sharing with concurrent operations. Acta Psychologica, 121, 137–175.
Tsang, P. S., & Shaner, T. L. (1998). Age, attention, expertise, and time-sharing performance. Psychology
and Aging, 13, 323–347.
Tsang, P. S., & Vidulich, M. A. (2006). Mental workload and situation awareness. In G. Salvendy (Ed.),
Handbook of human factors & ergonomics (pp. 243–268). Hoboken, NJ: Wiley.
Tsang, P. S., & Wickens, C. D. (1988). The structural constraints and strategic control of resource allocation.
Human Performance, 1, 45–72.
Tsang, P. S., & Wilson, G. (1997). Mental workload. In G. Salvendy (Ed.), Handbook of human factors and

437
ergonomics (2nd Ed.). New York: Wiley.
Tsimhoni, O., Smith, D., & Green, P. (2004). Address entry while driving: speech recognition versus touch
screen keyboard. Human Factors, 46, 600–610.
Tsirlin, I., Allison, R. S., & Wilcox, L. M. (2008). Stereoscopic transparency: Constraints on the perception of
multiple surfaces. Journal of Vision, 8(5):5, 1–10.
Tufte, E. (2001). The visual display of quantitative information. (2nd Ed.). Cheshire, CT: Graphics Press.
Tulga, M. K., & Sheridan, T. B. (1980). Dynamic decisions and workload in multitask supervisory control.
IEEE Transactions on Systems, Man, and Cybernetics, SMC–10, 217–232.
Tullis, T. S. (1988). Screen design. In M. Helander (Ed.), Handbook of human-computer interaction (pp. 377–
411). Amsterdam: North-Holland.
Tulving, E., & Schacter, D. L. (1990). Priming and human memory systems. Science, 247, 302–306.
Tulving, E., Mandler, G., & Baumal, R. (1964). Interaction of two sources of information in tachistoscopic
word recognition. Canadian Journal of Psychology, 18, 62–71.
Turner, M. L., & Engle, R. W. (1989). Is working memory capacity task dependent? Journal of Memory and
Language, 28, 127–154.
Tversky, A. (1972). Elimination by aspects: A theory of choice. Psychological Review, 79, 281–299.
Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327–352.
Tversky, A., & Kahneman, D. (1971). The law of small numbers. Psychological Bulletin, 76, 105–110.
Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185,
1,124–1,131.
Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211,
453–458.
Tversky, B., Morrison, J., & Bertrancourt, M. (2002) Animation: can it facilitate? International Journal of
Human-Computer Studies, 57, 247–262.
U.S. Navy (1988). Investigation report: Formal investigation into the circumstances surrounding the downing
of Iran airflight 655 on 3 July 1988. Washington, DC: Department of Defense Investigation Report.
Ullsperger, P., Freude, G., & Erdmann, U. (2001). Auditory probe sensitivity to mental workload changes—
an event-related potential study. International Journal of Psychophysiology, 40, 201–209.
Upton, C., & Doherty, G. (2008). Extending ecological interface design principles: A manufacturing case
study. International Journal of Human-Computer Studies, 66, 271–286.
Ursin, H., Baade, E., & Levine, S (Eds.), 1978. Psychobiology of stress. NY.: Academic Press.
Valero-Gomez, A., de la Puente, P., & Hernando, M. (2011). Impact of two adjustable-autonomy models on
the scalability of single-human/multiple-robot teams for exploration missions. Human Factors, 53(6), 703–
716.
Van Beurden, M. H. P. H., van Hoey, G., Hatzakis, H., & Ijsselsteijn, W. A. (2009). Stereoscopic displays in
medical domains: A review of perception and performance effects. In Human Vision and Electronic
Imaging XIV, Proceedings of the SPIE. ( pp. 72400A-72400A-15). Bellingham, WA: International Society
for Optics and Photonics.
Van Breda, L. (1999). Anticipatory behavior in supervisory vehicle control. Delft, Netherlands: Delft
University Press.
Van Dam, S. B. J., Mulder, M., & van Paassen, M. M. (2008). Ecological interface design of a tactical
airborne separation assistance tool. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems
and Humans, 38, 1221–1233.
Van Der Horst, R. (1988). Driver decision making at traffic signals. In Traffic accident analysis and roadway
visibility (pp. 93–97). Washington, DC: National Research Council.
Van Der Kleij, R., & Brake, G. (2010). Map mediated dialogues. Human Factors, 52, 526–536.
Van Ee, R., Banks, M. S., & Backus, B. T. (1999). An analysis of stereoscopic slant contrast. Perception, 28,

438
1,121–1,145.
Van Erp, J. B. F., Eriksson, L., Levin, B., Carlander, O., Veltman, J. A., & Vos, W. K. (2007). Tactile
cueing effects on performance in simulated aerial combat with high acceleration. Aviation, Space, and
Environmental Medicine, 78, 1,128–1,134.
Van Gog, T., & Rummel, N. (2010). Example-based learning: Integrating cognitive and social-cognitive
research perspectives. Educational Psychology Review 22(2), 155-174.
Van Laar, D., & Deshe, O. (2007). Color coding of control room displays: The psychocartography of visual
layering effects. Human Factors, 49, 477–490.
Van Merriënboer, J. J. G., Kester, L., & Paas, F. (2006). Teaching complex rather than simple tasks:
balancing intrinsic and germane load to enhance transfer of learning. Applied Cognitive Psychology 20,
343–352.
Van Overschelde, J. P., & Healy, A. F. (2005). A blank look in reading. Experimental Psychology (formerly
Zeitschrift für Experimentelle Psychologie), 52, 213–223.
Van Rooij, I., Stege, U., & Schactman, A. (2003). Convex hull and tour crossings in the Euclidean traveling
sales-person problem: Implications for human performance studies. Memory & Cognition, 31, 215–220.
Van Veen, H. A. H. C., and & van Erp, J. B. F. (2003). Providing directional information with tactile torso
displays. In Proceedings of the World Haptics Conference (pp. 471–474). New York: IEEE.
Vanderheiden, G. C. (2006) Design for people with functional limitations. In G. Salvendy (Ed.), Handbook of
Ergonomics & Human Factors (3rd Ed).
Varey, C. A., Mellers, B. A., & Birnbaum, M. H. (1990). Judgments of proportions. Journal of Experimental
Psychology: Human Perception and Performance, 16, 613–625.
Vartabedian, A. G. (1972). The effects of letter size, case, and generation method on CRT display search
time. Human Factors, 14, 511–519.
Vashitz, G., Meyer, J., Parmet, Y., Peleg, R., Goldfar, D., Porath, A., & Gilutz, H. (2009). Defining and
measuring physicians’ responses to clinical reminders. Journal of Biomedical Informatics, 42, 317–326.
Venetjoki, N., Kaarlela-Tuomaala, A., Keskinen, E., & Hongisto, V. (2006). The effect of speech and speech
intelligibility on task performance. Ergonomics, 49, 1,068–1,091.
Venturino, M. (1991). Automatic processing, code dissimilarity, and the efficiency of successive memory
searches. Journal of Experimental Psychology: Human Perception and Performance, 17, 677–695.
Vergauwe, E., Barrouillet, P., & Camos, V. (2010). Do mental processes share a domain-general resource?
Psychological Science, 21, 384–390.
Verhaegen, P., Steitz, D. W., Sliwinski, M. J., & Cerella, J. (2003). Aging and dual-task performance: A
meta-analysis. Psychology and Aging, 18, 443–460.
Vessey, I. (1985). Expertise in debugging computer programs: A process analysis. International Journal of
Man-Machine Studies, 23, 459–494.
Vessey, I. (1991). Cognitive fit: A theory-based analysis of the graphs versus tables literature. Decision
Sciences, 22, 219–241.
Vicente, K. J. (1990). Coherence-and correspondence-driven work domains: Implications for systems design.
Behaviour & Information Technology, 9, 493–502.
Vicente, K. J. (1992). Memory recall in a process control system: A measure of expertise and display
effectiveness. Memory & Cognition, 20, 356–373.
Vicente, K. J. (1997). Should an interface always match the operator’s mental model? CSERIAC Gateway, 8,
1–5.
Vicente, K. J. (1999). Cognitive work analysis. Mahwah, NJ: Erlbaum.
Vicente, K. J. (2002). Ecological interface design: Progress and challenges. Human Factors, 44, 62–78.
Vicente, K. J., & Rasmussen, J. (1992). Ecological interface design: Theoretical foundations. IEEE
Transactions on Systems, Man, and Cybernetics, 22, 589–606.
Vicente, K. J., Thornton, D. C., & Moray, N. (1987). Spectral analysis of sinus arrhythmia: A measure of

439
mental effort. Human Factors, 29, 171–182.
Vicente, K. J., & Wang, J. H. (1998). An ecological theory of expertise effects in memory recall.
Psychological Review, 105, 33–57.
Vicentini, M., & Botturi, D. (2009). Human factors in haptic contact of pliable surfaces. Presence, 18, 478–
494.
Vickers, D. (1970). Evidence for an accumulator model of psychophysical discrimination. Ergonomics, 13,
37–58.
Victor, T (2011). Distraction and inattention counter measure technologies. Ergonomics in Design, 19(4), 20–
22.
Vidulich, M. A., & Tsang, P. S. (1986). Techniques of subjective workload assessment: A comparison of
SWAT and the NASA-bipolar methods. Ergonomics, 29, 1,385–1,398.
Vidulich, M. A., & Tsang, P. S. (2007). Methodological and theoretical concerns in multitask performance: A
critique of Boles, Bursk, Phillips, and Perdelwitz. Human Factors, 49, 46–49.
Vidulich, M. A., & Wickens, C. D. (1986). Causes of dissociation between subjective workload measures and
performance: Caveats for the use of subjective assessments. Applied Ergonomics, 17, 291–296.
Villoldo, A., & Tarno, R. L. (1984). Measuring the performance of EOD equipment and operators under
stress (DTIC Technical Report AD-B083-850). Indian Head, MD: Naval Explosive and Ordnance Disposal
Technology Center.
Vincow, M. A., & Wickens, C. D. (1998). Frame of reference and navigation through document
visualizations: Flying through information space. In Proceedings of the Human Factors and Ergonomics
Society 42nd Annual Meeting (pp. 511–515). Santa Monica, CA: Human Factors.
Vinze, A. S., Sen, A., & Liou, S. F. T. (1993). Operationalizing the opportunistic behavior in model
formulation. International Journal of Man–Machine Studies, 38, 509–540.
Violanti, J. M., & Marshall, J. R. (1996). Cellular phones and traffic accidents: An epidemiological
approach. Accident Analysis and Prevention, 28(2), 265–270.
Votanopoulos, K., Brunicardi, F. C., & Thornby, J., & Bellows, C. F. (2008). Impact of three-dimensional
vision in laparo-scopic training. World Journal of Surgery, 32(1), 110–118.
Wachtel, P. L. (1968). Anxiety, attention, and coping with threat. Joumal of Abnormal Psychology, 73, 137–
143.
Waganaar, W. A., & Sagaria, S. D. (1975). Misperception of exponential growth. Perception &
Psychophysics, 18, 416–422.
Walker, B., & Kogan, A. (2009). Spearcon performance and preference for auditory menus on a mobile
phone. In C. Stephanidis (Ed.), Universal access in human-computer interaction: Intelligent and ubiquitous
interaction environments, Berlin: Springer.
Wallis, T. S. A., & Horswill, M. A. (2007). Using fuzzy signal detection theory to determine why experienced
and trained drivers respond faster than novices in a hazard perception test. Accident Analysis and
Prevention, 39, 1,177–1,185.
Wallsten, T. S., & Barton, C. (1982). Processing probabilistic multidimensional information for decisions.
Journal of Experimental Psychology: Learning, Memory and Cognition, 8, 361–384.
Wang, B. (2011). Simplify to clarify. Nature Methods, 8, 611.
Wang, L., Jamieson, G. A., & Hollands, J. G. (2009). Trust and reliance on an automated combat
identification system. Human Factors, 51, 281–291.
Wang, W., & Milgram, P. (2009). Viewpoint animation with a dynamic tether for supporting navigation in a
virtual environment. Human Factors, 51, 393–403.
Wang, Z., Hope, R., Wang, Z., Ji, Q., & Gray, W. D. (2012). Cross-subject workload classification with a
hierarchical Bayes model. NeuroImage, 59, 64–69.
Ward, G., & Allport, A. (1997). Planning and problem-solving using the five-disc Tower of London task.
Quarterly Journal of Experimental Psychology, 50A, 49–78.

440
Ware, C. & Franck, G. (1996). Evaluating stereo and motion cues for visualizing information nets in three
dimensions. ACM Transactions on Graphics 15, 2, 121–139.
Ware, C., & Mitchell, P. (2008). Visualizing graphs in three dimensions. ACM Transactions on Applied
Perception, 5 (1), 2-1–2-15.
Warm, J. S. (Ed.). (1984). Sustained attention in human performance. Chichester: Wiley.
Wargo, E. ( 2011) From the lab to the courtroom. APS Observer 24 (November) 1–14.
Warm, J. S., & Dember, W. N. (1998). Tests of a vigilance taxonomy. In R. R. Hoffman, M. F. Sherrick, & J.
S. Warm (Eds.), Viewing psychology as a whole: The integrative science of William N. Dember (pp. 87–
112). Washington, DC: American Psychological Association.
Warm, J. S., Dember, W. N., & Hancock, P. A. (1996). Vigilance and workload in automated systems. In R.
Parasuraman & M. Mouloua (Eds.), Automation and human performance: theory and applications (pp.
183–200). Mahwah, NJ: Erlbaum.
Warm, J. S., Dember, W. N., Murphy, A. Z., & Dittmar, M. L. (1992). Sensing and decision-making
components of the signal-regularity effect in vigilance performance. Bulletin of the Psychonomic Society,
30, 297–300.
Warm, J. S., Parasuraman, R., & Matthews, G. (2008). Vigilance requires hard mental work and is stressful.
Human Factors, 50, 433–441.
Warren, W. H. (2004). Optic flow. In L. M. Chalupa & J. S. Werner (Eds.), The visual neurosciences (pp.
1,247–1,259). Cambridge, MA: MIT Press.
Warren, W. H., & Hannon, D. J. (1990). Eye movements and optical flow. Journal of the Optical Society of
America A, 7, 160–169.
Warren, W. H., Kay, B. A., Zosh, W. D., Duchon, A. P., & Sahuc, S. (2001). Optic flow is used to control
human walking. Nature Neuroscience, 4, 213–216.
Warrick, M. J. (1947). Direction of movement in the use of control knobs to position visual indicators (USAF
AMC Report no. 694–4C). Wright AFB: U.S. Air Force.
Warrick, M. S., Kibler, A., Topmiller, D. H., & Bates, C. (1964). Response time to unexpected stimuli.
American Psychologist, 19, 528.
Watson, M., & Sanderson, P. (2004). Sonification supports eyes-free respiratory monitoring and task time-
sharing. Human Factors, 46, 497–517.
Watts-Perotti, J., & Woods, D. (1999). How experienced users avoid getting lost in large display networks.
International Journal of Human Computer Interaction. 11, 269–299.
Weber, E. (2010). What shapes perceptions of climate change? Wiley Interdisciplinary Reviews: Climate
Change, 1, 332–342.
Weeks, D. J., & Proctor, R. W. (1990). Salient features coding in the translation between orthogonal stimulus
and response dimensions. Journal of Experimental Psychology: General, 119, 355–366.
Wegner, D. M., Giuliano, T., & Hertel, P. (1985). Cognitive interdependence in close relationships. In W. J.
Ickes (Ed.), Compatible and incompatible relationships (pp. 253–276). New York: Springer.
Weinstein, L. F., & Wickens, C. D. (1992). Use of nontradi-tional flight displays for the reduction of central
visual overload in the cockpit. International Journal of Aviation Psychology, 2, 121–142.
Weinstein, Y., McDermott, K., & Roediger, H. (2010). A comparison of study strategies for passeges:
Rereading, answering questions and generating questions. Journal of Experimental Psychology: Applied.
16, 308–316.
Weintraub, D. J. (1971). Rectangle discriminability: Perceptual relativity and the law of pragnanz. Journal of
Experimental Psychology, 88, 1–11.
Weiss, D., & Shanteau, J. (2003). Empirical assessment of expertise. Human Factors, 45, 104–116.
Weldon, M. S., & Bellinger, K. D. (1997). Collective memory: Collaborative and individual processes in
remembering. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 1,160–1,175.
Welford, A. T. (1952). The psychological refractory period and the timing of high speed performance. British

441
Journal of Psychology, 43, 2–19.
Welford, A. T. (1967). Single channel operation in the brain. Acta Psychologica, 27, 5–21.
Welford,A.T. (1968). Fundamentals of skill. London:Methuen.
Welford, A. T. (1976). Skilled performance: Perceptual and motor skills. Glenview, IL: Scott, Foresman.
Wellner, M., Sigrist, R., & Riener, R. (2010). Virtual competitors influence rowers. Presence, 19, 313–330.
Wells, G. L. (1993). What do we know about eyewitness identification? American Psychologist, 48, 553–571.
Wells, G. L.(1984). The psychology of lineup identifications. Journal of Applied Social Psychology, 14,
89–103.
Wells, G. L., & Bradfield, A. L. (1998). “Good, you identified the suspect”: Feedback to eyewitnesses distorts
their reports of the witnessing experience. Journal of Applied Psychology, 83, 360–376.
Wells, G. L., & Loftus, E. F. (1984). Eyewitness testimony: Psychological perspectives. New York:
Cambridge University Press.
Wells, G. L., & Olson, E. A. (2003). Eyewitness testimony. Annual Review of Psychology, 54, 277–95.
Wells, G. L., Lindsay, R. C., & Ferguson, T. I. (1979). Accuracy, confidence, and juror perceptions in eye-
witness testimony. Journal of Applied Psychology, 64, 440–448.
Weltman, H., Smith, J. & Egstrom, G. (1971). Perceptual narrowing during simulated pressuresure chamber
exposure. Human Facotors, 13, 99–107.
Wenger, M. J., & Payne, D. G. (1995). On the acquisition of mnemonic skill: Application of skilled memory
theory. Journal of Experimental Psychology: Applied, 1, 194–215.
Westheimer, G. (2011). Three-dimensional displays and stereo vision. Proceedings of the Royal Society B,
278, 2,241–2,248.
Wetherell, A. (1981). The efficacy of some auditory-vocal subsidiary tasks as measures of the mental load on
male and female drivers. Ergonomics, 24, 197–214.
Wetzel, C. D., Radtke, P. H., & Stern, H. W. (1994). Instructional effectiveness of video media. Hillsdale, NJ:
Erlbaum.
Wheatley, D. J., & Basapur, S. (2009). A comparative evaluation of TV video telephony with webcam and
face to face communication. Proceedings of the seventh European conference on interactive television.
Leuven, Belgium.
Whitaker, L. A., & Stacey, S. (1981). Response times to left and right directional signals. Human Factors, 23,
447–452.
Whitney, P., Arnett, P. A., Driver, A., & Budd, D. (2001). Measuring central executive functioning: what’s in
a reading span? Brain and Cognition, 45, 1–14.
Whittaker, S. (2003). Things to talk about when talking about things. Human-Computer Interaction, 18, 149–
170.
Wickelgren, W. (1977). Speed accuracy tradeoff end information processing dynamics. Acta Psychologica,
41, 67–85.
Wickelgren, W. A. (1964). Size of rehearsal group in short–term memory. Journal of Experimental
Psychology, 68, 413–419.
Wickens. C. D. ( 2009). The psychology of aviation surprise: an 8 year update regarding the noticing of black
swans. In J. Flach & P. Tsang (eds), Proceeedngs 2009 Symposium on Aviation Psychology: Dayton Ohio:
Wright State University.
Wickens, C. D. (1976). The effects of divided attention on information processing in tracking. Journal of
Experimental Psychology: Human Perception and Performance, 2, 1–13.
Wickens, C. D. (1980). The structure of attentional resources. In R. Nickerson (Ed.), Attention and
performance VIII (pp. 239–257). Hillsdale, NJ: Erlbaum.
Wickens, C. D. (1984). Engineering psychology and human performance. Columbus, OH: Merrill.
Wickens, C. D. (1984). Processing resources in attention. In R. Parasuraman & R. Davies (Eds.), Varieties of

442
attention (pp. 63–101). New York: Academic Press.
Wickens, C. D. (1986). The effects of control dynamics on performance. In K. R. Boff, L. Kaufman, & J. P.
Thomas (Eds.), Handbook of Perception and Performance (Vol. II, pp. 39–1/39–60). New York: Wiley.
Wickens, C. D. (1992). Engineering psychology and human performance (2nd ed.). New York: Harper
Collins.
Wickens, C. D. (1993). Cognitive factors in display design. Journal of the Washington Academy of Sciences,
83(4), 179–201.
Wickens, C. D. (1996). Designing for stress. In J. E. Driskell & E. Salas (Eds.), Stress and human
performance (pp. 279–296). Mahwah, NJ: Erlbaum.
Wickens, C. D. (1999). Frames of reference for navigation. In D. Gopher & A. Koriat (Eds.), Attention and
performance XVI (pp. 113–144). Orlando, FL: Academic Press.
Wickens, C. D.(2002a). Multiple resources and performance prediction. Theoretical Issues in Ergonomics
Science, 3, 159–177.
Wickens, C. D. (2002b). Aviation psychology. In L. Backman & C. von Hofsten (Eds.), Psychology at the
turn of the millennium (Vol. 1). East Sussex, UK: Psychology Press.
Wickens, C. D. (2002c). Situation awareness and workload in aviation. Current Directions in Psychological
Science, 11(4), 128–133.
Wickens, C. D. (2003). Aviation displays. In P. Tsang & M. Vidulich (Eds.), Principles and practices of
aviation psychology. Mahwah, NJ: Erlbaum.
Wickens, C. D. (2005). Multiple resource time sharing model. In N. A. Stanton, E. Salas, H. W. Hendrick, A.
Hedge, & K. Brookhuis (Eds.), Handbook of human factors and ergonomics methods (pp. 40–1/40–7).
Taylor & Francis.
Wickens, C. D. (2007). How many resources and how to identify them: Commentary on Boles et al., and
Vidulich & Tsang. Human Factors, 49, 53–56.
Wickens, C. D. (2008a). Multiple resources and mental workload. Human Factors, 50, 449–455.
Wickens, C. D. (2008b). Situation awareness: Review of Mica Endsley’s 1995 articles on SA theory and
measurement. Human Factors, 50, 397–403.
Wickens, C. D. (2012). Noticing events in the visual workplace: The SEEV and NSEEV models. In R.
Hoffman & R. Parasuraman (Eds.), Handbook of Applied Perception (pp. xx–xx). Cambridge, UK:
Cambridge University Press.
Wickens, C. D., & Alexander, A. (2009). Attentional tunneling and task management in synthetic vision
displays. International Journal of Aviation Psychology, 19, 182–199.
Wickens, C. D., Alexander, A. L., Ambinder, M. S., & Martens, M. (2004). The role of highlighting in visual
search through maps. Spatial Vision, 37, 373–388.
Wickens, C. D., Bagnall, T., Gosakan, M., & Walters, B. (2011) Modeling single pilot control of multiple
UAVs. In M. Vidulich & P. Tsang (Eds), Proceedings 16th International Symposium on Aviation
Psychology: Dayton, OH: Wright State University.
Wickens, C. D., & Baker, P. (1995). Cognitive issues in virtual reality. In W. Barfield & T. Furness III (Eds.),
Virtual Environments and Advanced Interface Design (pp. 514–541). New York: Oxford University Press.
Wickens, Carolan, Hutchins & Cumming, 2011 Investigating the impact of training on transfer: A meta-
analytic approach. In Proceedings 55th Conference of the Human Factors & Ergonomics Society. Santa
Monica, CA.: Human Factors.
Wickens, C. D. & Carswell, C. M. (1995). The proximity compatibility principle: Its psychological
foundation and relevance to display design. Human Factors, 37, 473–494.
Wickens, C. D., & Carswell, C. M. (2012). Information processing. In G. Salvendy (Ed.), Handbook of
Human Factors and Ergonomics (4th Ed.) (Ch. 5., pp. xx–xx). New York: Wiley.
Wickens, C. D., & Colcombe, A. (2007). Performance consequences of imperfect alerting automation
associated with a cockpit display of traffic information. Human Factors, 49, 564–572.

443
Wickens, C. D., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A synthesis of the
literature. Theoretical Issues in Ergonomics Science, 8, 201–212.
Wickens, C. D., Dixon, S. R., & Ambinder, M. S. (2006). Workload and automation reliability in unmanned
air vehicles. In N. J. Cooke, H. L. Pringle, H. K. Pedersen, & O. Connor (Eds.), Human factors of remotely
operated vehicles (pp. 209–222). Elsevier: Amsterdam.
Wickens, C. D., Dixon, S., Goh, J., & Hammer, B. (2005). Pilot dependence on imperfect diagnostic
automation in simulated UAV flights: an attentional visual scanning analysis. In J. Flach (Ed.), Proceedings
13th International Symposium on Aviation Psychology, Wright-Patterson AFB, Dayton OH.
Wickens, C. D., Keller, J. W. & Small, R. L. (2010). Left, No, Right! Development of the Frame of Reference
Transformation Tool (FORT). In Proceedings of the Annual Meeting of the Human Factors and
Ergonomics Society (pp. 1022-1026). Santa Monica, CA: Human Factors and Ergonomics Society.
Wickens, C. D., Dixon, S., & Seppelt, B. (2002). In-vehicle displays and control task interferences: The
effects of display location and modality (Technical Report AFHD-02-7/NASA-02-5/GM-02-1). Savoy, IL:
University of Illinois, Aviation Research Lab.
Wickens, C. D., Gempler, K., & Morphew, M. E. (2000). Workload and reliability of predictor displays in
aircraft traffic avoidance. Transportation Human Factors Journal, 2, 99–126.
Wickens, C. D., Goh, J., Helleberg, J., Horrey, W. J., & Talleur, D. A. (2003). Attentional models of
multitask pilot performance using advanced display technology. Human Factors, 45, 360–380.
Wickens, C. D., & Gosney, J. L. (2003). Redundancy, modality, and priority in dual-task interference. In
Proceedings of the 47th Annual Meeting of the Human Factors & Ergonomics Society. Santa Monica, CA:
Human Factors and Ergonomics Society.
Wickens, C. D., Hutchins, S., Carolan, T. & Cumming, J. (2012).
Wickens, C. D., & Hollands, J. G. (2000). Engineering psychology and human performance (3rd. Ed.). Upper
Saddle River, NJ: Prentice-Hall.
Wickens, C. D., Hooey, B. L., Gore, B. F., Sebok, A., & Koenicke, C. S. (2009). Identifying black swans in
NextGen: Predicting human performance in off-nominal conditions. Human Factors, 5, 638–651.
Wickens, C. D., & Horrey, W. (2009). Models of attention, distraction and highway hazard avoidance. In M.
Regan, Lee, J. D., & Young, K. L. (Eds.), Driverdistraction:Theory, effects, and mitigation. Boca Raton,
FL: CRC Press.
Wickens, C. D., Huiyang, L., Santamaria, A., Sebok, A., & Sarter, N. B. (2010). Stages and levels of
automation: An integrated meta-analysis. In Proceedings of the Human Factors and Ergonomics Society
54th Annual Meeting. (pp. 389–393). Santa Monica, CA: Human Factors and Ergonomics Society.
Wickens, C. D., Hutchins, S., Carolan, T. & Cumming, J. (2012a). Attention and Cognitive Resource Load
in Training Strategies . In A. Healy & Lyle Bourne (Eds.),
Training cognition: Optimizing efficiency, durability, and generalizability. Boca Ratan FL: CRC.
Wickens, C. D., Hutchins, S., Carolan, T. & Cumming, J. (2012b). Effectiveness of Part Task Training and
Increasing Difficulty Training Strategies: A meta-analysis approach. Human Factors. Human Factors, 54,
#4
Wickens, C.D., Hyman, F., Dellinger, J., Taylor, H., & Meador, M. (1986). The Sternberg Memory Search
task as an index of pilot workload. Ergonomics, 29, 1,371–1,383.
Wickens, C. D., & Kessel, C. (1980). The processing resource demands of failure detection in dynamic
systems. Journal of Experimental Psychology: Human Perception and Performance, 6, 564–577.
Wickens, C. D., Ketels, S. L., Healy, A. F., Buck-Gengler, C. J., & Bourne, L. E. (2010). The anchoring
heuristic in intelligence integration: A bias in need of debiasing. In Proceedings of the Annual Meeting of
the Human Factors and Ergonomics Society (pp. 2,324–2,328). Santa Monica, CA: Human Factors and
Ergonomics Society.
Wickens, C. D., Kramer, A. F., Vanasse, L., & Donchin. E. (1983). Performance of concurrent tasks: a
psychophysiological analysis of the reciprocity of information-processing resources. Science, 221(4615),
1,080–1,082.

444
Wickens, C. D., Lee, J. D., Liu, Y., & Gordon Becker, S. E. (2004). An Introduction to Human Factors
Engineering (pp. 289–290). Upper Saddle River, NJ.: Pearson.
Wickens, C. D., Liang, C. C., Prevett, T. T., & Olmos, O. (1996). Egocentric and exocentric displays for
terminal area navigation. International Journal of Aviation Psychology, 6, 241–271.
Wickens, C. D., & Liu, Y. (1988). Codes and modalities in multiple resources: A success and a qualification.
Human Factors, 30, 599–616.
Wickens, C. D., & Long, J. (1995). Object versus space-based models of visual attention: Implications for the
design of head-up displays. Journal of Experimental Psychology: Applied, 1, 179–193.
Wickens, C. D. & McCarley, J. M. (2008). Applied attention theory. Boca Raton, FL: CRC Press.
Wickens, C. D., Mavor, A., Parasuraman, R., & McGee, J. (1998). The future of air traffic control: Human
operators and automation. Washington DC: National Academy Press.
Wickens, C. D., Merwin, D. H., & Lin, E. L. (1994). Implications of graphics enhancements for the
visualization of scientific data: Dimensional integrality, stereopsis, motion, and mesh. Human Factors, 36,
44–61.
Wickens, C. D., Miller, S., & Tham, M. (1996). The implications of data link for representing pilot request
information on 2D and 3D air traffic control displays. International Journal of Industrial Ergonomics, 18,
283–293.
Wickens, C. D., & Prevett, T. T. (1995). Exploring the dimensions of egocentricity in aircraft navigation
displays: Influences on local guidance and global situation awareness. Journal of Experimental Psychology,
Applied, 1, 110–135.
Wickens, C. D., Prinet, J., Hutchins, S., Sarter, N., & Sebok, A. (2011). Auditory-visual redundancy in
vehicle control interruptions: Two meta-analyses. In Proceedings of the Human Factors and Ergonomics
Society Annual Meeting (pp. 1,155–1,159). Santa Monica, CA: Human Factors and Ergonomics Society.
Wickens, C. D., Rice, S., Keller, D., Hutchins, S., Hughes, J., & Clayton, K. (2009). False alerts in the air
traffic control conflict alerting system: is there a “cry wolf ” effect? Human Factors, 51, 446–462.
Wickens, C. D., & Rose, P. N. (2001). Human factors hand-book for displays: Summary of findings from the
Army Research Lab’s Advanced Displays & Interactive Displays Federated Laboratory. Thousand Oaks,
CA: Rockwell Scientific Co.
Wickens, C. D., Sandry, D., & Vidulich, M. (1983). Compatibility and resource competition between
modalities of input, central processing, and output: Testing a model of complex task performance. Human
Factors, 25, 227–248.
Wickens, C. D., Self, B. P., Andre, T. S., Reynolds, T. J., & Small, R. L. (2007). Unusual attitude recoveries
with a spatial disorientation icon. The International Journal of Aviation Psychology, 17, 153–165.
Wickens, C. D., Stokes, A. F., Barnett, B., & Hyman, F. (1993). The effects of stress on pilot judgment in a
MIDIS simulator. In O. Svenson & A. J. Maule (Eds.), Time pressure and stress in human judgment and
decision making (pp. 271–292). New York: Plenum.
Wickens, C. D., Thomas, L. C., & Young, R. (2000). Frames of reference for display of battlefield terrain and
enemy information: Task-display dependencies and viewpoint interaction use. Human Factors, 42, 660–
675.
Wickens, C. D., Todd, S., & Seidler, K. (1989). Three-dimensional displays: Perception, implementation, and
applications (CSERIAC SOAR-89-01). Wright-Patterson AFB, OH: Armstrong Aerospace Medical
Research Laboratory.
Wickens, C. D., Ververs, P., & Fadden, S. (2004). Head-up display design. In D. Harris (Ed.), Human factors
for civil flight deck design (pp. 103–140). UK: Ashgate.
Wickens, C. D., Vidulich, M., & Sandry-Garza, D. (1984). Principles of S-C-R compatibility with spatial and
verbal tasks: The role of display-control location and voice-interactive display-control interfacing. Human
Factors, 26, 533–543.
Wickens, C. D., Vincow, M. A., Schopper, A. W., & Lincoln, J. E. (1997). Computational models of human
performance in the design and layout of controls and displays. CSERIAC State of the Art (SOAR) Report.
Wright-Patterson AFB: Crew Systems Ergonomics Information Analysis Center.

445
Wickens, T. (2002). Elementary Signal Detection. San Francisco: Freeman.
Wiegmann, D., & Shappell, S. (2003). A human error approach to aviation accident analysis. Burlington
VT: Ashgate.
Wiegmann, D., Goh, J., & O’Hare, D. (2002). The role of situation assessment and flight experience in
pilots’ decisions to continue visual flight rules flight into adverse weather. Human Factors, 44, 171–188.
Wiener, E. L. (1977). Controlled flight into terrain accidents: System-induced errors. Human Factors, 19,
171–181.
Wiener, E. L. (1981). Complacency: Is the term useful for air safety? In Proceedings of the 26th Corporate
Aviation Safety Seminar (pp. 116–125). Denver, CO: Flight Safety Foundation.
Wiener, E. L. (1988). Cockpit automation. In E. L. Wiener & D. C. Nagel (Eds.), Human factors in aviation
(pp. 433–461). San Diego: Academic Press.
Wiener, E. L. (1989). Reflections on human error: Matters of life and death. In Proceedings of the 33rd
Annual Meeting of the Human Factors Society (pp. 1–7). Santa Monica, CA: Human Factors Society.
Wiener, E. L., & Curry, R. E. (1980). Flight deck automation: Promises and problems. Ergonomics, 23, 995–
1,012.
Wiener, E. L., Kanki, B. G., & Helmreich, R. L. (1993). Cockpit resource management. San Diego, CA:
Academic Press.
Wierwille, W. W., & Casali, J. G. (1983). A validated rating scale for global mental workload measurement
applications. In Proceedings of the 27th Annual Meeting of the Human Factors Society. Santa Monica, CA:
Human Factors Society.
Wierwille, W. W., & Williges, R. C. (1978, September). Survey and analysis of operator workload assessment
techniques (Report No. S-78-101). Blacksburg, VA: Systemetrics.
Wiese, E. E. & Lee, J. D. (2004). Auditory alerts for in-vehicle information systems: the effects of temporal
conflict and sound parameters on driver attitudes and performance. Ergonomics 47, 965–86.
Wiggins, M. W. (2010). Vigilance decrement during a simulated general aviation flight. Applied Cognitive
Psychology, 25, 229–235.
Wiggins, M., & O’Hare, D. (1995). Expertise in aeronautical weather-related decision making: A cross-
sectional analysis of general aviation pilots. Journal of Experimental Psychology: Applied, 1, 305–320.
Wightman, D. C., & Lintern, G. (1985). Part-task training for tracking and manual control. Human Factors,
27, 267–283.
Wikman, A. S., Nieminen, T., & Summala, H. (1998). Driving experience and time-sharing during in-car
tasks on roads of different width. Ergonomics, 41, 358–372.
Wilkinson, R. T. (1964). Artificial “signals” as an aid to an inspection task. Ergonomics, 7, 63–72.
Willemsen, P., Colton, M. B., Creem-Regehr, S. H., & Thompson, W. B. (2009). The effects of head-
mounted display mechanical properties and field of view on distance judgments in virtual environments,
ACM Transactions on Applied Perception, 6(2), Article 8, 1–14.
Williams, A. & Davids, K. (1998) Visual search strategy, selective attention, and expertise in soccer. Research
Quarterly for Exercise and Sport, 69, 111–128.
Williams, D. E., Reingold, E. M., Moscovitch, M., & Behrmann, M. (1997). Patterns of eye movements
during parallel and serial visual search tasks. Canadian Journal of Experimental Psychology, 51, 151–164.
Williams, D. J., & Noyes, J. M. (2007). How does our perception of risk influence decision-making?
Implications for the design of risk information. Theoretical Issues in Ergonomics Science, 8, 1–35.
Williams, H. P., Wickens, C. D., & Hutchinson, S. (1994). Realism and interactivity in navigational training:
A comparison of three methods. In Proceedings of the Human Factors and Ergonomics Society 38th
Annual Meeting (pp. 1,163–1,167). Santa Monica, CA: Human Factors and Ergonomics Society.
Williams, M. D., Hollan, J. D., & Stevens, A. L. (1983). Human reasoning about a simple physical system. In
D. Gentner & A. L. Stevens (eds.), Mental models. Hillsdale, NJ: Erlbaum.
Williges, R. C. (1971). The role of payoffs and signal ratios on criterion changes during a monitoring task.

446
Human Factors, 13, 261–267.
Williges, R. C., & Wierwille, W. W. (1979). Behavioral measures of aircrew mental workload. Human
Factors, 21, 549–555.
Wilson, G. F. (2001). In-flight psychophysiological monitoring. In F. Fahrenberg & M. Myrtek (Eds.)
Progress in ambulatory monitoring. (pp. 435–454). Seattle: Hogrefe and Huber.
Wilson, G. F. (2002). Psychophysiological test methods and procedures. In S. G. Charlton & T. G. O’Brien
(Eds.), Handbook of human factors testing and evaluation (2nd Ed., pp. 127–156). Mahwah, NJ: Erlbaum.
Wilson, G. F., & Russell, C. A. (2003). Operator functional state classification using multiple
psychophysiological features in an air traffic control task. Human Factors, 45, 381–289.
Wilson, G. F., & Russell, C. A. (2007). Performance enhancement in an uninhabited air vehicle task using
psychophysiologically determined adaptive aiding. Human Factors, 49, 1,005–1,018.
Wilson, P. N., Foreman, N., & Tlauka, M. (1997). Transfer of spatial information from a virtual to a real
environment. Human Factors, 39, 526–531.
Wine, J. (1971). Test anxiety and direction of attention. Psychological Bulletin, 76, 92–104.
Winter, J. C., F., & Dodou, D. (2011). Why the Fitts list has persisted throughout the history of function
allocation. Cognition, Technology, and Work, doi 10.1007/ s10111-011-0188-1.
Wise, J. A., & Debons, A. (1987). Principles of film editing and display system design. In Proceedings of the
31st Annual Meeting of the Human Factors Society (pp. 121–124). Santa Monica, CA: Human Factors
Society.
Witmer, B. G., & Kline, P. B. (1998). Judging perceived and traversed distance in virtual environments.
Presence, 7, 144-167.
Wixted, J. T. (2007). Dual-process theory and signal-detection theory of recognition memory. Psychological
Review, 114, 152–176.
Wogalter, M. S., & Conzola, V. C. (2002). Using technology to facilitate the design and delivery of warnings.
International Journal of Systems Science, 33(6), 461–466.
Wogalter, M. S., Godfrey, S. S., Fontenelle, G. A., Desaulniers, D. R., Rothstein, P. R.,& Laughery, K. R.
(1987). Effectiveness of warnings. Human Factors, 29, 599–612.
Wogalter, M. S., & Laughery, K. R. (2006). Warnings and hazard communications. In G. Salvendy (Ed.),
Handbook of human factors and ergonomics (3rd Ed., pp. 889–911). Hoboken, NJ: Wiley.
Wogalter, M. S., & Silver, N. C. (1995). Warning signal words: Connoted strength and understandability by
children, elders, and non-native English speakers. Ergonomics, 38, 2,188–2,206.
Wolf, L. D., Potter, P., Sedge, J., Bosserman, S., Grayson, D., & Evanoff, B. (2006). Describing Nurses’
work: Combining quantitative and qualitative analysis. Human Factors, 48, 5–14.
Wolfe, F. M. (1986). Meta-analysis: quantitative methods for research synthesis. Newbury Park, CA: Sage.
Wolfe, J. M. (1994). Guided search 2.0: A revised model of visual search. Psychonomic Bulletin and Review,
1, 202–238.
Wolfe, J. M. (2007). Guided search 4.0: Current progress with a model of visual search. In W. D. Gray (Ed.),
Integrated models of cognitive systems (pp. 99–119).New York: Oxford University Press.
Wolfe, J. M., & Horowitz, T. S. (2004). What attributes guide the deployment of visual attention and how do
they do it? Nature Reviews Neuroscience, 5(6), 495–501.
Wolfe, J. M., Horowitz, T. S., & Kenner, N. M. (2005). Rare items often missed in visual searches. Nature,
435, 439.
Wolfe, J. M., Horowitz, T. S., & Kenner, N. M. (2005). Rare items often missed in visual searches. Nature,
435, 439–440.
Wolfe, J. M., Horowitz, T. S., Van Wert, M. J., Kenner, N. M., Place, S. S., & Kibbi, N. (2007). Low target
prevalence is a stubborn source of errors in visual search tasks. Journal of Experimental Psychology:
General, 136, 623–638.
Wood, N., & Cowan, N. (1995). The cocktail party phenomenon revisited: How frequent are attention shifts

447
to one’s name in an irrelevant auditory channel? Journal of Experimental Psychology: Learning, Memory,
& Cognition, 21, 255–260.
Woods, D. D. (1984). Visual momentum: A concept to improve the cognitive coupling of person and
computer. International Journal of Man-Machine Studies, 21, 229–244.
Woods, D. D. (1995). The alarm problem and directed attention in dynamic fault management. Ergonomics,
38, 2,371–2,393.
Woods, D. D. (1996). Decomposing automation: Apparent simplicity, real complexity. In R. Parasuraman &
M. Mouloua (Eds.), Automation and human performance (pp. 3–18). Mahwah, NJ: Erlbaum.
Woods, D. D., Johannesen, L. J., Cook, R. I., & Sarter, N. B. (1994). Behind human error: Cognitive
systems, computers, and hindsight (State-the-the Art Report CSERIAC 94-01). Wright-Patterson AFB, OH:
CSERIAC Program Office.
Woods, D., Patterson, E., & Roth, E. (2002). Can we ever escape from data overload? Cognition, Technology
and Work, 4, 22–36.
Woods, D. D., & Roth, E. (1988). Aiding human performance: II. From cognitive analysis to support systems.
Le Travail Humain, 51, 139–172.
Woods, D. D., Wise, J., & Hanes, L. (1981). An evaluation of nuclear power plant safety parameter display
systems. In Proceedings of the 25th Annual Meeting of the Human Factors Society. Santa Monica, CA:
Human Factors Society.
Woodworth, R. S., & Schlossberg, H. (1965). Experimental psychology. New York: Holt, Rinehart &
Winston.
Worringham, C., & Beringer, D. (1989) Operator compatibility and orientation in visual-motor task
performance. Ergonomics, 32, 387–399.
Wotring, B., Dyre, B. P., & Behr, J. (2008). Cross-talk between altitude changes and speed control during
simulated low-altitude flight. In Proceedings of the Human Factors and Ergonomics Society—52nd Annual
Meeting (pp. 1,194–1,198). Santa Monica, CA: Human Factors and Ergonomics Society.
Wouters, P., Paas, F., & van Merriënboer, J. J. G. (2008). How to optimize learning from animated models:
A review of guidelines based on cognitive load. Review of Educational Research, 78, 645–675.
Wright, D. & Davides, G. (2007) Eyewitness testimony. In F. Durso (Ed.) Handbook of Applied Cognition
(2nd Ed). West Sussex, UK: Wiley.
Wright, D., & Loftus, E. (2005). Eyewitness memory. In G. Cohen & M. A. Conway (Eds.), Memory in the
real world (3rd Ed.) (pp. 91–106). New York: Taylor & Francis.
Wright, P. (1974). The harassed decision maker: Time pressures, distractions, and the use of evidence.
Journal of Applied Psychology, 59, 555–561.
Wright, P., & Barnard, P. (1975). Just fill in this form—A review for designers. Applied Ergonomics, 6,
213–220.
Xiao, Y., Seagull, F. J., Nieves-Khouw, F., Barczak, N., & Perkins, S. (2004). Organizational–historical
analysis of the “failure to respond to alarm” problems. IEEE Transactions on Systems, Man, and
Cybernetics. Part A. Systems and Humans, 34, 772–778.
Xu, X., Wickens, C. D., & Rantanen, E. M. (2007). Effects of conflict alerting system reliability and task
difficulty on pilots’ conflict detection with cockpit display of traffic information. Ergonomics, 50, 112–130.
Yallow, E. (1980). Individual differences in learning from verbal and figural materials (Aptitudes Research
Project Technical Report No. 13). Palo Alto, CA: Stanford University, School of Education.
Yamani, Y., & McCarley, J. S. (2010). Visual search asymmetries within color-coded and intensity-coded
displays. Journal of Experimental Psychology: Applied, 16, 124–132.
Yantis, S. (1993). Stimulus driven attentional capture. Current Directions in Psychological Science, 2, 156–
161.
Yantis, S., & Johnston, J. C. (1990). On the locus of visual selection: Evidence from focused attention tasks.
Journal of Experimental Psychology: Human Perception and Performance, 16, 135–149.

448
Yarbus. A. L. (1967). Eye movements and vision. New York: Plenum Press.
Ye, N., & Salvendy, G. (1994). Quantitative and qualitative differences between experts and novices in
Chunking computer software knowledge. International Journal of Human–Computer Interaction, 6, 105–
118.
Yechiam E., & Hochman, G. (in press). Losses as modulators of attention: Review and analysis of the unique
effects of losses over gains. Psychological Bulletin.
Yeh, M., Merlo, J., & Wickens, C. D. (2003). Head up versus head down: The costs of imprecision,
unreliability, and visual clutter on cue effectiveness for display signaling. Human Factors, 45, 390–407.
Yeh, M., Merlo, J. L., Wickens, C. D., & Brandenburg, D. L. (2003). Head up versus head down: The costs
of imprecision, unreliability, and visual clutter on cue effectiveness for display signaling. Human Factors,
45, 390–407.
Yeh, M., Multer, J., & Raslear, T. (2009). An application of signal detection theory for understanding driver
behavior at highway-rail grade crossings. In Proceedings of the Human Factors and Ergonomics Society—
53rd Annual Meeting (pp. 1776–1780). Santa Monica, CA: Human Factors and Ergonomics Society.
Yeh, M., & Wickens, C. D. (2001). Attentional filtering in the design of electronic map displays: A
comparison of color coding, intensity coding, and decluttering techniques. Human Factors, 43, 543–562.
Yeh, M., Wickens, C. D., & Seagull, F. J. (1999). Target cuing in visual search: The effects of conformality
and display location on the allocation of visual attention. Human Factors, 41, 524–542.
Yeh, Y. Y., & Wickens, C. D. (1988). The dissociation of subjective measures of mental workload and
performance. Human Factors, 30, 111–120.
Yin, S.Q., Wickens, C. D., Pang, H., & Helander, M. (2011) Comparing rate of change cues in trend displays
for a process control system. In Proceedings of the 55th Annual Meeting of the Human Factors and
Ergonomics Society. Santa Monica, CA: Human Factors and Ergonomics Society.
Young, M. J., Landy, M. S., & Maloney, L. T. (1993). A perturbation analysis of depth perception from
combinations of texture and motion cues. Vision Research, 33, 2,685–2,696.
Young, M. S., & Stanton, N. A. (2002). Malleable attentional resources theory: A new explanation for the
effects of mental underload on performance. Human Factors, 44, 365–375.
Young, S. L., Wogalter, M. S., & Brelsford, J. W. (1992). Relative contribution of likelihood and severity of
injury to risk perceptions. In Proceedings of the 36th Annual Meeting of the Human Factors and
Ergonomics Society (pp. 1,014–1,018). Santa Monica, CA: Human Factors and Ergonomics Society.
Yuille, J. C. & Bulthoff, H. H. (1995). A Bayesian framework for the integration of visual modules. In T.
Inui & J. L. McClelland (Eds.), Attention and performance: Vol 16. Information integration in perception
and communication (pp. 47–70). Cambridge, MA: MIT Press.
Zadeh, L. A. (1965). Fuzzy sets. Information and Control, 8, 338–353.
Zakay, D. (1993). The impact of time perception processes on decision making under time stress. In O.
Svenson & A. J. Maule (Eds.), Time pressure and stress in human judgment and decision making (pp. 59–
72). New York: Plenum.
Zander, T., & Kothe, C. (2011). Towards passive brain-computer interfaces: applying brain-computer
interface technology to human-machine systems in general. Journal of Neural Engineering, 8, 1–5.
Zarcadoolas, C. (2010). The simplicity complex: exploring simplified health messages in a complex world.
Health Promotion International.
Zeitlin, L. R. (1994). Failure to follow safety instructions: Faulty communications or risky decisions? Human
Factors, 36, 172–181.
Zekveld, A. A., Heslenfeld, D. J., Festen, J. M., & Schoonhoven, R. (2006). Top-down and bottom-up
processes in speech comprehension. NeuroImage, 32, 1,826–1,836.
Zhai, S. (2008). On the ease and efficiency of human-computer interfaces. In ETRA ‘08 Proceedings of the
2008 Symposium on Eye Tracking Research & Applications (pp. 9–10). New York: Association for
Computing Machinery.
Zhai, S., Kristensson, P. O., Appert, C., Andersen, T. H., & Cao, X. (in press). Foundational issues in touch-

449
screen stroke gesture design—An integrative review. Foundations and Trends in Human–Computer
Interaction.
Zhang, J., & Norman, D. A. (1994). Representations in distributed cognitive tasks. Cognitive Science, 18,
87–122.
Zhang, L., & Cao, C. (2010). The effect of image orientation on a dynamic laparoscopic task. In Proceedings
of the 54th Annual Meeting of the Human Factors Society. Santa Monica, CA: Human Factors and
Ergonomics Society.
Zheng,Y., Brown,M., Herdman,C.M., &Bleichman,D. (2007). Lane position head-up displays in
automobiles: Further evidence for cognitive tunneling. In 14th International Symposium on Aviation
Psychology. Dayton, OH: Wright State University. Available at http://www6.carleton.ca/ace/projects-and-
publications/heads-up-displays/
Zsambok, C. E., & Klein, G. (1997). Naturalistic decision making. Mahwah, NJ: Erlbaum.

450
NAME INDEX
A
AARC Joint Commission, 25
Aaslid, R., 355
Abney, D. H., 211
Ackerman, P. L., 233, 371
Acta Psychologica, 223, 228
Adami, A., 398
Adamic, E. J., 108
Adams, A., 79, 192
Adams, B. D., 243
Adams, J. A., 27, 243
Adams, M. J., 215
Adams, J., 317
Adapathya, R., 337
Adelman, L., 261
Adelstein, B. D., 158, 159
Adhikari, N. K., 283, 379, 384, 385
Adlam, A., 200
Agarwal, 231
Agrawal, S., 153
Agrawala, M., 92
Aguillar, M., 343
Ahlstrom, V., 399
Aichele, S. R., 30
AIM, 311
Ainsworth, L., 282, 315, 349
Akhtar, S. C., 106
Alcalde, C., 343
Aldrich, K., 185, 186
Alexander, A. L., 73, 74, 77, 114, 115, 131, 133, 333, 361, 393
Alford, D., 79
Algom, D., 68
Alkov, R., 365
Allen, G., 255, 261
Allen, P. A., 162
Allen, R. C., 159
Allen, R. J., 199, 205
Allen, R. W., 153
Allison, R. S., 107, 111, 119, 120
Allport, A., 209, 220
Allport, D. A., 220, 293, 332
Alluisi, E., 300, 307
Alm, H., 344
Altmann, E. M., 332, 334, 335, 336
Alvino, C., 275
Amadieu, F., 184
Ambinder, M. S., 73, 74, 77, 135, 332, 361
Amer, T. S., 89
Andersen, T. H., 243
Anderson, A. H., 192
Anderson, G. J., 106
Anderson, J., 223, 227
Anderson, J. D., 399
Anderson, J. R., 33, 209, 221, 233, 237
Anderson, M. C., 205, 206
Anderson, P., 154
Andersson, J., 213
Ando, J., 373
Andre, A. D., 82, 145, 228, 296, 298, 301
Andre, T. S., 178
Andresen, G., 100, 283, 369
Angel, H. A., 243
Angell, L., 338, 340
Angelone, B. L., 54
Angus, R., 31, 56
ANSI, 191
Anthony, B., 379, 387
Antonijevic, S., 192
Appert, C., 243
Arengo, R., 167
Aretz, A. J., 125, 126, 127, 129, 131, 145

451
Argote, L., 213, 214
Arkes, H. R., 262, 272
Arnett, P. A., 206
Arruda, J. E., 85
Arthur, J. J., 243
Artman, H., 195, 215
Asch, S. M., 385
Ash, A., 107
Askew, S., 384
Astésano, C., 78
Astur, R. S., 55, 59
Atchley, P., 30, 67, 325, 359
Austria, P. A., 79, 174
Avery, B., 156, 157
Avnaim-Pessoa, L., 263
Ayaz, H., 349, 356
Ayres, T. J., 184
Azuma, R. T., 155

B
Baad, E., 364
Baber, C., 56, 147, 291, 301, 307
Bachlechner, M. E., 154
Backs, R. W., 356
Baddeley, A. D., 26, 82, 198, 199, 200, 201, 204, 205, 216, 220, 327, 331, 353
Baghieri, N., 391
Bagnall, T., 329, 349
Bahner, E., 388, 392, 404
Bahri, T., 395, 397
Bahrick, H. P., 322, 342
Bailey, B. P., 334, 347
Bailey, N., 391
Bailey, R. W., 170, 177, 180, 205, 235
Bain, J. D., 323, 349
Bainbridge, L., 388, 391
Bak, P., 15, 23
Baker, C. A., 167
Baker, P., 151, 152, 153, 158
Baker, R., 323, 349
Baker, R., 228
Baker, C. H., 28, 30
Balakrishnan, R., 240
Balaubramanian, V., 333, 334, 336
Baldwin, C. L., 355, 365, 398, 400
Ball, K., 60
Balla, J., 256, 259
Ballard, D. H., 325
Ballard, K., 270
Ballenson, J. N., 154
Banbury, S. P., 65, 78, 80, 81, 82, 195, 214, 215, 216, 218, 219, 220
Banich, M. T., 201, 331, 344, 371
Banks, A. P., 215
Bar, M., 172
Barclay, R. L., 58, 60
Barczak, N., 23, 394
Bareket, T., 227, 343, 371
Barfield, W., 209
Barfield, 151, 158
Bar-Hillel, M., 258, 259
Barnard, P., 179
Barnes, L. R., 23, 394
Barnes, M., 379, 395, 397, 399
Barnett, B. J., 69, 75, 178, 283
Barr, M., 344
Barron, K., 39
Barrouillet, P., 200, 203, 326
Barsalou, L. W., 235
Barshi, E., 332, 333, 334, 335, 337
Barton, C., 256
Barton, R. R., 39
Bartram, D. J., 130, 307
Basapur, S., 193
Bastardi, A., 262
Bastien, J. M. C., 54
Bateman, S., 92
Bates, C., 286
Bates, D., 377

452
Bates, E., 175
Bathalon, G. P., 361
Bauer, K., 229
Baumal, R., 163, 189
Baus, J., 127
Bavelier, D., 293, 343
Bazerman, M., 272, 282
Beach, L. R., 15
Beaman, C. P., 81
Beatty, J., 356, 361
Becellio, E., 112
Becic, W., 338
Beck, H. P., 393
Beck, M. R., 54, 56, 58
Becker, C. A., 139, 236
Becker, R., 143
Becker, A. B., 30
Beckner, J. K., 211
Bederson, B. B., 145
Behr, J., 108
Behre, J., 108
Behrmann, M., 58
Bell, B. S., 229
Bellenkes, A. H., 218, 342, 357
Bellinger, K. D., 213
Bellows, C. F., 120
Beltowska, J., 338
Benbasat, I., 117
Benbassat, D., 255
Benight, C., 23, 394
Ben-Ishai, R., 372
Bennet, K. B., 138, 131, 144
Bennett, A. M., 108
Bennett, K., 71, 75, 77, 93, 94, 100, 222, 256, 384
Bennett, W., 243
Bennett, K. B., 100
Ben-Shakhar, G., 9
Bents, F. D., 340
Berends, I. E., 221
Berger, 354
Berglund, B., 81, 82
Bergman, J. S., 56
Beringer, D. B., 76, 99
Beringer, D. B., 299
Berkun, M. M., 364, 365
Berman, B. A., 332
Berman, M. G., 325
Bernardin, S., 203
Bernhard, D., 89
Berry, B. F., 81, 82
Berry, D. C., 81, 82, 219, 220
Bersh, P., 34
Berson, B. L., 172
Bertelson, P., 292, 304, 306
Bertin, J., 97, 138
Bertolotti, H., 121
Besson, M., 78
Betrancourt, M., 144
Bettman, J. R., 256, 261, 263, 274, 275, 368
Beyene, J., 283, 379, 384, 385
Bhaskara, A., 241
Bialystok, E., 343
Biederman, I., 171, 172
Bielock, S., 289
Biemond, R., 154
Biggs, S. J., 151
Billings, C., 377, 378, 380, 387, 400
Billings, D. R., 388
Billington, M. J., 307
Binford, J. R., 26
Birbaumer, N., 275
Bird, J., 153
Birnbaum, M. H., 251
Bisantz, A., 388
Bittner, A. C., 352
Bizo, L. A., 214
Bjork, R. A., 180, 226, 227, 229, 233, 234, 277

453
Bjorneseth, O., 158
Black, P., 261
Blackshaw, L., 275
Blandford, A., 314
Bleckley, M. K., 199, 216, 372
Bluethmann, W., 384
Blumer, C., 272
Bobrow, D. G., 323, 324
Bocker, M., 119
Boe, O. C., 219
Boehm-Davis, D. A., 88, 92, 334, 336, 342, 343
Boeing Company, 311, 314
Bogner, M., 319
Boian, R., 154
Bojko, A., 344
Boles, D. B., 330, 352
Bolia, R. S., 121, 122
Bolstad, C. A., 195
Booher, H. R., 1, 181
Boot, W. R., 8
Bootsma, R. J., 107
Borman, W. C., 371
Bornstein, B. H., 206
Boron, J. B., 184
Borowsky, M., 365
Bortalussi, M. R., 358
Bos, J. C., 158
Boschelli, M. M., 145
Boss, S. M., 56
Bosserman, S., 333
Botturi, D., 151, 153
Botzer, A., 15, 23
Boucek, G. P., 349, 350
Bourne, L., 223, 228, 232, 233, 243, 261, 281
Bourne, P., 361
Bower, A. B., 186
Bower, G. H., 178
Bowne, S. F., 120
Boyle, E. A., 192
Boyle, L., 53, 328, 340
Bradfield, A. L., 23
Brainard, R. W., 300
Braithwaite, M. G., 121
Brake, G., 127, 145
Brandenburg, D. L., 62, 63, 134, 156, 389, 390, 391
Bransford, J. D., 178
Braseth, A. O., 100
Braun, C. C., 186
Braune, R., 228, 291, 372
Braunstein, M. L., 110
Bregman, A. S., 79, 80
Brehmer, B., 250, 276
Brelsford, J. W., 273
Bremen, P., 120
Bremond, R., 54
Brenner, L., 246, 256, 257, 259, 277, 278
Bresley, B., 334
Breslow, L. A., 97
Bresnick, T., 261
Breton, R., 215, 219
Brewer, G. A., 212
Brewer, N., 22, 23, 177, 276
Brewster, C., 239
Brewster, S. A., 174
Brezinski, A. S., 379
Breznitz, S., 25
Brickner, M., 324, 331, 343
Bridges, A., 79
Bridwell, D. A., 30
Briggs, G., 230, 343
Broadbent, D. E., 26, 27, 28, 30, 61, 67, 162, 164, 175, 204, 304, 307, 336, 341, 365
Broadbent, M. H., 162, 164
Brock, D. P., 334, 335
Brookhuis, K. A., 340, 354, 391
Brookings, J., 354
Brooks, C., 92
Brooks, J., 340

454
Brooks, J. O., 125, 126, 127
Brooks, L. J., 216
Brooks, V., 131, 144
Brouwer, A.-M., 275
Brown, G. D. A., 203
Brown, J., 203
Brown, K., 121
Brown, M. E., 137
Brown, N. L., 211
Brown, M., 141
Brown, S. D., 33
Brownell, H. H., 68
Brungart, D. S., 121
Bruni, S., 149, 379, 399
Brunicardi, F. C., 120
Bruno, N., 112
Brunye, T., 123
Bruyer, R., 201
Bryant, D., 221
Bryant, D. J., 243
Buchwald, 233
Buck-Gengler, C. J., 261, 281
Budd, D., 206
Buehler, R. 276
Bulkley, N. K., 104, 107, 329
Bullmore, E., 353
Bulthoff, H., 135
Bunce, S., 349, 356
Bundesen, C., 52
Burdick, M., 262, 392, 404
Burdon, T. A., 120
Burgess, N., 203
Burgess-Limerick, R., 298, 299, 300
Burke, C. S., 194, 195, 223
Burki-Cohen, J., 227
Burns, C. M., 94, 100, 101, 102, 256, 283, 369
Burov, O., 356
Burr, B. J., 146
Bursk, J. H., 330, 352
Bushyhead, J. B., 259
Butcher, L. M., 373
Butler, L. T., 219, 220
Butner, S. E., 386
Buttigieg, M. A., 75
Byers, J. C., 352
Byrne, E. A., 357, 380, 397
Byrne, M. D., 314

C
Cabeza, R., 240
Cacioppo, J. T., 347
Caclin, A., 37
Cades, D. M., 342, 343
Caggiano, D., 28, 353
Caggiano, J. M., 56
Cahill, M. C., 96
Cahillane, M., 301
Cain, B., 151
Caird, J., 56, 339
Cairns, P., 314
Caldwell, B., 309
Camacho, M. J., 172
Camilli, M., 357
Camos, V., 200, 203, 326
Campbell, J., 194, 232
Campbell, M., 114
Canham, M. S., 222
Canning, J. M., 221
Cannon, J. R., 209
Cannon-Bowers, J. A., 213, 238, 370
Canny, J., 193
Cao, C., 127
Cao, X., 154, 243
Caplan, D., 372
Carbonell, J. R., 52
Card, S. K., 137, 138, 145, 146, 311
Carey, T. T., 142

455
Carlander, O., 121, 122
Carlson, L., 127, 135, 136, 137
Carlson, C., 22
Carolan, T., 229, 230, 231, 233, 342
Carpenter, P. A., 88, 92, 93, 164, 353, 372
Carrasco, M., 38
Carretta, T. R., 216, 371
Carriere, J., 338
Carroll, J. M., 229, 236
Carswell, C. M., 5, 38, 72, 74, 77, 85, 86, 87, 88, 89, 90, 92, 93, 100, 182, 283
Carter, R. C., 96
Carterette, E., 80
Casali, J. G., 352, 358
Casey, E. J., 75
Casey, S., 390
Casner, S. M., 87, 220, 221
Casper, J., 379
Catrambone, R., 229, 234
Cattell, R. B., 371, 372
Caudek, C., 112, 116
Caufield, K., 104, 107, 329
Causse, M., 199, 333, 372
Cellier, J. M., 208, 336
Cepeda, N., 233
Cerella, J., 344
Chabris, C. F., 55, 56, 387
Chaffin, D., 113, 116
Chan, A. H. S., 124, 128, 296, 297, 298, 299, 300
Chan, M., 30, 325, 359
Chan, W. H., 297
Chan, A., 294, 295
Chandler, J., 233, 274
Chandler, P., 74, 181, 182, 183, 231, 236
Chandrasekaran, B., 88
Chaney, F. D., 15, 26, 362
Chang, D., 149, 380
Chao, C., 337
Chapanis, A., 294
Chapman, P., 218
Chapman, P., 260
Charissis, V., 65
Charness, N., 190, 209
Chase, W. G., 136, 208, 209, 216, 262
Chau, A. W., 67
Cheal, M., 62
Chebat, J.-C., 186
Chein, J. M., 166
Chen, I., 159
Chen, J., 92, 106
Chen, J. Y. C., 388
Cheng, P. C. H., 87, 88, 91
Chesney, G. L., 327, 355
Cheyne, J. A., 338
Chi, C. F., 15, 61, 136
Chia, N., 333, 342
Chignell, M. H., 121, 395
Childress, M. E., 358
Childs, J. M., 29
Chiles, W. D., 371
Chincotta, D., 200
Chipman, S. F., 238
Chodorow, M., 165
Choi, J. H., 356
Choi, S., 156
Chou, K., 58
Chrisman, S. E., 76
Christ, R. E., 96, 352
Christensen, J. C., 398
Christenssen-Szalanski, J. J., 259
Chronicle, E. P., 222
Chudy, A., 130
Chui, Y. P., 216, 219
Chun, M. M., 61
Cissell, G. M., 60
Cizarre, C., 126
Clamann, M. P., 395
Clark, B. A., 199

456
Clark, C., 81, 82
Clark, H. H., 68
Clark, J. J., 53, 55
Clark, M. C., 178
Clark, R. C., 153
Clark, H., 262
Clarke, E., 31, 388, 397
Clarkson, G., 209, 210
Clawson, D. M., 243
Clayton, K., 24, 25, 379, 394, 395
Clement, M. R., 59
Cleveland, W. S., 89, 90, 143
Clifasefi, S. L., 56
Clough, P. J., 153
Coan, J. A., 242
Cockburn, A., 50
Coffey, E. B. J., 275
Cohen, A. L., 241
Cohen, G., 235
Cohen, M. S., 246, 281
Cohen, S., 360
Cohen, J., 373
Colcombe, A., 25, 328, 333, 335
Cole, W. G., 75
Coleman, N., 328
Coles, M. G. H., 303
Collett, C., 338, 339, 340
Collins, A. M., 235
Colom, R., 372
Colquhoun, H. Jr., 155
Colquhoun, W. P., 26
Colton, M. B., 158
Coman, A., 241
Combs, B., 273
Combs, B., 274
Commarford, P. M., 56, 307
Comstock, J. R., 104
Connor, O., 380, 384
Conrad, R., 308
Consalus, K. P., 52, 329, 338
Conway, A. R. A., 199
Conzola, V. C., 185, 186
Cook, D., 141
Cook, G. I., 212
Cook, M. B., 96, 113, 120, 158, 261, 262, 282
Cook, R. I., 250, 280, 311, 318, 366
Cooke, A. D. J., 271
Cooke, N. J., 195, 213, 214, 238, 380, 384
Cooney, J. W., 344
Cooper, J. M., 340
Cooper, M., 182
Cooper, L., 125
Coovert, M. D., 85
Corcoran, K. J., 122
Corl, L., 97, 298
Corley, R. P., 371, 373
Corn, D., 254
Corrigan, B., 1, 255, 258, 283
Corwin, J., 18
Cosenzo, K., 379, 399
Cottle, J. L., 175, 177
Coughlin, M. P., 78, 80
Courtney, A. J., 97
Coury, B. G., 56, 63, 178, 401
Covas, C. M., 106
Cowan, N., 78, 204
Cowen, E. L., 366
Cowen, M. B., 114, 115
Cowgill, J. L., 121
Cowie, J. R., 177
Coyne, J., 355
Craig, A., 59, 151, 152
Craig, I. W., 373
Craik, F. I. M., 230, 240, 343
Craik, K. W. J., 304
Crandall, B., 218, 238, 248, 268
Crandall, J. W., 399

457
Credé, M., 193
Creelman, C. D., 10, 15, 18, 19
Creem-Regeher, S. H., 158
Crocoll, W. M., 178, 401
Croft, D., 215, 218, 219, 220
Crossley, S. A., 177
Crowell, J. A., 327
Crundall, D., 125, 127
Crundall, L., 127
Crutchfield, J. M., 372
Cukor, J., 154
Cumming, J., 230, 231, 233, 342
Cummings, M. L., 149, 379, 380, 399
Curry, M. B., 173, 174
Curry, R. E., 378, 388, 389
Curry, M., 172
Cutting, J. E., 111, 112
Czerwinski, M., 137, 138, 139, 141, 143, 144

D
D’Orsi, C. J., 21
Dahlström, Ö., 213
Dailey, S., 22
Dallman, R. C., 121
Dalton, R., 135, 136, 137
Damos, D. L., 230, 333, 334, 335, 342, 343
Danaher, J. W., 197
Daneman, M., 372
Danielsson, H., 213
Danzigera, L., 263
Dark, V. J., 203, 205, 206
Darken, R. P., 152
Darker, I. T., 167
Darlington, K., 379
Das, A., 29
Dattel, A. R., 216, 219
Davenport, W. G., 27
Davids, K., 56
Davies, A., 326
Davies, D. R., 26, 27, 325, 360, 363, 365, 372
Davies, G., 22, 206
Davis, E. M., 314
Davis, J. H., 261
Davis, K. B., 255
Davis, M. H., 190
Davis, O. S., 373
Davis, R., 293
Dawe, L. A., 393
Dawes, R. M., 250, 255, 257, 258, 283
de Araujo, 154
De Bondt, W. F. M., 252, 266
de Bruijn, O., 173
de Jong, R., 344
de la Peña, N., 153
de la Puente, P., 399
de Visser, E., 31, 356, 388, 397
de Waard, D., 340, 354, 391
Deaton, J. E., 28, 395, 397
Deaton, 325
Debecker, J., 307
Debons, A., 144
Deffenbacher, K. A., 206
DeFries, J. C., 371, 373
Degani, A., 197, 241, 379, 387, 400
deGroot, A. D., 209
Dehaise, F., 199, 333, 372
Deininger, R. L., 296, 307
Dekel, A., 68
del Millan, J., 275
DeLucia, P. R., 103, 107, 108, 117, 128
Dember, W. N., 18, 19, 26, 27, 28, 30
Dempsey, P., 309
Denton, G. G., 108
Department of the Army, 383
Derakshan, N., 54
Derrick, W. L., 356
Desaulniers, D. R., 185

458
Deshe, O., 97
Desmedt, R., 307
Desmond, P., 360, 363
DeSota, C. B., 179
D’Esposito, M., 344
Dessouki, M. I., 337
DeThorne, L., 190
Detweiler, M. C., 207, 342
Deubel, H., 53, 55
Devereaux, P., 283, 379, 384, 385
Devine, P. G., 23
Dewar, R. E., 179, 287
DeYoung, C., 373
Di Nocera, F., 357
Dickison, D., 237
Diehl, A. E., 194
Dietz, A., 377
Dietz, P. H., 154
Difede, J., 154
Diftler, M., 384
DiMatteo, M. R., 3
Dinges, D. F., 357
Dingus, T. A., 338, 340
Dino, R. N., 193, 366
Dismukes, R. K., 211, 212, 332, 333, 334, 335, 337, 342, 343
Dittmar, M. L., 26
Divekar, K., 342, 343
Dixon, S. R., 24, 67, 149, 328, 332, 335, 380, 389, 391, 394
Doane, S. M., 209, 216, 218, 237
Dobbs, A. R., 344
Dockrell, J., 81
Doctor, P., 403
Dodhia, R., 334
Doherty, G., 100, 138
Doherty, M. E., 262
Doherty, M. L., 368
Doll, T. J., 29
Domingo, M., 38
Domini, F., 112, 116
Domowitz, I., 381
Donald, F. M., 31
Donaldson, M., 1
Donchin, E., 230, 275, 303, 324, 327, 351, 355, 374
Donders, F. C., 287, 303
Dong, X., 222
Donkin, C., 33
Donmez, B., 328, 340
Donovan, 233
Dorneich, M. C., 334, 395, 397, 398, 402
Dornheim, M. A., 333, 389
Dougherty, E., 316
Dougherty, M. R. P., 210, 216, 221, 316, 360, 368
Doughty, A. S., 58, 60
Douissenbekov, E., 55
Doyle, P. C., 153
Dragicevic, P., 240
Draper, M. H., 158
Drazin, D., 286
Drews, F. A., 69, 75, 327, 339, 340
Driskell, J. E., 193, 368
Drivdahl, S. B., 54, 276
Driver, A., 206
Driver, J., 81
Druckman, D., 226, 227
Drury, C. G., 9, 15, 20, 29, 57, 59, 60, 61, 146, 289, 388
Dryer, D. C., 402
Du, R., 354
Dubrowski, A., 125, 127, 131
Duchon, A. P., 104, 106
Dudfield, H., 195, 216
Duffy, E., 361
Duffy, S. A., 56, 63
Duggan, G. B., 289, 365
Dulaney, C. L., 166
Dumais, S. T., 239
Duncan, J., 58, 68, 301, 341
Durding, B. M., 139, 236

459
Durgin, F. H., 158
Durso, F. T., 214, 215, 216, 218, 219, 372
Duschek, S., 355, 356
Dutcher, J. S., 184
Dutta, A., 232, 273
Dwyer, F. M., 184
Dwyer, J., 242
Dye, M., 293
Dyre, B. P., 15, 91, 104, 106, 107, 108, 329
Dysart, J., 22
Dyson, B. J., 37
Dzindolet, M. T., 393

E
Eastwood, J. D., 347
Eberhardt, J. L., 56
Eberts, R. E., 113
Edelmann, J., 184
Edland, A., 365, 368
Edwards, J. D., 60
Edwards, W., 247, 249, 260, 268, 282
Edworthy, J., 79, 174, 175, 185, 186, 192
Efendov, A., 97
Egan, J., 80
Egeth, H. E., 35, 62, 63
Eggemeier, F. T., 351, 352
Egger, M., 3
Egstrom, G., 364, 365
Ehrenreich, S. L., 168
Ehrlich, J. A., 159
Eichstaedt, J., 163
Eid, J., 219
Eidelson, B. D., 154
Einhorn, H. J., 260, 261, 262, 264, 279, 280, 281
Einstein, G. O., 211, 212, 231, 334, 336
Eisen, L. A., 194, 195
Eishita, F. Z., 38
Elaad, E., 9
Elliott, E. M., 81, 82
Ellis, H., 206
Ellis, N. C., 167, 204
Ellis, R. D., 207
Ellis, S. R., 115, 117, 120, 152, 157, 159, 170
Ely, K., 229
Emerson, M. J., 199, 201
Emery, L., 80, 216
Emilsson, M., 213
Emmelkamp, P. M. G., 153
End, C., 81
Endsley, M. R., 126, 195, 214, 215, 216, 217, 218, 219, 382, 390, 397, 401
Engle, R. W., 55, 199, 204, 206, 370, 371, 372
English, W. K., 146
Englund, C. E., 76
Enns, J. T., 113
Enomoto, Y., 114, 115, 116, 144, 145
Entin, E. B., 250
Ephrath, A. R., 357
Ercoline, W., 103
Erdmann, U., 355
Erdogmus, D., 398
Erev, I., 251, 271, 272, 274, 392
Ericsson, A., 209
Ericsson, K. A., 208, 210, 216, 243
Eriksen, B. A., 65, 67, 275
Eriksen, C. W., 36, 65, 67, 275
Eriksson, L., 121, 122
Erlick, D. E., 158
Ersner-Herschfield, H., 270
ESSAI, 195
Estepp, J. R., 398
Estrada, A., 121
Eulitz, C., 189
Evanoff, B., 333
Evans, J. E., 332
Evans, J. St. B. T., 246
Evtushenko, V. F., 364
Eyrolle, H., 208, 336

460
F
Fadden, S., 65, 66, 155
Falco, C. M., 361
Falk, V., 120
Fan, J., 373
Fann, J. I., 120
Farmer, E., 81
Farrell, S., 390
Faust, D., 258
Favelle, S., 106, 107
Fedak, 142
Federmeier, K. D., 163
Fedota, J. R., 275
Feher, B. A., 380
Feigh, K. M., 395, 397
Fein, R. M., 237
Feiner, S., 155
Feldon, D. F., 238
Felfoldy, G. L., 37
Felton, E. A., 275
Fendrich, D. W., 167
Fennema, M. G., 275, 325
Ferguson, T. I., 276
Ferrarina, A., 153
Ferrell, W. A., 15, 169
Ferrer, E., 30
Ferrez, P. W., 275
Ferris, T., 393
Ferris, T., 398
Festen, J. M., 189
Ficks, L., 35
Figner, B., 274
Filik, R., 167
Finlay, J. E., 92
Finucane, M., 251, 260
Fiore, S. M., 222
Fischer, E., 66
Fischer, P., 81, 82
Fischer, U., 247, 248, 250, 255, 265, 275, 278, 282, 289, 291, 364, 385, 392, 393
Fischhoff, B., 258, 259, 275, 276, 277, 279, 280, 281, 282
Fisher, D., 137, 138, 139, 141, 143, 144
Fisher, D. L., 20, 56, 63, 342, 343
Fisk, A. D., 29, 59, 162, 166, 184, 233, 324, 341, 342, 343, 344
Fitts, P. M., 1, 146, 284, 285, 286, 289, 290, 295, 296, 300, 301, 307, 322, 342
Fitzpatrick, D., 175
Flach, J. M., 75, 77, 93, 94, 104, 106, 107, 108, 131, 144, 146, 215
Flannagan, M., 339
Flavell, R., 96
Fletcher, G., 195
Flexman, R., 366
Flight International, 293
Flin, R., 195, 368
Fogarty, G., 371
Fogg, B. J., 402
Folk, C., 55
Folkman, S., 363, 366
Fong, 281
Fontenelle, G. A., 185, 260
Foodsell, C., 22
Ford, J. K., 368
Foreman, N., 153
Forlano, J. G., 82, 200
Forsberg, A. S., 92
Forsythe, A., 172
Forsythe, C., 145
Fossella, J., 373, 399
Fougnie, D., 53, 56
Foushee, H. C., 194
Fowler, F. D., 179, 207
Fracker, M. L., 341
Francolin, C. M., 172
Frankenberger, S., 89
Frankenstein, J., 135
Franklin, N., 124, 128
Frankmann, J. P., 27
Frantz, J. P., 73

461
Frecker, M., 39
Frederick, S., 258, 259, 273
Freed, M., 337
Freeman, J. T., 281
Frese, M., 231
Freude, G., 355
Freuen, M., 211
Fricker, L., 80, 216
Friederici, A. D., 206
Friedland, N., 366, 367, 370
Friedman, A., 327
Friedman, D. B., 177
Friedman, N. P., 199, 201, 371, 372, 373
Frolov, M. V., 364
Fu, W. T., 325
Fuchs, A., 301
Fuld, R., 399
Fulero, S. R., 22
Funk, K., 337, 380
Furnas, G., 141
Furness, T., 151

G
Gagner, M., 386
Gajendren, R. S., 193
Gales, A. G., 167
Gallimore, J. J., 137
Galster, S., 391
Gane, 234
Garabet, A., 277, 333, 339
Garbis, C., 195, 215
Gardiner, J. M., 241
Gardner, 282
Garell, P. C., 275
Garg, A. X., 283, 379, 384, 385
Garland, D. G., 214
Gärling, T., 222, 250, 268, 270, 271, 272
Garner, W. R., 35, 37, 76
Garness, S. A., 106
Garton, T., 270
Garzonis, S., 174
Gawande, A., 320, 377
Gaynor, M., 365
Gazzaley, A., 344
Gazzaniga, M. S., 347
Gebhard, J. W., 33
Geelhoed, E., 158
Geisler, W. S., 58, 104
Gejets, P., 184
Gelade, G., 56, 58
Gempler, K., 148
Genest, A., 92
Gentner, D., 94
Gentzler, M. D., 56, 307
Geri, G. A., 106
Gerret, D., 167
Gerson, A. D., 275
Getty, D. J., 21, 24, 119
Getzmann, S., 120
Gevins, A., 354
Ghuman, A. S., 172
Giambra, 344
Giard, M.-H., 37
Gibb, R. W., 107, 113
Gibson, J. J., 103, 104, 106
Gigerenzer, G., 263
Gilchrist, I. D., 357
Giley, R. H., 121
Gilkey, R. H., 121
Gill, R. T., 239
Gillam, B. J., 111
Gillan, D. J., 85, 87, 88, 92, 93
Gillie, T., 336, 341
Gillies, M., 154
Gillingham, K., 113
Gilovich, T., 247
Gilovich, T., 252

462
Gilutz, H., 377
Gittelman, S. S., 94, 97
Giudice, N. A., 121
Giuliano, T., 213
Glass, G. V., 3
Glavin, S. J., 97
Glencross, M., 158
Glisky, E., 212
Glover, B. L., 357
Gluck, M. D., 181
Gobet, F., 208, 209, 210
Goddard, K., 393
Godfrey, C. N., 185, 340
Goh, J., 51, 52, 248, 328, 391
Gold, M., 58, 62, 73
Goldberg, 281
Golden, T. D., 193
Goldfar, D., 377
Goldstein, E. B., 109
Goldwasser, J. B., 298
Golestani, N., 189
Gollan, T. H., 343
Golledge, R. G., 121
Gollwitzer, P. M., 212
Gong, L., 192
Gonthier, D., 24
Gonzales, V. M., 333
Gonzalez, C., 216
Goodale, M. A., 103, 120
Goodman, M. J., 340
Goodrich, M. A., 399
Goodwin, G. A., 243
Gopher, D., 227, 230, 308, 311, 319, 321, 324, 325, 331, 342, 343, 351, 371
Gordon, C., 338, 339
Gordon, R. L., 78
Gordon, S. E., 239
Gordon, U., 360
Gordon-Becker, S. E., 1, 6, 92, 185, 282
Gore, B. F., 52, 53, 54, 55, 287, 404
Gorman, J. C., 195, 213, 214
Gosney, J. L., 329
Gould, J. D., 139, 236
Goza, M., 384
Grabbe, J. W., 162
Graf, P., 230, 390
Graham, T., 211, 212, 334, 336
Gramopadhye, A. K., 60
Grassia, J., 246, 278
Gratton, E., 291, 356
Graw, T., 56
Gray, J. R., 373
Gray, R., 106, 323
Gray, W. D., 3, 325, 398
Grayson, D., 333
Green, A. E., 373
Green, C. S., 343
Green, P., 119, 120, 338, 356
Green, R. F., 121
Green, S., 293
Green, D., 219, 343
Green, D. M., 10, 12, 14, 17, 33
Greenfield, J., 177
Greenlaw, R. L., 58, 60
Greenwald, C. Q., 29
Greenwood, P. M., 373, 399
Gregory, R. L., 113
Gregory, M., 26, 30
Grenell, J. F., 356
Grether, W. F., 95, 96, 167
Grey, C., 185, 186
Grice, H. P., 402
Griffin, D., 247, 256, 257, 259, 276, 277, 278
Griffith, I., 229, 232, 333
Griffiths, T. L., 239
Griswold, J. A., 117, 128
Groen, G. J., 209
Grondlund, S. D., 22, 210, 215, 216, 218, 221

463
Grose, E., 145
Grossman, T., 240
Grosz, J., 107
Groth, K. E., 162
Grunenfelder, J., 120
Grungeiger, T., 332, 333, 334, 336
Gruntfest, E. C., 23, 394
Grunwald, A. J., 152
Guagliardo, L., 356
Guediri, S. M., 153
Guerlaine, S., 150
Gugerty, J. O., 216
Gugerty, L. J., 125, 126, 127, 216, 340
Guillot, S. A., 338, 339, 340
Guitouni, A., 219
Gunn, D. V., 121
Gunning, D., 142
Gurushanthaiah, K., 76
Gutwin, C., 92
Gutwin, 142

H
Ha, Y. W., 264
Haber, R. N., 162
Haelbig, T. D., 206
Hagen, B. A., 106
Hagen, 66
Hailpern, J., 190
Haines, M., 81, 82
Haines, R., 66
Hajdukiewicz, J. R., 100, 256
Hake, H. N., 36
Hale, S., 121
Halford, G. S., 228, 323, 349
Halgren, T. L., 238
Hall, E. P., 238
Hall, J. K., 368
Halle, J., 190
Hallen, A., 338
Halloran, J. M., 385
Hamm, R. M., 246, 278
Hammer, J. 319
Hammer, B., 391
Hammond, K. R., 246, 278
Hampton, D. C., 208
Hancock, D. J., 120
Hancock, P. A., 19, 20, 23, 24, 25, 28, 30, 81, 219, 347, 351, 360, 362, 363, 374, 382, 383, 388, 394, 395
Handel, S., 179
Hanes, L., 69, 76
Hankins, T., 354, 361
Hanna, T. E., 29
Hannaford, B., 120
Hannemann, R., 189
Hannon, D. J., 106
Hannon, E. M., 54
Hanowski, J., 338, 340
Hanson, M. A., 371
Hardwick, J., 310
Hardy, T. J., 131, 133
Harkness, A. R., 262
Harmsen, A., 153
Harrington, D., 195
Harris, H., 154
Harris, R. L., 357
Harris, D. H., 15, 26
Harrison, D. A., 193
Hart, S. G., 348, 351, 352, 358, 359
Harvey, S., 177
Hasbroucq, T., 296
Hasher, L., 325
Hashtrudi-Zaad, K., 151
Haskell, I. D., 113, 296
Hatzakis, H., 119, 120
Havig, P. R., 121
Hawkins, F. H., 78, 187, 226, 227
Hayashi, Y., 67
Hayden, M. H., 23, 394

464
Hayes, C., 222, 334, 395, 397, 402
Hayes, J. R., 167
Hayes-Roth, B., 132
Hayhoe, M. M., 325
He, J., 338
Head, J., 81, 82
Healy, A. F., 162, 167, 223, 228, 232, 233, 243, 261, 281
Heath, A., 96
Heathcote, A., 33
Hedge, J. W., 371
Heer, J., 84, 92
Heers, S., 262, 392
Hegarty, M., 96, 127, 232, 372
Helander, M. G., 29
Helenius, R., 82
Hellbrück, J., 82
Helleberg, J. R., 51, 52, 181, 202, 328, 335, 379
Hellier, E., 79, 174, 175, 185, 186, 192
Helmreich, R. L., 177, 194, 195
Henderson, S. J., 155
Hendricks, R., 328
Hendrix, C., 158
Hendy, K. C., 193, 351
Hennelly, R. A., 204
Henrion, M., 276
Herbert, W., 247, 262
Herdman, C. M., 66, 69, 200, 216
Hermann, D., 319
Hernando, M., 399
Herron, S., 358
Hershon, R., 310
Hertel, P., 213
Hertwig, R., 251, 271, 274, 392
Heslenfelda, D. J., 189
Hess, S. M., 207, 342
Hewitt, J. K., 371, 373
Hick, W. E., 287, 288
Hickox, J. C., 127
Hicks, J. L., 211, 212
Hilburn, B., 349, 357, 396, 397
Hill, M. I., 158, 159
Hill, S. E., 167
Hill, S. G., 352
Hillix, W., 310
Hillyard, S. A., 354
Hines, F. G., 190
Hirst, W., 241, 341
Hitch, G. J., 199, 203, 205, 216
Hitchcock, R. J., 117, 170
Ho, C. Y., 335, 400
Ho, G., 56
Hochberg, J., 131, 144
Hocherlin, M. E., 179
Hockey, G. R. J., 354, 362, 363, 364, 365, 366, 367, 368, 398
Hodgetts, H., 81
Hoedemaeker, M., 391
Hoeft, 100
Hoffman, E., 124, 128, 296, 297, 298, 299, 300
Hoffman, R., 137, 238, 248
Hoffman, E.R., 294, 295
Hoffman-Goetz, L., 177
Hoffmann, E. R., 297, 299
Hogarth, A., 247, 261, 262, 263, 264, 279, 280, 281
Hogenoom, P., 356
Hogue, J. R., 153
Holding, D. H., 227
Hole, G. J., 206
Hollan, J. D., 145, 237
Hollands, J. G., 3, 9, 15, 23, 24, 25, 39, 85, 87, 88, 91, 93, 99, 114, 115, 116, 118, 121, 129, 133, 142, 144, 145, 146, 236, 250, 303, 309, 310,
349, 404
Hollingworth, W., 255
Holmberg, N., 182
Holmqvist, K., 182
Holsanova., J. N., 182
Holscher, C., 135, 136, 137
Hongisto, V., 82
Hooey, B. L., 52, 53, 54, 55, 287, 404

465
Hoogeboom, P., 81
Hoosain, R., 204
Hope, L., 64, 262
Hope, R., 398
Hope, 278
Hörmann, H. J., 195
Hornbaek, K., 142
Horowitz, T. S., 26, 28, 30, 59, 60, 322
Horrey, W. J., 50, 51, 52, 277, 328, 329, 333, 338, 339, 340, 349, 398
Horswill, M. A., 9, 20
Horton, T. E., 380
Hosking, S. G., 340
Houghton, R., 301
Houle, S., 240
Houtmans, M. J. M., 51
Howard, D., 231
Howe, S. R., 18, 19, 26, 27
Howell, W. C., 95, 290, 369
Howerter, A., 199, 201
Howse, W. R., 194, 195, 223
Hsieh, S., 293, 332
Hu, Y., 113, 191
Huang, K.-C., 173
Hubbold, R., 120, 158
Huber, D., 356
Huber, E., 384
Huestegge, L., 87
Huey, M. B., 359, 367
Hufford, L. E., 243
Huggins, A., 188
Hughes, J., 24, 25, 379, 394, 395
Hughes, R. W., 81
Hughes, T., 69, 76
Hui, C., 272
Hulsbosch, A. M., 154
Hulse, S. H., 33, 35
Hults, B. M., 368
Humes, L. E., 78, 80
Humphreys, C., 354
Humphreys, G. W., 58
Hunn, B. P., 150
Hunt, E., 256, 281, 372
Hunt, E., 372
Hunter, J. E., 360, 368
Huper, A.-D., 388, 392, 404
Hurts, K., 338, 340
Husein, M., 153
Huss, D., 237
Hussey, E., 31, 397
Hutchins, S., 24, 25, 81, 122, 170, 181, 229, 230, 231, 233, 291, 328, 329, 335, 379, 394, 395
Hutchinson, S., 125, 127, 132, 153, 228, 342
Hygge, S., 81, 82
Hyman, I. E., 56
Hyman, R., 287, 288, 292
Hyönä, J., 51, 55

I
Iani, C., 333, 334, 337
Iavecchia, H. P., 352
Ichikawa, M., 112
Ijsselsteijn, W. A., 119, 120
Ilgen, D. R., 388
Inagaki, T., 28, 52, 392, 395, 397, 398, 399, 402
Inbar, O., 92
Ince, F., 99
Ingleton, M., 74
Inselberg, A., 140, 141
Iqbal, S. T., 347
Iragui, V., 354
Irby, T. S., 300
Irwin, C. I., 292
Irwin, D. E., 53, 56, 60, 327, 344
Isakof, M., 263, 254
Isard, S., 189
Isherwood, S., 172, 173, 174
Islam, A. R., 38
Isreal, J. B., 327, 355

466
Itti, L., 52, 55
Iyengar, V., 388
Izzetoglu, K., 349, 356

J
Jack, D., 154
Jacko, J., 1
Jackson, J., 309
Jackson, L., 218
Jacob, R. J. K., 40
Jagacinski, R. J., 146
Jakobsen, M. R., 142
Jakobson, L. S., 120
Jakus, G., 175
James, N., 215
James, W., 321
Jamieson, G. A., 24, 25, 100, 283, 355, 369, 391, 404
Jang, J., 182
Janiszerwinski, C., 322
Jarmasz, J., 66, 69
Jarvic, J. G., 255
Jay, C., 157
Jay, T., 174
Jayasinghe, N., 154
Jedel, S., 154
Jenkin, M. R., 152
Jenkins, H. M., 281
Jenkins, D., 1, 215
Jennings, A. E., 371
Jensen, R. S., 97, 113, 298, 366
Jentsch, F., 379
Jeon, M., 175
Jeon, S., 156
Jerden, E., 231
Jersild, A. T., 293, 332
Jessa, M., 101, 102
Ji, Q., 398
Jian, J-Y., 388
Jiang, H., 114, 115, 116, 144, 145
Jiang, X., 60
Jiang, Y., 374
Jodelet, D., 136
Johannesen, L. J., 250, 280, 311, 318, 366
Johannsdottir, K. R., 66, 69, 200, 216
Johanson, J. F., 58, 60
Johansson, J., 221
Johnsen, B. H., 219
Johnson, A., 49, 321
Johnson, C. I., 182
Johnson, E. J., 256, 263, 274, 275, 368
Johnson, J. C., 67
Johnson, M. K., 178
Johnson, R. Jr., 355
Johnson, S. J., 153
Johnson, S. L., 98
Johnson, E., 260
Johnsrude, I. S., 190
Johnston, J. C., 58, 62, 73, 370
Johnston, J. H., 370
Johnston, J., 55
Joint Commission, 394
Jolicoeur, P., 74
Jones, D., 81
Jones, D. G., 195
Jones, D. M., 78, 79, 81, 82, 200, 215, 218
Jones, L. C., 104
Jones, S., 1, 174
Jonides, J., 63, 206
Joose, M., 81
Jordan, T. R., 167
Jorna, P. G., 357, 397
Joslyn, S., 260
Juan, M. C., 153
Juan-Espinosa, M., 372
Jung, T. P., 354
Jungk, 100
Jurkowitz, N., 229, 232, 333

467
Jushasz, B., 165
Just, M. A., 164, 353

K
Kaarlela-Tuomaala, A., 82
Kaber, D. B., 61, 126, 382, 390, 395, 397, 399, 401
Kaczmarski, H., 327
Kahana, M., 241
Kahneman, D., 64, 65, 68, 246, 247, 249, 250, 252, 253, 257, 258, 259, 260, 261, 266, 267, 268, 269, 270, 271, 272, 273, 276, 278, 279, 281,
322, 325, 356, 362, 364, 372, 381
Kalia, R. K., 154
Kalish, M. L., 242
Kalkofen, D., 156
Kalmar, D., 341
Kalyuga, S., 228, 236
Kane, M. J., 55, 199, 206
Kantowitz, B. H., 25, 304, 306, 325, 326, 403
Kaplan, S., 325
Kapralos, B., 152
Kapur, S., 240
Karahalios, K., 190
Karlin, L., 305
Karlsen, P. J., 199
Karpicke, J., 231
Karsh, B., 377
Karsh R., 15
Kaufmann, R., 97
Kay, B. A., 104, 106
Keele, S. W., 78, 305
Keeney, 274
Keillor, J., 121, 145
Keillor, J., 125
Keinan, G., 366, 367, 370
Keiras, D. E., 328, 330
Keith, N., 231
Keller, D., 24, 25, 379, 394, 395
Keller, J., 125, 299, 300
Kelley, C. M., 243
Kello, J., 195
Kelly, L., 106
Kelly, M. L., 29
Kenner, N. M., 26, 28, 30, 60
Kenney, R. L., 282
Keppel, G., 206
Kerns, C., 204
Kersten, C., 151
Keskinen, E., 82
Kessel, C., 326
Kessler, R. C., 360
Kester, L., 233
Kestinbaum, L., 305
Kesting, I., 24
Ketels, S. L., 261, 281
Khanna, M. M., 372
Khoo, L., 260, 277, 289
Kibbi, N., 26, 28, 30, 60
Kibler, A., 286
Kidd, D., 278
Kieras, D. E., 304, 306
Kijowski, B. A., 337
Kilkenny, C., 153
Kim, J., 107
Kim, S-H., 395
Kim, W. S., 120
Kindström, M., 121
King, R. H., 272
King, S., 322
King, M. C., 38
Kingstone, A., 347
Kintsch, W., 165, 180, 210, 216
Kiris, E. O., 390, 401
Kirk, D., 154
Kirlik, A., 49
Kirschenbaum, S. S., 85
Kirsh, D., 222
Kirwan, B., 282, 315, 349
Kite, K., 96

468
Klapp, S. T., 233, 292, 341
Klatzky, R. L., 121, 237
Klauer, S., 338, 340
Klayman, J., 264
Klein, G., 137, 218, 238, 246, 247, 249, 250, 265, 268, 278, 364
Klein, G. A., 246, 247, 249, 250, 261, 266, 276, 278, 279, 322
Kleinmuntz, D. N., 258, 275, 282, 283, 325
Klemmer, E. T., 167, 286, 308
Klette, R., 357
Kliegel, M., 211
Kline, 158
Knäeuper, A., 60
Knight, J. B., 212
Knight, J. L., 325, 326
Knill, D. C., 112
Knutson, B., 270
Kobus, D. A., 374
Koch, C., 52, 55, 64
Koehler, D., 256, 257, 259, 277
Koenicke, C. S., 52, 53, 54, 55, 287, 404
Kogan, A., 175
Koh, R., 333, 342
Kohn, L., 1
Kolasinski, E. M., 158
Kalyluga, S., 233
Kong, N., 92
Konstan, J. A., 334
Kooi, F., 119
Koolstra, C., 81
Kopala, C. J., 40
Koriat, A., 281
Korn, D., 263
Kornblum, S., 292, 296, 309
Kornbrot, D. E., 18
Korndorffer, J. R., 242
Kothe, C., 275
Kraft, C., 107
Kraiger, K., 231, 238
Kraiss, K. F., 60
Kramer, A. F., 8, 49, 53, 56, 60, 67, 218, 324, 327, 340, 342, 344, 355, 356, 358, 397, 398
Kramer, F. M., 361
Krebs, M. J., 285, 301
Kreidler, D. L., 290
Kress, C., 184
Krijn, M., 154
Kristensson, P. O., 243
Kroft, P., 61, 134, 135
Kroll, R. L., 292
Krueger, F., 388
Krupenia, V., 298, 299, 300
Kryter, K. D., 190, 191
Kuhl, S. A., 158, 184
Kuisma, J., 51, 55
Kujala, T., 336
Kumagai, J. K., 95
Kumar, N., 117
Kumar, R., 399
Kundel, H. L., 56, 59, 60
Kutas, M., 355
Kutlesa, N., 97
Kveraga, K., 172
Kwantes, P. J., 239
Kwinn, A., 153
Kwok, J., 100, 283, 369
Kyllonen, P. C., 372
Kysor, K. P., 181, 184

L
LaBerge, D., 161, 162, 166
Lacson, F., 388
Ladak, H. M., 153
LaFollette, P. S., 56, 59
Laidlaw, D. H., 92
Laimay, C., 184
Laird, A. R., 353
Lalomia, M. J., 85
Lam, T. M., 121

469
Lamb, M., 99, 129, 133
Laming, D., 14, 46, 47
Lan, T., 398
Landauer, T. K., 168, 180, 239, 380, 386
Landy, M. S., 112, 116
Langewiesche, W., 197
Langheim, L., 325, 333, 372
Lappin, J., 74, 75
Larish, J. F., 104, 106, 107, 344
Larrick, R. P., 282, 404
Larsen, J. T., 268
Laskowski, S. J., 175, 176, 178
Laszlo, S., 163
Latorella, K. A., 332, 333, 335
Lau, N., 100, 283, 369
Laudeman, I. V., 337
Laughery, K. R., 184
Laughery, K. R., 185
Laumann, K., 356
Lavie, N., 333, 338
Layton, C., 221, 223, 392
Lazar, R., 59
Lazarus, R., 363, 366
Lazer, R., 161
Leachtenauer, J. C., 56, 60
Lee, A., 185, 186
Lee, B., 137, 138, 139, 141, 143, 144
Lee, D. N., 107
Lee, E., 59, 60
Lee, J., 53, 330, 338, 339, 340
Lee, J. D., 1, 6, 25, 49, 78, 79, 92, 100, 101, 174, 185, 282, 328, 338, 340, 378, 379, 388, 389, 393, 394, 395, 400, 404
Lee, J. H., 78, 80
Lee, K. M., 152
Lee, S. E., 338
Lee, T., 354
Lee, Y.-C., 53, 338
Lees, M. N., 25, 388, 394, 395
Lehrer, J., 246, 255, 265, 269, 282
Lehto, M., 265
Lei, S., 354
Leibowitz, H., 103, 277, 329
Leiser, D., 85
Lele, O., 88
Lennerman, J. K., 356
Leonard, J. A., 293
Leong, H., 354
Leroy, G., 177
Leroy, J., 386
Lesch, M. F., 277, 333, 339, 340, 398
Lesgold, A. M., 178
Levav, J., 263
Leveson, N., 386
Levi, D. M., 120
Levin, B., 122
Levin, D. T., 53, 54, 276
Levine, S., 364
Levine, 126
Lew, R., 104, 107, 329
Lewandowsky, S., 203, 242, 390
Lewis, J. R., 56, 307
Lewis, K., 213, 327
Lewis, M., 379, 386
Lewis, R., 87
Li, F. F., 64
Li, L., 106
Li, S. Y., 314
Li, Y., 61, 62
Li, Z., 158
Li, H., 400, 401
Liang, C.-C., 114, 127, 131, 145
Liang, D. W., 213, 214
Liao, J., 337, 351
Liao, T. W., 38
Liben, L., 132, 136
Lichtenstein, R., 281
Lieberman, H. R., 361
Ligetti, C., 39

470
Limor, N.-G., 260
Lin, E. L., 117, 119, 139
Lin, M., 388
Lincoln, J. E., 52, 61
Lindemann, H., 260
Lindenbaum, L. E., 294
Lindsay, P. H., 161, 162
Lindsay, R. C., 22, 276
Ling, J., 56
Linkens, D. A., 398
Lintern, G., 98, 230, 342, 343
Liobera, J., 154
Liou, S. F. T., 221
Lipshitz, R., 246, 249, 278, 281
Lipsky, R., 388
Little, D., 242
Liu, Y., 92, 185, 282, 301, 328
Liu, Y. C., 65, 73, 113, 116, 143, 327, 399
Liu, Y., 1, 6
Liuzzo, J., 29
Liversedge, S. P., 165
Lleras, A., 113
Llorens, I., 343
Lockhart, R. S., 230
Lockhart, C., 24
Lockhead, G. R., 38, 308
Lodge, M., 195, 216
Loeb, M., 26
Loft, S., 241, 347
Loftus, E. F., 242
Loftus, G. R., 203, 205, 206
Logan, G.D., 41, 233
Logie, R. H., 198, 199, 327
Lohrenz, M. C., 56, 58, 61
Lohse, G. L., 87
Loizou, P. C., 191
London, M., 179
Long, J., 65, 66, 69, 309
Longman, D., 308
Longo, M., 399
Loomis, J. M., 121
Lopes, L., 281
Lopez-Ba, I., 81, 82
Lotan, M., 372
Lotem, A., 271, 272
Loukopolous, L. D., 332, 333, 334, 335, 337
Lovchik, C., 384
Loveless, N. E., 301
Loxley, S., 79
Lu, S., 328, 335
Luce, R. D., 33, 46
Luchins, A. S., 366
Luck, S. J., 354
Lum, H. C., 222
Luo, Z., 158
Lusk, C. M., 368
Lusted, L. B., 21, 22
Luz, M., 377
Lyall, B., 227, 380
Lyon, D. R., 62

M
Ma, J., 191
MacDonald, J., 153
Macdougall, H., 333, 334, 336
Macedo, J., 126
MacGregor, D., 251, 260, 261, 275, 276
MacGregor, J., 59, 60
MacGregor, J. N., 222
Mack, A., 55
Mack, I., 31
Macken, B., 79
Macken, W. J., 78, 81, 215, 218
Mackinlay, J. D., 137, 138, 145
Mackworth, N. H., 1, 8, 25, 26, 27, 30
MacLeod, C. M., 68
MacLin, O. H., 22

471
MacMahon, C., 20
MacMillan, A. G., 113
MacMillan, J., 250
Macmillan, N. A., 10, 15, 18, 19, 241
MacRae, A. W., 69, 76
Madden, D. J., 162
Madden, J. M., 257
Maddox, W. T., 27
Madhavan, P., 337, 388
Magee, L. E., 116, 151
Magne, C., 78
Magruder, D., 384
Maheswar, G., 29
Mahfouf, M., 398
Maisano, R. E., 34
Makeig, S., 354
Maki, R. H., 136
Malcolm, R., 104
Malhotra, N. K., 255
Malone, L., 121
Maloney, L. T., 112, 116
Malpass, R. S., 23
Maltz, M., 391
Mandler, G., 163, 189
Mandryk, R. L., 92
Mane, A., 230
Manes, D. I., 25, 380
Mania, K., 158, 159
Manier, D., 241
Manning, C. A., 210, 216
Manzey, D., 377, 388, 390, 391, 392, 393, 401, 402, 404
Marchhena, E., 343
Marescaux, J., 386
Mariné, C., 184, 208
Mark, 333
Markham, S., 153
Markley, S., 243
Marks, W., 166
Marley, A. A. J., 33
Marois, R., 53, 56
Marsh, L. G., 136
Marsh, R. L., 211, 212
Marshall, D. C., 79, 174
Marshall, J. R., 339
Marston, J. R., 121
Martens, M. H., 53, 54, 73, 74, 77, 135, 361
Martin, B., 255
Martin, B. A., 211
Martin, G., 327
Martin, L., 368
Martin, M., 211
Martin, R. C., 82, 200
Marvin, F., 261
Masalonis, A. J., 19, 20, 382, 383
Mason, A. F., 255, 262
Massel, L. J., 95
Masserasng, K., 342, 343
Mathan, S., 334, 397, 398, 402
Mathiassen, E., 309
Mattes, S., 338
Matthews, G., 27, 325, 333, 360, 363, 366, 372
Matthews, M. D., 219
Matthews, M. L., 142
Maule, A., 246, 268, 360, 366, 367, 368, 386
Mavor, A., 3, 386
Mavroidis, C., 154
May, P. A., 114
Mayer, A., 184
Mayer, R., 229, 232, 322, 333
Mayer, R. E., 181, 182, 183, 222
Mayer, S., 232
Mayer, R., 228, 231, 328
Mayeur, A., 54
Mayhew, D. J., 204
McAdams, S., 37
McBride, D. M., 211
McCabe, K., 388

472
McCann, C. A., 142
McCarley, J. S., 5, 8, 24, 49, 51, 52, 53, 54, 55, 56, 60, 97, 289, 321, 327, 331, 338, 339, 344, 371, 394
McCarthy, G., 303, 355
McCauley, S., 260, 277, 289
McClelland, J. L., 163, 304
McConkie, A., 292
McConkie, G. W., 164, 165
McCormick, E., 137
McCoy, C. E., 221, 223, 392
McCredden, J. E., 323, 349
McCrerie, C. M., 65
McCurry, J. M., 97
McDaniel, M. A., 180, 211, 212, 231, 233, 334, 336
McDermott, K., 231
McDine, D., 92
McDonald, H., 283, 379, 384, 385
McDonald, J., 81
McDougall, S., 172, 174
McDougall, S. J. P., 173, 174
McElvoy, C., 261
McEvoy, L., 354
McEwen, T. R., 108
McFall, R. M., 21
McFarland, C., 212
McFarlane, D. C., 40, 332, 333
McGarry, K., 389, 401
McGarry, W. R., 31, 397
McGee, J., 386
McGeoch, J. A., 206
McGeorge, P., 195, 262
McGill, R., 89, 90
McGookin, D. K., 174
McGowan, A., 219
McGrath, B. J., 121
McGraw, A. P., 268
McGreevy, M. W., 115, 117
McGwin, G., 60
McIntire, J. P., 121
McIntosh, A. R., 240
McKee, S. P., 54, 74, 120
McKenzie, B., 50
McKenzie, K. E., 56
McKeown, M. J., 354
McLain, T. W., 399
McLaughlin, A. C., 243
McLeod, P., 327
McMillan, K. M., 353
McNamara, D. S., 177
McNeil, B. J., 271, 272
McNelly, T. L., 243
McTeague, J., 381
McVay, J., 199
Meader, D. K., 192
Mecklinger, A., 206
Meehl, P. C., 258
Meeks, J. T., 212
Meichenbaum, D., 370
Meilinger, T., 135
Meissner, C. A., 22
Meixensberger, J., 377
Mellers, B. A., 251, 271
Mellman, M., 227
Melton, A. W., 203, 204
Melton, J., 194
Melton, D. F., 1, 340, 398
Memmert, D., 55
Memon, A., 262
Mendez, E., 156
Merians, A. S., 154
Merien, N., 332
Merikle, P. M., 236
Merkel, J., 287
Merlo, J. L., 62, 63, 134, 156, 389, 390, 391, 404
Merrit, A. C., 194
Merritt, S. M., 388
Merwin, D. H., 97, 115, 114, 117, 119, 138, 139
Meshkati, N., 351

473
Metzger, U., 9, 352, 391
Meyer, D. E., 304, 306, 328, 330, 332
Meyer, J., 15, 23, 85, 92, 377, 391
Mezzanotte, R. J., 172
Michalos, A., 356
Michinov, E., 213, 214
Michinov, N., 213, 214
Micire, M. J., 150
Miles, K. S., 175, 177
Milgram, P., 99, 117, 136, 155, 351
Milios, E., 152
Militello, L. G., 238
Miller, C., 399, 402, 403
Miller, G. A., 33, 189, 204, 205
Miller, J., 341
Miller, L., 275
Miller, R. J., 95
Miller, S., 116
Miller, T., 177
Miller, D., 309, 315, 316, 317
Miller, B., 24
Mills, S., 221
Milner, A. D., 103
Mintz, D., 120
Mintz, F. E., 334, 335
Mischel, W., 270
Misra, S., 152
Mitchell, J., 240
Mitchell, P., 118, 119
Mitchell, P. J., 149, 379, 399
Mitta, D., 142
Miyake, A., 123, 199, 201, 353, 371, 372, 373
Mocharnuk, J. B., 59
Moertl, P. M., 74, 144, 221
Mohler, B., 135
Molden, D., 272
Molloy, R., 31, 388, 390, 391, 395, 404
Momen, N., 54, 276
Mondor, T. A., 37, 80
Monk, A., 240
Monk, C. A., 278, 332, 334, 336, 342
Monsell, S., 293, 332
Montello, D., 132
Montgomery, H., 246, 278, 333
Mooij, M., 347
Moon, B., 137
Moon, Y., 402
Moore, A. B., 199
Moore, C. J., 120
Moore, G. E., 377
Moore, T. J., 121
Moorman, L., 145
Moran, T. P., 311
Moray, N., 28, 52, 78, 94, 203, 236, 237, 293, 311, 319, 333, 337, 347, 351, 356, 384, 388, 389, 392, 393, 404
Moreland, R., 213, 214
Moreno, 228, 231
Morgan, C. A., 361
Morgan, P., 322
Mori, H., 67
Morin, C., 301
Morley, N., 185, 186
Morphew, M. E., 148
Morris, R. K., 165
Morris, N., 218, 315, 320
Morrison, J., 144
Morrison, J. G., 374, 380, 395, 397
Morrow, D. G., 184, 185, 278, 283, 377, 379, 385
Morton, A., 118
Moscovitch, M., 58
Moses, F. L., 34, 168
Mosier, K., 247, 248, 260, 262, 265, 277, 278, 282, 289, 385, 392, 393, 404
Most, S. B., 55, 59
Mouloua, M., 395, 396, 397
Mourant, R. R., 56, 342
Mowbray, G. H., 33, 293
Mueller, S., 377
Muhlbach, L., 119

474
Muir, B., 388
Mulder, G., 356
Mulder, J. A., 107
Mulder, L. J., 356
Mulder, M., 121, 215, 100
Mullen, M. P., 40
Muller, H. J., 62
Muller, P. I., 307
Multer, J., 9
Multhaner, R. A., 113
Mumaw, R. J., 53, 387
Mumenthaler, M. S., 199
Munafo, M., 373
Munichor, N., 271, 272
Munoz, Y., 186
Munzer, S., 127
Munzer, T., 141
Murphy, A. H., 264, 281
Murphy, R., 379
Murphy, T. D., 67
Murphy, A. Z., 26
Murray, M. D., 185
Mursalin, T. E., 38
Mussa-Ivaldi, F., 275
Mussweiller, T., 281
Mutter, D., 386
Myaskovsky, L., 214
Mynatt, C. R., 262
Myung, S., 126

N
Nagy, A. L., 58
Nakano, L., 61, 62, 153
Nakayama, K., 54, 74
Nass, C., 192, 402
Nassef, A., 398
Navarro, J., 343
Navon, D., 163, 321, 324, 325, 331, 341, 343
Naweed, A., 175
Naylor, J., 230, 343
Neal, A., 347
Neale, V. L., 338
Nee, D. E., 206
Nehme, C. E., 149, 380
Neider, M. B., 327
Neisser, U., 57, 58, 59, 64, 160, 161, 163, 188, 189
Nelson, T. O., 324
Nelson, W. T., 121, 122
Neufeld, P., 242
Neuper, C., 275
Nevile, M., 193
Newell, A., 311
Newlands, A., 192
Newon, J., 194
Newsome, S. L., 179
Neyedli, H. F., 9, 15, 23, 25, 403
Nguyen, A. D., 385
Nguyen, D. T., 193
Nickel, P., 354, 398
Nickerson, R. S., 235, 262, 263, 276
Nicolelis, M. A., 275
Nielsen, A., 154
Niemczyk, M., 380
Nieminen, T., 329, 357
Nieves-Khouw, F., 23, 394
Nikolic, M. I., 53, 329, 335
Nilson, L., 344
Nilsson, L. G., 202
Niro, P., 361
Nissen, M. J., 63
Noble, M., 342
Nodine, C. F., 60
Nof, S. Y., 377
Nokes, T. J., 182
Norman, D. A., 71, 78, 102, 161, 162, 168, 221, 237, 240, 300, 301, 302, 303, 309, 310, 311, 312, 314, 318, 320, 323, 324, 335, 344, 380, 387,
400
North, C., 137, 138, 141, 142, 143

475
North, R., 184, 283, 377, 379, 385
Nosofsky, R. M., 33, 35
Novick, R., 59, 161
Nowinski, J., 334, 343
Noyes, J. M., 174, 185
NTSB, 331, 380, 381
Nugent, W. A., 183
Nunes, A., 51, 58, 135, 340, 357
Nygren, T. E., 352

O
O’Brien, N., 309
O’Hara, R., 199
O’Neill, E., 174
O’Neill, P., 31
O’Regan, J. K., 53, 55
O’Brien, K. S., 216
O’Connor, P., 194
O’Donnell, R. D., 351, 352
Ogden, W. C., 63
O’Hanlon, J. F., 361
O’Hara, K., 113, 221, 239
O’Hare, D., 208, 216, 248, 366
Ohlsson, K., 202
Öhrström, E., 81, 82
Ohrt, D. D., 210, 216
Okado, Y., 242
Okamura, A. M., 152
Olafsson, R. P., 154
Oldak, R., 107
Oliva, A., 171, 172
Olmos, O., 116, 127, 130, 131, 145
Olofinboba, O., 25, 23, 24, 394
Olson, G. M., 192, 237
Olson, J. S., 192, 236, 237
Olson, W. A., 387
Olson, E. A., 22, 23
Onal, E., 390, 401
Onaral, B., 349, 356
Onnasch, L., 391, 392, 401, 404
Ono, Y., 373
Oonk, H. M., 114, 115
Öörni, A., 51, 55
Opperman, R., 399
Oran-Gilad, T., 121
Oransky, N. A., 59
Orasanu, J., 250, 255, 260, 275, 277, 289, 291, 360, 364, 365
Orlady, H. W., 78, 226, 227
Orlansky, J., 153
Ormel, W., 340
Ormerod, T. C., 222
Orne, E. C., 357
Orne, K. T., 357
Oron-Gilad, T., 374
Orr, J. M., 53
Oskamp, S., 255
Osman, A., 275, 296
Overauer, K., 203
Overbye, T. J., 34
Overley, E. T., 60
Owen, A. M., 353
Owen, D. H., 108
Owen, G., 380
Owsley, C., 60

P
Paap, K. R., 235, 236
Paas, F., 228, 229, 233, 322, 324
Pachella, R. J., 35, 289, 290, 291, 303
Packard, M. G., 234
Paese, P. W., 264
Palacios, A., 372
Palmer, E. A., 337
Palmer, S. E., 64, 76, 113
Palmisano, S., 106, 107
Panoutsos, G., 398

476
Pansky, A., 68
Papanastasiou, S., 65
Parasuraman, R., 8, 9, 19, 20, 21, 23, 23, 24, 25, 26, 27, 28, 30, 31, 150, 275, 325, 346, 347, 352, 353, 356, 357, 358, 365, 371, 373, 374, 377,
378, 379, 380, 381, 382, 383, 386, 388, 389, 390, 391, 392, 393, 394, 395, 396, 397, 399, 400, 401, 402, 403, 404
Parekh, M., 301
Park, O., 94, 97
Park, T., 333, 342
Parker, A. M., 256, 282
Parker, H. A., 118
Parker, J. F., 22
Parkes, A. M., 328
Parks, D. L., 349, 350
Parmentier, F., 81
Parmet, Y., 15, 23, 122, 377
Parra, L. C., 275
Pashler, H., 180, 233, 304, 305, 306, 327, 341
Pastor, J., 199, 372
Pasupathi, M., 340
Patel, V. L., 209
Paterson, K. B., 167
Patey, R., 195
Patrick, J., 215, 322
Patrick, T., 322
Patt, I., 154
Patterson, E., 144, 385
Patterson, R., 103, 119
Patton, W. E. III, 272
Pauker, S. G., 271, 272
Pavel, M., 398
Pavlas, D., 222
Pavlovic, N. J., 114, 115, 116, 121, 125, 144, 145
Pavvio, A., 231
Payne, D. G., 210
Payne, J., 256, 261, 274
Payne, J. W., 256, 263, 274, 275, 368
Payne, S. J., 113, 221, 289, 296, 365
Pea, R. D., 229
Peacock, B., 2, 6, 146
Pearlmutter, B., 275
Pearson, T., 246, 278
Peavler, W. S., 357
Pedersen, H. K., 380, 384
Peebles, D., 76, 87, 88, 91
Peleg, R., 377
Pellegrino, J. W., 237, 372
Pelz, J. B., 325
Penaranda, B., 398
Penningroth, S., 95, 211
Penrod, S. D., 206
Perdelwitz, J. R., 330, 352
Perez, D., 153
Perez, M. A., 338, 340
Perham, N., 82
Perkins, S., 23, 394
Perlin, M., 145
Perona, P., 64
Perrone, J. A., 158
Perrott, D. R., 121
Perrow, C., 311, 317
Perry, D. C., 216
Perry, J. L., 210, 216
Peters, E., 251, 260
Petersen, A., 338
Peterson, B., 152
Peterson, L. R., 203
Peterson, M. J., 203
Peterson, M. S., 54, 344
Peterson, C. R., 15
Petit, C., 338, 339, 340
Petocz, A., 172
Petrov, A. A., 33
Pew, R. W., 3, 214, 215, 290
Pfeiffer, T., 281
Pfurtscheller, G., 275
Philipp, A. M., 87
Phillips, J. B., 330, 352
Phillips, S., 323, 349

477
Pichora-Fuller, M. K., 189
Pickett, R., 12, 21, 24
Pickle, J. L., 162
Pickrell, J. E., 242
Pierce, B. J., 116
Pierce, L. G., 393
Pigeau, R. A., 31
Pillalamarri, K., 15
Pilotti, M., 165
Pinker, S., 87
Pizarro, L., 38
Place, S. S., 26, 28, 30, 60
Plaisant, C., 1, 138, 141, 143
Plasters, C. 275
Plath, D. W., 167
Playfair, W., 85
Playfoot, D., 174
Plomin, R., 373
Plude, D., 172
Poldrack, R. A., 234
Polich, J., 355
Pollack, I., 35, 356
Pollack, E. 233
Pollatsek, A., 165, 342, 343
Polson, M. C., 327
Polson, P. G., 210
Pomerantz, J. R., 76
Pond, D. J., 34
Ponin, E., 274
Pool, M., 81
Poon, Y., 218, 276
Pope, A. T., 104
Porath, A., 377
Porter, G., 357
Posey, 100
Posner, M. I., 62, 63, 65, 200, 284, 285, 286, 300, 322, 353, 373
Post, D., 275, 329
Potter, P., 333
Poulton, E. C., 89, 365
Povenmire, H. K., 225
Powanusorn, P., 126
Poynor, D. V., 165
Pradham, A., 59, 342, 343
Pratim-Bannerjee, A., 298, 299, 300
Prevett, T. T., 99, 116, 117, 127, 129, 130
Previc, F., 103, 113, 124, 125, 329
Price, T., 66
Prichard, J. S., 214
Prinet, J., 81, 121, 170, 181, 291, 328, 329, 335
Pringle, H., 53, 56, 60, 327, 344, 380, 384
Prinzel, L., 113, 131, 395
Pristach, E. A., 76
Pritchett, A., 377, 378, 379, 384, 385, 387
Proctor, R., 1, 6, 49, 95, 232, 296, 300, 321, 347
Prussog, A., 119
Puffer, S., 337
Pugh, H. L., 238
Punto, M., 329
Purcell, J. A., 209
Purdy, K. J., 167
Puto, C. P., 272

Q
Quesada, S., 379, 387
Quillian, M. R., 235
Quinlan, P. T., 37
Qusipel, L., 356

R
Rabbitt, P. M., 62, 63, 290, 314, 317
Rabinowitz, J. C., 172
Raby, M., 337, 351, 365
Radosevich, 233
Radtke, P. H., 183, 184, 193
Radwin, R. G., 275
Raij, D., 308

478
Raj, A. K., 121
Rakauskas, M., 340
Rall, E., 211, 212, 334, 336
Ramesh, K. T., 152
Randel, J. M., 238
Rantanen, E. M., 149
Raskin, J., 243
Raslear, T., 9
Rasmussen, J., 100, 248, 262, 284, 320
Rattan, A., 56
Ratwani, R. M., 88, 92, 97, 336
Rau, P.-L. P., 100, 168, 170
Raymond, J., 343
Rayner, K., 164, 165
Razael, M., 357
Read, L., 81
Reason, J., 197, 241, 311, 312, 313, 314, 315, 316, 318, 319, 380
Rebollo, I., 372
Recarte, M. A., 51, 340, 357
Redding, R. E., 209
Redelmeier, D. A., 339
Reder, L., 234, 249
Redish, J., 175, 176, 178
Ree, M. J., 216, 371
Reed, S. K., 238
Reeves, B., 402
Regan, M., 49, 330, 338, 339, 340
Rehal, G., 153
Rehnmark, F., 384
Reichenbach, J., 391, 392, 401, 402, 404
Reicher, G. M., 163
Reichle, E. D., 165
Reid, G. B., 352
Reinerman-Jones, L., 325, 333, 372
Reingold, E. M., 58
Reisweber, M., 356
Remington, R. W., 55, 58, 62, 73, 135
Renkl, A., 228, 236, 322
Rennerman, L., 348
Renshaw, J. A., 92
Rensink, R. A., 53, 54, 55
Reppa, I., 172, 174
Rettinger, D. A., 372
Rey, 233
Reynolds, D., 304
Reynolds, T. J., 178
Rhoades, M. V., 293
Ricchiute, D. N., 261, 282
Rice, S., 24, 25, 379, 394, 395
Rich, A. M., 34
Richards, A., 54
Richardson, R., 308
Richardson-Klavehn, A., 241
Richman, E. H., 92
Riener, R., 155
Rieskamp, 261
Riesz, R. R., 307
Riley, J. M., 397, 399
Riley, V., 31, 378, 381, 394
Ring, L., 145
Risden, K., 141
Risser, M. R., 31
Rissman, J., 344
Ritter, F. E., 380
Rizy, E. F., 30
Rizzo, M., 346
Roberts, A. C., 354, 398
Roberts, M. H., 354
Robertson, G., 84, 137, 138, 139, 141, 143, 144, 145
Rock, R., 55
Rockwell, T. H., 56, 342
Rodriguez, 270
Roediger, H., 231
Roels, R., 175
Roenker, D. L., 60
Roetting, M., 354
Roge, J., 55

479
Rogers, D., 145
Rogers, R. D., 293, 332
Rogers, S. P., 68
Rogers, W., 341, 344
Rogers, W. A., 166, 184
Rohrer, D., 180, 233
Rolfe, J. M., 325
Rollins, R. A., 328
Rolt, L. T. C., 197
Romera, M., 58, 62, 73
Ronen, A., 122
Ronnberg, J., 202
Roring, R. W., 190
Rosas-Arellano, M. P., 283, 379, 384, 385
Roscoe, S. N., 95, 97, 98, 99, 113, 225, 298, 366
Rose, A. M., 243
Rose, D. L., 97
Rose, P. N., 63
Rosen, A. C., 199
Rosen, M. A., 222
Rosen, S., 189
Rosenholtz, R., 61, 62
Rosenthal, R., 3
Roske-Hofstrand, R. J., 235, 236
Ross, L., 262
Ross, M., 276
Rossi, A. L., 257
Rotello, C. M., 241
Rotenberg, I., 333
Roth, E., 144, 149, 379
Rothbart, M. K., 373
Rothbaum, B. O., 154
Rothman, D., 229, 232, 333
Rothrock, L., 39
Rothstein, P. R., 185
Roudsari, A., 393
Rouse, S. 319
Rouse, W., 218, 248, 256, 281, 310, 311, 315, 319, 337, 395, 397
Rousseau, G. K., 166
Rousseau, R., 215, 219
Rovira, E., 389, 401
Rowe, A. L., 238
Roy, C. S., 353
Rubenstein, T., 255, 262
Rubino, F., 386
Rubinstein, J. S., 332
Ruffle-Smith, H. P., 31
Ruiz, G., 343
Rule, B. G., 344
Rumelhart, D. E., 163, 309, 310
Rummel, N., 229
Rupert, A. H., 121
Russell, C. A., 361, 397
Russell, C. S., 398
Russell, E. J., 211
Russell, S. M., 108
Russo, J. E., 261
Ruthruff, E., 58, 62, 73
Ruva, C. L., 261
Ryder, J. M., 209
Rymer, W. Z., 275
Rysdyk, R. T., 107
Ryu, H., 240

S
Saariluoma, P., 336
Saberi, K., 121
Sachtler, W. L., 106, 107
Sadowski, W., 152
Sagaria, S. D., 251
Sahuc, S., 104, 106
Saida, S., 112
Saito, M., 29
Sajda, P., 275
Sak, S., 261
Salamé, P., 82, 200
Salas, E., 85, 193, 194, 195, 213, 222, 223, 238, 368

480
Salili, F., 204
Salmon, P., 215, 219
Salterio, S., 236
Salvendy, G., 6, 168, 170, 209
Salvucci, D., 330, 338
Salzer, Y., 121
Samanez-Larkin, G., 270
Samet, M. G., 255
Sanchez, R. R., 58
Sander, C., 156, 157
Sanders, A. F., 51
Sanderson, P., 328, 333, 334, 336, 347, 394
Sanderson, P. M., 75, 237, 332
Sandry, R., 201, 202, 301, 327, 328, 356
Sandry-Garza, D., 301
Sanquist, T. F., 379, 403
Sarkar, M., 141
Sarno, K., 274, 296, 327, 329
Sarter, N. B., 53, 78, 81, 94, 122, 170, 178, 181, 250, 280, 291, 311, 318, 328, 329, 335, 366, 378, 379, 387, 390, 393, 400, 401
Satchell, P., 377
Sauer, J., 178
Savage, L. J., 260
Savel, R. H., 194, 195
Savelli, S., 260
Sawin, D. A., 29
Sayer, J., 339
Scailquin, J.-C., 201
Scerbo, M., 29, 59, 391, 395, 397
Schachtman, A., 222
Schacter, D. L., 235
Schaefer, D., 81
Schaefer, K. E., 388
Schaffer, L., 310
Schall, G., 150
Scharenborg, O., 190
Schaudt, W. A., 104, 107
Schauss, F., 165
Scheck, B., 242
Scheiter, K., 184
Scheitman, S. L., 368
Schepers, P., 340
Schiff, W., 107
Schindler, R. M., 162
Schkade, D., 268, 282, 283
Schlittmeier, S. J., 82
Schlossberg, H., 284, 285
Schmaltstieg, D., 156
Schmauder, A. R., 165
Schmeink, C., 178
Schmidt, R. A., 181, 184, 234
Schmierer, K. A., 239
Schmitt, N., 368
Schmorrow, D., 374, 398
Schmucker, C., 153
Schneider, W., 28, 29, 59, 162, 166, 226, 233, 243, 322, 324, 342, 343, 371
Schoenfeld, V. S., 59
Scholl, B. J., 65, 68
Scholl, M. J., 135
Schön, D., 78
Schoonhoven, R., 189
Schopper, A. W., 52, 61
Schòrmann, 367
Schott, D. J., 243
Schraagen, J. M., 238
Schriefers, H., 206
Schröder, S., 173, 174
Schroeder, B. K., 178, 401
Schroeder, R. G., 255
Schultz, D. M., 23, 394
Schum, D., 256
Schumacher, E., 306
Schumsky, D. A., 122
Schunn, C. D., 182
Schurr, P. H., 272
Schustack, M. W., 262
Schutte, P. C., 330
Schwartz, A., 271

481
Schwartz, D. R., 95, 369
Schwarz, N., 259
Schweickert, R., 20
Scialfa, C. T., 56, 339
Scott, S. K., 189
Scott, W. D., 211
Scullin, M., 211
Seagull, F. J., 23, 63, 275, 394
Seamster, T. L., 209
Sears, A., 1
Sebok, A., 52, 53, 54, 55, 81, 122, 170, 181, 287, 291, 328, 329, 335, 379, 387, 404
Sedge, J., 333
Sethumadhavan, A., 401
See, J., 388
See, J. E., 18, 19, 26, 27
Seeger, C. M., 295, 301
Seegmiller, J. K., 55
Segal, L., 193
Seibel, R., 290, 307, 308
Seidler, K., 112, 236
Seigel, D., 230
Sejnowski, T., 354
Selcon, S. J., 65, 178
Self, B. P., 178
Seligman, M. E. P., 241
Sellen, A., 154
Selye, H., 361
Semmler, C., 177
Sen, A., 221
Senders, J. W., 52, 311, 319
Seppelt, B., 67, 100, 101, 328, 335, 378, 400
Serfaty, D., 250
Servos, P., 120
Sethi, N., 260, 277, 289
Sethumadhavan, A., 214, 215, 219, 400, 402
Sexton, J. B., 194
Shadbolt, N., 238, 248
Shaffer, M. T., 193
Shah, P., 92, 93, 123, 372
Shah, R., 112, 116
Shalin, V. L., 238
Shallice, T., 327, 344
Shanar, T. L., 344
Shandry, R., 355
Shannon, C. E., 41
Shanteau, J., 249, 250, 264, 278, 366
Shapiro, K. L., 343
Shappell, S., 311, 318
Shareafi, P., 333
Sharit, J., 311, 315, 316
Sharma, G., 153
Shattuck, L., 100
Shaw, P., 88
Shaw, T., 31, 356, 397
Sheedy, J. E., 167
Sheese, B. E., 373
Shelley, C., 342
Shelly, C., 322
Shelton, J., 81
Shepard, R. N., 33, 125
Shepherd, J., 206
Sheridan, T. B., 15, 52, 150, 169, 337, 381, 382, 383, 392, 400, 402
Sherman, W., 151, 152
Sherrington, C. S., 353
Shewokis, P. A., 349, 356
Shield, B., 81
Shiffrin, R., 28, 35, 59, 162, 166, 322
Shinar, D., 85, 342, 391
Shipley, D., 135, 136, 137
Shneiderman, B., 1, 137, 138, 141, 143, 240, 307
Shoda, 270
Shriver, A., 278
Shugan, S. M., 262
Shulman, H. G., 292
Shute, V. J., 238
Shutko, J., 338, 340
Sibert, L. E., 40

482
Sidorsky, R., 308
Siegel, D., 311, 319
Siegel, J. A., 33
Siegel, W., 33
Siegrist, M., 89
Sierra, R., 243
Sigrist, R., 155
Silver, N. C., 186
Simola, J., 51, 55
Simon, H. A., 208, 209, 216, 220, 221, 265
Simonov, P. V., 364
Simons, D. J., 53, 54, 55, 56, 276, 387
Simonsohn, U., 272
Simpson, B. D., 121
Simpson, T. W., 39
Singer, M. J., 158
Singh, I. L., 388, 390
Singley, M., 223, 227
Sirevaag, E. J., 356
Sit, R. A., 344
Sitzmann, T., 229
Sivier, J. E., 98
Skedsvold, P. R., 59
Skitka, L. J., 262, 392, 404
Sklar, A. E., 400
Skraaning, G., 100, 283, 369
Slamecka, N. J., 230, 390
Slater, M., 151
Sliwinski, M. J., 344
Sloane, M. E., 60
Sloman, S., 246, 247, 251, 260, 261, 273, 274
Small, R., 121, 125, 178, 299, 300
Smallman, H. S., 96, 113, 114, 115, 120, 158, 217, 220, 226, 261, 262, 282, 334, 380
Smelcer, J. B., 236
Smilek, D., 338, 347
Smith, A. F., 33, 162
Smith, D., 338
Smith, G., 3, 174
Smith, J., 364, 365
Smith, J. J., 185
Smith, K., 219
Smith, K. U., 309
Smith, M., 386
Smith, M. E., 354
Smith, P. J., 71, 138, 221, 222, 223, 256, 384, 392
Smith, R. E., 241
Smith, S., 97
Smith, S., 300
Smith, B. K., 37
Smither, J. A., 56, 307
Sniezek, J. A., 193, 250, 264
Snodgrass, J. G., 18
Snow, M. P., 157
Snyder, C., 62, 63
Socash, C., 379, 387
Sodnik, J., 175
Soegaard, M., 240
Sohn, Y. W., 209, 216, 218
Soll, H., 195
Sollenberger, R. L., 117
Sorensen, C., 95, 215
Sorensen, D., 92
Sorensen, L. J., 215
Sorkin, R. D., 8, 9, 23, 25, 388, 394, 403
Souther, J., 59
Sowerby, L. J., 153
Sox, H. C. Jr., 271, 272
Spady, A. A., 357
Spanish Ministry of Transportation and Communications, 187
Spanlang, B., 155
Sparko, A., 227
Speier, C., 86
Spence, C., 81, 275, 400
Spence, I., 81, 87, 88, 90, 97
Spencer, K., 184, 275
Spielman, L., 154
Sreenivasan, R., 60

483
Srinivasan, M. A., 151
St. Amant, R., 380
St. John, M., 25, 31, 96, 113, 114, 115, 217, 220, 226, 334, 374, 380
Stacey, S., 179
Staelin, R., 261
Stager, P., 56
Stammers, R. B., 360, 363
Stanard, T., 106
Stankov, L., 371
Stanney, K., 121, 150, 152, 374
Stansfeld, S. A., 81, 82
Stansky, D., 125, 127, 131
Stanton, N. A., 1, 56, 215, 219, 325, 372
Stanush, P. L., 243
Stark, C. E. L., 242
Stark, E., 366
Stark, L., 120
Starkes, J. L., 20
Starr, M. S., 165
Staveland, L. E., 352
St-Cyr, O., 94
Steblay, N., 22, 23
Steel, P., 339
Steelman-Allen, K. S., 52, 53, 54, 55
Stefanidis, D., 242
Stege, U., 222
Steil, B., 381
Steiner, B. A., 172
Steiner, L., 298, 299, 300
Steinley, D., 185
Steitz, D. W., 344
Steltzer, E. M., 53, 117, 134
Stephens, A. T., 357
Stern, H. W., 183
Sternberg, R. J., 262
Sternberg, S., 57, 292, 303
Stevens, A. L., 94, 237
Stevens, C., 172
Stevens, S. S., 90, 138
Stewart, J., 145
Steyvers, M., 239
Stiensmeier-Pelster, J., 367
Stokes, A. F., 96, 365
Stone, D. E., 181
Stone, E. R., 256, 282
Stone, R. B., 71, 138, 222, 256, 384
Stone, R., 152
Strack, F., 281
Stratford, R. J., 214
Strauch, B., 291
Strauss, G., 377
Strayer, D. L., 53, 55, 56, 60, 291, 327, 339, 340, 344, 356
Stroobant, N., 356
Stroop, J. R., 68
Strub, M., 368
Strybel, T. Z., 121
Stull, 96
Sturm, W., 361
Styles, E. A., 293
Subbaram, M. V., 167
Sudweeks, J., 338
Suiridov, E. P., 364
Suissa, J. A., 186
Sulistyawati, K., 216, 218, 219, 276
Summala, H., 287, 329, 357
Sun, J., 399
Sun, Y., 34
Suroteguh, C., 380
Sutherland, A., 195
Svenson, S., 246, 268, 360, 367, 368
Swain, A., 315, 316
Swain, C., 354
Swartz, S. M., 92
Sweller, J., 74, 181, 182, 183, 228, 231, 233, 236, 322
Swets, J. A., 8, 9, 10, 12, 14, 17, 21, 22, 24, 47, 58
Swoboda, J. C., 15
Szalma, J. L., 81, 374

484
T
Taatgen, N. A., 237, 330
Taati, B., 151
Tack, D. W., 158
Tahmasebi, A. M., 151
Takarangi, M. K. T., 56
Takeuchi, A. H., 33, 35
Taleb, N. N., 54, 250, 266, 278, 287, 381
Talleur, D. A., 51, 52, 328
Tan, K. C., 56, 63
Tang, A., 145, 275
Tarno, R., 367
Taylor, H., 123
Taylor, J. L., 199
Taylor, R. M., 178
Taylor, S., 123
Taylor, V. A., 186
Taylor, M. M., 26
Technical Working Group for Eyewitness Evidence, 23
Teevan, J., 240
Teichner, W. H., 26, 59, 285, 301
Telford, C., 304
Telson, R., 308
Tenenbaum, J. B., 239
Teng, O., 333, 342
Tengs, T. O., 56, 63
Tenney, Y. J., 214, 215
Terenzi, M., 357
Tetlock, P. E., 250, 276, 278, 279
Thaden, R., 82
Thaler, R. H., 252, 266
Tham, M., 114
Theeuwes, J., 60, 67
Thomas, B. H., 156, 157
Thomas, D., 97
Thomas, L. C., 114, 115, 117, 130
Thompson, B. B., 281
Thompson, J., 31, 397
Thompson, W. B., 158
Thornburg, M., 388
Thornby, J., 120
Thorndyke, P., 132
Thornton, D. C., 356
Thull, 100
Thurstone, L. L., 33
Thwing, E., 80
Tibshirani, R. J., 339
Tierney, J., 263, 265, 274, 325
Tierney, P., 182
Tiersma, P. M., 177
Tijerna, L., 338, 340
Tindall-Ford, S., 74, 182, 183, 231
Ting, C., 398
Tinker, M. A., 167
Tirre, W. C., 216
Titchener, K., 175, 321
Tlauka, M., 153
Todd, P., 263
Todd, S., 112
Tole, J. R., 357
Tomazic, S., 175
Topmiller, D. H., 286
Torgerson, W. S., 33
Toronov, V., 356
Torralba, A., 171, 172
Tractinsky, N., 92
Trafton, J. G., 56, 58, 61, 88, 92, 97, 332, 334, 335, 336, 342
Treadaway, C. A., 216
Treat, T. A., 21
Tredoux, C. G., 22
Treisman, A., 56, 58, 59, 65, 68, 74, 78, 79, 80, 135, 293, 326
Tremblay, S., 78, 79, 80, 81, 82, 200, 214, 215, 216, 219, 333
Trinh, K., 145
Tripp, L., 121, 325, 333, 355, 372
Troscianko, T., 357
Truitt, T., 399

485
Trujillo, A. C., 330
Tsang, P. S., 327, 331, 341, 344, 347, 351, 352
Tsimhomi, O., 338
Tsirlin, I., 119
Tudela, P., 353
Tufte, E., 92, 138
Tulga, M. K., 337
Tullis, T. S., 61
Tulving, E., 163, 189, 234, 240
Tuovinen, J., 236
Turner, M. L., 204, 371
Tversky, A., 144, 247, 252, 253, 257, 258, 259, 260, 265, 268, 269, 270, 271, 272, 276, 278, 281, 292, 381
Tversky, B., 124, 128
Tweney, R. D., 262
Tyfa, D., 92
Tyler, M., 120

U
U.S. Navy, 246, 257, 262
Uhlman, E., 262
Ullsperger, P., 355
Underwood, B. J., 206
Upton, C., 100, 138
Ursin, H., 364
Usoh, M., 151
Uusitalo, L., 51, 55

V
Vais, M. J., 53, 56, 60, 327, 339, 344
Valero-Gomez, A., 399
Vallone, R., 252
Van Beurden, M. H. P. H., 119
van Breda, L., 148
Van Dam, 100
Van Der Horst, R., 286, 287
van der Hulst, M., 391
van der Kleij, R., 127, 145
van der Vaart, J. C., 107
van der Voort, T., 81
Van Dijk, T. A., 165, 180
van Erp, J. B. F., 121, 275
van Gog, T., 228, 229, 233, 324
van Gool, M., 81
van Hoey, G., 119, 120
van Kamp, I., 81, 82
Van Laar, D., 97
van Lieshout, E. C. D. M., 221
van Merriënboer, J. J. G., 233
Van Opstal, A. J., 120
Van Overschelde, J. P., 167
van Paassen, M. M., 100, 121, 215
van Rooij, I., 222
van Roon, A., 356
Van Schaik, P., 56
van Veen, H. A. H. C., 121
van Wanrooij, M. M., 120
Van Were, M., 26, 28, 30
Van Wert, M. J., 60
van Wieringen, P. C. W., 107
van Zandt, T., 6, 95
Vanasse, L., 324, 355
Vanderheiden, G. C., 374
VanRullen, R., 64
Varey, C. A., 251
Vartabedian, A. G., 167
Vashitz, G., 377
Vaughn, L., 259
Vecellio, 112
Veland, O., 100
Veldman, H., 356
Veltman, J. A., 122
Venetjoki, N., 82
Venturino, M., 341
Vergauwe, E., 200, 326
Verhaeghen, P., 344
Verplanck, W. L., 381, 382

486
Ververs, P., 65, 66, 155, 334, 397, 398, 402
Vessey, I., 85, 209
VicarI, J. J., 58, 60
Vicente, K. J., 1, 94, 100, 101, 208, 356
Vicentini, M., 152
Vick, D., 145
Vickers, D., 292
Victor, T., 62, 340
Vidoni, E. D., 8
Vidulich, M., 201, 202, 301, 327, 328, 341, 352, 359
Viega, J. F., 193
Vienne, F., 55
Villoldo, A., 367
Vincow, M. A., 52, 61, 97, 123, 125, 128, 130, 137, 138, 300
Vingerhoets, G., 356
Vint, R., 380
Vinze, A. S., 221
Violante, J. M., 339
Vishton, P. M., 111
Vix, M., 386
Vlachos, G., 65
Vogel, E. K., 354
Vorländer, M., 82
Vos, W. K., 122
Votanopoulos, K., 120
Vu, K., 1, 300, 347
Vyas, M., 315

W
Wachtel, P. L., 365
Wadley, V. G., 60
Waganaar, W. A., 251
Wager, T. D., 199, 201
Walden, R., 337
Waldron, S., 322
Walker, B., 175
Walker, B. N., 175
Walker, G., 215, 219, 298, 299, 300
Walker, N., 236
Waller, D., 127
Wallis, G., 298, 299, 300
Wallis, T. S. A., 9, 20
Wallsten, T. S., 256, 368
Walrath, J. D., 15
Walters, K., 79, 192
Wang, B., 92
Wang, J. H., 208
Wang, L., 24, 403
Wang, W., 99
Wang, Z., 398
Ward, G., 209, 220
Ward, J. L., 52
Ward, P., 208
Ward, R. D., 92
Ward, W. C., 281
Ware, C., 117, 119, 120, 138
Wargo, E., 22, 242
Warm, J., 18, 19, 26, 27, 122, 325, 333, 355, 363, 372
Warren, R., 106, 108
Warren, W. H., 103, 104, 106
Warrick, M., 286, 289
Washburn, D., 325, 333, 372
Wastell, D. G., 178
Watamaniuk, S. N. J., 121
Waters, D. S., 372
Waters, M., 335
Watson, J. M., 55
Watson, M., 328
Watts, K. P., 152
Watts-Perotti, J., 144
Weaver, W., 41
Webb, A., 356
Webb, R. D. G., 243
Weber, E., 274
Weedon, B., 79, 192
Weeks, D. J., 296
Weel, J., 388

487
Wegner, D. M., 213
Wei, C. S., 153
Weigmann, D., 248, 311, 366
Weil, M., 227, 230, 311, 319, 343, 371
Weil, P., 154
Weiner, M., 185
Weiner, E., 335
Weinger, M. B., 76
Weinstein, L. F., 106
Weinstein, Y., 231
Weintraub, D. J., 38
Weir, R., 275
Weiss, D., 246
Welch, R., 100, 283, 369
Weldon, M. S., 213
Welford, A. T., 27, 33, 304, 306, 307, 309
Wellner, M., 154
Wells, G. L., 22, 23, 242, 276
Weltman, G., 255
Weltman, H., 364, 365
Wen, M. H., 65
Wenger, M. J., 210
Westenskow, D. R., 69, 75
Westerman, S. J., 360, 363
Westheimer, G., 119
Wetzel, J. M., 183, 184, 356
Wheatley, D. J., 193
Whitaker, L. A., 179
White, L. R., 193
White, M. F., 60
Whitehouse, W. G., 357
Whitfield, S., 354
Whitlow, S., 334, 397, 398, 402
Whitney, P., 206
Whittaker, S., 154
Wickelgren, W., 205, 289, 290
Wickens, C. D., 1, 3, 5, 6, 8, 24, 25, 38, 46, 47, 49, 50, 51, 52, 54, 55, 56, 58, 60, 61, 62, 63, 65, 66, 67, 69, 72, 73, 74, 75, 77, 81, 82, 85, 86,
87, 88, 92, 93, 96, 97, 99, 100, 105, 106, 112, 114, 115, 116, 117, 118, 119, 120, 122, 123, 125, 127, 128, 129, 130, 131, 132, 133, 134, 135,
137, 138, 139, 143, 144, 145, 146, 147, 149, 150, 151, 152, 153, 156, 158, 159, 170, 178, 179, 181, 182, 184, 185, 201, 202, 214, 216, 217,
218, 219, 220, 227, 228, 229, 230, 231, 233, 236, 250, 251, 261, 274, 276, 278, 281, 282, 283, 287, 291, 296, 298, 299, 300, 301, 303, 309,
310, 321, 324, 325, 326, 327, 328, 329, 331, 332, 333, 334, 335, 337, 337, 338, 339, 341, 342, 343, 347, 348, 349, 351, 352, 353, 356, 358,
359, 361, 365, 367, 371, 372, 377, 378, 379, 380, 382, 383, 385, 386, 387, 389, 390, 391, 393, 394, 395, 399, 400, 401, 402, 404
Wickens, T. D., 10
Wiegmann, D., 34, 49, 248, 311, 318, 366, 388
Wiener, E. L., 193, 197, 241, 314, 377, 378, 387, 388, 389, 390, 393
Wierweille, W. W., 340, 351, 352, 358
Wiese, E. E., 78, 79
Wiggins, M., 31, 208
Wightman, D. C., 194, 195, 223, 230, 342
Wijesinghe, R., 275
Wikman, A. S., 357
Wilcox, L., 119, 120, 125, 127, 131
Wiley, J., 222
Wilkinson, R. T., 30, 61
Willems, B., 349, 356
Willemsen, P., 158
Williams, D., 203, 205, 206
Williams, D. E., 58
Williams, D. J., 185
Williams, H. P., 125, 127, 132, 152
Williams, J. C., 275
Williams, M. D., 237
Williams, A., 56
Williges, R. C., 99, 157, 351
Willness, C., 339
Wilmes, K., 361
Wilschut, E., 275
Wilson, G. F., 346, 347, 351, 354, 361, 374, 397, 398
Wilson, J., 380
Wilson, J. A., 275
Wilson, K. A., 194, 195, 223
Wilson, P. N., 153
Wilson, W., 228, 323, 349
Wilson, G., 354, 361
Wimisberg, J., 216
Wine, J., 365
Winkler, R. L., 264, 281

488
Winner, J. L., 195, 213
Winzenz, D., 178
Wise, B. M., 56
Wise, J., 69, 76, 144
Witmer, 158
Witzki, A. H., 199, 201
Wixted, J. T., 22, 240
Wogalter, M. S., 82, 184, 185, 186, 200, 273
Wolf, L. D., 333
Wolf, M., 356
Wolfe, J. M., 26, 28, 30, 56, 58, 59, 60, 61, 322
Wolfe, S. P., 238
Wolfe, F. M., 3
Woods, D. D., 8, 9, 23, 69, 76, 93, 129, 131, 144, 148, 250, 280, 311, 315, 318, 335, 366, 379, 386, 387, 390
Woods, N., 78
Woodworth, R. S., 284, 285
Worringham, 299
Wotring, B., 107, 108
Wright, C. E., 292
Wright, D., 64, 242
Wright, M. C., 395
Wright, M. J., 373
Wright, P., 179, 255
Wright, D., 22
Wyatt, J. C., 393

X
Xiao, Y., 23, 275, 394
Xu, X., 149

Y
Yallow, E., 180
Yamani, Y., 97
Yantis, S., 53, 62, 63, 67, 179
Yarbus, A. L., 50
Yates, J. F., 256, 282
Yazdani, H., 340
Ye, N., 209
Yechiam, 256
Yechiam, 269
Yee, N., 154
Yee, P. L., 372
Yeh, M., 9, 56, 58, 61, 62, 63, 73, 123, 125, 128, 130, 134, 135, 155, 300, 380, 389, 390, 391
Yeh, Y-Y., 67, 358, 359
Yesavage, J. A., 199
Yeung, N., 275
Yin, S., 58, 135, 218, 384
Young, J., 185
Young, K., 49, 330, 338, 339, 340
Young, K. L., 340
Young, L. R., 357
Young, M. J., 112, 116
Young, M. S., 325, 372
Young, P., 374
Young, R., 114, 116, 130
Young, R. M., 314
Young, S. E., 371, 373
Young, S. L., 273
Youngblood, K. L., 121

Z
acks, R., 325
Zadeh, L. A., 19
Zakay, D., 366
Zaklad, A. L., 352
Zander, T., 275
Zanesco, A. P., 30
Zarcadoolas, C., 177
Zatorre, R. J., 80
Zeitlin, L. R., 186
Zekveld, A. A., 189
Zhai, S., 243
Zhang, J., 71, 102, 221
Zhang, L., 127
Zhang, X., 113, 116
Zheng, W., 177

489
Zhong, P., 153
Ziefle, M., 173, 174
Zimand, E., 154
Zimmer, H., 127
Zimmerman, A. B., 167
Zosh, W. D., 104, 106
Zsambok, C. E., 246, 247, 249, 250, 278
Zyda, M., 150

490
SUBJECT INDEX

A
Abbreviations, 168
Absolute judgment, 32–40, 96
channel capacity, 33, 35–36, 44–46
multidimensional, 34–40
Accidents. See Safety; Aviation
Additive factors, 303
Aesthetics, 174, 319
Affordance, 301–03
Aging
executive control, 344
focused attention, 344
multi-tasking, 344
perception, 189–90
speed-accuracy tradeoff, 291
vision, 60
working memory, 344
Air traffic control, 10, 25, 31, 67, 74, 114–115, 133, 385, 394–95
Alarms & alerts, 23–25, 63, 166, 383, 384, 391, 394–95
Alcohol, 56
Ambient vision, 103–109
Animation, 139, 144, 145, 232
Arousal
and stress, 361–63
in vigilance, 27
Articulation index, 190
Attensors, 78
Attention. See Directing attention; Focused attention; Multi-tasking; Mental workload; Single channel theory; Selective attention; Timesharing
Attention management, 336–37, 360–61. See also Interruptions
Attention skills. See Training, attention
Attentional cueing, 62–64, 185
Attentional narrowing, 63, 130, 333, 364–65, 393
Attentional switching, 78, 332, 372. See also Interruptions
Auditory processing. See also Alarms, Compatibility, Displays, auditory; Multimodal
absolute judgment, 39
attention in, 77–80
cocktail party effect, 79
dichotic listening, 80
in instructions, 180–82
of icons (earcons), 174–75
preemption, 335
in reaction time, 285
irrelevant sound effect, 81–82
monaural listening, 80
of speech, 186–190
polyphony, 80
streaming, 79–80
three dimensional (3d sound), 80, 120–21, 152
warnings, 174–175
Augmented Cognition, 374, 398
Augmented reality, 150–151, 155–59
Automaticity
of color, 96
and errors, 412–13
in multi-tasking, 322, 342, 349
in reading, 161–62, 166–68
in response time (RT), 293
in training, 232–33
in vigilance, 29
in visual search, 59
Automation 377–404. See also Alarms & alerts
accidents, 380–81, 393
adaptive, 320, 340, 352, 395–99, 399
automation bias, 392–93
in aviation. See Aviation, automation

491
complacency, 31, 63, 390–93, 403–04
complexity, 386–87
in decision support, 261, 283, 358, 401
etiquette, 402–403
feedback (displays), 94–95, 100–02, 387–88, 401
human-centered, 399–400
intelligent agents, 152, 239
levels and stages of, 381–386
mode errors, 314, 319
OOTLUF, 393
problems with, 386–95
purpose of, 378–80
reliability of, 23–25, 63–64, 318, 388–89
stages & levels, 381–86, 400–02
trust in, 317–18, 388–95, 403–04.
See also Automation, reliability of
Aviation
accidents in, 95–96, 107–08, 113, 179, 185–86, 215, 293, 311, 380, 393, 394
automation in, 314, 380, 387, 392–93, 397
cockpit task management, 337
cockpit resource management, 204
communications in, 204
decision making, 366, 278, 291
displays, 64–67, 74, 95, 97–99, 104–07, 113–16, 128–29, 131, 148, 178–179, 218, 329
expertise, 291
flight dynamics, 149
training and transfer, 224–27
visual scanning, 342
visual illusions, 88–89, 107–08, 113
workload in, 358

B
Bilingual, 343
Bottleneck theory. See Single channel theory
Brain-computer interface, 374–75
Brushing, 142–43
Business applications, 272, 381

C
Cellular phones. See Phones
Change blindness, 52–56, 135, 327, 339, 344, 368, 387
Checklists, 241–242, 330–31, 319
Chording. See Controls
Climate change, 273
Clutter, 59, 61, 133–35, 143, 387
Code design, 169–70
Cognitive appraisal, 363, 361
Cognitive load theory, 181–84, 228–33. See also Mental Workload; Effort
Cognitive streaming, 218
Cognitive tunneling. See Attentional narrowing
Color coding, 96–97, 135, 138, 300
Communications, 127, 204, 258, 334
non-verbal, 192–193
remote, 155, 193, 340
speech, 187–92
video-mediated, 193
Compatibility
Data-type, 140
display, 34, 94–102, 139–40, 201–02
ecological, 94–95, 99–102, 137
location, 294–96
modality, 201–02, 300–01
movement, 97–99, 296–300
population stereotypes, 97
of proximity. See Proximity compatibility principle
S-R (display-control), 65, 227, 293–301
visual field, 299
in information visualization, 138–140
Complacency, 63, 390–92, 403–04
Complexity, 323, 349
Computers programming, 209
Confidence. See Overconfidence
Configural dimensions, 38

492
Confirmation bias. See Decision making
Confusion. See also Similarity
errors of, 136, 313, 318
in memory, 205–07
in multi tasking, 321, 336, 341, 345
in reaction time, 292
in visual search, 58
in task interference, 341
Congruence
in instructions, 179
in S-R compatibility, 295, 297
Consistency, 93, 136–37, 144, 227–28 301, 369
Consumer behavior, 264–65
Controls, 40, 291
chording, 307–08
confusion of, 313
dynamics of, 147–48
keyboard, 227–28, 327, 296, 307, 310
mouse, 147
voice, 327–28, 338, 340, 301
Cost, 152, 157, 225, 380
Crew Resource Management (CRM), 194–195
Cross-modality attention, 80–82
Cueing. See Attentional cueing
Cybersickness. See Motion sickness

D
Data-ink ratio, 91–92
Daydreaming, 338
Decision complexity advantage, 306–08
Decision making
aiding in, 261, 283, 358, 401
Bayesian, 260
bias. See Decision biases
choice, 264–74
compliance cost, 186, 272
debiasing, 281–82
diagnosis, 217, 250–64
displays for, 261, 282–83
effort in, 274–76, 325
expertise in, 246, 274–76
framing in, 271–73
heuristics. See Decision heuristics
loss aversion, 268
naturalistic, 265, 268, 278. See also Expertise
risk perception, 273–74, 282
temporal discounting, 270, 273
under stress, 368
Decision biases, 263–64
confirmation bias, 261–63, 280, 282
Gambler’s fallacy, 252, 292
hindsight bias, 250
overconfidence, 263, 264, 276–78
planning fallacy, 276
sunk cost bias, 272–73
Decision fatigue, 263,325
Decision heuristics, 247, 263–64
accessibility, 258–59, 263
anchoring, 260–61, 281
as if, 256–58, 277
availability, 259–60, 273–74
Elimination By Aspects (EBA), 265
Representativeness, 259–60, 264, 270–71
Depth perception, 103–20. See also Display, three-dimensional
ambiguity of, 116–18, 130
attention in, 67
cues for, 104–112
cue effectiveness, 111–112
Diagnosis, 217, 250–64, 384–85. See also Decision making
Dichotic listening, 83
Dimensions
configural, 38
integral versus separable, 37–38
Disabilities, 374–75
Discriminability. See Confusion

493
Display
auditory, 120–22. See also Alarms, Display, voice
aviation. See Aviation, displays
clutter 61, 133–135
coding of, 138–139
command versus status, 178–79, 298, 369
compatibility of. See Compatibility
coplanar, 114–115
decision, 261, 282–83
digital versus analog, 95
ecological. See Ecological display
frame of reference. See Maps
frequency separated, 98–99
Head-Up (HUD), 65–67, 300
head mounted, 62–63, 151–56. See also Virtual environments
hybrid, 97–99
layout of, 51–53
object, 69–70, 74–76, 101, 117–18
naïve realism in, 96, 113
peripheral, 104
predictive, 149, 337
proximity compatibility in, 71–77
process control, 149
size, 59, 134
stereoscopic, 111–12, 119–20
three-dimensional, 103, 113–20, 127–31, 133, 139, 145, 151, 333
virtual reality. See Virtual environments
voice, 183–84, 190, 301
Directing attention, 80
Distributed cognition, 215, 219
Divided attention, 49. See also Multi-tasking
in audition, 78–79
in instructions, 181–83
in perception, 49, 64–77
Driving
accidents in, 330, 335, 107–08
automation, 100–01, 395, 401
cell phones, 338–340
distracted, 338–340
models of, 329
overconfidence in, 276–77
response time, 286–87
situation awareness, 220
visual attention in, 20, 49, 55, 342
visual illusions, 107, 108, 113, 117

E
Earcons, 174–175
EEG. See Brain waves
Ecological displays, 100–02, 103–09, 256, 283, 400
Effort, 322–25, 337. See also Mental Workload
in decision making, 249, 258, 262, 274–76, 325
in driving, 340
individual differences in, 372
information access, 73, 93
in mental workload, 347–60
physical, 147
in safety compliance, 186
in training, 228–33
in vigilance, 27–28
in visual sampling, 51
Egomotion, 104–08
E-mail, 170
Emergency procedures, 233, 291, 366, 368
Emergent features, 38, 64–65, 69, 86, 92–93, 144, 283
Engagement, 31, 232, 333, 339
Engineering psychology, 1
Environmental design, 135–137
Error 310–320. See also Speed-accuracy tradeoff
categories of, 312–15
detection of, 314–15
human reliability analysis, 315–18
lapses, 314
mode errors, 314
neuroergonomics, of 375

494
post completion errors, 314, 319
prediction of, 315–18
remediation, 319–20
in training, 229, 319
Error tolerant systems, 319–20
Event memory, 242
Executive control, 201, 293, 330–32, 343, 371, 373
Exocentric (world-centered) representation, 124–132
in information visualization, 140–143
Expectancy
in change blindness, 54
in decision making, 251, 258–60
in perception, 162, 165–72
in response time, 286–88
in signal detection. See Signal detection theory
in speech perception, 189–190
in vigilance, 28
in visual search, 59–60
Expertise
in decision making, 246, 278–80
development of, 208–209
in driving, 342
in errors, 316
in knowledge organization, 235–237, 241
in learning, 233
in memory, 208–210
in multi-tasking, 342
in situation awareness, 216, 241
in vigilance, 27–28
in visual search, 52, 53, 60
Eye movements. See Visual scanning
Eyewitness testimony, 22–23, 64, 278

F
Fault tree, 282, 316
Feedback
in adaptive automation, 395–98
in control, 309. See also Tracking
in decision making, 279–81
and errors, 314–15
in learning & training, 232, 279–80
in stress, 367
in vigilance tasks, 30
Fisheye view, 141–42
Fitts’ law, 146, 375
fMRI, 353
Focused attention, 49, 67, 77, 341, 343
aging, 344
in audition, 79–80
and stress, 365
Forecasting. See Prediction; Weather forecasting
Forgetting. See Memory
Frame of reference, 97, 124–32, 141–42, 151, 298, 300. See also Maps
Egocentric, 124, 129
Exocentric, 124, 129–130
Landmark, route, survey knowledge, 132
Function allocation 258, 396. See also Automation, stages & levels

G
Games, 154–55, 343
Generation effect, 230
Genetics and cognition, 373–374
Geographical orientation. See Spatial cognition
Graph perception, 84–93
biases in, 88–91, 118
consistency in, 92–93
graphs vs. tables, 85
guidelines, 85–86
mental operations in, 87–88
parallel coordinate graphs, 141
and the proximity compatibility
principle, 75–76, 86–88, 100, 102, 114
tasks, 86

495
three-dimensional graphs, 92, 117–118

H
Haptic perception, 152–53, 156
Head-up displays. See Display, Head-Up
Health care. See Medical applications
Heart rate, 356
Heuristics. See Decision, heuristics
Hick-Hyman law, 287–88, 306
Highlighting, 63, 134, 145
Highway safety. See Driving
Human factors engineering, 1, 406

I
Icons, 172–74
Inattentional blindess, 55–56
Individual differences
in attention & multi-tasking, 325, 341–44, 371–72
in cognitive ability, 180, 233
in effort, 372
in instructional format, 180–81
in overconfidence, 264
in spatial cognition, 127, 137
in vision, 60
in working memory, 55, 216, 218, 243, 371–72
Information theory, 32, 41–47
bits, 41
channel capacity, 44–46
context, 43–44
in absolute judgment, 33
in reaction time, 287–88
redundancy in, 44. See also Redundancy
transmission, 44–46
Information visualization, 137–44
color in, 97
data representation, 138–41
interactive, 142–144
principles of, 138–44
tasks, 137–138
Insight, 137, 143, 153–54
Inspection. See Quality control
Instructions, 175–84
command versus status, 178–179
multimedia, 180–84
negatives in, 179
working memory load of, 180
Integral dimensions, 37–38
Interference (proactive, retroactive), 205–207
Interruptions, 212, 331–36, 340, 343
Invariants, 104–109

K
Keyboards. See Controls
Keyhole phenomenon, 129–30, 141
Knowledge
acquisition of. See Learning; Training
declarative, 234
elicitation, 235
ontology, 239
organization of, 235–36
procedural, 234
representation methods, 238
representation of, 234–239
spatial, 132, 137
Knowledge of results. See Feedback

L
Labels, 34, 167
Lag, 147–48, 157–58, 217

496
Language, 204, 327
Learning. See also Expertise; Instructions; Long-term memory; Training
of attention skills, 52, 342–43
in decision making, 278–81
of navigation information, 132, 137
Legal & law enforcement applications, 22–23, 31, 64, 177, 260–61, 263, 278, 339
Lockouts, 303–04, 340
Long-term memory. See Memory, Knowledge

M
Magnitude estimation, 90
Malcolm horizon, 104
Maintenance, 314, 319
Manual control. See Tracking
Maps, 132–35
clutter in, 133–35
frame of reference of, 125–27, 131–32
rotation, 125–26, 131
versus route lists, 130
in visualization, 139
you are here maps, 126, 145
3D. See Displays, 3D
Medical applications, 12, 21–22, 60, 69–70, 114, 117, 119, 153, 185, 334
in automation, 377, 379, 384–86, 394
in decision making, 257, 259, 271, 272, 274, 283
in detection, 12, 21–22
in displays, 114, 117, 120
errors, 311, 318, 319
Meditation, 30
Memory 239–243. See also Forgetting; Long-term memory; Working memory
in absolute judgment, 33
accidents, 197
echoic, 202
episodic, 234
event memory, 242
expertise in, 208–10
forgetting, 197–198, 239
iconic, 202
implicit memory, 219, 234
in eye witness testimony, 22–23, 64
lapse, 314
long term memory, 197
long term working memory, 209–10, 216–17, 219
measures of, 239–42
procedural, 242–244
prospective, 211–12, 314, 334–35
recall and recognition, 239–240
retrieval cues, 241–242
semantic, 234
sensory, 202
in signal detection, 22–23
skilled, 209–210
transactive memory, 213–14
Memory for goals theory, 332–36
Mental models, 236–38
in display compatibility, 94–96
of environment, 127, 131, 135–36
in memory, 236–238
in menu design, 236
in prediction, 218
in training, 237
Mental rotation, 125–29
Mental simulation, 218
Mental workload, 347–60
in adaptive automation, 397
dissociation, 358–59
measures of, 351–58
redline for, 348–49, 352
in training, 228–229. See also Cognitive load theory
neuroergonomics of, 350–57
Menus, 307, 336
Metacognition 234. See also Overconfidence
in decision making, 276–78
in memory, 234, 255
Mode errors. See Automation; Errors

497
Models, 3, 405
of decision making, 247–49
of errors, 312, 315–17
of graph perception, 87–88
of human performance, 3–6
mental. See Mental models
of multiple task performance, 329–30
of noticing, 54–55
of response time, 287–89
of stress, 366–68
of visual search, 57–58
of visual scanning, 50–53
Motion perception, 104–10
Motion sickness, 159
Motivation, 211, 372
Multimedia. See Multimodal
Multimodal. See also Auditory processing, Tactile sense
feedback, 400
instructions, 180–84, 231–32
in multi-tasking, 80–81, 121, 328–29, 333, 335, 400
in RT, 285
redundancy, 181–84, 186
Multitasking, 321–45
in automation, 391
confusion, 341
in decisions, 255
in driving, 338–40
individual differences, in 341–44, 373–74
in learning, 228–30
multiple resources, in 181–84, 210, 231, 325–30, 335, 336, 338, 340–41, 349, 353, 355, 400
performance resource function, 275, 323–24, 331, 342–43, 358
in working memory, 200–01
Music, 33, 80, 82, 200

N
Naïve realism, 96, 113
Nautical applications, 148, 217, 381
Navigation. See Maps, Spatial cognition
Near Infrared Spectroscopy (fNIRS), 366, 374
Negatives, 179
Negative transfer, 227–28
Neuro-ergonomics, 327, 347
cerebral blood flow, 355–56
EEG, 353–54, 361
ERP (event related potential), 303, 354–55, 375
heart rate, 356
pupil diameter, 356
in mental workload, 353–58
Noise, 81–82, 365
Noticing, 53–55
N-SEEV model, 54–55
Nuclear process control. See Process control

O
Object display. See Display, object
Object perception. See Perception, of objects
Optical flow, 106
Optimum performance
in attention allocation, 52–53, 337
in decision making, 247, 264–57
in signal detection, 12–15
Organizations, 318
Overconfidence
in decision making, 276–78, 398
in learning, 233–34
in memory, 23
in situation awareness, 219, 276, 398
in multi-tasking, 339, 398
in vision, 54
Overlearning, 232–33

498
P
Pacing (self versus forced), 308–09
Parallel processing. See Divided attention
Perception. See also Displays
3D (depth), 109–112
direct vs. indirect, 103
graphical, 84–93
of objects, 68–71, 78, 170–74
of print, 160–70
of risk, 273–74, 282
of scenes, 50, 171–72
of sound, 174–75
of speech, 186–93
of statistics and probability, 84–93, 250–52
Performance shaping function, 316–17
Peripheral vision, 53, 62–63
Phones, 300, 307, 327, 338–40
Planning, 220–23, 276
Opportunistic, 221
Population stereotype, 97, 297. See also Compatibility
Pre-attentive processing, 58, 135. See also Automaticity
Prediction, 148–149, 217–18, 251–52, 276, 337
Presence (in virtual environments), 151
Principle of pictorial realism, 95–96
Principle of the moving part, 97–99
Problem solving, 220–23
Heuristics, 222
Tower of Hanoi, 221
Traveling salesman, 222
Team problem solving, 222
Process control, 69, 100, 148–149, 208, 267, 272
Processing codes (verbal-spatial)
in display compatibility, 201–02, 300–01
in multi-tasking, 327–28
in working memory, 200–02
Proofreading, 162
Prospect theory, 268–71
Proximity compatibility principle, 71–77
in display design, 53, 71–77, 102, 131, 134–35, 283
in graph design, 75–76, 86–88, 100, 102, 113
in instructions, 182–83
in visualization, 138, 143–44
Psychological refractory period, 304–08, 327

Q
Quality control, 13, 15, 29, 34, 308–09

R
Railway applications, 56, 197
Reaction time. See Response time
Reading, 164–65, 175–80, 185. See also Perception, of print
Realism, 16, 96, 113, 184
in training, 226–27
Redundancy, 64, 367
calculation of, 48
in code design, 169–79
in communications, 192
of depth cues, 112
gain, 40, 67
in instructions, 178–79, 181–84, 231
in reading, 162–63, 168–69
in speed-accuracy tradeoff, 291
in visual displays, 36–37, 40, 122, 168–69
in warnings, 186
Reliability analysis, 315–18, 388–89
Research methods, 3
Resources. See Time-sharing skill; Decision making, effort in; Mental workload, Multitasking
Response conflict, 67
Response criterion. See Signal detection theory
Response modality. See Control, voice
Response time. See also Speed-accuracy tradeoff
choice, 287–303

499
information theory, 288, 306–07
compatibility. See Compatibility
decision complexity advantage, 306–08
repetition effect, 292
serial, 304–10. See also Psychological refractory period; Transcription
simple, 284–86
stages in, 303–04
Risk. See Decision making; Safety
Robots, 112, 147, 149–150
Route knowledge, 132

S
Saccades 164–165
Safety
Aviation. See Aviation, accidents in
cell phones and, 338
decision, 267, 273–74
health care, 311. See also Medical applications, errors
highway, 338
risk perception, 273–74, 281–282
in training, 225
warnings, 184–86
Salience, 335
Satisficing, 221
Scientific visualization, 137, 153–154
Search. See Visual search
Security, 8, 31, 289
SEEV model, 50–53. See also Selective attention
Selective attention, 49–64. See also Visual scanning
auditory, 77–80
change blindness, 53–54
in decision making, 253–58
eye movements in, 50–53
in attentional blindness, 55–56
optimality of, 50–53
in situation awareness, 215
and stress, 364–65
tasks, 50
in training, 227
to warnings, 185
Separable dimensions, 37–38
Shannon Fano principle, 169–170
Short-term sensory store, 198–200
Signal detection theory, 8–25
applications of, 20–28
fuzzy, 19–20, 30
response criterion (beta), 12–15, 23–25, 48
ROC curve, 16–19
Sensitivity, 15–19, 47
sluggish beta, 14–15, 22, 259, 270–71
in visual search, 61
Similarity 56, 227–28. See also Confusion
Simulation
fidelity of, 226–27
in training, 152, 226–27, 234
Single channel theory, 304–06
Situation awareness, 214–20
anticipation, 217–218
in automation, 390–93, 395–96
displays for, 119, 130, 207
levels of, 217–18
measures of, 217–18
overconfidence in, 276
shared, 195, 340
team, 195
and workload, 351
Skill retention, 242–43
Sleep, 361
Slips, 312–15
Software reliability, 386
Spatial audio, 120–122
Spatial cognition, 123–35
in data bases, 142–143
distortions of, 136
environments, 135–37

500
learning, 132
in navigation and map design. See Maps
strategies, 126
Spatial proximity, 65–67, 71–75
Spectral analysis, 187–188
Speech control. See Control, voice
Speech perception, 186–192. See also Auditory displays
Speed-accuracy tradeoff, 60, 320, 367–68
Splay, 105–106
Sports, 20, 252–52, 286, 291, 323
Stages of processing, 404
in automation, 381–86, 400–02
in decision making, 247–48
in multi-tasking, 326–27
in response time, 303–04
Stereo vision (stereopsis), 111–12, 119–20
Stevens’ law, 90–91
Strategic control. See also Speed accuracy tradeoff; Metacognition
in memory, 230–31
in stress, 366–368
in multi-tasking, 330–337
Stress, 360–70
in design, 301
decision making, 246, 368
performance effects, 316–17, 363–68
remedies for, 369–70
theories of, 363–68
time, 36, 275, 361
training for, 369–70
Stroop task, 68, 341
Supervisory control, 50–53, 150, 336–37, 382
Survey knowledge, 132

T
Tactile sense, 121–22, 152, 328
Task analysis, 207, 239
Teams, 145, 195, 213–14, 222, 318
Telepresence, 154
Temporal discounting, 270, 273
Texting, 340
Therapy, 153
Time representation, 139, 142
Three Mile Island, 365
Time line analysis, 349–50
Time stress, 36, 275. See also Speed-accuracy tradeoff
Time-to-contact (Tau), 107
Time-sharing skill, 230, 342–43
Top-down processing. See Expectancies
Tracking, 145–150
compatibility, 297
driving, 147
dynamics, 143–48, 217
gain, 147
lag, 147–48
multi-axis, 149–150
multi-task, 143, 149–150, 327
prediction in, 149–150
stability, 148
in visualization, 142–43
Training, 223–34. See also Expertise Automaticity
for automation, 25, 392, 404
in attention, 344
cost of, 225
crew resource management (CRM), 194–95, 228
in decision making, 281, 279–82
electronic learning, 153
feedback in, 232. See also Feedback
interruption management, 340
for multi-tasking, 343–44
for navigation, 132
realism, 226–27
and stress, 369–70
techniques of. See Training strategies
team, 214
transfer of, 223–28

501
virtual environments for, 152
in visual perception, 60
Training strategies, 228–34
attention training, 230, 343
active learning, 230–31
adaptive training, 230
error prevention, 229, 319
overlearning, 233
practice distribution, 233–34
scaffolding, 229
Transcranial Doppler Sonography (TCD), 355–356
Transcription, 310
Translating, 308, 310
Transfer effectiveness ratio (TER), 225–26
Transfer of training. See Training
Trouble shooting, 256
Typing. See Controls, Transcription

U
Unmanned Air Vehicles (UAV), 149–150, 214, 379, 397
Ubiquitous computing, 154–55
Unitization, 166–168
Useful field of view (UFOV), 56–57, 60
USS Vincennes incident, 257, 364

V
Vigilance, 25–31, 325. See also Signal detection theory
and sustained attention, 27–28, 49
techniques to combat decrement, 28–31
theories, 27–28
Violations, 318
Virtual environments, 63, 99, 117, 132, 150–59
applications of, 152–156
features of, 151–52
problems with, 156–59
in training, 159, 227
Visual channels
in perception, 103–09
in multi-tasking, 329
Visual illusions, 88–89, 107–08, 112–13
in depth perception, 112–113
in graph perception, 88–91
Visual momentum, 93, 131, 141, 144–45, 150
Visual scanning, 50–55
in automation, 52, 391
of graphs, 88
mental workload, 357
models of, 50–55, 337, 339
in multi-tasking, 328, 339
in reading, 164–65
skill in, 342
training in, 231, 343
useful field of view (UFOV), 56–57, 60
Visual search, 56–61, 93, 161
in maps, 133–35
models of, 57–60
speed-accuracy tradeoff, 289
in visualization, 137–38
Visualization. See Information visualization
Voice. See Control, voice
Voice recognition, 190–192
Voting, 175–76

W
Warnings, 73, 184–86. See also Alarm & alerts
Warrick principle, 299
Weather forecasting, 261, 264
Working memory, 197–297
analysis, 207
binding, 199
capacity, 203–04

502
central executive, 200–01
chunking, 109, 167, 205, 243
codes of (spatial versus verbal), 200–201
duration of, 203–04
episodic buffer, 199
forgetting of, 135, 203–04
genetics of, 373–74
individual differences in, 55, 216, 218, 243, 371–72
in instructions, 180
in intelligence, 372
interference of, 81–82, 200–01, 205–07
in learning, 228–31
in multi-tasking, 327, 333, 338, 353–4
in moral control, 199
neuro-ergonomics of, 352–54
in reading, 165
in situation awareness, 216
phonological loop, 198
spatial 123. See also Spatial cognition
stress effects on, 353–54
visuo-spatial sketchpad (spatial WM), 198
Workload. See Mental workload

Y
Yerkes-Dodson law, 362–63

Z
Zipf ‘s law, 170

503

Вам также может понравиться