Академический Документы
Профессиональный Документы
Культура Документы
VERITAS
Virtual and Augmented Environments and Realistic User Interactions
To achieve Embedded Accessibility DesignS
247765
Table of Contents
Version History Table............................................................................................i
List of Figures......................................................................................................vi
List of Tables........................................................................................................ix
List of Code Snippets..........................................................................................xi
List of Abbreviations...........................................................................................xiii
Executive Summary.............................................................................................1
1 Introduction........................................................................................................2
1.1 Defining Multimodal Interaction............................................................................2
1.1.1 Multimodal interaction: a human-centered view..................................................2
1.1.2 Multimodal interaction: a system-centered view..................................................3
1.2 Modelling Multimodal Interaction.........................................................................5
1.2.1 Bersens taxonomy...............................................................................................5
1.2.2 Architecture of multimodal user interfaces...........................................................6
1.2.3 Fusion of input modalities....................................................................................7
1.2.3.1 PAC-Amodeus................................................................................................... 7
1.2.3.2 Open Agent Architecture....................................................................................7
1.2.3.3 Multimodal Architecture proposed by Obrenovic et al........................................8
1.3 Multimodal Applications.....................................................................................11
1.3.1 Ambient spaces..................................................................................................12
1.3.2 Mobile/wearable.................................................................................................12
1.3.3 Virtual environments..........................................................................................13
1.3.4 Art.......................................................................................................................13
1.3.5 Users with disabilities.........................................................................................13
1.3.5.1 Users with disabilities automotive ................................................................14
1.3.5.2 Users with disabilities smart living spaces....................................................14
1.3.5.3 Users with disabilities workplace design.......................................................15
1.3.5.4 Users with disabilities infotainment...............................................................15
1.3.5.5 Users with disabilities personal healthcare and well-being...........................16
1.3.6 Public and private spaces..................................................................................16
1.3.7 Other..................................................................................................................16
2 Assistive Technologies (AT).............................................................................17
2.1 Assistive Technologies for visually impaired users.............................................17
2.1.1 Electronic reading devices.................................................................................17
2.1.2 Lasercane..........................................................................................................17
2.1.3 Braille and refreshable braille............................................................................18
2.1.4 Screen magnifiers..............................................................................................18
2.1.5 Screen readers...................................................................................................18
2.1.6 Text browsers.....................................................................................................18
2.1.7 Voice browsers...................................................................................................19
2.1.8 Haptic devices....................................................................................................19
2.2 Assistive Technologies for hearing impaired users.............................................19
2.2.1 Hearing aids.......................................................................................................19
2.2.2 Visual System for Telephone and Mobile Phone Use........................................20
2.2.3 Visual notification devices..................................................................................21
2.2.4 Gesture recognition............................................................................................21
2.3 Assistive Technologies for motor impaired users...............................................22
List of Figures
Figure 1: Design space for multimodal systems [9]......................................................................4
Figure 2: An architecture of multimodal user interfaces. Adapted from [16]..................................6
Figure 3: Obrenovic et al framework [21]......................................................................................8
Figure 4: Multimodal Interaction Models architecture overview diagram....................................44
Figure 5: Generic structure of a Multimodal Interaction Model...................................................45
Figure 6: Multimodal Interaction Model "grasp door" example....................................................45
Figure 7: Enabling relationship - Indicative explanatory case.....................................................46
Figure 8: Enabling relationship with information passing Indicative explanatory case............47
Figure 9: Choice relationship Indicative explanatory case.......................................................47
Figure 10: Concurrency relationship - Indicative explanatory case.............................................47
Figure 11: Concurrency with information passing relationship Indicative explanatory case.....48
Figure 12: Order independency Indicative explanatory case..................................................48
Figure 13: Disabling relationship Indicative explanatory case................................................48
Figure 14: Suspend/Resume relationship Indicative explanatory case....................................49
Figure 15: Walk Multimodal Interaction Model relationships.....................................................50
Figure 16: See Multimodal Interaction Model relationships......................................................51
Figure 17: Hear Multimodal Interaction Model relationships.....................................................52
Figure 18: Grasp: Door handle Multimodal Interaction Model relationships.............................53
Figure 19: Pull (hand): Door handle Multimodal Interaction Model relationships......................54
Figure 20: Walk: To car seat Multimodal Interaction Model relationships.................................55
Figure 21: Grasp (right hand): Steering wheel Multimodal Interaction Model relationships......56
Figure 22: Grasp (hand): Door handle Multimodal Interaction Model relationships..................58
Figure 23: Pull (hand): Window bar Multimodal Interaction Model relationships......................59
Figure 24: Sit: Bed Multimodal Interaction Model relationships................................................60
Figure 25: Pull (hand): Toilet flush valve Multimodal Interaction Model relationships...............62
Figure 26: Stand up (knee, back): Toilet Multimodal Interaction Model relationships...............63
Figure 27: Grasp (hand): Mouse Multimodal Interaction Model relationships...........................65
Figure 28: See: Computer screen Multimodal Interaction Model relationships.........................66
Figure 29: Push (hand): Medical device button Multimodal Interaction Model relationships.. . .67
Figure 30: Read: Message on touch screen Multimodal Interaction Model relationships.........69
Figure 31: Press: OK button on the touch screen Multimodal Interaction Model relationships. 70
Figure 32: Multimodal Interfaces Manager architecture..............................................................71
Figure 33: Data input and output in Modality Compensation and Replacement Module.............72
Figure 34: The Multimodal Toolset Manager and its modules. Some modules are used for
getting input from users, while others for passing information to them. The modality type of
each tool is also indicated..................................................................................................... 73
Figure 35: Data flow concerning the Speech Recognition Module. Both real-time and pre-
recorded speech-audio is supported and converted to text by the module............................74
Figure 71: Grasp (hand): Faucet controls Multimodal Interaction Model relationships...........128
Figure 72: Grasp (hand): Hob gas control knob Multimodal Interaction Model relationships..129
Figure 73: Push (hand): Stove knob Multimodal Interaction Model relationships...................130
Figure 74: Pull (hand): Washing machine porthole handle Multimodal Interaction Model
relationships........................................................................................................................ 131
Figure 75: Turn (hand): Dishwasher knob Multimodal Interaction Model relationships..........132
Figure 76: Push (hand): Hood button Multimodal Interaction Model relationships.................133
Figure 77: Pull (hand): Oven door handle Multimodal Interaction Model relationships...........134
Figure 78: Twist (hand): Faucet control Multimodal Interaction Model relationships..............135
Figure 79: Stand up (knee, back): Toilet Multimodal Interaction Model relationships.............137
Figure 80: Push (hand): Mouse Multimodal Interaction Model relationships..........................138
Figure 81: Move (hand): mouse Multimodal Interaction Model relationships..........................139
Figure 82: Press (hand): Keyboard key Multimodal Interaction Model relationships..............140
List of Tables
Table 1: Different senses and their corresponding modalities [5]..................................................2
Table 2: Interaction modalities described using the Obrenovic et al framework [21].....................9
Table 3: Disabilities and their constraints (from [21])..................................................................10
Table 4: Constraints introduced by driving a car (from [97])........................................................11
Table 5: Walk Multimodal Interaction Model definition..............................................................49
Table 6: See Multimodal Interaction Model definition..............................................................51
Table 7: Hear Multimodal Interaction Model definition..............................................................51
Table 8: Grasp: Door handle Multimodal Interaction Model definition.......................................52
Table 9: Pull (hand): Door handle Multimodal Interaction Model definition...............................53
Table 10: Walk: To car seat Multimodal Interaction Model definition.........................................54
Table 11: Grasp (right hand): Steering wheel Multimodal Interaction Model definition.............56
Table 12: Grasp (hand): Door handle Multimodal Interaction Model definition.........................58
Table 13: Pull (hand): Window bar Multimodal Interaction Model definition..............................59
Table 14: Sit: Bed Multimodal Interaction Model definition.......................................................60
Table 15: Pull (hand): Toilet flush valve Multimodal Interaction Model definition......................62
Table 16: Stand up (knee, back): Toilet Multimodal Interaction Model definition.......................63
Table 17: Grasp (hand): Mouse Multimodal Interaction Model definition..................................65
Table 18: See: Computer screen Multimodal Interaction Model definition................................66
Table 19: Push (hand): Medical device button Multimodal Interaction Model definition............67
Table 20: Read: Message on touch screen Multimodal Interaction Model definition................68
Table 21: Press: OK button on the touch screen Multimodal Interaction Model definition........69
Table 22: Multimodal Toolset Manager's modules and their modality requirements from the
target users........................................................................................................................... 74
Table 23: Features of the chosen speech recognition software, CMU Sphinx............................75
Table 24: Feature of the speech synthesizers used by the Speech Synthesis Module...............77
Table 25: Sit: Car seat Multimodal Interaction Model definition................................................92
Table 26: Swing (legs): Inside car Multimodal Interaction Model definition...............................93
Table 27: Grasp (hand): Interior door handle Multimodal Interaction Model definition..............95
Table 28: Pull (left hand): Interior door handle Multimodal Interaction Model definition............96
Table 29: Push (right hand): Lock button Multimodal Interaction Model definition....................97
Table 30: Press (right hand): Eject button on belt buckle Multimodal Interaction Model
definition................................................................................................................................ 98
Table 31: Grasp (right hand): Interior door handle Multimodal Interaction Model definition....100
Table 32: Push (left hand): Interior door side Multimodal Interaction Model definition............101
Table 33: Pull down (hands): Sun shield Multimodal Interaction Model definition..................102
Table 34: Grasp (hand): Steering wheel Multimodal Interaction Model definition...................103
Table 35: Push (left foot): Gear pedal Multimodal Interaction Model definition.......................104
Table 36: Push (right foot): Accelerator pedal Multimodal Interaction Model definition...........105
Table 37: Push (right foot): Brake pedal Multimodal Interaction Model definition...................107
Table 38: Push (thumb): Parking brake release button Multimodal Interaction Model definition.
............................................................................................................................................ 108
Table 39: Pull (hand): Hand brake Multimodal Interaction Model definition............................109
Table 40: Grasp (hand): Light switch Multimodal Interaction Model definition........................110
Table 41: Turn (hand): Light switch Multimodal Interaction Model definition...........................112
Table 42: Move up/down (hand): Direction indicator Multimodal Interaction Model definition. 113
Table 43: Grasp (hand): Radio knob Multimodal Interaction Model definition.........................115
Table 44: Turn (hand): Radio knob Multimodal Interaction Model definition...........................116
Table 45: Push (hand): Radio button Multimodal Interaction Model definition........................117
Table 46: Push (hand): Window button Multimodal Interaction Model definition.....................118
Table 47: Grasp (hand): Window handle Multimodal Interaction Model definition...................119
Table 48: Turn (hand): Window handle Multimodal Interaction Model definition.....................120
Table 49: Turn (hand): Rear mirror Multimodal Interaction Model definition...........................121
Table 50: Push (hand): Rear mirror Multimodal Interaction Model definition..........................122
Table 51: Grasp (right hand): Gear handle Multimodal Interaction Model definition...............123
Table 52: Push (right hand): Gear handle Multimodal Interaction Model definition.................124
Table 53: Push (hand): Navigation system buttons Multimodal Interaction Model definition.. 125
Table 54: Push (right foot): Rear brake pedal Multimodal Interaction Model definition...........126
Table 55: Listen: Navigation system audio cues Multimodal Interaction Model definition.......127
Table 56: Grasp (hand): Faucet controls Multimodal Interaction Model definition..................128
Table 57: Grasp (hand): Hob gas control knob Multimodal Interaction Model definition.........129
Table 58: Push (hand): Stove knob Multimodal Interaction Model definition..........................130
Table 59: Pull (hand): Washing machine porthole handle Multimodal Interaction Model
definition.............................................................................................................................. 131
Table 60: Turn (hand): Dishwasher knob Multimodal Interaction Model definition..................132
Table 61: Push (hand): Hood button Multimodal Interaction Model definition.........................133
Table 62: Pull (hand): Oven door handle Multimodal Interaction Model definition..................134
Table 63: Twist (hand): Faucet control Multimodal Interaction Model definition......................135
Table 64: Sit (knee, back): On toilet Multimodal Interaction Model definition..........................136
Table 65: Push (hand): Mouse Multimodal Interaction Model definition..................................138
Table 66: Move (hand): mouse Multimodal Interaction Model definition.................................139
Table 67: Press (hand): Keyboard key Multimodal Interaction Model definition.....................140
CodeSnippet 26: Push (left hand): Interior door side Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 102
CodeSnippet 27: Pull down (hands): Sun shield Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 103
CodeSnippet 28: Grasp (hand): Steering wheel Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 104
CodeSnippet 29: Push (left foot): Gear pedal Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 105
CodeSnippet 30: Push (right foot): Accelerator pedal Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 106
CodeSnippet 31: Push (right foot): Brake pedal Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 108
CodeSnippet 32: Push (thumb): Parking brake release button Multimodal Interaction Model
(UsiXML source code)......................................................................................................... 109
CodeSnippet 33: Pull (hand): Hand brake Multimodal Interaction Model (UsiXML source code).
............................................................................................................................................. 110
CodeSnippet 34: Grasp (hand): Light switch Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 111
CodeSnippet 35: Turn (hand): Light switch Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 113
CodeSnippet 36: Move up/down (hand): Direction indicator Multimodal Interaction Model
(UsiXML source code)......................................................................................................... 114
CodeSnippet 37: Grasp (hand): Radio knob Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 115
CodeSnippet 38: Turn (hand): Radio knob Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 116
CodeSnippet 39: Push (hand): Radio button Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 117
CodeSnippet 40: Push (hand): Window button Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 118
CodeSnippet 41: Grasp (hand): Window handle Multimodal Interaction Model (UsiXML source
code).................................................................................................................................... 119
CodeSnippet 42: Turn (hand): Window handle Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 120
CodeSnippet 43: Turn (hand): Rear mirror Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 121
CodeSnippet 44: Push (hand): Rear mirror Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 122
CodeSnippet 45: Grasp (right hand): Gear handle Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 123
CodeSnippet 46: Push (right hand): Gear handle Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 124
CodeSnippet 47: Push (hand): Navigation system buttons Multimodal Interaction Model
(UsiXML source code)......................................................................................................... 126
CodeSnippet 48: Push (right foot): Rear brake pedal Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 127
CodeSnippet 49: Listen: Navigation system audio cues Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 128
CodeSnippet 50: Grasp (hand): Faucet controls Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 129
CodeSnippet 51: Grasp (hand): Hob gas control knob Multimodal Interaction Model (UsiXML
source code)........................................................................................................................ 130
CodeSnippet 52: Push (hand): Stove knob Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 131
CodeSnippet 53: Pull (hand): Washing machine porthole handle Multimodal Interaction Model
(UsiXML source code)......................................................................................................... 132
CodeSnippet 54: Turn (hand): Dishwasher knob Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 133
CodeSnippet 55: Push (hand): Hood button Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 134
CodeSnippet 56: Pull (hand): Oven door handle Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 135
CodeSnippet 57: Twist (hand): Faucet control Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 136
CodeSnippet 58: Sit (knee, back): On toilet Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 138
CodeSnippet 59: Push (hand): Mouse Multimodal Interaction Model (UsiXML source code).139
CodeSnippet 60: Move (hand): mouse Multimodal Interaction Model (UsiXML source code).
............................................................................................................................................ 140
CodeSnippet 61: Press (hand): Keyboard key Multimodal Interaction Model (UsiXML source
code)................................................................................................................................... 141
List of Abbreviations
Abbreviation Explanation
AT Assistive Technologies
GPS Global Positioning System
PDA Personal digital assistant
SP Sub-Project
UsiXML USer Interface eXtensible Markup Language
Executive Summary
This manuscript is devoted to the research and development of tools and their
respective data structures that concern the design of multimodal interfaces specifically
devoted to the demands and expectations of persons with special needs, people with
disabilities and deficiencies. The focus has been made on categories of users with a
certain degree of impairments either physical or cognitive (blind and low vision users,
motor impairment users, mild cognitive impairment users, speech impairment users,
hearing impairment users), including older people. The generated models are able to
simulate the various steps of the interaction process of mono- and multimodal
interfaces and link them to the sensorial capabilities of the users.
Initially, in Section 1, an introduction to the multimodal interaction concept is made. A
detailed analysis of interaction mechanisms of the current HCI solutions (e.g. touch,
voice control, speech output, gesture, GUIs, etc.) is presented. This analysis is needed
to assess the gaps in the usability of each of the considered HCI solutions by the
selected user groups.
In Section 2, the widely used assistive devices are identified and analysed in terms of
several parameters like their input and output modalities, mobility, robustness, etc. The
State of the Art of multimodal interfaces for older people and people with disabilities
(visual, hearing, motor and cognitive domains) is also analysed.
In Section 3, two lists of entities are defined: one regarding the interfaces and one that
describes various (impaired) user groups and the interfaces that can be used in order
to address their deficiencies.
In Section 4, the manuscript presents the specifications of the Multimodal Interaction
Models, which include combinations of interfacing modalities most suited for the target
user groups and connect the Virtual User Models, developed in SP1, to the Task
Models and the virtual prototype to be tested. Solution cases for five application
domains are presented: automotive, smart living places, workplace/office, infotainment
and personal healthcare.
Section 5 describes the aspects that concern the Multimodal Interfaces Manager
implementation. An architecture analysis is made about the manager and the features
of its two main components are presented.
1 Introduction
If we observe the modalities from a neurobiological point of view [6][7], we can divide
the modalities in seven groups:
internal chemical (blood oxygen, glucose, pH),
external chemical (taste, smell),
The different dimensions of the design space define eight possible classes of systems.
By definition [9], multimodal systems require the value Meaning in the Levels of
abstraction. Thus, there are four distinct classes of multimodal systems: exclusive,
alternate, concurrent and synergistic. Use of modalities refers to the temporal
dimension. Parallel operation can be achieved at different levels of abstraction [10].
The most important level is the task level, the level that the user interacts with. It must
seem to be concurrent in order to make the user perceive the system as parallel.
Low-level, or physical level, concurrency is not a requirement for a multimodal system,
but needs to be considered in the implementation. Most current computing systems are
able to offer an imitation of concurrent processing even though the processing is in fact
sequential.
Fusion is the most demanding criterion in the design space [11]. Here Combined
means that different modalities are combined in synergistic higher-level input tokens.
Independent means that there are parallel but not linked modalities in the interface.
The synergistic use of modalities implies the fusion of data from different modelling
techniques. Nigay and Coutaz [9] have identified three levels of fusion: lexical,
syntactic and semantic. They can be mapped to the three conceptual design levels
defined in [12]:
Lexical fusion corresponds to conceptual Binding level. It happens when
hardware primitives are bound to software events. An example of lexical fusion
is selecting multiple objects when the shift key is down.
Syntactic fusion corresponds to conceptual Sequencing level. It involves the
combination of data to obtain a complete command. Sequencing of events is
important at this level. An example of syntactic fusion is synchronizing speech
and pen input in a map selection task.
Semantic fusion corresponds to conceptual Functional level. It specifies the
detailed functionality of the interface: what information is needed for each
operation on the object, how to handle the errors, and what the results of an
operation are. Semantic fusion defines meanings but not the sequence of
actions or the devices with which the actions are conducted. An example of
semantic fusion is a flight route selection task that requires at least two airports
as its input through either touch or speech and draws the route on a map.
Still, having defined that multimodal systems make use of multiple input or output
modalities does not define the properties of the actual systems very well. There are two
The presented architecture contains many models for different processes in the
system. Each of these models can be refined to fulfil the requirements of a given
system. Specifically, user and discourse models are highly important in a multimodal
interface. There can be one or more user models in a given system. If there are several
user models, the system can also be described as an adaptable user interface. The
discourse model handles the user interaction in a high level, and uses media analysis
and media design processes to understand what the user wants and to present the
information with the appropriate output channels. [19] discusses user and discourse
models for multimodal communication when he describes an intelligent multimodal
1.2.3.1 PAC-Amodeus
Nigay and Coutaz [9][17] present a fusion mechanism that is used with their PAC-
Amodeus software architecture model. They use a melting pot metaphor to model a
multimodal input event. PAC agents [10] act in this multiagent system and are
responsible for handling these events, in a system similar to Bolts Put-That-There
[20]. In their 1995 paper Nigay and Coutaz explain the fusion mechanism in detail and
give the metrics and rules that they use in the fusion process. They divide fusion in
three classes. Microtemporal fusion is used to combine related informational units
produced in a parallel or pseudo-parallel manner. It is performed when the structural
parts of the input melting pots are complementary and when their time intervals
overlap. Macrotemporal fusion is used to combine related information units that are
produced or processed sequentially. It is performed when the structural parts of the
input melting pots are complementary and their time intervals belong to the same
temporal window but do not overlap. Contextual fusion is used to combine related
information units without attention to temporal constraints. Their algorithm favours
parallelism and is thus based on an eager strategy: it continuously attempts to combine
input data. This may possibly lead to incorrect fusion and requires an undo mechanism
to be implemented.
and reduce the need for error correction. It should be noted, however, that multiple
modalities alone do not bring benefits to the interface: the use of multiple modalities
may be ineffective or even disadvantageous. In this context, Oviatt [11][25] has
presented the common misconceptions (myths) of multimodal interfaces, most of them
related to the use of speech as an input modality.
The types of modalities used, as well as the integration models vary widely from
application to application. The literature on applications that use multimodal interaction
is vast and could well deserve a survey of its own [26][27][28][29][30][31][32].
Therefore, we do not attempt a complete survey of multimodal applications. Instead we
give a general overview of some of the major areas by focusing on specific application
areas in which interesting progress has been made. In particular, we focus on the
areas below.
1.3.2 Mobile/wearable
The recent drop in costs of hardware has led to an explosion in the availability of
mobile computing devices. One of the major challenges is that while devices such as
PDAs and mobile phones have become smaller and more powerful, there has been
little progress in developing effective interfaces to access the increased computational
and media resources available in such devices. Mobile devices, as well as wearable
devices, constitute a very important area of opportunity for research in multimodal
applications, because natural interaction with such devices can be crucial in
overcoming the limitations of current interfaces. Several researchers have recognized
this, and many projects exist on mobile and wearable multimodal applications. The
authors of [50] integrate pen and speech input for PDA interaction. The use of
computer vision, however, is also being explored in projects such as [51], in which a
tourist can take photographs of a site to obtain additional information about the site. In
[52], the authors present two techniques (head tilt and gesture with audio feedback) to
control a mobile device. The authors of [53] use MMHCI to augment human memory:
RFID tags are used in combination with a head mounted display and a camera to
capture video and information of all the objects the user touches. The authors of [54]
combine eye tracking with video, head tracking, and hand motion information. The
authors of [55] use eye tracking to understand eye-hand coordination in natural tasks,
and in [56] eye tracking is used in a video blogging application, a very interesting area.
1.3.4 Art
Perhaps one of the most exciting application areas of multimodal applications is art.
Vision techniques can be used to allow audience participation [60] and influence a
performance. In [61], the authors use multiple modalities (video, audio, pressure
sensors) to output different emotional states for Ada, an intelligent space that
responds to multimodal input from its visitors. In [62], a wearable camera pointing at
the wearers mouth interprets mouth gestures to generate MIDI sounds (so a musician
can play other instruments while generating sounds by moving his mouth). In [63], limb
movements are tracked to generate music. Multimodal applications can also be used in
museums to augment exhibitions [64].
interpret facial gestures for wheel chair navigation. The authors of [68] introduce a
system for presenting digital pictures non-visually (multimodal output), and the
techniques in [69] can be used for interaction using only eye blinks and eye brow
movements. Some of the approaches in other application areas (e.g., [70]) could also
be beneficial for people with disabilitiesmultimodal applications have great potential
in making computers and other resources accessible to people with disabilities.
For a particular type of user activity and location, appropriate multimodal interfaces
were designed, incorporating visual, speech-audio, appliance and environmental
activity to drive I/O actions. [82] also describes a smart home with a difference here
users employ a multimodal VR application (modes employed: visual, acoustic, tactile,
haptic) to check and change online the status of real devices (e.g. lights, window
blinds, heating) in the smart home. [83] describes the AutonomaMente project, which
has developed a highly customizable application based on multimodal communication
(speech, icons, text) to support autonomous living of persons with cognitive disabilities
in special apartments fitted with domotic sensors. Last but not least, the issue of
context awareness and multimodality in a smart home is also the subject of [84].
team of two highly realistic 3D agents present product items in an entertaining and
attractive way. The infotainment arena is one (possibly the only) VERITAS application
scenario where research has been done on the use of olfaction thus, the
enhancement of multimedia infotainment content with olfactory data is the subject of
[89].
1.3.7 Other
Other applications include biometrics [100][101], surveillance, remote collaboration
[102], gaming and entertainment [103], education, and robotics ([104] gives a
comprehensive review of socially active robots). Multimodal applications can also play
an important role in safety-critical applications (e.g., medicine, military [105][106], etc.)
and in situations in which a lot of information from multiple sources has to be viewed in
short periods of time. A good example of this is crisis management [107].
2.1.2 Lasercane
The LaserCane has three laser beam channels projected from the cane to detect
upward, forward, and downward obstacles. When there is an obstacle, it reflects the
laser beam, which is then detected by the receiver on the cane. The reflected signal
results in the LaserCane producing vibration or sounds to warn the user of the
obstruction. Users have the option to inactivate the sounds and use only vibration for
warnings. The LaserCane is used similarly to the standard long cane and operates with
gain and the MTO switch. The MTO (microphone/telecoil/off) switch is used to select
different modes, according to hearing tasks and occasions: The Microphone (M) mode
is used for general communication occasions, the telecoil (T) operation mode is used
for telephone and induction system use, and the off (O) mode is used for battery saving
[111]. These additional functions are more effective for persons with severe to profound
hearing loss [112].
In-the-Ear Hearing Aids: In-the-ear (ITE) hearing aids fill the outer part of the ear with
all parts contained in a custom-made shell. ITE hearing aids can be further categorized
into: a) ITE aids, b) partially in-the-canal (ITC) aids, and c) complete-in-the-canal (CIC)
aids [111]. ITE aids are usually equipped with telecoil, where ITC and CIC aids normally
have no telecoil inside due to the small size and the possibility of normal telephone
use. The small size of the ITC and CIC hearing aids makes battery replacement
difficult. The limited battery size also restricts amplification power. However, these two
styles of hearing aid sit deeper in the canal, permitting higher signal pressure levels
(SPLs) [111]. The size of all ITE hearing aids results in greater possibility of feedback
due to the close distance between the microphone and speaker. Sophisticated designs
for ITE hearing aids are necessary to avoid feedback [111]. The controls of these
devices are also difficult to adjust; using a remote control system or automatic volume
control helps solve this problem [111].
Bone-Conduction Hearing Aids: Bone-conduction hearing aids are designed for
individuals with conductive hearing loss or occlusion of the ear canal that cannot be
treated surgically. The only difference between bone-conduction hearing aids and the
above-mentioned hearing aids is the receiver. The bone conduction aids can be BW or
BTE style with receivers (vibrators fitted in a headband) that transmit vibration to the
skull. The vibration is picked up directly by the cochlea.
problems with using a mobile phone with a TDD have been reported. Kozma-Spytek
[114] further indicated that using a TDD with a mobile phone does not provide better
portability and convenience for its users. An interactive text pager provides an easier
way for people with hearing impairment to communicate with text messaging. This
device has a small QWERTY keyboard for thumb input, and it enables users to send
text message to another persons pager, mobile phone, or computer. Additional
functions available for this device include, for example, emailing, sending faxes, TDD
chat, and instant messaging. Two-way text messaging function is now available on
most mobile phones and service providers. Sending text messages through mobile
phones is very popular for mainstream users; however, using a numeral keypad for text
input is not time-efficient. New designed text-messaging mobile phones (e.g., T-mobile
Sidekick) now provides a compact QWERTY keyboard with large screen. These
phones allow better internet access and instant messaging. In addition to text
messaging, sending email and instant messaging are other popular ways to
communicate through text. Online instant messaging requires both users to be online
and using the same instant messaging program at the same time. However, text-
messaging mobile phones free their users from the restraints of time and place.
hand tracking [122], and the selection of suitable features [116]. After the parameters
are computed, the gestures represented by them need to be classified and interpreted
based on the accepted model and based on some grammar rules that reflect the
internal syntax of gestural commands. The grammar may also encode the interaction of
gestures with other communication modes such as speech, gaze, or facial expressions.
As an alternative to modeling, some authors have explored the use of combinations of
simple 2D motion based detectors for gesture recognition [123].
In any case, to fully exploit the potential of gestures for an application that support
multimodal interaction, the class of possible recognized gestures should be as broad
as possible and ideally any gesture performed by the user should be unambiguously
interpretable by the interface. However, most of the gesture-based HCI systems allow
only symbolic commands based on hand posture or 3D pointing. This is due to the
complexity associated with gesture analysis and the desire to build real-time interfaces.
Also, most of the systems accommodate only single-hand gestures. Yet, human
gestures, especially communicative gestures, naturally employ actions of both hands.
However, if the two-hand gestures are to be allowed, several ambiguous situations may
appear (e.g., occlusion of hands, intentional vs. unintentional, etc.) and the processing
time will likely increase.
Walker designs include rigid, folding, or wheeled (two, three, or four wheels) [128].
Rigid walkers require lifting to move forward. Wheeled walkers are simply pushed
forward. Folding walkers make transport easier. Most two-wheeled walkers have
automatic breaks and some have an auto-glide feature that skims the surface. Three-
and four-wheeled walkers have hand brakes and even though they are heavier, they
require less strength and energy. Todays walkers include features such as seats, trays,
baskets, and platform arm supports [125].
Walkers are a popular choice of mobility AT because they are readily available, easy to
use, robust, and relatively inexpensive when compared to other alternatives, such as
wheelchairs. They are typically used by people who have difficulties with walking,
balancing, agility, and/or standing for prolonged periods of time. These impairments are
a result of various factors and diseases, such as Parkinsons disease, multiple
sclerosis, arthritis, or the normal aging process. Walkers are used for both rehabilitation
and compensation of impairment(s), with the majority of use focusing on compensation.
There are numerous designs of walkers, with most of them falling into one of three
categories: 1) walking frames; 2) two-wheeled walkers; and 3) four-wheeled walkers
[129].
2.3.2 Wheelchairs
Wheelchairs are used for more severe mobility impairments. In general, wheelchairs
have seats, backs, footrests, and casters. The presence of other familiar features such
as push-handles, wheel locks, and large rear wheels with push-rims depend on the
purpose and specific use of the chair [130]. As with walkers, the goal of this type of AT
is typically rehabilitation or compensation for a particular impairment or disability.
People who use wheelchairs are usually unable to walk, or have difficulty walking or
standing due to various neurological dysfunctions or musculoskeletal diseases or
difficulties (e.g. muscular weakness). Common impairments that often require the use
of a wheelchair include spinal cord injuries, hemiplegia and other types of paralysis,
multiple sclerosis, cerebral palsy, arthritis, and lower limb amputees [131]. There are
three categories of wheelchairs: 1) dependent mobility (wheelchairs that are propelled
by an attendant); 2) independent manual mobility (manually propelled by the user); and
3) independent powered mobility (motor-propelled) [131].
2.3.3 Reachers
Reachers are helpful, low-cost devices for older adults designed to pick up small or
large objects (e.g., cans, pans, dishes, books, CDs), and can be used in a variety of
activities such as dressing, cooking, and gardening [131][132]. These devices can be
used to reach for items stored on high shelves, preventing the more risky approach of
climbing on a chair, stool, or ladder. Reachers extend the range of motion of a person
with disabilities (e.g., low back problems, arthritis, hypertension, stroke) [133]. Older
adults use reachers to pick up remote controls and to take cups and dishes in and out
of cabinets [131][132]. Reachers can be purchased from department stores, ordered
from catalogs or web sites, or prescribed in a medical rehabilitation setting. A study
[133] of reacher use by older adults found that they preferred lightweight reachers with
adjustable length, a lock system for grip, lever action trigger, forearm and wrist support,
life-time guarantee, and one-hand use. One of the most important criteria for reachers
used by elders is that they are lightweight. Self-closing or locking mechanisms are also
important because they eliminate the need to grasp the handle for a prolonged period
of time [133].
systems less accurate, although increasing computational power and lower costs mean
that more computationally intensive algorithms can be run in real time. As an
alternative, in [135], the authors propose using a single high-resolution image of one
eye to improve accuracy. On the other hand, infra-red-based systems usually use only
one camera, but the use of two cameras has been proposed to further increase
accuracy [136].
Although most research on non-wearable systems has focused on desktop users, the
ubiquity of computing devices has allowed for application in other domains in which the
user is stationary (e.g., [75][136]). For example, the authors of [75] monitor driver visual
attention using a single, non-wearable camera placed on a cars dashboard to track
face features and for gaze detection.
Wearable eye trackers have also been investigated mostly for desktop applications (or
for users that do not walk wearing the device). Also, because of advances in hardware
(e.g., reduction in size and weight) and lower costs, researchers have been able to
investigate uses in novel applications (eye tracking while users walk). For example, in
[54], eye tracking data are combined with video from the users perspective, head
directions, and hand motions to learn words from natural interactions with users; the
authors of [55] use a wearable eye tracker to understand hand-eye coordination in
natural tasks, and the authors of [56] use a wearable eye tracker to detect eye contact
and record video for blogging.
The main issues in developing gaze tracking systems are intrusiveness, speed,
robustness, and accuracy. The type of hardware and algorithms necessary, however,
depend highly on the level of analysis desired. Gaze analysis can be performed at
three different levels [70]: a) highly detailed low-level micro-events, b) low-level
intentional events, and c) coarse-level goal-based events. Micro-events include micro-
saccades, jitter, nystagmus, and brief fixations, which are studied for their physiological
and psychological relevance by vision scientists and psychologists. Low-level
intentional events are the smallest coherent units of movement that the user is aware
of during visual activity, which include sustained fixations and revisits. Although most of
the work on HCI has focused on coarse-level goal-based events (e.g., using gaze as a
pointer [137]), it is easy to foresee the importance of analysis at lower levels,
particularly to infer the users cognitive state in affective interfaces (e.g., [138]). Within
this context, an important issue often overlooked is how to interpret eye-tracking data.
In other words, as the user moves his eyes during interaction, the system must decide
what the movements mean in order to react accordingly. We move our eyes 2-3 times
per second, so a system may have to process large amounts of data within a short
time, a task that is not trivial even if processing does not occur in real-time. One way to
interpret eye tracking data is to cluster fixation points and assume, for instance, that
clusters correspond to areas of interest. Clustering of fixation points is only one option,
however, and as the authors of [139] discuss, it can be difficult to determine the
clustering algorithm parameters. Other options include obtaining statistics on measures
such as number of eye movements, saccades, distances between fixations, order of
fixations, and so on.
Examples: Button activated doors, support bars inside car, leg lifters,
lever/switch/button to eject/release belt tongue from buckle, lever/switch/button to open
door, gear control on steering wheel, accelerator lever, ring accelerator, brake radial
lever, floor mounted braking levers, floor mounted accelerator levers, light switches on
steering wheel, headrest controls, voice activated doors, voice controlled system to
eject/release belt tongue from buckle, voice control gear system, voice control brake
system, light voice controlled system, voice controlled radio, voice controlled window,
voice activated mirror, voice controlled navigation system, etc.
Various assistive devices are also used in workplace and in smart living places.
Examples: foot faucet controls, foot flush valve, sensor enabled faucet control, voice
controlled doors/windows, support bars, etc.
prototype of Activity Compass that recorded the GPS reading into a PDA [141]. The
Activity Compass monitors which activity paths stored in the server it believes are in
progress based on the time, user location, and activity path. Visual guidance (arrows
for direction) then were provided on PDA monitor. The arrows guide the user for the
correct route to his/her most likely destination. The technologies used in Activities
Compass are hand-held computer, GPS receiver, and wireless technology. The most
recent version of Activity Compass uses only a smartphone and GPS receiver.
as the interfaces from the previous section that can be used to address the
deficiencies.
3.2.1 Elderly
Aging is associated with decreases in muscle mass and strength. These decreases
may be partially due to losses of motor neurons. By the age of 70, these losses occur
in both proximal and distal muscles. In biceps brachii and brachialis, old adults show
decreased strength (by 1/3) correlated with a reduction in the number of motor units (by
1/2). Old adults show evidence that remaining motor units may become larger as motor
units innervate collateral muscle fibers [143].
Concerning walking gait, when confronted to an unexpected slip or trip during walking,
compared to young adults, old adults have a less effective balance strategy: smaller
and slower postural muscle responses, altered temporal and spatial organization of the
postural response, agonist-antagonist muscles coactivation and greater upper trunk
instability. Comparing control and slip conditions, after the perturbation, young adults
have a longer stride length, a longer stride duration, and the same walk velocity
whereas old adults have a shorter stride length, the same stride duration, and a lower
walk velocity [144].
For the knee extensors, old adults produce less torque during dynamic or isometric
maximal voluntary contractions than young adults. The mechanisms controlling fatigue
in the elderly during isometric contractions are not the same as those that influence
fatigue during dynamic contractions, while young adults keep the same strategy. The
knee extensors of healthy old adults fatigue less during isometric contractions than do
those of young adults who had similar levels of habitual physical activity [145].
Old adults exhibit reductions in manual dexterity which is observed through changes in
fingertip force when gripping and/or lifting [146].
There are many diseases, disorders, and age-related changes that may affect the eyes
and surrounding structures. As the eye ages certain changes occur that can be
attributed solely to the aging process. Most of these anatomic and physiologic
processes follow a gradual decline. With aging, the quality of vision worsens due to
reasons independent of diseases of the aging eye. While there are many changes of
significance in the nondiseased eye, the most functionally important changes seem to
be a reduction in pupil size and the loss of accommodation or focusing capability
(presbyopia [147]).
Hearing loss is one of the most common conditions affecting older adults. One in three
people older than 60 and half of those older than 85 have hearing loss.
Memory loss is normal when it comes with aging.
Modalities affected: motor (speed, dexterity, fatigue resistance, gait), vision, hearing,
cognitive (memory loss).
Interfaces that address to this user group:
If vision deficiencies are present: Speech recognition interfaces, Speech
synthesis interfaces, Screen reading interfaces, Voice browsing interfaces,
Audio playback interfaces, Alternative keyboards/switches interfaces, Haptic
interfaces.
3.2.5 Osteoarthritis
Osteoarthritis (OA) is a group of mechanical abnormalities involving degradation of
joints [155] including articular cartilage and subchondral bone. Symptoms may include
joint pain, tenderness, stiffness, locking, and sometimes an effusion. The main
symptom is pain, causing loss of ability and often stiffness. OA commonly affects the
hands, feet, spine, and the large weight bearing joints, such as the hips and knees,
although in theory, any joint in the body can be affected. As OA progresses, the
affected joints appear larger, are stiff and painful, and usually feel better with gentle use
but worse with excessive or prolonged use, thus distinguishing it from rheumatoid
arthritis.
Modalities affected: motor (range of motion, gait).
Interfaces that address to this user group: Speech recognition interfaces,
Alternative keyboards/switches interfaces, Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.6 Gonarthritis
Gonarthritis, or knee arthritis, results in mechanical pain, in other words, pain that
increases when walking, particularly going up or down stairs. Cracking or instability of
the knee may be noted (the knee seems to give way). Frequently, in addition to
persistent mechanical pain there may be cycles of intense pain accompanied by
inflammation of the joint with effusion. In the most advanced stages there may exist a
decrease in the range of knee mobility (patients are not able to fully extend or bend it)
[156].
Modalities affected: motor (pain while walking, limited knee range of motion).
Interfaces that address to this user group: Speech recognition interfaces,
Alternative keyboards/switches interfaces, Gaze/Eye tracking interfaces, Facial
3.2.7 Coxarthritis
Hip arthritis typically affects patients over 50 years of age. It is more common in people
who are overweight. The most common symptoms of hip arthritis are pain with
activities, limited range of motion, stiffness of the hip, walking with a limp [157].
Modalities affected: motor (gait, hip stiffness, limited hip range of motion).
Interfaces that address to this user group: Speech recognition interfaces,
Alternative keyboards/switches interfaces, Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.10 Hemiparesis
Hemiparesis is weakness on one side of the body. It is less severe than hemiplegia
the total paralysis of the arm, leg, and trunk on one side of the body. Thus, the patient
can move the impaired side of his body, but with reduced muscular strength.
Depending on the type of hemiparesis diagnosed, different bodily functions can be
affected. People with hemiparesis often have difficulties maintaining their balance due
to limb weaknesses leading to an inability to properly shift body weight. This makes
performing everyday activities such as dressing, eating, grabbing objects, or using the
bathroom more difficult. Hemiparesis with origin in the lower section of the brain
creates a condition known as ataxia, a loss of both gross and fine motor skills, often
manifesting as staggering and stumbling [162].
Right-sided hemiparesis involves injury to the left side of the person's brain, which is
the side of the brain controlling speech and language. People who have this type of
hemiparesis often experience difficulty with talking and understanding what others say
[162].
In addition to problems understanding or using speech, persons with right-sided
hemiparesis often have difficulty distinguishing left from right. When asked to turn left
or right, or to raise a left or right limb, many affected with right-sided hemiparesis will
either turn/raise limb/etc. in the wrong direction or simply not follow the command at all
due to an inability to process the request [162].
Left-sided hemiparesis involves injury to the right side of the person's brain, which
controls learning processes, certain types of behavior, and non-verbal communication.
Injury to this area of a person's brain may also cause people to talk excessively, have
short attention spans, and have problems with short-term memory [162].
Modalities affected: motor (reduced muscular strength, loss of balance).
Interfaces that address to this user group: Speech recognition interfaces,
Alternative keyboards/switches interfaces, Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.11 Hemiplegia
Hemiplegia is total paralysis of the arm, leg, and trunk on the same side of the body
[163]. Hemiplegia is more severe than hemiparesis, wherein one half of the body has
less marked weakness. Hemiplegia is not an uncommon medical disorder. In elderly
individuals, strokes are the most common cause of hemiplegia. In children, the majority
of cases of hemiplegia have no identifiable cause and occur with a frequency of about
one in every thousand births. Experts indicate that the majority of cases of hemiplegia
that occur up to the age of two should be considered to be cerebral palsy until proven
otherwise.
Hemiplegia problems may include: difficulty with gait, balance while standing or
walking, having difficulty with motor activities like holding, grasping or pinching,
increasing stiffness of muscles, muscle spasms, difficulty with speech, difficulty
swallowing food, significant delay in achieving developmental milestones like standing,
smiling, crawling or speaking. The majority of children who develop hemiplegia also
have abnormal mental development. Behavior problems like anxiety, anger, irritability,
lack of concentration or comprehension.
Modalities affected: motor (paralysis, gait, affects facial expressions), speech
(sometimes).
Interfaces that address to this user group: Speech recognition interfaces (when
there is not a speech deficiency present), Alternative keyboards/switches interfaces,
Gaze/Eye tracking interfaces, Facial expression recognition interfaces, Gesture
recognition based interfaces.
3.2.15 Glaucoma
Glaucoma is an eye disease in which the optic nerve is damaged in a characteristic
pattern. This can permanently damage vision in the affected eye(s) and lead to
blindness if left untreated [170]. Glaucoma signs are gradually progressive visual field
loss, and optic nerve changes.
Modalities affected: vision (lvisual field loss).
Interfaces that address to this user group: Speech recognition interfaces, Speech
synthesis interfaces, Screen reading interfaces, Voice browsing interfaces, Audio
playback interfaces, Alternative keyboards/switches interfaces, Haptic interfaces.
3.2.17 Cataract
Cataract is a clouding that develops in the crystalline lens of the eye or in its envelope
(lens capsule), varying in degree from slight to complete opacity and obstructing the
passage of light. As a cataract becomes more opaque, clear vision is compromised. A
loss of visual acuity is noted. Contrast sensitivity is also lost, so that contours, shadows
and color vision are less vivid. Veiling glare can be a problem as light is scattered by
the cataract into the eye [172].
3.2.19 Otitis
Otitis is a general term for inflammation or infection of the ear. It is subdivided into otitis
externa, media and interna. Otitis externa, external otitis, or "swimmer's ear" involves
the outer ear and ear canal. In external otitis, the ear hurts when touched or pulled.
When enough swelling and discharge in the ear canal is present to block the opening,
external otitis may cause temporary conductive hearing loss .
Otitis media or middle ear infection involves the middle ear. In otitis media, the ear is
infected or clogged with fluid behind the ear drum, in the normally air-filled middle-ear
space. Children with recurrent episodes of acute otitis media and those suffering from
otitis media with effusion or chronic otitis media, have higher risks of developing
conductive and sensorineural hearing loss [174].
Otitis interna is an inflammation of the inner ear and is usually considered synonymous
with labyrinthitis. It results in severe vertigo lasting for one or more days. In rare cases,
hearing loss accompanies the vertigo in labyrinthitis.
Modalities affected: hearing.
Interfaces that address to this user group: Visual notification interfaces, Sign
language synthesis interfaces, Augmented reality interfaces.
3.2.20 Otosclerosis
Otosclerosis is an abnormal growth of bone near the middle ear. It can result in hearing
loss. Otosclerosis can result in conductive and/or sensorineural hearing loss. The
primary form of hearing loss in otosclerosis is conductive hearing loss (CHL) whereby
sounds reach the ear drum but are incompletely transferred via the ossicular chain in
the middle ear, and thus partly fail to reach the inner ear (cochlea). On audiometry, the
3.2.23 Presbycusys
Presbycusys (age-related hearing loss) is the cumulative effect of aging on hearing.
Also known as presbyacusis, it is defined as a progressive bilateral symmetrical age-
related sensorineural hearing loss [178]. The hearing loss is most marked at higher
frequencies. Hearing loss that accumulates with age but is caused by factors other
than normal aging is not presbycusis, although differentiating the individual effects of
multiple causes of hearing loss can be difficult.
Modalities affected: hearing.
Interfaces that address to this user group: Visual notification interfaces, Sign
language synthesis interfaces, Augmented reality interfaces.
3.2.24 Stuttering
Stuttering (alalia syllabaris) affects approximately 1% of the adult population. also
known as stammering (alalia literalis or anarthria literalis), is a speech disorder in which
the flow of speech is disrupted by involuntary repetitions and prolongations of sounds,
syllables, words or phrases, and involuntary silent pauses or blocks in which the
stutterer is unable to produce sounds [179]. The term stuttering is most commonly
associated with involuntary sound repetition, but it also encompasses the abnormal
hesitation or pausing before speech, referred to by stutterers as blocks, and the
prolongation of certain sounds, usually vowels and semivowels.
Modalities affected: speech.
Interfaces that address to this user group: Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.25 Cluttering
Cluttering (tachyphemia) is a speech disorder and a communication disorder
characterized by speech that is difficult for listeners to understand due to rapid
speaking rate, erratic rhythm, poor syntax or grammar, and words or groups of words
unrelated to the sentence. Cluttering has in the past been viewed as a fluency disorder
[180].
Modalities affected: speech.
Interfaces that address to this user group: Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.26 Muteness
Muteness (mutism) is complete inability to speak. Those who are physically mute may
have problems with the parts of the human body required for speech (the throat, vocal
cords, lungs, mouth, or tongue, etc.). Being mute is often associated with deafness as
people who have been unable to hear from birth may not be able to articulate words
correctly (see Deaf-mute). A person can be born mute, or become mute later in life as a
result of injury or disease [181].
Modalities affected: speech.
Interfaces that address to this user group: Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.27 Dysarthria
Dysarthria is a weakness or paralysis of speech muscles caused by damage to the
nerves and/or brain. Dysarthria is often caused by strokes, parkinsons disease, ALS,
head or neck injuries, surgical accident, or cerebral palsy [182].
Modalities affected: speech.
Interfaces that address to this user group: Gaze/Eye tracking interfaces, Facial
expression recognition interfaces, Gesture recognition based interfaces.
3.2.28 Dementia
Dementia is a serious loss of global cognitive ability in a previously unimpaired person,
beyond what might be expected from normal aging. Dementia is not a single disease,
but rather a non-specific illness syndrome (i.e., set of signs and symptoms) in which
affected areas of cognition may be memory, attention, language, and problem solving
[183].
Modalities affected: cognitive (attention and memory loss).
Interfaces that address to this user group: Audio playback interfaces, Pen based
interfaces, Alternative keyboards/switches interfaces, Augmented reality interfaces.
4.1 Overview
The Interaction Models have to cover three basic dimensions concerning the
interaction between the Virtual User and the virtual prototype to be tested:
Multiple Users: denote the support for users with different disabilities
Multiple Modalities: identify the need to support different interaction styles in
different situations.
Multiple Assistive Devices: reflect the need to support multiple interaction
resources and assistive devices.
The multimodal interaction models will have flexibility to allow the adjustment of the
level of sensorial capabilities needed for the interaction, in order to evaluate the
usability level and the interaction effectiveness in relation to the level and type of
impairment of the virtual user.
As depicted in Figure 4, the Multimodal Interaction Models will describe the execution
of the primitive tasks using different modalities and assistive devices, with respect to
the different target groups.
Task Models describe how a complex task can be analyzed into primitive tasks, in an
abstract way, without taking into account alternative ways of tasks execution using
different modalities and/or assistive devices.
The Multimodal Interaction Models intend to fill this gap. More specifically, the
Multimodal Interaction Models will describe the alternative ways of a primitive tasks
execution, using different modalities and/or assistive devices, with respect to the
disabilities of the target user groups. Then, the Task Models in conjunction with the
Multimodal Interaction Models and the Virtual User Models will be used by the Modality
Compensation and Replacement Module included in the Simulation Platform. The
Modality Compensation and Replacement Module will utilize the characteristics of the
simulated Virtual User Model, in order to convert, whenever possible, modalities that
are not perceived, due to a specific disability, from one sensory channel into another
normally perceived communication channel (e.g. aural information could be
dynamically transformed into text or sign language for hearing impaired users).
Additionally, it will decide if assistive devices should be used during the simulation
process performed by the Simulation Platform.
For every primitive task (with regard to each application sector) that supports different
ways of successful execution, a multimodal interaction model has to be defined.
UsiXML [2] will be used for the definition of the multimodal interaction models, similarly
to the definition of the Task and Simulation Models.
The execution order of the alternative tasks can be defined by the following temporal
operators [186]:
Enabling: specify that a target task cannot begin until source task is finished
(Figure 7).
Order independency: specifies that two tasks are independent of the order of
execution (Figure 12).
4.3.1 Walk
The definition of the walk generic Multimodal Interaction Model is presented in Table
5. A visual representation of the relationships is given through Figure 15. The source
code is contained in CodeSnippet 2.
Task Modality Task Disability Alternative Alternative Alternative
object task(s) modality task object
/ assistive
device
Walk Motor Wheelchair Roll (hands) Motor Wheelchair
users
Lower limb Grasp Motor Support bar
impaired
Shuffle (feet) Motor
4.3.2 See
The definition of the see generic Multimodal Interaction Model is presented in Table 6.
A visual representation of the relationships is given through Figure 16. The source code
is contained in CodeSnippet 3.
4.3.3 Hear
The definition of the hear generic Multimodal Interaction Model is presented in Table
7. A visual representation of the relationships is given through Figure 17. The source
code is contained in CodeSnippet 4.
Task Modality Task Disability Alternative Alternative Alternative
object task(s) modality task object
/ assistive
device
Hear Audition Audio Hearing See Vision Visual cues
cues impaired
Figure 19: Pull (hand): Door handle Multimodal Interaction Model relationships.
Figure 21: Grasp (right hand): Steering wheel Multimodal Interaction Model
relationships.
Figure 22: Grasp (hand): Door handle Multimodal Interaction Model relationships.
Figure 23: Pull (hand): Window bar Multimodal Interaction Model relationships.
Table 15: Pull (hand): Toilet flush valve Multimodal Interaction Model definition.
Figure 25: Pull (hand): Toilet flush valve Multimodal Interaction Model
relationships.
Figure 26: Stand up (knee, back): Toilet Multimodal Interaction Model relationships.
Figure 29: Push (hand): Medical device button Multimodal Interaction Model
relationships.
Table 20: Read: Message on touch screen Multimodal Interaction Model definition.
Table 21: Press: OK button on the touch screen Multimodal Interaction Model
definition.
Figure 31: Press: OK button on the touch screen Multimodal Interaction Model
relationships.
4.9 Conclusions
The current section provided the specifications of the multimodal interaction models,
which include combinations of interfacing modalities most suited for the target user
groups and connect the Virtual User Models, developed in SP1, to the Task Models and
the virtual prototype to be tested.
The multimodal interaction models describe the alternative ways of a primitive tasks
execution with respect to the different target user groups, the replacement modalities
and the usage of assistive devices for each application sector (automotive, smart living
spaces, office workplace, infotainment, personal healthcare).
quality of the user interaction and make the product more accessible, taking into
account the virtual user's disabilities.
5.1 Architecture
The architecture of the Multimodal Interaces Manager is presented in Figure 32. As it
depicted the Multimodal Interfaces Manager is consisted of two main components:
the Modality Compensation and Replacement Module, which is responsible
for managing the multimodal interaction models and providing alternative tasks,
modalities and assistive devices.
the Multimodal Toolset Manager, which is the provider of various input/output
tools to the user in order to enhance his/her interaction based on his/her
capabilities.
Multimodal Interfaces
Manager
Core Simulation
Platform
Immersive
Simulation
Runtime Engine
Figure 33: Data input and output in Modality Compensation and Replacement Module.
The first two factors can be found in the Task Model and Simulation Model files. These
two files can be provided to the Modality Compensation and Replacement Module by
the Interaction Adaptor. The third factor is specified in the loaded Virtual User Model
(again provided by the Interaction Adaptor). Having knowledge of the task, application
area and virtual user deficiencies, the compensation module selects the appropriate
Multimodal Interaction Model from its pool.
Then, it analyses the various primitive tasks and provides alternative ones, that are
based on a different, more appropriate to the virtual user. The new task definition may
based on a different modality, or in the same modality but include the usage of an
assistive device, or a combination of these two. The whole input/output data is shown
in Figure 33.
Speech Recognition
Module
ch
pee
s
Speech Synthesis
User Input
m Module
ot
or
g
Module Manager
he
or
ot
m
Figure 34: The Multimodal Toolset Manager and its modules. Some modules are used
for getting input from users, while others for passing information to them. The modality
type of each tool is also indicated.
The implementation, the modalities and the characteristics of these tools are discussed
in the paragraphs 5.3.2 to 5.3.5. The modality requirements of the tools are presented
in Table 22. It must be noted that the Multimodal Toolset Manager is implemented in
such way to support extra modules, thus increasing its capabilities.
Table 22: Multimodal Toolset Manager's modules and their modality requirements from
the target users.
Prerecorded Core
Avatar speech file Simulation
Platform
Figure 35: Data flow concerning the Speech Recognition Module. Both real-time and
pre-recorded speech-audio is supported and converted to text by the module.
The speech recognition implementation is based on the CMU Sphinx toolkit [187]. The
Sphinx architecture is based on two models: a) the language mode, which contains the
dictionary and grammar, b) the acoustic model, which contains the respective audio
information. In its current state, the Speech Recognition module supports English,
French and Russian. However, the Sphinx framework support training of new language
and acoustic models. Table 23 presents summarises the features of the CMU Sphinx
toolkit.
Table 23: Features of the chosen speech recognition software, CMU Sphinx.
The output of the Speech Recognition Module is a text string and is sent to the
Immersive or Core Simulation Platform for further processing.
Immersive
Simulation
Platform
Speech Synthesis Speech Immersed
Text
Module audio User
Core
Simulation
Platform
The Speech Synthesis Module uses two speech synthesis engines: eSpeak [193] and
Festival [194]. Generally, in terms of processing speed, the eSpeak provides faster the
output with low CPU usage. The eSpeak speech is clear, and can be used at high
speeds, but is not as natural or smooth as larger synthesizers which are based on
human speech recordings. This is the reason why another speech synthesizer, the
Festival Speech Synthesis System, is included to the Speech Synthesis Module.
Festival is a general multilingual speech synthesis system originally developed at the
University of Edinburgh. Substantial contributions have also been provided by Carnegie
Mellon University and other sites. It is distributed under a free software license similar
to the BSD License. It offers a full text to speech system with various APIs, as well as
an environment for development and research of speech synthesis techniques. It is
written in C++ with a Scheme-like command interpreter for general customization and
extension. Festival is designed to support multiple languages, and comes with support
for English (British and American pronunciation), Welsh, and Spanish. Voice packages
exist for several other languages, such as Castilian Spanish, Czech, Finnish, Hindi,
Italian, Marathi, Polish, Russian and Telugu.
A feature comparison between the two speech synthesizers is presented in Table 24.
Table 24: Feature of the speech synthesizers used by the Speech Synthesis Module.
position,
orientation
Immersed Haptics Simulation
User Module Platform
force
feedback
Haptic device
Figure 37: User and Simulation platform exchange information via the haptic device
and the Haptics Module.
The Haptics Module makes use of the CHAI3D API [196] in order to manipulate various
haptic devices. CHAI 3D is an open source set of C++ libraries for computer haptics,
visualization and interactive real-time simulation. CHAI 3D supports several
commercially-available three-, six- and seven-degree-of-freedom haptic devices, and
makes it simple to support new custom force feedback devices. Multiple haptic devices
can be connected to the same computer in order to support both hands interactions.
The Haptics Module has been tested and fully supports the: Sensable's PHANTOM
Desktop [197] and PHANTOM Omni [198] Haptic Devices, and the Novint Falcon
device [199].
Immersive
Simulation
Platform
Text Sign Language Immersed
Synthesis Module User
Core
Simulation
Platform
The processing is based on the ATLAS [200] (Automatic TransLAtion into Sign
language) tool and is currently under a demo (testing only usage). The text must have
special annotation (xml language is used). The output is a 3D rendered virtual actor
who performs the sign language gestures. In the current phase, only the Italian
language is supported.
Text alert
Immersive
Simulation
Platform
Symbolic Graphics alert Immersed
Module User
Core
Simulation
Platform Sound alert
Future Work
Despite its unique features, The Multimodal Interface Tool-set in its current state cannot
be characterised as a complete product.
The most important missing element is the integration with the Veritas simulation
framework. Thus, the future work will be concentrated to this field. The automatic
selection of different modalities, task tools or assistive devices will offer to the virtual or
immersed users a more accessible environment to test specific-for-them product
designs. The selection of the most appropriate Multimodal Interaction Model will be
based on prioritization rules and this automatic task translator tool will be part of the
Modality Compensation and Replacement Module.
In the future, the Core Simulation Platfrom (A2.1.1) will support simulation cascade
mode, i.e. sequential testing of various virtual user models around different product
designs. A similar approach is needed for the Multimodal Interface Toolset, where the
cascades will support different modalities and different assistive devices will be tested
automatically.
Additionally, future plans include the implementation of a Multimodal Interfaces Models
parser for the Modality Compensation and Replacement Module, which will parse the
various usixml files and convert them into machine-friendly code.
Also, an investigation will be made concerning the support of mobile phone touch-
screens and if necessary a corresponding input module for the Multimodal Toolset
Manager will be implemented.
References
[1] Charwat, H. J. (1992) Lexicon der Mensch-Maschine-Kommunikation.
Oldenbourg.
[2] MacDonald, J. and McGurk, H. (1978) Visual influences on speech perception
process. Perception and Psychophysics, 24 (3), 253-257.
[3] McGurk, H. and MacDonald, J. (1976). Hearing lips and seeing voices. Nature,
264, 746-748.
[4] Parke, F.I. and Waters, K. (1996). Computer Facial Animation. A K Peters.
[5] Silbernagel, D. (1979) Tatchenatlas der Physiologie. Thieme.
[6] Kandel, E. R. and Schwartz, J. R. (1981) Principles of Neural Sciences. Elsevier
Science Publishers (North Holland).
[7] Shepherd, G. M. (1988) Neurobiology, 2nd edition. Oxford University Press.
[8] Chatty, S. (1994), Extending a graphical toolkit for two-handed interaction. ACM
UIST 94 Symposium on User Interface Software and Technology, ACM Press,
195-204.
[9] Nigay, L. and Coutaz, J. (1993) A design space for multimodal systems:
concurrent processing and data fusion. Human Factors in Computing Systems,
INTERCHI 93 Conference Proceedings, ACM Press, 172-178.
[10] Coutaz, J. (1987) PAC: An object-oriented model for dialog design. Proceedings
of INTERACT 87: The IFIP Conference on Human Computer Interaction, 431-
436.
[11] Huang, X. and Oviatt, S. (2006) Toward Adaptive Information Fusion in Multimodal
Systems, in S. Renals and S. Bengio (Eds): MLMI 2005, LNCS 3869, 15-27.
[12] Foley, J.D., van Dam, A., Feiner, S.K. and Hughes, J.F. (1990) Computer
Graphics: principles and practice, 2nd edition. Addision-Wesley.
[13] Shneiderman, B. (1982) The future of interactive systems and the emergence of
direct manipulation. Behaviour and Information Technology 1 (3), 237-256.
[14] Shneiderman, B. (1983) Direct manipulation: a step beyond programming
languages. IEEE Computer, 16 (8), 57-69.
[15] Bernsen, N.O. The Structure of the Design Space, in Computers, Communication
and Usability: Design Issues, Research, and Methods for lntegrated Services, P.F.
Byerley, P.]. Barnard, and J. May, eds., North Holland, Amsterdam, 1993, pp. 221
-244.
[16] Maybury, M. T. and Wahlster, W. (1998) Readings in Intelligent User Interfaces.
Morgan Kaufmann Publishers.
[17] Nigay, L. and Coutaz, J. (1995) A generic platform for addressing the multimodal
challenge. Human Factors in Computing Systems, CHI 95 Conference
Proceedings, ACM Press, 98-105.
[18] Moran, D.B., Cheyer, A.J., Julia, L.E., Martin, D.L., and Park, S. (1997) Multimodal
user interfaces in the Open Agent Architecture. Proceedings of the 1997
International Conference on Intelligent User Interfaces (IUI 97), ACM Press, 61-
68.
[19] Wahlster, W. (1991) User and discourse models for multimodal communication.
In: Intelligent User Interfaces, J. W. Sullivan and S. W. Tyler (Eds.), ACM Press,
45-67.
[20] Bolt, R. (1980) Put-That-There: voice and gesture at the graphics interface,
Computer Graphics, 14, 3, 262270.
[21] Obrenovic, Z., Abascal, J. and Starcevic, D. (2007) Universal accessibility as a
multimodal design issue. Commun. ACM 50, 5, 83-88.
[22] Burzagli, L., Emiliani, P.L., and Gabbanini, F. (2009). Design for All in action: An
example of analysis and implementation, Expert Systems with Applications, 36,
985-994.
[23] Doyle, J., Bertolotto, M., and Wilson, D. (2008) Multimodal Interaction
Improving Usability and Efficiency in a Mobile GIS Context, 1st Int. Conf. on
Advances in Computer-Human Interaction, 63-68.
[24] Oviatt, S.L. (1999) Mutual Disambiguation of recognition errors in a multimodal
architecture. Proceedings of the SIGCHI conference on Human factors in
computing systems: the CHI is the limit (CHI '99). ACM, New York, NY, USA, 576-
583.
[25] Oviatt, S.L. (1999) Ten myths of multimodal interaction, Communications of the
ACM 42, 11, 7481.
[26] Cobb, S.V.G and Sharkey, P.M. (2007) A Decade of Research and Development
in Disability, Virtual Reality and Associated Technologies: Review of ICDVRAT
1996-2006, The International Journal of Virtual Reality, 6(2): 51-68.
[27] Edwards, A. D. N. (2002). Multimodal interaction and people with disabilities. (in)
Multimodality in Language and Speech Systems. B. Granstrm, D. House and I.
Karlsson, (Eds.). Dordrecht, Kluwer, pp. 73-92.
[28] Emiliani, P. L. and Stephanidis C. (2005) Universal access to ambient intelligence
environments: Opportunities and challenges for people with disabilities, IBM
Systems Journal, 44, 3, 605-619.
[29] B. Kisacanin, V. Pavlovic and T.S. Huang, Editors, Real-Time Vision for Human
Computer Interaction , Springer-Verlag (2005).
[30] Jaimes, A. and Sebe, N. (2007) Multimodal human-computer interaction: A survey,
Computer Vision and Image Understanding, 108, 116-134.
[31] Oviatt, S.L. (2001). Designing robust multimodal systems for universal access. In
Proceedings of the 2001 EC/NSF workshop on Universal accessibility of
ubiquitous computing: providing for the elderly (WUAUC'01). ACM, New York, NY,
USA, 71-74.
[32] Sharma, R., Pavlovic, V., and Huang T.S. (1998) Toward Multimodal Human
Computer Interface, Proceedings of the IEEE, 86, 5, 853-869.
[33] Argyropoulos, S., Moustakas, K., Karpov, A.A., Aran, O., Tzovaras, D., Tsakiris, T.,
Varni, G., and Kwon, B. (2008). Multimodal user interface for the communication
of the disabled, Journal on Multimodal User Interfaces, 2, 105116.
[34] Intille, S., Larson, K. , Beaudin, J., Nawyn, J. , Tapia, E., and Kaushik, P. (2004) A
living laboratory for the design and evaluation of ubiquitous computing
technologies, ACM Conference on Human Factors in Computing Systems (CHI),
1941-1944.
[35] McCowan, L., Gatica-Perez, D., Bengio, S., Lathoud, G., Barnard, M. and Zhang,
D. (2005) Automatic analysis of multimodal group actions in meetings, IEEE
Transactions on PAMI 27, 3, 305317.
[36] Gatica-Perez, D. (2006) Analyzing group interactions in conversations: a survey,
IEEE International Conference on Multisensor Fusion and Integration for
Intelligent Systems, 4146.
[37] Pentland, A. (2005) Socially aware computation and communication, IEEE
Computer 38, 3, 33-40.
[38] Meyer S. and Rakotonirainy, A. (2003) A Survey of research on context-aware
homes, Australasian Information Security Workshop Conference on ACSW
Frontiers.
[39] Cheyer, A. and Julia, L. (1998) MVIEWS: multimodal tools for the video analyst,
Conference on Intelligent User Interfaces (IUI), ACM, New York, NY, USA, 55-62.
[40] Bradbury, J.S., Shell, J.S. and Knowles, C.B. (2003) Hands on cooking: towards
an attentive kitchen, ACM Conference Human Factors in Computing Systems
(CHI), 996-997.
[41] Chen, D., Malkin, R., and Yang, J. (2004). Multimodal detection of human
interaction events in a nursing home environment. In Proceedings of the 6th
international conference on Multimodal interfaces (ICMI '04). ACM, New York, NY,
USA, 82-89.
[42] Lauruska V. and Serafinavicius, P. (2003) Smart home system for physically
disabled persons with verbal communication difficulties, Assistive Technology
Research Series (AAATE), 579583.
[43] Adler, A., Eisenstein, J., Oltmans, M., Guttentag, L., and Davis, R. (2004) Building
the design studio of the future, AAAI Fall Symposium on Making Pen-Based
Interaction Intelligent and Natural (2004).
[44] Ben-Arie, J., Wang, Z., Pandit, P. and Rajaram, S. (2002) Human activity
recognition using multidimensional indexing, IEEE Transactions on PAMI 24, 8,
2002, 10911104.
[45] Bobick A.F. and Davis, J. (2001) The recognition of human movement using
temporal templates, IEEE Transaction on PAMI 23, 3, 257267.
[46] Pentland, A. (2000), Looking at people, Communications of the ACM 43, 3, 35
44.
[47] Chen, L.S., Travis Rose, R., Parrill, F., Han, X., Tu, J., Huang, Z., Harper, M.,
Quek, F., McNeill, D., Tuttle, R., and Huang, T.S. (2005) VACE multimodal
meeting corpus, MLMI.
[48] Garg, A., Naphade, M. and Huang, T.S. (2003) Modeling video using input/output
Markov models with application to multi-modal event detection, Handbook of
Video Databases: Design and Applications.
[49] Hu, W., Tan, T., Wang, L. and Maybank, S. (2004) A survey on visual
surveillance of object motion and behaviors, IEEE Transactions on Systems, Man,
and Cybernetics 34, 3, 334 352.
[50] Dusan, S., Gadbois, G.J. and Flanagan, J. (2003) Multimodal interaction on
PDAs integrating speech and pen inputs, Eurospeech.
[51] Fritz, G., Seifert, C., Luley, P., Paletta L., and Almer, A. (2004) Mobile vision for
ambient learning in urban environments, International Conference on Mobile
Learning (MLEARN).
[52] Brewster, S., Lumsden, J., Bell, M., Hall, M. and Tasker, S. (2003). Multimodal
'eyes-free' interaction techniques for wearable devices. In Proceedings of the
SIGCHI conference on Human factors in computing systems (CHI '03). ACM, New
York, NY, USA, 473-480.
[53] Kono, Y., Kawamura, T., Ueoka, T., Murata S., and Kidode, M. (2004) Real world
objects as media for augmenting human memory, Workshop on Multi-User and
Ubiquitous User Interfaces(MU3I), 3742.
[54] Yu, C. and Ballard, D.H. (2004) A multimodal learning interface for grounding
spoken language in sensorimotor experience, ACM Transactions on Applied
Perception, 1, 1, 57-80.
[55] Pelz, J.B. (2004) Portable eye-tracking in natural behavior, Journal of Vision 4,11.
[56] Dickie, C., Vertegaal, R., Fono, D., Sohn, C., Chen, D., Cheng, D., Shell J.S., and
Aoudeh, O. Augmenting and sharing memory with eye, Blog in CARPE (2004).
[57] Nijholt, A. and Heylen, D. (2002) Multimodal communication in inhabited virtual
environments, International Journal of Speech Technology 5, 343354.
[58] Malkawi, A.M. and Srinivasan, R.S. (2004) Multimodal humancomputer
interaction for immersive visualization: integrating speech-gesture recognition
and augmented reality for indoor environments, International Association of
Science and Technology for Development Conference on Computer Graphics
and Imaging.
[59] Paggio, P. and Jongejan, B. (2005). Multimodal communication in the virtual farm
of the staging Project. In: O. Stock and M. Zancanaro, Editors, Multimodal
Intelligent Information Presentation, Kluwer Academic Publishers, Dordrecht , 27
46.
[60] Maynes-Aminzade, D., Pausch, R. and Seitz, S. (2002) Techniques for interactive
audience participation, ICMI, 15-20.
[61] Wassermann, K.C., Eng, K., Verschure, P.F.M.J. Manzolli, and J. (2003). Live
soundscape composition based on synthetic emotions, IEEE Multimedia
Magazine 10, 4, 82-90.
[62] Lyons, M.J., Haehnel, M., and Tetsutani, N. (2003) Designing, playing, and
performing, with a vision-based mouth Interface, Conference on New Interfaces
for Musical Expression, 116-121.
[63] Paradiso, J. and Sparacino, F. (1997) Optical tracking for music and dance
performance, Optical 3-D Measurement Techniques IV, A. Gruen, H. Kahmen,
eds., 1118.
[127] Mann, W. C., Goodall, S., Justiss, M. D., and Tomita, M. (2002). Dissatisfaction
and non-use of assistive devices among frail elders. Assistive Technology 14(2),
130139.
[128] Canes and Walkers. In: Helpful Products for Older Persons (booklet series).
University at Buffalo, NY: Center for Assistive Technology, Rehabilitation
Engineering Research Center on Aging.
[129] Fernie, G. (1997). Assistive Devices. Handbbook of Human Factors and the Older
Adult. A. D. Fisk and N. Rogers. London, Academic Press: 289-310.
[130] Blesedell-Crepeau, E., Cohn, E. S. et al. (2003). Willard and Spackman's
Occupational Therapy. Philadelphia, PA, Lippincott, Willaims & Wilkins.
[131] Cook, A. M. and Hussey, S. M. (2002). Assistive Technologies: Principles and
Practice. Toronto, Mosby.
[132] Mann, W. C. (2002). Assistive devices and home modifications. In: Encyclopedia
of Aging, E. D. J., et al., eds. New York: Macmillan.
[133] Chen, L.K.P., Mann, W. C., Tomita, M. and Burford, T. (1998). An evaluation of
reachers for use by older persons with disabilities. Assistive Technology 10(2),
113125.
[134] Moriyama, T., Kanade, T., Xiao, J. and Cohn, J., Meticulously Detailed Eye
Region Model and Its Application to Analysis of Facial Images, IEEE Trans. on
PAMI, 28(5):738-752, 2006.
[135] Wang, J.G., Sung, E. and Venkateswarlu, R., Eye gaze estimation from a single
image of one eye, ICCV, pp. 136-143, 2003.
[136] Ruddaraju, R., Haro, A., Nagel, K., Tran, Q., Essa, I., Abowd, G. and Mynatt, E.,
Perceptual user inter-faces using vision-based eye tracking, ICMI, 2003.
[137] Sibert, L.E. and Jacob, R.J.K., Evaluation of eye gaze interaction, ACM Conf.
Human Factors in Computing Systems (CHI), pp. 281-288, 2000.
[138] Heishman, R., Duric, Z. and Wechsler, H., Using eye region biometrics to reveal
affective and cogni-tive states, CVPR Workshop on Face Processing in Video,
2004.
[139] Santella, A. and DeCarlo, D., Robust clustering of eye movement recordings for
quantification of visual interest, Eye Tracking Research and Applications (ETRA),
pp. 27-34, 2004.
[140] Parnes, R. B. (2003). GPS technology and Alzheimers disease: Novel use for an
existing technology. Retrieved August 15, 2004, from
http://www.cs.washington.edu/assistcog/NewsArticles/HealthGate/GPS
%20Technology%20and%20Alzheimers%20Disease%20Novel%20Use%20for
%20an%20Existing%20Technology%20CHOICE%20For%20HealthGate.htm
[141] Patterson, D. J., Etzioni, O., and Kautz, H. (2002). The activity compass.
Presented at UbiCog 02: First International Workshop on Ubiquitous Computing
for Cognitive Aids, Gteborg, Sweden.
[142] Sharon Oviatt, Multimodal Interfaces, Handbook of Human-Computer
Interaction, (ed. by J. Jacko & A. Sears), Lawrence Erlbaum: New Jersey, 2002.
[143] Doherty T.J., Vandervoort A.A., Taylor A.W., and Brown W.F., Effects of motor unit
[162] Thomas C. Weiss, Hemiparesis Facts and Information, 2010, Disabled World,
Neurological Disorders.
[163] Hemiplegia Treatments Definition, symptoms and treatments,
http://www.hemiplegiatreatment.net/
[164] Donnan GA, Fisher M, Macleod M, Davis SM (May 2008). "Stroke". Lancet 371
(9624): 161223. doi:10.1016/S0140-6736(08)60694-7.
[165] Stanford Hospital & Clinics, Cardiovascular Diseases: Effects of Stroke.
[166] Davis FA, Bergen D, Schauf C, McDonald I, Deutsch W (November 1976).
"Movement phosphenes in optic neuritis: a new clinical sign". Neurology 26 (11):
11004.
[167] Page NG, Bolger JP, Sanders MD, January 1982, "Auditory evoked phosphenes
in optic nerve disease". J. Neurol. Neurosurg. Psychiatr. 45 (1): 712.
[168] Compston A, Coles A., October 2008, "Multiple sclerosis". Lancet 372 (9648):
150217.
[169] AgingEye Times, "Macular Degeneration types and risk factors". Agingeye.net.
[170] Merck Manual, Home Edition, "Glaucoma", Merck.com.
[171] Better Health Channel, "Colour blindness",
http://www.betterhealth.vic.gov.au/bhcv2/bhcarticles.nsf/pages/Colour_blindness
[172] World Health Organization, Priority eye diseases, Prevention of Blindness and
Visual Impairment
[173] Kertes PJ, Johnson TM, ed. (2007). Evidence Based Eye Care. Philadelphia, PA:
Lippincott Williams & Wilkins. ISBN 0-7817-6964-7.
[174] Da Costa SS; Rosito, Letcia Petersen Schmidt; Dornelles, Cristina (February
2009). "Sensorineural hearing loss in patients with chronic otitis media". Eur Arch
Otorhinolaryngol 266 (2): 2214. doi:10.1007/s00405-008-0739-0.
[175] Dorland's Medical Dictionary, "Otosclerosis", http://en.wikipedia.org/wiki/Dorland
%27s_Medical_Dictionary.
[176] Occupational Safety and Health Standards (OSHA), Occupational noise
exposure, OSHA 29,
http://www.osha.gov/pls/oshaweb/owadisp.show_document?
p_table=STANDARDS&p_id=9735
[177] Kral A, O'Donoghue GM. Profound Deafness in Childhood. New England J
Medicine 2010: 363; 1438-50.
[178] D.W. Robinson and G.J. Sutton "Age effect in hearing - a comparative analysis of
published threshold data." Audiology 1979; 18(4): 320-334.
[179] World Health Organization ICD-10 F95.8 Stuttering
[180] Daly, David A.; Burnett, Michelle L. (1999). Curlee, Richard F.. ed. Stuttering and
Related Disorders of Fluency. New York: Thieme. p. 222. ISBN 0-86577-764-0.
[181] Leonard, Laurence B. (1998). Children with specific language impairment.
Cambridge, Mass: The MIT Press. ISBN 0-262-62136-3.
[182] O'Sullivan, S. B., & Schmitz, T. J. (2007). Physical rehabilitation. (5th ed.).
A. Automotive Area
Figure 41: Swing (legs): Inside car Multimodal Interaction Model relationships.
Figure 42: Grasp (hand): Interior door handle Multimodal Interaction Model
relationships.
Figure 43: Pull (left hand): Interior door handle Multimodal Interaction Model
relationships.
Table 29: Push (right hand): Lock button Multimodal Interaction Model definition.
Figure 44: Push (right hand): Lock button Multimodal Interaction Model relationships.
Figure 45: Press (right hand): Eject button on belt buckle Multimodal Interaction
Model relationships.
Figure 46: Grasp (right hand): Interior door handle Multimodal Interaction Model
relationships.
Table 32: Push (left hand): Interior door side Multimodal Interaction Model definition.
Figure 47: Push (left hand): Interior door side Multimodal Interaction Model
relationships.
Table 33: Pull down (hands): Sun shield Multimodal Interaction Model definition.
Figure 48: Pull down (hands): Sun shield Multimodal Interaction Model relationships.
Table 34: Grasp (hand): Steering wheel Multimodal Interaction Model definition.
Figure 49: Grasp (hand): Steering wheel Multimodal Interaction Model relationships.
Table 35: Push (left foot): Gear pedal Multimodal Interaction Model definition.
Figure 50: Push (left foot): Gear pedal Multimodal Interaction Model relationships.
Figure 51: Push (right foot): Accelerator pedal Multimodal Interaction Model
relationships.
Figure 52: Push (right foot): Brake pedal Multimodal Interaction Model relationships.
Table 38: Push (thumb): Parking brake release button Multimodal Interaction Model
definition.
Figure 53: Push (thumb): Parking brake release button Multimodal Interaction Model
relationships.
Table 39: Pull (hand): Hand brake Multimodal Interaction Model definition.
Figure 54: Pull (hand): Hand brake Multimodal Interaction Model relationships.
Figure 55: Grasp (hand): Light switch Multimodal Interaction Model relationships.
Figure 56: Turn (hand): Light switch Multimodal Interaction Model relationships.
definition.
Figure 57: Move up/down (hand): Direction indicator Multimodal Interaction Model
relationships.
Table 43: Grasp (hand): Radio knob Multimodal Interaction Model definition.
Figure 58: Grasp (hand): Radio knob Multimodal Interaction Model relationships.
Table 44: Turn (hand): Radio knob Multimodal Interaction Model definition.
Figure 59: Turn (hand): Radio knob Multimodal Interaction Model relationships.
Table 45: Push (hand): Radio button Multimodal Interaction Model definition.
Figure 60: Push (hand): Radio button Multimodal Interaction Model relationships.
Table 46: Push (hand): Window button Multimodal Interaction Model definition.
Table 47: Grasp (hand): Window handle Multimodal Interaction Model definition.
Table 48: Turn (hand): Window handle Multimodal Interaction Model definition.
Table 49: Turn (hand): Rear mirror Multimodal Interaction Model definition.
Table 50: Push (hand): Rear mirror Multimodal Interaction Model definition.
Table 51: Grasp (right hand): Gear handle Multimodal Interaction Model definition.
Figure 66: Grasp (right hand): Gear handle Multimodal Interaction Model
relationships.
Table 52: Push (right hand): Gear handle Multimodal Interaction Model definition.
Figure 67: Push (right hand): Gear handle Multimodal Interaction Model relationships.
Figure 68: Push (hand): Navigation system buttons Multimodal Interaction Model
relationships.
Figure 69: Push (right foot): Rear brake pedal Multimodal Interaction Model
relationships.
Table 55: Listen: Navigation system audio cues Multimodal Interaction Model
definition.
Figure 70: Listen: Navigation system audio cues Multimodal Interaction Model
relationships.
Table 56: Grasp (hand): Faucet controls Multimodal Interaction Model definition.
Figure 71: Grasp (hand): Faucet controls Multimodal Interaction Model relationships.
Figure 72: Grasp (hand): Hob gas control knob Multimodal Interaction Model
relationships.
Figure 73: Push (hand): Stove knob Multimodal Interaction Model relationships.
Table 59: Pull (hand): Washing machine porthole handle Multimodal Interaction
Model definition.
Figure 74: Pull (hand): Washing machine porthole handle Multimodal Interaction
Model relationships.
Figure 75: Turn (hand): Dishwasher knob Multimodal Interaction Model relationships.
Table 61: Push (hand): Hood button Multimodal Interaction Model definition.
Figure 76: Push (hand): Hood button Multimodal Interaction Model relationships.
Table 62: Pull (hand): Oven door handle Multimodal Interaction Model definition.
Figure 77: Pull (hand): Oven door handle Multimodal Interaction Model relationships.
C. Workplace Office
Table 63: Twist (hand): Faucet control Multimodal Interaction Model definition.
Figure 79: Stand up (knee, back): Toilet Multimodal Interaction Model relationships.
<enabling>
<source sourceId="st0task6"/>
<target targetId="st0task7"/>
</enabling>
</taskmodel>
CodeSnippet 58: Sit (knee, back): On toilet Multimodal Interaction Model (UsiXML
source code).
D. Infotainment
Figure 82: Press (hand): Keyboard key Multimodal Interaction Model relationships.