Академический Документы
Профессиональный Документы
Культура Документы
Announcement
Login name and password needed to access lecture notes on the course website
User name: sysc5303 Password: BIOM5402
Note
Completeness
The information given in this chapter should not be regarded as complete or comprehensive, but rather as providing a representative sampling of some available results.
Accuracy
The field is active and fast-moving, therefore, the information contained in this chapter might have a short half-life. So, although the materials lectured are accurate for the time being from our current knowledge, but some of them subsequently may have to be updated.
Consistency:
While attempts will be made to preserve consistency wherever possible, in some case the desired information was unavailable or available materials are inconsistent by themselves.
Definitions
Interfaces: In INS systems, they are those devices acting as a mediator (e.g., data gloves, exoskeleton, stereo glasses) through which the human operator can observe, sense, manipulate and control
a remote physical dynamic system over a distance a physical system at micro- or nano- scale a virtual simulated dynamic system.
Human operator: a person doing the observing and acting. What kind of interfaces is ideal?
The human operator does not feel its existence.
Introduction
Peoples research on human I/O has a long history In Ancient China
Human body was thought to be made from gold, wood, water, fire and earth. It was long believed that human thinks using heart rather than brain.
Aristotles concept
Sense organs are made of earth, air, water and fire. Five senses project to the heart either directly or indirectly
Overview
Input/sensory channels
Visual Auditory Haptic Smell Taste The sixth channel/sense ?
Output channels
Verbal Haptic
hand, arm, body, leg, foot
1. Visual Channel
Visual Channel
The most important single sense modality in human gathering of information concerning his relation to the real world Absorption of light energy by the eye and the successive conversion of this energy into neural messages which are mediated by the brain into perceptual patterns. Wavelength range
0.3 - 0.7 microns (1micron=10-6 meter)
Eyeball
Approximately spherical, about 1 inch (24~25 mm) in diameter Black-looking aperture, the pupil, that allows light to enter the eye (it appears dark because of the absorbing pigments in the retina). A colored circular muscle, the iris, which is giving us our eye's color. This circular muscle controls the size of the pupil so that more or less light, depending on conditions, is allowed to enter the eye. A transparent external surface, the cornea, that covers both the pupil and the iris. This is the first and most powerful lens of the optical system of the eye The "white of the eye", the sclera, which forms part of the supporting wall of the eyeball.
Eyeball
Light rays are focused and passed through the transparent cornea and lens upon the retina. The details of the image are formed at the retina and transmitted directly to the brain for the higher operations needed for perception and cognition. The central point for image focus (the visual axis) in the human retina is the fovea.
Here image has the finest detail
Eyeball
Eye Movement
Capability of rotation:
To the left = to the right Vertical upward movement: <40; Downward movement: <60 Rotation around the visual axis: <10
Eye Movement
Reason
The (temporal resolution) of the human visual system is not sufficient to detect the fast presentation of the movie frames Human visual system can not resolve individual pixels (spatial resolution is limited).
Spatial Resolution
The capacity of the eye to see fine detail. In practice, various ways are employed to measure and specify visual spatial resolution, depending on the type of acuity tasks used.
Target detection: requires only the perception of the presence or absence of an aspect of the stimuli, not the discrimination of target detail Target recognition: are most commonly used in clinical visual acuity measurements, require the recognition or naming of a target Target localization: involves discriminating differences in the spatial position of segments of a test object
Spatial Resolution
Determined by:
Density and type of photoreceptors in the retina
Limiting factors
Diffraction Aberration Refractive errors:
such as myopia (short-sightedness) and hyperopia (longsightedness)
Size of the pupil Illumination: background luminance Time of exposure of the target State of adaptation of the eye Eye movement Area of the retina stimulated
The density of cone receptors determines the ability of our eyes to resolve what we see.
Cones are more one-to-one to activate neurons
lower temporal resolution, but higher spatial resolution
Foveation
Resolution has the highest value at the point of the fovea (point of gaze) and drops rapidly away from that point as a function of the distance to the central point since there is the highest concentration of cone photoreceptors at the point of gaze.
The region around the point of fixation (or foveation point) is projected into the fovea, sampled and perceived with the highest density. The sampling density decrease dramatically with increasing distance to the fovea.
10
The spatial resolution of human vision is cut in half at about 2.3 degrees from the point of fixation, fovea.
Computer vision
Human vision
11
Temporal Resolution
The eye constantly samples information ( i.e. images) projected onto the retina in a periodic intermittent manner since there is a finite amount of time required to collect and process information. Information is then integrated so objects around us appear to be stable or move smoothly. When intermittent stimuli are presented to the eye at a very low rate
they are perceived as separate
When the presentation rate is high, but lower than a curtain rate
they appear to stay on but with changes in intensity, producing the sensation called flicker.
Above a certain critical rate, the flicker stops. This point is called the critical flicker frequency (CFF) that is influenced by a number of factors.
Temporal Resolution
Critical flicker frequency (CFF)
Transition frequency point of an intermittent light source where the flickering light stops and appears as a continuous light. Fovea CFF is around 60 Hz; Peripheral CFF is around 75 Hz;
Basis of film and TV
Film: 60Hz Northern American TV: 75Hz
CFF = a logL + b, where a and b are constants and L is the luminance of flickering stimulus in normal conditions. Q: from a practical point of view, if a computer monitor is flickering, what we can do?
Increase refreshment rate Decrease the intensity.
12
Depth Perception
Ten well-defined cues in depth perception Binocular cue is the most important
Each eye captures its own view The two separate images are sent on to the brain for processing. When the two images arrive simultaneously in the back of the brain, they are united into one picture. The brain combines the two images by matching up the similarities and adding in the small differences. The small differences between the two images add up to a big difference in the final picture, to create a stereo picture
Depth Perception
Other cues:
Relative size Overlapping Paralleled line convergence Color contrast or difference from the known contrast Relation of lights and shadows Texture gradients Accommodation within the eye Experience plays an important role Most stereo-vision systems use human binocular cues to render depth information
13
Depth Perception
x0 x0: the arranged line by the subject is truly straight x<x0: the points trace concave curves x>x0: the points trace convex curves
14
Color Perception
Three principal color receptors (cones) (Three primaries)
Any color to be matched by a mixture of three colors: blue, green and red Any color can be fully specified in terms of their hue, lightness and saturation.
Note: wavelength doesnt necessarily directly determines color appearance? Can perceive as many as 30,000 different colors
Does the so-called true-color monitor make sense?
1million different colors More colors, bigger data size
Other Remarks
Adult can see 3-6mm of movement per second when an object is 1 meter away 10300,000 different visual configurations might conceivably be seen Retina is reflective Eye blink does not affect perception Attention and gaze direction are correlated Lots of illusions to play with size and distance
15
16
17
18
19
20
Issues
Lack of depth information (how to incorporate depth information?)
Stereo vision by using specific interfaces based on human binocular cures Proper arrangement of the environment Can color cues be used for this purpose?
Data rate? ( Is 100 frames/second good?) Image size: does it make sense to have 24bits per pixel ?
Output (commanding)
Eye movement ? (few results)
Advantage: fast (high-bandwidth), no extra payload Limitations: # of DOFs, range of movement, effect of flick, etc.
Image size: can be significantly reduced Less network bandwidth Less time delay
21
Any Discussion?
22
2. Auditory Channel
Is It Relevant?
Auditory stimulus does increase the realism (degree of fidelity/telepresence) of a INS system in many instances:
In the research conducted at NASA Ames Research Center, it was found that pilots had difficulty knowing whether they had positively engaged a touch-screen virtual button without auditory feedback. It was demonstrated by Massimino and Sheridan that auditory cues could be used to substitute for force feedback in various telemanipulation tasks. In an experiment at the JPL Advanced Teleoperation Laboratory, it was found that auditory feedback speeded the completion of manipulation tasks, given in addition to visual and haptic feedback. It was shown that, the addition of specialized sound significantly increased the reported sense of presence in a VE
23
The Ear
24
It should be noted:
When auditory stimulus is transmitted via headphones, the sound bypasses the pinna and arrives directly at the ear canal, so that most of the effect of the pinna disappears. The difference in propagation delay to the two ears is also a very important source of localization, especially at low frequencies.
25
Audibility
Audible frequency
Audible frequency range: 16Hz to 20,000Hz Most efficient: 1000Hz and 4000 Hz A drop in efficiency as the sound frequency becomes higher or lower
26
Physical-Space Description
A single or pure tone sound:
Frequency Magnitude Phase
Perception-Space Representation
Pitch (predominant frequency, not the highest frequency)
Roughly corresponds to the frequency of the predominant sinusoidal components A single-valued subjective summary of the sensed spectral properties of the sound stimulus When the spectral property of a sound is made more diffuse over a band of frequencies, it becomes more difficult for the listener to distinguish pitch.
White noise: NO PITCH AT ALL
Frequency just-noticeable difference (jnd) Dependent upon the loudness level and frequency of the sound
Smaller (higher perception precision) for low frequency tones Below 20db, the average human loses the ability to perceive change of frequency Above 20db, 3/1000 of the tones frequency.
27
Perception-Space Representation
Loudness (intensity)
Subjective judgment of the intensity of the observed sound stimulus Audible loudness: 0db ~ 160db
Discomfort: 120db+; Pain: 140db+
Loudness just-noticeable difference (jnd) Dependent upon the sensation loudness level and frequency
Below 20db, 2 to 6 db dependent upon frequency Above 20db, to 1db is sufficient At extremely high frequencies, jnd is large Most sensitive (jnd smallest): 500Hz ~10K Hz
Perception-Space Representation
Duration
For very short tones, the perceived intensity is inversely proportional to duration For very long tones, it is possible to have an auditory after-image Physical tones less than 0.01s are insufficient to yield a pitch A maximum loudness is reached at about 0.5 seconds followed by a decrease in the intensity
28
Perception-Space Representation
Spatial localization
Each ear has a non-uniform directional sensitivity However, localization information is primarily a result of the comparison of stimuli (intensity difference and time interval) separately sensed by the ears.
Below 1K Hz, time difference is the predominant source of direction information Above 1K Hz, loudness difference becomes significant for the discrimination of direction
Least sensitive on median plane (due to symmetry) Easiest to locate tones in 500 ~ 700Hz Difficult to locate tones around 2K Hz
Speech
The most important class of auditory stimuli. The quality of a speech is a subjective description of the particular waveform. Frequency range
100Hz to 10K Hz The fundamental frequency of male voice is 125 Hz while this frequency for the female voice is 250 Hz.
Power
typical conversational speech 10~15 microwatts
29
Output
Voice-based control
loudness, pitch, spatial location and duration ?
30
Difficulties
No localised sensory organ
The sense of touch has no single sensory organ, but operates throughout the skin, muscles and bones as a distributed and diffuse process.
Complex sensing
Not a simple transduction of one physical property into an electronic signal. Not well known how these different aspects of the haptic phenomena are related and processed.
Hard to imitate
Difficult to create haptic sensing devices. (Not a problem for developing new camera/display or microphone/speaker for visual and auditory sensations)
Terminology
Haptic
originated from Greek haptesthai meaning to touch relating to or based on the sense of touch
Truly interactive
Informatic, but more importantly energetic
Need direct contact Not well understood compared to visual and auditory channels Haptic feedback (I would rather say haptic interaction) can be categorized into:
Tactile (cutaneous) Kinesthetic (force)
Haptic Exploration
Suppose the hand comes up to an object freely suspended in space.
The initial sense of contact is provided by the touch receptors (nerve endings) in the skin, which provides information on the geometry, texture, slippage, etc. of the object surface. This information is tactile1. When the hand applies more force, trying to hold this object, kinesthetic force2 comes into play providing details about the position and motion of the hand and arm relative to the object. In the mean time, the force feedback now also gives a sense of total contact force, compliance (stiffness), and the weight of the object (if the hand is supporting the object in some way) . This perception is with the muscles and tendons beneath the skin.
Haptic Exploration
In order for the hand to manipulate the object, say move it horizontally, rotate it, or pinch it, the haptic system must issue stronger motor action3 that applies forces on the object. That response (feedback) will, in turn, guide further manipulation.
Information processing
a response is triggered and an electrical discharge is generated into the never fiber; Second-order neurons transmit the signal further up to the spine and into the thalamus region of the brain; Here third-order neurons complete the path to the cortex where the corresponding sensations of pressure, temperature, or pain are registered.
Sensitivity
Threshold or JND
Temporal resolution
The minimum time difference that can be detected by the receptors
Spatial resolution
The minimum spatial difference that can be detected by the receptors
Saturation
Maximum force exertion
1 Tactile Sensing
Important in object discrimination and manipulation In general, tactile sensations include:
Tactual
Pressure Texture Softness Wetness Friction-induced phenomena such as slip, adhesion etc. Local features of objects such as shape, edges, embossing and recessed features Electrical conductivity. Vibrotactile sensations
Thermal
Cold or warmth
The Skin
Very heavy and largest organ
roughly 2 m2; 5 kilograms
Two layers
Epidermis: the outer protective layer of the skin, covering the dermis. Dermis: the sensitive connective tissue layer of the skin located below the epidermis, containing nerve endings, glands and blood vessels
Tactile Sensation
Hairless skin
Palm/fingertip
up to 135 receptors per square centimeter at the finger tip
The highest sensorial density of specialized receptors Mapping the hand receptors to nearly a quarter of the total cortex surface (of the brain). The sensorial mapping is dynamic (why?) Five major types of receptors
Hairy skin
An additional type of receptors, i.e., the hair-root plexus that detects movement on the surface of the skin Hairy regions are more sensitive
since the hairs act as levers, providing a considerable amplification of the applied force.
Meissners Corpuscles
surface curvature, local shape, slippage poor spatial resolution 43%
Pacinian Corpuscles
vibration, slippage, acceleration 70-1000Hz response frequency range 13%
Merkels Disks
skin curvature, local shape, pressure 25%
Ruffini Endings
skin stretch, local force 19%
Sensorial Adaptation
Temporal variation of responses of a receptor in response to a constant stimulus.
Slowly adapting (SA) receptors:
The stimulus can be detected for a long time without much decay Example
weight
Ruffini corpuscles
SA receptors as they produce a regular discharge rate for a steady load. skin stretch, local force
Meissner corpuscles
RA as they discharge mostly at the onset of the stimulus surface curvature, velocity, local shape, slip
Pacinian corpuscles
RA type receptors as they discharge once for each stimuli application, not sensitive to constant pressure. vibration, slip, acceleration
(N/cm2)
Spatial resolution
Human fingerpad
1.5mm
10
Pain Sensing
Excess mechanical, chemical, thermal or electrical stimulus can excite the pain sensation. Negative adaptation
The subject perceive an increase in pain after an injury.
Thermal Sensing
Touching or no touching Separate types of receptors. Cold sensitivity extends to greater depth than does warm in the skin. Some areas are sensitive only to cold
e.g. cornea.
Many points on the skin surface respond only to cold, or only to warmth, or even neither.
E.g. on the forearm, cold spots average 13 to 14 per mm2, but warm spots average 1 or 2 per mm2
Cold spots may be excited by a warm stimulus (over 45C) with the resulting sensation of paradoxical cold the hot stimulus actually feels cold. Cold receptors are most sensitive at 1C while warmth receptors have a maximum sensitivity around 37 C. There is a physiological zero temperature region around which no temperature is sensed.
11
Skin is not uniformly sensitive to excitation over its entire region The delay time of these receptors ranges from about 50 to 500 msec. The thresholds of different receptors overlap, and the perceptual qualities of touch are determined by the combined inputs from different types of receptors. The operating frequency range: from at least 0.04 to greater than 1K Hz.
12
2. Kinesthetic Sensing
To recognize the object
Overall shape Stiffness Weight Overall force
Muscles
Tendon organs: monitoring muscle tension Spindle organs: measuring stretch and rate of change
None of the skin, joint, or muscle receptors provide awareness of weight; instead, this sense arises mainly from signals derived entirely within the central nervous system (negative adaptation)
13
14
Motoring Aspects
Maximum and sustained force exertion Finger mechanical impedance Force control bandwidth
Precision grasps:
less force higher dexterity (only the fingertips are used)
15
16
17
Hogan at MIT found that the impedance of human hand is indistinguishable from that of a passive system even though human hand is clearly an active system.
Fundamental assumption for the stability of a number of control algorithms
Manipulation bandwidth
The speed (rapidity) with which humans can respond/act.
18
System Design
Input/output asymmetry Stability vs. Fidelity
For fidelity, high-frequency feedback is required For stability, high frequency should be filtered out
Data Transmission
Transmission rate (depending on types of haptic signals) Data volume (depending on spatial resolution)
19
Purpose of perception
Not actual values But a mental image
Sensory Substitution
20
A common basic principle governing the correspondence between stimulus magnitude and sensation magnitude.
the sensation magnitude grows as a power function of the stimulus magnitude
where: k: constant depending on the unit of measurement : differing from one sensory modality to another.
Cross-modality Mapping
Can a scaling relation be created between two distinct sensory modalities?
The modality 1 stimulus is applied to the subjects. The subjects are asked what would be the sensation magnitude in modality 2 by matching numbers and making direct comparison between the two different sensory modalities.
Yes, there is a relation between any two distinct sensory modalities [Stevens 1975].
The cross-modality matching is common in nature. [Stevens 1959 1966 1975; Stevens et al. 1963; Marks 1986; and Hubbard 1993]
21
Sensory Substitution
Sensory substitution is the provision to the brain of information that is usually in one sensory domain (for example haptic information) by means of the stimuli, receptors, pathways and brain areas of another sensory system (for example auditory sensory system). [Bach et al. 1987]. The "sensory substitution" systems transform stimuli characteristic of one sensory modality (for example, haptic) into stimuli of another sensory modality (for example, auditory).
Sensory Substitution
Originally for aid for handicapped persons
Vision through touch
Braille for the blind: information acquired visually (reading) is, instead, acquired through the fingertips
In this course, we will see how and whether it can be used in INS systems
22
Projects
Flying airplanes as early as 1936 [deFlorez, 1936]
Apparently a pilot was dependent on external visual references to maintain flight under clear weather conditions. In fog, even the most experienced pilot could not maintain a proper orientation without suitable instruments. Research was conducted to establish aural reference axes that could be substituted for visual ones during instrument flying conditions.
providing a turn indication consisting of an increase in sound intensity in one ear and a decrease in the other having changes in the sounds pitch represent changes in airspeed.
It was demonstrated that blindfolded pilots could fly airplanes when two of their instrument indications were presented aurally.
Projects
Further research in this area was conducted to determine the accuracy and speed of pilot response to a variety of auditory cues at Harvard University in the early to mid 1940s [Forbes 1946] .
what types of auditory signals could be followed with greatest ease, with what accuracy such signals could be utilized, how many simultaneous auditory signals could be followed successfully.
23
Projects
The tactile vision substitution system (TVSS) for the blind [since 1969; Bach-y-Rita ]
A head-mounted video camera captures image of the environment. The image is then converted into a "tactile image". The tactile image is produced by a matrix of 400 activators ( 20 rows and 20 columns of solenoids of one millimeter diameter). The matrix is placed either on the back, or on the chest. Equipped with the TVSS, blind (or blindfolded) subjects are almost immediately able to detect simple targets and to orient themselves. They are also rapidly able to discriminate vertical and horizontal lines, and to indicate the direction of movement of mobile targets. The recognition of simple geometric shapes requires some learning (around 50 trials to achieve 100% correct recognition). More extensive learning is required in order to identify ordinary objects in different orientations. The latter task requires 10 hours of learning in order to achieve recognition within 5 seconds.
Projects
Auditory and tactile senses were studied to substitute kinesthetic feedback in time-delayed teleoperation [Massimino, MIT 1993, 1995]
The force feedback is substituted by the auditory sensations A task of inserting a rectangular peg into a rectangular hole
To indicate force from contact at the left or right side of the hole, a medium pitch (1000Hz) tone sounded in the left or right ear (the subject wore earphones). To indicate contact at the top or bottom, the tone was at high (3500Hz) or low (350Hz) pitch in the both ears, which made the tone appear to the subjects as from the middle of the head. The loudness of the tone was to indicate the magnitude of the force.
24
Projects
M. Kitagawa, D. Dokko, A. M. Okamura, and D. D. Yuh, "Effect of Sensory Substitution on Suture Manipulation Forces for Robotic Surgical Systems," Journal of Thoracic and Cardiovascular Surgery, Vol. 129, No. 1, pp. 151-158, 2005. (The Johns Hopkins University ) Work was done to substitute direct haptic feedback with visual and auditory cues in robotic surgical systems.
Visual cues
An active role in our daily life. For examples:
Traffic signals, which use colorful lights, warn people for attention to create correct reactions to different traffic conditions. Animals have severe reactions to some aggressive colors, such as red and yellow. We can estimate the rough temperature of flame by observing its color.
25
Disadvantage:
Accuracy Opposite to the human-centric principle
Not intuitive. May not work when the user is tired or in fatigue.
Mismatching is possible
For example, if used for tactile feedback: dynamic bandwidth of up to 1 kHz is required for realistic force reflection, yet visual cues run very much slower.
26