Вы находитесь на странице: 1из 4

A Human-Computer Interface Design Using Automatic Gaze Tracking

Kevin Huang, Steve Petkovsek, Binay Poudel, and Taikang Ning Department of Engineering, Trinity College, Hartford Connecticut, USA
taikang.ning@trincoll.edu
Abstract This paper describes the design and implementation of a human-computer interface using an in-house developed automatic gaze tracking system. A focus has been placed on developing an inexpensive non-invasive gaze-tracking computer interface by which the user can control his/her computer input in a handsfree manner. In the underlying project, infrared (IR) light emitting diodes (LEDs) are placed around a computer monitor to produce reference corneal glints from the users eye and to illuminate the users pupil. An IR-sensitive video camera is then used to capture images of these glints. A graphical user interface is used to gather calibration glint information from the user when gazing at six strategically place calibration points. A linear model is derived from these data and utilized to map the vertical and horizontal displacements of the glints with respect to certain physical landmarks of the users pupil onto the corresponding point of gaze on the monitor. The design is capable of real-time performance and was evaluated with many volunteers with good success rate. Keywords-gaze tracking, infrared sensing; bright pupil; image segmentation

reflected off the users cornea, can provide information about eye-gaze. When exactly one LED was placed at each of the four corners of the monitor, Yoo discovered that the pupil center remained within the polygon formed by connecting the four glints off the users cornea [2]. It was then proposed that the ratio of the vertical and horizontal distances of the pupil center with respect to the corneal glints approximates that of the point of gaze with the plane of the monitor, and thus eye gaze could be estimated. With a basic assumption that the users cornea can be approximated as a planar surface and reflects analogous to a planar mirror, we can demonstrate this by ray-tracing, as shown below in Fig.1.

estimated gaze optical axis pupil center

macula

I.

INTRODUCTION
LEDs camera

glints

A personal computer is a ubiquitous tool in modern society. Minimal operation of the personal computer typically requires use of the computer mouse and keyboard. Because of this, people with limited or obstructed mobility of their hands, such as paraplegics, amputees, and sufferers of carpal tunnel or arthritis, may have trouble operating a personal computer. One approach to provide hands-free computer access is gaze-tracking. First attempts included very invasive techniques, such as electrodes placed on the users temples to monitor skin electrical potential corresponding to muscle movement, and implementation of contact lens designed exclusively for gaze tracking [1]. In contrast, the fundamentals of Yoo et als approach are completely non-invasive. Yoos proposal suggested that light produced by LEDs located in the plane of a monitor screen, when

Figure 1. Cross-Ratio Methodology

This method, termed the cross-ratio method, was successfully used to track a users gaze when the head was stationary. Drawbacks included a lack of consistent accuracy (deviations of as much as 164.5mm were reported by Kang [2]), a limited range of head motion, and computational complexity in glint and pupil center extraction. This project addresses the latter; by limiting images acquisition to the IR range, this method will circumvent extraneous signals in the visible spectrum, increasing the signalto-noise ratio (SNR) of glint and pupil images and mitigating computational requirements. There are two reasons for the utilization of IR frequency light for the illumination sources. First, the

Page 1 of 4

The previously described integration of IR illumination by Yoo et al. [1], the cross-ratio methodology was modified and improved in few aspects in our study. They include both system design/construction and image processing of gaze tracking. A. System Design and Implementation The hardware setup consisted of a host PC and computer monitor, coaxial and reference glint LEDs, a webcam with an IR filter, and a microcontroller which managed the glint source switching circuitry. A simple wooden chin rest was constructed to restrict head movements during gaze tracking. To capture images, a Logitech QuickCam Communicate MP web camera was modified for IR image acquisition. In particular, the hot-glass (IR blocking filter) included in the manufacturers construction was removed and replaced with an IR band-pass filter LEE Filters LE8733. Built in camera lens was replaced by a 25mm lens to get a good focus of the users eye. To produce IR reference glints, a pair of 2-pin 875nm IR LEDs was secured to each corner of the Viewsonic LCD monitor. Similarly, to produce the coaxial illumination for the bright pupil, six 875nm IR LEDs were placed around the optical lens of the web camera. The LEDs were driven by constant current sources at 75mA. This resulted in adequate glint intensity and equivalent IR illumination for both the glint and coaxial illumination sources. To ensure synchronization between image capture and illumination control, the v-sync pin of the Logitech QuickCam Communicate MP camera controller was used as a logical toggle signal for alternating the illumination sources between coaxial bright-pupil and off-axis reference glint. Thus, each successive frame was illuminated by the alternate glint source of the previous frame. The frame rate could be adjusted to 5, 10, 15, or 20 frames per second. B. User Interface: A user interface was created for the system utilizing Numpy toolbox in Python. Users were asked to view six specific points at known locations for the purpose of calibration. Once the calibration process was completed, users could view the computed gaze location on the screen with an orange circle. Users also had an option of using head stand to keep their eyes within the focus area of the camera. Page 2 of 4

Figure 2. The Physical setup of an automatic gaze control system using infra-red LED. IR LED light is not visible to the human eye. As a result, using IR illumination as glint reflection sources would not distract the user or create unnecessary discomfort as a visible light source could. This also makes the system suitable for low-light settings. Second, the human retina acts as a retro-reflector of IR light [4]. That is to say that any light entering the pupil will be reflected with minimal scattering in the opposite direction along a parallel vector. This property of the human retina is useful in terms of detecting the pupil. Acquiring any signature of the retro-reflection (bright pupil) would require illumination coaxial to the image capturing device. Further, unwanted illumination, such as visible light illumination, which is subject to inconsistency, can be filtered. This preprocessing of the data captured by the camera reduces processing time and guarantees a stable lighting environment that will ensure the required information (i.e. the glint and pupil center locations) will always be extractable.

II. METHODS

Figure 3. Transfer Characteristics of TSHA5502 875nm infrared LED

C. Image Processing and Gaze Extraction


Using the described hardware setup, bright pupil and glint signatures were captured. Figure 4 shows samples of these types of images.

Figure 4. Bright Pupil and Reference Glints

Figure 6. Cropped and Segmented Signal smaller compared to the size of the original picture. This method was employed to achieve reduction of processing time and enable real-time operation. To ameliorate the effects of inter-subject variability (corneal surface roughness, pupil size, general eye-anatomy, etc.), a simple linear calibration method was used. The gaze estimation described above was performed while the user was asked to view six specific points at known locations for the purpose of calibration. Data was taken three times at each of the six locations, under constraints of precision (variance amongst three consecutive measurements) as well as accuracy (static threshold, estimated gaze to known calibration location) of the measured data. The data was then used to create two linear models: one for vertical cross-ratio to vertical gaze estimation and one for horizontal cross-ratio to horizontal gaze estimation. Figure 7 shows the calibration graphical interface. After calibration, corrected gaze estimation appeared as a circular cursor on the screen. The size of the cursor grew as a proportion of the gaze location variance, which was calculated over 100 cycles (200 frames). This feature was included to reduce erratic cursor movement. The software user interface was created in the programming language Python 2.6 with the OpenCV library. III. RESULTS AND DISCUSSION The device was tested on a diverse group of Trinity students and faculty. Ninety percent of users were able to successfully calibrate the device and move the cursor to a location given at random. Deviations of less than 10% of the screen dimensions were measured. Errors did occur if the user looked too close to the screen edge. The cursor would erroneously move to the wrong edge when the user

Figure 5. Subtracted Image To isolate the Pupil and glints from obtained pictures, a common mode rejection approach was taken. Subtracting the two images in Fig.4 resulted in an image that contained both desired signals (pupil center & reference glints), as shown below in Fig.5. The ratio of relative distances of the pupil center to each of the reference glints was extracted from this image. This was done by the use of a static threshold and median filter to isolate the pupil and glints. Once the pupil shape was found, its center was approximated by taking the arithmetic mean of the maximum and minimum horizontal and vertical coordinates of the pixels in the identified shape. Following designation of the pupil center, a specified, co-centric region was cropped around the pupil. Since the pupil center will always remain within the polygon formed by connecting the four corneal glints, the image was separated into quadrants about the pupil center, as shown below in Fig.6. The coordinates of the small dark dot in each quadrant with respect to the pupil center were used to determine the cross-ratio, and thus estimate gaze location. Images for each quadrant were much

Page 3 of 4

looked near the screen edge or off the screen entirely. This occurred because a quadrant would have two glints, and the coordinates of the wrong glint would be used to estimate gaze. The device was evaluated in real-time. With a medium range of desktop computer as the host to handle all computing, our design was able to perform tracking eye movement in real-time. The cursor on the screen could follow a volunteers eye movement with a very short delay of less than half a second. We have evaluated the performance with different users to verify its robustness. The results met a good success rate showing little sensitivity to user variability. In addition, the complete design and implementation was achieved with an entire system that cost less than $50an major design goal of the study. There is no question that more powerful computational machine and camera will lead to better performance and can accommodate more sophisticated image processing to improve the overall performance. IV. CONCLUSION A complete hands-free human-computer interface was designed and implemented using an in-

satisfactory result while in a lighted environment. The complete system was designed and implemented with a small budget using most commercially available parts. The automatic gaze-tracking system has shown robust performance with little sensitivity to different users. Without a question, the underlying design can be greatly improved with computers equipped with more powerful number-crunching microprocessors. Similarly, the use of more sophisticated image processing algorithm can make the tracking of gaze can become much finer. REFERENCES
[1] Yoo D.H. and M. J. Chung. A novel non-intrusive gaze estimation using cross-ratio under large head motion, Computer Vision and Image Understanding, vol. 98, no. 1, pp. 25-51, 2005. [2] Kang, Jeffrey J., Moshe Eizenman, Elias D. Guestrin, and Erez Eizenman. "Investigation of the cross-ratios method for point-of-gaze estimation," IEEE Trans. Biomedical Engineering, pp. 2293-2302, Sept. 2008 [3] Yoo, Kim, Lee and Myoung Jin Chung. Non-contact eye gaze tracking system by mapping of corneal reflections. Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference (2002) : 94-95. [4] Hua, Hong, Prasanna Krishnaswamy, and Jannick P. Rolland. "Video-based Eyetracking Methods and Algorithms in Head - mounted Displays." Optics Express 14.10 (2006): 4328. [5] Liu X. and Xu, F. Real-time eye detection and tracking for driver observation under various light conditions IEEE Intelligent Vehicle Symposium, 2002. [6] D Koonsb, A Amirb, M Flicknerb Pupil detection and tracking using multiple light sources. Image and Vision Computing Volume 18, Issue 4, 1 March 2000, Pages 331335. [7] Guestrin, E. D. and Eizenman, M, Remote point-ofgaze estimation requiring a single-point calibration for applications with infants, Proc. the 2008 symposium on Eye tracking research & applications. [8] Guestrin, E. D, Eizenman, M, Kang, JJ and Eizenman, E. Analysis of subject-dependent point-of-gaze estimation bias in the cross-ratios method, Proc. the 2008 symposium on Eye tracking research & applications.

Figure 7. A graphic calibration interface screen house developed automatic gaze tracking system. The system implementation involves a combination of various hardware and software tools and requires delicate system integration. In our study, the system design has successfully completed the following tasks:

It integrates IR illumination with cross-ratio techniques via common mode rejection to extract a clean pupil center and glint references with

Page 4 of 4

Вам также может понравиться