Вы находитесь на странице: 1из 2

A Fast and Robust Eye-Event Recognition (FRER)

for Human-Smartphone Interaction


Ardhiansyah Baskara

Han-You Jeong

School of Electrical and Computer Engineering


Pusan National University
Busan, 609-735 Republic of Korea
ardhiansyah.baskara@gmail.com

School of Electrical and Computer Engineering


Pusan National University
Busan, 609-735 Republic of Korea
hyjeong@pusan.ac.kr

AbstractThe human-computer interaction using the computer vision have been extensively studied to provide a chance
to access the computers to the people with physical challenges,
such as Lou Gehrigs disease or stroke illness. Recently, a
smartphone has become one of important gadgets in our daily life.
However, there are still many challenges in the computer vision
for the human-smartphone interaction (HSI), such as hardware
limitation and unstable distance and pose of the smartphone user.
In this paper, we present the fast and robust eye-event recognition
(FRER) scheme that consists of the eye-area extraction, the eye
tracking, and the eye-event recognition blocks. We also propose
the slope-based similarity checking (SSC) algorithm for eye-event
recognition of a person with arbitrary eye size. The experimental
results show that the FRER scheme can successfully detect the
eye events with 99.3 % at frame rate of 19 frames per second.
Keywords: Human-smartphone interaction, computer vision, eyeevent recognition.

I. I NTRODUCTION
The human-computer interaction (HCI) based on the computer vision technology has been one of the hottest research
topics, because the eye-event recognition is one of ways to
get input command from users by using camera [1] [2]. Many
researchers have extensively studied a novel way to establish
the interaction between a user and a computer. The seminal
paper in [1] presents the framework of eye-event recognition
which detects the eye area through motion analysis, tracks
the eye area using a similarity measure, and then recognizes
the eye events based on the threshold values of the similarity.
The authors in [2] present a robust implementation of this
framework which supports a frame rate of 30 frames per
second (FPS) using a desktop PC equipped with a webcam.
Recently, a smartphone has been a ubiquitous mobile device
for web browsing, instant messaging, and streaming services
in our daily life. In this paper, we focus on the humansmartphone interaction (HSI) using the computer vision technology. The goal of this research is to provide a chance to
access the smartphone to the people with severe physical
challenges, such as Lou Gehrigs disease and stroke illness.
Usually, the HSI faces a couple of additional challenges
compared to the HCI: 1) how to detect/track the eye events
with computationally efficient way; and 2) how to accurately
recognize eye events of a person with arbitrary eye size. The
EyePhone in [3] is the first hand-free interaction for driving

Fig. 1. Overview of the FRER scheme

the apps using the HSI. In [4], the EyeGuardian informs the
user if his/her blink rate is exceptionally low. For the template
of eye tracking, the EyePhone requires an additional step to
collect open-eye templates at the initial phase, whereas the
EyeGuardian uses computationally intensive Haar Cascade
Classifier. For the eye-blink detection, both apps use the
threshold-based similarity checking (TSC) which is not robust
to a person with different eye size.
In this paper, we propose the fast and robust eye-event
recognition (FRER) scheme. The FRER scheme first extracts
the eye-area using the face detection, tracks the location of
eye area, and recognizes the eye events regardless of the eye
size. The experimental results show that the FRER scheme can
detect the eye event with success probability of 99.3 percent
at frame rate of 19 FPS.
II. T HE FRER S CHEME
Fig. 1 shows the overview of the FRER scheme consisting
of three blocks: the eye-area extraction (EE), the eye-tracking
(ET), and eye-event recognition (ER). The EE block obtains
the eye area through the following steps: The EE block first
converts the RGB frame of smartphone camera in Fig. 1(a)
into a grayscale frame as shown in Fig. 1(b). Next, the EE
block employs the Haar Cascade Classifier to extract the face
area (See Fig. 1(c)), and then obtains the eye area by cropping
it from the face area as shown in Fig. 1(d).
Once the eye area is obtained from the EE block, the ET
block tracks the movement of the eye area using the Haar
Cascade Classifier as shown in Fig. 1(e). To aim this, the ET

TABLE I
F RAME R ATE OF E YE T RACKING
Scheme
Basic
FRER

Head
movement
No
Yes
No
Yes

Mean
15.04
15.21
18.57
19.34

Frame rate (FPS)


Max
Min
15.74
13.89
16.12
13.26
28.39
14.64
29.95
12.01

Fig. 2. The NCC of two smartphone users

block considers the extracted eye area as the region of interest


(ROI) which includes the open-eye template in Fig. 1(f). This
ROI is also used for skipping the computationally intensive
EE block, which will be explained at the end of this section.
As shown in Fig. 1(g), we consider two possible states of
eye: open and closed. To recognize an eye event, the ER block
uses the open-eye template of the ET block as the reference.
The ER block adopts the same similarity measure, called the
normalized cross-correlation (NCC), as the existing works in
[14]. The prime difference of the ER block is the use of
slope-based similarity checking (SSC) algorithm to detect the
eye event. Denoting the NCC value at frame index t by (t),
the existing TSC algorithm uses a fixed threshold T of the
NCC: An eye is open, if (t) T , and closed otherwise.
Instead, the SSC toggles its state when the slope of the NCC,
denoted by (t), exceeds a fixed threshold S: It maintains the
current state, if (t) = |(t) (t 1)| S, and switches
to the other state, otherwise. Fig. 2 plots the NCC of two
smartphone users taken from our experiments. We can see that,
at the frame of eye closing/opening, both users have a similar
slope value while the NCC value is quite different depending
on the eye size.
Finally, if (t) is less than 0.5, the FRER scheme interprets
that the ET block fails to track the eye area: It goes to the EE
block to extract the eye area again. Otherwise, it skips the EE
block and directly executes the ET block with the new ROI.
III. N UMERICAL R ESULTS AND D ISCUSSION
In this section, we discuss the numerical results from our
experiments with a test app running at the Samsung Galaxy
Note 3 Neo. We develop this app using the Android APIs and
the OpenCV library.
To demonstrate the computational efficiency, we compare
the FRER scheme with the basic scheme that executes all three
blocks of FRER at each frame. Table I shows the frame rate
of both schemes. At each scenario, we run the test app for
three minutes with/without head movements. We can see that
the FRER scheme achieves a higher frame rate than the basic
scheme; The frame rate of the former is around 19 FPS, while
that of the latter is about 15 FPS. We infer that the ET block
can reduce the computational load of the face detection.
Fig. 3 show the accuracy of eye-blink recognition of three
smartphone users with different eye size. To maximize the

Fig. 3. The accuracy of eye-blink recognition of three users.

accuracy of eye-blink recognition, we set the threshold value


of TSC algorithm to T = 0.865 and the threshold of SSC
algorithm to S = 0.08. We observe that the accuracy of SSC
algorithm is much higher than the existing TSC algorithm: On
average, the former achieves the accuracy of 99.3 % while the
latter attains the accuracy of 60.0 %. We can also see that
the SSC algorithm achieves a high accuracy regardless of eye
size, while the accuracy of TSC algorithm is dependent on the
eye size of a user: The maximum difference in the accuracy
of SSC algorithm among three users is 2.0 % while that of
TSC algorithm is 100 %.
From the above results, we conclude that the FRER scheme
is not only computationally efficient for a real-time HSI app,
but also robust to the users with different eye sizes.
ACKNOWLEDGEMENT
This research was supported by National Research Foundation
of Korea (NRF) Grant (No. 2009-0083495) and by Basic Science
Research Program (No. 2013R1A1A1012290) through the NRF,
which is funded by the Ministry of Science, ICT & Future Planning.

R EFERENCES
[1] K. Grauman, M. Betke, J. Gips, and G. Bradski, Communication via
eye blinks - detection and and duration analysis in real time, in Proc.
IEEE CVPR01, Kauai, Hawaii, Dec. 2001, pp. 1010 - 1017.
[2] M. Chau and M. Betke, Real time eye tracking and blink detection with
USB cameras, in Boston University Computer Science Technical Report
No. 2005-12, 2005.
[3] E. Miluzzo, T. Wang, and A. T. Campbell, Eyephone: activating mobile
phones with your eyes, in Proc. ACM MobiHeld10, New Delhi, India,
Aug. 2010, pp. 15 - 20.
[4] S. Han, S. Yang, J. Kim, and M. Gerla, Eyeguardian: a framework of
eye tracking and blink detection for mobile device users, in Proc. ACM
HotMobile12, San Diego, CA, USA, Feb. 2012.

Вам также может понравиться