Вы находитесь на странице: 1из 30

WELCOM

E
BANDIT FRAMEWORK
FOR SYSTEMATIC
LEARNING IN
WIRELESS VIDEO-BASED
FACE RECOGNITION
GUIDED BY
VINDYA VIJAYAN
VKCET
PRESENTED BY
JIJI BJ
VKCET
INTRODUCTION
Analysis of video stream for face recognition is become common after
10yrs among remote computing cluster.
This process experience time varying, unknown channel condition,
traffic loads and processing constraints.
Examples: Google Goggles, Google Glass object recognition,
Facebook face tagging, Microsoft photo gallery face recognition.
A group of device in a single wireless network called Wireless cluster.

A group of remote accessing computer is called cloud
The face recognition transaction system consist of-
A Video content producers is used to capture, encode and transmit the video
streams
A cloud computing cluster is used for analysis visual data from wireless cluster.
A message is received in mobile device from the remote cluster.
To reduce the contention in the wireless network, each device uploads
its video content with encoding bitrate and number of frames to
produce.
To reducing the task scheduling congestion in the cloud, the required
processing time is scaled.
At last, with in a predetermined time ,device receive a reply from the
cloud


Each device achieves reliable face recognition while minimizing the
required wireless transmission and cloud-based processing under
highly-varying contention and congestion conditions
Illustration of object or face recognition transactions
between mobile devices and a cloud
RELATED WORK
All the application must keep an accuracy rate while minimizing its
cost and the number of video frames.
most existing solutions for designing and configuring wireless
multimedia applications based on dynamic cloud.
Reinforcement learning is best algorithm is used in wireless
multimedia applications with cloud processing.
For more speed access multi armed bandit(MAB) can be proposed.
Here an index that consist of estimated performance and uncertainty ,
index is updated using feedback
The MAB choose highest index for the fast execution.
For information exploitation we use contextual bandit algorithms.
Our proposed framework learns independently for each context, and
used for multiuser purpose.
MAB is based on a regret bound-the difference between expected
recognition rate and the achieved recognition rate from online learning
algorithms.
The order of the regret is O(log k)

CONTENTS
SYSTEM DESCRIPTION AND SYSTEM MODEL
Video Capturing and Encoding
Visual Analysis
Wireless Transmission
System Model
DISTRIBUTED, DEVICE-ORIENTED, BANDIT LEARNING
ALGORITHM
Device oriented Contextual Learning Algorithm
Service oriented Contextual Learning Algorithm
NUMERICAL RESULT
CONCLUSION
REFERENCE
SYSTEM DESCRIPTION AND SYSTEM
MODEL
Video Capturing and Encoding
Mobile device have a video camera capture several frames of
faces
Each frame is illuminated through the artificial modulation of
light of flash
Each video frame can be cropped to the object or face area by
automated face detection algorithms

Example of five frames captured under different illumination elevation (E) and azimuth
(A) angles from the Yale Face Database
The cropped frames are then compressed using a standard-
compliant video codec like MPEG/ITU-T H>264/AVC
Wireless Transmission
This framework using either wireless local area network (WLAN)
infrastructures, such as IEEE 802.11 WLANs or WiMAX/LTE-based
cellular networks
Multiple wireless devices sharing the same spectrum.
Each device can adapt its transmission parameters depending on the
number of concurrent transmitters
Our solution can learn the behavior of such adaptation mechanisms
and transmission settings to use under time constraints.
Visual Analysis
The cloud computing cluster processes multiple visual analysis tasks
concurrently, so its task scheduler experiences highly-varying levels of
congestion
Cloud computing infrastructure may have to adapt the accuracy of its
feature matching algorithm, as well as the number of video frames
processed.
For face recognition, the server matches the provided video
information to a pre-established database of stored images
L1 minimization, salient-point extraction and matching, support vector
machines etc. are the examples of algorithm used for face recogntion
When the majority of the video frames are matched to the same
person in the database, then the match is successful and return the
result
The video frames that match the same person can be stored in a set
called a-priori, for reducing false positive rate.
Similarly the system maintain a false negative rate. Its mainly due to
varying illumination, motion blur or pose.
System Model
Here m mobile device indexed by set M={1,2,3.m}
A is a set of all possible actions/settings, A={1,2,3..a
S
} where S=size
of the settings space.
Discrete set T= Contention and congestion levels of the wireless
medium
Discrete set G= Contention and congestion levels of Cloud
Infrastructure.
k=Each recognition transition.
For each recognition transaction, k
1. Each device observes the current wireless contention level, t (k) T
, and cloud congestion level, g (k) G, and selects the bitrate and
number of frames to capture, transmits encoded video to the cloud
2. The cloud decodes the video and performs feature matching with
the database
3. Each device gets the result from the cloud, either a label or a error
message
4. the device performs further recognition attempts until a successful
result is obtained or the process is cancelled

Device oriented Model
devices systematically learn the
best transmission setting to
maximize their own recognition rate
per attempt
All devices use the contention level
of the wireless medium and the
congestion level in the cloud as
contexts
Service Oriented Model
The cloud systematically learns the
best setting that maximizes the
clusters average recognition rate
per attempt
The cloud uses the wireless
contention level as context and the
recognition rate and the cloud
congestion level

Two models are used for the derivation of a bandit-based
systematic learning framework
DISTRIBUTED, DEVICE-ORIENTED, BANDIT
LEARNING ALGORITHM

Device oriented Model
Mobile devices select their own actions and learn through their
interaction with the wireless medium and the cloud
There are two stages for any recognition transaction
1. Exploration stage, where the mobile device chooses an arbitrary
transmission setting to update the estimated recognition rate per
attempt

2. Exploitation stage, where mobile devices select the transmission
setting that yields the highest estimated recognition rate per attempt

Device-oriented model
Sub-optimality Gap and Minimum Sub-optimality Gap
The sub-optimality gap defines the performance difference between
the best and other transmission settings
Sub-optimality gap=
a
-
be a transmission setting a- A
Minimum Sub-optimality Gap=


Service Oriented Model
The cloud that makes the decisions and all mobile devices simply
obey
Congestion level in the cloud indirectly controlled by the settings
decided for each device
There are two stages for any recognition transaction
1. Exploration stage, where the cloud chooses an arbitrary
transmission setting depending on the contention level and updates
the estimated recognition rate per attempt
2. Exploitation stage, where cloud select the transmission setting that
yields the highest expected average recognition rate per attempt


Sub-optimality Gap and Minimum Sub-optimality Gap
Sub-optimality gap=
Minimum Sub-optimality Gap=


NUMERICAL RESULT
General Setup
Our system consist of mobile devices connected via an IEEE 802.11
WLAN to a computing cluster
Videos of human faces are produced by random images of persons
taken from the extended Yale Face Database
The camera flash is modulated to emulate light coming from different
angles
H.264/AVC codec is used for encoding the video
The 2D-PCA algorithm is used at the cloud side for face recognition

Wireless Transmission
Generated wireless contention via the well-known backoff and
retransmission mechanism of the Distributed Coordination Function
(DCF).
The default settings of the DCF simulator of Bianchis method
Face Recognition on the Cloud
More than 80% of the received video frames have to match to the
same person in the database
A limited number of the received video frames is used by 2D-PCA
Eigenvectors used for the distance calculation
DEVICE ORIENTED MODEL
AVERAGE ATTEMPTS PER
TRANSACTION AVERAGE BIT STREAM SIZE
SERVICE ORIENTED MODEL
AVERAGE ATTEMPTS PER
TRANSACTION AVERAGE BIT STREAM SIZE
CONCLUSION
We propose a contextual bandit framework for learning contention and
congestion conditions in object or face recognition via wireless mobile
streaming and cloud-based processing
Bandit framework converges to the value of the oracle solution.
It quickly adjusts to contention and congestion conditions
Can be used in multi-user wireless video streaming systems and high
level multimedia signal processing systems
REFERENCES
A. Anandkumar, N. Michael, and A. Tang. Opportunistic spectrum
access with multiple players
P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the
multiarmed bandit problem.
W. Bao, H. Li, N. Li, and W Jiang. A liveness detection method for face
recognition based on optical flow field
http://www.Wikipedia.com

THANK YOU

Вам также может понравиться