Вы находитесь на странице: 1из 30

Stereovision Based 3D

Hand Gesture Recognition

Guided By: By:


Dr. Sumam David Pankaj Bongale (09EC60)
Vineet Roy, TI Vikram Shenoy H (09EC104)
Introduction
 Hand gestures:
Hand gestures are a form on nonverbal communication
in which visible hand actions are used for conversational,
controlling, manipulative, and communicative purposes.
 So, What is 3D hand gesture recognition?
It is the technique of extracting useful information from the
hand gesture made by an user in a 3D environment using depth
information.
Introduction
 Motivation for Hand gesture recognition:
 More natural and intuitive in Computer Vision, especially in 3D
applications
 As an assistive means for analyzing human intent and identifying
potential threats in a multi-modality surveillance system
 Major area of concern is pervasive computing environment,
where a person can control the applications running on a tablet
PC or a smart phone using mere hand gestures
 Why 3D features ?
 Depth information gives better accuracy against cluttered
backgrounds
Methodology

Our proposed work concentrates on 3 aspects:


 Skin tone detection to separate background from user.
 Use 3D to construct a depth map and separate the hands
from the face using chamfer distance thresholding.
 Segment the photo into fingers, palm etc using contours,
motion etc and detect the dynamic gesture.
Related Work
 Skin tone detection:
(1) Bergh and Gool (2010) proposed hybrid method
for skin tone detection. In their work, Gaussian mixture
model and Histogram based probability estimation were
considered.
(2)Another simple method is using YCbCr colour space
to locate skin color (Mahmound 2008)
(3) Skin tone detection using 6 different color spaces is
proposed by Gasparini and Schettini.
Related Work
Hand Segmentation Models
 Kuch and Huang (1995) presented 26 degrees of freedom for hand gesture
recognition.
Out of 26 DOFs, 3 were used for hand global orientation and rest of the
23 were used for parameters on fingers.
 More realistic model was developed by Bray et.al (2004), which was based
on skinning technique.
However, it is very specific to only one user and hence adaptability is very
poor.
 Geometric hand models are also used for template matching (Wu, Lin and
Huang,2001)
 Haar –like feature based hand detection model was proposed by Qing and
Nicolas (2008)
Related Work
Hand Segmentation Models
 Recent advances in hand gesture recognition involve hidden
markov models (HMM), SIFT and SURF algorithms.
 Jiatong Bao et al proposed dynamic hand gesture recognition
based on SURF (Speeded Up Robust Features tracking).
In their study, interest points are located using Fast Hessian
detector and most dominant direction of gesture is found out using
Haar wavelet response in X and Y direction.
Later these SURF points are matched in the adjacent frames
to recognize the hand gesture.
Results on Skin-tone detection

Original Image Resulting Binary Image after skin tone detection

 We have used YCbCr, RGB and HSV color space with multiple thresholding for
skin tone detection
Haar-like features for 2D hand tracking

•Haar-like features describe the ratio between the dark and


bright areas within a kernel.
•Advantage: Computation is faster and Classification is easier.
Better accuracy is ensured if used with more iterations like
AdaBoost algorithm
Haar-like features for 2D hand tracking

Concept of the “integral image

• The “integral image” at the location of pixel(x, y) contains the sum of the
pixel values above and left of this pixel .
•According to the definition of the “integral image,” the sum of the pixel
values within the area D in (b) can be computed by P1 + P4 − P2 − P3
where P1 = A, P2 = A + B, P3 = A + C, and P4 = A + B + C + D.
Results obtained for 2D hand tracking

Haar-like features are used for hand tracking. For this purpose, a
haar cascade xml file specific to hand tracking is used.
Stereovision

 to obtain 3D depth map in order to separate occluding image components


which may affect the gesture recognition .
 Stereo vision systems take two images of a scene from different viewpoints
 Disparity: Displacement of corresponding points from one image to
the other
 From the disparity, we can calculate depth
Implementing stereovision
Using OpenCV
 Requirements:
 2 identical cameras (iball C8.0)
 Details:
 calib3d.hpp – for calibrating the 2 cameras
 imagproc.hpp – to process images & convert to disparity image
 highgui.hpp – visualisation of point cloud image
 Ease of use:
 Lot of online help available (websites, forums, etc)
 Not optimized for real time applications
Stereovision – Principle
Stereovision – Principle (continued)

Assumption: camera pair is frontal parallel


Steps involved in stereo-imaging

 Stereo Calibration
 Stereo Rectification
 Stereo Correspondence
Stereo Calibration
 process of computing the geometrical-relationship between the two
cameras in space.
 To find rotation matrix R and translation vector T between the two
cameras

For a 3D point P,
 Left camera: Pl = RlP + Tl ------(1)
 Right Camera: Pr = RrP + Tr ------(2)
 In general, we have Pl = RT(Pr – T) ------(3)

From (1), (2), (3) we have


R = Rr(Rl)T
T = Tr - RTl
Stereo Calibration (Continued)
• Set of {R,T} found for
multiple images, median
found
• Levenberg Marquardt
iterative algorithm - to find
the minimum of the
reprojection error of the
chessboard corners for
both camera views
Stereo Rectification
 for a certain point P(x,y) in the left camera image, it's
corresponding point should appear at P(x+a,y)
 the optical axes (or principal rays) of the two cameras are
made parallel and so that they intersect at infinity
 4 terms each for left & right image generated: distCoeffs,
Rrect, Mrect, M.

2 algorithms:
 Hartley’s algorithm: Uncalibrated stereo rectification
 Bouguet’s algorithm: Calibrated stereo rectification
Stereo Rectification (continued)
Stereo Correspondence
 matching a 3D point in the
two different camera
views.
 Block matching algorithm:
small SAD windows to
find matching points
between the left and right
stereo rectified images.
Stereo Correspondence (contd)

Matches, Occlusions and Insertions


Implementation: Stereo Calibration
 2 cameras fixed horizontally at a distance of around 10cm
 standard 8x8 chessboard held in hand
 24 different pairs of pictures were taken with the chessboard
held in different inclinations
 R and T matrices calculated, totally 24 pairs of matrices
 'cvStereoCalibrate()' (OpenCV) :
 calculate these matrices
 uses Marquardt iterative algorithm to approximate these 24
matrix pairs into one pair (R,T) with minimum error of
chessboard corners for both camera views.
Implementation: Stereo Calibration
(Continued)
Dataset of left camera images Dataset of right camera images
Implementation: Stereo Rectification

Hartley's algorithm: cvStereoRectifyUncalibrated()


Bouguet's algorithm: cvStereoRectify()
Implementation: Stereo Correspondence
 OpenCV block matching algorithm:
cvFindStereoCorrespondenceBM()
 Based on the horizontal distance between the locations of a
point in the two images and a set of predefined constants, a
disparity image is generated

Parameters:
 SADWindowSize
 minDisparity
 numberOfDisparities
 textureThreshold
 speckleWindowSize
 speckleRange
Implementation: Stereo Correspondence
(Continued)
Issues
 Insufficient data generated:
 the stereo data obtained in the form of disparity images is in its
crude stages due to the limitations of the current hardware.
 The disparity image does not give high density data.

 False shades in disparity:


 Shades do not decrement with distance
 Multiple shades given to objects at same distance from cameras
Future Work
 Completing 3D hand detection:
 to implement the same codes on better hardware (cameras
specifically designed for stereo-vision) and generate high-
precision high-density data
 extract hand data based on gray shades of the disparity image
 3D gesture recognition:
 implement algorithm(s) to use this data
 train a model from standard gestures
 use this model to perform real-time gesture recognition.
Thank You

Вам также может понравиться