Вы находитесь на странице: 1из 26

ROAD MODELING FOR AUTONOMOUS

VEHICLE
(FINAL REPORT)
PREPARED AT : INNOVATION CELL IIT BOMBAY
SUBMITTED BY: AMARTANSH DUBEY
MENTOR: EBRAHIM ATTARWALA

ACKNOWLEDGEMENTS
I take this opportunity to express my gratitude to the people who have
been instrumental in the successful completion of this project. We are
thankful to Innovation Cell, IIT Bombay, for providing me with this
opportunity to learn, explore and implement our skills. I thank my
mentor EBRAHIM ATTARWALA and all other mentors for their constant
guidance and support throughout the course of the project.

PROBLEM STATEMENT:
To develop image processing algorithms for autonomous vehicle which
basically involves:
1) Lane Detection.
2) Vehicle Detection and Recognition.
3) Character, Text and Road Sign Detection and Recognition.
4) Controlling motion of vehicle using above three algorithm in Real
time.

INDEX
1.

INTRODUCTION TO IMAGE PROCESSING AND OPENCV

2.

OpenCV vs. MATLAB (why opencv not matlab)

3.

WHY FPGA OR GPU NEEDED FOR REAL TIME IP

4.

FPGA (Field Programmable Gate Array):

5.

ADVANTAGES OF USING FOGA FOR IMAGE PROCESSING:

6.

DISADVANTAGES USING FPGA FOR IMAGE PROCESSING:

7.

IMAGE PROCESSING ALGORITHMS FOR LANE DETECTION

8.

Part A:

Briefing the concepts of image processing used in the codes for detecting lanes,
removing noise, etc.

SUBHEADINGS:-

CANNY/SOBEL EGDE DETECTION:


HOUGH STANDARD TRANSFORM.(For shape detection)
HOUGH PROBABLISTIC TRANSFORM.(For shape detection)
BIRD EYE VIEW / PERSPECTIVE VISION.
GAUSSIAN BLURR.

HORIZON REMOVAL.
OTHER SMALL CONCEPTS.

Part B:

Algorithms and codes which works well and detected lanes successfully but were
not adaptive in every conditions.

SUBHEADINGS:
ALGORITHM-1 & FLOWCHART AND PROBLEM IN IT

ALGORITHM-2 AND PROBLEM IN IT

ALGORITHM-3 AND PROBLEM IN IT

10
Final code
SUBHEADINGS:

Part C:
Final algorithm.
Final flowchart.
RESULT.

11

TRAFFIC SIGN, FACE, CHARACTER RECOGNITION:

SUBHEADINGS:
WHY CASCADED CLASSIFIER NEEDED:
SAMPLING:
SETTING REGION OF INTEREST OR CROPPING
Training and Classification:
OUTPUT OF CODE
12

FUTURE WORK THAT CAN BE DONE

13
IMPORTANT LINKS CONTAINING
CODES, SAMPLE VIDEOS, OUTPUT VIDEOS

INTRODUCTION TO IMAGE PROCESSING AND OPENCV:


Image processing is a method to convert an image into digital form and perform some
operations on it, in order to get an enhanced image or to extract some useful
information from it. It is a type of signal dispensation in which input is image, like video
frame or photograph and output may be image or characteristics associated with that
image. Usually Image Processing system includes treating images as two dimensional
signals while applying already set signal processing methods to them.
It is among rapidly growing technologies today, with its applications in various aspects
of a business. Image Processing forms core research area within engineering and
computer science disciplines too.
Image processing basically includes the following three steps.

- Importing the image with optical scanner or by digital photography.

- Analyzing and manipulating the image which includes data compression and image
enhancement and spotting patterns that are not to human eyes like satellite
photographs.

- Output is the last stage in which result can be altered image or report that is based on
image analysis.

OPENCV:
OpenCV (Open Source Computer Vision) is a library of programming
functions mainly aimed at real-time computer vision, developed by Intel
Russia research center in Nizhny Novgorod, and now supported by Willow
Garage and Itseez.[1] It is free for use under the open source BSD license.
The library is cross-platform. It focuses mainly on real-time image
processing. If the library finds Intel's Integrated Performance Primitives on
the system, it will use these proprietary optimized routines to accelerate
itself.
OpenCV is written in C++ and its primary interface is in C++, but it still
retains a less comprehensive though extensive older C interface. There are
now full interfaces in Python, Java and MATLAB/OCTAVE (as of version
2.5).

OpenCV vs. MATLAB (why Opencv not matlab)


Speed: Matlab is built on Java, and Java is built upon C. So when you run a Matlab
program, your computer is busy trying to interpret all that Matlab code. Then it
turns it into Java, and then finally executes the code. OpenCV, on the other hand,
is basically a library of functions written in C/C++. You are closer to directly
provide machine language code to the computer to get executed. So ultimately
you get more image processing done for your computers processing cycles, and
not more interpreting. As a result of this, programs written in OpenCV run much
faster than similar programs written in Matlab.
Resources needed: Due to the high level nature of Matlab, it uses a lot of your
systems resources. And I mean A LOT! Matlab code requires over a gig of RAM to
run through video. In comparison, typical OpenCV programs only require ~70mb of
RAM to run in real-time. The difference as you can easily see is HUGE!
Cost: List price for the base (no toolboxes) MATLAB (commercial, single user
License) is around USD 2150. OpenCV is free!
Visual studio 2010: I have used Microsoft visual studio 2010 as programming
platform for writing, compiling and running my opencv codes, its very easy to
integrate opencv libraries with visual studio and it is fast and efficient.

WHY FPGA OR GPU


NEEDED:
As it is about Autonomous Vehicle
which have to recognize lanes,
obstacles(vehicles,etc), Traffic signs,
faces, etc in real time otherwise it
may cause serious accidents,
therefore it is big challenge to
execute the algorithm in real time
without any lag. When complexity of code increases my laptop start showing
some lag, I have tried some embedded boards like PANDA BOARD, INTEL ATOM
DE-2I-150 but because complexity of my code was high therefore real time
execution was not achievable, though lag was not too much but if complexity
increases further then it may become quit significant.

FPGA (Field Programmable Gate Array):


FPGAs are programmable semiconductor devices that are based around a matrix
of Configurable Logic Blocks (CLBs) connected through programmable

interconnects. FPGAs can be programmed to the desired application or


functionality requirements. Using HDL ie hardware descripting language
like VHDL, VERILOG, etc. Logic blocks shown below are made up of
simple gate arrangement which makes a multiplexer type design with
some specific lookup table(Truth table), and this truth table is fed to
logic blocks using HDL.

ADVANTAGES OF USING FOGA FOR IMAGE PROCESSING:


The biggest advantage of using FPGA is parallel processing, as mentioned earlier,
fpga is array of lakhs of programmable gates which can be designed into desired
building blocks according to our programming needs. Different regions of fpga can
be used for different processing which is called parallel processing, this provides
great flexibility over normal CPU because we can decompose the complex
algorithm into simpler one and execution can be done parallel which boost up the
speed upto 10%. Also processing can be shared between CPU and FPGA.

ALTERA INTEL ATOM DE2i-150 BOARD:

As shown in the figure, it is INTEL ATOM DE2i-150 board. This board is


nice combo of intel Atom processor with 1.6GHz speed and Cyclone lV
FPGA. Atom processor and FPGA are connected by communication bus
to boost up the processing and achieve real time feedback.
But when I start working on this board its intel Atom processor works
good for less complex codes but for more complex code it starts lagging
so I tried to link FPGA available on this board to Atom board via PCIe
communication bus to boost up the speed of execution. I googled too
much and finally come to the conclusion there are many disadvantage
in using FPGA, mentioned below

DISADVANTAGES OF USING FPGA FOR IMAGE


PROCESSING:
There are three major problems with FPGA
First is that FPGA do not have any memory of its own, it is just a hardware
with no storage capacity, once power is off, FPGA losses its program. One
have to always use some processing unit to program FPGA.
Secondly one cannot use familier languages like C, C++, java, etc to program
FPGA, for programming it, Hardware description languages like VHDL or
VERILOG, working with complex algorithm on HDL is quite hard.

Third is that opencv is dynamic library whereas FPGA are only hardware
with no memory so we cant use OPENCV for image processing, we have to
choose other tools like OPENCL or VIVADO, and these tools are not as
efficient as OPENCV.

Because of these problems I used the I7 processor with enough


processor speed which needed for my code.

IMAGE PROCESSING ALGORITHMS FOR LANE


DETECTION
I have divided this section in two parts:
Part A: Briefing the concepts of image processing used in the codes for detecting
lanes, removing noise, etc.
Part B: Algorithms and codes which works well and detected lanes successfully but
were not adaptive in every conditions and depends on certain factors like type of
roads, lanes and surroundings. I tried 3 such algorithms and finally reached to the
final algorithm.
Part C: Final algorithm and code for lane detection which works well for every
roads and conditions.

PART A:
FOLLOWINGS ARE CONCEPTS AND TOOLS I USED IN LANE
DETECTION:
1)CANNY/SOBEL EGDE DETECTION:
2)HOUGH TRANSFORM.(For shape detection)
3)BIRD EYE VIEW / PERSPECTIVE VISION.
4)GAUSSIAN BLURR.
5)HORIZON REMOVAL.
6)There are many other small concepts which I have dealed in the

explanation of codes.

DETALIED EXPLANATIONS::
1)CANNY/SOBEL EGDE DETECTION:
1. The Canny Edge detector was developed by John F. Canny in 1986. Also known
to many as the optimal detector, Canny algorithm aims to satisfy three main
criteria:
o

Low error rate: Meaning a good detection of only existent edges.

Good localization: The distance between edge pixels detected and real
edge pixels have to be minimized.

Minimal response: Only one detector response per edge.

IMPLEMENTATION:

1) Filter out any noise. The Gaussian filter is used for this purpose. An example of a
Gaussiankernel of size = 5 that might be used is shown below:

2) Find the intensity gradient of the image: A) Apply a pair of convolution masks
(in and directions) and Find the gradient strength and direction with:

The direction is rounded to one of four possible angles (namely 0, 45, 90 or 135)
3) Non-maximum suppression is applied. This removes pixels that are not
considered to be part of an edge. Hence, only thin lines (candidate edges) will
remain.
4) Hysteresis: The final step. Canny does use two thresholds (upper and lower):
If a pixel gradient is higher than the upper threshold, the pixel is accepted as an
edge
If a pixel gradient value is below the lower threshold, then it is rejected.
If the pixel gradient is between the two thresholds, then it will be accepted only if
it is connected to a pixel that is above the upper threshold.
Canny recommended a upper:lower ratio between 2:1 and 3:1.
Example :::

2) HOUGH TRANSFORM: hough transform is a function available in opencv


which helps in shape detection of some standard geometrical shapes like circle,
line, ellipse. I used hough line transform for lane detection, there are two types of
hough line transform:1)HOUGH STANDARD TRANSFORM.
2)HOUGH PROBABLISTIC TRANSFORM.

2)HOUGH STANDARD TRANSFORM:


When we take image as matrix of pixel then a line in this image matrix can be represented in
two basic forms:
1) Cartesian coordinate system: Parameters: (m,b).
2) Polar coordinate system: Parameters: (r, theta).

In Hough standard Transform, we will express lines in the Polar system. Hence, a line equation
can be written as:

OR

each pair
represents each line that passes by
In general for each point
we can define the family of lines that goes
through that point as:
.
If for a given
we plot the family of lines that goes through it, we get a
sinusoid. For instance, for x=8, y=6, we get the following plot (in a plane ):

We consider only points such


that
and .

3) We can do the same operation above for all the points in an image. If the curves
of two different points intersect in the plane - , that means that both points
belong to a same line. For instance, following with the example above and
drawing the plot for two more points:
,
and
,
, we get:

The three plots intersect in one single point

, these coordinates are the parameters

(
) or the line in which
,
and
lay. It means that in general, a line
can be detected by finding the number of intersections between curves.The more curves
intersecting means that the line represented by that intersection have more points. In general,
we can define a threshold of the minimum number of intersections needed to detect a line.

2B) HOUGH PROBABLISTIC TRANSFORM:


It is much more efficient and accurate way to detect lines then hough transform because instead
of returning lines in polar coordinates it directly gives two cartesian coordinates of the detected
lines.So it is easy to interprete the data returned by the function

ADVANTAGES OF HOUGH PROBABLISTIC OVER HOUGH STANDARD : (Advantages can


be clearly seen from above two screenshot of hough standard and hough probabilistic
implementation on a road.)

In hough probabilistic, there is a parameter: minLineLength, it is used to set the


Minimum line length. Line segments shorter than that are rejected, this is not
present in hough standard transform.
Another parameter present in hough probabilistic which is not in standard
transform: maxLineGap Maximum allowed gap between points on the same
line to link them.
Third and most important advantage is that hough probabilistic directly returns
cartesian coordinates not polar.

3) BIRD EYE VIEW / PERSPECTIVE VIEW:

When camera takes image then it covers area in shape of trapezium, ie as


vertical distance increases, camera covers more area, and near camera it covers
less area and hence covers the area in shape of trapezium, it is quite obvious.As a
result it can be seen in above image that lanes which are parallel seems to be
converging and intersecting at some point. So camera sees lanes as non parallel
lines but we want it to be real (as if bird is viewing the road from top )
So for getting lanes parallel I remapped the pixels of trapezium into rectangle by
calculating the relation between real world distance and pixels (example 1cm =10
pixel) and forming two matrix to adjust point 2 at point C and similarly point3 at
pointD and then use remapping function available in opencv. This will give me
parallel lanes as output.

4) HORIZON REMOVAL:
it means to remove the horizon (sky and other unwanted area above the road),
this will help in removing unwanted noises and disturbances. It is done by
accessing rows and columns and put pixel =0 to remove unwanted region or it can
be done by using function setROI.
5)GAUSSIAN BLURR:
It is used to remove noise from the input image.

When Gaussian blurr function is applied on matrix of pixel of input image it gives
output image using the kernel:

Function uses above matrix to perform averaging of pixels of the input image.

PART B
Here I have mentioned the algorithms and codes which works well and
detected lanes successfully but were not adaptive in every conditions and
depends on certain factors like type of roads, lanes and surroundings. I
tried 3 such algorithms and finally reached to the final algorithm.

ALGORITHM 1 & FLOWCHART:

ALGORITHM-1 IT IS
NOT ADAPTIVE FOR
EVERY TYPE OF
LANES

Start
Main function

LOAD IMAGE OR VIDEO

CONVERT IMAGE OR FRAME INTO GRAYSCALE AND APPLY CANNY EGDE


DETECTOR FUNCTION

REMOVE NOISE AND REMOVE


HORIZON

APPLY HOUGH STANDARD TRANSFORM AND HOUGH PROBABLISTIC


TRANSFORM SEPERATELY ON SAME IMAGE WITH ANGLE
THRESHOLDING

PERFORM BITWISE AND TO THE EACH PIXELS OF BOTH


PROCESSED IMAGES, ie ONE WITH HOUGH STANDARD
TRANSFORM AND ONE WITH HOUGH PROBABLISTIC

DISPLAY THE
PROCESSED IMAGE

RELEASE ALL THE MEMORY USED BY


IMAGES AND WINDOWS USED

IMPORTANT NOTE: code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV3c&usp=sharing
THIS WILL BE OUTPUT WHEN THIS ALGORITHM IS EXECUTED:

Advantage of removing horizon can be seen as there are no lines above the line
which differentiate road and sky.
PROBLEM WITH THE ALGORITHM 1:For controlling a self driving car and
maintaining fix distance from both the lanes, I needed the equation of the
detected lines, which is not possible because more than one lines are detecting
one lane, I needed one hough line for each lane so that I can find its equation and
distance of car from the lane.
ALGORITHM 2:

As I discussed that in previous algorithm that finding equation of lane is possible


only when I could get one line detecting one lane, so for that I used concept of
averaging the values of rho and theta to get one single line, example (here k is
variable which is incremented under loop to get weighted mean of rho and
theta )
`
rho1=(rho1*(k-1)+rho)/k;
theta1=(theta1*(k-1)+theta)/k;
k++;

IMPORTANT NOTE:
code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV3c&usp=sharing

PROBLEM WITH THE ALGORITHM 2:


ALGORITHM 2 solves the problem which algorithm one was having, that is now I got one hough
line for each lane, now I can find equation of line, but incase of 4 lane or 6 lane roads, other
other lanes are also detected lanes and get added to the weighted mean of hough lines and
causes wrong results, this is serious problem I dumped this code and move toward other
algorithm.
One more problem with this code is that it do not works for curved roads because I have
applied threshold on values of theta obtained from hough function:
if ( theta < (CV_PI-19.*CV_PI/30)

|| theta > 19.*CV_PI/30.)//for first lane

else if(theta < (CV_PI/180)*70 && theta> (CV_PI/180)*20)//for 2nd lane

this do not allow hough function to print lines which are not in this
theta range, so curves could not be detected. Only straight lanes are
detected successfully.

ALGORITHM 3:
In this code I used distance of hough lines as tool to filter out other hough lines
and obtained one hough line for each lane, this avoids weighted mean and errors
mentioned in algorithm 2.
Here I find equation of the all the hough lines as :
column=(-sin(theta)/cos(theta)*350)+ rho/cos(theta);

and divided the hough lines in two categories, one which are
on left half of the image, and one which are on right
side,then filter out only those lines which are nearest to the center of camera
(on left and right of center respectively), because in between the roads there is
very less probability of having lane like thick straight lines, and small lines which
could be detected as lines are removed by Gaussian blurr or by setting parameters
of hough probabilistic function (increase min gap between points which should be
detected as lines ). Also k-mean clustering can be used to remove other lines .Well
road signs which may cause problem are removed using Haar Classifier.
PROBLEM WITH THE ALGORITHM 3:
This code solves problem of weighted mean but still curved could not be detected
because thresholding which is applied on theta is still there, moreover since lines
are too much slanted and it is not possible to calculate pixel-real distance it is not
possible to localizing car in between lanes.

IMPORTANT NOTE:code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV3c&usp=sharing

PART C
Final algorithm:
To get accurate distance, and relation between real world distance and pixels I
applied BIRD EYE VIEW(PERSPECTIVE VISION), I already mentioned about BIRD
EYE VIEW above. This gives me relation between pixel and real world distance,
and now car can be localized between the lanes.

Also for detecting curved lanes I used following algorithm:


Hough probabilistic returns pair coordinates which defines lines, I find centroid of
this pair of coordinates and join it to the centroid of next pair of coordinates,
(instead of joing pair of coordinates I joined centroids), this algorithm works well
for detecting curved lanes because in this way hough line wil be formed along the
tangent of the curves. this is derived from MEAN VALUE THEOREM THEORAM: if a
function f is continuous on the closed interval [a, b], where a < b, and differentiable on
the open interval (a, b), then there exists a point c in (a, b) such that,

FINAL FLOWCHART:
IMPORTANT NOTE: code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV3c&usp=sharing

FINAL flowchart

Start
Main function

OPEN SERIAL PORT TO SEND DATA


THROUGH UART

LOAD IMAGE OR VIDEO


APPLY BIRD EYE VIEW
AND REMOVE NOISE
CONVERT IMAGE OR FRAME INTO GRAYSCALE AND APPLY CANNY EGDE
DETECTOR FUNCTION

APPLY HOUGH PROBABLISTIC TRANSFORM AND GET PAIRS OF


COORDINATES OF LINES IN A VECTOR

JOIN CENTROIDS OF PAIR OF COORDINATES INSTEAD


JOINING COORDINATES TO DETECT CURVES AND FILTER
UNWANTED HOUGH LINES BY MINIMUM DISTANCE
METHOD MENTIONED ABOVE

DISPLAY THE
PROCESSED IMAGE

RELEASE ALL THE MEMORY USED BY


IMAGES AND WINDOWS USED

When code is executed then following results are displayed on four windows:

RESULT

1)First window : It shows input image.

2) SECOND WINDOW: It shows image after applying BIRD EYE VIEW, CANNY
EGDE and GAUSSIAN BLURR.
3)THIRD WINDOW: It shows image on which all the detected hough lines are
shown without filtering.

4)FOURTH WINDOW: It shows final image with only hough lines on lanes.
5) FIFTH WINDOW(shown AT LAST): it is showing the distance of centre of
camera from both of the lanes, and sending data for localizing car to processor
using UART.
IMPORTANT NOTE: code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV3c&usp=sharing

TRAFFIC SIGN, FACE, CHARACTER RECOGNITION


WHY CASCADED CLASSIFIER NEEDED:
Object recognition in opencv can be done in several ways like:

COLOR THRESHOLDING. Recognising object of particular color is easy with color


thresholding but it have several constraint like if another object comes in background
then may be one will get wrong results.So one cannot rely on it for good result if
background is not fix.

CONTOUR DETECTION: It increases accuracy as compared to color recognition , what it


do is that it detects the close areas and one can filter out other objects of same colors(in
backgroung) using tools provided by contour function like with contour area one could
set range of camera( if area is set greater then particular value then object of same color
in background having small area can be removed). But still if object of approx same area
and color may cause misleading results.
FEATURE EXTRACTION USING CASCADE CLASSIFIERS: The classifier is a set of APIs that
allow you to define classes, or categories of nodes. By running samples of classes
through the classifier to train it on what constitutes a given class, you can then run that
trained classifier on unknown documents or nodes to determine to which classes each
belongs. There are many classifiers available on intertnet like HAAR, LBP, etc.
With these classifiers one can not only detect color but also it gives good results for some
complex tasks which are not possible with contour detection.

Classifiers can be used for face detection, character and text recognition and many more.
Classifier works on feature extraction . It involves following steps:

1) SAMPLING:
Sampling means to collect sample images of the object which is to be detected. This is very
important step, and for good results sampling should be done accurately. Generally for good
face detection program more than 1000 samples are to be taken. Suppose I want to detect a
traffic sign, for that I have to gather sample images of the sign from all possible angles and
brightness conditions. More are the samples gathered more is the accuracy. In order to train our
own classifier we need samples, which means we need a lot of images that show the object we
want to detect (positive sample) and even more images without the object (negative sample).
POSITIVE IMAGES
It means images of object to be detected, take photos of the object you want to detect, look for
them on the internet, extract them from a video or take some Polaroid pictures generate
positive samples for OpenCV to work with. It's also important that they should differ in lighting
and background.
NEGATIVE IMAGES
Now negative images are needed, the ones that don't show a object to be detected. In the best
case, if one wants to train a highly accurate classifier, he should have a lot of negative images
that look exactly like the positive ones, except that they don't contain the object we want to
recognize. As I want to detect stop signs on walls, the negative images would ideally be a lot of
pictures of walls. Maybe even with other signs. Keep an eye on the ratios of the cropped images,
they shouldn't differ that much. The best results come from positive images that look exactly like
the ones you'd want to detect the object in, except that they are cropped so only the object is
visible.

2)SETTING REGION OF INTEREST OR CROPPING:


Second step includes setting region of interest in all the sample images gathered, that is to
remove unwanted background and only selecting the features of object which characterizes
that object. Now you need to crop them so that only our desired object is visible.
3)Now collect information of cropped images in text file, information means its size, position in original
image. Link which explains method of doing so is given at the last of the report.

4) Training and Classification:


Training is the process of taking content that is known to belong to specified classes and creating a
classifier on the basis of that known content. Classification is the process of taking a classifier built with
such a training content set and running it on unknown content to determine class membership for the
unknown content. Training is an iterative process whereby you build the best classifier possible, and
classification is a one-time process designed to run on unknown content.
Finally after training, a xml file is generated which is loaded in the main program to match the features
and detect whether object is present or not.

OUTPUT OF CODE:
Here is OUTPUT OF CODE was run:::
TRAFFIC SIGN, FACE AND VEHICLE RECOGNITION:

IMPORTANT NOTE: code, sample videos, output videos are uploaded on google
drive:https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1WEFRV
3c&usp=sharing

FUTURE WORK THAT CAN BE DONE


1) Pedestrian recognition for better localization.
2) In vehicle and road sign detection I used 200 positive images for
training, which sometimes gives wrong result, so number of sample
should increased.
3) Shadow and illumination correction for better results.
4) Multithreading of cascade classifier can used, it increases speed of
training.
5) Classifier called LBP can be used for more accurate results.

IMPORTANT NOTE:
code, sample videos, output videos are uploaded on google drive:
https://drive.google.com/folderview?id=0BxV8Z1s8nFXWcDRuOXN1W
EFRV3c&usp=sharing

Вам также может понравиться