Вы находитесь на странице: 1из 24

OpenCv Based Autonomous Robot

By:
Asim Ali
Shahab Ahmad
Adnan khan

A thesis presented to the Department of Electrical Engineering, University of Engineering


and Technology, Peshawar, Pakistan in partial fulfillment of the requirements for the
degree of
Bachelor of Science
in
Electrical Engineering (communication)

Supervised by:
Engr. Amir Rasheed

DEPARTMENT OF ELECTRICAL ENGINEERING


UNIVERSITY OF ENGINEERING & TECHNOLOGY, PESHAWAR,
PAKISTAN
UNIVERSITY OF ENGINEERING AND TECHNOLOGY,
PESHAWAR
DEPARTMENT OF ELECTRICAL ENGINEERING

REPORT ON BSc ELECTRICAL ENGG. THESIS DEFENCE EXAMINATION


Title of BSc Thesis: opencv based autonomous robot

Student’s Name and Address Date of Thesis Defence Examination


Asim ali
Shahab ahmad
Adnan khan
Department of Electrical Engineering, _______________________
UET Peshawar, Khyber Pukhtunkhwa

Examination Committee

Members Approving This Report Members Not Approving This Report

1. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 1. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

2._ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 2. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

3._ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 3. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

In case of a tie or a majority not approving this project, state below the reasons for failure and
conditions to be met if a re-examination is to be administered.

________________________________________ ____________
____________________________________________________
_______________________________________ _____________

ii
AUTHOR'S DECLARATION

I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including
any required final revisions, as accepted by my examiners.
I understand that my thesis may be made electronically available to the public.

iii
Abstract

Vision-based robot navigation has long been a fundamental goal in both robotics and
computer vision research. While the problem is largely solved for robots equipped with
active range-finding devices, for a variety of reasons, the task still remains challenging
for robots equipped only with vision sensors. Vision is an attractive sensor as it helps in
the design of economically viable systems with simpler sensor limitations. It facilitates
passive sensing of the environment and provides valuable semantic information about the
scene that is unavailable to other sensors. Two popular paradigms have emerged to
analyze this problem, namely Model-based and Model-free algorithms. Model-based
approaches demand a priori model information to be made available in advance. In case
of the latter, required 3D information is computed online. Model-free navigation
paradigms have gained popularity over model-based approaches due to their simpler
assumptions and wider applicability. This thesis discusses a new paradigm to vision-
based navigation, namely Image-based navigation. The basic concept is that model-free
paradigms involve an unnecessary intermediate depth computation, which is redundant
for the purpose of navigation. Rather the motion instruction required to control the robot
can be inferred directly from the acquired images. This approach is more attractive as the
modeling of objects is now simply substituted by the memorization of views, which is far
easier than 3D modeling.

iv
Acknowledgements

I would like to thank Engr.amir rasheed for his guidance and support during my research. He
provided me with the opportunities to pursue my personal research interests and I am thankful for
the same. I would also like to thank Engrr: Dawar Awan for advising me and motivating me at
critical junctures. They have been great mentors and have guided me through the ups and downs
of my research career until now.
I have immensely benefited from their knowledge of OpenCv and Image processing,
specifically Engrr: Dawar Awan.,

v
Table of Contents

vi
List of Figures
Insert List of Figures here.

vii
Chapter 1
introduction

1.1 Introduction To Problem

Consider a robot that needs to determine how to navigate around an obstacle in its path.
The robot needs to be able to detect the obstacle, determine radius of object is, and
calculate the best path around it. The second of these three tasks, determining how far
away the object is, can be accomplished easily with relevant equipment such as a laser
range-sensor.
However, when a robot relies entirely on its vision, other techniques must be employed.
There are two main tasks to estimating the depth of an object from a single video
sequence, tracking the object, and estimating the position of the tracked object.
Object tracking is a problem that has been studied by many. Some solutions to object
tracking include segmentation, motion vector calculations, and density gradient
estimation. Pose estimation is the problem of determining the translation and rotation of
an object in an image with respect to the camera. The translation obtained from pose
estimation is the position of the object and pose estimation can therefore be used for the
second task involved in depth estimation. Methods for pose estimation include Filters,
stereo cameras, and algebraic methods with projections.
In this project we consider the case when a robot has only a single camera and wishes to
estimate the radius, color and position of object. The method we use for object tracking is
called the Hough circle and filters, and relies on HSV image.

.
1
2
1.1.1 Vision Sensors

There are many different types of digital vision sensors. Each is used to create an image
that contains some information about the scene. Some sensors detect range information
while others simply take color images. Some sensors can even take pictures of the inside
of the human body. In general, though, images consist of a two-dimensional array of
binary data in which each cell in the array makes up a pixel, or picture element. We
provide a basic description of range cameras, normal light-sensing cameras, and video
cameras here. Range cameras use electromagnetic radar beams to detect depth
information in the scene. Each pixel of the image therefore represents the depth of the
object at that location in the image. Such images contain information about the surface of
objects in the scene, and can more accurately represent motion than intensity images
Light-sensing camera sensors consist of a two-dimensional array of light sensors on a
chip. These sensors detect the intensity of light that hits them. A major type of camera
sensor is the charge-coupled device (CCD). In CCD cameras, the light hits a photoelectric
cell producing an amount of charge depending on the intensity of the light. The charge is
converted to binary data using a shift register, or CCD. Video cameras are simply normal
cameras that record sequences of images, or frames. The standard rate for frame
sequencing is 30 fps (frames per second). Humans cannot detect change fast enough to
notice the change from frame to frame, and therefore the sequence gives the illusion of
motion. For this project, standard color video cameras are used, however our application
can track objects in still images as well.

2
3
1.1.2 Image Representation
Each pixel in an image is represented by binary data. This data can be one value or a
combination of a few values. These values are usually represented in a single 8-bit byte of
data, meaning they can be between 0 and 255. 10 3 BACKGROUND
In grayscale, an image is a single two-dimensional array of pixels, where each pixel is
represented by a single byte. This byte represents the intensity of the pixel. An example
of a grayscale image is shown in Fig. 4(a). Binary images are images where each pixel is
either a zero or one. Depending on the encoding, zero could represent the absence of a
pixel and one could represent the presence of a pixel. Therefore, the image would be
white where there are no pixels and black where there are. An example of a binary image
is shown in Fig. 4(b). Color images contain more information that just intensity or the
presence of a pixel. Color images must encode the color of each pixel. There are several
color systems for doing this, two of which are discussed in Section 3.3. The standard
color system, Red-Green-Blue, represents each pixel by 24-bits (3 bytes). Each byte
represents a red, green, or blue value for the pixel. The combination of these three values
gives us the color of the pixel. Fig. 4(c) shows an example of an RGB image. Our
application uses color images in order to track objects based on their color. However, we
convert the color images to grayscale images that represent probability,
1.1.3 Object Tracking

Object Tracking is a topic within Computer Vision that deals with recognition and
tracking of moving objects in a scene. These objects can be known and recognized from a
stored model or can be recognized based on features such as shape, color, or texture.
Tracking deals with determining the location of the object in the image even if the object
changes position between consecutive images such as in a video sequence. Tracking can
also deal with prediction of object location based on trajectory. There are several different
techniques for object detection and tracking. The techniques we looked at, and discuss
here, are segmentation, and optical Flow. The algorithm, which is what is used in this
project, uses another method known as density gradient estimation,.

3
4
1.2 What do we want to do

Our basic objective was to design such an autonomous robot that can avoid obstacle by
changing its path whenever an object comes in its contact but to not use conventional
sensors. For this purpose we have taken advantage from the modern day computer
software tools and function. Among them the one of key importance is OpenCv. Thus
nutshell we had the following objectives in mind.
1. The robot must avoid the specified object in its path.
2. Sensors should be avoided and real time image processing should be used.

4
Chapter 2

OpenCv Scheme

2.1 Introduction

Fig 2.1 Introduction

5
6
2.1.1 History

OpenCv is a library of programming functions mainly aimed at real-time computer


vision, originally developed by Intel's research center in Nizhny Novgorod (Russia),
later supported by Willow Garageand now maintained by Itseez. The library is cross-
platform and free for use under the open-source BSD license.

Officially launched in 1999, the OpenCV project was initially an Intel


Research initiative to advance CPU-intensive applications, part of a series of projects
including real-time ray tracing and 3D display walls. The first alpha version of
OpenCV was released to the public at the IEEE Conference on Computer Vision and
Pattern Recognition in 2000, five betas were released between 2001 and 2005.

The first 1.0 version was released in 2006. In mid-2008, OpenCV obtained corporate
support from Willow Garage, and is now again under active development. A version
1.1 "pre-release" was released in October 2008. OpenCV 2 includes major changes to
the C++ interface, aiming at easier, more type-safe patterns, new functions, and better
implementations for existing ones in terms of performance (especially on multi-core
systems).

2.1.2 Application
OpenCV's application areas include:

1. 2D and 3D feature toolkits


2. motion estimation
3. Facial recognition system
4. Gesture recognition
5. Human–computer interaction (HCI)
6. Mobile robotics
7. Motion understanding
8. Object identification
9. Image processing.etc

6
7
2.1.3 Programming language

Following are the list of programs that opencv can support

 OpenCV is written in C++ and its primary interface is in C++, but it still retains a
less comprehensive though extensive older C interface.

 There are bindings in Python, Java and MATLAB.

 Algorithms in OpenCV are now developed in the C++ interface

2.1.4 Major Modules

• Core:

A compact module defining basic data structures, including the dense multi-dimensional
array Mat and basic functions used by all other modules.

• Image processing:
An image processing module that includes linear and non-linear image filtering,
geometrical image transformations (resize, affine and perspective warping, generic table-
based remapping), color space conversion, histograms, and so on.

• Video:

A video analysis module that includes motion estimation, background subtraction, and
object tracking algorithms.

• Calibration 3d:
Basic multiple-view geometry algorithms, single and stereo camera calibration, object
pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction.

• Features 2d:

Salient feature detectors, descriptors, and descriptor matchers.

7
8
• Object detection:

Detection of objects and instances of the predefined classes (for example, faces, eyes,
mugs, people, cars, and so on).

• Highgui:

An easy-to-use interface to video capturing, image and video codecs, as well as simple UI
capabilities.

2.1.5 Modules Used In Project

 Image Processing:
Image processing is performing to help the computer to understand the content of an
image (features e.g colors, shape etc). This can be done by developing the algorithms of
image processing.
 High Level GUI And Media I/O :
While OpenCV was designed for use in full-scale applications and can be used within
functionally rich UI frameworks
This is what the HighGUI module has been designed for.
It provides easy interface to:
• Create and manipulate windows that can display images and “remember” their content
(no need to handle repaint events from OS).
• Add trackbars to the windows; handle simple mouse events as well as keyboard
commands.
• Read and write images to/from disk or memory.
• Read video from camera or file and write video to a file.

1.2.1 Visual Studios C++


A simple platform that provides its user to use and write programs for android, ios and many
windows application. It generally supports C and C++ languages but is now a day’s compatible
with many java written codes. The feature, due to which we selected it, was its compatibility with
the OpenCv bin files. Also it’s very user friendly and can provide many important and HIGUI
related function in a very efficient and easy way
All we have to do is to link opencv libraries with visual studio c++.
8
9

C++ code:

#include <iostream>
#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include <opencv\cv.h>
#include "opencv2/objdetect.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"

#include <Windows.h>

using namespace cv;


using namespace std;

int main( int argc, char** argv )


{
bool present = false;
char outputChars[1];
DWORD btsIO;
char turn = 'B';

// Setup serial port connection and needed variables.


HANDLE hSerial = CreateFile(L"COM7", GENERIC_READ | GENERIC_WRITE, 0, 0,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, 0);

if (hSerial != INVALID_HANDLE_VALUE)
{
printf("Port opened! \n");

DCB dcbSerialParams;
GetCommState(hSerial, &dcbSerialParams);

dcbSerialParams.BaudRate = CBR_9600;
dcbSerialParams.ByteSize = 8;
dcbSerialParams.Parity = NOPARITY;
dcbSerialParams.StopBits = ONESTOPBIT;

SetCommState(hSerial, &dcbSerialParams);
}
else
{
if (GetLastError() == ERROR_FILE_NOT_FOUND)
{
printf("Serial port doesn't exist! \n");
}

printf("Error while setting up serial port! \n");


}

9
10
// VideoCapture cap(0); //capture the video from web cam
VideoCapture cap("http://admin:123456@192.168.0.100/video.cgi?.mjpg");
//capture the video from web cam

if ( !cap.isOpened() ) // if not success, exit program


{
cout << "Cannot open the web cam" << endl;
return -1;
}

while (true)
{
Mat imgOriginal;

bool bSuccess = cap.read(imgOriginal); // read a new frame from video

if (!bSuccess) //if not success, break loop


{
cout << "Cannot read a frame from video stream" << endl;
break;
}

Mat imgHSV;

cvtColor(imgOriginal, imgHSV, COLOR_BGR2HSV); //Convert the captured frame from


BGR to HSV

Mat imgThresholded;

inRange(imgHSV, Scalar(0, 169, 82), Scalar(32, 255, 154), imgThresholded);


//Threshold the image
outputChars[0] = 'D';

WriteFile(hSerial, outputChars, strlen(outputChars), &btsIO, NULL);

//morphological opening (remove small objects from the foreground)


erode(imgThresholded, imgThresholded, getStructuringElement(MORPH_RECT, Size(5,
5)) );
dilate( imgThresholded, imgThresholded, getStructuringElement(MORPH_RECT,
Size(25, 25)) );

imshow("Thresholded Image", imgThresholded); //show the thresholded image

//////////////////////////////////////////////////////////////////////////////////
/
Mat finalImg = imgThresholded;

// contours of targets
vector<vector<Point> > contours;

// get contours from binary image

findContours(finalImg, contours, CV_RETR_TREE, CV_CHAIN_APPROX_SIMPLE);

// poly contours
vector<vector<Point> > contours_poly(contours.size());
10
11

// bounded rectangles
vector<Rect> boundRect(contours.size());

present = false;

for (int i = 0; i<contours.size(); i++) {


// get polyDPcontour and bound rectangle of it
approxPolyDP(Mat(contours[i]), contours_poly[i], 5, true);
boundRect[i] = boundingRect(Mat(contours_poly[i]));
int x1 = boundRect[i].x;
int y1 = boundRect[i].y;
int h1 = boundRect[i].height;
int w1 = boundRect[i].width;

rectangle(finalImg, Point(x1, y1), Point(x1+w1, y1+h1),


Scalar(128,128,128),3, 0);
rectangle(imgOriginal, Point(x1, y1), Point(x1 + w1, y1 + h1), Scalar(0,
0, 255),3,0);

present = true;
}

if (present == true) {
outputChars[0] = turn;
WriteFile(hSerial, outputChars, strlen(outputChars), &btsIO, NULL);
outputChars[0] = 'D';
WriteFile(hSerial, outputChars, strlen(outputChars), &btsIO, NULL);
}
else {
outputChars[0] = 'C';
WriteFile(hSerial, outputChars, strlen(outputChars), &btsIO, NULL);
}

//imshow("final Image", finalImg);

//////////////////////////////////////////////////////////////////////////////////
//
imshow("Original", imgOriginal); //show the original image

if (waitKey(30) == 27) //wait for 'esc' key press for 30ms. If 'esc' key
is pressed, break loop
{
cout << "esc key is pressed by user" << endl;
outputChars[0] = 'S';

WriteFile(hSerial, outputChars, strlen(outputChars), &btsIO,


NULL);
break;
}
}

return 0;

11
12

Chapter 3
Overview of hardware

Equipment Needed

After opencv programming we will be using the following equipment to design the basic
chassis of our robot.
1. Arduino uno.
2. Usb extensions
3. Webcam
4. Jumper wire
5. Wifi router
6. remote control car
1.2.2 Arduino uno
Arduino is the major hardware used in our project. Arduino uno is a simple and easy to
use microcontroller based broad that can perform many functions due to it’s a long range
of I/O and PCM pin.
Arduino uno consist of the following component.
1. Power usb:
Arduino board can be powered by using the usb cable from your computer.
All you need to do is connect the usb cable to the usb connection .
2. Power jack:
Arduino board can be directly powered from the AC main power supplyby
connecting it to the jack barrel.
3. Voltage regulator:
The function of voltage regulator is to control the voltage given to the Arduino
board and stabilize the dc voltage used by the processor and other element.
4. Crystal oscillator:the crystal oscillator helps Arduino in dealing with time
issues
12
13
Arduino resets :
5. You can reset your Arduino board . Arduino can be reset by using the reset
button.
6. Pin(3.3,5,GND,Vin):
3.3v- supply 3.3 output volt.
5v (7) – supply 5 output volt.
Most of the component used with Arduino board works fine with 3.3 volt and
5 volt.
GND (8)(ground) – there are several GND pins on the Arduino , any of which
can be used to ground your circuit.
VIN (9) – this pin also can be used to power the Arduino board from an
external power source, like AC mains power supply.
7. Analog pins :
The Arduino uno board has five analog input pins A0 through A5.these pins
can read signal from an analog sensor like humidity sensor or temperature
sensor and convert it into a digital value that can be read by the
microprocessor.
8. Main microcontroller:
Each Arduino board has its own microcontroller(11).you can assume it as the
brain of your board .the main ic on the Arduino is slightly different from
board to board . the microcontrollers are usually of the ATMEL company.you
must know what IC your board has before uploading up a new program from
the Arduino IDE . this information is available on the top of the IC.
9. ICSP pin:
Mostly , ICSP (12) is an AVR, a tiny programming header for the Arduino
consisting of MOSI,MISO,SCK,RESET, VCC , and GND. It is often referred
as to as an SPI(serial peripheral interface).which could be consider as
expansion of the output.actually you are slaving the output device to the
master of the SPI bus.
10. Power led indicator.
The LED should light up when you plug your Arduino into a power source to

13
14
indicate that your board is powered up correctly.if this light does not turn on ,
then there is something wrong with the connection.

11. Tx and Rx Leds :


On the board you will find two labels : TX(transmit) and RX (receive). They
appear in two places on the Arduino uno board.first at the digital pins 0 and 1 ,
to indicate the pins responsible for serial communication.second TX and RX
LED (13).
12. Digital i/o:
The Arduino uno board has 14 digital I/O pins . this pins can be configured
to work as input digital pins to read the logic values (0 or 1) or as digital
output pins to drive the different modules like LEDs ,relays, etc. The pins
labeled as “ ” can be be used to generate PWM.
13. AREF:
AREF stands for analog Refrence . It is sometimes , used to set an external
regfrence voltage (between 0 and 5 Volts) as the upper limit for the analog
input pins.

Wifi router
A wifi router also called access point is a device that provide access to the internet or in
simple words a router is a networking device that forward packet data between computer
networks. wifi router make it easy to build a fast reliable network for homes, universities,
offices etc.the wifi router uses 802.11g protocols set.
In our project wifi router, provides wireless connectivity between pc and ip camera.
.since we now that router is a two terminal device having source and destination address
i.e the wifi router treat ip camera as a source and pc as a destination .in simple word the
images capture by camera are send to laptop by mean of wifi router. Wifi router provide a
network to which both ip camera and laptop are connected. Now to established wireless
connectivity between pc(laptop) and ip camera we have to do some necessary
configuration.

14
15
ip camera :
An internet protocol camera is a type of digital video camera , that is capable of sending
and receiving data via a computer network and the internet. IP cameras connect directly
to the network with no dependence on any other device. IP cameras offer two-way
audio, so the user can interact with the video footage. IP cameras enable remote viewing
to anyone connected to the network .
In our project we used Dlink wireless N network camera. The D-link wireless N network
camera (DCS-930L) is the perfect camera for your homes or small offices. This camera
provide
1. surveillance and monitoring.
2. view live video from anywhere
3. motion and sound detection.
4. up to 640 x 480 resolution for clear video images.
5. wireless 802.11n with wifi protected setup (wps) and Ethernet lan port for network
connection.
6. support M-JPEG video codec for video compression and viewing.
7. easy setup , flexible placement , anywhere access.
Remote control car .

Appendix A
Sample Appendix

This is a sample Appendix. Insert additional appendices with the “Start New Appendix”
command.

15
Bibliography

Vogt, C. 1999. Creating Long Documents using Microsoft Word. Published on the Web at the
University of Waterloo.

16

Вам также может понравиться