Hand Gesture Based Mobile Robotic Loader

Hand Gesture Based Robotic Mobile
Loader
AHMED DAUD
AISHA BATOOL
HAMAYUN KHAN
DDP-FA10-BTE-002
DDP-FA10-BTE-001
DDP-FA10-BTE-020
SPRING 2014
Project Advisor
Dr. Mujtaba Hussain Jaffery
Department of Electrical Engineering
COMSATS-LANCASTER Dual Degree Program
COMSATS INSTITUTE OF INFORMATION TECHNOLOGY

LAHORE PAKISTAN
Submission Form for Final-Year
PROJECT REPORT
PROJECT ID
TITLE
NUMBER OF
MEMBERS
1905
Hand Gesture Based Robotic Mobile Loader
SUPERVISOR NAME
MEMBER NAME
TERN
Dr. Mujtaba Hussain Jaffery
REG. NO.
EMAIL ADDRESS
Ahmed Daud
CIIT/DDP-FA10-BTE-002
ahmeddaud@hotmail.com
Aisha Batool
aishasheikh@yahoo.co.uk
Hamayun Khan
Hamayunkhan@ymail.com
CHECKLIST:
Number of pages in this report
56
I/We have enclosed the soft-copy of this document along-with the

codes and scripts created by myself/ourselves
YES / NO
My/Our supervisor has attested the attached document
YES / NO
I/We confirm to state that this project is free from any type of
plagiarism and misuse of copyrighted material
YES / NO
MEMBERS SIGNATURES
Supervisors Signature
COMSATS Institute of Information Technology, Lahore Campus
This work, entitled Hand Gesture Based Robotic Mobile Loader has been
approved for the award of
Bachelors in Electrical Engineering
SPRING 2014
External Examiner:
Head of Department:
COMSATS INSTITUTE OF INFORMATION TECHNOLOGY

LAHORE PAKISTAN
Declaration
No portion of the work referred to in the dissertation has been submitted in support of an
application for another degree or qualification of this or any other university/institute or
other institution of learning.
MEMBERS SIGNATURES
ii
Acknowledgements
We are grateful to ALLAH Almighty, the most kind and most merciful who provides all the
resources of every kind to us, so that we make their proper use for the benefit of mankind. We
would like to thank our parents for their prayers and moral support who kept backing us up in all
the times, both financially and morally. We are thankful to our worthy Project Advisor Dr.
Mujtaba Hussain Jaffery helped, encouraged, guided and motivated us all the way during the
difficult journey of accomplishing this project. We eulogize Dr. Mujtabas motivation and kind
guidance that transformed the dream of this project into reality. Without his supervision we would
have been nowhere. Moreover, we would like to thank faculty members of Electrical Engineering
Department for their endless and valuable support to make this happen during our final year of
degree program. Furthermore, we would like to thank project lab staff for their help and support
to complete this project.
iii
Abstract
The Kinect sensor is widely used for human motion recognition in video game playing. In this project,
Kinect sensor is being used for the motion control of a mobile robot according to the movements of hands
(gesture), as a ubiquitous human machine interface. The robot is in the form of a loader vehicle, which can
be used for transportation of goods. After decision making based on gesture recognition, the corresponding
command is sent wirelessly to the robot and action is performed according to the movement of the hands.
Moreover, Graphical User Interference (GUI) for monitoring and controlling the robot using computer is
also implemented in this project. Obstacles avoidance is being used to avoid hurdles that come across the
robots path using sonar sensors. The robot can also be controlled using speech recognition.
iv
Table of Contents
1
INTRODUCTION........................................................................................................................ 1
1.1
KINECT SENSOR ..................................................................................................................... 1
1.2
OVERVIEW OF THE SYSTEM ..................................................................................................... 2
1.3
MAJOR CONTRIBUTION ........................................................................................................... 3
1.4
DESCRIPTION OF THESIS ......................................................................................................... 4
LITERATURE REVIEW ............................................................................................................ 5

2.1
ROBOTICS .............................................................................................................................. 5
2.2
OBSTACLE DETECTION AND PREVIOUS WORK .......................................................................... 6
2.3
SYSTEM STRUCTURE .............................................................................................................. 7
2.4
WIRELESS COMMUNICATION................................................................................................... 7
2.4.1
2.5
Speech recognition ............................................................................................................ 9
2.5.2
Voice recognition ............................................................................................................. 10

EXISTING GESTURE RECOGNITION SYSTEMS.......................................................................... 10
SOFTWARE ARCHITECTURE .............................................................................................. 12

3.1
COMMUNICATION SYSTEM .................................................................................................... 12
3.1.1
Requirements .................................................................................................................. 12
3.1.2
Design ............................................................................................................................. 12
3.1.3
X-CTU .............................................................................................................................. 12
3.1.4
Functionality.................................................................................................................... 13
3.2
SPEECH AND VOICE RECOGNITION .......................................................................................... 8
2.5.1
2.6
3
Comparision between different wireless communication protocols ..................................... 7
HMI MODULES .................................................................................................................... 13
3.2.1
Gesture Based Control ..................................................................................................... 13
3.2.2
GUI and Features ............................................................................................................. 17
3.2.3
Speech Based Control....................................................................................................... 18
3.2.4
Voice recognition module................................................................................................. 20
HARDWARE ARCHITECTURE ............................................................................................. 23

4.1
ARCHITECTURE OVERVIEW ................................................................................................... 23
4.2
COMPONENTS ....................................................................................................................... 23
4.2.1
The Kinect Sensor............................................................................................................. 24
4.2.2
Arduino ........................................................................................................................... 25
4.2.3
Arduino Software ............................................................................................................. 27
4.2.4
H-Bridge .......................................................................................................................... 27
4.2.5
Xbee Module ................................................................................................................... 28
4.2.6
Batteries.......................................................................................................................... 30
4.2.7
Distance sensor................................................................................................................ 31
4.2.8
Motors ............................................................................................................................ 32
4.2.9
LCD Display...................................................................................................................... 33
4.2.10
4.3
Gas Sensor................................................................................................................... 34
ROBOT DESIGN..................................................................................................................... 34
4.3.1
Base Design ..................................................................................................................... 34
4.3.2
Structural Design ............................................................................................................. 36
4.3.3
Functionality.................................................................................................................... 37
4.3.4
Dimensions ...................................................................................................................... 37
IMPLEMENTATION ................................................................................................................ 39
5.1
TOOLS.................................................................................................................................. 39
5.2
DEVELOPMENT STAGE .......................................................................................................... 39
5.2.1
Structure and Design ....................................................................................................... 40
5.2.2
Wireless Communication ................................................................................................. 40
5.2.3
Hand Gesture Recognition ............................................................................................... 40
5.2.4
Hurdle Avoidance Mechanism .......................................................................................... 41
5.2.5
Graphical User Interface .................................................................................................. 41
5.2.6
Wireless Monitoring ........................................................................................................ 41
5.2.7
Speech recognition .......................................................................................................... 41
5.2.8
System Integration........................................................................................................... 42
5.2.9
Key Components .............................................................................................................. 42
5.3
HURDLES ............................................................................................................................. 43
5.4
SOLUTION OF THE PROBLEM ................................................................................................. 43
TESTING ................................................................................................................................... 44
6.1
WIRELESS MODULES TESTING .............................................................................................. 44
6.2
TURNING RADIUS ................................................................................................................. 44
6.3
MAXIMUM WEIGHT CAPACITY .............................................................................................. 44
6.4
SYSTEM RUN TIME ............................................................................................................... 45
6.5
SLIPPING .............................................................................................................................. 45
CONCLUSIONS AND FUTURE WORK ................................................................................. 46

7.1
CONCLUSIONS ...................................................................................................................... 46
REFERENCES ................................................................................................................................... 47
APPENDIX A: SOURCE CODE ....................................................................................................... 52
APPENDIX B: HARDWARE SCHEMATICS .................................................................................. 53
APPENDIX C: LIST OF COMPONENTS ........................................................................................ 55
vi
APPENDIX D: PROJECT TIMELINE ............................................................................................. 56
vii
Table of Figures
FIGURE 1-1 A STANDARD KINECT .................................................................................................................... 2
FIGURE 1-2 OVERVIEW OF ENTIRE SYSTEM ........................................................................................................ 3
FIGURE 3-1 X-CTU ................................................................................................................................... 13
FIGURE 3-2 THE VIRTUAL REALITY INTERFACE .................................................................................................. 15
FIGURE 3-3 UML USE-CASE DIAGRAM .......................................................................................................... 16
FIGURE 3-4 IMAGE PROCESSING DIAGRAM...................................................................................................... 16
FIGURE 3-5 IMAGE PROCESSING DIAGRAM FOR DECISION MAKING ....................................................................... 17
FIGURE 3-6 C SHARP BASED GUI.................................................................................................................. 17
FIGURE 3-7 C SHARP BASED SPEECH CONTROL ................................................................................................ 19
FIGURE 3-8 VR MODULE ............................................................................................................................ 20
FIGURE 3-9 VR COMMANDER SOFTWARE ....................................................................................................... 21
FIGURE 4-1 ARCHITECTURE OVERVIEW OF SYSTEM ............................................................................................ 23
FIGURE 4-2 KINETIC INTERNAL HARDWARE ..................................................................................................... 24
FIGURE 4-3 ARDUINO UNO R3 FRONT ........................................................................................................... 25
FIGURE 4-4 ARDUINO MEGA 2560 R3 FRONT................................................................................................. 26
FIGURE 4-5 ARDUINO SOFTWARE ................................................................................................................. 27
FIGURE 4-6 L298 MOTOR DRIVER ................................................................................................................. 28
FIGURE 4-7 L298 CIRCUIT DIAGRAM ............................................................................................................. 28
FIGURE 4-8 XBEE MODULE .......................................................................................................................... 29
FIGURE 4-9 12V LEAD ACID BATTERY ............................................................................................................ 30
FIGURE 4-10 HC-SR04 SENSOR................................................................................................................... 31
FIGURE 4-11 PITTMAN GEAR MOTOR ......................................................................................................... 32
FIGURE 4-12 BASIC 20X4 CHARACTER LCD .................................................................................................... 33
FIGURE 4-13 GAS SENSOR - MQ-4 ............................................................................................................... 34
FIGURE 4-14 ROBOT BASE DESIGN................................................................................................................ 35
FIGURE 4-15 STRUCTURAL DESIGN................................................................................................................ 36
FIGURE 4-16 ROBOT TOP VIEW.................................................................................................................... 37
FIGURE 4-17 LIFTER OF ROBOT .................................................................................................................... 38
FIGURE 5-1 DEVELOPING STAGE ................................................................................................................... 39
FIGURE 5-2 ROBOT STRUCTURE .................................................................................................................... 40
FIGURE 5-3 GUI ....................................................................................................................................... 41
FIGURE 5-4 INTEGRATION OF ALL MODULES ..................................................................................................... 42
FIGURE 6-1 X-CTU, ZIGBEE COORDINATOR, ZIGBEE ROUTER AND ITS CONNECTION WITH ARDUINO .............................. 44
FIGURE 10-1 SCHEMATIC H BRIDGE .............................................................................................................. 53
FIGURE 10-2 H BRIDGE PCB DESIGN............................................................................................................. 53
viii
Table of Tables
TABLE 2-1 COMPARISON OF BLUETOOTH, WI-FI, AND ZIGBEE PROTOCOLS ............................................................... 8
TABLE 4-1 KINECT SPECIFICATIONS ................................................................................................................ 25
TABLE 4-2 ARDUINO UNO SPECIFICATIONS...................................................................................................... 26
TABLE 4-3 ARDUINO MEGA SPECIFICATIONS .................................................................................................... 26
TABLE 4-4 XBEE SERIES 2 SPECIFICATIONS ....................................................................................................... 29
TABLE 4-5 LEAD ACID BATTERY SPECIFICATION................................................................................................. 30
TABLE 4-6 DISTANCE SENSOR SPECIFICATIONS ................................................................................................. 32
TABLE 4-7 ROBOT DIMENSIONS ................................................................................................................... 38
ix
Chapter 1
1
Introduction
In todays era, the robotic industry has been evolving many new trends to upsurge the efficiency,
approachability and accuracy of the systems. Robot are used to do jobs that are injurious to the
human, repetitive jobs that are tedious, hectic etc. Although robots can be used to replace humans
but still they need to be controlled by humans. Besides controlling the robotic system through
physical devices, recent techniques of controlling robotic system through gesture and speech have
become very popular. The core purpose of using gestures and speech is that it is a more natural
way of controlling and provides an intuitive form of interface with the robotic system. Automation
is an essential part of robotics today. Automated robots can perform the tasks of loading and
unloading weights according to the need. The motivation behind the development of this project
was to create a prototype for testing of different human machine interfaces. This project
contributes to the field of robotics by integrating the following departments:
i.
Speech Recognition
ii.
Gesture Recognition
iii.
Graphical User Interface
iv.
Remote Monitoring
During the development of this project we followed a modular approach because modularity
provides flexibility for the further expansion, research and testing etc. by hardware and software
changes in the existing system. The main module used for gesture and speech recognition was
Kinect Sensor.
1.1 Kinect Sensor

The Kinect sensor is an input device developed by Microsoft initially for the purpose of gaming
with the Xbox 360. In 2012 a Windows edition was released along with its SDKs for the purpose
of development. The Kinect for Windows Sensor from Microsoft is not the same as the motiontracking, voice-activated gaming controller for the Xbox 360 known as Kinect for Xbox 360.
Kinect for windows sensor is a tool for developers to creating new motion-tracking applications
for computers running on Windows operating system. The Kinect for Windows sensor is used
with the Kinect for Windows Commercial Software Development Kit (SDK) [1].
Kinect can take input by means of human body detection. This enables it to act as a powerful tool
for gesture recognition. It can also take voice commands as input through its multiple
1
microphones. A 3D camera enables it to act as a depth sensor as well. It can also perform colour
recognition. The algorithm used in this project enables it to act both as depth sensor and a RGB
colour sensor [2].
Kinects gesture recognition is also used in the gesture based control module of the project.
A full detail of Kinects specifications and functionalities is discussed in later Chapter.
Figure 1-1 A Standard Kinect

Courtesy: https://www.microsoft-careers.com/content/hardware/hardware-story-kinect/
1.2 Overview of the Entire System

The main objective of this project was to build a prototype that would facilitate the testing of
different algorithms for motion of robot. For that purpose, a Kinect camera has been used. The
soft wares that have been used are Processing 2 and Microsoft Visual Studio.
In the gesture control mode, the Kinect camera captures a stream of images that are forwarded to
processing 2, which runs the motion algorithm. Based on the hand movements, the Robot is given
commands according to the motion algorithm. There is a wireless link between the robots and the
computer that gives the commands. Robot has a microcontroller that processes those commands
and the robotic loader perform the particular task accordingly.
In speech control mode, the Kinect sensor captures the voice of user that are forwarded to
Microsoft Visual Studio, which runs the speech recognition algorithm. Based on the voice
command input by the user, the Robot is given commands according to the speech recognition
algorithm. These commands are sent to robot via a wireless link between the robot and the
2
computer. The Robot process those commands and the robot perform the particular task
accordingly.
On robot obstacle detection and avoidance mechanism is also implemented. When an obstacle
comes in front of robot, the robot stops to avoid collision. Moreover the robot doesnt move
backwards when an obstacle is present on the back. The robot doesnt turn left if an obstacle is
present on left side of robot. In the same way the robot doesnt turn right if an obstacle is present
on the right side of robot.
The robot can be controlled using graphical user interface developed in C Sharp.
A wireless monitoring of robot status is also implemented. The current status of robot, battery
voltage status, obstacle warning, natural gas warning, distance sensor data and orientation of robot
can be monitored remotely. An overview of the entire system in block diagram form is displayed
in figure 1-2.
Figure 1-2 Overview of Entire System
1.3 Major Contribution

3
This project has contributed greatly in the fields of:

i.
Kinect Application Development
ii.
Motion Detection
iii.
Human Computer Interaction
iv.
Speech Recognition
1.4 Description of Thesis

Chapter 2:
This chapter explains the different stages of literature review involved in this project, the different
algorithms, which tell us about the contextual material to understand this project.
Chapter 3:
This chapter explains the software architecture of the project; including Computer Vision, the
different control techniques, the communication system, and the various Human-Machine
Interface modules that have been deployed in this project.
Chapter 4:
This chapter explains the hardware architecture of the project. A brief introduction of the different
hardware components used in the project and the construction of the Robot.
Chapter 5:
This chapter details the implementation of this project in the field of control.
Chapter 6:
This chapter includes the testing of robot.
Chapter 7:
This chapter concludes the project and gives a brief overview of everything that has been achieved
through it. It also discusses the future work related to this project.
Chapter 2
2 Literature Review
Lately, quite a significant amount of time and chattels spent in the field of computing have been
directed into vision and speech. What has strained so much interest into the subfield is the
immeasurable ease and uncluttered accuracy with which the human brain accomplishes certain
tasks related to vision and speech. The concept being that the human brain is essentially an
information processing device much like the modern computer. Computer scientists have based
much of their ideas and inspirations on researching speech and computer vision. Although for a
long period, biologists have delved into and unravelled a proportion of the mysteries of the brain,
we are yet to be capable of representing its functionality with the help of a machine. This project
aims to work into human brains most important image processing and speech recognition
operations to be accomplished by a computer. It provides some vision into several basic
techniques used in fields of computing such as image processing, speech recognition and a few
advanced approaches to perform basic obstacle detection for a vehicle [3].
2.1 Robotics
A robot can be defined as a mechanical device which performs automated tasks, either according
to direct human supervision, a pre-defined program or, a set of general guidelines, using artificial
intelligence techniques. The first commercial robot was used in the automotive industry by Ford
in 1961. The robots were predominantly proposed to substitute humans in repetitive, hefty and
lethal processes. Now a days, due to economic reasons, industrial robots are purposefully used in
a lot of diverse applications [4].
The universe of Robotics is a standout amongst the most energizing territories that has been
through steady development and advancement. Robotics are interdisciplinary and have gotten
more a piece of our lives. They are no more a dream for the future however an actuality of the
present. In our regular lives, robots fill numerous essential parts from cutting the garden, securing
our uninhabited houses, nourishing our pets or building our autos. It is unmistakable that they are
all over and over the long haul they will be more present [5].
Inside robotics autonomy, unique consideration is given to mobile robots, since they have the
ability to explore in their surroundings and are not altered to one physical area. The assignments
they can perform are perpetual and the potential outcomes for them to help our lives are
incalculable (e.g., unmanned flying vehicles, self-governing submerged vehicles). As of late we
have seen a real build of versatile robots. Presently a days they are subject of significant
exploration activities being available in industry, military, security and even home situations as
5
purchaser items for stimulation and home support undertakings. The portable mechanical
technology field has numerous difficulties that need to be tended to. Route (mapping, localization,
path planning), environment observation (sensors), autonomous conduct (actuators, behaviour
rules, control mechanisms), two legged development (parity), ability to work for an enlarged
period without human intercession (batteries, human help), are exploration open door samples,
among numerous others. Some of these challenges are addressed in this thesis [6].
2.2 Obstacle Detection and Previous work

Obstacle detection can be defined as;
The grit of whether a given space is free of obstacles for nontoxic travel by an autonomous
vehicle [3].
Obstacle detection is one of the most prominent hitches within the subfield of computer vision in
terms of the volume of research it has engrossed and its numerous applications. Together with
research into other subfields of artificial intelligence, obstacle detection is decisive in order to
accomplish numerous basic tasks for mobile robots such as dodging and steering.
An examination of various research studies on automated robots shows that there are different
types of obstacle detection sensors. These sensors differ in price from inexpensive to very
expensive. Each sensor has its own unique advantages and disadvantages for different
applications. Previous work in this field of obstacle detection has made use of arithmetical
cameras, laser scanner, sonar, and odometry to perform obstacle detection. Snap shots are taken
of the real scene using input devices and the data is handled by a computer which eventually
performs obstacle detection. Since most of these devices can only take a limited number of shots
of a scene in a given time and a huge amount of data needs to be processed, obstacle detection
can often be quite slow for certain real time applications, like navigation of high-speed machines.
State of the art sensors and processors are used to perform obstacle detection so far have been
able to billet navigation at speeds of only a few meter per second. Almost all obstacle detection
systems use a combination of passive-active technology. The best solution for obstacle detection
is obtained using a vision system combined with a distance sensor like sonar or laser. But in our
case we have only used sonar sensor for hurdle detection. Sonar is the most widely used sensor
for obstacle detection because it is cheap and simple to operate. Sonar can be used for vehicle
localization, navigation and obstacle detection. We have used a sonar for obstacle detection using
obstacle avoidance algorithm.
Sonar works by blanketing a target with ultrasonic energy; the resultant echo contains
information about the geometric structure of the surface, in particular the relative depth
information [44].
6
2.3 System Structure

The robot control system includes the following parts:
i.
A gesture and speech recognition system running on a laptop computer.
ii.
A Kinect camera connected with the laptop computer.
iii.
The robot controller.
iv.
A pair of wireless communication modules connected with the gesture/speech recognition

system and the robot controller respectively.
v.
Obstacle detection system
Each gesture corresponds to a different robot control command. Then the wireless module is used
to send these different robot control commands to the robot controller. Accordingly, the robot will
perform actions, thus human-robot interaction can be achieved. The gesture recognition system
is developed with the help of Open NI libraries. Moreover each speech command corresponds to
a different robot control command and a wireless module is used to send these commands to the
robot controller to perform the desired actions.
2.4 Wireless Communication

Wireless communication in general works through radio signals that are broadcast by an enabled
device within the air, physical surroundings or atmosphere. The transmitting device can be a
source or an intermediate device with the capability to propagate wireless signals. The
communication between two devices take place when the destination or receiving intermediate
device captures these signals, making a wireless communication channel among the sender and
receiver device.
The wireless technique of communication uses low-powered radio waves to transmit data between
devices. High powered transmission sources typically involve government licenses to broadcast
on a particular wavelength. This platform has traditionally carried voice signals and has developed
into a huge industry, carrying many thousands of broadcasts all over the world. Radio waves are
now gradually being used by unregulated computer users [7].
2.4.1 Comparision between different wireless communication protocols

Wi-Fi (over IEEE 802.11), Bluetooth (over IEEE 802.15.1), ZigBee (over IEEE 802.15.4), and
ultra-wideband (UWB, over IEEE 802.15.3), are four protocol standards for shortrange wireless
7
communications with low power consumption. According to an application viewpoint, Wi-Fi is

focused at computer-to-computer connections as an addition or replacement of cabled networks,
Bluetooth is proposed for a cordless mouse, keyboard, and hands-free headset, ZigBee is intended
for reliable wirelessly networked monitoring and control networks, while UWB is focused on
high-bandwidth multimedia links [8]. A comparision between bluetooth, Wi-Fi and ZigBee is
shown in the table 2-1 [9].
Table 2-1 Comparison of Bluetooth, Wi-Fi, and ZigBee Protocols
We have selected ZigBee for our communication between computer and robot. ZigBee is the only
standard wireless technology intended to address the exclusive requirements of low-cost, lowpower wireless sensor and control networks in just about any market. ZigBee can be utilized pretty
much anywhere and is easy to implement and requires little power to operate. As our robot was a
mobile system so our main concern is power consumption and wide range so ZigBee was the best
choice in the given conditions.
2.5 Speech and Voice Recognition

We as human talk and hear each out other in human-human interface. Presently endeavours have
been made to create a human-machine interface in which people and machine can impart in an
incompetent way. Clearly such an interface would yield incredible profits. Handwriting
recognition has been produced however has neglected to satisfy this dream. However now
endeavours have been made to developed vocally intuitive machines to realize this dream. i.e.
Computer that can give speech as output given textual input (speech synthesizer), and understand
speech which is given as input (speech recognizer) [10].
8
Two quickly enhancing advances, voice and speech recognition, are unequivocally joined
regarding their expected reason, however the contrasts between the two are frequently confounded.
When all is said in done terms, the key distinction in the middle of voice and speech recognition
exists in the gathered information investigation and the yield from that examination. Speech
recognition gathers the talked word then dissects and presents the results as information, though
voice recognition is concerned with distinguishing the individual giving the talked word info [11].
Voice and speech recognition vary through the path in which enter is broke down. Both of these
innovations work with the human voice, changing over it into an information stream that could
be dissected. Speech recognition and voice recognition are both quickly advancing advances with
various provisions that can improve comfort, security, law authorization deliberations, and that's
just the beginning. In spite of the fact that "Speech recognition" and "voice recognition" are
regularly utilized conversely, they are distinctive innovations with drastically diverse destinations.
2.5.1 Speech recognition

Speech recognition is the procedure of catching talked words utilizing a mouthpiece or phone and
changing over them into a digitally put away set of words. The nature of a speech recognition
frameworks are evaluated as indicated by two variables: its speed (how effectively software could
sustain a human voice) and its accuracy (error rate in translating spoken words to digital data)
[12].
Speech recognition innovation has unlimited provisions. Generally, such programming is utilized
for programmed interpretations, correspondence, without hands registering, therapeutic
translation, apply autonomy, robotized client administration, and significantly more. On the off
chance that you have ever paid a bill via telephone utilizing a mechanized framework, you have
presumably profited from speech recognition programming.
Speech recognition innovation has made tremendous strides inside the most recent decade. In any
case, speech distinguishing has its shortcomings and pestering issues. Current innovation is far
from perceiving conversational discourse, for instance.
Notwithstanding its deficiencies, speech recognition is rapidly developing in prevalence. Inside
the following few years, masters say that speech recognition will be the standard in telephone
organizes the world over. Its spread will be supported by the way that voice is the main choice
for controlling computerized administrations in spots where touch tone telephones are remarkable.
Extra employments of speech recognition incorporate transcription, interpretation, and
computerized phone administrations. Despite the fact that the engineering has been being used
for a few years, speech recognition keeps on improving as the information dissection
programming creates further. A percentage of the challenges confronted in creating speech
9
recognition programming incorporate restricted slang terms, conversational dialect, and exact
representation of information from people with discourse obstacles.
2.5.2 Voice recognition

While speech recognition is the procedure of changing over speech to advanced information,
voice recognition is pointed to distinguishing the individual who is talking.
Voice recognition works by examining the gimmicks of speech that vary between people. Every
person has a unique pattern of speech stemming from their anatomy and behavioural patterns.
The provisions of voice recognition are uniquely not the same as those of speech recognition.
Most normally, voice recognition innovation is utilized to check a speaker's character or focus an
obscure speaker's personality. Speaker identification and speaker verification are two basic kind
of voice recognition [13].
Speaker verification is the methodology of utilizing an individual's voice to confirm that they are
who they say they are. Basically, an individual's voice is utilized like a finger impression. When
a specimen of their speech is recorded, an individual's speech samples are tested against a database
to check whether their voice matches their guaranteed personality.
Most usually, speaker confirmation is connected to circumstances where secure access is required.
Such frameworks work with the client's information and collaboration.
Speaker identification proof is the procedure of deciding an obscure speaker's identity. Not at all
like speaker verification, speaker identification proof is typically secretive and managed without
the client's information.
2.6 Existing Gesture Recognition Systems

Many systems exist that are used for controlling the robot through gestures. Some gesture
recognition systems include, adaptive colour segmentation, hand finding and labelling with
blocking, morphological filtering, and then gesture actions are found by template matching and
skeletonising. This does not provide dynamicity for the gesture inputs due to template matching.
Another system provide real-time gestures to the robot using machine interface device. Analogue
flex sensors are used on the hand glove to measure the finger bending, also hand orientation and
position are measured by ultrasonic for gesture recognition. Another approach of human gesture
recognition uses an accelerometer, which gives the most accurate measurements, but this method
is not practical for routine use, because it is invasive and users may forget to wear it. And in
another approach, gestures are recognized using Microsoft Kinect for Windows. Kinect gathers
the colour and depth information using an RGB and Infra-Red camera respectively. Depth image
10
generated by depth sensor is a simplified 3D description, however we only treat depth image as
an additional dimension of information and still implement recognition process in 2D space. The
input data for Kinect are streams of vectors of twenty body-joint locations acquired by standard
Application Programming Interface (API) of the Kinect Software Development Kit (SDK). These
joints symbolise the human body captured by Kinect sensor. We have used the Kinect sensor
because its the latest consumer market gaming camera which is reasonably priced and extremely
practical. We selected Kinect sensor because its a multipurpose device. It can be used for both
gesture and speech recognition.
11
Chapter 3
3
Software Architecture
This chapter details the complete software architecture of the project, including the details of the
different control techniques associated with this project and detail of the communication system.
It also discusses the various HMI modules.
3.1 Communication System

The main goal of the communication system was to allow the robot to have wireless
communication with a computer. The hardware that has been used is Xbee series 2 due to its
simplicity, cost efficiency and low power consumption.
3.1.1 Requirements
A ZigBee coordinator communicates with one ZigBee router, attached to the robot to ensure its
real time operation.
3.1.2 Design
One of the Xbee S2 as Coordinator, another as Router for point to point two way communication.
The software used for configuring the Xbees is X-CTU. It is software from Digi International,
which can be used configure a variety of modems manufactured by Digi International, which
include Xbee, Wi-Fi etc. Xbees can either be configured by this software, or directly by any
terminal program.
3.1.3 X-CTU
XCTU shown in figure 3-1 is a free multi-platform application intended to facilitate developers
to interact with Digi RF modules through a simple-to-use graphical interface. XCTU contains
tools that allows developers to set-up, configure and test Xbee RF modules [14].
12
Figure 3-1 X-CTU
3.1.4 Functionality
Two ZigBees are connected directly to each other. They are in constant communication with
each other. The coordinator sends commands to the router, which forwards it to the Arduino Mega
present on the robot. It is then decoded and the respective action is performed.
3.2
HMI Modules
This section discusses the different Human-Machine Interaction modules present in the
project. The various modules include Gesture based control and speech recognition. The
modules are very important to the development of this project as they serve greatly in testing
and implementing algorithms.
3.2.1 Gesture Based Control

Gesture recognition is the mathematical interpretation of human body language using computing
devices like depth-aware cameras, stereo cameras and wired gloves etc. The gestures can come
from any state or bodily motion.
13
NUI, Ubiquitous Computing and Augmented Reality

Gesture recognition is an example of Natural User Interface (NUI). NUI are interfaces that are
effectively invisible to the user. The control actions in NUI are related to natural, everyday human
behaviour. NUI is the future of human computer interaction. According to Microsoft, NUI is the
next evolutionary phase.
The project also incorporates Augmented Reality. Augmented Reality is a type of virtual reality
in which a compound view is generated for the user. This view is the combination of the real
scene seen by the user coupled with a virtual scene created by the computer that supplements the
scene with added information. The virtual scene created by the computer improves the user's
perception of the virtual world they are viewing or interacting with. In the gesture control
application, the virtual rectangles interface and background scene serve as augmented reality.
Software Features
The gesture controlled robot application allows human to control robot using hand gestures. The
users can enable augmented reality creation, which creates a virtual scene in the background. The
commands that the users can give to the robot using hand gestures are: move forward; move back;
turn right; turn left; lifter up; lifter down and stop.
The User Interface and Use-Case Diagrams
Gesture recognition is based on two types of gestures namely offline and online gestures. Offline
gestures are processed after their completion whereas online gestures are directly manipulated at
run-time. In this project online gestures have been used because the robots are real-time embedded
systems and require real-time computing.
The design principle was followed for designing the human computer interaction to keep it close
to the natural gestures for driving vehicles and commanding someone to come close and go away,
it was also a requirement to make it flexible and adaptive.
The human computer interaction interface is based on hand positions and rectangles. It uses the
position of hands in three rectangles to identify the command to send as shown in figure 3-2. The
commands are then sent wirelessly to the robots through ZigBee dongles.
14
Figure 3-2 The Virtual Reality Interface
Following is the relation table between commands and positions:

i.
Both hands in upper rectangle corresponds to Move Forward
ii. Both hands in middle rectangle corresponds to Stop

iii. Both hands in lower rectangle corresponds to Move Backwards
iv. Left hand in upper rectangle and right hand in lower rectangle corresponds to Turn right
v. Right hand in upper rectangle and left hand in lower rectangle corresponds to Turn
left
vi. Right hand in upper rectangle and left hand in middle triangle corresponds to the
lifter up
vii. Left hand in upper rectangle and right hand in middle triangle corresponds to the
lifter down
The rectangles are created after six seconds of program run time to give users the time to settle.
The position of the interface and the size of rectangles depend upon the position of user's hand
and the distance between them at the moment of rectangle creation.
15
Figure 3-3 UML Use-Case Diagram
Figure 3-4 Image Processing Diagram
16
Figure 3-5 Image Processing Diagram for Decision Making
3.2.2 GUI and Features

The GUI developed in the C Sharp for the monitoring and control of robot is shown in Figure 36.
Figure 3-6 C Sharp Based GUI
17
The Graphical User Interface (GUI) developed in the Microsoft Visual Studio runs on the remote
laptop or computer. It is a multipurpose interface for both monitoring and control of robotic loader.
The GUI sends and receives data to and from the robotic loader via Xbee module connected with
the serial port of the laptop/computer.
Following are the main features of the GUI
I.
Controlling of robot
i.
Using up arrow of keyboard or by clicking forward button, robot can move forward.
ii.
Using down arrow of keyboard or by clicking backward button, robot can move
backward.
iii.
Using left arrow of keyboard or by clicking left button, robot can move left.
iv.
Using right arrow of keyboard or by clicking right button, robot can move right.
v.
Using shift key from the keyboard or by clicking stop button, robot can stop.
vi.
By pressing 8 from the numeric keypad or by clicking up button, robot lifter can
move up.
vii.
By pressing 2 from the numeric keypad or by clicking down button, robot lifter can
move down.
viii.
By pressing 5 from the numeric keypad or by clicking lifter stop button, robot lifter
can stop.
II.
Monitoring of robot
i.
Current state of robot
ii.
Obstacles status
iii.
Batteries current voltage level
iv.
Accelerometer reading
v.
Color detector reading
vi.
Distance sensors readings
vii.
Natural gas detector status
3.2.3 Speech Based Control

The speech recognition control developed in the C Sharp is shown in Figure 3-7.
18
Figure 3-7 C Sharp Based Speech Control
One of the key features of the Kinect for Windows natural user interface is speech recognition.
We have developed speech based control application using Kinect sensor that respond to spoken
commands. The Kinect sensor contains a microphone array that consists of four microphones that
are arranged linearly. The microphone array is an exceptional input device for speech recognitionbased applications because it delivers enhanced sound quality through noise suppression and
acoustic echo cancellation than a comparable single microphone. The Kinect for Windows
Runtime Language Pack (included in the Kinect for Windows SDK) includes a custom acoustical
model that is augmented for the Kinect sensors microphone array.
To recognize speech, we created a Speech Recognizer Engine object. Then created a grammar
and load it into the engine. Finally, created event handlers to handle recognition events, and set
the input audio stream to be the sensor's audio source stream. In the event handler, the event
arguments contain a confidence level as to the quality of the recognition; use the quality level to
accept or reject the recognized speech as a command to our robot control application.
Following are the main features of the speech application
i.
By saying forward or straight, robot can move forward.
ii.
By saying backward or back, robot can move backward.
iii.
By saying turn left or left, robot can move left.

19
iv.
By saying turn right or right, robot can move right.
v.
By saying stop, robot can stop.
vi.
By saying upward or up, robot lifter can move up.
vii.
By saying downward or down, robot lifter can move down.
viii.
By saying end, robot lifter can stop.
Moreover if the user speaks the correct command the corresponding command will be highlighted
on the screen and the animated turtle will move in that direction.
3.2.4 Voice recognition module

Voice recognition is a tools that permits spoken input into systems. When someone talks to a
computer, phone or device, it uses what is said as input to trigger some event. Voice recognition
technology is being used to substitute other approaches of input like clicking, typing or selecting
in other methods. It is a way to make software and devices more user-friendly and to upsurge
production. There are a number of applications and areas where voice recognition is used,
including the military, in the medical field, as aid for impaired persons, in robotics etc. [15].
The VR module shown in figure 3-8 is a multi-purpose voice recognition module intended to
easily add robust, versatile and cost effective multi-language speech recognition abilities to nearly
any application developed on the Arduino platform.
Figure 3-8 VR module

Courtesy: http://www.veear.eu/products/easyvr-arduino-shield/
20
The module supports both Speaker Independent Commands and up to 32 user-defined Speaker
Dependent (SD) commands, triggers and voice passwords. The module supports Speaker
Dependent commands in any language.
The EasyVR Commander software was used to configure our VR module connected to our PC
by using the microcontroller host board. EasyVR commander is a software for recording voices
in this module. We defined a group of commands and password and generate a basic code
template to handle them. We edit the generated code template to implement the our robot
application logic, but the template contained all the subroutines or functions to handle the speech
recognition jobs.
Figure 3-9 VR Commander Software
The VR module is used in our project for on robot processing of voice commands to control robot
using the voice commands. Following are the main features of the voice application
i.
By saying forward, robot can move forward.
ii.
By saying back, robot can move backward.
iii.
By saying turn left, robot can turn left.
iv.
By saying right, robot can turn right.

21
v.
By saying stop, robot can stop.
vi.
By saying upward, robot lifter can move up.
vii.
By saying down, robot lifter can move down.
viii.
By saying ok, robot lifter can stop
22
Chapter 4
4 Hardware Architecture
This chapter discusses in detail, the hardware architecture of the project. It would include the
detail of the specifications of the components used. It will also shed light on the various aspects
that were associated with the design and construction of the robots.
4.1 Architecture Overview

The design of the robot is explained graphically with the help of a diagram shown in Figure 4-1.
The diagram clarifies the overall communications of the modules and their locations.
Figure 4-1 Architecture Overview of system
4.2 Components
Following are the hardware components used in the project.
23
4.2.1 The Kinect Sensor

As mentioned before, Kinect is an input device used primarily for human body motion sensing.
It can also be used for colour and depth sensing. An inside look of the Kinect is shown in figure
4-2.
Figure 4-2 Kinetic Internal Hardware

Courtesy: http://msdn.microsoft.com/en-us/library/jj131033.aspx
The Kinect comprises of the following components:

i.
RGB camera for capturing colour images has been used. It can capture images at
1280x960 resolutions.
ii.
An IR Emitter and IR Sensor for depth sensing. The emitter emits a beam of infrared light
which is absorbed by the sensor on its way back. This information is then converted into
a depth image.
iii.
A multi-array microphone, containing four microphones for both recording the audio and
localizing its source.
iv.
A 3-axis accelerometer for determining the orientation of the Kinect. It is configured for
a 2g acceleration due to gravity (g) range.
Following are the specifications of the Kinect [16].
24
Table 4-1 Kinect Specifications
4.2.2 Arduino
All the processing on the robot is done by Arduino microcontroller. It takes input from all the
sensors and from Xbee module and takes decision after processing all the data. Its the brain of
the robot.
Arduino is an open-source electronics prototyping platform based on flexible, easy-to-use
hardware and software [17].
Figure 4-3 Arduino Uno R3 Front

Courtesy: http://arduino.cc/en/Main/arduinoBoardUno
25
Table 4-2 Arduino Uno Specifications
Figure 4-4 Arduino Mega 2560 R3 Front

Courtesy: http://arduino.cc/en/Main/arduinoBoardMega2560
Table 4-3 Arduino Mega Specifications
26
Arduino is the first widespread Open Source Hardware platform. It was launched in 2005 to
simplify the process of electronic prototyping and it enables everyday people with little or no
technical background to build interactive products.
The Arduino system is a fusion of two different elements:
i.
A small electronic board containing a microcontroller that makes it easy and reasonable
to program a microcontroller, a type of tiny computer found inside millions of ordinary
items.
ii.
A software application which is used to program the board.
4.2.3 Arduino Software

The open-source Arduino environment is used to write code and upload it to the i/o board. It runs
on Windows, Linux and Mac OS X. The environment is written in Java and based on avr-gcc,
Processing, and other open source software [18]. We used arduino software for programming of
arduino microcontroller. A basic interface of arduino software is shown in figure 4-5.
Figure 4-5 Arduino Software
4.2.4 H-Bridge
H Bridge is used to drive high current motors. L298 shown in figure 4-6 is a high current, high
voltage dual full-bridge driver intended to receive standard TTL logic levels and drive inductive
27
loads such as solenoids, relays, DC and stepping motors. The emitters of the lower transistors of
each bridge are connected together and the corresponding external terminal can be used for the
connection of an external sensing resistor. An added supply input is provided so that the logic
works at a lower voltage [19]. To get a higher current of 4A, we have paralleled the outputs of
L298. Moreover we also paralleled channel 2 with channel 3 and channel 1 with channel 4. The
circuit diagram of L298 is shown in figure 4-7.
Figure 4-6 L298 motor driver

Courtesy: https://www.sparkfun.com/datasheets/Robotics/L298_H_Bridge.pdf
Figure 4-7 L298 Circuit Diagram

Courtesy: https://www.sparkfun.com/datasheets/Robotics/L298_H_Bridge.pdf
4.2.5 Xbee Module
28
We used Xbee modules for communication between robot and computer. Xbee shown in figure
4-8 is a Plug and Play wireless module that works on the IEEE 802.15.4 standard. It is a simple
and reliable communication source and supports multiple topologies.
Xbee series 2 transceivers have been used as they are low power consuming devices ideal for
embedded devices having limited battery resource. They provide fast and reliable point 55 to point
communication with low response time. There are other devices as well like Bluetooth and WiFi, but they have higher cost per node and are more power consuming so would compromise on
battery life. Xbee has the flexibility to connect to 65000 nodes, compared to 7 nodes for a
Bluetooth scatter net [20].
Figure 4-8 Xbee module

Courtesy: https://www.sparkfun.com/products/10414
Following are its specifications [21].
Table 4-4 Xbee Series 2 Specifications
29
4.2.6 Batteries
For the power requirements of the robot, robot is equipped with five rechargeable lead acid
batteries shown in figure 4-9.
i.
Two 12V batteries connected in parallel for power supply to power circuit for
microcontroller, sensors, and LCD etc. operation.
ii.
Two 12V batteries in series to get 24V for driving motors of robot.
iii.
One 12V battery for lifter motor operation.
Figure 4-9 12V Lead Acid Battery

Courtesy: http://www.batteryspace.com/sealedleadacidbattery12v23ahs1.aspx
Table 4-5 Lead Acid Battery Specification
30
4.2.7 Distance sensor
Figure 4-10 HC-SR04 Sensor

Courtesy: http://www.dx.com/p/hc-sr04-ultrasonic-sensor-distance-measuring-module-133696#.U5ydMPmSyvs
We have used four distance sensors one on each side of robot for detecting obstacles on either
side. The HC-SR04 ultrasonic sensor uses sonar to determine distance to an object by Doppler
Effect just like dolphins or bats. The modules comprises of ultrasonic transmitters, ultrasonic
receiver and control circuit. Interfacing of sonar sensor with Arduino controller is shown in figure
4-10. The basic principle of work:
High level signal for at least 10us using IO trigger,
An eight 40 kHz pulse is sent by the module automatically and a pulse back signal is
scanned.
IF the signal back, through high level , time of high output IO duration is the time from
sending ultrasonic to returning.
Distance = ((Duration of high level) (velocity of sound: 340 m/s))/2
The main reason for choosing an ultrasonic ranging module, the HC-SR04 is because of its high
ranging accuracy and stable performance. If we compared ultrasonic sensor with other IR ranging
module, HC-SR04 is more inexpensive. But still it has the identical ranging accuracy and longer
ranging distance. Its operation is not affected by sunlight or black material though acoustically
soft materials like cloth can be difficult to detect using sonar.
The specifications of sonar sensor is given in table 4-6.

31
Table 4-6 Distance Sensor Specifications
4.2.8 Motors
Figure 4-11 PITTMAN Gear Motor

Courtesy: http://www.gearseds.com/standard_motor.html
We have used two PITTMAN GM8224 DC brush gear motors shown in figure 4-11 to drive the
robot. One on the left side and one on the right side of the base of robot.
32
Specifications:
Supply Voltage: 24.0 VDC
Speed @ Cont. Torque: 720 RPM
Current @ Cont. Torque: 1.41 A
Output Power: 15.5 Watts
Terminal Resistance: 17 Ohms
No Load Current: 0.09 A
Weight (Mass): 231 g
Peak Current: 5.54 A
We have also used one Jye Maw gear motor for lifter.
Specifications:
Speed: 1260 RPM
Voltage: 12 VDC
4.2.9 LCD Display

A Basic 20x4 Character LCD shown in figure 4-12 has been installed on the robot for the
displaying different parameters of the robot.
Figure 4-12 Basic 20x4 Character LCD

Specifications:
20 character by 4 line display.
5x8 dots
+5V power supply
1/16 duty cycle
Green LED Backlight
The LCD is interfaced with Arduino Mega and display the following information:
33
Current status of the robot is displayed on first row of LCD
Obstacles distance of all four sonar sensor is displayed on second row of LCD
Accelerometer reading is displayed on the third row of LCD
Batteries voltage status is displayed on the fourth row of LCD
4.2.10 Gas Sensor

A gas sensor is also installed on the robot for the monitoring of Methane and CNG gas. It gives
the valuable information regarding hazardous gases present in the field. Gas sensor MQ-4 shown
in figure 4-13 is a simple-to-use compressed natural gas (CNG) sensor. It can sense natural gas
(composed of mostly Methane [CH4]) concentrations in the air. The gas sensor can detect natural
gas concentrations anywhere from 200 to 10000ppm. This sensor has a fast response time and
high sensitivity.
Figure 4-13 Gas Sensor - MQ-4

4.3 Robot Design

This section will explain their design, dimensions, electromechanical characteristics and some of
the most important functionalities that they are capable of performing.
4.3.1 Base Design

The base of the robot is a four wheels with two gearboxes skid steer drive. It is easy to design and
build. Moreover it uses two motors chained drive which is powerful and inexpensive. It is a strong
and stable design. The robot base design is shown in figure 4-14.
34
Skid Steering
Skid Steering is a close relative of the differential drive system. Skid steering is another driving
mechanism implemented on vehicles with either tracks or wheels which uses differential drive
concept. It is mostly used in tracked machines e.g. tanks. It is can also be used in four / six wheeled
robots. Skid Steering or simply tank style, is a system where each set of wheels, or tracks is
independently powered. The right and left wheels are driven separately. Steering is accomplished
by actuating each side in a different direction or at a different rate, causing the wheels or tracks
to skid, or slip, on the ground.
To drive forward, both right, and left set must be powered forward.
To drive backward, both, right and left set must be powered backward.
To steer left, to be able to pull an on-the-spot spin, the left should be put into reverse and
right in forward or the right side must be going faster than the left.
To steer right, to be able to pull an on-the-spot spin, the right should be put into reverse
and left in forward or the left side must be going faster than the right.
The advantages of Skid-steering is that we can turn on the spot. Skid-steer loaders are capable of
zero-radius, pirouette turning, which makes them exceptionally controllable and valued for
certain applications.
The disadvantages are that for a human used to a steering wheel style system it is a little nonintuitive. Moreover motors are not often identical, even in the same batch, so we may get some
drift which needs adjustment. The skid-steering automobile is turned by producing differential
velocity at the opposite sides of the vehicle. They can be improved to low ground friction by using
especially designed wheels such as the Mecanum wheel.
Figure 4-14 Robot Base Design
35
4.3.2 Structural Design

The robots had to be designed keeping in mind that they had to comply with a number of
standards, most important being that their loader/lifter were capable to place goods from one
place to another place. The structural design of robot is shown in figure 4-15.
Figure 4-15 Structural Design
A custom made frame was used for the body of the robot. The frame has two sections. The lower
section holds the batteries and motors. The upper section holds the controller, sensors LCD and
power circuit.
In the front, the robots have a sonar pairs, which was used to detect the obstacle. When sonar pairs
is active, robot will automatically stop when it detect the obstacle in its path. In the same manner
there are sonar sensor on each side of robot for obstacle detection.
The following figure 4-16 shows the top view of the robotic loader.
36
Figure 4-16 Robot Top View
4.3.3
Functionality
Robot can perform different movements. They include moving forwards, moving backwards,
moving left, moving right and the ability to up and down the lifter.
Moving forwards and backwards is achieved by moving both the DC motors in the same direction,
while rotations are achieved by a differential drive i.e. moving them in opposite directions.
The lifter up and down functionality has been added to allow the robot to pick the goods and place
it toward the destination.
Obstacle detection is also implement on the robot. The robot automatically stops when a hurdle
present in front of the robot.
4.3.4 Dimensions
The robot has the following dimensions:
37
Specification
Value
Total Length
50 cm
Total Height
40 cm
Total Width
30 cm
Total Weight
10 Kg Approx.
Lifter Length
20 cm
Lifter Min. Height
22 cm
Lifter Max. Height
32 cm
Table 4-7 Robot Dimensions
A view of lifter of robot is shown in figure 4-17.
Figure 4-17 Lifter of Robot
38
Chapter 5
5 Implementation
5.1 Tools
The following tools have been used to develop the system.
5.2
Arduino IDE
Microsoft Visual Studio
X-CTU
Processing 2
Development Stage
When we started our project, initially we divided it into different phases. Following were the
distinct phases we have experienced incrementally to comprehend our project in the given time.
Figure 5-1 Developing Stage
39
5.2.1 Structure and Design

The first milestone of the project was to design and implement the structure of the robotic loader.
Different structure design were analysed and four wheel two motors chained skid steering drive
base was selected. Then lifter mechanism was implemented in front of the robot. The design and
structure of robot is given below.
Figure 5-2 Robot Structure
5.2.2 Wireless Communication

The next step followed was to develop wireless communication between the robot and the
computer. Different wireless communication protocols were evaluated and ZigBee protocol was
selected. The main reason behind selection of Xbee series 2 module was it low power
consumption and wide range as in mobile embedded systems the power saving is the main concern
during development. Point to point communication was established between the computer and
microcontroller on the robot.
5.2.3 Hand Gesture Recognition

There are different Human-Machine Interaction modules present in the project. Gesture
recognition is an example of Natural User Interface (NUI). NUI are interfaces that are effectively
invisible to the user. The control actions in NUI are related to natural, everyday human behaviour.
We implemented gesture recognition using Microsoft Kinect sensor in processing 2.
40
5.2.4 Hurdle Avoidance Mechanism

The next step was to implement hurdle detection and avoidance mechanism. Sonar sensors were
used for the detection of hurdles and hurdle avoidance algorithm was implemented on the
microcontroller which takes decisions to avoid hurdle based on this algorithm.
5.2.5 Graphical User Interface

Then we designed a graphical user interface for controlling of robotic loader in C sharp.
Figure 5-3 GUI
5.2.6 Wireless Monitoring

The next step was to implement a wireless monitoring mechanism for monitoring of robot. We
integrated monitoring of different parameters of robot in the graphical user interface including
Current state of robot
Obstacles warning status
Batteries current voltage level
Accelerometer reading
Colour sensor reading
Distance sensors readings
Natural gas detector status
5.2.7 Speech recognition

As with Microsoft Kinect sensor we have endless possibilities. One of the key features of the
Kinect for Windows natural user interface is speech recognition. The next phase of our project
41
was to develop a speech based control application using Kinect sensor that respond to spoken
commands for controlling of robotic loader. We developed a C Sharp based speech recognition
application for controlling of robot.
5.2.8 System Integration

At last we have to incorporate all the modules so that our robotic loader can perform the requisite
task. The following diagram shows the path of flow of data from one module to another.
Figure 5-4 Integration of all modules
5.2.9 Key Components

The key components that used in this project are
i.
ZigBee
ii.
Arduino controller
iii.
H-Bridge
iv.
Kinect Sensor
v.
Sonar Sensor
vi.
VR Module
vii.
Gas Sensor
viii.
Power supply (battery)

42
5.3 Hurdles
During the implementation our project we faced different issues. The problems that we
have faced were:
1. Distance Sensors Malfunctions

2. Difficulty for robot to follow required angle and direction due to skidding.
3. H Bridge malfunction
4. Wireless communication
5. Power circuit design
5.4
Solution of the Problem
There were many ways to get solution of the problems. We evaluated different approaches for
implementation of high current rating H Bridge circuits. Selection of wireless communication
protocol was also a difficult task. We evaluated different protocols and selected ZigBee. Still on
implementation of Xbee communication we faced issues in the configuration of Xbee modules
which were solved by troubleshooting.
43
Chapter 6
6 Testing
6.1 Wireless Modules Testing
The configuration of wireless modules in detail has already been discussed. Here the details of
modules connections with the microcontrollers and their working have been described. The
ground pins of Xbee S2 were connected with the ground pins of Arduino. 3.3V pin of Xbee was
connected to 3.3V pin of Arduino. The Data-out and Data-in pins of Xbees were connected to
Rx and Tx pins of Arduino respectively. At the time of transmission a green light blinks up on
coordinator as well as router, which was the sign of successful transmission of data. The following
Figure 6-1 shows how X-CTU can be used to transmit signals.
Figure 6-1 X-CTU, Zigbee Coordinator, Zigbee Router and its connection with Arduino
6.2 Turning Radius

The turning radius of our prototype is 25 cm.
6.3 Maximum Weight Capacity

The maximum weight lifting capacity of our robot is 1.5 Kg.
44
6.4 System Run Time

When one 12V battery is connected:
The run time of the system in a single charge without motors = 3 Hours and 15 Minutes.
When two 12V batteries are connected in parallel:
The run time of the system in a single charge without motors = 7 Hours and 20 Minutes.
When two 12V batteries are connected in parallel for motors:
The run time of motors in a single charge = 55 Minutes.
Hence the total run time of the system is 55 Minutes in a single recharge of batteries.
6.5 Slipping
When robot is running at full speed in forward direction
Total Slipping = 1 cm
45
CHAPTER 7
7 Conclusions and Future Work
7.1 Conclusions
The robotics field is quite promising and requires a lot of effort to develop an intelligent robot.
The ultimate goal of robotics is a super human system that personifies all the skills of humans
such as touch, intelligence, and sensitivity without any of their precincts such as ageing and
strength. Regardless of the massive work put in by the prior researchers, there is an enormous
scope for research for further growth in this field of robotics.
The use of Kinect sensor has incorporated the concept of Robotics and has helped greatly in
testing various guidance, navigation and control techniques. It has also contributed to the
computer vision techniques that are a great contribution to the eventual goal of Robotics. The
speech and gesture control gives an alternative way of controlling robots. Gesture and speech
control are more natural way of controlling devices and makes the control of robots more easy
and efficient.
7.2 Future Work

This project can be extended greatly in the future from the point of view of Autonomous Robotic
Controller as this project can be used to implement in wheel chair or in army purpose as an agent.
That will include using the localization and using Google protocol buffer for real time operation.
The use of holonomic drive can increase the potential applications of the project.
A lot of work can be done in creating and implementing new algorithms and obstacle avoidance
strategies to make this system compete with human interface at a much improved level. Moreover
implementation of efficient path discovery algorithms and localization for seamless navigation of
robot.
46
References
[1] L. Holmquest, Listening with Kinect, Microsoft, 12 December 2012. [Online]. Available:
http://msdn.microsoft.com/en-us/magazine/jj884371.aspx. [Accessed 25 March 2014].
[2] Microsoft, Kinect for Windows Sensor Components and Specifications., Microsoft, 06
December 2012. [Online].
Available: http://msdn.microsoft.com/en-us/library/jj131033.aspx. [Accessed 25 May
2014].
[3] Singh, S.; Keller, P., "Obstacle detection for high speed autonomous navigation," Robotics
and Automation, 1991. Proceedings., 1991 IEEE International Conference on , vol., no.,
pp.2798,2805 vol.3, 9-11 Apr 1991
[4] L. Brthes, P. Menezes, F. Lerasle, and J. Hayet. Face tracking and hand gesture recognition
for human robot interaction. In International Conference on Robotics and Automation,
pages 1901-1906, 2004. New Orleans, Louisiana.
[5] Geng Yang, Yingli Lu, motor and motion control system, Tsinghua University
Press,2006.3, pp.85-110.
[6] D. Press, Sci Sports: Killer Robots, 11 March 2013. [Online]. Available:
http://press.discovery.com/us/sci/programs/sci-sports-killer-robots. [Accessed 21 May
2014].
[7] C. Janssen, Wireless Communications, techopedia, 2 June 2006. [Online]. Available:
http://www.techopedia.com/definition/10062/wireless-communications. [Accessed 25 May
2014].
[8] Jin-Shyan Lee; Yu-Wei Su; Chung-Chou Shen, "A Comparative Study of Wireless
Protocols: Bluetooth, UWB, ZigBee, and Wi-Fi," Industrial Electronics Society, 2007.
IECON 2007. 33rd Annual Conference of the IEEE , vol., no., pp.46,51, 5-8 Nov. 2007
[9] J.-S. Lee, Y.-W. Su and C.-C. Shen, A Comparative Study of Wireless Protocols:
Bluetooth, UWB, ZigBee, and Wi-Fi, in 33rd Annual Conference of the IEEE, Taipei,
2007.
[10] R. H. Dictionary, speech recognition, Random House, Inc. , 21 June 2013. [Online].
Available: http://dictionary.reference.com/browse/speech+recognition. [Accessed 25 May
2014].
[11] Luo Zhizeng, Zhao Jingbing, Speech Recognition and Its Application in Voice-based Robot
Control System, International Conference on Intelligent Mechatronics arid Automation,
960-963, 2004.
47
[12] N. Unuth, What is Speech Recognition?, About.com, 22 May 2014. [Online]. Available:
http://voip.about.com/od/voipbasics/a/What-Is-Speech-Recognition.htm.
[Accessed
28
May 2014].
[13] Yuan Meng, "Speech recognition on DSP: Algorithm optimization and performance
analysis", The Chinese University of Hong Kong, July 2004
[14] DIGI, Digi your M2M EXPERT. XBEE/XBEE-PRO., Digi International Inc., 3 April
2014. [Online]. Available:
http://www.digi.com/support/productdetail?pid=3430&osvid=0&type=documentation.
[Accessed 25 May 2014].
[15] EasyVR Arduino Shield 2.0, TIGAL KG, 6 June 2013. [Online]. Available:
http://www.veear.eu/products/easyvr-arduino-shield/. [Accessed 11 May 2014].
[16] Kinect for Windows Sensor Components and Specifications, Microsoft, 21 March 2013.
[Online]. Available: http://msdn.microsoft.com/en-us/library/jj131033.aspx. [Accessed 25
May 2014].
[17] Arduino, 22 February 2012. [Online]. Available: http://arduino.cc/. [Accessed 12 May
2014].
[18] Arduino, Arduino Uno, SmartProjects, [Online]. Available:
http://arduino.cc/en/Main/arduinoBoardUno. [Accessed 25 May 2014].
[19] l298 Dual Full Bridge Driver, STMicroelectronics, 15 July 2011. [Online]. Available:
http://www.st.com/web/catalog/sense_power/FM142/CL851/SC1790/SS1555/PF63147.
[20] XBee / XBee-PRO ZB (S2) Modules, Digi International Inc., 14 August 2012. [Online].
Available:
http://www.digi.com/support/productdetail?pid=3430&osvid=0&type=documentation.
[21] ZigBee Technology, ZigBee Alliance, 1 January 2008. [Online]. Available:
http://www.zigbee.org/About/AboutTechnology/ZigBeeTechnology.aspx. [Accessed 16
April 2014].
[22] IEEE Computer Society, IEEE ICCV Workshop on Recognition, Analysis, and Tracking of
Faces and Gestures in Real-Time Systems, IEEE Computer Society, 2001.
[23] S. Lee, H. Cho, K.-J. Yoon and J. Lee, Intelligent Autonomous Systems 12, in
Proceedings of the 12th International Conference IAS-12, Jeju Island, Korea, June 26-29,
2012.
48
[24] S. A. Wilson, A real-time obstacle detection vision system for autonomous high speed
robots, University of Louisiana at Lafayette , Lafayette , 2006.
[25] Kinect
for
Windows,
Microsoft,
21
March
2013.
[Online].
Available:
http://www.microsoftstore.com/store/msusa/en_US/pdp/Kinect-forWindows/productID.253758600. [Accessed 15 March 2014].

[26] W. Technology, Wireless Technology With Examples, Senior Scribe Publications, 10
March 2011. [Online]. Available: http://wordinfo.info/unit/4003/s:technology. [Accessed
25 May 2014].
[27] Raheja, J.L.; Shyam, R.; Kumar, U.; Prasad, P.B., "Real-Time Robotic Hand Control Using
Hand Gestures," Machine Learning and Computing (ICMLC), 2010 Second International
Conference on , vol., no., pp.12,16, 9-11 Feb. 2010
[28] E. Ferro and F. Potorti, "Bluetooth and Wi-Fi wireless protocols: A survey and a
comparison," IEEE Wireless Commun., vol. 12, no. 1, pp. 12-16, Feb. 2005.
[29] Baker, N. "ZigBee and Bluetooth: Strengths and weaknesses for industrial applications,"
IEE Computing & Control Engineering, vol. 16, no. 2, pp 20-25, April/May 2005.
[30] X. Wang, Y. Ren, J. Zhao, Z. Guo, and R. Yao, "Comparison of IEEE 802.11e and IEEE
802.15.3 MAC," in Proc. IEEE CAS Symp. Emerging Technologies: Mobile & Wireless
Commun, Shanghai, China, May, 2004, pp. 675-680.
[31] Leoni F, Guerrini M, Laschi C, etal. Implimenting robotic grasping tasks using a biological
approach[C]. IEEE International Conference on Robotics and Automation, Leuven,
Belgium, 1998,3:2274-2280
[32] Aijing Che, "The Sunplus SPEC061A microcontroller-based voice control system",
Computer development and application, vol. 19, Aug.2006, pp. 49-51.
[33] J.-M. Valin, J. Rouat, and F. Michaud, "Enhanced robot audition based on microphone array
source separation with post-filter," in Proceedings IEEE/RSJ International Conference on
Intelligent Robots and Systems, 2004.
[34] T. Takahashi, S. Nakanishi, Y. Kuno, and Y. Shirai. Helping Computer Vision by Verbal
and Nonverbal Communication. In Proceedings of the 14-th IEEE International Conference
on Pattern Recognition, volume 4, pages 1216-1218, 1998.
[35] H. Sidenbladh, D. Kragic, and H. Christensen. A Person Following Behaviour for a Mobile
Robot. In Proceedings of the IEEE International Conference on Robotics and Automation,
Detroit, MI, USA, pages 670-675, April 1999.
49
[36] T.C. Lueth, Th. Laengle, G. Herzog, E. Stopp, and U. Rembold. KANTRA - HumanMachine Interaction for Intelligent Robots using Natural Language. In IEEE International
Workshop on Robot and Human Communication, volume 4, pages 106-110, 1994.
[37] Jagdish Raheja, Radhey Shyam and Umesh Kumar, Hand Gesture Capture and Recognition
Technique for Real-time Video Stream, In The 13th IASTED International Conference on
Artificial Intelligence and Soft Computing (ASC 2009), September 7-9, 2009 Palma de
Mallorca, Spain.
[38] Peter X. Liu, A. D. C. Chan, R. Chen, K. Wang, Y. Zhu, Voice Based Robot Control,
International Conference on Information Acquisition,543-547, 2005.
[39] W.S.H. Munro, S. Pomeroy, M. Rafiq, H.R. Williams, M.D. Wybrow and C. Wykes,
Ultrasonic Vehicle GuidanceTransducer, Ultrasonics, Vol. 28, pp. 349-354, 1990.
[40] Reza Hassanpour, Stephan Wong, Asadollah Shahbahrami, VisionBased Hand Gesture
Recognition for Human Computer Interaction: A Review, IADIS International Conference
on Interfaces and Human computer Interaction. 25-27 July 2008 Amsterdam, Netherlands .
[41] Jagdish Raheja, Radhey Shyam and Umesh Kumar, Hand Gesture Capture and Recognition
Technique for Real-time Video Stream, In The 13th IASTED International Conference on
Artificial Intelligence and Soft Computing (ASC 2009), September 7-9, 2009 Palma de
Mallorca, Spain.
[42] Raheja, J.L.; Shyam, R.; Kumar, U.; Prasad, P.B., "Real-Time Robotic Hand Control Using
Hand Gestures," Machine Learning and Computing (ICMLC), 2010 Second International
Conference on , vol., no., pp.12,16, 9-11 Feb. 2010
[43] Takai, H.; Miyake, M.; Okuda, K.; Tachibana, K., "A simple obstacle arrangement detection
algorithm for indoor mobile robots," Informatics in Control, Automation and Robotics
(CAR), 2010 2nd International Asia Conference on , vol.2, no., pp.110,113, 6-7 March 2010
[44] N. Harper, and P. McKerrow, Detecting plants for landmarks with ultrasonic sensing,
Proceedings of the International Conference on Field and Service Robotics, pp. 144-149,
1999.
[45] Discant, A.; Rogozan, A.; Rusu, C.; Bensrhair, A., "Sensors for Obstacle Detection - A
Survey," Electronics Technology, 30th International Spring Seminar on , vol., no.,
pp.100,105, 9-13 May 2007
50
51
APPENDIX A: SOURCE CODE

The source code available in the CD provided.
52
Appendix B: Hardware Schematics
Figure 10-1 Schematic H Bridge
Figure 10-2 H Bridge PCB Design
53
Figure 10-3 Robot Front View
54
Appendix C: List of Components
Three DC Gear Motor
Sabertooth dual 12A motor driver
H-Bridge using L298 IC
Voltage Regulator 7805
Accelerometer MMA 7361
X-bee Module Series 2
Kinect Sensor
Sonar Sensor HC-SR04
Arduino Uno
Arduino Mega
Color Sensor ADJD-S311
LCD 4x20
EasyVR module
Circuit Board
Gas Sensor MQ-4
55
Appendix D: Project Timeline

MAY 24, 2014
DATE
PROJECT ID
TITLE
TOTAL NUMBER
OF WEEKS IN
PLAN
05
33536
Hand Gesture Based Robotic Mobile Loader
No.
STARTING
WEEK
Week 1
Literature review
4 weeks
Week 5
Working strategy
4 weeks
Week 9
Hardware structure design
3 weeks
Week 12
Wireless Communication
2 weeks
Week 14
Hand Gesture Recognition
3 weeks
Week 17
Sensors and modules interfacing/integration
4 weeks
Week 21
Hurdle detection/Avoidance mechanism
2 weeks
Week 23
Graphical User Interface Development
3 weeks
Week 26
Wireless Monitoring
2 weeks
10
Week 28
Speech Recognition
3 weeks
11
Week 31
Testing / Troubleshooting
2 weeks
12
Week 33
Thesis
3 weeks
DESCRIPTION OF MILESTONE
56
DURATION

Hand Gesture Based Mobile Robotic Loader

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Hand Gesture Based Mobile Robotic Loader

Загружено:

Авторское право:

Доступные форматы

Hand Gesture Based Robotic Mobile

Department of Electrical Engineering

COMSATS-LANCASTER Dual Degree Program

COMSATS INSTITUTE OF INFORMATION TECHNOLOGY

Submission Form for Final-Year

Hand Gesture Based Robotic Mobile Loader

Dr. Mujtaba Hussain Jaffery

I/We have enclosed the soft-copy of this document along-with the

My/Our supervisor has attested the attached document

COMSATS Institute of Information Technology, Lahore Campus

Department of Electrical Engineering

Bachelors in Electrical Engineering

Department of Electrical Engineering

COMSATS INSTITUTE OF INFORMATION TECHNOLOGY

KINECT SENSOR ..................................................................................................................... 1

OVERVIEW OF THE SYSTEM ..................................................................................................... 2

MAJOR CONTRIBUTION ........................................................................................................... 3

DESCRIPTION OF THESIS ......................................................................................................... 4

LITERATURE REVIEW ............................................................................................................ 5

OBSTACLE DETECTION AND PREVIOUS WORK .......................................................................... 6

SYSTEM STRUCTURE .............................................................................................................. 7

Speech recognition ............................................................................................................ 9

Voice recognition ............................................................................................................. 10

SOFTWARE ARCHITECTURE .............................................................................................. 12

COMMUNICATION SYSTEM .................................................................................................... 12

SPEECH AND VOICE RECOGNITION .......................................................................................... 8

Comparision between different wireless communication protocols ..................................... 7

HMI MODULES .................................................................................................................... 13

Gesture Based Control ..................................................................................................... 13

GUI and Features ............................................................................................................. 17

Speech Based Control....................................................................................................... 18

Voice recognition module................................................................................................. 20

HARDWARE ARCHITECTURE ............................................................................................. 23

ARCHITECTURE OVERVIEW ................................................................................................... 23

The Kinect Sensor............................................................................................................. 24

Arduino Software ............................................................................................................. 27

Xbee Module ................................................................................................................... 28

Base Design ..................................................................................................................... 34

Structural Design ............................................................................................................. 36

DEVELOPMENT STAGE .......................................................................................................... 39

Structure and Design ....................................................................................................... 40

Wireless Communication ................................................................................................. 40

Hand Gesture Recognition ............................................................................................... 40

Hurdle Avoidance Mechanism .......................................................................................... 41

Graphical User Interface .................................................................................................. 41

Wireless Monitoring ........................................................................................................ 41

Speech recognition .......................................................................................................... 41

Key Components .............................................................................................................. 42

SOLUTION OF THE PROBLEM ................................................................................................. 43

WIRELESS MODULES TESTING .............................................................................................. 44

TURNING RADIUS ................................................................................................................. 44

MAXIMUM WEIGHT CAPACITY .............................................................................................. 44

SYSTEM RUN TIME ............................................................................................................... 45

CONCLUSIONS AND FUTURE WORK ................................................................................. 46

APPENDIX D: PROJECT TIMELINE ............................................................................................. 56

Graphical User Interface

1.1 Kinect Sensor

Figure 1-1 A Standard Kinect

1.2 Overview of the Entire System

Figure 1-2 Overview of Entire System

1.3 Major Contribution

This project has contributed greatly in the fields of:

Kinect Application Development

Human Computer Interaction