Вы находитесь на странице: 1из 3

Real Time Detection and Recognition of Hand Held

Objects to Assist Blind People

Suchit Adak, Ankit Dongre, Jimil Gandhi, Dhruv Purandare, Prof.Archana G. Said
adak.suchit8@gmail.com, nktdongre@gmail.com, jimilgandhi14@gmail.com, dhruvpurandare18@gmail.com,
Computer Department

ABSTRACT identify different products, it enables the users who are blind
In this paper, we have implemented our idea to help blind to access information about these products through speech and
individuals with their daily struggle. We have used Google’s Braille. But a big limitation is that it is very hard for blind
TensorFlow library in our project. TensorFlow is an open users to find the position of the bar code and to correctly point
source, fast and scalable open source machine learning the bar code reader at the bar code. [1]There were systems
library. TesnsorFlow is used by many internal Google made using Optical Character Recognition (OCR) for
products including: Gmail, translate, YouTube, and Play.
recognition of text from product labels. A common problem
General Terms in an early stage of OCR preprocessing is to adjust
Objection Recognition the orientation of text areas which is very difficult task
according to the perspective of a blind person. Another
Keywords problem with such systems is that some other person has to
Object recognition; Computer vision; image registration;
initiate it and hand it over to blind person but we want to
1. INTRODUCTION make blind person use the system independently without
Globally, there are estimated 36 million blind people . This needing such assistance. Hence, we propose our own system
number is increasing as every year passes. Printed text is with which we are trying to remove the limitations of other
omnipresent in the form of receipts reports, statements of such previous works.
home, menus, classroom notes, product packages, instructions
etc. Logos are graphical representations that either recall some 3. Proposed System
real world objects, or focuses on a name, or simply display
some abstract signs that have strong appeal. Modern 3.1 System Architecture
advancement in computer vision, digital cameras, and smart We propose an optical sensor-based object recognition system
phones make it practical to assist these individuals by to help visually impaired persons to recognize hand held
developing camera-based products that incorporate computer objects in their day to day lives. To isolate the object from
vision technology. We present a camera-oriented object cluttered backgrounds or other surrounding objects in the
recognition application to help visually impaired people to camera view, we first propose an efficient and effective
recognize hand held objects in their da y to day lives. To motion based method to define a region of interest (ROI) in
segregate the object from complex backgrounds or other the image by asking the user to hold the object. .).If the test
surrounding objects in the camera view, we first propose a image get match with reference image then the output signal
motion based method to define a region of interest (ROI) in created in the form of audio signal. Output audio signal sent
the image by asking the user to hold the object which is toward headphone which will attach to the ear of the blind
efficient and effective. Reference objects and test images are user so that blind user could identify the product which is held
accompanied by local features (regions, interest points, etc.). on hand .To make sure the hand-held object is visible in the
If the test image get match with reference image then the camera view, we use a mobile camera with sufficiently large
output signal created in the form of audio. Output audio is optical sensor and good clarity. In this suggested system we
generated from mobile and sent toward head- phone which have used TensorFlow by Google as it makes use of machine
will attach to the ear of the blind user so that blind user could learning to classify and recognize images faster than any other
identify the product which is held on hand. In this system , we image processing algorithm
have used TensorFlow which is an open source framework by
Google. TensorFlow makes use of a training data set to learn
images and matches the images in the video scene with that in
the training data set. All this happens in few milliseconds
making it adequate for real time object recognition.

There have been many attempts made in the past to assist
blind people but they have their drawbacks. For example,
portable bar code readers designed to help blind people

Volume: 3 Issue: 2 April - 2018 78

3.2 Process
3.3 Classification
[11]It consists of classifying an image into one of many
different categories. One of the most popular datasets used in
academia is ImageNet, composed of millions of classified
images. In recent years classification models have surpassed
human performance and it has been considered practically
solved. In our system we are using TensorFlow classifiers to
classify the objects into several classes/categories. For
example , chair, bottle ,car etc.

3.4 Localization
Localization is used to find the location of an individual
object inside an image. Localization has many uses and it can
be used along with classification fro categorizing object into
one of many categories.

3.5 Instance Segmentation

Going ahead of localization we want to obtain a pixel by pixel
mask of each individual detected object. We refer this to
instance segmentation.
4. Block Diagram
Fig.1 shows the rough idea of how our proposed system
works. Blind person opens the app and points the camera
towards an object before it. The image is then localized and
object is detected and matched with reference image present
Fig 1 : Basic Steps in database, if matched then output audio is generated and
sent towards headphone attached to ear of blind user.

Fig 2: Block Diagram

Volume: 3 Issue: 2 April - 2018 79

Fig 3(a): Implementation Result Fig 3(b): Implementation result

5. Result [2] Jia Xingteng, Wang Xuan, Dong Zhe, "Image Matching
Method Based On improved SURF Algorithm", IEEE
Table 1 shows the results of our implementation. International Conference on Computer and
Communications(ICCC), pp 142-145, 2015.
Table 1. Object recognition result [3] Runqing Zhang, Yue Ming, Juanjuan Sun, "Hand gesture
recognition with SURF-BOF based on Gray threshold
Time taken to segmentation",ICSP2016 978-1-5090-1345-6/16/31.00
Time taken to 2016 IEEE.
recognize after
recognize in Juan and O. Gwon, "A Comparison of SIFT, PCASIFT
Object Accuracy change in [4]
portrait and SURF", International Journal of Image
orientation by
orientation Processing(IJIP), 3(4):143152, 2009.
[5] Hanen Jabnoun, Faouzi Benzarti, and Hamid Amiri,
Bottle 0.01s 80% 0.019s "Object recognition for blind people based on features
Knife 0.031s 94% 0.035s extraction",IEEE IPAS14: INTERNATIONAL IMAGE
cup 0.001s 95% 0.023s CONFERENCE 2014.
[6] Ricardo Chincha and Ying Li Tian, "Finding Objects for
5.1 Result Analysis and Conclusion Blind People Based on SURF Features", 2011 IEEE
From table 1 it is evident that the time taken to recognize and International Conference on Bio informatics and
image is very less when it is in portrait mode or the Biomedicine Workshops.
orientation is orthogonal. The accuracy with which the object [7] Payal Panchal, Gaurav Prajapati, Savan Patel, Hinal Shah
and Jitendra, "A Review on Object Detection and
is detected is also significant. When the object is rotated by
45⁰ to either side there is a only a small amount of delay FOR RESEARCH IN EMERGING SCIENCE AND
which is negligible .Hence, we could say our proposed system TECHNOLOGY, VOLUME-2, ISSUE-1, JANUARY-
can be used in practice for assisting blind person with their 2015.
daily lives. [8] Lukas T, Hendrik, Andrea Finke and Helge Ritter
CITEC, "Gaze-contingent audio-visual substitution for
6. ACKNOWLEDGMENTS the blind and visually impaired", 2013 7th International
We would like to express gratitude to our project guide Prof. Conference on Pervasive Computing Technologies for
Archana G. Said for her expert advice and encouragement Healthcare and Workshops.
throughout this difficult project ,as well as project coordinator [9] Chucai Yi, Student Member, IEEE, Yingli Tian, Senior
Dr.K.S. Wagh and Head of Department Prof. S.N. Zaware. Member, IEEE, and Aries Arditi, "Portable Camera-
Without their continuous support and encouragement this Based Assistive Text and Product Label Reading From
Hand-Held Objects for Blind Persons", IEEE/ASME
project might not have been possible. TRANSACTIONS ON MECHATRONICS, VOL. 19,
NO. 3, JUNE 2014.
7. REFERENCES [10] Hanen Jabnoun, Faouzi Benzarti ,Hamid Amiri, "Object
[1] Samruddhi Deshpande, Ms. Revati Shriram, "Real Time Detection and Identification for Blind People in Video
Text Detection and Recognition of Hand Held Objects to Scene", 2015 15th International Conference on
Assist Blind People",2016 International Conference on Intelligent Systems Design and Applications (ISDA) .
Automatic Control and Dynamic Optimization
Techniques (ICACDOT)International Institute of https://tryolabs.com/blog/2017/08/30/object-detection-
Information Technology (IIIT),Pune. an-overview-in-the-age-of-deep-learning/

Volume: 3 Issue: 2 April - 2018 80