Вы находитесь на странице: 1из 6

Proposal and Implementation of a novel scheme for

Image and Emotion Recognition using Hadoop


Parag Saini1, Tanupriya Choudhury 2, Praveen Kumar3 , Seema Rawat4
Amity University Uttar Pradesh, Noida1,2,3,4
paragsaini22@gmail.com1, tchoudhury@amity.edu2, pkumar3@amity.edu3, Srawat1@amity.edu4

Abstract—The digital media especially social media is very Then frames are divided into sub frames by using the same
popular in recent time. Due to which we have more number of method. Then difference between each sub frame gives the
videos than before. These videos are generally not tagged and block difference
classified. This paper aims for parallel video processing in Calculate the block difference from required formula and
Hadoop for fast processing and automatic tag emotions. This
calculate the mean deviation and standard deviation. This will
system recognizes face and tag emotions automatically on
hadoop clusters for fast and efficient performance. This will give you the threshold. If threshold which is calculated is
make human work easier in classification of videos and smaller than that of block difference of frame, then frame will
processing on parallel system make it quick. become key frame.
Face detection and recognition is done together. First video
stream is taken as an input. It is converted into image sequence
of frames. Then important key frames are extracted from the
Keywords : Automatic emotion tagging, parallel video image sequence by techniques like PCA or EGM. Then with
processing, face recognition
the help of some techniques we detect and track the face. The
next step is Image processing and face alignment which
consists of three parts. First one is Histogram equalization
I. INTRODUCTION after that resize image and last one is image raster scan. The
A growth of videos on internet has been increased in recent extracted features are used to create face print. Then feature is
few years because of social media and sites like youtube, matching with the done from the trained set.
dailymotion. According to the recent studies [1] more than Video tagging is defined as collection of contents in video like
60% people watch videos online. These videos are generally a scene, or a shot or information about it and description is
not classified and not tagged properly. Due to which video much more relevant than the whole annotated video.
searching become such a difficult task.
Tagging is a process by which we will assign metadata to the Through this paper an efficient method is proposed to detect
videos which helps to organize and manage the resources. the faces and recognize faces in the video and automatically
Video searching depend upon these tags which contains meta tag the emotions which is done on parallel systems with the
data of the videos like most viewed video. Some more help of Hadoop clusters. The key objective of this paper is to
important information about the video is not provided. recognize faces from video, extract key features, and use
In 2004, [1] data is processed by the MapReduce clusters. technique for image recognition so that character can be
From then MapReduce is used to large amount of data. Video identified using a trained data and important information is
processing can be accelerated by MapReduce. This technique labelled and classified under different classes. A template
is used for cheap, scalable and capable for video processing. It consists of important information such as name about
uses apache Hadoop and some open source projects. character and pre-configured nodal points. The nodal points
A large amount of video can be handled and time taken to are calculated in mathematics so that we can have a face print
process the data can reduced by utilizing the clusters. The which refers to the database. The face is recognized and
system automatic tagging process can be done parallel with automatic emotions are tagged with them. This process is very
other system. slow so to make is more efficient Hadoop clusters are used so
[7]. Video is basically a structured media by nature and our that parallel processing is done.
first step is to make it into temporal unit by segmented it.
Histogram is a technique which is used by separating each
frame into blocks smaller in size and “Histogram difference”
of frames is taken in succession.
There are many approaches existed for key frame extraction
which are depends upon detected shots. One of the technique
is ‘Shot Boundary Detection’.
According to “Shot Boundary Detection”, First read the video
as an input and shot boundary method is used to process it.

Fig.1 Face detection

978-1-5386-0569-1$31.00 2017
c IEEE 1358
The techniques which are generally used for feature extraction vector , and in-turn, it maps to other shot label. A number of
are described below: - labels maps to video event, which means one layer of video
EGM: -[3] In this method we gather the data store it and annotation are calculated by previous layers of video
process the features of the face in image graph. The jets are annotation data.
the local image descriptors is transformed by wavelet
transform. These are results of the Gabor wavelet which have Multimedia content descriptor like MPEG-7 is used to store
a 2-D wave field. The jets form graph which will represent the video annotation in hierarchical manner. For fast retrieval of
face of the character. If the graph is matches with the videos the description is associate with
geometry, then it considered has same faces. the content. XML is used so that meta data can be stored and
PCA: - [3] It is one of the widely-used algorithm in facial it can associate with time-code and events can be
recognition. Eigen faces ae formed by extracting the principal synchronized.
components in the face images. Whenever there is a new According to the standards of MPEG [] a Description
image we have to determine is it a face image or not. If it is a containing of a descriptive scheme (structure) and the set of
face image, then the person is recognized by the weight Descriptive values (instantiations) that describe the Annotated
pattern. But it is very expensive to operate and couldn’t work Data. The Descriptor value can be defined as an instantiation
well with complex images. Still this is used most widely of a Descriptor for a given data set. DDL[10] defines the
because it is robust and unsupervised. structural relation among descriptors which is based on XML
language.
[A] Video Annotation
[B] Key Feature Extraction Algorithm
To improve searching, classifications and indexing of the
videos we have to assign some labels, tags to the content of
video. Automated video tagging can be done by two ways. The algorithm mainly focuses on techniques which are
The first one is “open-set tagging” in which[9] extraction is different and video sequence fundamental dynamics. An
required and tags for that extracted video content can be algorithm is explained for key frame extraction.
chosen from information which is associated with content like Step 1: Calculation of all the dissimilarity between general
phrases, group of words or sentences. The second one is frames and reference.
“closed set tagging”. It provides classification of tags in Step 2: Locate the maximum difference in the shots.
predefined classes like cricket, news report. Hierarchical Step 3: Depending upon the relationships we can determine
method is used to store video annotation in the form of data. the shots which is existed between maximum and mean
According to semantic levels we group the data of video deviation.
annotation. the video shot corresponding to another feature Step 4: Calculate the position of the key frame.

Fig.2 Working of MPEG-7

Fig.3 Working of Key frame extraction

2017 International Conference On Smart Technology for Smart Nation 1359


[c] Apache Hadoop System Architecture
The key feature of MapReduce is encapsulating data in key
It is basically divided into two sub-systems: one of them is values pair which helps the Mapper and Reducer can process
task scheduling and other one is distributed system storage. in concurrent manner via parallel processing. .
The tasking scheduling system consists of Job Tracker which
is master of schedule MapReduce task and Task tracers
between the clusters. Name node is master node of storage HDFS is mounted to local file system by Fuse-DFS and the
system and Data nodes. With Datanode and Task tracers an video data present on HDFS made available to JavaCV. Video
special node called slave node is deployed. To improve the analysis ability of JavaCV is inherited from OpenCV and
performance MapReduce task are sent to slave. We can FFMPEG, which helps to make the libraries available from
manage storage, do failure recovery and schedule task by this video IO to MapReduce.
framework.

Fig.5 Internal working of Apache Hadoop


Fig.4 Apache Hadoop Architecture
The Procedure of the process is given below:-
[5] Fuse-DFS is sub part of Apache Hadoop system. It acts as
Video data is read by RecordReader through interface given
an interface so that the gap between local file system and
by javaCV, data is encapsulated into key-value and then it is
HFDS can be filled and files designed for local file system can
submitted to inputted format.
be benefitted.
Multiple key value pair can be accepted by one Input format
There are two most commonly used libraries in computer
which is provided by RecordReader. All key value are paired
vision i.e. OpenCV and FFMPEG. Video processing will not
by InputFormat and it is then submitted to Mappers.
be completed without the help of these libraries. With the help
These key value pairs is then grouped according to algorithm
of Fuse-DFS we can use them in Hadoop. Initially they are
requirements and then these are dispatched to the Reducers.
made to run in C/C++ only and Hadoop is made to run in java.
Key value is processed by the Reducers and final results are
Then one more project is made which is hosted by Google
submitted to the OutputFormat.
Code named as JavaCV, helps and give a better solution to
These results are then written to the HDFS by the
port all the video processing libraries including OpenCV and
RecordWriter which is employed by outputFormat.
FFMPEG to java on multiple operating system like Linux,
Windows, Android, with the support from hardware
Hadoop processes videos concurrently with JavaCV. There are
acceleration. The system is explained below.
multiple Datanode in which processing in done. Each process
HDFS help us to store distributed services for video data.
is implemented in different Datanode due to which parallel
Fuse-DFS helps to mount distributed file into local file.
processing is done and it will help our system to work
JavaCV helps in port two video processing libraries OpenCV
concurrent work. This will improve the efficiency and make
and FFMPEG to java.
the system faster.
For video data to be concurrent MapReduce programming
model is used.

1360 2017 International Conference On Smart Technology for Smart Nation


II. PROPOSED WORK

After the analysis of the existing system and new techniques Step 3: In this step, all the background is suppressed using
which can be used to make a system which can recognize the image thresholding. Frames passed from previous steps and
face as well as emotion which gives one extra feature for cross-checked with technique used in step 4. So after this
tagging which is helpful in classification in videos. process we are only left with rectangular windows with only
The System is worked on Hadoop due to which the efficiency the facial pixels (faces present in that frame).
of the system is increased. This paper proposed an
automatically tagged image and emotion recognition system.

Step 1: The video frame rate, first of all is trimmed to 10 FPS


to 15 FPS from the original frame rate of 25-29. This is done,
to achieve efficiency and less processing of video per second
by catching the key frame easily.

Fig. 8. Detected faces & Suppressing Non facial pixels

Step 4: To recognize the face, first of all a “GFK(General Face


Knowledge)”, which is basically a massive database of
metadata and the information of the images in the database

Fig. 6. Video Stream to Image Sequence

Step 2: In this step, the key frame is extracted and detected.


This facilitated by subtracting of frames that are consecutive.
This technique detects potential frames where a movement
due to living/non-living entity is detected. Since this
technique may not be able to detect very miniscule Fig. 9. Face Recognition using Elastic Graph Matching
movements let’s say: flinching of eyes. It has to filter down
Step 5: Using Bezier Curve Algorithm, we will detect the
from the next process.
important features of the face which recognize emotion like
facial expressions, lip flinching, rolling of eyes etc. The
obtained data from these steps is compared with existing data
linked with an emotion/expression. The data for the emotion is
converted into XML format according to the emotion shown in
that particular frame

Fig. 7. Metadata with probe image Fig. 10. Emotion Detection through Bezier curve

2017 International Conference On Smart Technology for Smart Nation 1361


Step 6: The processing of image recognition and emotion
recognition will run parallel and the facial emotion as well as the As ween in the above graph as the processes done on single
image are automatically tagged in this step. system the speed is very less but when we do the same
processes on Hadoop clusters the speed increase.

.IV. CONCLUSION

The project work is used to classify video and gives an extra


dimension which is emotion which is helpful to classify videos
in more appropriate manner. This image and emotion
recognition will be effective in terms of performance because
we used Hadoop for parallel processing. Both image and
emotion recognition is processed parallel which make the
application fast and most efficient. After both the process is
done on the parallel nodes, and both the results are then
combined which will help in automatic tagging. This
automatic tagging is helpful in search the videos which will
help to save both time and energy. Also the cost is not much
high due to which it is easily affordable. It will reduce the
human effort to do task because it will automatically tag the
videos.

V. Future Work

This work can be used in robotics with the help of IOT. With
this we can make a robot we can automatically recognise the
Fig. 11.F low diagram of the working system face of the person which is used to identify the owner of the
robot and also identify the mood of the owner by using
This diagram shows how these above steps will work. Firstly emotion recognition.
we reduce frame per second and extract the key frame by
differentiate the images. Then we detect faces by supressing all As our work is on running on Hadoop nodes the robot will be
non-face pixels. Parallel we extract facial expressions for image fast to identify his owner as well as his mood of his owner.
recognition. We compare the image and from our database and in
parallel compare emotions from our database. Then we
REFERENCES
automatically tag both face and emotion.
[1]. Gill, Phillipa, Martin Arlitt, Zongpeng Li, and Anirban Mahanti.
III. RESULT "YouTube traffic characterization: a view from the edge." In
Proceedings of the 7th ACM SIGCOMM conference on Internet
We find that when we have done both image and emotion measurement, pp. 15-28. ACM, 2007.
recognition on one system it takes a lot of time while if we do on [2] Bloehdorn, Stephan, Kosmas Petridis, Carsten Saathoff, Nikos
Simou, Vassilis Tzouvaras, Yannis Avrithis, Siegfried
Hadoop clusters the process speed up. It takes less time while we Handschuh, Yiannis Kompatsiaris,SteffenStaab, and Michael G.
do on Hadoop clusters. Strintzis. "Semantic Annotation of images and videos for
multimedia analysis." In the Semantic web: research and
applications, pp. 592-607. Springer Berlin
Heidelberg, 2005.
[3] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal
of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, January
1991.
[4] Zhu, Xiangxin, and Deva Ramanan. "Face detection, pose
estimation, And landmark localization in the wild."In Computer
Vision and Pattern Recognition (CVPR), 2012 IEEE Conference
on, pp. 2879 2886. IEEE, 2012.
[5] C.-H. Chen. Mohohan: An on-line video transcoding service
via apache Hadoop. [Online]. Available:
http://www.gwms.com.tw/TRENDHadoopinTaiwan2012/1002d
ownload/C3.pdf
[6] F. Yang and Q.-W. Shen, “Distributed video transcoding on
Hadoop,” Computer Systems & Applications, vol. 11, p. 020,
2011.
[7] Anastasios D. Doulamis, Nikolaos D. Doulamis and Stefanos D.
Kollias, National Technical University of Athens, Department
of Electrical and Computer Engineering,

Fig. 12. Graph showing speed variation

1362 2017 International Conference On Smart Technology for Smart Nation


[8] Sanchita , kalpana Jaiswal , Praveen Kumar , Seema Rawat 5th International Con-ference on Soft Computing for Problem
“Prefetching web pages for improving user access latency using Solving (SocProS 2015) organised by IIT Roorkee, INDIA
integrated Web Usage Mining “in the proceeding OF International (Published in Springer) , Dec 18-20, 2015. PP 586 – 589.
Conference on Communication Control and Intelligent System [10] Seema Rawat ,Praveen Kumar, Geetika, “Implementation of the
(CCIS-2015) organised by GLA University Uttar Pradesh , INDIA principle of jamming forHulk Gripper remotely controlled by
(Published in IEEE Explorer) , November 07-08, 2015. PP 401 – Raspberry Pi “in the pro-ceeding of 5th International
405. Conference on Soft Computing for Problem Solving (SocProS
[9] Praveen Kumar , Dr. Vijay S. Rathore “Improvising and Optimizing 2015) organised by IIT Roorkee, INDIA, Dec 18-20, 2015. PP
resource utiliza tion in Big Data Processing “in the proceeding of 199-208.

2017 International Conference On Smart Technology for Smart Nation 1363

Вам также может понравиться