Академический Документы
Профессиональный Документы
Культура Документы
Abstract—The digital media especially social media is very Then frames are divided into sub frames by using the same
popular in recent time. Due to which we have more number of method. Then difference between each sub frame gives the
videos than before. These videos are generally not tagged and block difference
classified. This paper aims for parallel video processing in Calculate the block difference from required formula and
Hadoop for fast processing and automatic tag emotions. This
calculate the mean deviation and standard deviation. This will
system recognizes face and tag emotions automatically on
hadoop clusters for fast and efficient performance. This will give you the threshold. If threshold which is calculated is
make human work easier in classification of videos and smaller than that of block difference of frame, then frame will
processing on parallel system make it quick. become key frame.
Face detection and recognition is done together. First video
stream is taken as an input. It is converted into image sequence
of frames. Then important key frames are extracted from the
Keywords : Automatic emotion tagging, parallel video image sequence by techniques like PCA or EGM. Then with
processing, face recognition
the help of some techniques we detect and track the face. The
next step is Image processing and face alignment which
consists of three parts. First one is Histogram equalization
I. INTRODUCTION after that resize image and last one is image raster scan. The
A growth of videos on internet has been increased in recent extracted features are used to create face print. Then feature is
few years because of social media and sites like youtube, matching with the done from the trained set.
dailymotion. According to the recent studies [1] more than Video tagging is defined as collection of contents in video like
60% people watch videos online. These videos are generally a scene, or a shot or information about it and description is
not classified and not tagged properly. Due to which video much more relevant than the whole annotated video.
searching become such a difficult task.
Tagging is a process by which we will assign metadata to the Through this paper an efficient method is proposed to detect
videos which helps to organize and manage the resources. the faces and recognize faces in the video and automatically
Video searching depend upon these tags which contains meta tag the emotions which is done on parallel systems with the
data of the videos like most viewed video. Some more help of Hadoop clusters. The key objective of this paper is to
important information about the video is not provided. recognize faces from video, extract key features, and use
In 2004, [1] data is processed by the MapReduce clusters. technique for image recognition so that character can be
From then MapReduce is used to large amount of data. Video identified using a trained data and important information is
processing can be accelerated by MapReduce. This technique labelled and classified under different classes. A template
is used for cheap, scalable and capable for video processing. It consists of important information such as name about
uses apache Hadoop and some open source projects. character and pre-configured nodal points. The nodal points
A large amount of video can be handled and time taken to are calculated in mathematics so that we can have a face print
process the data can reduced by utilizing the clusters. The which refers to the database. The face is recognized and
system automatic tagging process can be done parallel with automatic emotions are tagged with them. This process is very
other system. slow so to make is more efficient Hadoop clusters are used so
[7]. Video is basically a structured media by nature and our that parallel processing is done.
first step is to make it into temporal unit by segmented it.
Histogram is a technique which is used by separating each
frame into blocks smaller in size and “Histogram difference”
of frames is taken in succession.
There are many approaches existed for key frame extraction
which are depends upon detected shots. One of the technique
is ‘Shot Boundary Detection’.
According to “Shot Boundary Detection”, First read the video
as an input and shot boundary method is used to process it.
978-1-5386-0569-1$31.00 2017
c IEEE 1358
The techniques which are generally used for feature extraction vector , and in-turn, it maps to other shot label. A number of
are described below: - labels maps to video event, which means one layer of video
EGM: -[3] In this method we gather the data store it and annotation are calculated by previous layers of video
process the features of the face in image graph. The jets are annotation data.
the local image descriptors is transformed by wavelet
transform. These are results of the Gabor wavelet which have Multimedia content descriptor like MPEG-7 is used to store
a 2-D wave field. The jets form graph which will represent the video annotation in hierarchical manner. For fast retrieval of
face of the character. If the graph is matches with the videos the description is associate with
geometry, then it considered has same faces. the content. XML is used so that meta data can be stored and
PCA: - [3] It is one of the widely-used algorithm in facial it can associate with time-code and events can be
recognition. Eigen faces ae formed by extracting the principal synchronized.
components in the face images. Whenever there is a new According to the standards of MPEG [] a Description
image we have to determine is it a face image or not. If it is a containing of a descriptive scheme (structure) and the set of
face image, then the person is recognized by the weight Descriptive values (instantiations) that describe the Annotated
pattern. But it is very expensive to operate and couldn’t work Data. The Descriptor value can be defined as an instantiation
well with complex images. Still this is used most widely of a Descriptor for a given data set. DDL[10] defines the
because it is robust and unsupervised. structural relation among descriptors which is based on XML
language.
[A] Video Annotation
[B] Key Feature Extraction Algorithm
To improve searching, classifications and indexing of the
videos we have to assign some labels, tags to the content of
video. Automated video tagging can be done by two ways. The algorithm mainly focuses on techniques which are
The first one is “open-set tagging” in which[9] extraction is different and video sequence fundamental dynamics. An
required and tags for that extracted video content can be algorithm is explained for key frame extraction.
chosen from information which is associated with content like Step 1: Calculation of all the dissimilarity between general
phrases, group of words or sentences. The second one is frames and reference.
“closed set tagging”. It provides classification of tags in Step 2: Locate the maximum difference in the shots.
predefined classes like cricket, news report. Hierarchical Step 3: Depending upon the relationships we can determine
method is used to store video annotation in the form of data. the shots which is existed between maximum and mean
According to semantic levels we group the data of video deviation.
annotation. the video shot corresponding to another feature Step 4: Calculate the position of the key frame.
After the analysis of the existing system and new techniques Step 3: In this step, all the background is suppressed using
which can be used to make a system which can recognize the image thresholding. Frames passed from previous steps and
face as well as emotion which gives one extra feature for cross-checked with technique used in step 4. So after this
tagging which is helpful in classification in videos. process we are only left with rectangular windows with only
The System is worked on Hadoop due to which the efficiency the facial pixels (faces present in that frame).
of the system is increased. This paper proposed an
automatically tagged image and emotion recognition system.
Fig. 7. Metadata with probe image Fig. 10. Emotion Detection through Bezier curve
.IV. CONCLUSION
V. Future Work
This work can be used in robotics with the help of IOT. With
this we can make a robot we can automatically recognise the
Fig. 11.F low diagram of the working system face of the person which is used to identify the owner of the
robot and also identify the mood of the owner by using
This diagram shows how these above steps will work. Firstly emotion recognition.
we reduce frame per second and extract the key frame by
differentiate the images. Then we detect faces by supressing all As our work is on running on Hadoop nodes the robot will be
non-face pixels. Parallel we extract facial expressions for image fast to identify his owner as well as his mood of his owner.
recognition. We compare the image and from our database and in
parallel compare emotions from our database. Then we
REFERENCES
automatically tag both face and emotion.
[1]. Gill, Phillipa, Martin Arlitt, Zongpeng Li, and Anirban Mahanti.
III. RESULT "YouTube traffic characterization: a view from the edge." In
Proceedings of the 7th ACM SIGCOMM conference on Internet
We find that when we have done both image and emotion measurement, pp. 15-28. ACM, 2007.
recognition on one system it takes a lot of time while if we do on [2] Bloehdorn, Stephan, Kosmas Petridis, Carsten Saathoff, Nikos
Simou, Vassilis Tzouvaras, Yannis Avrithis, Siegfried
Hadoop clusters the process speed up. It takes less time while we Handschuh, Yiannis Kompatsiaris,SteffenStaab, and Michael G.
do on Hadoop clusters. Strintzis. "Semantic Annotation of images and videos for
multimedia analysis." In the Semantic web: research and
applications, pp. 592-607. Springer Berlin
Heidelberg, 2005.
[3] M. Turk and A. Pentland, "Eigenfaces for recognition," Journal
of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, January
1991.
[4] Zhu, Xiangxin, and Deva Ramanan. "Face detection, pose
estimation, And landmark localization in the wild."In Computer
Vision and Pattern Recognition (CVPR), 2012 IEEE Conference
on, pp. 2879 2886. IEEE, 2012.
[5] C.-H. Chen. Mohohan: An on-line video transcoding service
via apache Hadoop. [Online]. Available:
http://www.gwms.com.tw/TRENDHadoopinTaiwan2012/1002d
ownload/C3.pdf
[6] F. Yang and Q.-W. Shen, “Distributed video transcoding on
Hadoop,” Computer Systems & Applications, vol. 11, p. 020,
2011.
[7] Anastasios D. Doulamis, Nikolaos D. Doulamis and Stefanos D.
Kollias, National Technical University of Athens, Department
of Electrical and Computer Engineering,