Академический Документы
Профессиональный Документы
Культура Документы
�������������������
������������������������������������������
���������������������������������������
�������
�������������������� �����������������
������������������������������������ ������������������������������������
������������������������������������ ������������������������������������
����������������������������������������������� ������������������������������������
Information contained in this work has been obtained by McGraw Hill Education (India), from sources believed to be reliable. However, neither
McGraw Hill Education (India) nor its authors guarantee the accuracy or completeness of any information published herein, and neither McGraw
Hill Education (India) nor its authors shall be responsible for any errors, omissions, or damages arising out of use of this information. This work
is published with the understanding that McGraw Hill Education (India) and its authors are supplying information but are not attempting to render
engineering or other professional services. If such services are required, the assistance of an appropriate professional should be sought.
Typeset at Tej Composers, WZ 391, Madipur, New Delhi 110 063, and printed at
���������������������������������������������������������������������������
���������������������������������������������������������������������������
�������������������������������������������������������������������������������������������������������
����������������������������������������������������������������������������������������������������������
�����������
������������������������������������������������������������������������������������������������������
�����������������������������������������������
�������������������������������������������������������������������������������������������������������������
�����������������������������������������������������������������������������������������������������
�����������������������������
������������������
�����������������������������
��������
��������������������������������������������������������������������������������
����������������������������������������������������������������������������
������������ ����� ��� �������� ���� �������� ����������� ��� ����� ��������������
�������������������������������������������������������������������������������
���������������������
�����������������������������������������������������������������������������
�������� ��� ������������� ����������� ���� �������� ������ ���������� ���������� ����
��������������������������������������������������������������������������������
���� ���� ������������� ��� �������� ������ ���������� ��� ���� ��������� ������� ���
�����������������������
�������������������������������������������������������������������������������������������������������
���������������������������������������������������������������������������������
����������������������������
�����������������������������
��������
�������������������������������������������������������������������������
��������������������������������������������������������������������������
��������������������������������������������������������������
���� ����������� ����������� ��� �� ����������� ��� ��������� ������ ��������� ����
��������������������������������������������������������������������������
��������������������������������������������������������������������������
������ ��������� ���� ����� ������� �������� ������� ��� ���� ����� ��� ������������
��������������������������������������������������������������������������
�������������������������������������������������������������������������������������������������������
����������������������������������������������������������
�� ����� ����� ���� ����������� ����� �������� �� ��������� ���� �������� ���������� ��� ���������� ����������� ������
��������������������������������������������������������������������������������������������������������
����������
�������������������
�������������������������������
��������
��������������������������������������������������������������������������
�����������������������������������������������������������������������
��������������������������������������������������
�� ��� ������ ��� ����� ����� ���� ����������� ��� ������������ ����������� ���
�������������������������������������������������������������������������
����������������������������������������������������������������������������
�������� ������ ����� ���� ����� ��������� �������� ����� ��������� �������� ��� ����
�����������������
�� ����� ����� ���� ��������������� ������������� �������������� ���� ������������
����� ����� ��� ���������� ������ ��������� ������ �� ������� ��� ���������� �������
���������������������������������������������������������������������������������������
���������������������������������
����������������������������������
��������
������������������������������������������������������������������������
��������������������������������������������������������������������������
������������������
������������������������������������������������������������������������
���������������������������������������������������������������������������
���������������������������������������������������������������������������
���������������������������������������������������������������������������
��� ��������� �������� ����������� ��� ����� ��� �������� ��� �������� ������� �� �����
�����������������������������������������������������������������������������
�������������������������������������������������������������������������
������������
�����������������������������������������������������������������������������������������������������������
�����������������������������������������������������������������������������������������������������
�����������������������������
������������������
���������������������
��������
���������������������������������������������������������������������
�� ��������� ���������� �������������� �������� ���������� ����������
���������������������������������������������������������������������
������������������������������������������������������������������������
����������������������������������������������������������������������������
��������������������������������������������������������������������������
����������������������������������������������������
�������������������������������������������������������������������
��������������������������������������������������������������������������
���������������������������������
��������������������������������������������������������������������������������������������������������
��������������������������� ���� ������������� ������������������������ �� ������ �������� ������� ��� ��������
������� ��� ����������� ���� �������� ���������� ���� ������ ���������� ������������ ��� ���� �������� ��� ����
�������������
���������������
��������������������������
�������������
� �������������������������������������������������
� �����������������������������������������������������������
� ����������������������������������������������������
� ����������������������������������������������������������������
������
� ���������������������������������������������
�������������� � � � � �
� �����������������������������������������
� �������������������������������������������
����������������� � � � �
� �����������������������������������������������������������������
��������� � � � � �
� ����������������������������������������������������
������������ � � � � �
� ������������������
������������������ � � � �
� �����������������
� ���������������
�������������������������� � � �
� �������������������
��������������������� � � � �
����������������� ����������������
����������������� ������������������
��������������� ��������������
������������������� ����������������
������������������� ��������������
���������������� ���������������
������������������ �����������������
����������������� �������������������
�������������������� �������������������
�������������������
�� ������������������������������������������������������������������������
�� ����������������������������������������������������������������
�� ��������������������������������������������������������
�� �����������������������������������������������
�� ���������������������������������������������������������������������������������������
�� �������������������������������������������������������������������������������
�� ��������������������������������������������������������������������������
�� �����������������������������������������������������������������������������������
�� ������������������������������������������������������������������������������������
�� �����������������������������������������������������������������
�� ����������������������������������������������������������������������
�� ���������������������������������������������������������������������������������������������
�� ������������������������������������������������������������������������������������
�� ��������������������������������������������������������������
�� ���������������������������������������������
�� ��������������������������������������������������������������
�� �����������������������������������������������������������������������������
�� �������������������������������������������������������
�� �������������������������������������������������
�� ��������������������������������������������������������
�� ���������������������������������������������������
�� ����������������������������������������������
�� ������������������������������������������������������
�� ����������������������������������������������������������������������������������
�� ������������������������������������������
�������������������������
�� ����������������������������������������������������������������
�� ��������������������������
�� �����������������������������������������������������������������������������
�� ���������������������������������������������������������������������
�� �����������������������������������������������������
�� ���������������������������������������������������������������������������
�� ���������������������������������������������������������������������������������������
�� �����������������������������������������������������������������������������������
�� �������������������������������������������������������������������������������������
�� �����������������������������������������������������������������������
�� �������������������������������������������������������������������������������������
�� �����������������������������������������������������
�� �����������������������������������������������������
�� �����������������������������������������������������������
�� ����������������������������������������
�� ��������������������������������������������������������������������������
�� ����������������������������������������������������������������������
�� �����������������������������������������
�� ����������������������������������������������
�� ���������������������������������
�� ������������������������������������������������������������������������
�� ������������������������������������������������������������������������������
�� ���������������������������������������������������������������
�� ���������������������������������������������������������������������������
�� ���������������������������������������������
�� �������������������������������������������������������������������������������
�� ����������������������������������������������������������������������������
������������������������
�� ���������������������������������������������������������������������������������������
�� �����������������������������������������������������������������������
�� �������������������������������������������������������������������������������������
�� ������������������������������������������������������
�� ���������������������������������������
�� ����������������������������������������������������
�� ��������������������������������������������������������
�� �������������������������������������������������
�� ��������������������������������������
�� ���������������������������������������
�� �������������������������������������������
�� ��������������������������������������������������
�� ��������������������������������������������
�� �������������������������������������������������
�� ����������������������������������������������
�� ��������������������������������������������
��������
������������
� �������������������� ���
� ������������������������������ ��
� ��������������������� ��
� ����������������������������������� ����
� �������������������� ��
� ����������������� ����
� ��� ����������������������������������������������������������������� �
� ��� ������������������������������������������������������� �
� ��� �������������������������������������������������������������������� ��
� 4. �������������������������������������������������������������������������������� ��
� ��� ��������������������������������������������������������� ��
� ��� ���������������������������������������������������������������������������������������� ��
� ��� �������������������������������������������� ��
� ��� ������������������������������������������������������������������� ��
� ��� �������������������������������������������������������� ��
� ���� ����������������������������������������������������� ��
� ���� ����������������������������������������������������������������������������������� ��
� ���� ��������������������������������������������������������������������������������������� ��
� ���� ������������������������������������������� ��
����� �������������������������������������������������������� ��
����� ��������������������������������������������������� ��
����� ������������������������������������� ��
����� ���������������������������������������������������������������� ��
����� �������������������������������������������������� ��
����� ������������������������������� ��
����� ����������������������������������������������������������������������� ���
����� ������������������������������������������������������������������������������������� ���
����� �������������������������������������������������������� ���
� ���� �������������������������������������� ���
� ���� ��������������������������������� ���
� ���� ���������������������������������������������������������������������������������� ���
����� ���������������������������������������������������� ���
����� ��������������������������������������������������������������������������������� ���
����� ����������������������������������������������������������������������������� ���
����� ������������������������������������������������������������������������������������������������ ���
xxiv Contents
Abstract - Scheduling of a task in MapReduce is one of the and aspects. In MapReduce, multiple tasks run parallel on
most important aspects. To improve job completion and cluster substantial, task duplication redundantly executes some tasks on
throughput MapReduce uses speculative execution. When a which other tasks are critically dependent [6]. Mapreduce
machine takes an unusually long time to complete a task, the so- presents a new strategy referred to as LATE (Longest
called straggler machine, will delay the job completion time and Approximate Time to End) in its Hadoop-0.21 implementation. It
degrade the cluster throughput significantly. Many research efforts
keeps watch on the progress rate of the tasks and estimates their
being undertaken to increase MapReduce performance like LATE
(Longest Approximate Time to End), Resource utilization, Data remaining time. Tasks with their progress rate below slow Task
Placement in Hadoop clusters, but some are inappropriate and Threshold are selected as backup candidates [7] and therefore the
some did not lever the situation like data skew, asynchronously one with the longest remaining time is given the best priority.
starting task and abrupt resources.These problems are handled via
smart speculative execution wherein slow task is backed-up on an II. MAPREDUCE MECHANISM
alternative machine with the hope that the back-up one can finish MapReduce is a framework composed of two primitives viz.
later. In this paper, a new set of resource-aware scheduling with the map and reduce. It consumes data as its input and sort it into
help of improved speculative execution strategy is proposed. This
small lumps of data. These slots are processed by map tasks in
proposed technique improvises speculative execution significantly
by decreasing the job completion time and improving the cluster
totally parallel and distributed fashions outcomes of map task are
throughput. This new resource-aware scheduling technique will furnished into reduce tasks. After execution uncompleted and
aim at improving resource utilization across machines while failed task are again executed till completion of tasks map reduce
observing completion time goals. frame work is eternally associated with key value pair in
mapping function. Mapper Function will do tasks of mapping key
Keywords: MapReduce, Straggler machine, Hadoop, value of input to intermediate key value pairs. Mapping begins
Speculative Execution with zero and end up to n number of pairs. In reduce functions
there are three ground functions. They are shuffle, sort and
I. INTRODUCTION
reduce
MapReduce may be a programming model that has been The architecture of MapReduce is shown in Fig. 1:
��������� ��� ����� ��������� ��������� ����� ��������� ���������
implementation (simply known as MapReduce) and therefore the
well-liked ASCII text file implementation Hadoop which may be
obtained, at the side of the HDFS classification system from the
Apache Foundation. One have to be compelled to write area unit
2 functions, known as Map and Reduce, whereas the system
manages the parallel execution, coordination of tasks that execute
Map or Reduce, and conjointly deals with the likelihood that one
in all these tasks can fail to execute.
Scheduling of job is an important feature to consider in
MapReduce. MapReduce [1] has rapidly gained fame in both
academia and industry. It been used in indexing [1],
bioinformatics [2], machine learning [3] etc.
Hadoop is a widely-used implementation of MapReduce
besides Hadoop; other data parallel systems including Dryad [4]
and Sector/Sphere [5] have been designed with different features Fig. 1:- MapReduce Architecture
2 7th Post Graduate & Doctoral Conference for Information Technology
iii. �������� ������ ��� ����� ������������� ���������� ��� ����� ����� task. This backup task is then run to different machine for
priority dependency, where the restraint job ordering implementation the task in time limit and so that the
approach will be considered. computation becomes more faster. In Hadoop, if a node is
available but is performing poorly, the condition is called a
straggler; MapReduce runs a speculative copy of its task (also
3. Dynamic MR: A Dynamic Slot Allocation Optimization
called a backup task) on �������� �������� ��� ������ ����
Framework for MapReduce Clusters.
computation faster. Hadoop monitors task progress using a
parameter called Progress score which is valued between 0 and
This paper projected a DynamicMR framework desiring to
1. In map, the Progress Score is the tiny proportion of input data
improve the performance of MapReduce workloads whereas
read. For a reduce task, the execution is divided into three
managing the fairness. It consists of 3 techniques, namely, DHSA
phases, each of which accounts for 1/3 of the score. Hadoop
(Dynamic Hadoop Slot Allocation), SEPB (Speculative
looks at the average Progress Score of each category of tasks
Execution Performance Balancing), Slot Prescheduling, all of
������ ���� ��������� ��� ������ �� ���������� ���� ������������
that targets on the slot utilization optimisation for MapReduce
execution. When a tasks Progress Score is less than the average
cluster from numerous views[12]. DHSA focuses on the slot
for its category minus 0.2, and the task has run for at least one
utilization maximization by allocating map (or reduce) slots to
minute, it is marked as a straggler.
map and reduce tasks dynamically. significantly, it doesn't have
The comparative study of Speculative Execution based
any assumption or need any prior-knowledge and might be used
scheduler is given at table no.2
for any types of MapReduce jobs (e.g., freelance or dependent
ones). 2 kinds of DHSA ar conferred, namely, PI-DHSA and PD- D. Comparison Parameters
DHSA, supported numerous levels of fairness. User will choose The main focus in this research work is being on smarter
either of them consequently. In distinction to DHSA, SEPB and speculative execution strategies; studied and compared existing
Slot Pre programming contemplate the potency optimisation for a speculative execution based scheduler on the basis of following
given slot utilization.ions of Hadoop. parameters. This has helped us in determining smarter
speculative execution strategy.
Disadvantages:
i. Map and reduce slots can exchange, and then small jobs 1. Data Skew
cannot be accomplish in time. Data skew refers to the non-uniform distribution in a dataset.
ii. No load-balancing feature in scheduling for map and reduce These Tasks do not always process the same amount of data and
stage. it may experience several types of data skew in MapReduce.
When the input data has some big records that cannot be
iii. If reduce tasks steal map slot, some local map task will
separated, the map tasks that process those records will process
become non-local task because of shortage of map slots. more statistics. Partitioning the intermediate data generated by
the map tasks unevenly will also lead to the partition skew
4. Improve MapReduce Performance through Data Placement in among the reduce tasks, characteristically when the distribution
Heterogeneous Hadoop Clusters of keys in the input data set is skewed.
The direct impact of data skew on parallel execution of
Two algorithms were implemented and integrated into complex database queries is the deprived load balancing leading
Hadoop HDFS. The first algorithm is initially used to distribute to high response time.
file into heterogeneous nodes in a cluster. After the algorithm is
initiated all file fragments of an input files are distributed to the 2. Resource Utilization
computing nodes[13]. To solve the data skew problem the second Increasing utilization of system resources, increase the
algorithm is used and it also restructures file fragments. There performance of system. Maximum Resource utilization is in
two cases in which file fragments must be reshuffled. First, the Dolly [14] because of proactive cloning. Due to proactive cloning
cluster expanded is widened by adding new nodes to the present it uses maximum memory utilization then SAMR (Self Self-
cluster. When, new data is affixed to an existing input file, file adaptive MapReduce Scheduling Algorithm) [15] and ESAMR
fragments distributed by the initial data placement algorithm can (an Enhanced Self-Adaptive Map Reduce scheduling algorithm)
be disrupted. [16]. Disk Utilization is very much high in Wrangler [17] as
compared to SAMR and ESAMR.
C. Speculative execution based schedulers
3. Reliability and Scalability:
Wrangler is more reliable then SAMR and ESMAR. LATE is
These Schedulers are based on speculative task scheduling. less reliable than others. ESMAR, SMAR are quite scalable.
Speculative Execution copies the task and then it backed up the
4 7th Post Graduate & Doctoral Conference for Information Technology
IV. PROPOSED SYSTEM IMPLEMENTATION 2. A custom scheduling is being used to schedule task according
following criteria:
In this proposed system a new set of resource-aware a) Estimated time of completion
scheduling with the help of improved speculative execution b) Average execution time
strategy is proposed which improvises speculative execution c) List of all chunk of data in descending order by their size.
significantly by lessening the job completion time and d) Avoiding Straggler nodes.
also improves the cluster throughput by assigning task slots in 3. Sorting of data using TIMSORT algorithm [18]
the genus of assigning jobs. This new resource-aware scheduling As by default MapReduce uses quicksort, so instead of using
technique will aim at improving resource utilization across quicksort, Timsort algorithm will be used. By using Timsort it
machines while observing completion time goals. The proposed will give 10-20% improvement from default Hadoop during
system is divided into 4 modules as shown in Fig. 2. In an shuffle and sort phase.
implementation phase: 4. Finally comparison between default Hadoop and modified
1. Identifying the stragglers either by reading previous job logs Hadoop version with proposed configuration will be done.
or during run time using EWMA algorithm Straggler will be
identified either by reading previous job logs or during run time
using EWMA algorithm
EWMA (Exponentially Weighted Moving Average) scheme
which is expressed as fol���������������� ����������������� �������
���� �� �� �� �� ������� �� ���� ���� �� ���� ���� ���� ���������� ���� ����
��������� �������� ������ ��� ����� ��� �������������� �� ��������� ��
tradeoff between stability and responsiveness. however effective
EWMA will be in predicting the method speed and therefore the
remaining time of a task, a kind job is run within the cluster.
Resource-aware Scheduling Through Improved Speculative Execution 5
VI. CONCLUSION
[15] ���������������������������������������������������������������- [17] Neeraja J. Yadwadkar, Ganesh Ananthanarayanan, and Randy Katz.
adaptive MapReduce Scheduling Algorithm In Heterogeneous Wrangler: Predictable and faster jobs using fewer resources. In
�������������� ����� ����� IEEE International Conference on Computer Proceedings of the ACM ���������� ��� ������ ����������� ����� �����
and Information Technology (CIT 2010),2010. pages 26:1�26:14, New York, NY, USA, 2014. ACM.
[16] C. He ; Y. Lu, X. Sun ��� ������� ��� ��������� ����- Adaptive [18] �������� ����� �� ������ ���� �������� ����� ������� ������� ����
���������� ����������� ����������� Parallel and Distributed Systems sorting a �������������������� https://databricks.com/bl og/2014/10/10/spa
(ICPADS), 2012 IEEE 18th International Conference,2012 rk-petabyte-sort.html.
Emotionizer: Sentence Based Emotion Recognition
System
Gaurav Dalvi Emmanuel M.
PG Student, Dept of Information Technology HOD,Dept of Information Technology
Pune Institute of Computer Technology Pune Institute of Computer Technology
Pune, India. Pune, India
gauravdalvi63@gmail.com hodit@pict.edu
Abstract� ��� �������� �������� ������ ���� ������������ ���not be for human-computer interaction to acknowledge human
inferred from statistics, it also requires the support of some reviews emotions. Emotions can be expressed in many ways of which
or opinions of customers to provide better feedback or decision maximum communication is through text. Emotion Analysis
making. As most of the population are able to express their area has become a booming topic where Machine are trained to
emotions in the form of text, research in textual analysis becomes a
analysis emotion have helped organizations to get a closer look
prime need. Initially, a comparative analysis of existing techniques
at their products, Customer Satisfaction, Psychological Study of
like Keyword-spotting approach, Statistical and machine learning
approach, Rule-Based approach and Hybrid Approach, Cause
a patient and much more. Among all this major of the crowd
Extraction Approach is done. This research work uses keyword- expressing way is through Social Networking Sites like
spotting approach for retrieving affect words and calculating the Facebook, Microblogs, etc. Social Networking sites have
overall sentence level emotions. Also to get more accuracy it uses become an influencer to the people. People are following these
some heuristic rules which work on intensifying words and get sites for decision making, as a mentor in different walks of life.
more insight of emotions in sentences. The use of Microblogs is growing fast in recent years. Since
users are able to update their content quickly, microblogging
Keywords� ����������������������� Emotion Analysis, Natural
service also acts as a hub of real-time news.
Language Processing.
The organization of the paper is as section II describes Key
INTRODUCTION Concepts of Emotion Recognition, section III Comparative
Analysis, IV Proposed System, section V Results and
��������������������������������������������������������������
Discussion and finally concluded by section VI Conclusion.
be in his speech, facial expression, body language. Emotion is
often tangled with mood, nature, personality, and motivation. EMOTION RECOGNITION KEY CONCEPTS
Definitions of emotions in the literature have been many and
varied. Kleinginna and Kleinginna suggested a formal definition A. ������������������������������
��� �������� ��� ��� �������� ���� ��� ������������� ������ ����������� The famous Emotion Classification method is the Big Six,
and objective factors, mediated by neural and hormonal systems, ����� ��� ����� �������� ��������� ��� ������������ ������������ ���
which can a) give rise to effective experiences such as emotional expressions (Ekman el al. 1969). The Big Six
awakening of feelings , happiness and unhappiness; b) generate emotions are happiness, sadness, fear, surprise, anger and
cognitive processes such as emotionally relevant perceptual disgust. They are thought to be basic in two ways: psychological
affect, appraisals, labeling processes; c) active widespread and biological. They do not contain other emotions as parts, and
physiological adjustments to the arousing conditions; and d) lead they are innate.
to behavior that is often but not always communicative, goal- Ekman found that an isolated preliterate tribe in New Gineau,
���������������������������� the Fore tended to associate facial expressions of the Big Six
The larger use of new computer-based media for emotions with the same kinds of situations with which people
communication and expression gave rise to an increasing need associate them in the west [2].
8 7th Post Graduate & Doctoral Conference for Information Technology
Rule-Based and Hybrid Approaches uses heuristic rules for disgust, surprise). Let Vk be the initial set of emotional
identifying the negation and intensifiers coupled up with ��������� ������ ����� ���� ������ ���� ��������� �� ������� ����
keyword spotting approach. Anthony proposed a system for synsets are searched by using SimilarTo relationship in
Real-Time ����� ��� �������� ���������� ��� ������ ������ ��� ������ WordNet. The newly retrieved synsets are added to S k. As the
Time Text-to-�������� ��������� ���� ������� �������� ����� synsets are obtained indirectly, some penalty p is being attached
communication by extracting emotion from the real-time typed to it.
��� ����
��
���
�
��� � �� � �
�� ���
PROPOSED SYSTEM
�
Fig 2. shows the various components of the Proposed System.
���
�� ���� � � ��� � �� � ���
��
System Overview
���
The results are formatted in XML and JSON for efficient Dictionary. For the construction of Emotion Lexicon dictionary,
delivery to other applications. ConceptNet and some core annotated words belonging to
emotion states are used. After getting all the vectors associated
RESULTS AND DISCUSSION
with emotion words in sentence Emotion Calculation is
Emotionizer works on any text���� ������ ������ ������� performed.
Annotated dataset is used for testing. Tokenization of text is
performed to convert text into sentence list. Each sentence list is Experiment 1: Data Collection
passed through the preprocessing step for stopwords removal, ������ �������� ��� ���������� ������� �������� ��� ������ ���� ��������
stemming to filter the sentence and get core stem words. consists of stories submitted by B. Potter (19 stories), H.C.
Keyword spotting technique is used to label the emotion words Anderson (77 stories), Grimm (80 stories). It dataset is
in a sentence with emotion vectors in Emotion Lexicon
annotated across 6 different emotion as Angry (A), Disgusted part of speech. For Stopwords Removal, nltk provides a
(D), Fearful (F), Happy (H), Neutral (N), Sad (Sa), Surprise predefined set of stop words which is used. Finding the root
(Su) [15]. words is an important step as it helps to match the root words
in Keyword spotting module.
Few sentences of Ginger and Pickles story.
Once upon a time there was a village shop. Results of POS tagging:
The name over the window was "Ginger and Pickles." [('Once', 'RB'), ('upon', 'IN'), ('a', 'DT'), ('time', 'NN'), ('there',
It was a little small shop just the right size for Dolls -- 'EX'), ('was', 'VBD'), ('a', 'DT'), ('village', 'NN'), ('shop', 'NN'),
Lucinda and Jane Doll-cook always bought their groceries at ('.', '.')]
Ginger and Pickles. [('The', 'DT'), ('name', 'NN'), ('over', 'IN'), ('the', 'DT'),
('window', 'NN'), ('was', 'VBD'), ('``', '``'), ('Ginger', 'NNP'),
Experiment 2: Sentence Tokenization ('and', 'CC'), ('Pickles', 'NNP'), ('.', '.'), ("''", "''")]
For Sentence tokenization, various pickles are provided. As [('It', 'PRP'), ('was', 'VBD'), ('a', 'DT'), ('little', 'JJ'), ('small',
most of the text available are ��� ��������� �english��������� ��� 'JJ'), ('shop', 'NN'), ('just', 'RB'), ('the', 'DT'), ('right', 'JJ'),
used for tokenization. Result produce is a form of a list which ('size', 'NN'), ('for', 'IN'), ('Dolls', 'NNP'), ('--', ':'), ('Lucinda',
can be iterated to get each sentence. 'NNP'), ('and', 'CC'), ('Jane', 'NNP'), ('Doll-cook', 'NNP'),
('always', 'RB'), ('bought', 'VBD'), ('their', 'PRP$'), ('groceries',
Sentence Tokenization result: 'NNS'), ('at', 'IN'), ('Ginger', 'NNP'), ('and', 'CC'), ('Pickles',
['Once upon a time there was a village shop.The name over 'NNP'), ('.', '.')]
the window was "Ginger and Pickles."', 'It was a little small
shop just the right size for Dolls -- Lucinda and Jane Doll- Result after applying Stopwords Removal and Stemming.
cook always bought their groceries at Ginger and Pickles.'] [u'ont', 'upon', 'tim', 'vil', 'shop', '.']
['the', 'nam', 'window', '``', 'ging', 'pickl', '.', "''"]
Experiment 3: Word Tokenization ['it', 'littl', 'smal', 'shop', 'right', 'siz', 'dol', '--', 'lucind', 'jan',
For word tokenization focuses on two criteria such as white 'doll-cook', 'alway', 'bought', u'grocery', 'ging', 'pickl', '.']
spaces/periods, punctuation marks. It split off punctuation
marks other than periods to form words. Experiment 5: Keyword Spotting
Word Lexicon dictionary is used which consists of emotion
Word Tokenization result: words along with emotion vectors. Words from this dictionary
['Once', 'upon', 'a', 'time', 'there', 'was', 'a', 'village', are used to match words after word tokenization to get
������������������ �������� �������� ������� ���������� ������� ������ respective vectors.
'Ginger', 'and', 'Pickles', '.', "''"]
['It', 'was', 'a', 'little', 'small', 'shop', 'just', 'the', 'right', 'size', Result of Keyword Spotting:
'for', 'Dolls', '--', 'Lucinda', 'and', 'Jane', 'Doll-cook', 'always', {'shop': ['0.0', '0.0', '0.0', '0.0', '0.16666666666666666',
'bought', 'their', 'groceries', 'at', 'Ginger', 'and', 'Pickles', '.'] '0.0']}{}
{'shop': ['0.0', '0.0', '0.0', '0.0', '0.16666666666666666', '0.0'],
Experiment 4: Preprocessing 'right': ['0.027777777777777776', '0.0', '0.0', '0.0', '0.0', '0.0']}
Preprocessing consists of three steps as Part Of Speech (POS)
Tagging, Stopwords Removal, Stemming. POS tagging is
done after sentence tokenization to get different associated
Emotionizer: Sentence Based Emotion Recognition System 11
Experiment 6: Emotion Calculation at Sentence Level It was a little small shop just the right size for Dolls --
Lucinda and Jane Doll-cook always bought their groceries at
Emotion vectors of all affect words (emotion bearing words)
Ginger and Pickles.
are summed up and a neutral threshold is used to identify [0.027777777777777776, 0.0, 0.0, 0.0,
neutral sentence. s0.16666666666666666, 0.0] n
Final Outcome after Emotion Calculation:
Once upon a time there was a village shop.
[0.0, 0.0, 0.0, 0.0, 0.16666666666666666, 0.0] n
The name over the window was "Ginger and Pickles."
[0.0, 0.0, 0.0, 0.0, 0.0, 0.0] n
rdnet.princeton.edu, 2015. [Online].Available: https://wordnet.p
rinceton.edu/wordnet/. [Accessed:24- Dec- 2015].
[6] Sentiwordnet.isti.cnr.it, "SentiWordNet", 2015. [Online].
CONCLUSION Available: http://sentiwordnet.isti.cnr.it/. [Accessed: 24- Dec-
2015].
Emotions are conveyed in the form of audio, video, textual Conceptnet5.media.mit.edu, "ConceptNet 5", 2015.
[7]
format. Out of which communication among people is mostly [Online]. Available:http://conceptnet5.media.mit.edu/.
[Accessed: 24- Dec- 2015].
done through a textual form which gives rise to research in "Artificial Intelligence", 2016.
[8] Psych.utoronto.ca,
this field. Different approaches to extract emotion are [Online]. Available:http://psych.utoronto.ca/users/reingold/cours
discussed. It is concluded that Rule-Based Approach can be es/ai/cyc.html. [Accessed: 16- Mar- 2016].
[9] Carrillo-de-Albornoz, L. Plaza and P. Gervás, "SentiSense: An
enriched by adding more unambiguous rules bagged up with easily scalable concept-based affective lexicon for sentiment
efficient lexicon word generator. Lexicon word generator can analysis", in 8th International Conference on Language
Resources and Evaluation (LREC 2012), p. 1, 2012.
be built by using some preliminary emotion words sorted by [10] Elliott, C. Davidson, "The affective reasoner: A process model of
experts and then applying the procedure to construct lexicon emotions in a multi-agent system", Ph.D, Northwestern
University Evanston, IL, USA, 1992.
synonyms. ����� ����� ��� ������ ���� ��� �������� ���������� ����� ������
[11]
Machine Learning for Text-Based E������� ������������� ������
REFERENCES Human Language Technology Conf., Conf. Empirical Methods
in Natural Language Processing, pp. 579-586, 2005.
[1] M. D. Munezero, C. Montero, E. Sutinen and J. Pajunen, 'Are ����� ������������� ������ ����� ����-to-Emotion Engine for
They Different? Affect, Feeling, Emotion, Sentiment, and [12]
Expressive Internet ����������������� ������ ������� ������ts,
Opinion Detection in Text', IEEE Trans. Affective Comput., vol. Effects and Measurement of User Presence in Synthetic
5, no. 2, pp. 101-111, 2014. Environments, pp. 306-318, Ios Press, 2003.
[2] U. Krcadinac, P. Pasquier, J. Jovanovic and V. Devedzic,
[13] W. Li and H. Xu, 'Text-based emotion classification using
'Synesketch: An Open Source Library for Sentence-Based emotion cause extraction', Expert Systems with Applications,
Emotion Recognition', IEEE Trans. Affective Comput., vol. 4, no. vol. 41, no. 4, pp. 1742-1749, 2014.
3, pp. 312-325, 2013.
[14] Bird, Steven, Edward Loper and Ewan Klein (2009), Natural
[3] J. PRINZ, 'Which emotions are basic?', Emotion, Evolution, and Language Processing with Python���������������������
Rationality, pp. 69-88, 2004. C. Alm, "Downloads for affect data", People.rc.rit.edu, 2016. [O
Quora.com, "What is the difference between sentiment and [15]
[4] nline]. Available: http://people.rc.rit.edu/~coagla/affectdata/.
emotion ? Compare - Quora", 2015. [Online]. [Accessed: 16- Mar- 2016].
Available:https://www.quora.com/What-is-the-difference-
between-sentiment-and-emotion-Compare.[Accessed: 23- Dec-
2015].
[5] P. University, "About WordNet WordNet About Wo rdNet", Wo
��������������������������������������������������
SVM �������������
Himanshu Joshi A.M. Bagade
PG Student, Dept Of Information Technology Assistant Professor, Dept of Information Technology
Pune Institute of Computer Technology Pune Institute of Computer Technology
Pune, India Pune, India
himanshuj25@gmail.com ambagade@pict.edu
Abstract� Wide spread use of Biometric supporting to deal with variation in illumination which can drastic change
consumer system is increasing day by day. Consumer devices like in face recognition, other challenges are face pose, expression
laptop, tablets and phones have potential to act as a biometric etc.
readers used for authentication. However, this rapidly emerging The existing system for biometric verification uses either
technology lead to external attacks especially like Replay attack. controlled environment setup or specialized hardware sensors.
Current approaches to counter replay attacks in this area are
These techniques are expensive and limit the widespread
inadequate or being performed over controlled environment
which are basically used to distinguish between live people and development of biometric system as consumer devices like
spoofing attempts. These systems also have specialized hardware laptop, tablets etc. are operated in highly uncontrolled
such as sensor with specific processor chip which is very high in environments with no expensive hardware attached to them,
cost and not in reach of normal user. so image based feature extraction (Hybrid approach) like
The paper proposes a challenge-response method using face color space with SVM method can be used because this
reflection with in-band digital watermarking to address replay technique is cheap, reliable and can work against many
attacks for face recognition on smart devices. Here challenge is of environment challenges like auto brightening, auto white
���������� ������ ��� ����� ��� ������ ������� �������� ����������� balancing, etc. This paper propose a consumer device face
and response of identification and validation is done through
recognition system which have in-band digital watermarking
�����������������������������������������������������������������
under ideal condition, color reflection from the face may be [2,12] to address replay attacks and also produce good result
accurately classified in consumer devices. in uncontrolled environment.
Keywords�Biometric, Authentication, Replay attack, Digital This paper is presented as follows: Section II gives a
watermarking, In-band, Face Recognition general view on face recognition method. Section III provides
a detailed review of literature on different techniques used in
INTRODUCTION face recognition. Section IV gives introduction to the
proposed system used. Section V discusses different modules
Users hate passwords [1]! In modern world, the fear of and the algorithm used in those modules. Section VI discussed
losing personal information and valuables like money urge the intermediate results and the future outcome to be expected.
user to use passwords. But due to complexity rules and Section VII covers conclusion and future works.
different passwords for different purposes, leads user to forget
one or more of them. FACE RECOGNITION
Biometric system is used to identify a person based upon
his physical and behavioral characteristics. Some of the Face recognition method involves various techniques like
familiar methods of biometric are fingerprint recognition, iris pattern recognition, computer vision, computer graphics,
scanner, face recognition, handwriting verification and palm image processing etc.
geometry. Among all these technology, a rapidly growing In Fig.1 shown below, the first step is to capture image
technique is face recognition and it is one of most popular either online or offline then preprocessing is done, after that
application of image analysis. Face recognition plays a vital feature extraction is carried out by different image processing
role in accessing personal information, security and human algorithm used for features extraction and then extracted
machine interaction. The performance of many face feature are matched with database using classifier like SVM
recognition application in controlled environment reached to and finally final result is obtained.
satisfactory level, but still research is going on for face The different type of face detection and recognition methods
recognition technique which can deal the uncontrolled are shown in Fig 2.
environment. Some challenges of uncontrolled environment is
Face Recognition using Facial Color Reflection and SVM Classification 13
b. Based on Motion:
When use of video sequence is vailable then motion
information can be used to locate moving objects.Moving
shadows like face and body parts can be extracted by simply
Fig 1: Block diagram of face recognition method using classifier
frame differences gathered thresholding.
c. Based on edge detection:
Sakai et al. [4] proposed face detection based on
edges. Here, his work is based on analyzing line drawing
of faces from photographs, and aiming to locate facial
features
C. Template matching Approach: all location in the image. The result of calculating the
Template matching methods are used to correlate between distance from face space is a face map.
pattern in input image and stored standard patterns of a whole
face or face features to determine the presence of a face or
face features. Deformable and Predefined templates can be E. Hybrid approach:
used.
This approach uses both statistical pattern recognition
D. Image Based Approach:
Image based approach is further classified into techniques like SVMs and Feature Analysis techniques
1. Neural Network: like HSV color space.
As, face recognition is a two class pattern recognition HSV Color Space and Support Vector Machine (SVM):
problem, various neural network algorithm have been
proposed. Neural network for face detection is feasible Daniel F. Smith et al. [9] proposes a technique where he
uses local feature and whole face region to recognize a face. It
for training system to capture complex class face first convert the RGB captured images into HSV image and
patterns of conditional density. One drawback of neural then by using feature extraction it extract the color feature and
network is that network architecture has very much then compared with the SVM training data, if matched then
face is recognized if not then not matched. It uses frame
number of layers, number of nodes and learning rate to capture algorithm to improve the quality of color patterns
get superb performance. Feraud and Bernier [7] against illumination, ambient light etc. Hybrid method
proposed a detection method by using auto associative considered as best because it uses both local features character
as well as whole face region to detect or recognize a face.
neural networks, in this a five layer auto associated
network is able to perform nonlinear principal PROPOSED SYSTEM
component analysis
������������������������������������������ ���������� pupil
region, because bright reflection from the dark pupil region is
easy to recognize and can provide high level dissimilarity for
2. Statistical Method:
separating different color reflection, this is suggested by
Nishino and Nayer in 2004 [10]. From this technique it can be
a. Support Vector Machine (SVM): also stated that face also reflect color not much as eye pupil,
Osuna et al. [8] first introduces SVMs for face but can be used in studying face feature as it provide large
detection. SVMs uses new paradigm to train polynomial ������� ����� ���� ����� �������� ��������������������� ����� ������
pupil.
function, neural networks, or radial basis function (RBF) Using this approach a face recognition system is proposed
classifiers. It uses structural risk minimization to which uses information from the different reflected color from
minimize an upper bound on expected generalized error. face and then SVM is used to classify color and getting out a
Osuna et al. train SVM for large scale problem like face result. In this system different color are used because more the
variety in color greater the entropy. So, it is planned to use
detection. SVMs in wavelet domain can detect faces and colors displayed through monitor screen and the reflected
pedestrians color image is captured through camera. Then to avoid replay
. attack on system, a simple in-band digital watermarking
technique is used; this part of the system is called challenge
b. Principal Component Analysis (PCA): part. In response part, response of reflected color is calculated
Kirby and Sirivich in 1988 [13] introduced a and analyzed. If calculated response matches the order of
presented colors, one can conclude that the video was taken at
technique based on the concept of Eigen faces known as live instant and from particular device, rather than it is old
principal component analysis (PCA). PCA on a training video is replayed because it cannot contain the same color
set of face images is performed to generate the Eigen sequence. This system would robustly detect the spoofing
faces in face space. Faces area images are projected onto attack like printed photo when it is shown to camera as printed
photo have different color response and hence can be
the clustered and subspace. Similarly, for non- face identified by the system.
training images. To detect a face in scene, distance
between an image region and face space is computed for
Face Recognition using Facial Color Reflection and SVM Classification 15
1. �� � ����� ������
OUTPUT: Face Recognized (YES/NO).
���� //M��� �� � �
�� ���������
IMAGE CAPTURING (Challane Part):
Swell the rice in thin cream, or in new milk strongly flavoured with
vanilla or cocoa-nut; add the same ingredients as in the foregoing
receipt, and when the rice is cold, form it into balls, and with the
thumb of the right hand hollow them sufficiently to admit in the centre
a small portion of peach jam, or of apricot marmalade; close the rice
well over it; egg, crumb, and fry the croquettes as usual. As, from the
difference of quality, the same proportions of rice and milk will not
always produce the same effect, the cook must use her discretion in
adding, should it be needed, sufficient liquid to soften the rice
perfectly: but she must bear in mind that if not boiled extremely thick
and dry, it will be difficult to make it into croquettes.[136]
136. We must repeat here what we have elsewhere stated as the result of many
trials of it, that good rice will absorb and become tender with three times its
own bulk or measure of liquid. Thus, an exact half pint (or half pound) will
require a pint and a half, with an extremely gentle degree of heat, to convert
it into a thoroughly soft but firm mass; which would, perhaps, be rather too
dry for croquettes. A pint of milk to four ounces of rice, if well managed,
would answer better.
SAVOURY CROQUETTES OF RICE. (ENTRÉE.)
This is the French name for small fried pastry of various forms,
filled with meat or fish previously cooked; they may be made with
brioche, or with light puff-paste, either of which must be rolled
extremely thin. Cut it with a small round cutter fluted or plain; put a
little rich mince, or good pounded meat, in the centre, and moisten
the edges, and press them securely together that they may not burst
open in the frying. The rissoles may be formed like small patties, by
laying a second round of paste over the meat, or like cannelons; they
may, likewise, be brushed with egg, and sprinkled with vermicelli,
broken small, or with fine crumbs. They are sometimes made in the
form of croquettes, the paste being gathered round the meat, which
must form a ball.[137]
137. If our space will permit, more minute directions for these, and other small
dishes of the kind, shall be given in the chapter of Foreign Cookery.
In frying them, adopt the same plan as for the croquettes, raising
the pan as soon as the paste is lightly coloured. Serve all these fried
dishes well drained, and on a napkin.
From 5 to 7 minutes, or less.
VERY SAVOURY ENGLISH RISSOLES. (ENTRÉE.)
(Very delicate.)
Pare the crust neatly from one or two French rolls, slice off the
ends, and divide the remainder into as many patties as the size of
the rolls will allow; hollow them in the centre, dip them into milk or
thin cream, and lay them on a drainer over a dish; pour a spoonful or
two more of milk over them at intervals, but not sufficient to cause
them to break; brush them with egg, rasp the crust of the rolls over
them, fry and drain them well, fill them with a good mince, or with
stewed mushrooms or oysters, and serve them very hot upon a
napkin; they may be filled for the second course with warm apricot
marmalade, cherry-jam, or other good preserve. This receipt came to
us direct from Dresden, and on testing it we found it answer
excellently, and inserted it in an earlier edition of the present work.
We name this simply because it has been appropriated, with many
other of our receipts, by a contemporary writer without a word of
acknowledgment.
TO PREPARE BEEF MARROW FOR FRYING CROUSTADES,
SAVOURY TOASTS, &C.
(Author’s Receipt.)
Cut very evenly, from a firm stale loaf, slices nearly an inch and a
half thick, and with a plain or fluted paste-cutter of between two and
three inches wide press out the number of patties required,
loosening them gently from the tin, to prevent their breaking; then,
with a plain cutter, scarcely more than half the size, mark out the
space which is afterwards to be hollowed from it. Melt some clarified
beef-marrow in a small saucepan or frying-pan, and, when it begins
to boil, put in the patties, and fry them gently until they are equally
coloured of a pale golden brown. In lifting them from the pan, let the
marrow (or butter) drain well from them; take out the rounds which
have been marked on the tops, and scoop out part of the inside
crumb, but leave them thick enough to contain securely the gravy of
the preparation put into them. Fill them with any good patty-meat,
and serve them very hot on a napkin.
Obs.—These croustades are equally good if dipped into clarified
butter or marrow, and baked in a tolerably quick oven. It is well, in
either case, to place them on a warm sheet of double white blotting-
paper while they are being filled, as it will absorb the superfluous fat.
A rich mince, with a thick, well-adhering sauce, either of mutton and
mushrooms, or oysters, or with fine herbs and an eschalot or two; or
of venison, or hare, or partridges, may be appropriately used for
them.
SMALL CROUSTADES À LA BONNE MAMAN.
Fry lightly, in good butter, clarified marrow, or very pure olive oil,
some slices of bread, free from crust, of about half an inch thick, and
two inches and a half square; lift them on to a dish, and spread a not
very thick layer of Captain White’s currie-paste on the top; place
them in a gentle oven for three or four minutes, then lay two or three
fillets of anchovies on each, replace them in the oven for a couple of
minutes, and send them immediately to table. Their pungency may
be heightened by the addition of cayenne pepper, when a very hot
preparation is liked.
Obs.—We have spoken but slightly in our chapter of curries of
Captain White’s currie-paste, though for many years we have had it
used in preference to any other, and always found it excellent.
Latterly, however, it has been obtained with rather less facility than
when attention was first attracted to it. The last which we procured
directed, on the label of the jar, that orders for it should be sent per
post to 83, Copenhagen Street, Islington. It may, however, be
procured without doubt from any good purveyor of sauces and other
condiments. It is sold in jars of all sizes, the price of the smallest
being one-and-sixpence. We certainly think it much superior to any
of the others which we have tested, its flavour being peculiarly
agreeable.
TO FILLET ANCHOVIES.
Drain them well from the pickle, take off the heads and fins, lay
them separately on a plate, and scrape off the skin entirely; then
place them on a clean dish and with a sharp-edged knife raise the
flesh on either side of the back-bone, passing it from the tail to the
shoulders, and keeping it nearly flat as it is worked along. Divide
each side (or fillet) in two, and use them as directed for the
preceding toasts or other purposes. They make excellent simple
sandwiches with slices of bread and butter only; but very superior
ones when they are potted or made into anchovy butter.
SAVOURY TOASTS.
Cut some slices of bread free from crust, about half an inch thick
and two inches and a half square; butter the tops thickly, spread a
little mustard on them, and then cover them with a deep layer of
grated cheese and of ham seasoned rather highly with cayenne; fry
them in good butter, but do not turn them in the pan; lift them out,
and place them in a Dutch oven for three or [TN: missing word.]
minutes to dissolve the cheese: serve them very hot.
To 4 tablespoonsful of grated English cheese, an equal portion of
very finely minced, or grated ham; but of Parmesan, or Gruyère, 6
tablespoonsful. Seasoning of mustard and cayenne.
Obs.—These toasts, for which we give the original receipt
unaltered, may be served in the cheese-course of a dinner. Such
mere “relishes” as they are called, do not seem to us to demand
much of our space, or many of them which are very easy of
preparation might be inserted here: a good cook, however, will easily
supply them at slight expense. Truffles minced, seasoned, and
stewed tender in butter with an eschalot or two, may be served on
fried toasts or croûtons and will generally be liked.
TO CHOOSE MACCARONI AND OTHER ITALIAN PASTES.