Академический Документы
Профессиональный Документы
Культура Документы
“GRAPHOLOGY”
Thesis submitted in partial fulfillment of curriculum prescribed for
the award of the degree of
BACHELOR OF ENGINEERING
IN
COMPUTER SCIENCE AND ENGINEERING
by
Meghasree G
Poornima G Gokhale
Sindhu Chandrashekar
CERTIFICATE
This is to certify that the project titled “Graphology” is a bonafide work car-
ried out by Meghasree G, Poornima G Gokhale, Sindhu Chandrashekar in
partial fulfillment of the award of the degree of Bachelor of Engineering in Com-
puter Science and Engineering of Visvesvaraya Technological University,
Belgaum during the year 2015-16. It is certified that all corrections / sugges-
tions indicated during project presentation have been incorporated in the report.
The project report has been approved as it satisfies the academic requirements in
respect of project presentation prescribed for the Bachelor of Engineering degree.
Date: 2 .....................
3 .....................
DECLARATION
We, hereby declare that the project work entitled “GRAPHOLOGY” has been indepen-
dently carried out by us under the guidance of Ms Manimala S, Assistant Professor, Depart-
ment of Computer Science and Engineering, Sri Jayachamarajendra College Of Engineering,
Mysuru is a record of an original work done by us and this project work is submitted in the
partial fulfillment of the award of the degree of Bachelor of Engineering in Computer
Science and Engineering of Visvesvaraya Technological University, Belgaum during
year 2015-16. The results embodied in this thesis have not been submitted to any other Uni-
versity or Institute for the award of any degree or diploma.
Meghasree G
Poornima G Gokhale
Sindhu Chandrashekar
Abstract
Handwriting Analysis or Graphology is a study of the physical traits and patterns of handwrit-
ing, implying the psychological state of the writer at the time of writing. Handwriting is the
reflection of each individuals personality. The writer leaves their identity in the subtle charac-
ters of their writing. Handwriting Analysts study the handwriting and predict the behaviour
of a person based on their skills. To make this more accurate, proposed methodology focuses
on computer assisted handwriting analysis by considering 9 features such as size, slant of the
words, space between words, breaks in writing, pressure, margin, baseline, loop of ‘e’ and the
distance between title(dot) and stem of ‘i’. The estimation accuracy of 93.77% is achieved.
ACKNOWLEDGEMENT
We would like to thank the whole management of the department of Computer Science and
Engineering for having given us an opportunity to carry out project on our own by trusting
and acknowledging for our abilities. We have a great pleasure in expressing our deep sense of
gratitude to our institution Sri Jayachamarajendra College of Engineering, Mysuru.
We extend our deep regards to Dr Syed Shakeeb Ur Rahman, Principal of Sri Jay-
achamarajendra College of Engineering for providing an excellent opportunity to carry
out our project at the Computer Science and Engineering Department.
We take this opportunity to thank our Project Guide Ms Manimala S, Assistant Profes-
sor, Department of Computer Science and Engineering, SJCE for suggestions, valuable
support, encouragement and guidance throughout the project.
Our special thanks to the team of the institute “Power of Handwriting, Mumbai”, Mr
Rajesh Jauhari, Archana Sunil Wani, Ashita Thakkar, Siddhi Surve, Bhavisha Merwana, Elisha
Virani, Qrushant Rana who took time from their busy schedule and helped us with the ground
truth by analysing the handwritten samples.
We would like to thank all the teaching and non-teaching staff of Computer Science and
Engineering Department. We also convey our gratitude to all those who have contributed to
this project directly or indirectly.
Meghasree G
Poornima G Gokhale
Sindhu Chandrashekar
Contents
1 Introduction 7
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Aim/Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Motivation for Handwriting analysis . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Existing Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 Proposed Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.6 Time schedule for the completion of project . . . . . . . . . . . . . . . . . . . . . 10
1.7 Project Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Literature Survey 12
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Detailed Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Handwriting Analysis based on Segmentation Method for Prediction of
Human Personality using support vector machines . . . . . . . . . . . . . 13
2.2.2 Development of an automated handwriting analysis system . . . . . . . . 14
2.2.3 Artificial Neural Network for Human Behaviour Prediction through Hand-
writing analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.4 Automated Prediction of human behaviour system for Career counselling
of an individual through handwriting analysis. . . . . . . . . . . . . . . . 15
2.2.5 Honesty and Dishonesty through Handwriting Analysis . . . . . . . . . . 16
2.2.6 Handwriting Analysis for Detection of Personality Traits using Machine
Learning Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
5 System Design 28
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.3 Overview of the proposed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 30
6 System Implementation 32
6.1 Interface Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
6.2 Software Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.3 Table of traits and corresponding human behaviour . . . . . . . . . . . . . . . . . 46
8 Applications 56
Publication Details 58
References 59
Appendix A 61
Appendix B 63
CHAPTER 1
1 Introduction
1.1 Introduction
The authenticity of a persons signature or handwriting in the suicide note is frequently sub-
jected to forensic examination during investigation in order to determine authorship. In the
medical field, it can be used as an aid in diagnosis and tracking of diseases like Parkinsons
disease, Alzheimer’s disease, and even cancer through KanferTests[2].
Handwriting analysis done by a computer is accurate and identifies the handwriting better
than visual inspection. Moreover computer assisted handwriting analysis is automated, efficient
and devoid of human errors. Behavioural prediction by handwriting analysis with the aid of a
computer has been studied previously by various researchers.
7
Understanding Graphology
Handwriting develops right from childhood. When you write, your pen is under the control
of the muscles of your fingers, hands and arm. All these body parts are under the control of
your mind. The manner in which the words are eventually formed by the pen must bear a
direct relationship to the mind that guides their formation. Each vibration of movement is
unconsciously directed by the brain, so we can judge the mental state of the writer. It is a guide
to the will power, intellect and emotions of a person[4].
1.2 Aim/Objective
The objective of our project focuses on development of an automated technique for predicting
personality of a person through image processing. The main attention of our project is feature
extraction and its classification. All features are extracted automatically from the digital image
of handwriting. Segmentation and recognition of cursive English handwriting is a monotonous
task.
Many people want to know why they should go for handwriting analysis. Handwriting analysis
is both important and necessary for us at some level. There are varieties of reasons why a person
must get his handwriting analyzed at least once. The main reason is handwriting analysis
unfolds many things about your personality. It is a tool to know your strengths and weak
points. Till about a few years ago, handwriting analysis was restricted to forensic experts.
Not anymore. Today, several multinational corporate firms use handwriting analysis to know
personality traits of their job candidates. Handwriting tells about your suitability for the job,
your motivational level, creativity, leadership, teamwork etc. In fact, we would suggest you to
not only get your handwriting analyzed, you should also know the art of handwriting analysis.
Advantages are many like having access to the inner feelings of the person sitting next to you,
your colleague. It is more productive and beneficial for us to know how to get along with other
people and know yourself a little better.
8
1.4 Existing Solutions
• Hough transform is a technique used to detect any arbitrary shape in an image by creating
a table for storing all the edge pixels of the target shape[6].
• Template matching with certain predefined templates is also used as a technique for be-
havioural analysis. Methods discussed are not simple to automate and therefore a need
exist for a simple solution that could be automated easily[8].
• Generalised Hough Transform(GHT) is used to determine the shape of lower loop of letter
‘y’ and the height of the t-bar on the stem of the letter ‘t’ is calculated using template
matching[9].
A new methodology is proposed for human behavioural prediction through automated hand-
writing analysis. Initially the image of handwriting sample is acquired and subjected to pre-
processing. Further, for the extraction of different traits the image is segmented into lines,
words, and letters. In this system we consider 9 different features like size of the words, space
between the words, slant, baseline, margin, pressure of writing, breaks in writing, loop of ‘e’
and dot distance of ‘i’.
These features are extracted through various methods and then they are classified into different
categories. Each one of the trait will show different human behaviour.
9
1.6 Time schedule for the completion of project
The below figure describes the timeline followed. Each step involved documentation of the
process and meet up with the guide.
The scope of this project is restricted to images of 300dpi resolution and the writing should
be on an unruled paper. There is no trained operator required to use this software.
10
CHAPTER 2
11
2 Literature Survey
2.1 Introduction
Literature survey chapter provides an overview of previous research on knowledge sharing and
intranets. It introduces the framework for the case study that comprises the main focus of the
research described in this thesis. It provides a description, summary, and critical evaluation of
each work. Survey may include: scholarly journals, books, dissertations, conference proceedings,
etc. It may be completed en route to an essay, thesis or dissertation and included in the final
project. Or, it Materials may be conducted as its own entity. This was in order to scope out
the key data collection requirements for the primary research to be conducted, and it formed
part of the emergent research design process. An appreciation of previous work in this area
served three further purposes. First, through providing direction in the construction of data
collection tools, it guarded against the risk of overload at the primary data collection stages
of the project. Second, working the endings from extant literature into a formal review helped
maintain throughout the study a sense of the topics perspective. Finally, this activity raised
the opportunities for articulating a critical analysis of the actual meaning of the data collected
when the data analysis stages of the research were reached.
Throughout history, scientists, philosophers, artists and others have been interested in the
relationship between the handwriting and the writer. This interest appeared as early as 1622.
Efforts at handwriting analysis began in 1872, with the work of the French abbe, Hypolite
Michon, who gave graphology its name[10].
Methods proposed in literature involve the preliminary process of text extraction from the
sample and then application of various algorithms/ techniques to determine the characteristic
traits. Polygonalization method is one such technique which involves a closed polygon produced
around a line in the scanned image of the hand written text. The slope of the text/alphabet is
found using the coordinates of the polygon. Generalized Hough Transform is a second technique
used to detect any arbitrary shape in an image by creating a table for storing all the edge pixels
of the target shape. Template matching with certain pre defined templates is also used as
a technique for behavioral analysis Segmentation method which involves splitting up of the
handwriting sample into individual letters is another work available in literature. Methods
discussed are not very simple to automate and therefore a need exists for a simple method
which could be automated easily.
12
2.2.1 Handwriting Analysis based on Segmentation Method for Prediction of Hu-
man Personality using support vector machines
In this method, to predict the actual personality of the individual there are various features
, such as slant, size, pressure, upper zone or case(as in i, t, h, s, etc), lower zone (as in g, q, y,
z, etc), word spacing, line spacing, page margins, middle zone or case(as in a, o, c, s, e, etc),
arcade, garland, angle, thread, wavy line (written by authors who are mentally mature and are
skilful), and many others. But the proposed system has only six features among them discussed
above. These features are size of the letters, slant of words and letters, baseline, pen pressure,
spacing between letters and words as they are enough to predict the behaviour of the person.
The main attention of this paper is feature extraction and its classification. All features are
extracted automatically from the digital image of handwriting. These samples are then input
to the support vector machine for classification[5].
The classifier here used is SVM as shown in the above figure 2, as it gives good result with
more accuracy. SVM is time efficient to than neural network as tested. It analyses the data
and recognize patterns in the data. Multiclass- SVM aims to assign labels to instances by using
support vector machines, learning algorithm, where the different classes are drawn from a finite
set of elements to clearly separate them. SVM takes training data x, of some space, Rd, where
x=x1, x2,. xn such that x belongs to Rd. We also give their class label y= y1, y2, ..yn as to
which belongs to. In binary SVM y is a subset of -1,+1while in case of multi-class SVM the
y ranges 1 to N, where, N is the number of classes can belong to the linear SVM classification
function is represented by f(x)=w. x+b, which relates to a separating hyper-plane. x+b=0
The hyper-planes separating the training data samples with the maximum margin such that
the closest point to plane is 1/kwk.
13
2.2.2 Development of an automated handwriting analysis system
This work focuses on development of an automated technique for determining the charac-
teristic traits of a person through Image Processing called AHWAS (Automated Handwriting
Analysis System). The proposed work involves lesser image pre-processing of the image as it
crops the given sample automatically and uses a RGB filter to extract the text in the hand-
writing and identifies eight features in the handwriting simultaneously. The features identified
are: size of the letters, baseline, pressure of the writing, slant of the handwriting, number of
breaks, spacing between the words, margins and speed of the writing in the sample. The sys-
tem is designed to directly indicate the behaviour of the person from the above features. The
system can be used in various applications such as detection of diseases like Parkinsons disease
or Alzheimer’s disease, forensic document examination and lie detection[11].
The primary step involved in calibration of the images to extract the maximum about of the
handwriting, is to use a RGB (red green blue) filter to have a clear image of the handwriting.
The width of an A4 sheet, which is 8.27 inches, is used for size calibration. This was done by
determining the edges of the paper and passing the input to the program.
14
2.2.3 Artificial Neural Network for Human Behaviour Prediction through Hand-
writing analysis
In this paper a method has been proposed to predict the personality of a person from the
features extracted from his handwriting using Artificial Neural Networks. The personality traits
revealed by the baseline, the pen pressure and the letter t as found in an individuals handwriting
are explored in this paper. Three parameters, the baseline, the pen pressure and the height of
the t-bar on the stem of the letter t are the inputs to the ANN which outputs the personality
trait of the writer. The evaluation of the baseline is using the polygonalization method and
the evaluation of the pen pressure utilizes the grey-level threshold value. The height of the
t-bar on the stem of the letter t is calculated using template matching. The baseline, the pen
pressure and the letter ‘t’ in ones handwriting reveal a lot of accurate information about the
writer. MATLAB is the tool used for the purpose. The performance is measured by examining
multiple samples[12].
The main technique for finding the baseline slant is the polygonalization of the single line in
the handwritten text .The pen pressure can be calculated as a count of the number of foreground
pixels in the thresholded image. The number of black pixels is indicative of the pen pressure,
thickness of strokes, and the size of writing. The lower case letter t is one of the letters in one‘s
handwriting that reveals a lot of accurate information about the writer. People write letter in
many different ways. There are various ways to make the stem, the cross on the t-bar, and even
the entry and exit to this letter t, each one of which relates to a specific personality trait of a
person.
The many features in handwriting whereby behaviour is usually predicted are pen pressure,
baseline, slant, width of margins, spacing between letters, spacing between words, measurements
writing, height of bar on letter ‘t’, letter ‘y’, etc. A may be proposed to calculate the personality
of the person on the feature extracted from his handwriting using Artificial Neural Networks.
Most of researches were done to recognize the characters of handwriting and commonly used
Artificial Neural Network (ANN) for the recognition . It is easier to apply neural network for
that purpose because ANN is known as a good method for pattern recognition. Anyhow the
training process for ANN requires a lot of time and data[13].
15
This work has been carried out to implement multi-input multi layered Neural Network(parallel
distributed system) when considering recognition of Punjabi characters, that is trained using
back propagation, for that final utilization of this trained network to realize the patterns trained
for, and classify these under different, distinct output classes how the network was taught to
group them under. This problem is divided into two phases:
The technique of lying, it seems, has at least three ways of achieving its ends. In the
liar’s presentation of the story
(b) One (essential) part is left out and a freely invented part is substituted for it.
16
(c) One (essential) part is left out and the gap is filled with chitchat, or meaningless,vague
tales. In all three ways, the liar tries carefully not to appear as such his story and
approach must not arouse suspicion[14].
These two seemingly different handwritings were written by one person, a pathological
liar. She executed this writing for the doctor who had her under his care,in order to show
“how clever she was”. From the standpoint of graphology, these handwritings are identical
with the exception of the slant; neither contains a basic characteristic that the other lacks.
The pathological liar, to be sure, is not merely a person who tells many lies. He is almost
completely identified with the false roles he unconsciously assumes. Consequently, he will
characteristically show two or more different styles of writing, rather than merely the slips
of the “habitual liar”. Such shifting of style is the clue to pathology, which the graphologist
can discover.
Among all the unique characteristics of a human being, handwriting carries the richest in-
formation to gain the insights into the physical, mental and emotional state of the writer.
Graphology is the art of studying and analysing handwriting, a scientific method used to de-
termine a persons personality by evaluating various features from the handwriting. The prime
features of handwriting such as the page margins, the slant of the alphabets, the baseline etc.
can tell a lot about the individual. To make this method more efficient and reliable, introduction
of machines to perform the feature extraction and mapping to various personality traits can
be done. This compliments the graphologists, and also increases the speed of analysing hand-
written samples. Various approaches can be used for this type of computer aided graphology.
In this method a novel approach of machine learning technique to implement the automated
handwriting analysis tool is discussed[15].
17
Figure 5: Process Flow diagram for generation of training dataset
This paper has proposed a methodology to predict the accurate personality traits of an
individual from the features extracted from handwriting using a machine learning approach.
This paper explores the personality traits revealed by baseline, margin, slant of the words and
height of t-bar of a person’s handwriting. These features will be extracted from the handwriting
samples into feature vectors which would be compared with an initially trained data set; and
then mapped to the class with corresponding personality trait. The baseline would be evaluated
using the method of Polygonalization while margin will be calculated using the method of
vertical scanning. The height of the t-bar on the stem of the alphabet ’t’ and word-slant would
be calculated using template matching.
18
Figure 7: Mathematical model
The proposed system can be used as a complementary tool by the graphologist to improve
the accuracy of handwriting analysis and also make the process fast. It will also assist the
HR/company employer in decision making regarding the suitability of an employee for the
specific job and improving the retention of an employee. The future work can be to include
more features from the micro approach of handwriting analysis like the loops of alphabet ‘f’
and ‘l’, gradient, concavity of letters and so on in order to predict more accurate results.
19
CHAPTER 3
20
3 System Requirements and Analysis
3.1 Introduction
Requirement specification encompasses the needs of this project. It aims at providing a full
description of the requirements based on the concepts defined in the Problem Domain. The sys-
tem requirement specification is produced at the culmination of the analysis task. The function
and performance allocated to software as part of system engineering are refined by establishing
a complete information description, a detailed functional description, representation of sys-
tem behavior, an indication of performance requirements and design constraints, appropriate
validation criteria and other information pertinent to requirements.
• The spacing between lines in the sample should be sufficiently large so that segmentation
is done correctly.
• Processors Any Intel or AMD x86 processor supporting SSE2 instruction set.
• Disk Space
1 GB for MATLAB only.
34 GB for a typical installation
21
• Input device : Mouse and Keyboard
22
CHAPTER 4
23
4 Tools and Technology Used
Image Processing is a technique to enhance raw images received from cameras/sensors placed
on satellites, space probes and aircraft or pictures taken in normal day-to-day life for various
applications. Methods of Image Processing There are two methods available in Image Pro-
cessing, analog Image Processing and digital image processing. Various techniques have been
developed in image processing during the last four to five decades. Most of the techniques are
developed for enhancing images obtained from unmanned spacecrafts, space probes and military
reconnaissance flights. Image Processing systems are becoming popular due to easy availability
of powerful personnel computers, large size memory devices, graphics softwares etc. The com-
mon steps in image processing are image scanning, storing, enhancing and interpretation. After
converting the image into bit information, processing is performed. This processing technique
may be, Image enhancement, Image restoration, and Image compression.
This section discusses image processing operations, which are used in this project.
1. Image Segmentation
Image segmentation is the process of partitioning a digital image into multiple segments
(sets of pixels, also known as super pixels). The goal of segmentation is to simplify and/or
change the representation of an image into something that is more meaningful and easier
to analyze. Image segmentation is typically used to locate objects and boundaries (lines,
curves, etc.) in images. More precisely, image segmentation is the process of assigning a
label to every pixel in an image such that pixels with the same label share certain visual
characteristics.
2. Image Thresholding
In image processing, thresholding is used to split an image into smaller segments, or junks,
using at least one color or grayscale value to do their boundary.
24
4.2 Tool Used for Implementation
4.2.1 Matlab
Matlab is well adapted to numerical experiments since the underlying algorithms for Matlab’s
built in functions are based on the standard libraries LINPACK and EISPACK. Matlab program
and script files always have filenames ending with “.m”; the programming language is excep-
tionally straightforward since almost every data object is assumed to be an array. Graphical
output is available to supplement numerical results.
4.3.1 LaTex
LaTex is a document markup language and document preparation system for teX typesetting
program. The term LaTeX refers only to the language in which documents are written, not
to the editor application used to write those documents. In order to create a document in
LaTeX a .tex file must be created using some form of text editor. While most text editors
can be used to create a LaTeX document, a number of editors have been created specifically
for working LaTeX. LaTeX is widely used in academia. As a primary or intermediate format,
e.g., translating DocBook and other XML based formats to pdf, LaTeX is used because of the
25
high quality of typesetting achievable by Tex. The typesetting system offers programmable
desktop publishing features and extensive facilities for automating most aspects of typesetting
and desktop publishing, including numbering and cross-referencing, tables and figures, page
layout and bibliographies. LaTeX is intended to provide a high-level language that accesses the
power of TeX. LaTeX essentially comprises a collection of TeX macros and a program to process
LaTeX documents.
26
CHAPTER 5
27
5 System Design
5.1 Introduction
System design describes any constraints in the system design and includes any assumptions
made by the project team in developing the system design. It is the process or art of defining the
hardware and software architecture, components, modules, interfaces, and data for a computer
system to satisfy specified requirements. One could see it as the application of systems theory
to computing. The design of the system is essentially a blueprint, or a plan for a solution for the
system. We consider a system to be set of components with the clear and defined behaviour,
which interacts each other in a fixed, defined manner, to produce some behaviour or services to
its environment. Our system is a software, which can be used to predict the personality of the
person by his handwriting.
Image Pre-Processing
Image Segmentation
Trait Acquisition
Behavioral Prediction
28
• The handwritten forms are taken from IAM database. The database contains forms of
unconstrained handwritten text, which were scanned at a resolution of 300dpi and saved
in png format with 256 gray levels[16].
• The image is pre-processed to remove the background noise by thresholding the image and
it is subjected to different types of segmentation. They are :
1. Line Segmentation
2. Word Segmentation
3. Letter Segmentation.
1. Size
2. Pressure
3. Margin
4. Baseline
5. Breaks
7. Slant
8. Loop of ‘e’
29
5.3 Overview of the proposed algorithm
30
CHAPTER 6
31
6 System Implementation
Graphical user interface (GUI) is a type of user interface that allows user to interact easily us-
ing graphics rather than text commands. AGUI represents the information and actions available
to a user through graphicalicon-sand visual indicators such as secondary notation, as opposed
to text-based interfaces, typed command labels or text navigation. The actions are usually per-
formed through direct manipulation of the graphical elements. A GUI allows users to perform
tasks interactively through controls such as buttons and sliders. Within MATLAB, GUI tools
enables us to perform tasks such as creating and customizing plots (plottools), fitting curves
and surfaces (cftool), and analyzing and filtering signals (sptool). We can also create custom
GUIs for others to use either by running them in MATLAB or as standalone applications.
In our implementation, we have used the user interface to interactively take the input from
the user, the handwritten image which has to be analysed is selected and the personality is
predicted from selected image in the UI.
32
6.2 Software Implementation
This section narrates step by step implementation and control flow of our system.
1. Image Acquisition
The IAM Handwriting Database contains forms of handwritten English text which can
be used to train and test handwritten text recognizers and to perform writer identification
and verification experiments[16]. A sample image is shown in the figure 8.
2. Pre-Processing of Image
33
3. Segmentation of Image Handwriting
In this process, the sample of handwriting is segmented into three different parts for
efficient analysis i.e. line segmentation, word segmentation and letter segmentation
Firstly the paragraph is divided into separate lines so that features like baseline
and spacing between words is extracted to examine the writers emotional stability
and dispositions in the baseline of the writing. Here image is scanned horizontally
and a logical vector is created of which lines have text and which do not, based on
threshold[17].
This segmentation process is used to segment the words in digital handwriting doc-
ument to calculate different features (size, breaks in writing, slant) related to words
indicating the disposition towards criticism, and towards argument. Here the image
is first dilated to connect all the letters is the words. And then each word is cropped
out.
34
Figure 10: Word Segmentation
Character segmentation and recognition has been an active field of research for many
years. It still remains an open problem in the field of pattern recognition and image
processing. There are mainly three phases of a character recognition system namely
preprocessing, segmentation and recognition. Preprocessing aims at eliminating the
variability that is inherent in printed words. The preprocessing techniques such as
background noise removal, scaling, thinning skew removal etc. have been employed
by various researchers in an attempt to increase the performance of the segmentation
and recognition process; The role of segmentation is to find correct letter boundaries.
Segmentation precedes character recognition, which means that the output of seg-
mentation becomes the input to the character recognition module. Segmentation of
off-line cursive words into characters is one of the most difficult and important process
in handwriting recognition as it directly affects the result of recognition process[18].
Letter Segmentation module in handwriting recognition plays a crucial role for suc-
cessful performance of the overall recognition system. Here segmentation is performed
on each letter in the word in digital handwriting document. This segmentation is used
in feature calculation related to letters (e and i) for the prediction of the personality
of individuals.
One of the crucial step in this part is if the words are slant either to the left or right
then it leads to an extra process of removing the slant. For this find the mean angle
of an image by looking for maximum projection values and transforming the image
35
according to the geometric transformation.
i. Pre-processing
ii. Thinning
36
eroding an image. The gray scale images are then converted in a binary matrix
format. The resultant binary images have values of 0 each for all the foreground
black pixels and 1 each for all the background white pixels.
iv. Overview
There are two types of characters in English language. First type of characters
are called “Closed Characters” and contain a loop or a semi-loop such as ‘a’, ‘b’,
‘c’, ‘d’, ‘e’, ‘g’, ‘o’, ‘p’, ‘s’ etc. Second type of characters are termed as “Open
Characters” and are without a loop or a semi-loop e.g. ‘u’, ‘v’, ‘w’, ‘m’, ‘n’, ‘i’ etc.
In case of open characters, it is very difficult to differentiate between ligatures and
characters because of the cursive nature of handwriting. In case of cursive hand-
written words, a ligature is a link (small foreground component) which is present
between two successive characters to join them. Two consecutive ‘i’ characters may
give an illusion of the presence of a character ‘u’ and vice versa. Two consecutive
characters ‘n’ and ‘i’ may look like ‘m’. Also, handwritten character ‘w’ may look
like the presence of two consecutive characters ‘i’ and ‘v’. To overcome such type
of challenges in the domain of cursive handwriting segmentation and recognition,
a new segmentation approach is developed which is based on the analysis of the
character’s geometric features, such as, the shape of the character to identify the
characters and the ligatures.
v. Methodology
The word image is scanned vertically, from top to bottom, column wise and
the number of foreground pixels in the inverted word image are counted in each
37
column.The positions of all these columns are saved and these identified columns
are termed as PSC. This image is converted from binary format to a RGB format.
Now, it becomes computationally easier to display the PSC (Potential Segmenta-
tion Columns) in different colour other than black and white. It is clear that each
column in the word image, for which the sum of foreground white pixels is 0 or 1,
is a PSC and vertically cuts the word image.
vi. Implementation
Step 2: To minimize the computation complexity, the input word image is in-
38
verted for further processing. By complementing the input binary image, white
pixels become the foreground pixels and the black pixels become the background
pixels. Hence, it becomes easier to count the foreground white pixels represented
by 1, in each column of the word image.
Step 3: This image is now converted from binary format to a RGB format. Now,
it becomes computationally easier to display the PSC (Potential Segmentation
Columns) in different colour other than black and white.
Step 4: All PSCs over-segmenting the word image are printed in red colour. It is
clear that each column in the word image, for which the sum of foreground white
pixels is 0 or 1, is a PSC and vertically cuts the word image.
4. Feature Extraction
Size has its own significance in the process of handwriting. Size is an unfixed
trait[21]. It tells whether the writer is extrovert or introvert, the writer’s present
capacity for concentration. It is the determined by the average height. To get the
average height, image is scanned across column by column to find the first black pixel
and the last black pixel.(get the mean over every column).
39
Figure 13: Size of word
(b) Baseline
The imaginary line on which the writer writes on the blank paper is called the base-
line. The emotional stability and disposition of writer is judged by the baseline in the
handwriting. It is obtained by finding the co-ordinate of all the pixels and the pixels
are fitted to a line from left to right. The angle is calculated by the formula
y −y
2 1
Angle = tan−1 (1)
x2 − x1
40
Figure 14: Process of finding baseline
This represents the connectivity within a word in the handwriting. A person can
write a word continuously without any break, or sometimes he connects 2-3 letters
in a word, or he gives break after every letter. Each way of writing reflects different
personality of the individual. It is determined by the total number of breaks by to-
tal number of words in the given sample. Number of breaks is obtained by applying
bounding box method for each word.
41
(d) Spacing between the words
Word spacing indicates the disposition towards criticism, and towards argument
.This feature is obtained by the number of pixels between the end of the one word and
the start of the next word.
(e) Margin
Margin is the space left while writing either to the left or to the right. This feature
reveals the exquisite temperament, adjustment and social interaction of the person
with the society. It is obtained by applying the bounding box to whole paragraph.
The bounding box gives 4 values [left, top, width, and height] so that we can get the
margin distances from those parameters.
It is the weight of the handwriting. Precisely how much energy is available to the
person at the time of writing is evident in the pressure[22]. Based on the pen pressure,
writer can be classified into light, medium and heavy writer[23]. To measure this, a
simple thresholding technique is used by taking the intensity of the foreground pixels in
42
the image. Physical and mental level of a person is revealed by pressure in the writing.
Slant of the writing indicates emotional interactions of the author. Three classes
in slant are vertical, rightward and leftward slant. It is obtained by finding the mean
angle of an image by looking for maximum projection values and transforming the
image according to the geometric transformation.
The letter ‘e’ is contemplated to be the ears in the art of graphology since it tells
us how different people communicate and their reactions when they hear something
in this materialistic world. The area is obtained by computing the roundness metric.
43
Figure 19: Letter ‘e’
It is a surprising fact that dot in the ‘i’ can disclose a lot about writers personality
and character. As a general rule, the closer the dot, it suggests extra attention to
details and on the contrary, the farther away the dot, it reflects the writer‘s trait of
processing less attention to details[24]. The distance of dot is calculated by Euclidean
distance.
p
Distance(x, y) = (y1 − x1 )2 + (y2 − x2 )2 (2)
44
Figure 20: Finding distance in ‘i’
45
6.3 Table of traits and corresponding human behaviour
Small Introvert
Size Medium Social
Large Extrovert
Light Sensitive to atmosphere
Pressure Medium Emotional
Heavy Takes things seriously
Left Exhibits courage in facing life
Margin
Right Less risk taker
Connected Good analytical thinking power
Breaks
Disconnected Sensitive
Narrow Dependant
Space between words Even Caring
Wide Independent
Downward Negative
Baseline Straight Consistent
Upward Active and busy
Left self-reliant
Slant Right Moody
Vertical Practical
Closed less acceptability
Loop of ’e’
Wider Open minded
Close Extra attention to details
Dot distance of ’i’ Far Less attention
Just above Accuracy and perfection
46
CHAPTER 7
47
7 Testing and Results
7.1 Introduction
Testing is intended to show that a system conforms to its specification and that the system
meets the expectation of the user of the system. Large systems are built out of the sub sys-
tems which are built out of modules which are composed of procedures and functions. The
testing process should therefore proceed in stages where testing is carried out incrementally in
conjunction with system implementation. The testing of the learning tool was done along with
the implementation of the various modules. This method of testing helps to ensure the proper
working of the modules at the time of their implementation. Testing involves exercising the real
data processed by the program. The existence of program defects or inadequacies is inferred
from unexpected system outputs. For verification and validation, we make use of the program
testing technique.
The primary goal of unit testing is to take the smallest piece of testable software in the
application, isolate it from the remainder of the code, and determine whether it behaves exactly
as you expect. Each unit is tested separately before integrating them into modules to test
the interfaces between modules. Unit testing has proven its value in that a large percentage
of defects are identified during its use. This type of testing is driven by the architecture and
implementation teams. This focus is also called black-box testing because only the details of
the interface are visible to the test. Limits that are global to a unit are tested here.
48
Figure 21: User interface
49
Figure 22: Form selection
50
Figure 23: Sample Form
51
Figure 24: Output
52
7.4 Results
The experimentation is executed for around 80 samples and the results are compared with
the manual analysis obtained by graphologist. It is observed that the efficiency of all the traits
exceeds 80% as shown in the table 7.4.In our system, the discriminating features are breaks,
size, space between words, baseline and loop of ‘e’. For estimating the overall accuracy, 90%
weightage is given to these 5 features and 10% to the remaining 4 features. The estimated
weighted accuracy of 93.77% is achieved.
The handwriting traits and their corresponding behavourial explantion is shown in the table
6.4.
Features Accuracy
Pressure 86.15
Margin 98.46
Breaks 92.3
Size 90.76
Space 98.46
Baseline 93.84
Slant 83.76
Loop of ‘e’ 95.38
Dot distance of ‘i’ 93.38
7.5 Conclusion
A relatively simpler method has been proposed to anticipate the personality of a person by
exploring various handwriting features. The system considers five discriminating features such
as breaks, size, space between words, baseline, loop of ‘e’ and few other features like pressure,
margin, slant and dot distance in ‘i’. The proposed system can be used as a twin tool by
graphologist to improve the accuracy and anticipate the behaviour of a person faster. The
estimated weighted accuracy of 93.77% is achieved.
53
Figure 25: Accuracy plot
54
CHAPTER 8
55
8 Applications
1. Personnel Selection
2. Psychological analysis
3. Medical diagnosis
4. Investigations
56
their last note.
5. Employment Profiling
A writing sample of an applicant is analyzed to match his profile with the ideal psycho-
logical profile of an employee required for the vacancy. A graphologist report is used for
comprehensive background checks, practical demonstration or as a record of work skills.
The analysis can be a complete failure and may have also proved to be highly successful in
certain cases. In most cases, traditional methods are preferred in the employment process
where there is an absence of evidence of a direct link between handwriting analysis and
various measures of job performance. The use of graphology in the hiring process has been
facing criticism since ages on ethical grounds and legal grounds.
6. Career Guidance
Handwriting analysis is also used for career guidance where knowledge about personality
is essential in order to match the individual correctly to the type of work that would best
suit his or her personality and interests[26].
7. Personal Relationships
57
Publication Details
58
References
3. http://graphicinsight.co.za/uses.html
4. http://www.slideshare.net/GargeeHiray/graphology-sciencehandwriting-analysis
10. http://www.ijcaonline.org/volume8/number12/pxc3871758.pdf
12. Handwriting Analysis of Human Behaviour Based on Artificial Neural Network by Champa
13. Automated Prediction of human behaviour system for Career counselling of an individual
through handwriting analysis by Ashish KathaitandAjit Singh
15. Handwriting Analysis for Detection of Personality Traits using Machine Learning Approach
16. http://www.iam.unibe.ch/fki/databases/iam-handwriting-database/download-the-iam-handwriting-
database
17. http://www.mathworks.com/matlabcentral/answers/225185-how-do-you-segment-the-image-
horizontally
18. http://www.realsimple.com/work-life/life-strategies/handwriting-101/if-loops-are
21. https://in.mathworks.com/matlabcentral/answers/260193-crop-an-image-per-letter
23. http://timesofindia.indiatimes.com/life-style/books/features/Know-what-your-writing-says-
about-you/articleshow/50695314.cms
59
24. http://www.actforlibraries.org/meaning-of-i-dot-in-handwriting-analysis-what-do-your-i-dots-
reveal-guide-to-i-dots/
60
Appendix A - Project Team Details
THE TEAM
Sindhu Chandrashekar, Poornima G Gokhale, Manimala S (Guide), Meghasree G
61
Write-up about the Project
Our project aims in developing an automated handwriting analysis system for human be-
haviour prediction. The conduction of this project required the knowledge of various subjects
learnt in previous semesters of Computer Science and Engineering. To design and implement
any project, the understanding of Software Development Life Cycle is a must. Hence the core
subject Software Engineering aided the project development process.
Since our project was based on images, an elective subject, Pattern Recognition helped in
analysis of patterns of the external features and its classification. It provided the basic knowledge
needed to understand the concept of image processing using matlab
62
Appendix B - CO’s, PO’s and PSO’s Mapping
63