Вы находитесь на странице: 1из 3

Tiberiu-George Copaciu CST Part IA: Machine Learning, SV 1

tgc30@cam.ac.uk 2017-02-25 13:00, William Gates Building

Pipeline for recognizing the happiness of hand drawn


sketches

1 Assumptions and particularisation


In order to design a pipeline that classifies hand drawn faces into happy or sad, we firstly have
to make a relatively large set of sensible assumptions. By doing so, we will be able to use our
program only on some certain types of drawings, reducing considerably the complexity of the
algorithm. However, for the purpose of this task I believe we are not supposed to come up
with something very sophisticated that we still cannot be sure if will work on every possible
drawing. Therefore, the assumptions are as follows:

there is only one face in the image


the contour of the face is continuous
there are only three other objects besides the contour of the face (2 eyes and the mouth),
all of them placed inside the contour
all the other objects have continuous contours
none of the contours touch each other
the eyes are roughly 2 circles and the mouth is a curved line
since the face is drawn by a human being, it respects to at least some extend the features
of a real face, therefore it is sensible to assume that the line described by the centers of the
eyes is roughly parallel to the line described by the edges of the mouth

After taking into account all this assumption our drawings will have to be similar to this one:

Now, since the only parameter we can classify the faces by is the mouth, we can actually split
our classification into 3 classes: happy, neutral or sad.

2 Desinging the pipeline


In a nutshell, our program removes the border of the face. Then, it aligns the other objects by
rotating the image until the line described by the centers of the eyes is horizontal and the mouth
is under it. Then, the pipeline isolates the mouth and finds a best fitting quadratic function
for the shape of the mouth. Finally, it calculates the distance between the line described by
the edges of the mouth and the lowest/greatest point on the quadratic function. Comparing
this distance with some precomputed values, we should be able to classify out drawing into
happy, neutral or sad.

The precomputed values are created by using a training set of already classified images and
extracting an average distance beyond which the face is either happy or sad. Any smaller ab-
solute distance obtained will result in a neutral face. The drawing below should be suggestive:

For: Dr Luke Church March 6, 2017 1/3


CST Part IA: Machine Learning, SV 1 Tiberiu-George Copaciu
2017-02-25 13:00, William Gates Building tgc30@cam.ac.uk

3 Building the pipeline


Firstly, we have to remove the contour of the face. Since there is nothing in the image outside
the face, we will traverse the matrix of pixels until we find one that is not white (i.e. it is part
of the contour). We know as well that the contour is continuous, so we can now apply a filling
algorithm. Starting from that coloured pixel, we will make it white and than recursively apply
the same algorithm for any coloured pixel among its (at most) 8 neighbors. Objects do not
touch each other, otherwise we could have accidentally remove them using this algorithm.

The next step is identifying the centers of the eyes. Considering that the eyes are above the
mouth, we can traverse the matrix in some different ways to find the following points that we
are interested in:
(i) from top to bottom: find the highest point of one eye
(ii) from left to right: find the most left side point of one eye
(iii) from right to left: find the most right side point of one eye
(iv) from bottom to top: first we go through the lines containing the pixels of the mouth and
then we find the lowest point of one of the eyes

Now, comparing the distances between all this points and choosing the smallest 2, we can pair
them s.t. each pair describes one eye. Afterwards, we use these points to get the centers of
the eyes, determine the angle between that line and a horizontal one, and rotate our image so
that the mouth will be placed in a sensible position.

2/3 March 6, 2017 For: Dr Luke Church


Tiberiu-George Copaciu CST Part IA: Machine Learning, SV 1
tgc30@cam.ac.uk 2017-02-25 13:00, William Gates Building

The only bit left is to use the mouth in order to classify the sentiment. We will exploit the fact
that a drawing of a happy/sad mouth is usually similar to a quadratic function and therefore
use a best fit algorithm in order to find a function of the form ax2 + bx + c. Finally, we can
simply consider that for a<0 the face is sad and for a>0 the face is happy. However, for a
neutral face there is a high probability that a!=0 when we use our fitting algorithm. Therefore,
when we apply this algorithm on the training set images, we determine the average value of
the distance shown in the second picture. We use this value as limits for a neutral face.

4 Testing the pipeline


A
The accuracy of the algorithm on a random input is: B (A = number of correctly predicted
sentiments; B = total number of drawings). In order to see how good our program is expected
to behave in general and its consistency we can perform a cross validation: we split our input
into 10 different balanced files (having roughly equal number of happy, neutral and sad faces),
then use each file once as the test set and the others as the training set. Determine the pipelines
accuracy for every case, calculate the average and the variance. A low variance will mean that
the pipelines accuracy is expected to be similar regardless the given input.

For: Dr Luke Church March 6, 2017 3/3