Вы находитесь на странице: 1из 6

Age and Gender Classification using

Convolutional Neural Network


Muskan Chawla, Anupma Gadhwal and Kunal Jain
Department of Computer Science and Electronics
Bharati Vidyapeeth’s College of Engineering

ABSTRACT such as expression, gender and age. Human


beings can detect and analyze this information
Accurately predicting the age of humans is an easily, for example, majority of people are able
extremely challenging task. Automatic age and to recognize human traits like gender, where
gender classification has become important to they can tell if the person is male or female by
many applications, particularly since the rise of only seeing his/her face. Similarly, they can
social media. In this paper, we show that by the determine the age of the person and say whether
use of convolutional neural networks (CNN), a that person is a child or an adult.
significant increase in performance can be
obtained on these tasks. First, the dataset is In this paper we attempt to close the gap
collected from GOOGLE and consists of 100 between automatic face recognition capabilities
images of male/female of age group 5 to 60. and those of age and gender estimation
We created a CNN network adjoining the fully methods. To this end, we follow the successful
connected network using Keras library. We example laid down by recent face recognition
inserted 3 convoluted layer with first layer systems: Face recognition techniques described
containing 64 neurons and a window of 7x7 in the last few years have shown that
followed by Relu activation layer and max tremendous progress can be made by the use of
pooling layer of 2x2 this is followed by 2 more deep convolutional neural networks (CNN) .
CNN layers with 100 and 64 neurons and a 5x5 We demonstrate similar gains with a simple
and 3x3 respectively , with activation function network architecture, designed by considering
as Relu. After we received the output from the rather limited availability of accurate age
CNN output was in a matrix form so we and gender labels in existing face data sets. On
flattened it using a fatten layer and fed it to the other hand, constructing applications to
fully connected layers with 64 and 1 neurons identify the people from their face and
each. Final output layer consist of a single extracting their age and gender information is a
neuron with Sigmoid Activation function. challenging task for computer vision as well as
pattern recognition. Computer vision includes
various methods and techniques for
Keywords understanding, analysing, and extracting
CNN (Convolutional Neural Network) , Dense information from images. In other words, it’s a
layer, Pixels, adam, haarcascade, opencv. science that works on building a system that is
able to make the computer see and describe our
1.INTRODUCTION world. Pattern recognition, on the other hand, is
a technology providing identification,
The human face holds very important quantity description, and interpretation of how machines
of attributes and information about the person, can recognize and detect the pattern, which
could be a shape, speech signal, fingerprint calculating ratios between different
image, a handwritten word, environment, or a measurements of facial features [29]. Once
human face. The design of a pattern recognition facial features (e.g. eyes, nose, mouth, chin,
system involves three parts- pre-processing, etc.) are localized and their sizes and distances
features extraction, and classification. measured, ratios between the mare calculated
and used for classifying the face into different
age categories according to hand-crafted rules.
More recently, [6] uses a similar approach to
model age progression in subjects under 18
years old. As those methods require accurate
localization of facial features, a challenging
problem by itself, they are unsuitable for in-the-
wild images which one may expect to find on
social platforms. On a different line of work are
methods that represent the aging process as a
subspace [16] or a manifold [19]. A drawback
of those methods is that they require input
images to be near-frontal and well-aligned.
These methods therefore present experimental
results only on constrained data-sets of near-
frontal images (e.g UIUC-IFP-Y [12, 19] ,FG-
NET[30] and MORPH[23]). Again, as a
Figure 1. Faces for age and gender classification consequence, such methods are ill-suited for
. These images represent some of the challenges unconstrained images. Different from those
of age and gender estimation from real-world, described above are methods that use local
unconstrained images. features for representing face images. In [25]

Gaussian Mixture Models (GMM) [13] were


2. RELATED WORK used to represent the distribution of facial
patches. In [24] GMM were used again for
Before describing the proposed method we representing the distribution of local facial
briefly review related methods for age and measurements, but robust descriptors were used
gender classification and provide a cursory instead of pixel patches. Finally, instead of
overview of deep convolutional networks. GMM, Hidden-Markov Model, super-vectors
[20] were used in [26] for representing face
2.1 Age and Gender Classification patch distributions. An alternative to the local
image intensity patches are robust image
2.1.1 Age classification. The problem of
descriptors: Gabor image descriptors [22] were
automatically extracting age related attributes
used in [15] along with a Fuzzy-LDA classifier
from facial images has received increasing
which considers a face image as belonging to
attention in recent years and many methods
more than one age class. In [20] a combination
have been put fourth. A detailed survey of such
of Biologically-Inspired Features (BIF) [44]
methods can be found in [11] and in [21]. We
and various manifold-learning methods were
note that despite our focus here on age group
used for age estimation. Gabor [23] and local
classification rather than precise age estimation
binary patterns (LBP) [1] features were used in
(i.e., age regression), the survey below includes
[7] along with a hierarchical age classifier
methods designed for either task. Early
composed of Support Vector
methods for age estimation are based on
Machines(SVM)[9] to classify the input image
to an age-class followed by a support vector specific predictive modeling problem, such as
regression [10] to estimate a precise age. image classification. The result is highly
Finally,[4] proposed improved version so specific features that can be detected anywhere
frelevant component analysis [3] and locally on input images.
preserving projections [26]. Those methods are
used for distance learning and dimensionality
reduction, respectively, with Active
Appearance Models [8] as an image feature. All
of these methods have proven effective on
small and/or constrained benchmarks for age
estimation. To our knowledge, the best
performing methods were demonstrated on the
Group Photos benchmark [14]. We show our
proposed method to outperform the results they
report on the more challenging Adience
benchmark, designed for the same task.

2.1.2 Gender classification. A detailed survey Figure2. CNN(Convolutional Neural Network)


of gender classification methods can be found
3. METHODOLOGY
in [4] and more recently in [12]. Here we
quickly survey relevant methods. One of the 3.1 DATA COLLECTION
early methods for gender classification [17]
used a neural network trained on a small set of The dataset is collected from GOOGLE and
near-frontal face images. In [27] the combined consists of 100 images of male/female of age
3D structure of the head (obtained using a laser group 5 to 60.
scanner) and image intensities were used for
3.2 DATASET FORMATION
classifying gender. SVM classifiers were used
by [25], applied directly to image intensities. It is a process to create a labelled dataset which
Rather than using SVM, [2] used AdaBoost for is used for training / testing purposes. In this
the same purpose, here again, applied to image project, we stored the male and female data in
intensities. Finally, viewpoint-invariant age and different folders.
gender classification was presented by [29].
We used OS library to travel through the
2.1.3 Deep Convolutional Neural Network directory to access the data. As an image
usually consists of 3 channels which makes it
Convolution and the convolutional layer are the
hard to process, therefore we read all images in
major building blocks used in convolutional
grayscale format and resized it to a matrix of
neural networks.
70x70 pixels. We labelled male as 0 and female
A convolution is the simple application of a
as 1. Then, we exported the data using pickle
filter to an input that results in an activation.
library.
Repeated application of the same filter to an
input results in a map of activations called a 3.3 CNN FORMATION
feature map, indicating the locations and
strength of a detected feature in an input, such We created a CNN network adjoining the fully
as an image. connected network using Keras library. We
The innovation of convolutional neural inserted 3 convoluted layer with first layer
networks is the ability to automatically learn a containing 64 neurons and a window of 7x7
large number of filters in parallel specific to a followed by Relu activation layer and max
training dataset under the constraints of a pooling layer of 2x2 this is followed by 2 more
CNN layers with 100 and 64 neurons and a 5x5
and 3x3 respectively , with activation function
as Relu. After we received the output from
CNN output was in a matrix form so we
flattened it using a fatten layer and fed it to
fully connected layers with 64 and 1 neurons
each.

Final output layer consist of a single neuron


with Sigmoid Activation function.

3.4 TRAINING/TESTING

We trained the model on the dataset we made


consisting of 100 pictures, with validation split
of 10% and adam as the optimizer.

4. IMPLEMENTED WORK

We used our model for the prediction of the


gender. As the prediction of the age is more
complex and requires more features, we used a
pre-trained model for age detection.

We predicted the age and gender on the live


feed from the webcam. For this, we used
opencv library . We used haarcascade for face
detection .Then, we extracted the face from the
image turned it into grayscale format and
resized it a 70x79 pixel matrix. Then, we fed it
to our constructed model. and obtained output.

Finally, we wrapped the obtained results onto


the image frame using opencv library

5.RESULTS

Successfully predicted the age group and


gender of a person feeding image to the code Figure3.output(Gender and Age range)
via webcam with accuracy score of 80%.
6.CONCLUSION
Age Range: (8, 12) Age Range: (21, 32) Age
Though many previous methods have
Range: (8, 12) Age Range: (8, 12) Age Range:
addressed the problems of age and gender
(8, 12) Age Range: (21, 32) Age Range: (21,
classification, until recently, much of this work
32) Age Range: (8, 12) Age Range: (21, 32)
has focused on constrained images taken in lab
Age Range: (21, 32) Age Range: (8, 12) Age
settings. Such settings do not adequately reflect
Range: (8, 12) Age Range: (21, 32) Age Range:
appearance variations common to the real-
(8, 12) Age Range: (8, 12) Age Range: (8, 12)
world images in social websites and online
Age Range: (21,32) Age Range: (21,32) Age
repositories. Internet images, however, are not
Range: (21,32) Age Range: (21,32)
simply more challenging: they are also
abundant. The easy availability of huge image robust local image descriptor. Trans. Pattern
collections provides modern machine learning Anal. Mach. Intell., 32(9):1705–1720, 2010. 2
based systems with effectively endless training [7] S. E. Choi, Y. J. Lee, S. J. Lee, K. R. Park,
data. Taking example from the related problem and J. Kim. Age
of face recognition we explore how well deep estimationusingahierarchicalclassifierbasedong
CNN perform on these tasks using Internet data. lobaland local facial features. Pattern
We provide results with a lean deep-learning Recognition, 44(6):1262–1281, 2011. 2
architecture designed to avoid overfitting due to [8] T. F. Cootes, G. J. Edwards, and C. J.
the limitation of limited labeled data.We further Taylor. Active appearance models. In European
inflate the size of the training data by artificially Conf. Comput. Vision, pages 484–498.
adding cropped versions of the images in our Springer, 1998. 2
training set. The resulting system was tested
unfiltered images and shown to significantly out [9] C.CortesandV.Vapnik. Support-
perform recent state of the art. CNN can be used vectornetworks. Machine learning, 20(3):273–
to provide improved age and gender 297, 1995. 2 [10] E. Eidinger, R. Enbar, and T.
classification results,even considering the much Hassner. Age and gender estimation of
smaller size of contemporary unconstrained unfiltered faces. Trans. on Inform. Forensics
image sets labeled for age and gender. Second, and Security, 9(12), 2014. 1, 2, 5, 6
the simplicity of our model implies that more [11] Y. Fu, G. Guo, and T. S. Huang. Age
elaborate systems using more training data may synthesis and estimation via faces: A survey.
well be capable of substantially improving Trans. Pattern Anal. Mach. Intell.,
results beyond those reported here. 32(11):1955–1976, 2010. 2
[12] Y. Fu and T. S. Huang. Human age
REFRENCES estimation with regression on discriminative
References [1] T. Ahonen, A. Hadid, and M. aging manifold. Int. Conf. Multimedia,
Pietikainen. Face description with local binary 10(4):578–584, 2008. 2
patterns: Application to face recognition. [13] K. Fukunaga. Introduction to statistical
Trans. Pattern Anal. Mach. Intell., pattern recognition. Academic press, 1991. 2
[14] A. C. Gallagher and T. Chen.
28(12):2037–2041, 2006. 2
[2] S.Balujaand H. A.Rowley. Understanding images of groups of people. In
Boostingsexidentificationperformance. Int. J. Proc. Conf. Comput. Vision Pattern
Comput. Vision, 71(1):111–119, 2007. 2 Recognition, pages 256–263. IEEE, 2009. 2, 5
[3]A.BarHillel,T.Hertz,N.Shental,andD.Weins [15] F. Gao and H. Ai. Face age classification
hall. Learning distance functions using on consumer images with gabor feature and
equivalence relations. In Int. Conf. Mach. fuzzy lda method. In Advances in biometrics,
Learning, volume 3, pages 11–18, 2003. 2 pages 132–141. Springer, 2009. 1, 2
[4] W.-L. Chao, J.-Z. Liu, and J.-J. Ding. Facial [16] X. Geng, Z.-H. Zhou, and K. Smith-Miles.
age estimation based on label-sensitive learning Automatic age estimation based on facial aging
and age-oriented regression. Pattern patterns. Trans. Pattern Anal. Mach. Intell.,
Recognition, 46(3):628–641, 2013. 1, 2 29(12):2234–2240, 2007. 2
[5] K. Chatfield, K. Simonyan, A. Vedaldi, and [17] B. A. Golomb, D. T. Lawrence, and T. J.
A. Zisserman. Return of the devil in the details: Sejnowski. Sexnet:
Delving deep into convolutional nets. arXiv Aneuralnetworkidentifiessexfromhumanfaces.
preprint arXiv:1405.3531, 2014. 3 InNeural Inform. Process. Syst., pages 572–
[6] J. Chen, S. Shan, C. He, G. Zhao, M. 579, 1990. 2
Pietikainen, X. Chen, and W. Gao. Wld: A [18] A. Graves, A.-R. Mohamed, and G.
Hinton. Speech recognition with deep recurrent
neural networks. In Acoustics, Speech and
Signal Processing (ICASSP), 2013 IEEE [28] A. Krizhevsky, I. Sutskever, and G. E.
International Conference on, pages 6645–6649. Hinton. Imagenet classification with deep
IEEE, 2013. 3 convolutional neural networks. In Neural
[19] G. Guo, Y. Fu, C. R. Dyer, and T. S. Inform. Process. Syst., pages 1097–1105, 2012.
Huang. Imagebased human age estimation by 3, 4
manifold learning and locally adjusted robust [29] Y. H. Kwon and N. da Vitoria Lobo. Age
regression. Trans. Image Processing, classification from facial images. In Proc. Conf.
17(7):1178–1188, 2008. 2 Comput. Vision Pattern Recognition, pages
[20] G. Guo, G. Mu, Y. Fu, C. Dyer, and T. 762–767. IEEE, 1994. 1, 2
Huang. A study on [30] A. Lanitis. The FG-NET aging database,
automaticageestimationusingalargedatabase. 2002. Available: www-
InProc.Int. Conf. Comput. Vision, pages 1986– prima.inrialpes.fr/FGnet/html/
1991. IEEE, 2009. 2 benchmarks.html.
[21] H. Han, C. Otto, and A. K. Jain. Age
estimation from face images: Human vs.
machine performance. In Biometrics (ICB),
2013 International Conference on. IEEE, 2013.
[22] T. Hassner. Viewing real-world faces in
3d. In Proc. Int. Conf. Comput. Vision, pages
3607–3614. IEEE, 2013. 6
[23] T. Hassner, S. Harel, E. Paz, and R. Enbar.
Effective face
frontalizationinunconstrainedimages.
Proc.Conf.Comput. Vision Pattern
Recognition, 2015. 5, 6 [24]
G.E.Hinton,N.Srivastava,A.Krizhevsky,I.Sutsk
ever,and R. R. Salakhutdinov. Improving
neural networks by preventing co-adaptation of
feature detectors. arXiv preprint
arXiv:1207.0580, 2012. 5
[25] G. B. Huang, M. Ramesh, T. Berg, and E.
Learned-Miller. Labeled faces in the wild: A
database for studying face recognition in
unconstrained environments. Technical report,
Technical Report 07-49, University of
Massachusetts, Amherst, 2007. 3, 5
[26]Y.Jia,E.Shelhamer,J.Donahue,S.Karayev,J
.Long,R.Girshick, S. Guadarrama, and T.
Darrell. Caffe: Convolutional
architectureforfastfeatureembedding.
arXivpreprint arXiv:1408.5093, 2014. 5
[27]A.Karpathy,G.Toderici,S.Shetty,T.Leung,
R.Sukthankar, and L. Fei-Fei. Large-scale
video classification with convolutional neural
networks. In Proc. Conf. Comput. Vision
Pattern Recognition, pages 1725–1732. IEEE,
2014. 3

Вам также может понравиться