Muskan Chawla, Anupma Gadhwal and Kunal Jain Department of Computer Science and Electronics Bharati Vidyapeeth’s College of Engineering
ABSTRACT such as expression, gender and age. Human
beings can detect and analyze this information Accurately predicting the age of humans is an easily, for example, majority of people are able extremely challenging task. Automatic age and to recognize human traits like gender, where gender classification has become important to they can tell if the person is male or female by many applications, particularly since the rise of only seeing his/her face. Similarly, they can social media. In this paper, we show that by the determine the age of the person and say whether use of convolutional neural networks (CNN), a that person is a child or an adult. significant increase in performance can be obtained on these tasks. First, the dataset is In this paper we attempt to close the gap collected from GOOGLE and consists of 100 between automatic face recognition capabilities images of male/female of age group 5 to 60. and those of age and gender estimation We created a CNN network adjoining the fully methods. To this end, we follow the successful connected network using Keras library. We example laid down by recent face recognition inserted 3 convoluted layer with first layer systems: Face recognition techniques described containing 64 neurons and a window of 7x7 in the last few years have shown that followed by Relu activation layer and max tremendous progress can be made by the use of pooling layer of 2x2 this is followed by 2 more deep convolutional neural networks (CNN) . CNN layers with 100 and 64 neurons and a 5x5 We demonstrate similar gains with a simple and 3x3 respectively , with activation function network architecture, designed by considering as Relu. After we received the output from the rather limited availability of accurate age CNN output was in a matrix form so we and gender labels in existing face data sets. On flattened it using a fatten layer and fed it to the other hand, constructing applications to fully connected layers with 64 and 1 neurons identify the people from their face and each. Final output layer consist of a single extracting their age and gender information is a neuron with Sigmoid Activation function. challenging task for computer vision as well as pattern recognition. Computer vision includes various methods and techniques for Keywords understanding, analysing, and extracting CNN (Convolutional Neural Network) , Dense information from images. In other words, it’s a layer, Pixels, adam, haarcascade, opencv. science that works on building a system that is able to make the computer see and describe our 1.INTRODUCTION world. Pattern recognition, on the other hand, is a technology providing identification, The human face holds very important quantity description, and interpretation of how machines of attributes and information about the person, can recognize and detect the pattern, which could be a shape, speech signal, fingerprint calculating ratios between different image, a handwritten word, environment, or a measurements of facial features [29]. Once human face. The design of a pattern recognition facial features (e.g. eyes, nose, mouth, chin, system involves three parts- pre-processing, etc.) are localized and their sizes and distances features extraction, and classification. measured, ratios between the mare calculated and used for classifying the face into different age categories according to hand-crafted rules. More recently, [6] uses a similar approach to model age progression in subjects under 18 years old. As those methods require accurate localization of facial features, a challenging problem by itself, they are unsuitable for in-the- wild images which one may expect to find on social platforms. On a different line of work are methods that represent the aging process as a subspace [16] or a manifold [19]. A drawback of those methods is that they require input images to be near-frontal and well-aligned. These methods therefore present experimental results only on constrained data-sets of near- frontal images (e.g UIUC-IFP-Y [12, 19] ,FG- NET[30] and MORPH[23]). Again, as a Figure 1. Faces for age and gender classification consequence, such methods are ill-suited for . These images represent some of the challenges unconstrained images. Different from those of age and gender estimation from real-world, described above are methods that use local unconstrained images. features for representing face images. In [25]
Gaussian Mixture Models (GMM) [13] were
2. RELATED WORK used to represent the distribution of facial patches. In [24] GMM were used again for Before describing the proposed method we representing the distribution of local facial briefly review related methods for age and measurements, but robust descriptors were used gender classification and provide a cursory instead of pixel patches. Finally, instead of overview of deep convolutional networks. GMM, Hidden-Markov Model, super-vectors [20] were used in [26] for representing face 2.1 Age and Gender Classification patch distributions. An alternative to the local image intensity patches are robust image 2.1.1 Age classification. The problem of descriptors: Gabor image descriptors [22] were automatically extracting age related attributes used in [15] along with a Fuzzy-LDA classifier from facial images has received increasing which considers a face image as belonging to attention in recent years and many methods more than one age class. In [20] a combination have been put fourth. A detailed survey of such of Biologically-Inspired Features (BIF) [44] methods can be found in [11] and in [21]. We and various manifold-learning methods were note that despite our focus here on age group used for age estimation. Gabor [23] and local classification rather than precise age estimation binary patterns (LBP) [1] features were used in (i.e., age regression), the survey below includes [7] along with a hierarchical age classifier methods designed for either task. Early composed of Support Vector methods for age estimation are based on Machines(SVM)[9] to classify the input image to an age-class followed by a support vector specific predictive modeling problem, such as regression [10] to estimate a precise age. image classification. The result is highly Finally,[4] proposed improved version so specific features that can be detected anywhere frelevant component analysis [3] and locally on input images. preserving projections [26]. Those methods are used for distance learning and dimensionality reduction, respectively, with Active Appearance Models [8] as an image feature. All of these methods have proven effective on small and/or constrained benchmarks for age estimation. To our knowledge, the best performing methods were demonstrated on the Group Photos benchmark [14]. We show our proposed method to outperform the results they report on the more challenging Adience benchmark, designed for the same task.
2.1.2 Gender classification. A detailed survey Figure2. CNN(Convolutional Neural Network)
of gender classification methods can be found 3. METHODOLOGY in [4] and more recently in [12]. Here we quickly survey relevant methods. One of the 3.1 DATA COLLECTION early methods for gender classification [17] used a neural network trained on a small set of The dataset is collected from GOOGLE and near-frontal face images. In [27] the combined consists of 100 images of male/female of age 3D structure of the head (obtained using a laser group 5 to 60. scanner) and image intensities were used for 3.2 DATASET FORMATION classifying gender. SVM classifiers were used by [25], applied directly to image intensities. It is a process to create a labelled dataset which Rather than using SVM, [2] used AdaBoost for is used for training / testing purposes. In this the same purpose, here again, applied to image project, we stored the male and female data in intensities. Finally, viewpoint-invariant age and different folders. gender classification was presented by [29]. We used OS library to travel through the 2.1.3 Deep Convolutional Neural Network directory to access the data. As an image usually consists of 3 channels which makes it Convolution and the convolutional layer are the hard to process, therefore we read all images in major building blocks used in convolutional grayscale format and resized it to a matrix of neural networks. 70x70 pixels. We labelled male as 0 and female A convolution is the simple application of a as 1. Then, we exported the data using pickle filter to an input that results in an activation. library. Repeated application of the same filter to an input results in a map of activations called a 3.3 CNN FORMATION feature map, indicating the locations and strength of a detected feature in an input, such We created a CNN network adjoining the fully as an image. connected network using Keras library. We The innovation of convolutional neural inserted 3 convoluted layer with first layer networks is the ability to automatically learn a containing 64 neurons and a window of 7x7 large number of filters in parallel specific to a followed by Relu activation layer and max training dataset under the constraints of a pooling layer of 2x2 this is followed by 2 more CNN layers with 100 and 64 neurons and a 5x5 and 3x3 respectively , with activation function as Relu. After we received the output from CNN output was in a matrix form so we flattened it using a fatten layer and fed it to fully connected layers with 64 and 1 neurons each.
Final output layer consist of a single neuron
with Sigmoid Activation function.
3.4 TRAINING/TESTING
We trained the model on the dataset we made
consisting of 100 pictures, with validation split of 10% and adam as the optimizer.
4. IMPLEMENTED WORK
We used our model for the prediction of the
gender. As the prediction of the age is more complex and requires more features, we used a pre-trained model for age detection.
We predicted the age and gender on the live
feed from the webcam. For this, we used opencv library . We used haarcascade for face detection .Then, we extracted the face from the image turned it into grayscale format and resized it a 70x79 pixel matrix. Then, we fed it to our constructed model. and obtained output.
Finally, we wrapped the obtained results onto
the image frame using opencv library
5.RESULTS
Successfully predicted the age group and
gender of a person feeding image to the code Figure3.output(Gender and Age range) via webcam with accuracy score of 80%. 6.CONCLUSION Age Range: (8, 12) Age Range: (21, 32) Age Though many previous methods have Range: (8, 12) Age Range: (8, 12) Age Range: addressed the problems of age and gender (8, 12) Age Range: (21, 32) Age Range: (21, classification, until recently, much of this work 32) Age Range: (8, 12) Age Range: (21, 32) has focused on constrained images taken in lab Age Range: (21, 32) Age Range: (8, 12) Age settings. Such settings do not adequately reflect Range: (8, 12) Age Range: (21, 32) Age Range: appearance variations common to the real- (8, 12) Age Range: (8, 12) Age Range: (8, 12) world images in social websites and online Age Range: (21,32) Age Range: (21,32) Age repositories. Internet images, however, are not Range: (21,32) Age Range: (21,32) simply more challenging: they are also abundant. The easy availability of huge image robust local image descriptor. Trans. Pattern collections provides modern machine learning Anal. Mach. Intell., 32(9):1705–1720, 2010. 2 based systems with effectively endless training [7] S. E. Choi, Y. J. Lee, S. J. Lee, K. R. Park, data. Taking example from the related problem and J. Kim. Age of face recognition we explore how well deep estimationusingahierarchicalclassifierbasedong CNN perform on these tasks using Internet data. lobaland local facial features. Pattern We provide results with a lean deep-learning Recognition, 44(6):1262–1281, 2011. 2 architecture designed to avoid overfitting due to [8] T. F. Cootes, G. J. Edwards, and C. J. the limitation of limited labeled data.We further Taylor. Active appearance models. In European inflate the size of the training data by artificially Conf. Comput. Vision, pages 484–498. adding cropped versions of the images in our Springer, 1998. 2 training set. The resulting system was tested unfiltered images and shown to significantly out [9] C.CortesandV.Vapnik. Support- perform recent state of the art. CNN can be used vectornetworks. Machine learning, 20(3):273– to provide improved age and gender 297, 1995. 2 [10] E. Eidinger, R. Enbar, and T. classification results,even considering the much Hassner. Age and gender estimation of smaller size of contemporary unconstrained unfiltered faces. Trans. on Inform. Forensics image sets labeled for age and gender. Second, and Security, 9(12), 2014. 1, 2, 5, 6 the simplicity of our model implies that more [11] Y. Fu, G. Guo, and T. S. Huang. Age elaborate systems using more training data may synthesis and estimation via faces: A survey. well be capable of substantially improving Trans. Pattern Anal. Mach. Intell., results beyond those reported here. 32(11):1955–1976, 2010. 2 [12] Y. Fu and T. S. Huang. Human age REFRENCES estimation with regression on discriminative References [1] T. Ahonen, A. Hadid, and M. aging manifold. Int. Conf. Multimedia, Pietikainen. Face description with local binary 10(4):578–584, 2008. 2 patterns: Application to face recognition. [13] K. Fukunaga. Introduction to statistical Trans. Pattern Anal. Mach. Intell., pattern recognition. Academic press, 1991. 2 [14] A. C. Gallagher and T. Chen. 28(12):2037–2041, 2006. 2 [2] S.Balujaand H. A.Rowley. Understanding images of groups of people. In Boostingsexidentificationperformance. Int. J. Proc. Conf. Comput. Vision Pattern Comput. Vision, 71(1):111–119, 2007. 2 Recognition, pages 256–263. IEEE, 2009. 2, 5 [3]A.BarHillel,T.Hertz,N.Shental,andD.Weins [15] F. Gao and H. Ai. Face age classification hall. Learning distance functions using on consumer images with gabor feature and equivalence relations. In Int. Conf. Mach. fuzzy lda method. In Advances in biometrics, Learning, volume 3, pages 11–18, 2003. 2 pages 132–141. Springer, 2009. 1, 2 [4] W.-L. Chao, J.-Z. Liu, and J.-J. Ding. Facial [16] X. Geng, Z.-H. Zhou, and K. Smith-Miles. age estimation based on label-sensitive learning Automatic age estimation based on facial aging and age-oriented regression. Pattern patterns. Trans. Pattern Anal. Mach. Intell., Recognition, 46(3):628–641, 2013. 1, 2 29(12):2234–2240, 2007. 2 [5] K. Chatfield, K. Simonyan, A. Vedaldi, and [17] B. A. Golomb, D. T. Lawrence, and T. J. A. Zisserman. Return of the devil in the details: Sejnowski. Sexnet: Delving deep into convolutional nets. arXiv Aneuralnetworkidentifiessexfromhumanfaces. preprint arXiv:1405.3531, 2014. 3 InNeural Inform. Process. Syst., pages 572– [6] J. Chen, S. Shan, C. He, G. Zhao, M. 579, 1990. 2 Pietikainen, X. Chen, and W. Gao. Wld: A [18] A. Graves, A.-R. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE [28] A. Krizhevsky, I. Sutskever, and G. E. International Conference on, pages 6645–6649. Hinton. Imagenet classification with deep IEEE, 2013. 3 convolutional neural networks. In Neural [19] G. Guo, Y. Fu, C. R. Dyer, and T. S. Inform. Process. Syst., pages 1097–1105, 2012. Huang. Imagebased human age estimation by 3, 4 manifold learning and locally adjusted robust [29] Y. H. Kwon and N. da Vitoria Lobo. Age regression. Trans. Image Processing, classification from facial images. In Proc. Conf. 17(7):1178–1188, 2008. 2 Comput. Vision Pattern Recognition, pages [20] G. Guo, G. Mu, Y. Fu, C. Dyer, and T. 762–767. IEEE, 1994. 1, 2 Huang. A study on [30] A. Lanitis. The FG-NET aging database, automaticageestimationusingalargedatabase. 2002. Available: www- InProc.Int. Conf. Comput. Vision, pages 1986– prima.inrialpes.fr/FGnet/html/ 1991. IEEE, 2009. 2 benchmarks.html. [21] H. Han, C. Otto, and A. K. Jain. Age estimation from face images: Human vs. machine performance. In Biometrics (ICB), 2013 International Conference on. IEEE, 2013. [22] T. Hassner. Viewing real-world faces in 3d. In Proc. Int. Conf. Comput. Vision, pages 3607–3614. IEEE, 2013. 6 [23] T. Hassner, S. Harel, E. Paz, and R. Enbar. Effective face frontalizationinunconstrainedimages. Proc.Conf.Comput. Vision Pattern Recognition, 2015. 5, 6 [24] G.E.Hinton,N.Srivastava,A.Krizhevsky,I.Sutsk ever,and R. R. Salakhutdinov. Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580, 2012. 5 [25] G. B. Huang, M. Ramesh, T. Berg, and E. Learned-Miller. Labeled faces in the wild: A database for studying face recognition in unconstrained environments. Technical report, Technical Report 07-49, University of Massachusetts, Amherst, 2007. 3, 5 [26]Y.Jia,E.Shelhamer,J.Donahue,S.Karayev,J .Long,R.Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architectureforfastfeatureembedding. arXivpreprint arXiv:1408.5093, 2014. 5 [27]A.Karpathy,G.Toderici,S.Shetty,T.Leung, R.Sukthankar, and L. Fei-Fei. Large-scale video classification with convolutional neural networks. In Proc. Conf. Comput. Vision Pattern Recognition, pages 1725–1732. IEEE, 2014. 3