Академический Документы
Профессиональный Документы
Культура Документы
Abstract—Computer vision is widely used at present. However, In this paper, we propose a fruit recognition algorithm
fruit recognition is still a problem for the stacked fruits on based on convolution neural network(CNN). Input images can
weighing scale because of complexity and overlap. In this paper, be directly input into the network without feature extraction.
a fruit recognition algorithm based on convolution neural The CNN is trained by the regions which extract from original
network(CNN) is proposed. At first the image regions are images. And the type of the original image is based on the
extracted using selective search algorithm, then the regions have fusion of each region classification results. The final
been selected by means of an entropy of fruit images, and finally experiment results show that fruit recognition rate has much
these regions are regarded as input of CNN neural network for improved, and the proposed method can also be applied to
training and recognition. The final decision is made based on a
identify multiple fruit types in a picture in the future work.
fusion of all region classifications using voting mechanism. In
order to achieve the actual application in supermarket, we have The rest of this paper is organized as follows: Section 2
considered the variety of fruit, stack of fruits, the changes of fruit presents extraction of regions using selective search algorithm;
number and position, and have made a multifarious training set Section 3 introduces convolution neural network(CNN).
of fruits. After the network has been trained with an optimal Section 4 addressed how CNN is combined with selective
training set, it has obtained a remarkable recognition rates for search method. Section 5 shows experimental results and
the fruits stacked on a weighing scale. Section 6 gives conclusion.
Keywords- fruit recognition; CNN; selective search; vote;
II. EXTRACT REGIONS
I. INTRODUCTION In a practical application, the same category fruit’s images
have a wide of possible stacking forms due to the changes of
Recently, in the process of selling fruit in the supermarket,
fruit on the number and position. So we use selective search
it still requires staff to weigh. Not only it costs labor, but also
algorithm to extract the regions that contained useful
the efficiency is very low. So a fruit recognition algorithm is
information, and then region information are used to train
required to use in the supermarket fruit scale. There are so
CNN. The final result is obtained by mean of a fusion method
many types of fruit and some of them are very similar. Because
for each region classification results. This method can solve the
of changes in the position and number of fruits the recognition
problems with various possible stacking forms so that the
becomes a challenge issue.
recognition rate can be improved.
In early studies, WC Seng and SH Mirisaee proposed a
J. R .R . Uijlings put forward the selective search algorithm
method for fruits recognition system, which combines three
in 2012 [4]. Before of this, the practice of object recognition is
features analysis methods: color-based, shape-based and sized-
that select a window to scan the whole image, change the size
based. Fruit images are recognized using nearest neighbour
of the window and continue to scan the whole image. The
classification [1]. The system had a good performance for
approach has got rid of time-consuming problem of the original
single fruit recognition. However, in the actual situation fruits
window search method and obtains the better results. Firstly
are usually stacked together so that the system is not suitable
some original regions are segmented using an image
for application. Yundong zhang proposed a hybrid
segmentation method, and then a merging strategy is used to
classification method based on fitness-scaled chaotic artificial
incorporate these results for regions. Finally a hierarchical
bee colony (FSCABC) algorithm and feedforward neural
structure is obtained. The structure can contain the object. The
network (FNN) [2]. S. Arivazhagan proposed an efficient
size and location of regions are defined by a rectangular. The
fusion of color and texture features for fruit type recognition
specific steps of selective search algorithm are as fellows [4]:
[3]. In [2] and [3] the algorithms can be used to the database
which is variability on number and kind of fruits, but their • Input: (colour image)
recognition rates are not enough high. As mentioned above,
they are all adopted the method of feature extraction combined • Output: Set of object location hypotheses L
with classifier. And most researchers are committed to extract
the better characteristics and improve the performance of
classifiers.
Get highest similarity s(ri, rj) = max(S) according to the method of minimization error back-
propagation weight matrix is adjusted.
Merge corresponding regions rt = ri∪rj
Remove similarities regarding r i: S = S\s(ri, r*) B. CNN Architecture
As shows in Figure 1, the network contains three
Remove similarities regarding r j: S = S\s(r*, rj) convolutional layers, each of them is followed by pooling
Calculate similarity set St between rt and its neighbours layers, and two fully connected layers. The Relu non-linearity
is applied to the output of every convolutional and full-
S=S ∪St connected layer. The first convolutional layer filters the
R=R ∪Rt 32*32*3 input image with 32 kernels of size 5*5*3. The
second convolutional layer has 32 kernels of size 5*5*32. The
• Extract object location boxes L from all regions in R third convolutional layer has 64 kernels of size 5*5*32. All
pooling layer pool over 3*3 regions with stride of 2. The full-
The selective search algorithm is used to extract regions.
connected layers have 64 neurons each. Finally, softmax
More than twenty regions which are different size can be
classifier is applied on the last layer.
extracted from an original image by using selective search
algorithm. Too small or too tall or fat regions are removed
because they contain not much discriminant information [5].
The effective regions are regarded as input of the network.
III. CNN
Convolution neural network(CNN) is a kind of artificial
neural network, which has become a hot research topic in the
field of speech recognition and image recognition at present. A.
Figure 1. the structure of CNN
Krizhevsky et al. applied deep convolutional neural network in
ImageNet database in 2012 and achieved good results [6]. As
the image can be directly input to the network, it avoids feature C. CNN Advantage
extraction and data reconstruction process in traditional The most classic framework is still the model of feature
recognition algorithm. The structure of CNN layers typically extraction combined with classifier in pattern recognition field.
contains: convolutional layer, pooling layer and full-connected The characteristics of image are constructed by human. Then
layers. feature information is feed into classifier to classify. Finally,
classification results are obtained. Multiple features are usually
• Convolutional layer: By convolution operation, the used to describe the image, such as SIFT HOG LBP and so on.
original signal features can be enhanced, and the noise However, selecting features is uncertain (it is difficult to know
can be reduced. which features can be used to express image and to achieve the
• Pooling layer: Using the principle of local correlation, best classification results). So a lot of experiments have to be
by subsampling image the amount of data processing done to verify it. Due to the pixels are the most redundant
can be reduced while preserving useful information. representation of image semantics, the characteristics of the
abstract description will lose part of the image information [7].
The structure of CNN directly connect to the data. Through the
A. CNN Training Algorithm
deep network layer upon layer mapping, it can obtain the
• The first stage is the forward propagation: implicit expression of image information. Machine by
A sample (X, Yp) is taken from the sample set, and the multilayer supervising independent learning can realize the
X is the input of network; efficient representation of the image [8]. CNN after the feature
extraction generally adopt softmax classifier or RBF distance-
The corresponding actual output Op is calculated by the classifier for a final determination. So CNN is more excellent
network; in learning ability, which learned feature can be more essential
to the representation of the image and is more conducive to
classify. In addition, feature extraction and pattern
19
classification can be performed simultaneously and
simultaneously in training.
The BP neural network is one of the most widely used
neural network model, which is a multilayer feedforward
network trained by the inverse propagation algorithm. The BP
network can learn and store a large number of input-output
mapping relationship. Its learning is to use the gradient descent
method to adjust the weight and threshold of the network,
which makes that the error square becomes minimum. The
differences between CNN and general BP neural network is not Figure 2. Entropy statistics diagram
only its own depth structure, but also the use of local receptive
field and weight sharing method to further reduce the network The horizontal axis shows entropy. The vertical axis shows
parameters. Local receptive fields are connected to one region the number of occurrences of entropy values.
of the image in each convolution kernel, and each convolution We find that the entropy of regions representing
kernel is only part of the image. Then the local convolution background is generally between 3 and 4. The entropy of
features are connected in other layers, so that it can not only regions representing less information or uniform surface of part
meet the spatial correlation, but also reduce the number of of a fruit is between 4.5 and 6.77. All of the regions are not
parameters. Weight sharing make the weight of each higher than about 7.6. The regions that contain useful
convolution kernel same, and the image features are extracted information entropy is larger than 6.77 so that regions entropy
by adding the types of convolution kernel. And weight sharing which is less than 6.77 are removed. Figure 3 shows the
method makes that the neural network structure becomes regions under above mentioned conditions. Figure 4 shows the
simpler and more adaptable. result of optimally selected regions.
V. EXPERIMENT
E=-sum(p.*log2(p))
A. Data set
Through calculating the entropy of each regions of images In order to enable the proposed algorithm to deal with
in the training set, an entropy statistics is conducted. The various fruit stacking forms, a set of fruit images with complex
distribution of the entropy is shown as Figure 2: stacking forms are collected for training database. The fruit
database contains the change of fruit type, location and
quantity. It conforms to the actual application situation. The
database includes: red delicious apple, cherry tomato, orange,
kiwi fruit, banana, sugar orange, jujube; The different kinds of
fruit images show in Figure 5. Especially, Figure 6 shows the
changes of number in images with the same type fruit and the
changes of location in images with the same fruit.
20
set. In the experiment only CNN is used, in which the images
of training directly fed in CNN network. Parameter is set the
same as above. It is trained for 25 epochs through the training
set of 4000 images. A single desktop PC with 64GB RAM was
used for training CNN network.
In the end, the kind of fruit is determined by the vote results
of all regions. All regions from a fruit image are involved in the
vote. Which class support count in result is highest in the vote,
the fruit image belongs to corresponding fruit in the category
(If it has the same count of votes, then the first is taken for the
final result). Figure 7 shows the whole flow chart of the fruit
algorithm.
21
Compared with the former two methods in Table Ⅱ, CNN [4] J. R. Uijlings, K. E. van de Sande, T. Gevers, and A. W.Smeulders.
Selective search for object recognition. Inter-national journal of
show the better recognition effect. Compared with the latter computer vision,vol.104,no.2, pp. 154-171, 2013S.
two methods, they both are greatly improved in the accuracy. [5] Park, N. Kwak, Cultural event recognition by subregion classification
Furthermore, selective search combined with CNN shows a with convolutional neural network. Computer Vision and Pattern
better recognition rate which is up to 99.77%. Hence, the Recognition Workshops, pp. 45-50, 2015.
proposed method can make the recognition rate greatly [6] A. Krizhevsky, I. Sutskever, and G. E. Hinton. “Imagenet classification
increased. And it can meet the application requirements. with deep convolutional neural networks,” In Advances in neural
information processing systems, pp. 1097-1105, 2012.
[7] A. Karpathy, “Rich Feature Hierarchies for Accurate Object Detection
VI. CONCLUSION and Semantic Segmentation,” Conference on Computer Vision &
In this paper, a fruit recognition algorithm based on Pattern Recognition, vol.1, pp. 580-587, 2014.
convolution neural network(CNN) is proposed. The [8] J. Wright , A Y. Yang, A. Ganesh, “Robust face recognition via sparse
recognition rate has improved greatly. And by comparing the sepresentation,” Transactions on Pattern Analysis and Machine
Intelligence, vol.31,,no.2, pp.210-227, 2009.
two methods, recognition rate of CNN combined with
[9] Ji Shuiwang, Xu Wei, Yang Ming, “3D convolutional neural networks
selective search algorithm is higher than using only CNN. for human action recognition,”. Transactions on Pattern Analysis and
Although this method in the recognition rate achieves a good Machine Intelligence, vol.35, no.1 pp.221-231, 2013.
result, but less species fruit database are used in the [10] J, Zhao. “On-tree fruit recognition using texture properties and color
experiment, and is not considering the external environment data. International,” Conference on Intelligent Robots & Systems, vol.1,
change and other factors, such as light. We will increase the pp. 263-268, 2005.
fruit database species and focus on the fruit detection and [11] Y, Song. “Automatic fruit recognition and counting from multiple
localization in further work. images, ” Biosystems Engineering, vol.118 pp. 203-215, 2014.
[12] AR, Jiménez. “Automatic fruit recognition: “a survey and new results
using Range/Attenuation images,” Pattern Recognition, vol.32, pp.
ACKNOWLEDGMENT 1719-1736, 1999.
The authors gratefully acknowledge supports from Fujian [13] P. F. Felzenszwalb and D. P. Huttenlocher. “Efficient Graph-Based
Image Segmentation,” IJCV, vol. 59, pp. 167-181, 2004.
Provincial Key Laboratory for Photonics Technology, and the
fund from the Natural Science Foundation of China (Grant No. [14] G. Hinton, R. Salakhutdinov, “Reducing the dimensionality of data with
neural networks,” pp. 504-507, 2006.
61179011) and Science and Technology Major Projects for
[15] M L Raymer, W F Punch, Goodman E D, “Dimensionality reduction
Industry-academic Cooperation of Universities in Fujian using genetic algorithms,” Transactions on Evolutionary Computation,
Province (Grant No. 2013H6008), and supports from vol.4,no.2, pp.164-171, 2000.
Innovation Team of the Ministry of Education (IRT1115). [16] L.Zhang, P N Suganthan,. “A Survey of Randomized Algorithms for
Training Neural Networks,” Information Sciences, 2016.
REFERENCES [17] L.Wang, N.Zhou, F.Chu, “A General Wrapper Approach to Selection of
Class-Dependent Features,” Transactions on Neural Networks, ,
[1] Woo Chaw Seng and Seyed Hadi Mirisaee, “A New Method for Fruits vol.19,no.7,pp.1267-1278,2008.
Recognition System,” MNCC Transactions on ICT, vol. 1, no. 1, June
2009. [18] X.Fu, L.Wang, “Data dimensionality reduction with application to
simplifying RBF network structure and improving classification
[2] Y. Zhang, L. Wu. “Classification of fruits using computer vision and a performance,” Transactions on Systems Man & Cybernetics Part B
multiclass support vector machine,” sensors, vol.12, no.9, pp. 12489- Cybernetics A Publication of the IEEE Systems Man & Cybernetics
12505, 2012. Society, vol.33,no.3,pp.399-409,2003.
[3] RN, Shebiah. Fruit Recognition using Color and Texture Features.
Journal of Emerging Trends in Computing & Information
Sciences,vol.??, no.1,pp.90-94,2010.
22