Вы находитесь на странице: 1из 17

b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Available online at www.sciencedirect.com

ScienceDirect

journal homepage: www.elsevier.com/locate/issn/15375110

Research Paper
Special Issue: Robotic Agriculture

Detection of tomatoes using spectral-spatial


methods in remotely sensed RGB images captured
by UAV

J. Senthilnath a,b, Akanksha Dokania c, Manasa Kandukuri c,


Ramesh K.N. d, Gautham Anand e, S.N. Omkar a,*
a
Department of Aerospace Engineering, Indian Institute of Science, Bangalore, India
b
Geospatial Sciences Center of Excellence, South Dakota State University, Brookings, USA
c
Department of Electronics and Electrical Engineering, IIT Guwahati, Guwahati, India
d
Department of Electronics and Communication Engineering, UVCE, Bangalore, India
e
Department of Electronics, National College Basavanagudi, Bangalore, India

article info
The spectral-spatial classification of high spatial resolution RGB images obtained from
Article history: unmanned aerial vehicles (UAVs) for detection of tomatoes in the image is presented.
Published online xxx Bayesian information criterion (BIC) was used to determine the optimal number of clusters
for the image. Spectral clustering was carried out using K-means, expectation max-
imisation (EM) and self-organising map (SOM) algorithms to categorise the pixels into two
Keywords: groups i.e. tomatoes and non-tomatoes. Due to resemblance in spectral intensities, some of
Unmanned aerial vehicle the non-tomato pixels were grouped into the tomato group and in order to remove them,
Spectral clustering spatial segmentation was performed on the image. Spatial segmentation was carried out
Spatial segmentation using morphological operations and by setting thresholds for geometrical properties. The
number of pixels grouped in the tomato cluster is different for each clustering method. EM
doesn't pick up the land patches as tomato pixels. As a result, the size of the tomatoes
picked up is different than K-means and SOM. Since threshold values chosen for carrying
out spatial segmentation are shape and size dependent, different threshold values are
applied to different methods of clustering. A synthetic image of 12  12 pixels with
different labels is created to illustrate the effect of each method used for spatial segmen-
tation on the clustered image. Two representative UAV images captured at different
heights from the ground were used to demonstrate the performance of the proposed
method. Results and comparison of performance parameters of different spectral-spatial
classification methods were presented. It is observed that EM performed better than K-
means and SOM.
© 2015 IAgrE. Published by Elsevier Ltd. All rights reserved.

* Corresponding author. Tel.: þ91 080 229 32873; fax: þ91 080 236 00134.
E-mail address: omkar@aero.iisc.ernet.in (S.N. Omkar).
http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
1537-5110/© 2015 IAgrE. Published by Elsevier Ltd. All rights reserved.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
2 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

data and a crop simulation model for monitoring crop growth


Nomenclature and yields of spring wheat at the country and state level. Mo
et al. (2005) used SRS to measure the normalised difference
UAV unmanned aerial vehicle
vegetative index (NDVI) and derive the leaf area index (LAI) of
LAI, NDVI leaf area index, normalised difference
crops to aid the development of crop growth models. Johnson
vegetative index
(2014) used SRS data for corn and soybean yield forecasting
SRS, URS, LARS satellite remote sensing, UAV remote
models across the US Corn Belt at the country level.
sensing, low altitude remote sensing
At a micro level, plant level yield estimation can be carried
VTOL vertical take-off and landing

out. Stajnko & Cmelik (2005) used visible spectrum image
ESC, SBC electronic speed controllers, single board
analysis to model harvested apple fruit yield under orchard
computer
conditions. Stajnko, Lakota, & Hoc evar (2004) analysed ther-
BIC bayesian information criterion
mal camera acquired images of apple trees to detect apple
EM expectation maximisation
fruits. Zhou, Damerow, Sun, and Blanke (2012) used visible
SOM self-organising map
spectrum images to detect both green and red apples using
J objective function of K-means
P colour thresholding methods. Regunathan and Lee (2005)
mk, k, gnk mean, covariance of the data, posterior
implemented three different classification techniques
probability
namely Bayesian, neural network and Fischer's discrimina-
N, P(x), pk multivariate gaussian distribution, mixture of
tion to differentiate fruit pixels from the background pixels.
gaussian distributions, mixing proportion of
Hannan, Burks, and Duke (2009) proposed an algorithm based
the gaussian distributions
on adaptive segmentation and shape analysis for orange fruit
L(q; x) log likelihood function
detection. Patel, Jain, and Joshi (2011) developed an algorithm
dj, wij, ht discriminant function, weight vector, learning
for fruit detection based on multiple features in which
rate
different weights were assigned to different features such as
f, s, srot image, structuring element, rotated structuring
intensity, colour, orientation and edge of an image. Seng and
element
Mirisaee (2009) proposed a recognition approach which com-
S(A), B skeletonisation of object A, structuring element
bined colour-based, shape-based and size-based methods and
SI, P, A shape index, perimeter, area of the object
further used the nearest neighbour classification to classify
ROC receiver operating characteristics
the fruit pixels.
TP, FP, FN true positive, false positive, false negative
UAVs provide an interesting option for remote sensing for
agricultural applications. UAV remote sensing (URS) features
high spatial resolution, low temporal resolution and the abil-
ity to operate with cloud cover. Herwitz et al. (2004) used an
1. Introduction UAV to acquire multispectral imagery of a coffee plantation.
The chosen study area had coffee trees that were pruned a
With the increasing global population, increasing agricultural priori. Spectral index was used to determine the ripeness of
production is the key to meet global food security goals. The cherry beans. Kooistra et al. (2014) used a hyperspectral
imperative is to produce more food with less resources. The camera mounted on a UAV to monitor LAI as an indicator of
projected growth rate of total world consumption of agricul- biomass in a potato field. Hunt et al. (2010) used near infrared
tural products which include food, fodder and fibre is 1.1% per reflectivity (NIR) and green and blue photographs to measure
annum. From the year 2005e2007 to 2050 (Alexandratos & LAI and correlate the output with vegetation index.
Bruinsma, 2012) the world level (not for individual countries It can be seen that SRS is widely used to drive crop models at
or regions) consumption of agricultural products is estimated a macro level. At a micro level, individual plants have been
to be equal to its production, this means global production in analysed. While UAVs are used for agricultural operations such
2050 should be more than 60% than that of 2005e2007. The as monitoring and measuring site specific bio-physical pa-
total world food production is proposed to increase from 2217 rameters, they are not extensively used for yield estimation.
million tonnes in 2005e2006 to 3291 million tonnes in 2050 UAV images allow detection of the tree crowns, observa-
(Alexandratos & Bruinsma, 2012). With land availability and tion of individual plants, fruits and patches. This motivates
water resources becoming scarcer, increases in production in the application of URS for intermediary levels between tradi-
such proportions will not be easy. tional macro and micro level crop modelling techniques to
Estimating yield is a critical input in crop management to analyse and study actual fruits in yield estimation.
increase production and productivity. Several researchers In this paper two representative images acquired by URS
have analysed yield estimates at both macro and micro levels were analysed to detect tomatoes. Successful detection of
(the level of individual plants) and crop yield assessments at a tomatoes demonstrates the potential of using UAV images for
macro level have been simulated and empirically evaluated actual yield estimates in future. The UAV images of an open
considering factors such as soil, atmosphere and plant geno- field tomato crop were analysed. Spectral clustering tech-
type. Satellite remote sensing (SRS) data (Launay & Guerif, niques followed by spatial segmentation were used to extract
2005; Shi & Xingguo, 2011; Yuping et al., 2008) has been used tomatoes. This work can be broadly divided into two stages: 1)
to increase the predictability and accuracy of crop models Spectral clustering: The optimum number of clusters is deter-
(Lobell, 2013). Doraiswamy, Moulin, Cook, and Stern (2003) mined using Bayesian information criteria (BIC) and three
successfully demonstrated a method for integrating SRS different clustering algorithms, namely K-means, expectation

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 3

maximisation (EM) and self-organising map (SOM), applied on process of building the UAV for data acquisition has been
the image to generate the clusters and 2) Spatial segmentation: detailed.
Morphological operations with threshold for size and shape The UAV used in this study is built using off-the-shelf
based properties such as area and shape intensity are applied components. It is a quad-copter which belongs to a class of
to separate the tomatoes from other objects which were multi-rotor aircraft. The UAV was an electrically powered,
falsely picked up in the spectral clustering process. VTOL, with low endurance and low cost. The main advantage
Section 2 describes the UAV system used in this study and of VTOL UAVs are their hovering capabilities which makes
the data acquisition procedure. Section 3 describes the them stable platforms during image acquisition. It is piloted
spectral-spatial methods applied on UAV images for tomato remotely from the ground station by a trained pilot. The take-
detection. In Section 4, a synthetic image is used to illustrate off weight was 2.1 kg including a payload of 0.1 kg. The UAV
the proposed methods. Sections 5 and 6 details the perfor- and camera specifications are listed in Tables 1 and 2
mance measures and results and a discussion on the results respectively.
respectively. The conclusions are discussed in Section 7. Figures 1 and 2 show an image of the system and a block
diagram, respectively. The URS comprised the following sub-
systems, namely, a UAV, an imaging sensor and ground
2. Unmanned aerial vehicle for remote control.
sensing
2.1.1. UAV sub-system
In this section the type of UAVs suitable for remote sensing This sub-system consisted of all the components and devices
and more specifically for agriculture applications are dis- required for the UAV to be airborne. The UAV sub-system has
cussed. UAVs typically fly at low altitudes to acquire remote four brushless DC motors mounted on a quad-copter frame
sensing data also known as low altitude remote sensing operated using four electronic speed controllers (ESC) that
(LARS) (Saberioon et al., 2014). For LARS, most UAVs are fixed- were powered by Lithium-Polymer batteries. The quad-copter
wing or rotary-wing aircraft with low payload and short was stabilised using an on-board flight stabilisation system.
endurance capabilities. Payload size and weight are critical The flight stabilisation system was an open-source multi-
factors for UAVs in agricultural applications. The most rotor platform called ardupilot (APM2.5, 3d Robotics Inc, Ber-
important payload component in URS is the imaging sensor. keley, California, USA). It consisted of MEMS based gyroscope,
The imaging sensor for URS therefore needs to be of small size accelerometer, magnetometer, barometer, GPS module and
and low mass. AT-mega 2560 AVR microcontroller. As the pilot navigated
Co rcoles, Ortega, Herna  ndez, and Moreno (2013) used a using the remote control, the control commands were trans-
vertical take-off and landing (VTOL) micro-drone quad-rotor mitted over the air interface to the on-board flight stabilisa-
aircraft to determine LAI and canopy cover mapping of onion tion system. The flight stabilisation system varied the speed of
crop. Berni, Zarco-Tejada, Sua  rez, and Fereres (2009) used the motors to achieve the desired altitude. The pitch-roll
helicopter based UAV with hyper-spectral and thermal im- camera stabilisation maintained the nadir view which was
aging for vegetation monitoring. Xiang and Tian (2011) used a independent of the orientation of UAV.
14 kg helicopter with multispectral cameras and autonomous
capabilities to collect field image data with GPS navigation. 2.1.2. Imaging sensor sub-system
Sugiura, Noguchi, and Ishii (2005) used remote sensing for The imaging system is realised using a Raspberry Pi® (Model
vegetation monitoring using helicopter UAV. From the litera- 1 B, Raspberry Pi Foundation, Caldecote, Cambridgeshire,
ture it can be observed that LARS for agriculture is carried out United Kingdom). The Raspberry Pi® is a credit card sized
using cameras mounted on small low payload UAVs. These single board computer (SBC) that has a camera board as a
requirements drive the system characteristics of the UAV standard add on. The SBC runs a Linux operation system.
used in this work. Aerial video was acquired using the camera module of the
Raspberry Pi® computer. The Raspberry Pi® camera module
2.1. System description specification is shown in Table 2. The camera has a fixed focal
length of 3.6 mm and a fixed horizontal and vertical field of
In this section, the characteristics and preferred configuration view. The UAV was flown up to an altitude of 50 m. The images
of URS for agriculture application is discussed. Further, the chosen for analysis were captured at altitudes below 20 m.

Table 1 e UAV specifications.


Parameters UAV specifications
Class of UAV: Micro Aerial Vehicle (MAV), Rotor aircraft, Quadcopter
Gross Weight: 2.1 kg
Material: Fibre glass, aluminium
Motors: Four (4) BLDC motors, 1000 Kv, 210 W each
Length: 550 mm
Endurance: 20 min
Altitude: 500 m
Battery: Lithium polymer 3S 5000 mAh

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
4 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

UAV. The images were acquired with nadir orientation which


Table 2 e Camera specifications.
provides a top view of the tomato fruits. An oblique orienta-
Parameter specification Value tion may occlude the visibility of the fruits because of plant
Camera make Camera module of Raspberry PI stems and leaves. Although nadir orientation also causes oc-
Video HD resolution 1080 p, 30 Hz clusion, occlusion was predominantly due to the stalks of the
Sensor resolution 2592  1944, 5 mega pixels fruits, which our proposed method addresses.
Sensor image area per pixel 1.4 mm  1.4 mm
The video was reviewed offline to extract frames (images)
Focal length 3.60 mm
that have a good aerial views of the tomato crop. Regions of
Horizontal field of view 53.50
Vertical field of view 41.41 interest (RoIs) were selected within the extracted frames that
depicted the tomatoes well. It must be noted that the size of
the images chosen for analysis vary based on the size of the
The camera used the default calibration of average of sensors. RoI.
The camera is pre-tuned to an 'average' of the sensors as a Tomato crops shed their leaves during the terminal stages
default setting and the same default setting is used in image of crop development. During this stage, the fruits are visible in
acquisition. aerial images. Images acquired during the terminal stage were
analysed for yield estimation. In earlier stages of the crop,
2.1.3. Ground control sub-system where the plants were leafy and the fruits were concealed
This system consists of wireless devices required to control below, image analysis was not possible.
the UAV flight path and camera. A nine channel remote con- The experimental site was at Mudimadagu village (Latitude
trol is used to manually control the UAV. It is required to 13.56N, Longitude 78.36E) in the Rayalpad subdivision of Sri-
operate the SBC on field. To achieve this objective, a battery nivaspur Taluk, Kolar district, in the south-east of peninsular
operated display and keyboard are required. A palm size India.
display module was connected to the SBC through a com-
posite video port. The wireless keyboard was used to interface
with the SBC. 3. Methodology

2.2. Data acquisition Spectral-spatial classification of tomatoes and non-tomatoes to


UAV images was applied. Spectral-spatial methods have been
The SBC along with the camera module was used for image successfully applied to satellite remote sensing data
acquisition. As the UAV flew over the farm, the SBC captured (Senthilnath et al., 2013; Tarabalka, Benediktsson, & Chanussot,
video for 10 s and wrote the data into a file. This is repeated for 2009). To achieve better classification of spectral-spatial
the entire flight duration of 5 min. This functionality was methods majority voting on the pixel wise classification is
accomplished using a shell script program running on the performed using an adaptive neighbourhood. Here BIC was
SBC. The spatial resolution of the extracted frames vary based used to predict the optimal number of clusters for UAV remote
on the altitude at which the image was acquired. Since the sensing images and carried out the spectral clustering using K-
altitude of the UAV changed as it flew, the spatial resolution, means, EM and SOM. For spatial segmentation morphological
or the depth of the image, varied based on the altitude of the operations were used with geometrical attributes.

Fig. 1 e Unmanned aerial vehicle (UAV) system.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 5

X K 
N X 2
 
J¼ xn  mk  (3)
n¼1 k¼1

where xn is the nth pixel and m k is the kth cluster centre. The
BIC-K-means steps are

Step 1: Find the number of clusters K to be generated using


BIC.
Step 2: Randomly choose K pixel values from the image as
cluster centres.
Step 3: Calculate the Euclidean distance of every pixel from
each cluster centre.
Step 4: Assign the pixel to the nearest cluster centre i.e. the
Fig. 2 e UAV system block diagram. cluster centre for which Euclidean distance is minimum.
Step 5: Recalculate the cluster centres for each cluster by
computing the average of the pixel values allocated to that
3.1. Spectral clustering specific cluster.
Step 6: Repeat the process until there is no further change
Three unsupervised spectral clustering methods, namely, K- in the distribution of the pixel intensities.
means, EM and SOM are used for grouping pixels into to- Step 7: Select the cluster which is dominated with to-
matoes and non-tomatoes. matoes for spatial segmentation, further apply spatial
The RGB value of any pixel in an image depends on the methods to remove misclassified tomato regions as dis-
reflectance properties of the object it represents. Pixels can be cussed in section 3.2.
classified into different groups depending on their spectral
intensities. The aim was to classify the image into two groups:
tomatoes and non-tomatoes. However, due to resemblance in 3.1.2. Expectation maximisation (EM)
the spectral intensities of some non-tomato pixels were EM is an iterative method for determining maximum likeli-
classified with the tomato pixels. To overcome this problem, a hood estimates in which it is assumed that data points are
spatial classification was applied as discussed in Section 3.2. drawn from a mixture of distributions (Li, Zhang, & Jiang,
A key process for efficient clustering of RGB images is the 2005). In this study, the statistical model considered is a
aggregation and assignment to information class because mixture of Gaussian models where mean and covariance are
many subclasses may be included in a single information class. the parameters to be estimated. The BIC-EM algorithm steps
For example, the information class ‘tomatoes’ may include are (Li et al., 2005):
several intensity subclasses such as ripe and non-ripe to-
matoes. To determine the number of clusters for such varying Step 1: Consider a mixture of Gaussian distributions given
intensity distribution, a BIC (Schwarz, 1978) was adopted. by
  1
BICzln L b
q  *kj *logðnÞ (1) X
K
2 PðxÞ ¼ pk Nðxjmk ; Sk Þ (4)
i¼1
    
q ¼ p xb
where L b q; M (2) where K is the total number of Gaussian distributions in the
mixture model, pk2[0,1] is the mixing proportion of the
Equation (2) represents the maximised likelihood measure P
Gaussian model k satisfying the property Kk¼1 pk ¼ 1, mk is the
of model M for dataset x and b q are the parameter values that
mean of the Gaussian model k, Sk is the covariance of the
maximise the likelihood. kj represents the number of free
Gaussian model k and N(xjmk,Sk) is the multivariate Gaussian
parameters to be estimated for a specific number of clusters
distribution.
and n represents the number of attributes for the given data-
set. For i number of clusters of m dimensional data, the
Step 2: Find out the number of distributions K in the
number of free variables will be i * (1 þ m þ (m*(m þ 1)/2))
mixture by using BIC.
where 1 is for responsibility, m is for the mean and the third
Step 3: Randomly initialise mean values, covariance
term is for the degrees of freedom of a covariance matrix.
matrices and mixing coefficients.
With an increase in the number of clusters, BIC at first tends to
Step 4: Using these values, evaluate the posterior proba-
increase, reaches a maximum value and then decreases. The
bility gnk(responsibility) of kth cluster towards the pixel xn
value at which it attains the maximum was considered to be
given by
the optimal number of clusters.

pk Nðxn jmk ; Sk Þ
3.1.1. K-means gnk ¼ P    (5)
K 
K-means aims at minimisation of the Euclidean distance j¼1 pj N xn mj ; Sj

value (Selim & Ismail, 1984). The objective function J is given


by Step 5: Assign every pixel to the cluster with greatest pos-
terior probability.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
6 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Step 6: Update the parameters of each distribution using basis for competition. The discriminant function for a
Eqs. (6)e(8) neuron is given by

PN
gnk xn X
D
 2
mnew
k ¼ n¼1
(6) dj ¼ xi  wij (11)
Nk
i¼1

PN   T
Xnew gnk xn  mnew xn  mnew where x is the input vector of D dimension and wij is the
¼ n¼1 k k
(7) weight vector connecting the ith component of the input
k Nk
vector to the jth neuron. Hence, the winning neuron is the
XN neuron that most closely matches the input i.e. the neuron for
Nk
pnew
k ¼ ; where; Nk ¼ gnk (8) which the discriminant function is minimum.
N n¼1
When one neuron is activated, its closest neighbours tend
where mnew
k ; Sk ; pk
new new
are the updated parameters after each to get excited more than those farther away. There is a topo-
iteration, Nk is the total sum of the responsibilities of a logical neighbourhood that decays with distance (Kohonen,
Gaussian distribution and N is the total number of pixels in the 1990). This neighbourhood set Nc around the winning
image. neuron where c is defined.

Step 7: Calculate the log likelihood function given by Step 4: Update the winning neuron and the neurons within
the topological neighbourhood using

X
n
LðQ; XÞ ¼ logðpxi jQÞ (9)  
i¼1 Dwij ¼ hðtÞ xi  wij (12)

where the observation X ¼ {xiji ¼ 1,…,n} are independently where Dwij indicates the change in the weight and h(t) is the
drawn from the distribution p(x) parameterised by Q. time dependent learning rate which decreases with time. In
The optimised likelihood is given by this study the basic Kohonen network was implemented. The
above steps were repeated until there was no further change
e
LðQ; XÞ ¼ LðQ; XÞ þ gPðX; yjQÞ (10) in the topography.
where the regulariser P is a functional of the distribution of the
complete data given by Q and the positive value g is the reg- Step 5: Select cluster which is dominated with tomatoes for
ularisation parameter that controls the compromise between spatial segmentation, further apply spatial methods to
the degree of regularisation of the solution and the likelihood remove misclassified tomato regions as discussed in Sec-
function. tion 3.2.
Repeat the steps 4e6 till the log likelihood value converges.
3.2. Spatial segmentation
Step 8: Select the cluster which is dominated with to-
matoes for spatial segmentation, further apply spatial As discussed in the last step of K-means, EM and SOM algo-
methods to remove misclassified tomato regions as dis- rithm, here the spatial features of the clustered image were
cussed in Section 3.2. used to remove misclassified tomato regions.
The steps used in spatial segmentation and the rationale
for the chosen sequences of steps were as follows:
3.1.3. Self-organising maps (SOM)
SOM is an unsupervised algorithm which belongs to the class 3.2.1. Closing operation
of competitive learning (Kohonen, 1990). In competitive Some of the single tomatoes appear bisected in the binary
learning, output neurons compete with each other to get segmented image because of the overhead stalks. While
activated. counting, these may be counted as two instead of one. In order
SOM transforms an incoming signal pattern into one or two to join these, close operation is applied on the image with an
dimensional discrete map in a topological fashion. The neu- appropriate structuring element.
rons become selectively tuned to various input patterns or The closing of an image f by a structuring element s is
classes of input patterns during the course of competitive dilation followed by erosion (Haralick, Sternberg, & Zhuang,
learning. The BIC-SOM algorithm steps are: 1987). The structuring element is a small matrix of pixels,
each with a value of zero or one. The dimensions of the matrix
Step 1: Find out the number of neurons K in the mixture state the size of the structuring element and the pattern of
using BIC. ones and zeros specifies its shape.
Step 2: Randomly initialise the connection weights of the
neurons. f $s ¼ ðf 4srot ÞQsrot (13)
Step 3: For each input pattern, compute the value of the where srot means that the dilation and erosion should be
discriminant function for each neuron. This provides the performed with a rotated structuring element. In case of

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 7

symmetrical structuring elements, the rotation does not make then opened using B and then it is subtracted from the eroded
a difference. 4 represents the dilation of f with srot and Q image to obtain the skeleton points as shown in Eq. (14). The
represents erosion. A disk structuring element of radius 2 is operations were done iteratively for different sizes of B. The
used in this study. union of all such skeleton points gave the skeleton represen-
tation of the object. In this study, instead of taking the union
3.2.2. Area thresholding and skeletonisation as shown in Eq. (15), only Eq. (14) was implemented for a
The noise was predominantly due to patches of soil, stalks suitable k and B to find out the number of skeleton points for
and pebbles. It is not possible to eliminate all noise in one step each object. Due to overlapping, the connected tomatoes had
due to different geometric characteristics of the noises. The a complex shape and they need a greater number of pixels to
soil patches, being the largest, were remove first by setting a represent their skeleton. Therefore, along with an area
threshold for area values. The threshold of area values for a threshold, a threshold for skeleton points can be put to retain
given image depends on the resolution of the image. The the tomatoes and eliminate the land patches.
resolution of the image is a function of the altitude of the
image from the ground. The objects in the image were first 3.2.3. Removal of stalks
labelled and then the area for each object was calculated. Area Tomatoes are nearly circular in shape while the stalks are
of an object was calculated by finding the total number of relatively long, thin and have branches. This distinction in the
pixels depicting that object. However, problems arose in geometrical features of the stalks can be used to categorise
selecting a threshold when the tomatoes in the image them into non-tomato group. The geometric parameter shape
overlapped. index (SI) (Senthilnath et al., 2013) given by
In some cases, the area values of tomatoes that overlapped
P
and soil patches were similar. We overcome this difficulty by SI ¼ pffiffiffiffi (16)
4* A
taking into account the skeleton representation of each object
calculated using morphological means. The skeleton of an where P represents the perimeter of the object (i.e. the number
object provides a simple and compact representation of a of pixels on the boundary of the object) and A represents the
shape that preserves the topological and the original size of area of the object. For a given area, the perimeter value of a
the object. The skeletonisation S(A) (Sagar, 2013) of an object A stalk is greater than that of a tomato, which makes the ratio of
is carried out using the following equations: perimeter to area, a higher value for the stalks. Thus, a
threshold value was set for SI to remove stalks. The morpho-
Sk ðAÞ ¼ ðAQkBÞ  ðAQkBÞoB (14) logical operation open was applied on the image to detach the
stalks attached with tomatoes.
K
SðAÞ ¼ ∪ Sk ðAÞ (15)
k¼0
3.2.4. Removal of elliptical noise
where B is a structuring element of appropriate size and The unwanted objects remaining in the image were mostly
shape. At first, k successive erosions of object A were carried pebbles which are commonly elliptical in shape. The best way
out with the structuring element B. The eroded image was to distinguish an ellipse from a circle is by finding the ratio of

Fig. 3 e Illustrative example (a) Clustered image (b) Structuring element disk of radius 1 (c) Image after applying closing
operation (d) Effect of area thresholding (e) Effect of shape intensity thresholding (f) Removing the elliptical noise by
calculating the ratio of major to minor axis.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
8 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

the length of the major axis to the minor axis. Circular objects shown in Fig. 3(e). The major to minor axis ratio calculated for
have the ratio almost one while elliptical objects have ratio each object is
greater than one. The labels having ratio greater than the
threshold are categorised into non-tomato group while the
others are retained in the tomato group.
The result obtained, after implementing all the noise Label 1: 1.9861 Label 5: 2.2638
Label 6: 1.4639 Label 7: 1.0000
removal methods, was an image consisting tomatoes present
in the image.
Label 5 represented a pebble in the image. It had the
highest major to minor axis ratio and was eliminated by
setting a threshold of 2 for the ratio. The final result obtained
4. Illustrative example
after eliminating all the misclassified tomato regions is shown
in Fig. 3(f).
A synthetic image of size 12  12 pixels was assumed to
illustrate the effect of each step in the proposed algorithm.
Figure 3(a) shows the clustered image with tomatoes and
misclassified tomato regions. The challenge here is to retain 5. Performance measures
tomatoes by removing misclassified tomato regions, i.e. to
obtain labels 1, 6 and 7. The performance measures using ROC parameters were ana-
Firstly, close operation is applied on the image with a disk lysed in this study by comparing the spectral-spatial classi-
structuring element of radius 1 as shown in Fig. 3(b). Closing fiers. We have the reference data which shows the location of
fills the gaps in the regions while retaining the initial region tomato pixels in the image. It is collected using field survey.
size, as shown in Fig. 3(c). The processed binary image is overlaid on the reference data
The number of pixels (area) of each label was specified as: and the tomatoes and non-tomatoes pixels are identified
respectively. The number of true tomatoes and false tomatoes
were counted separately to evaluate the performance
measures.
Label 1: 13 Label 2: 02
ROC parameters calculated for UAV images to extract to-
Label 3: 11 Label 4: 08
Label 5: 08 Label 6: 03
matoes are defined in terms of true positive (TP), false positive
Label 7: 04 Label 8: 01 (FP) and false negative (FN) (Fawcett, 2006; Senthilnath,
Shivesh, Omkar, Diwakar, & Mani, 2012). A TP means that
the extracted object is a tomato and the database also in-
From Fig. 3(c), it can be observed that number of skeleton dicates the object to be a tomato. If the extracted object is not a
points for each label was: tomato but the database indicates it to be a tomato, then it is
counted as a FN. For an FP the extracted object is a tomato but
the database indicates it not to be a tomato.
Based on these ROC parameters, performance measures:
Label 1: 4 Label 2: 0
Recall, Precision and F-Measure are being evaluated.
Label 3: 2 Label 4: 2
Label 5: 2 Label 6: 1
Label 7: 0 Label 8: 0 i. Recall:

A threshold value is put for area in between 9 and 3 along


TP
with a threshold of 3 for number of skeleton points. So, all the R¼ (17)
TP þ FN
labels having area greater than 9 and less than 3 and skeleton
points less than 3 are assigned label 0. Although the area of
label 1 falls outside the threshold range, it is able to retain its
label due to the threshold used for number of skeleton points ii. Precision:
as shown in Fig. 3(d). The labels 2, 3 and 8 are no longer pre-
sent in the image as shown in Fig. 3(d). Next, the SI value for
each label was calculated using the perimeter value of the
TP
respective label. The SI value for each label was: P¼ (18)
TP þ FP

Label 1: 1.0212 Label 4: 0.9786


Label 5: 0.8536 Label 6: 0.4928 Label 7: 0.5000 iii. F-Measure:

Labels 1 and 4 had larger shape intensity values than other


labels. A threshold value less than 0.9, along with skeleton 2PR
F¼ (19)
points threshold of 3 removes label 4 from the image, as PþR

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 9

For a good spectral-spatial classifier, all the above mea- maximum threshold point for the ratio of major to minor axis
sures will be close to one. along with a threshold value of 26 for the skeleton points.
Likewise, for EM clustered image an area threshold value of
16 and SI threshold value of 1.424 were used along with a
threshold of 8 for the number of skeleton points. For the ratio
6. Results and discussion
of the major axis length to minor axis, a threshold value of
2.0655 along with a threshold value of 25 for skeleton points
In this section, the experimental results obtained using UAV
was used. For SOM clustered image, an area threshold value of
images for tomato detection are presented. In the images and
18 and SI threshold value between 2.5077 and 1.4402 was used.
the spatial resolution of images acquired in this study are
So, all those labels having an SI value in the threshold range
dynamic since the altitude of UAV varied. The nearer the UAV
were categorised as non-tomato pixels. For eliminating the
was to the tomato field, better features it picked up in the
elliptical shaped noise, a value of 2.4079 was used as the
image. The two images examined were acquired at different
threshold value for major to minor axis ratio along with a
altitudes and hence different spatial resolution were studied.
threshold value of 18 for skeleton points.
The detection of tomatoes was carried out using spectral-
The spatial classification leads to improve the tomato
spatial classification.
detection as shown in the Figs. 6(b), 7(b) and 8(b) for K-means,
EM and SOM respectively. Figures 6(c), 7(c) and 8(c) show the
6.1. UAV image 1 spectral-spatial classified tomato region overlaid on the orig-
inal image.
The size of the first UAV image was 293  415 pixels as shown The results of the Figs. 6(c), 7(c) and 8(c) are represented in
in Fig. 4. The optimal number of clusters to be chosen for Table 3 which compares the ROC parameters of all the
clustering depends on the data set. The BIC curve reached a methods, namely, K-means, EM and SOM. The performance
maximum at 3 as shown in Fig. 5. Hence all the spectral measures based on ROC parameters were calculated for each
methods generated 3 clusters. method, as shown in Table 4. For this was lower for SOM. The
The spectral methods, namely, K-means, EM and SOM, advantage of SOM is that it had higher precision value than the
were applied to group tomato and non-tomato regions in the other two methods as it selects less false positives. To analyse
image. The resultant images are as shown in Figs. 6(a), 7(a) and the results, the graph between the precision and recall was
8(a) respectively. The spectrally classified image contained plotted as shown in Fig. 9. The desirable outcome was the point
soil patches, stalks and pebbles among other objects that were where both the measures i.e., recall and precision were one.
misclassified as belonging to the tomato region. The point closest to the coordinate (1, 1) was selected as the
To overcome this problem, spatial classification was best result. The results of SOM and EM were equally good for
applied on spectrally classified image as discussed in Section both precision and recall compared to the K-means process.
3.2. For K-means clustered image, an area threshold value of In all these algorithms used for tomato detection, the
25 was used where all the labels having area <25 were grouped number of clusters chosen for clustering played an important
into non-tomato pixels. To remove the stalks an SI threshold role. Since, there are two classes of objects, i.e. tomato and
value of 1.4125 was used along with a threshold value of 26 for non-tomato region, the performance measures were analysed
the number of skeleton points. Therefore, all the labels having by taking exact partitioning into 2 clusters. This was
SI value >1.4125 were classified into non-tomato pixels. compared with the clusters generated using BIC, which is 3
However, due to the threshold value for number of skeleton clusters, for this image. The ROC parameters and the perfor-
points, certain overlapped tomatoes, despite having SI values mance measures using two clusters are shown in Tables 5 and
>1.4125 were retained in tomato group. For eliminating the
elliptical shaped noise, a value of 2.3033 was used as the

Fig. 4 e UAV image 1. Fig. 5 e BIC curve of UAV image 1.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
10 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Fig. 6 e (a) Spectral clustering using K-means on image 1 (b) Spatial classification on the clustered image (c) Spectral and
spatial classified tomato region overlaid on the original image.

6 respectively. While comparing the performance measures, it was observed by varying the learning rate and then
is noticed that choosing BIC generated 3 cluster centres gave a comparing the performance measures obtained for each
better recall as well as precision values than 2 clusters for K- case. The complete analysis is shown in Table 8, from this
means and SOM. For EM, the precision value was better for 2 table it can be observed that the value 0.99 gives the better
clusters than for 3 clusters generated by BIC. However, the result.
number of false positives picked was high for the case of 2 The algorithms were tested on a computer with Core-i3
clusters as compared to 3 clusters generated by BIC. Hence, processor, 4 GB RAM with Matlab (R2013a, The MathWorks,
clustering using 3 clusters was a better solution than 2 clus- Inc., Natick, Massachusetts, USA), and the execution time for
ters. Figure 9 shows the precision verses recall values using UAV Image 1 was recorded. The time of execution for the UAV
BIC and without BIC for K-means, EM and SOM. It was Image 1 for K-means spatial method was 874.41 s, the EM-
concluded that using the number of clusters generated by BIC spatial method took 3453.818 s and SOM-spatial method
gave a better result compared with 2 clusters. took 2886.445 s.
In EM method regularisation factor plays a key role in
classification of the pixels as tomatoes. In order to decide the 6.2. UAV image 2
regularisation factor which gives the best result, the change in
the ROC parameters and performance measures with respect The second UAV image is of size 1080  1920 pixels, as shown
to different regularisation factors was examined. The com- in Fig. 10. The number of clusters at which the BIC curve
plete analysis is shown in Table 7. Since, F-Measure takes into reaches its maxima was analysed. From Fig. 11, it can be
account both recall and precision, it can be taken as the prime observed that 5 is the optimal number of clusters to be applied
factor for deciding the best regularisation value. From Table 7, on this image for all the clustering methods.
it can be seen that F-Measure reached maximum value of The clustering of the image using K-means, EM and SOM is
0.8911 for 0.0005. Hence, the obtained result using 0.0005 as as shown in Figs. 12(a), 13(a) and 14(a) respectively. In the
the regularisation value was used for comparing the perfor- clustered image, misclassification can be observed where
mance of EM method with the other two methods. non-tomato pixels are misclassified as tomato pixels. In order
SOM considers learning rate as an important parameter to remove the misclassified pixels, spatial classification is
for deciding the classification of pixels. The optimal value performed on these images.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 11

Fig. 7 e (a) Spectral clustering using EM on image 1 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.

In case of EM, since no soil patches were classified into the The number and the type of pixels grouped in the tomato
tomato group, and “salt and pepper” noise was removed first. cluster were different in each method. For example, for EM,
To achieve the same, a threshold value of 62 was used for the tomato cluster did not pick up the soil patches and did not
areas where all the labels having area <62 were grouped into capture the tomato pixels fully. The size of the tomatoes
non-tomato pixels. All those labels having SI value in the picked up in EM tomato cluster was smaller compared to other
range of 1.9486 to 0.9219 were retained as tomato pixels. To two methods. In the case of SOM, the soil patches, stalks and
remove the pebbles, a threshold value of 2.6084 was set for the pebbles were captured in larger proportion than other two
ratio of major to minor axis. For K-means and SOM, soil methods. Since the threshold values were shape and size
patches were also misclassified into the tomato group. In case dependent, different threshold values were applied to
of K-means, soil patches have areas >1475 and the “salt and different methods of clustering.
pepper noise” have areas <140. Therefore, an area threshold Threshold values were data dependent due to the
value in the range of 1475 and 140 along with 50 as the different resolutions used in this study. Threshold values of
threshold value for the number of skeleton points was used to these above mentioned parameters were chosen empirically.
retain the tomatoes. An SI threshold value was set at 1.6137 A sensitivity analysis for the spatial method applied on K-
and for the ratio of major axis to minor axis, 1.8030 was cho- means spectrally clustered image was carried out. Table 9
sen as the threshold value and 5 as the threshold for number shows how the results varied with the change in threshold
of skeleton points. Similarly for SOM, the area threshold value values for spatial method. As the threshold values were
chosen was in the range of 2070 and 130 along with 25 as the increased for all these three geometrical properties, indi-
threshold for the number of skeleton points. The threshold vidually or together, (i.e. the labels having values greater
values for SI and ratio of major to minor axis were set as 1.7768 than these thresholds are removed) the number of true
and 1.9396 respectively. The resultant images were as shown positives as well as false positives increased. Hence, the
in Figs 12(b), 13(b) and 14(b) for K-means, EM and SOM threshold values for spatial method were chosen for the
respectively. Figures 12(c), 13(c) and 14(c) are the spectral- empirically set value so as to give a balance between the
spatial classified tomato region overlaid on the original image. precision and recall. This was done by calculating F-Measure
Due to the similar spectral intensities, few tomato pixels in our study. The thresholds, which give the better F-Mea-
and non-tomato pixels were misclassified with each other. sure value, were set.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
12 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Fig. 8 e (a) Spectral clustering using SOM on image 1 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.

The ROC parameters and the performance measures for was greater than the difference in the recall values, which
the second UAV image are shown in Tables 10 and 11 makes EM a better method than SOM. The same inference can
respectively. From Table 10, it can be seen that along with be observed from Fig. 15 where the EM (precision and recall)
the largest number of true positives, SOM gave the largest
number of false positives. Because of this the precision value
for SOM was the least among the three methods. For EM, the
number of true positives was less than SOM, but the number
of false positives was also less when compared to SOM or K-
means. The difference in the precision values of SOM and EM

Table 3 e ROC parameters comparison for UAV Image 1


with BIC.
ROC parameter K-means EM SOM
TP 45 45 44
FP 06 05 03
FN 06 06 07

Table 4 e Evaluating performance measures based on


ROC parameters in Table 3.
Performance measure K-means EM SOM
Recall 0.8824 0.8824 0.8627
Precision 0.8824 0.9000 0.9362
F-Measure 0.8824 0.8911 0.8979
Fig. 9 e Precision versus recall curve for UAV image 1.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 13

clusters derived using BIC, which is 5 clusters for this image.


Table 5 e ROC parameters comparison for UAV image 1
The ROC parameters and performance measures obtained
without BIC.
using 2 clusters are shown in Tables 12 and 13 respectively.
ROC parameter K-means EM SOM
From Table 12 it can be observed that true positives for all the
TP 38 38 43 three methods in the case of 2 clusters decreased drastically
FP 19 01 05 compared to the true positives shown in Table 10, for the case
FN 13 13 08
of 5 clusters. K-means is affected the most with 2 clusters as
there is a substantial decrease in the number of true positives
and increase in the number of false positives. It can also be
Table 6 e Evaluating Performance measures based on concluded that SOM and K-means give a better result using
ROC parameters in Table 5. BIC. However, for EM the results for both the number of
Performance measure K-means EM SOM clusters were equally good but using 5 clusters gave a better
Recall 0.7451 0.7451 0.8431 quantity of true positives while using 2 clusters gives less
Precision 0.6667 0.9744 0.8958 number of false positives. The complete comparison can be
F-Measure 0.7037 0.8445 0.8687 seen in Fig. 15.

Table 7 e ROC parameters and performance measures for different regularisation factor used in EM for UAV Image 1.
Regularisation factor True positives False positives False negatives Recall Precision F-Measure
0 47 8 04 0.9216 0.8545 0.8868
0.0005 45 5 06 0.8824 0.9000 0.8911
0.0008 18 0 33 0.3529 1.0000 0.5217
0.0010 17 0 34 0.3333 1.0000 0.4000
0.0030 39 1 12 0.7647 0.9750 0.8571
0.0050 41 4 10 0.8039 0.9111 0.8541
0.0080 42 2 09 0.8235 0.9545 0.8842
0.0100 35 2 16 0.6863 0.9459 0.7955
0.0200 09 3 42 0.1765 0.7500 0.2858

Bold indicates the value used for further processing.

Table 8 e ROC parameters and performance measures for different learning parameter used in SOM for UAV Image 1.
Learning rate True positives False positives False negatives Recall Precision F-Measure
0.10 39 4 12 0.7647 0.9070 0.8298
0.30 39 7 12 0.7647 0.8478 0.8041
0.50 42 5 09 0.8235 0.8936 0.8571
0.70 42 5 09 0.8235 0.8936 0.8571
0.90 42 4 09 0.8235 0.9130 0.8659
0.99 44 3 07 0.8627 0.9362 0.8979

Bold indicates the value used for further processing.

values are nearer to (1, 1). K-means performance measure


values were lower than the performance measures of EM.
To comprehend the significance of BIC, the analysis was
carried out again using exact partitioning of 2 clusters and

Fig. 10 e UAV image 2. Fig. 11 e BIC curve of UAV image 2.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
14 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Fig. 12 e (a) Spectral clustering using K-means on image 2 (b) Spatial classification on the clustered image (c) Spectral and
spatial classified tomato region overlaid on the original image.

Fig. 13 e (a) Spectral clustering using EM on image 2 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 15

Fig. 14 e (a) Spectral clustering using SOM on image 2 (b) Spatial classification on the clustered image (c) Spectral and spatial
classified tomato region overlaid on the original image.

Table 9 e Sensitive analysis for UAV image 2 with BIC e K-means.


Area threshold Shape intensity Major axis/Minor axis Recall Precision F-Measure
140e1475 1.6137 1.8030 0.6400 0.8067 0.7137
150e2000 1.6137 1.8030 0.5733 0.7748 0.6590
150e2000 2.0000 1.8030 0.6000 0.7692 0.6741
150e2000 2.0000 2.1000 0.6133 0.7077 0.6571
140e1475 2.0000 1.8030 0.6533 0.7716 0.7075
140e1475 2.0000 2.1000 0.6600 0.7279 0.6923
140e1475 1.6137 2.1000 0.6533 0.7259 0.6877

Bold indicates the value used for further processing.

To find the optimal regularisation factor and learning rate


Table 10 e ROC parameters comparison for UAV Image 2 in EM and SOM algorithm respectively, similar analyses were
with BIC.
carried out to obtain optimal value. It is observed that the
ROC parameter K-means EM SOM regularisation factor set to 0.003 and learning rate 0.99 gave
TP 96 109 114 the best result.
FP 23 20 36 In order to improve the overall performance, all the three
FN 54 41 52 clustering methods can be considered at a time, such that if a
pixel is classified as a tomato pixel in two, or more than two,
output images it will be considered as a tomato pixel. Other-
wise, it will be declared as a non-tomato pixel. The output of
this combined algorithm is shown in Fig. 16.
Table 11 e Evaluating performance measures based on
ROC parameters in Table 10.
Performance measure K-means EM SOM
7. Conclusion
Recall 0.6400 0.7267 0.6867
Precision 0.8067 0.8450 0.7600 Tomatoes grow in clusters and are generally partially hidden
F-Measure 0.7137 0.7814 0.7215
by their leaves and stalks. It becomes challenging to identify

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
16 b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7

Fig. 16 e Output of the combined algorithm.

acquisition (rotation, translation and more importantly


Fig. 15 e Precision versus recall curve for UAV image 2. scaling effect) by maintaining constant resolution. The
thresholds for tomato area and geometric properties are
empirically determined. Adaptive methods can be used to
automatically determine thresholds. This work of detecting
tomatoes can be extended to counting of tomatoes to esti-
Table 12 e ROC parameters comparison for UAV Image 2 mate the yield.
without BIC.
ROC parameter K-means EM SOM
TP 89 91 103 Acknowledgement
FP 34 08 38
FN 61 59 47
We thank the managing editor, guest editors and reviewers for
their thorough reading and suggesting many important as-
pects. All the suggestions were very useful in revising our
tomatoes manually and estimate the yield due to poor visual
study.
interpretation (as in Fig. 10). Hence, it is necessary to automate
the tomato detection process thus to validate the process we
experimented with two UAV images of different resolutions. references
The task of tomato detection using UAV images was suc-
cessfully carried out by the procedure of pixel based spectral
information for clustering and shape information for spatial Alexandratos, N., & Bruinsma, J. (2012). World agriculture towards
segmentation. Applying BIC with unsupervised clustering 2030/2050: The 2012 revision. ESA Work. Pap, 3.
techniques on the images gave better results than the exact Berni, J., Zarco-Tejada, P. J., Sua  rez, L., & Fereres, E. (2009).
partitioning (2 clusters). The performance is measured using Thermal and narrowband multispectral remote sensing for
vegetation monitoring from an unmanned aerial vehicle.
ROC parameters: precision, recall and F-Measure. A compar-
Geoscience and Remote Sensing, IEEE Transactions on, 47(3),
ative study of the outputs for three different clustering 722e738.
methods was carried out and EM proved to be better than K-  rcoles, J. I., Ortega, J. F., Herna
Co  ndez, D., & Moreno, M. A. (2013).
means and SOM. Estimation of leaf area index in onion (Allium cepa L.) using an
This work demonstrates a novel method to detect to- unmanned aerial vehicle. Biosystems Engineering, 115(1), 31e42.
matoes using UAV images. Tomato detection was success- Doraiswamy, P. C., Moulin, S., Cook, P. W., & Stern, A. (2003). Crop
fully carried out on two representative images. The results yield assessment from remote sensing. Photogrammetric
Engineering & Remote Sensing, 69(6), 665e674.
are promising; however as an extension to this work is
Fawcett, T. (2006). ROC Graphs: Notes and practical considerations for
planned to include how UAV remote sensing can be used for researchers. Technical Report HPL-2003-4, HP Labs.
larger areas cultivated with tomatoes. To achieve this image Hannan, M. W., Burks, T. F., & Duke, M. B. (2009). A machine
stitching is required to overcome the challenges of image vision algorithm combining adaptive segmentation and shape
analysis for orange fruit detection. Agricultural Engineering
International: the CIGR E Journal, XI, 1e7.
Haralick, R. M., Sternberg, S. R., & Zhuang, X. (1987). Image
Table 13 e Evaluating performance measures based on analysis using mathematical morphology. Pattern Analysis and
ROC parameters in Table 12. Machine Intelligence, IEEE Transactions on, 4, 532e550.
Performance measure K-means EM SOM Herwitz, S. R., Johnson, L. F., Dunagan, S. E., Higgins, R. G.,
Sullivan, D. V., Zheng, J., et al. (2004). Imaging from an
Recall 0.5933 0.6066 0.6867
unmanned aerial vehicle: agricultural surveillance and
Precision 0.7236 0.9191 0.7305
decision support. Computers and Electronics in Agriculture, 44(1),
F-Measure 0.6520 0.7308 0.7079
49e61.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003
b i o s y s t e m s e n g i n e e r i n g x x x ( 2 0 1 5 ) 1 e1 7 17

Hunt, E. R., Hively, W. D., Fujikawa, S. J., Linden, D. S., Selim, S. Z., & Ismail, M. A. (1984). K-means-type algorithms: a
Daughtry, C. S., & McCarty, G. W. (2010). Acquisition of NIR- generalized convergence theorem and characterization of
green-blue digital photographs from unmanned aircraft for local optimality. Pattern Analysis and Machine Intelligence, IEEE
crop monitoring. Remote Sensing, 2(1), 290e305. Transactions on, 1, 81e87.
Johnson, D. M. (2014). An assessment of pre-and within-season Seng, W. C., & Mirisaee, S. H. (2009). (2009, August). A new method
remotely sensed variables for forecasting corn and soybean for fruits recognition system. In , ICEEI'09. International
yields in the United States. Remote Sensing of Environment, 141, Conference on: Vol. 1. Electrical engineering and informatics (pp.
116e128. 130e134). IEEE. August.
Kohonen, T. (1990). The self-organizing map. Proceedings of the Senthilnath, J., Omkar, S. N., Diwakar, P. G., Mani, V., Nitin, K., &
IEEE, 78(9), 1464e1480. Shreyas, P. B. (2013). Crop stage classification of hyperspectral
Kooistra, L., Suomalainen, J., Franke, J., Bartholomeus, H., data using unsupervised techniques. IEEE Journal of Selected Topics
Mücher, S., & Becker, R. (2014, May). Monitoring agricultural in Applied Earth Observations and Remote Sensing, 6(2), 861e866.
crops using a light-weight hyperspectral mapping system for Senthilnath, J., Shivesh, B., Omkar, S. N., Diwakar, P. G., &
unmanned aerial vehicles. In EGU General Assembly Conference Mani, V. (2012). An approach to multi-temporal MODIS image
Abstracts (Vol. 16, p. 2790). analysis using image classification and segmentation.
Launay, M., & Guerif, M. (2005). Assimilating remote sensing data Advances in Space Research, 50(9), 1274e1287.
into a crop model to improve predictive performance for Shi, H., & Xingguo, M. (2011). Interpreting spatial heterogeneity of
spatial applications. Agriculture, Ecosystems & Environment, crop yield with a process model and remote sensing. Ecological
111(1), 321e339. Modelling, 222(14), 2530e2541.
Li, H., Zhang, K., & Jiang, T. (2005). The regularized EM algorithm. 
Stajnko, D., & Cmelik, Z. (2005). Modelling of apple fruit growth by
In Proceedings of the 20th National Conference on Artificial application of image analysis. Agriculturae Conspectus
Intelligence (pp. 807e812). Scientificus (ACS), 70(2), 59e64.
Lobell, D. B. (2013). The use of satellite data for crop yield gap Stajnko, D., Lakota, M., & Hoc evar, M. (2004). Estimation of
analysis. Field Crops Research, 143, 56e64. number and diameter of apple fruits in an orchard during the
Mo, X., Liu, S., Lin, Z., Xu, Y., Xiang, Y., & McVicar, T. R. (2005). growing season by thermal imaging. Computers and Electronics
Prediction of crop yield, water consumption and water use in Agriculture, 42(1), 31e42.
efficiency with a SVAT-crop growth model using remotely Sugiura, R., Noguchi, N., & Ishii, K. (2005). Remote-sensing
sensed data on the North China Plain. Ecological Modelling, technology for vegetation monitoring using an unmanned
183(2), 301e322. helicopter. Biosystems Engineering, 90(4), 369e379.
Patel, H. N., Jain, R. K., & Joshi, M. V. (2011). Fruit detection using Tarabalka, Y., Benediktsson, J. A., & Chanussot, J. (2009).
improved multiple features based algorithm. International Spectralespatial classification of hyperspectral imagery based
Journal of Computer Applications, 13(2), 1e5. on partitional clustering techniques. Geoscience and Remote
Regunathan, M., & Lee, W. S. (2005, July). Citrus fruit identification Sensing, IEEE Transactions on, 47(8), 2973e2987.
and size determination using machine vision and ultrasonic Xiang, H., & Tian, L. (2011). Development of a low-cost
sensors. In ASAE Annual International Meeting. agricultural remote sensing system based on an autonomous
Saberioon, M. M., Amin, M. S. M., Anuar, A. R., Gholizadeh, A., unmanned aerial vehicle (UAV). Biosystems Engineering, 108(2),
Wayayok, A., & Khairunniza-Bejo, S. (2014). Assessment of rice 174e190.
leaf chlorophyll content using visible bands at different Yuping, M., Shili, W., Li, Z., Yingyu, H., Liwei, Z., Yanbo, H., et al.
growth stages at both the leaf and canopy scale. International (2008). Monitoring winter wheat growth in North China by
Journal of Applied Earth Observation and Geoinformation, 32, combining a crop model and remote sensing data. International
35e45. Journal of Applied Earth Observation and Geoinformation, 10(4),
Sagar, B. S. D. (2013). Mathematical morphology in geomorphology and 426e437.
GISci. CRC Press. Zhou, R., Damerow, L., Sun, Y., & Blanke, M. M. (2012). Using
Schwarz, G. (1978). Estimating the dimension of a model. The colour features of cv. ‘Gala’ apple fruits in an orchard in image
Annals of Statistics, 6(2), 461e464. processing to predict yield. Precision Agriculture, 13(5), 568e580.

Please cite this article in press as: Senthilnath, J., et al., Detection of tomatoes using spectral-spatial methods in remotely sensed RGB
images captured by UAV, Biosystems Engineering (2015), http://dx.doi.org/10.1016/j.biosystemseng.2015.12.003

Оценить