Академический Документы
Профессиональный Документы
Культура Документы
397
fixed in 50 frames in the KTH actions dataset. In order the dimension of Sm including Si has been reduced
to have the spatio-temporal features, two factors are efficiently.
considered; the pertinence and the dimensionality of
feature vectors. These factors allow us to better
characterize of human behaviors in terms of the
accuracy rate and computational cost in classification
stage.
Let us consider a given action from the KTH
dataset which is composed by N frames denoted fi
which is described as follows:
fi = { f1 ,f2 ,f3 ,…, fN , / 1 i N } . (1)
Their corresponding spatio-temporal features
through the SURF descriptor are denoted Sm which
defines by a set of Si as described in the following
formula:
Sm = { S1 ,S2 ,S3 ,…, SN, / 1 i N }. (2)
We recall that N has been chosen to N=50 frames
which represents the short period of the action. Thus, m
is the number of the training example which is used in
the classification stage. The figure 2 shows the
extracted features of the boxing action using SURF
descriptor.
On the other hand, the dimensionality of Sm is very
high. Dealing with such high dimensional data will
cause low accuracy and high computing complexity.
Therefore, the SURF vectors give a high-level of
spatial information which is very requirement to have a
significant accuracy in the classification stage. In such
case, it is useful to reduce the dimensionality of the
provided vectors without losing their pertinent
information.
Among of existing methods that are used to reduce
the dimensionality, we have opted to use the Principal
Component Analysis algorithm (PCA) [8][9]. The
basic concept of this algorithm is conceptually quite
simple. First, the N-dimensional mean vector μ and d
×d covariance matrix Ȉ are computed for the full data
denoted Sm={ S1 ,S2 ,S3 , ··· ,Si , ···|1 i N }. Then, the
eigenvectors and eigenvalues are computed and sorted
according to decreasing eigenvalue. Call these
eigenvector v1 with eigenvalue Ȝ1 , v2 with eigenvalue
Ȝ2 , and so on. Next, the largest k such eigenvectors are
chosen. In this work, this is done by looking at a
spectrum of eigenvectors. Form a kxk matrix S whose
columns consist of the k eigenvectors. Figure 2. SURF features pattern for boxing action
Preprocess data according to:
X ′ = St(X − μ) (3)
2.3 Classification
Equation above is applied to obtain the more
significant features of Si . After using PCA algorithm, Many types of learning algorithms can be used as a
binary classifier in order to build the training model
398
classi¿er. In this work, we are interested to use the subsets on which 6 subjects from KTH dataset have
support vector machine classifier [10] to discriminate been chosen for training set and 3for the validation set.
the human behavior into two classes; Aggressive and The generated feature vectors X’ have been fed into the
Non-aggressive. This interest is motivated by binary support vector machine classifier model that
numerous factors. First, it is effective in high was built in the training set to classify the aggressive
dimensional spaces and uses a subset of training points human behavior. The SVM classifier is then performed
in the decision function called support vectors which by using different kernels including Quadratic,
allow obtaining a classifier probably faster than other Polynomial, and Radial Basis Function kernels in order
methods. to optimize the accuracy rate.
Moreover, the SVM aims to ¿nd a decision plane As we can show in the table 1, the accuracy rate
that has a maximum distance from the nearest training performed by different kernels of SVM is very
pattern of human behaviors. Given the training data significant. Moreover, the average recognition rate is
{(xi,yi)|yi =1or í1,i =1,..., N}for a two-class more significant in RBF kernel comparatively to the
classi¿cation; aggressive and no-aggressive behaviors. other kernels with accuracy equal to 96.8 %.
Where xi is the input feature, yi is the class label and N
is the number of training sample. Table 1: Average Accuracy Rate with KTH dataset.
Average accuracy rate
3. EXPERIMENTAL RESULTS
SVM-Linear 92 %
In order to validate the performance of the proposed SVM-Quadratic 95%
algorithm, several tests are conducted on popular SVM- Polynomial 95%
dataset called KTH actions [11]. It consists of 25 SVM- RBF 96.8%
persons performing 6 different actions {boxing, hand-
clapping, jogging, running, walking, hand-waving}. Due to the rarity of works dealing with aggressive
Moreover, the KTH dataset includes both normal and behavior based on the binary classification, it is useful
abnormal behaviors which could be aggressive or non to compare our method with that proposed by Datong
aggressive based on the dynamic of action and its et al in [12]. We can prove that our method gives
appearance such as boxing action. In such case, two significant performance in terms of the accuracy rate
classes have been performed from the aforementioned and the dimensionality of feature vectors unlike to the
dataset as follows: the class of Aggressive behavior proposed method of Datong et al. The latter
consists of the boxing, hand-clapping and hand-waving characterizes the aggressiveness of actions on KTH
of the KTH dataset and the second class contains dataset by using a local binary motion descriptor and
walking, jogging and running actions. The second class they used a one-center SVM to detect the
is obviously considered as non-aggressive behaviors. aggressiveness for each action. Nevertheless, the
The Figure 3 shows a sample of the KTH dataset actions and activities of persons are often characterized
actions. by two aspects; appearance and motion to have spatio-
temporal features which are not involved in the work
of Datong.
In order to further verify the performance of the
proposed method with competitive methods, it is
necessary to adapt the proposed algorithm to be able to
discriminate the different actions on KTH dataset. In
this case, we have used a one-versus-all approach [13]
to build multi-class SVM training model wherein each
input vector X’ has been fed with its related class.
The Table 2 provides the accuracy rate of the
proposed method in comparison with current
techniques which are carried out in human behavior
Figure 3. KTH DATASET recognition. We can see that the proposed method
provides satisfactory performance and proves its
The different experiments are conducted subject to effectiveness. We can also see that the accuracy rate of
the cross-validation technique. The latter involves the proposed method outperforms the majority of the
partitioning a sample of data into complementary
399
reported works which is also slightly better than the [6] N. Djelal, N. Saadia and A. Ouanane, “People Tracking
method reported in [14]. Using SURF Algorithm”, ISPA'12, Mostaganem, Algeria.
2/4 December 2012.
Table 2: Comparison with the stat-of-the-art (KTH). [7] W. He, T. Yamashita, H. Lu and S. Lao, “SURF
Tracking”, IEEE 12th International Conference on Computer
Methods Accuracy Rate Vision (ICCV), pp. 1586-1592, 2009.
Proposed method 96%
Wang et al. [14] 94.2% [8] L.Sirovich, and M. Kirby. “Low dimensional procedure
for the characterization of human faces”. Journal of the
Schindler et al [15] 92.7% Optical Society of America. A, Optics, Image Science, and
Laptev et al [16] 91.8% Vision, 4(3), 519–524. 1987.
400