Вы находитесь на странице: 1из 6

Vehicle classication at nighttime using Eigenspaces and Support Vector Machine

Tuan Hue Thi, Kostia Robert, Sijun Lu and Jian Zhang National ICT of Australia Kensington, NSW, Australia {TuanHue.Thi, Kostia.Robert, Sijun.Lu, Jian.Zhang}@nicta.com.au

Abstract
A robust framework to classify vehicles in nighttime trafc using vehicle eigenspaces and support vector machine is presented. In this paper, a systematic approach has been proposed and implemented to classify vehicles from roadside camera video sequences. Collections of vehicle images are analyzed to obtain their representative eigenspaces. The model Support Vector Machine (SVM) built from those vehicle spaces will then become a reliable classier for any unknown vehicle images. This approach has been implemented and proven to be robust in both speed and accuracy for vehicle classication at night.

1. Introduction
1.1. Motivation
Nighttime vehicle detection and classication is always a complicated and uncertain area within any trafc surveillance system due to the interference of illumination and blurriness. Moreover, the performance requirements are no longer left in prototype works at research labs but exposed to the real world problems. That demand makes the task of building such system high challenging, particularly in the requirements of speed and accuracy. Most of the recent surveillance systems using motion detection have been reported to work in daytime trafc, however, using the same approach for nighttime has never produced desirable results for research prototypes left alone meet the requirements for real world problems. This motivates us to use appearancebased approach with eigenspaces and SVM to particularly solve the problem of nighttime vehicle classication.

posed by Kirby and Sirovich [5]. A concept called eigenpicture was dened to indicate the eigenfunctions of the covariance matrix of a set of face images. Following up this approach, Turk and Pentland [7] has developed an automated system using eigenfaces with the similar concept to classify images in four different categories, which help to recognize true/false positive of faces and build new set of image models. Vehicle detection and classication have been done in different ways. Gupte et al. [3] uses background subtraction and tracking updates to identify vehicle positions in different scene. However, since it relies on the precondition that all shadows need to be removed before the actual classication, it is infeasible to be applied into nighttime trafc where shadow, reection and illumination are always present. Several approaches making uses of Principle Component Analysis (PCA) and SVM were mentioned in Zehang et al. [8] and Zhang et al. [9] work. Although the results reported is at high accuracy, the time expense for processing is not applicable for real trafc problem, and it relies much on different thresholds to reach good performance. In addition, their detection and classication system (which relies largely on motion detection) are originally designed for daytime trafc scenes, while in general nighttime trafc scenes, many of the daytime conditions are no longer available and same approach will not be guaranteed to give the same desired results.

1.3. Our approach


In our system, the case of night time trafc scene is particularly of interest, and there are three core objectives that makes our approach innovative and effective. Firstly, the requirement for accuracy is a must, since eventually the system will be deployed to be used on road trafcs with safety is the most important factor. Secondly, it must not be expensive in time processing in order to reach real-time speed of the urban transport system. Lastly, it should be capable of producing desired results for any nighttime trafc scenes. In order to meet all these requirements, we have proposed and

1.2. Related works


The use of principal components analysis in reducing dimensions and extract featured parts of objects was rst pro-

implemented a feature based classication approach using the combination between vehicle eigenspaces analysis and Support Vector Machine (detailed in section 2). Optimization and evaluation steps in section 3 will show how this method is improved and validated against practical problems. The most difcult problem of every support vector machine approach is how to use the most appropriate parameters to train the models and how to tune the training sets before applying the actual calculation. As suggested from Chang and Lin [2], we rst try to scale all image pixel intensities to a common range, then use grid cross validation to run on the test data to achieve the optimized parameters before generating the SVM model which will be eventually used to classify coming scaled vehicle images. Our main contribution from this paper to vehicle detection and classication is through the systematic approach of using PCA and SVM in classifying generic vehicle images, open for easy application to any scene with least modication. In addition, we propose a dynamic update approach to use tracking system to update true and false positive training image sets, and the guarantee for high accurate system after sufciently long running time. Although this framework is originally proposed to to deal with vehicle images in nighttime trafc scenes, classication problems of different types like robots, biological cells, or human activities can also be tackled in the same approach to obtain the desired classication results.

avoid false positive while detecting all possible false negative vehicle images, we propose an approach to use Principle Component Analysis (PCA) to build up the low dimensional eigenspaces for all possible vehicle images, and then apply a Support Vector Machine classier to verify unknown vehicle images against the model vehicle spaces.

2.2. Principal Components Analysis


As described in general face recognition applications [5], [7], principal components analysis are used with two main purposes. Firstly, it reduces the dimensions of data to computationally feasible size. Secondly, it extracts the most representative features out of the input data so that although the size is reduced, the main features remain, and still be able to represent the original data. A set of M vehicle images (including cars, buses, trucks) are collected as the training set for our classier. In order to generate the model of this set of images from different size, we rst use nearest neighbor interpolation to resize all vehicle images from previous section to one common value N. Each data image I with size N x N can be viewed as a vector I with size N 2 . The whole set of input data can be considered as a matrix with M rows and N 2 column. We will rst try to get the covariance matrix from this set of image vectors, as understood as one of the algorithmic transformations of those data. Then, the eigenvectors of this covariance transformation will be obtained. Eigenvectors are those that invariant in direction during a transformation, which can be used as a representation set of the whole big dataset. Those components are called eigenfaces in Turk and Pentland face detection application [7] and eigenvehicles in Zhang et al. vehicle detection application [9]. The covariance matrix of the input data is calculated starting from the algorithmic mean of all the vectors I1 , I2 , ... Im = 1 M
M

2. Vehicle classication at night time


2.1. Vehicle image segmentation
We use the feature-based detection approach described in Robert et al. [6] work to obtain hypothesis about vehicle positions in night scene video sequences. Figure 1

Ii
i=0

(1)

The difference image vector Ii and mean is called with i = Ii The theoretical covariance matrix C of all i is Figure 1. Vehicle Image Segmentation shows the segmentation results obtained from our implementation of this vehicle detection mechanism, all possible vehicle images at different sizes are detected and segmented from the trafc scene. As can be seen from the above list, due to bad light condition at nighttime, images with similar features are wrongly marked as real vehicles. In order to C= 1 M
M

(2)

i T i
i=0

(3)

All eigenvectors i and eigenvalues i of this covariance matrix are derived from the relationship: i = 1 M
M T (i T )2 i i=0

(4)

The collection of M eigenvectors i can be seen as the reduced dimension representation of the original input data (with size N 2 ) when M N 2 This set of M eigenvectors will have a corresponding eigenvalue associated with it, with indicates the distribution of this eigenvector in representing the whole dataset. Many papers have shown that, only a small set of the eigenvectors with top eigenvalues is enough to build up the whole image characteristic. In our system, we keep P top eigenvectors, P represents the number of important features from the vehicle eigenspace, and form the vehicle eigenspace (M rows, P columns).
P

Table 1. Vehicle Eigenspace and Weight Task Result Equat Matrix Size 1 2 3 4 5 5 6 Image set I Mean of I Difference Covariance C Eigenvectors i Eigenspace Image weight i (1) (2) (3) (4) (5) (6) M N2 1 N2 M N2 N2 N2 1 N2 P N2 1P

=
i=0

(5) the most representative features of the whole dataset still remain within only P eigenfeatures.

Followed are the rst 15 eigenfeatures of the vehicle front views produced from our training dataset of 400 vehicle images. These P representative features of the eigenspace will

2.3. SVM Training and Prediction


Support Vector Machine [1] is a statistical learning theory which increasingly becoming the popular machine learning technique of choice for pattern recognition. Burges has given a detailed description about how to use SVM in pattern recognition [1], including a description on how it emerges from a statistical basis to the various applications in pattern recognition. The main objective of training a support vector machine is to nd the biggest possible classication margin, which indicates the minimum value of w in:

Figure 2. Eigenvehicles be used to derive the transformed version of each separate vehicle image in this vehicle space. In our system, we call this transformed version the vehicle weight of each image in respect to the whole vehicle eigenspace, and can be used to judge the relationship between each image with the model vehicle spaces. The weight i of each input image vector Ii is calculated from the matrix multiplication of the difference i = Ii with the eigenspace matrix i = i (6)

The image weight calculated from equation 6 can be seen as the projection of an image onto the vehicle eigenspace, and it indicates the relative weight of the certainty that whether such image is an image of a real vehicle. Table 1 summaries the main steps to derive the vehicle eigenspace and image vehicle weight. Using the approach in table 1, our initial training set T, which consists of M true positive vehicle images and Q true negative vehicle images, has been transformed into a new set T of all input training image weights. This transformation has shown how PCA has been used to reduce the original dimension of the dataset T (size (M+Q)N 2 ) to T (size (M+Q)P) where generally P N 2 . Though the dimensions are greatly reduced,

1 T w w+C i (7) 2 where i 0 and C is the error tolerance level. In our vehicle classication case, the data is non-linear but the theory behind can be treated in the same way. Real vehicle images can be considered on one side and the non-vehicle images are on the side. The training of instance-label pairs Li (i , yi ), where i is the weight vector and yi {1, 1} is the class label of i , will aim to derive the optimized support vectors on the vehicle eigenspace. These best support vec2 tors will form a maximum margin m = |w| between two classes, and can be used as the SVM model S for future classication process. The judgement of any unclassied images Ui will be based on the relationship between its projection weight i onto the eigenspace , and the margin distance from the SVM model S. Among several popular kernel functions which can be used to nd the optimized value of margin m, Radial Basis Function (RBF) kernel K(xi , xj ) has been selected for our vehicle classication system, due to its simplicity and proven capability for dealing with both non-linear and linear dataset (Keerthi and Lin [4]). K(xi , xj ) = exp( xi xj
2

), > 0

(8)

As suggested from Chang and Lin [2] on dataset scaling for general SVM classication, the training image weight

Table 2. SVM Training and Prediction Task Result Description 1 2 3 4 5 6 7 8 Training weights T Scaled weights T Labeled weight Optimized and C SVM model Unknown image i Scaled weight i Result Equation (9) Li (i , yi ) cross validation from T , and C Equation (6) min(Ei ), max(Ei ) predict i on

add up new false positive images into the training set. This is done using a tracking history of each vehicle, if the results returned from the classier always true on that image but the number of tracking denials larger than a threshold of times, that image will be put in to the false-positive set. When the counter for new false-positive images reach another threshold of times, a new classier training will be carried out, hence, update the classier. With this approach, it is proven than, after sufcient running time, the detection accuracy rate of the system will be kept up-to-date, and since this approach does not depend on the initial training set of the classier, it is highly applicable to any other cases. While the initial classier tested on one particular highway location, it is still applicable to be installed on any other places with different context, different conditions.

T is rst scaled to the range of [Rm , RM ] before putting into the classier. The scaling procedure is done on each weight element Eij of T starting from the column maxima max(Ei ) and minima min(Ei ) of each feature, then the new value for each feature weight element Eij will be calculated as: (Eij min(Ei )) (RM Rm ) Eij = Rm + max(Ei ) min(Ei ) (9)

3. Experiment Results
3.1. Parameters Adjustment and Results
Five-fold cross validation with exponential value grid search on given training data set has revealed two optimized training parameters for our vehicle SVM classifer, = 2.0 and C = 0.5 with classication accuracy of 92.34%. Using these values, we then carry out a series of 12 tests to observe how norm size N and feature range P affect the processing time and classication accuracy, as well as to get the most suitable values of N and P for our future classication. These tests are run over the nighttime trafc sequence of 6000 frames (360360) extracted at 25 frames per second (fps) rate, running on Pentium IV, 2Gb RAM machine. We also make use of libSVM [2] library for our SVM implementation and Intel Open Computer Vision library (OpenCV) for most of image morphological operations. Table 3 has summarized the results obtained from these tests (in which N values vary from 20 to 60, and P from 10 to 70), three outputs are of our particular interest are the training time ttr (N, P ), prediction time tpr (N, P ), and the total classication accuracy of the system a. Figure 3 is a normalized version of table 3 which graphically reveals the impacts of norm size N and feature range P on the classication performance. It can be easily seen that the value of N largely inuences the variance of prediction time, while the training time depends on both N and P; this biased relationship is in fact quite reasonable since the squared change in norm size N 2 is much greater than in P. In addition, the raise in classication accuracy tends to be slowed down when norm size reaches the value of 40. Accuracy rate a is increased by only 0.3% while classication time is nearly doubled. These analysis on N and P has brought the decision to select norm size N = 40 and feature range P = 50 to be the most suitable candidates for this nighttime vehicle classier.

In our application, we chose [Rm , RM ] to be [-1,1], and feature scaling extremes min(Ei ) and max(Ei ) are reserved to correspondingly scale unknown vehicle image data matrix. Support Vector Machines are proven to produce high accuracy on general pattern recognition problems, but its accuracy is also highly sensitive to input parameters. Specically for our vehicle classier, the two vital parameters are RBF kernel parameter and error level C, depending on each particular data type and kernel, different values of and C can bring big difference in the result of the classication. In order to choose the most suitable parameters, a ve-fold cross validation described by Chang and Lin [2] has been used. The process includes an exponential grid-search on different values of and C while dividing the whole training set into 5 mutually exclusive smaller sets. For each testing run, one small group will be used to train a SVM model, which will then be used to predict the other 4 sets, record the accuracy rate. The value pair of and C which gives the best accuracy rate will be chosen as the optimized training parameters for this training set. Returned values obtained from the above grid search and ve-fold cross validation are then used to train our vehicle SVM classier, produce a SVM model , and use that model to predict unknown vehicle image Ii with weight i , returning the classication result = {1, 1}. Table 2 summaries our approach in training the vehicle image set and predicting unknown vehicle images. In order to make sure our vehicle image training set is up-to-date, our classication procedure also makes use of the tracking module to

Run N P ttr (ms) tpr (s) a(%)

1 20 10 1684 96 87.66

2 20 30 2305 98 90.42

3 20 50 2864 103 91.40

4 20 70

Table 3. Variable Analysis 5 6 7 40 10 4500 294 91.76 40 30 5132 299 93.97 40 50 5454 314 95.88

8 40 70 5763 325 95.93

9 60 10 7032 540 93.17

10 60 30 7650 556 95.15

11 60 50 7962 572 96.10

12 60 70 8133 575 96.20

3225 107 92.93

Table 4. Vehicle Classication Results Run ttr tpr ap (%) an (%) a(%) LSD1 ANN1 SVM1 LSD2 ANN2 SVM2 LSD3 Figure 3. Norm Size - Feature Range Analysis ANN3 SVM3 N/A 6125 6530 N/A 7468 7645 N/A 6945 7533 70 256 320 62 278 311 64 269 316 51.39 72.93 93.36 45.69 68.94 94.45 52.7 75.92 95.36 56.32 63.35 93.65 63.32 70.36 96.88 58.69 71.63 93.22 54.45 66.98 93.54 55.85 69.76 95.85 56.31 73.33 94.07

3.2. Performance Evaluation


In order to evaluate this approach, we use trafc video sequences from three different locations, and for each sequence, we also use two additional methods using Articial Neuron Networks (ANN) and a modied version of Least Squared Distance LSD classication from Turk and Pentlands face recognition work [7]. We use Multi-Layer Perceptrons (MLP) feedforward ANN available from OpenCV library to train the vehicle training input weights T (size (M+Q)P); ANN prediction is carried out on unknown image weight i calculated from equation 6. LSD classication is a quick and simple classication method which uses the squared distance of the unknown image i = Ii and the representative vehicle space vector v = i , that is: 2 = i v 2 . An image Ii will be classied i as a vehicle if its distance 2 where threshold is the i averaging value of all vehicle images in the training set to the vehicle space. The results obtained from the evaluation runs has shown that our proposed approach generally takes a little bit more classication time than the ANN and LSD (due to additional steps in normalizing input data), but its average value is still in the range of acceptable realtime performance (around 300s compared to the total frame grab time of 40000s). On the performance aspect, our frame-

work provides much higher classication rate (around 94%) than the other two (55% for LSD and 70% for ANN), which proves the effectiveness this optimized classier for vehicle images in nighttime trafc scenes.

4. Conclusion
In this paper, a systematic approach has been proposed and implemented to provide a robust solution for nighttime vehicle classication in trafc surveillance. Vehicle eigenspaces model is built upon different sources of known vehicles, and an optimized Support Vector Machine classier is then used to label all vehicle images based on that model. Results obtained from different runs have proven the robustness of this nighttime vehicle classication framework in both accuracy and processing speed.

5. Acknowledgment
This work is part of a collaboration project between National ICT of Australia (NICTA) and the Road and Trafc Authority of New South Wales (RTA). NICTA is funded by the Australian Governments Backing Australias Ability initiative, in part through the Australian Research Council.

References
[1] C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121167, 1998. [2] C.-C. Chang and C.-J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/ cjlin/libsvm. [3] S. Gupte, O. Masoud, R. Martin, and N. Papanikolopoulos. Detection and classication of vehicles. IEEE Transactions on Intelligent Transportation Systems, 3(1):3747, March 2002. [4] S. S. Keerthi and C.-J. Lin. Asymptotic behaviors of support vector machines with gaussian kernel. Neural Computation, 15(7):16671689, 2003. [5] M. Kirby and L. Sirovich. Application of the karhunenloeve procedure for thecharacterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1):103108, 1990. [6] K. Robert, N. Gheissari, and R. Hartley. Video Processing for Trafc Surveillance, 2006. Technical Report at the National ICT of Australia, http://www.nicta.com.au/. [7] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):7186, 1991. [8] S. Zehang, G. Bebis, and R. Miller. On-road vehicle detection using evolutionary gabor lter optimization. IEEE Transactions on Intelligent Transportation Systems, 6(2):125137, 2005. [9] C. Zhang, X. Chen, and W. B. Chen. A pca-based vehicle classication framework. IEEE International Conference on Data Engineering, pages 1717, 2006.

Вам также может понравиться