Академический Документы
Профессиональный Документы
Культура Документы
1 2 3
D. Prabavathi Mrs. K. V. Sumathi Mrs. K. K. Kavitha
M. Phil Scholar, Department of M.C.A., M.Phil,Assistant Professor, M.C.A., M.Phil., SET., (Ph.D).,
Computer Science, Department of Computer Science, Vice Principal, Head of the
Selvamm Arts and Science College Selvamm Arts and Science College Department of Computer Science,
(Autonomous) (Autonomous) Selvamm Arts and Science College
Namakkal (Tk) (Dt) – 637003 Namakkal (Tk) (Dt) – 637003 (Autonomous)
Namakkal (Tk) (Dt) – 637003
Abstract: Traffic classification has wide applications in network management, from security monitoring to quality of service measurements.
Recent research tends to apply machine learning techniques to flow statistical feature based classification methods. The nearest neighbor (NN)-
based method has exhibited superior classification performance. It also has several important advantages, such as no requirements of training
procedure, no risk of overfitting of parameters, and naturally being able to handle a huge number of classes. However, the performance of NN
classifier can be severely affected if the size of training data is small. In this paper, we propose a novel nonparametric approach for traffic
classification, which can improve the classification performance effectively by incorporating correlated information into the classification
process. We analyze the new classification approach and its performance benefit from both theoretical and empirical perspectives. A large
number of experiments are carried out on two real-world traffic data sets to validate the proposed approach. The results show the traffic
classification performance can be improved significantly even under the extreme difficult circumstance of very few training samples.
__________________________________________________*****_________________________________________________
NETWORK TRAFFIC CLASSIFICATION which is shared with our proposed methods. However, other
Network traffic classification has drawn significant supervised methods, such as neural nets and SVM, need
attention over the past few years. Classifying traffic flows time to learn parameters for their classification model.
by their generation applications plays very important role in Second, the proposed methods use the nearest neighbor rule
network security and management, such as quality of service which requires storage for all training data samples.
(QoS) control, lawful interception and intrusion detection. However, the amount of storage is tiny if the training data
Traditional traffic classification methods include the port- size is small. Identifying the nearest neighbor of a given
based prediction methods and payload-based deep flow from among n training flows is conceptually
inspection methods. In current network environment, the straightforward with n distance calculations to be performed.
traditional methods suffer from a number of practical The nearest neighbor rule is embedded in the proposed
problems, such as dynamic ports and encrypted applications. methods for traffic classification. With a small training set,
Recent research efforts have been focused on the application the NN classifiers and the proposed methods classify very
of machine learning techniques to traffic classification based quickly. For instance, with 10 training samples for each
on flow statistical features. class, the classification time of the proposed methods are
about 2 seconds and 5 seconds for the wide dataset and
COMPUTATIONAL PERFORMANCE ispdataset, respectively. The proposed methods, AVG-NN,
The computational performance includes learning MIN-NN, and MVTNN, have the same classification time,
time,amount of storage, and classification time. First, the because they follow the same classification approach. Due
NN classifier does not really involve any learning process, to the extra aggregation operation, the classification time of
373
IJFRCSCE | November 2017, Available @ http://www.ijfrcsce.org
_______________________________________________________________________________________
International Journal on Future Revolution in Computer Science & Communication Engineering ISSN: 2454-4248
Volume: 3 Issue: 11 370 – 375
_______________________________________________________________________________________________
the proposed methods is a little longer than NN but the according to their generation applications. The computing,
classification accuracy of our methods is much higher than the amount of applications deployed on the current research
NN. If NN achieves the same accuracy to our proposed of traffic classification concentrates on the Internet is
methods, it needs more training samples and must spend quickly increasing and many applications adopt application
more classification time. of machine learning techniques into flow the encryption
techniques. This situation makes it harder to statistical
SYSTEM FLEXIBILITY feature based classification methods.
The proposed system model is open to feature
extractionand correlation analysis. First, any kinds of flow A TRAFFIC CLASSIFICATION APPROACHWITH
statistical features can be applied in our system model. In FLOW CORRELATION
this work, we extract unidirectional statistical features from Traffic Classification using Correlation
full flows. The statistical features extracted from parts of information orTCC for short. A novel nonparametric
flows can also be used to represent traffic flows inour approach is also proposed to effectively incorporate flow
system model. Second, any new correlation analysis to correlation information into the classification process.
discover correlation information in traffic flows to improve
the robustness of classification. In this paper, a 3-tuple A Traffic Classification Approach with Flow Correlation
heuristic based method is applied to discover flow The presents a new framework which we call
correlation which aremodeled by BoFs.We presented the Traffic Classification using Correlation information or TCC
comprehensive analysis from theoretical and empirical for short. A novel nonparametric approach is also proposed
perspectives, which is based on the BoF model instead of to effectively incorporate flow correlation information into
the 3-tuple method. Therefore, new correlation analysis the classification process.
methods will not affect the effectiveness of the proposed
approach. In the future, we will work on developing new Correlation Analysis
methods for flow correlation analysis. The correlated flows sharing the same three-tuple
are generated by the same application. For example, several
CORRELATION ANALYSIS flows initiated by different hosts are all connecting to a same
We conduct correlation analysis using a 3-tuple host at TCP port 80 in a short period. These flows are very
heuristic,which can quickly discover BoFs in the real traffic likely generated by the same application such as a web
data. browser. The three-tuple heuristic about flow correlation has
3-tuple heuristic: in a certain period of time, the been considered in several practical traffic classification
flows sharing the same 3-tuple {dstip, dst port, protocol} schemes proposed a payload based clustering method for
form a BoF. The correlated flows sharing the same 3-tuple protocol inference, in which they grouped flows into
are generated by the same application. For example, several equivalence clusters using the heuristic. tested the
flows initiated by different hosts are all connecting to a same correctness of the three-tuple heuristic with real-world
host at TCP port 80 in a short period. These flows are very traces.
likely generated by the same application such as a web
browser. The 3-tuple heuristic about flow correlation has Computational Performance
been considered in several practical traffic classification The computational performance includes learning
schemes. Ma et al. proposed Abpayload-based clustering time, amount of storage, and classification time. First, the
method for protocol inference, in which they grouped flows NN classifier does not really involve any learning process,
into equivalence clusters using the heuristic. Caniniet which is shared with our proposed methods. However, other
al.tested the correctness of the 3-tuple heuristic with real- supervised methods, such as neural nets and SVM, need
world traces. In our previous work , we applied the heuristic time to learn parameters for their classification model.
to improve unsupervised traffic clustering. BoF to model the Second, the proposed methods use the nearest neighbor rule
correlation information obtained by the 3-tuple heuristic and which requires storage for all training data samples.
study the BoF model based supervised classification, which However, the amount of storage is tiny if the training data
is different from the exiting works. size is small.