Академический Документы
Профессиональный Документы
Культура Документы
II.
I.
INTRODUCTION
RELATED WORKS
III.
METHODOLOGY
IV.
C. Data Preprocessing
The next step is conducting data preprocessing. The data
preprocessing steps of this research are described as follows:
1. All the XML report files were parsed to select the most
relevant and important attribute values (feature selection).
2. A term dictionary was created, which contains all the
attribute values that were previously parsed and selected.
3. Each XML report file was compared against the term
dictionary by counting the existence (or non-existence) of
each term word in the term dictionary based on binary
weight and term frequency weight.
4. Sparse vector models were created for each XML report
file and Attribute-Relation File Format (ARFF) files were
created.
100.0%
90.0%
80.0%
70.0%
60.0%
50.0%
Recall (TPR)
J48
SVM
Nave Bayes
kNN
Precision
(PPV)
Accuracy (ACC)
202
204
201
100.0%
100.0%
90.0%
80.0%
70.0%
60.0%
50.0%
90.0%
Recall (TPR)
Recall (TPR)
Precision
(PPV)
70.0%
Accuracy (ACC)
50.0%
Precision (PPV)
Accuracy (ACC)
8
J4
LP
60.0%
Na k N
v
e N
Ba
ye
s
SV
M
J48
SVM
Nave Bayes
kNN
80.0%
100.0%
90.0%
Recall (TPR)
80.0%
Precision (PPV)
70.0%
Accuracy (ACC)
60.0%
50.0%
REFERENCES
J4
8
M
LP
Na k
v N N
e
Ba
ye
s
SV
M
CONCLUSION
[1]
[2]
[3]
[4]
[5]
[6]
203
205
202