Вы находитесь на странице: 1из 1

we may either evaluate the performance of the classifier by validating the classification results with

possibly accessible diagnosis information, or evaluating the classification capability of each feature, each
set of features within one test, by setting thresholds based on information entropy. In addition, after
predicting the label of the undetermined subjects, the distance between their test results and those
with relatively definite diagnosis can be calculated and also, the calculation of the distance can be
various, from which the different methods will indicate different mathematical meanings, and hopefully
furthermore the physical meanings can be mined. Taking the process of the raw data into consideration,
some machine learning techniques can be used and there remain unsolved questions. First, the feature
extraction includes the implementations of transforming the non-numeric parameters into the types
that can be used for machine learning techniques, especially used in Euclidean space. The methods such
as expanding with indicator variables or linearizing directly, are potentially dangerous but remaining to
be explored. Secondly, the feature selection is also a huge topic in machine learning area and in this
situation even vital. Choosing well defined features will provide clinicians with efficiency and high
accuracy. Using cross validation such as WRAPPER is common used for getting rid of the highly
correlational features, and combining with the decision tree based algorithms might be a good trial to
start with, for the decision tree methods are completely compatible with non-numeric features.

Вам также может понравиться