Вы находитесь на странице: 1из 4

Emphasizing Minority Class in LDA for Feature Subset Selection on

High-Dimensional Small-Sized Problems

Abstract:
Although mostly used for pattern classification, linear discriminant
analysis (LDA) can also be used in feature selection as an effective measure
to evaluate the separative ability of a feature subset. When applied to
feature selection on high-dimensional small sized (HDSS) data (generally)
with class-imbalance, LDA encounters four problems, including singularity
of scatter matrix, over fitting, overwhelming and prohibitively
computational complexity. In this study, we propose the LDA-based
feature selection method MCELDA (minority class emphasized linear
discriminant analysis) with a new regularization technique to address the
first three problems. Different to giving equal or more emphasis to majority
class in conventional forms of regularization, the proposed regularization
emphasizes more on minority class, with the expectation of improving
overall performance by alleviating overwhelming of majority class to
minority class as well as over fitting in minority class. In order to reduce
computational overhead, an incremental implementation of LDA-based
feature selection has been introduced. Comparative studies with other
forms of regularization to LDA as well as with other popular feature
selection methods on five HDSS problems show that MCE-LDA can

produce feature subsets with excellent performance in both classification


and robustness. Further experimental results of true positive rate (TPR)
and true negative rate (TNR) have also verified the effectiveness of the
proposed technique in alleviating overwhelming and over fitting problems.

Existing System:
A typical set-based feature selection algorithm runs recursively imbedding
two major components in its recursive procedure. The first component is a
candidate feature subsets searching or generating strategy and the second
is an evaluation criterion that measures the goodness of candidate feature
subsets generated.
In the first component, the sequential forward searching (SFS) and
sequential backward searching (SBS) are usually employed. In the second
component, two types of evaluation criteria including classifier-dependent
and classifier-independent measures are generally utilized.
Proposed System:
We propose the minority class emphasized linear discriminant analysis
(MCELDA) containing a new form of regularization to LDA. Instead of
giving more emphasis to class with majority of samples as done in the
conventional forms of regularization such as shrinking individual scatter
matrices towards the pooled scatter matrix, our new regularization
emphasizes more on the minority class.
The rationale behind our regularization is that in situation of small sample
size and class imbalance, the minority class is prone to be overlooked

during evaluating a feature subset in feature selection, therefore, the


influence of the minority class should be enhanced.
Our experimental studies show that MCE-LDA produces feature subsets
leading to improvements in both classification performance and robustness
performance.

Hardware Requirements:

System

: Pentium IV 2.4 GHz.

Hard Disk

: 40 GB.

Floppy Drive : 1.44 Mb.

Monitor

: 15 VGA Colour.

Mouse

: Logitech.

RAM

: 256 Mb.

Software Requirements:

Operating system

: - Windows XP.

Front End

: - JSP

Back End

: - SQL Server

Software Requirements:

Operating system

: - Windows XP.

Front End

: - .Net

Back End

: - SQL Server