Академический Документы
Профессиональный Документы
Культура Документы
1
http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_BMI_Regression#References
Classification Example
Regression Example
2
2
Classification tree for county-level outcomes in the 2008 Democratic Party primary (as of April 16), by
Amanada Cox for the New York Times
Decision tree builds classification or regression models in the form of a tree structure. It breaks
down a dataset into smaller and smaller subsets while at the same time an associated decision
tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes.
The goal of classification trees is to
predict /suggestion of a decision
and / or
explain responses on a categorical dependent variable / straightforward and intuitive
explanation of how the decision was made
Decision tree algorithms
https://www.stonybrook.edu/commcms/irpe/reports/presentations/DataMiningOverview_Galambo
s_2015_06_04.pptx
An Algorithm for Building Decision Trees
1. Let T be the set of training instances.
2. Choose an attribute that best differentiates the instances contained in T.
3. Create a tree node whose value is the chosen attribute. Create child links from this
node where each link represents a unique value for the chosen attribute. Use the child
link values to further subdivide the instances into subclasses.
4. For each subclass created in step 3:
a. If the instances in the subclass satisfy predefined criteria or if the set of
remaining attribute choices for this path of the tree is null, specify the classification for
new instances following this decision path.
b. If the subclass does not satisfy the predefined criteria and there is at least one
attribute to further subdivide the path of the tree, let T be the current set of subclass
instances and return to step 2.
Partitioning of search space
Univariate partitioning methods, attractive because
only one feature / attribute is analyzed at a time
partition the search space axis-parallel based on only one attribute at a time
the derived decision tree is relatively easy to understand
0.9
0.8
x < 0.43?
0.7
Yes No
0.6
y
0.3
Yes No Yes No
0.2
:4 :0 :0 :4
0.1 :0 :4 :3 :0
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
Ex. 2
Decision Boundary
Border line between two neighboring regions of different classes is known as decision
boundary
Decision boundary is parallel to axes because test condition involves a single attribute
at-a-time
Partitioning of search space
Univariate partitioning methods, attractive because
only one feature / attribute is analyzed at a time
partition the search space axis-parallel based on only one attribute at a time
the derived decision tree is relatively easy to understand
CART starts out with the best univariate split. It then iteratively searches for perturbations in
attribute values (one attribute at a time) which maximize some goodness metric. At the end of
the procedure, the best oblique and axis-parallel splits found are compared and the better of
these is selected.
CHAID4
regression-type problems or
classification-type.
4
http://www.statsoft.com/Textbook/CHAID-Analysis#index
Basic Tree-Building Algorithm: CHAID and Exhaustive CHAID
Merging categories.
The next step is
to cycle through the predictors
to determine
for each predictor
the pair of (predictor) categories
that is least significantly different
with respect to the dependent variable;
for classification problems (where the dependent variable is categorical as well),
it will compute a Chi-square test (Pearson Chi-square);
for regression problems (where the dependent variable is continuous),
it will compute the F tests
If the respective test for a given pair of predictor categories is not statistically significant
as defined by an alpha-to-merge value,
then it will
merge the respective predictor categories and
repeat this step
(i.e., find the next pair of categories,
which now may include previously merged categories)
If the statistical significance for the respective pair of predictor categories is significant
(less than the respective alpha-to-merge value),
then (optionally) it will
compute a Bonferroni adjusted p-value
for the set of categories for the respective predictor
5
http://www.statsoft.com/Textbook/Classification-and-Regression-Trees
COMPARATIONS CHAID, EXCHAUSTIVE CHAID, CRT, QUEST
SIMILARITY
regression-type problems or
classification-type.
DIFFERENCES
- as a practical matter,
- for a discussion of various schemes for combining predictions from different models,
see, for example, Witten and Frank, 2000.