Академический Документы
Профессиональный Документы
Культура Документы
http://chem-eng.utoronto.ca/~datamining/
Data Mining
Data mining is about explaining the past and predicting the future by means of data analysis.
http://chem-eng.utoronto.ca/~datamining/
Data Mining
Statistics AI & Machine Learning
Data Mining
Database & DW
http://chem-eng.utoronto.ca/~datamining/ 3
Source: KDnuggets.com
40
50
60 4
http://chem-eng.utoronto.ca/~datamining/
Source: KDnuggets.com
http://chem-eng.utoronto.ca/~datamining/
3
4 5 6
Deployment
http://chem-eng.utoronto.ca/~datamining/ 6
1. Problem Definition
Understanding the project objectives and requirements from a business perspective and then converting this knowledge into a data mining problem definition with a preliminary plan designed to achieve the objectives.
Source: http://www.crisp-dm.org/Process/index.htm
http://chem-eng.utoronto.ca/~datamining/
2. Data Preparation
Data
DSN
ETL
Data
Text
Modeling Data
http://chem-eng.utoronto.ca/~datamining/ 9
3. Data Exploration
Average, StDev, Min, Max, ...
Univariate Analysis
Combination Charts
http://chem-eng.utoronto.ca/~datamining/ 10
http://chem-eng.utoronto.ca/~datamining/
11
http://chem-eng.utoronto.ca/~datamining/
12
4. Modeling
Classification Regression
Linear Regression Robust Regression Neural Network
Clustering
Association
Bayesian
Hierarchical
A Priori
K-Means
SVM
http://chem-eng.utoronto.ca/~datamining/
13
Covariance Matrix
Linear
Regression
Similarity Functions
KNN
Neural Networks
Perceptron
Others
SVM
Bayesian
LDA
(Z Score)
Back
Propagation
GA
PCA/PCR
RBF
HMM
Scalable Methods
http://chem-eng.utoronto.ca/~datamining/
14
Modeling - Classification
Age
Responder
e.g., Y or N
http://chem-eng.utoronto.ca/~datamining/
15
Modeling - Regression
Age
http://chem-eng.utoronto.ca/~datamining/
16
Modeling - Clustering
Income
Age
http://chem-eng.utoronto.ca/~datamining/
17
Association Rules
Market Basket Analysis
http://chem-eng.utoronto.ca/~datamining/
18
5. Evaluation
Charts
Gain Chart Lift Chart K-S Chart
Stats
Confusion Matrix Mean Square Error Variables Contribution
http://chem-eng.utoronto.ca/~datamining/
19
True Positive
CM
Predicted Negative
False Negative
http://chem-eng.utoronto.ca/~datamining/
100%
45%
10%
Population%
10%
50%
http://chem-eng.utoronto.ca/~datamining/
100%
21
6. Deployment
SQL
VB
JAVA
HTML
http://chem-eng.utoronto.ca/~datamining/
22
Domain
Expert
DBA
Analyst
23
http://chem-eng.utoronto.ca/~datamining/
SPSS
KXEN
Data Mining
Angoss KNIME
http://chem-eng.utoronto.ca/~datamining/
24
Case Study...
http://chem-eng.utoronto.ca/~datamining/
25