Guide to Data Warehousing and Data Mining

DATA WAREHOUSING
AND
DATA MINING
A Comprehensive guide for students and IT Professionals
(Choice Based Credit System (CBCS) Pattern) New Syllabus
( For B. Sc Computer Science, B.Sc., Software Computer Science, B.Sc. ISM, B.Sc. IT,
B.Sc. Software System, B.Sc. Software Engineering, BCA, M.Sc. Computer Science,
M.Sc. Information Technology, M.Sc. Information System and Management, M.Sc.
Software Engineering, MCA, B.E.CSE, B.Tech IT, M.E CSE, M.Tech IT, M.Phil., and
IT Professionals.)
By
Dr.P.Rizwan Ahmed, MCA,, M.Sc.,M.A.,M.Phil.,Ph.D,

Head of the Department
Department of Computer Applications and
PG Department of Information Technology
Mazharul Uloom College,
Ambur - 635 802, Vellore Dist. Tamil Nadu.
CONTENTS
Preface
Acknowledgement
PART- I
DATA MINING
Chapter 1
Introduction
1.1 An Expanding universe of data

1.2 Information and production factor
1.3 KDD and data mining
1.4 Data Mining vs query tools
1.5 Data Mining in Marketing
1.6 Practical applications of data mining
1.7 Learning
1.8 Self-learning computer systems
1.9 Machine learning
1.9.1 Why machine learning is done?
1.10 Machine learning and the methodology of science
1.10.1 Differences between Data Mining and Machine Learning
1.11 Concept Learning
Summary
Review Question
Chapter 2
Data Mining and the Data Warehouse
2.1 Data Warehouse: Definitions

2.2 Why do we need Data Warehouse?
2.3 Designing decision support systems
2.3.1Hardware and software products of a decision support system
2.4 Integration with data mining
2.5 Client/server and data warehousing
2.6 Multi-processing machines
2.7 Cost justification
Summary
Review Questions
Chapter 3
Knowledge Discovery Process
3.1 Introduction
3.2 Data selection
3.3 Cleaning
3.4 Coding
3.5 Data mining
3.5.1 Preliminary analysis of the data set using traditional query tools
3.5.1.1 Visualization techniques
3.5.1.2 Likelihood and distance
3.5.1.3 OLAP tools
3.5.1.4 K-nearest neighbor
3.5.1.5 Decision Trees
3.5.1.6 Association Rules
3.5.1.7 Neural networks
3.5.1.8 Genetic algorithms
3.6 Reporting
Summary
Review Questions
Chapter- 4
KDD Environment
4.1 Different forms of knowledge

4.2 KDD environment
4.3 Ten golden rules
Summary
Review Questions
Chapter 5
Real life applications
5.1 Customer profiling

5.2 Predicting bid behavior of pilots
5.3 Discovering foreign key relationships
Summary
Review Questions
Chapter 6
6.1 Learning as compression of data sets
Formal aspects of learning algorithm
6.2 Information content of a message

6.3 Noise and redundancy
6.4 Significance of noise
6.5 Fuzzy databases
6.6 The traditional theory of the relational database
6.7 From relations to tables
6.7.1 From keys to statistical dependencies
6.8 Denormalization
6.9 Data mining primitives
Summary
Review Questions
Chapter 7
7.1 Introduction
7.2 Data
7.3 Information
7.4 Knowledge
7.5 Historical Note: Many names of Data Mining
7.6 Data Mining
7.6.1 Some of the definitions of Data Mining
7.7 Why Data Mining
7.8 Why Data Mining is Important?
7.9 Uses of Data Mining
7.10 Data Mining Models
7.10.1 Verification Model
7.10.2 Discovery Model
7.11 Development of data mining
7.12 Applications of Data Mining
7.12.1 Healthcare
7.12.2 Finance
7.12.3 Retail Industry
7.12.4 Telecommunication
7.12.5 Text Mining and Web Mining
7.12.6 Higher Education
7.13 Basic Data Mining Tasks / Taxonomy of data mining tasks
7.13.1 Prediction methods
7.13.2 Descriptive methods
7.14 Data Mining Vs Database
7.15 Data Mining Vs KDD
Data Mining
7.16 Steps in Data Mining Process / Steps involved in KDD

7.17 Architecture of a typical data mining system
7.18 Future Trends
7.18.1 Data Trends
7.18.2 Hardware Trends
7.18.3 Network Trends
7.18.4 Scientific Computing Trends
7.18.5 Business Trends
7.19 Major issues in Data Mining / Data Mining Issues
7.20 Data Mining Metrics
7.21 Social Implications of Data Mining
7.22 Data Mining from a database Perspective
Summary
Review Question
Chapter 8
Advanced Databases
8.1 Various kinds of data / Types of Data

8.1.1 Flat files
8.1.2 Relational Databases
8.1.3 Data Warehouses
8.1.4 Transaction Databases
8.1.5 Object oriented databases
8.1.6 Temporal Databases
8.1.7 Text and Multimedia Databases
8.1.8 Spatial Databases
8.1.9 Time-Series Databases
8.1.10 World Wide Web (WWW)
8.1.11 Heterogeneous databases
Summary
Review Question
Chapter 9
Data Mining Functionalities, Classification and Case Study
9.1 Data Mining Functionalities

9.2 Pattern Interesting / Interestingness of Patterns
9.2.1 Interestingness measures:
9.2.2 Objective vs. subjective interestingness measures
9.3 Classification of Data Mining Systems
9.4 Data Mining Task Primitives

9.5 Why Data Mining Primitives and Languages?
9.6 Integration of data mining system with a database or Data warehouse system
9.6.1 No Coupling
9.6.2 Loose Coupling
9.6.3 Semitight coupling
9.6.4 Tight coupling
9.7 Case Study
9.7.1 Customer Attrition: Case Study
9.7.2 Assessing Credit Risk : Case Study
9.7.3 Successful e-commerce - Case Study
Summary
Review Question
Chapter 10
Overview of Data Mining Techniques-I
10.1 Data Mining Techniques

10.1.1 Cluster Analysis
10.1.2 Induction
10.1.3 Decision Trees
10.1.4 Rule induction
10.1.5 Nearest Neighbour
10.1.6 Neural networks
10.2 Data Mining Application Examples
Summary
Review Question
Chapter 11
Overview of Data Mining Techniques-II
11.1 Introduction
11.2 A Statistical Perspective on Data Mining
11.2.1 Point Estimation
11.2.2 Models Based on Summarization
11.2.3 Bayes Theorem
11.2.4 Hypothesis Testing
11.2.5 Regression and Correlation
11.3 Similarity Measures
11.4 Decision Trees
11.5 Neural Networks
11.6 Genetic Algorithms

Summary
Review Question
Chapter 12
Data Preprocessing
12.1 1ntroduction
12.2 Why preprocess the data / Need for preprocessing
12.3 Data Preprocessing Techniques / Major Tasks in Data Preprocessing
12.4 Data Cleaning
12.4.1 Missing Data / Values
12.4.1.1 Methods of handling missing data
12.4.2 Noisy Data
12.4.2.1 How to Handle Noisy Data?
12.4.3 Outlier Analysis
12.4.4 Regression
12.5 Data Cleaning as a Process
12.5.1 Discrepancy detection
12.5.2 Discrepancy Detection Tools
12.5.3 Data Transformation
12.5.4 Data Transformation Tools
12.6 Data Integration
12.6.1 Issues to be considered in Data Integration
12.6.1.1 Schema integration
12.6.1.2 Reduction
12.6.1.3 Detecting and resolving data value conflicts
12.6.2 Handling Redundant Data in Data Integration
12.7 Data Transformation
12.7.1 Methods of Data Normalization
12.7.1.1 Min-max normalization
12.7.1.2 z-score normalization
12.7.1.3 Normalization by decimal scaling
12.8 Data Reduction
12.8.1 Data Reduction Strategies
12.8.1.1 Data Cube Aggregation
12.8.1.2 Attribute Subset Selection
12.8.1.3 Dimensionality Reduction
12.8.1.4 Numerosity Reduction

12.8.1.5 Data Discretization and concept hierarchy generation
Data discretization
12.9 Data Mining Query Languages (DMQL)
Summary
Review Questions
Chapter 13
Association Rules
13.1 Association Rules

13.2 Large Item sets
13.3 Basic Algorithm
13.3.1 Apriori Algorithm
13.3.2 Partitioning
13.4 Parallel and Distributed Algorithms
13.4.1 Data parallelism
13.4.2 Task parallelism
13.5 Comparing Approaches
13.6 Incremental Rules
13.7 Advanced Association Rule Techniques
13.7.1 Generalized association rules
13.7.2 Multiple-level association rules
13.7.3 Quantitative association rules
13.7.4 Using Multiple Minimum Supports
13.8 Measuring the Quality of Rules
Summary
Review Questions
Chapter 14
Concept Description: Generalization and Characterization
14.1 Concept Description

14.2 Data Generalization and Summarization-based
14.2.1 Data Generalization
14.2.2 Characterization: Data Cube Approach
14.2.3 Attribute oriented induction for data characterization
14.2.4 Efficient Implementation of Attribute-Oriented Induction

14.3 Analytical characterization: Analysis of attribute relevance
14.4 Mining class comparisons: Discriminating between different classes
Mining Class Comparisons
14.5 Descriptive Data Summarization / Mining descriptive
statistical measures in large databases
14.5.1 Measuring the Central Tendency
14.5.2 Measuring the Dispersion of Data
14.5.3 Graphics Displays of basic Statistical Description
Summary
Review Questions
Chapter 15
Mining Frequent Patterns, Associations & Correlations
15.1 Mining Association Rules in Large Databases

15.1.1 Market Basket Analysis: A Motivating Example
15.1.2 Association Rule: Basic Concepts
15.1.3 Association Rule Mining: A Road Map
15.1.4 Mining Frequent Itemsets: the Key Step
15.2 Mining single-dimensional Boolean association rules from transactional databases:
Efficient and Scalable Frequent Itemset Mining Methods
15.2.1 Apriori Algorithm
15.2.2 Generating Association Rules from Frequent Itemsets
15.2.3 Methods to Improve Aprioris Efficiency
15.2.4 Mining Frequent Patterns without Candidate Generation
15.2.5 Principles of Frequent Pattern Growth
15.3 Mining various kinds of Association Rules
15.3.1 Mining multilevel association rules from transactional databases:
Multiple-Level Association Rules
15.3.2 Mining multidimensional association rules from transactional databases
and data warehouse
15.4 From Association Mining to Correlation Analysis
15.5 Constraint-Based Association Mining
Summary
Review Questions
Chapter 16
16.1 Introduction
Classification
16.1.1 Classification algorithms based on the categorization:

Issues in Classification
16.2 Statistical-Based Algorithms
16.2.1 Regression
16.2.2 Bayesian classification
16.2.3 Nave Bayes Classifier
16.3 Distance-Based Algorithms
16.3.1 Simply Approach
16.3.2 K Nearest Neighbors
16.4 Decision Tree-Based Algorithms
16.4.1 C4.5
16.4.2 CART
16.4.2.1 Scalable DT techniques
16.5 Neural Network-Based Algorithms
16.5.1 Propagation
16.5.2 NN supervised learning
16.5.3 Radial Basis Function Networks
16.5.4 Perceptron
16.6 Rule-Based Algorithms
16.6.1 Generating Rules from a DT
16.6.2 Generating Rules form a Neural Net
16.6.3 Generating Rules without a DT or NN
16.7 Combining Techniques
Summary
Review Questions
Chatper-17
Classification and Prediction
17.1 Classification
17.1.1 ClassificationA Two-Step Process
17.1.2 Prediction
17.1.3 Issues regarding classification and prediction
17.1.4 Comparing Classification and Prediction Methods
17.2 Classification by decision tree induction
17.2.1 Decision Tree Induction
17.2.2 Attribute Selection Measure
17.2.3 Information Gain (ID3/C4.5)
17.2.4 Gini Index (IBM IntelligentMiner)
17.2.5 Extracting Classification Rules from Trees
17.2.6 Avoid Overfitting in Classification
17.2.7 Enhancements to basic decision tree induction

17.2.8 Classification in Large Databases
17.3 Bayesian Classification: Introduction
17.3.1 Bayesian Classification: Why?
17.3.2 Bayesian Classification
17.3.3 Bayesian Theorem
17.3.4 Nave Bayes Classifier
17.3.5 Bayesian Belief Networks
17.3.6 Training Bayesian Belief Networks
17.4 Rule Based Classification
17.4.1 Using IF-THEN Rules for Classification
17.4.2 Rule Extraction from a Decision Tree
17.4.3 Rule induction using a Sequential Conversing Algorithm
17.4.4 Rule Quality Measures
17.5 Classification by backpropagation
17.6 Classification based on concepts from association rule mining/
Association-Based Classification / Classification by association Rules
17.7 Lazy Learners (or Learning from Your Neighbors)
17.7.1 k-Nearest Neighbor
17.7.2 Case-Based Reasoning (CBR)
17.8 Other Classification Methods
17.8.1 Genetic Algorithms
17.8.2 Rough Set Approach
17.8.3 Fuzzy Sets Approaches
17.9 Prediction
17.10 Classification accuracy
17.10.1 Classification Accuracy: Estimating Error Rates
Summary
Review Questions
Chapter- 18
18.1 Introduction
18.2 Similarity and Distance Measures
18.3 Outliers
18.4 Hierarchical Algorithms
18.4.1 Agglomerative Algorithms
18.5 Partitional Algorithms
18.5.1 Minimum spanning tree
18.5.2 Squared Error Clustering Algorithm
Clustering
18.5.3 K-means clustering

18.5.4 Nearest neighbor algorithm
18.5.5 PAM Algorithm
18.5.5.1CLARA
18.5.5.2 CLARANS
18.5.6 Clustering with genetic algorithms
18.5.7 Clustering With Neural Networks
18.5.7.1 Self-Organizing Feature Maps
18.6 Clustering Large Databases
18.6.1 BIRCH
18.6.2 DBSCAN
18.6.3 CURE Algorithm
18.7 Comparison of Clustering Algorithm
Summary
Review Questions
Chapter 19
Cluster Analysis
19.1 What is Cluster Analysis?

19.2 General Applications of Clustering
19.3 Examples of Clustering Applications
19.4 What is Good Clustering?
19.5 Requirements of Clustering in Data Mining
19.6 Types of Data in Cluster Analysis
19.6.1 Interval-valued variables
19.6.2 Binary Variables
19.6.3 Nominal, Ordinal, and Ratio-Scaled Variables.
19.7 A Categorization of Major Clustering Methods
19.7.1 Major Clustering Approaches
19.8 Partitioning Methods: Basic Concept
19.8.1 K-Means Clustering Method
19.8.2K-Medoids Clustering Method
19.8.2.1 Comparison between K-means and K-medoids
19.8.3 PAM
19.8.4 CLARA
19.9 Hierarchical Methods
19.9.1 Types of Hierarchical Clustering Methods
19.9.1.1 Agglomerative Hierarchical Clustering
19.9.1.2 Divisive Hierarchical Clustering
19.9.2 BIRCH
19.9.3 CURE
19.9.4 ROCK
19.9.5 CHAMELEON
19.10 Density-Based Methods
19.10.1 DBSCAN
19.10.2 OPTICS
19.10.3 DENCLUE
19.11 Grid-Based Methods
19.11.1 STING
19.11.2 WaveCluster
19.11.3 CLIQUE
19.12 Model-Based Clustering Methods
19.12.1 Expectation Maximization (EM)
19.12.2 Conceptual clustering
19.12.3 Neural network approaches
19.13 Outlier Analysis
19.13.1 Outlier Discovery: Statistical Approaches
19.13.2 Outlier Discovery: Distance-Based Approach
19.13.3 Outlier Discovery: Deviation-Based Approach
Summary
Review Questions
Chapter 20
Advanced Topics (Mining Complex types of data)
20.1 Multidimensional analysis and descriptive mining of complex data objects

20.1.1 Generalization of Structured Data
20.1.2 Generalizing Spatial and Multimedia Data
20.1.3 Generalizing Object Data
20.1.4 Generalization-based Mining of Plan Databases by Divide and Conquer
20.2 Mining Spatial Data Mining
20.2.1 Dimensions and Measures in Spatial Data Warehouse
20.2.2 Mining Spatial Association and Co-location Patterns
20.2.3 Spatial Classification and Spatial Trend Analysis
20.3 Mining multimedia databases
20.3.1 Similarity Search in Multimedia Data
20.3.2 Multidimensional Analysis of Multimedia Data
20.4 Mining time-series and sequence data
20.4.1 Time-series database
20.4.2 Mining Time-Series and Sequence Data: Trend analysis
20.4.3 Estimation of Trend Curve

20.4.4 Discovery of Trend in Time-Series
20.4.5 Multidimensional Indexing
20.4.6 Subsequence Matching
20.4.7 Query Languages for Time Sequences
20.5 Text Mining / Mining text databases
20.5.1 Text Data Analysis and Information Retrieval
20.5.2 Text Indexing Techniques
20.5.3 Text Mining Approaches
20.6 Mining the World-Wide Web / Web Mining
Chapter 21
Applications and Trends in Data Mining
21.1 Applications of Data Mining

21.1.1 Data Mining for Financial Data Analysis
21.1.2 Data Mining for Retail Industry
21.1.3 Data Mining for Telecommunication Industry
21.1.4 Biomedical Data Mining and DNA Analysis
21.1.5 Data Mining Applications in Sales/Marketing
21.1.6 Data Mining Applications in Banking / Finance
21.1.7 Data Mining Applications in Health Care and Insurance
21.2 Data mining system products and research prototypes
21.2.1 How to choose a data mining system?
21.2.2 Examples of Data Mining Systems
21.3 Additional themes on data mining
21.3.1 Theoretical Foundations of Data Mining
21.3.2 Statistical Data Mining
21.4 Social impact of data mining
21.5 Trends in data mining
Summary
Review Questions
PART II
DATA WAREHOUSING
Chapter 22
22.1 Introduction
22.2 Characteristics of Data Warehouse
Data warehousing
22.3 Need for Data Warehousing

22.4 Why Separate Data Warehouse?
22.5 Difference between Operational databases and Data Warehouses
22.6 Difference between OLTP and Data warehouse
22.7 Benefits of Data Warehousing
22.8 Future of data warehouse
22.9 Limitations of Data Warehouse
22.10Applications of Data Warehousing
22.11 Advantages of Data Warehousing
22.12 Data Warehousing Tools
Summary
Review Questions
Chapter 23
Data Warehousing Components
23.1 Overall Architecture

23.2 Data warehouse database
23.3 Sourcing, acquisition, cleanup, and transformation tools
23.4 Metadata
23.5 Access tools
23.5.1 Query and reporting tools
23.5.2 Application
23.5.3 OLAP
23.5.4 Data mining
23.6 Data marts
23.7 Data warehouse administration and management
Summary
Review Questions
Chapter 24
From Data warehousing to data mining
24.1 Data warehouse usage

24.1.1 Three kinds of data warehouse applications
24.2 Information processing Online Analytical Processing
24.2.1 Advantages of OLAM
24.2.2 Architecture of On-Line Analytical Mining
24.2.3 Comparison between OLAP and OLAM
Summary
Review Questions
Chapter 25
Data Warehouse Architecture
25.1 Data Warehouse architecture

25.1.1 Steps for the design and construction of data warehouse
25.1.2 Data Warehouse Design Process
25.1.3 Three Tier Data Warehouse Architecture
25.1.3.1 Enterprise Warehouse
25.1.3.2 Data Mart
25.1.3.3 Virtual data warehouse
25.2Data warehouse Back-End Tools and Utilities
25.3 Metadata Repository
25.4 OLAP Engine
25.4.1 Relational OLAP (ROLAP)
25.4.2 Multidimensional OLAP (MOLAP)
25.4.3 Hybrid OLAP (HOALP)
25.4.4 Specialized Servers
Summary
Review Questions
Chapter 26
Data Warehouse Implementation
26.1 Data Warehouse Implementation

26.1.1 Efficient Computation of Data Cubes
26.1.2 Cube Operation
26.1.3 Indexing OLAP Data: Bitmap Index
26.1.4 Indexing OLAP Data: Join Indices
26.1.5 Efficient Processing OLAP Queries
Summary
Review Questions
Chapter 27
Mapping the data warehouse to a multiprocessor architecture
27.1 Relational database technology for data warehouse

27.1.1 Types of parallelism
27.1.2 Data partitioning
27.2 Data base architecture for parallel processing
27.2.1 Shared-memory architecture
27.2.2 Shared-disk architecture
27.2.3 Shared-nothing architecture
27.2.4 Combined architecture

27.3 Parallel RDMBS features
27.4 Alternative technologies
27.5 Parallel DBMS Vendors
27.5.1 Oracle
27.5.2 Informix
27.5.4 Sybase
27.5.5 Microsoft
Summary
Review Questions
Chapter 28
Reporting and Query Tools and Applications
28.1Tool categories
28.1.1 Reporting tools
28.1.2 Managed Query Tools
28.1.3 Executive information tools
28.1.4 OLAP tools
28.1.5 Data mining tools
28.2 Need for application
28.3 Cognos impromptu
28.4Applications
28.4.1PowerBuilder
Summary
Review Questions
Chapter 29
On-Line Analytical Processing (OLAP)
29.1 Introduction
29.2 Need for OLAP
29.3 Multidimensional data model
29.3.1 From Tables and Spreadsheets to Data Cubes
29.4 OLAP Guidelines / OLAP Product Evaluation Rules
29.5 Data Warehouse Schema / OLAP Schema
29.5.1 Star Schema
29.5.2 Star Schema Keys
29.5.3 Advantages of Star schema
29.5.4 Snow Flake Schema
29.5.5 Fact Constellation

29.6 Concept hierarchies
29.7 OLAP operation in the Multidimensional Data Model
29.8 Multidimensional versus Multirelational OLAP
29.9 Categorization of OLAP Tools
29.10 OLAP Tools and the Internet
29.11 Difference between OLTP and OLAP
29.12 Comparison of DBMS, OLAP, and Data Mining
Summary
Review Questions
Chapter 30
Security
30.1 Introduction
30.2 Requirements
30.2.1 User Access
30.2.2 Legal Requirements
30.2.3 Audit Requirements
30.2.4 Network Requirements
30.2.5 Data Movement
30.2.6 Documentation
30.2.7 High-Security Environments
30.3 Performance Impact of Security
30.3.1 Views
30.3.2 Data Movement
30.4 Security Impact on Design
30.4.1 Application Development
30.4.2 Database Design
30.4.3 Testing
Summary
Review Questions
Chapter 31
31.1 Introduction
31.2 Definition of Types of System
31.3 Defining the SLA
31.3.1 User Requirements
31.3.2 System Requirements
Service Level Agreement (SLA)
Summary
Review Questions
Chapter 32
Operating the data warehouse
32.1 Introduction
32.2 Day-To Day Operations of the Data Warehouse
32.3 Overnight Processing
Summary
Review Questions
Chapter 33
Capacity Planning
33.1 Process
33.2 Estimating the Load
33.2.1 Initial Configuration
33.2.2 How much CPU bandwidth
33.2.3 How Much Memory
33.2.4 How much disk?
Summary
Review Questions
Chapter 34
Tuning and testing the data warehouse
34.1 Tuning the Data Load

34.2 Prioritized Tuning Steps
34.3 Tuning Queries
34.3.1 Fixed queries
34.3.2 AD HOC queries
34. 4 Testing the Data Warehouse
34.4. 1 Introduction
34.4.2 The Testing Terminologies
34.4.3 Testing the operational environment
34.4.5 Testing the database
34.4.5.1 Testing database manager and monitoring tools
34.4.5.2 Testing database features
34.4.5.3 Testing database performance
34.5 Testing the Application
Summary
Review Questions
Chapter 35
35.1 Introduction
35.1.1 Types of Backup
35.2 Data Warehouse Recovery Models
35.3 Define Backup and Recovery Strategy
35.4 Security Impact on Design of Data Warehouse
35.4.1 Application Development
35.4.2 Database Design
35.4.3 Testing
34.5 Disaster Recovery
Summary
Review Questions
APPENDIX A; Glossary
APPENDIX B: Two marks Questions with Answers
APPENDIX C: Past University Question Papers
BIBLIOGRAPHY
Backup and Recovery

Guide to Data Warehousing and Data Mining

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Guide to Data Warehousing and Data Mining

Загружено:

Авторское право:

Доступные форматы

DATA WAREHOUSING

Dr.P.Rizwan Ahmed, MCA,, M.Sc.,M.A.,M.Phil.,Ph.D,

1.1 An Expanding universe of data

Data Mining and the Data Warehouse

2.1 Data Warehouse: Definitions

Knowledge Discovery Process

4.1 Different forms of knowledge

Real life applications

5.1 Customer profiling

Formal aspects of learning algorithm

6.2 Information content of a message

7.16 Steps in Data Mining Process / Steps involved in KDD

8.1 Various kinds of data / Types of Data

Data Mining Functionalities, Classification and Case Study

9.1 Data Mining Functionalities

9.4 Data Mining Task Primitives

Overview of Data Mining Techniques-I

10.1 Data Mining Techniques

Overview of Data Mining Techniques-II

11.6 Genetic Algorithms

12.8.1.4 Numerosity Reduction

13.1 Association Rules

Concept Description: Generalization and Characterization

14.1 Concept Description

14.2.4 Efficient Implementation of Attribute-Oriented Induction

Mining Frequent Patterns, Associations & Correlations

15.1 Mining Association Rules in Large Databases

16.1.1 Classification algorithms based on the categorization:

Classification and Prediction

17.2.7 Enhancements to basic decision tree induction

18.5.3 K-means clustering

19.1 What is Cluster Analysis?

Advanced Topics (Mining Complex types of data)

20.1 Multidimensional analysis and descriptive mining of complex data objects

20.4.3 Estimation of Trend Curve

Applications and Trends in Data Mining

21.1 Applications of Data Mining

22.3 Need for Data Warehousing

Data Warehousing Components

23.1 Overall Architecture

From Data warehousing to data mining

24.1 Data warehouse usage

Data Warehouse Architecture

25.1 Data Warehouse architecture

Data Warehouse Implementation

26.1 Data Warehouse Implementation

Mapping the data warehouse to a multiprocessor architecture

27.1 Relational database technology for data warehouse

27.2.4 Combined architecture

Reporting and Query Tools and Applications

On-Line Analytical Processing (OLAP)

29.5.5 Fact Constellation

Service Level Agreement (SLA)

Operating the data warehouse

Tuning and testing the data warehouse

34.1 Tuning the Data Load

Backup and Recovery

Вам также может понравиться