Вы находитесь на странице: 1из 24

Data Warehousing

Data Mining
Topics to be covered

Why we need Data Warehouse?


Online Transaction Processing ( OLTP)
Online Analytical Processing (OLAP)
Data Warehouse
Data Mining
Why we need Data
Warehouse ?
Online Transaction Processing
& Online Analytical Processing
How many units of Maruti WaganR were sold last
month?
What is the address and phone number of the
person in-charge of the Supplies department?
How is the employee attrition scene changing
over the years across the company?
Is there a correlation between the geographical
location of a company unit and excellent
employee appraisals?
Is it financially viable to continue our
manufacturing unit in UP?
How many employee have received an
OLTP vs. OLAP
OLTP OLAP
Users Clerk, IT professional Knowledge worker
Function Day to day operations Decision support
DB Design Application-oriented Subject-oriented
Current, up-to-date Historical, summarized,
Data detailed, flat relational multidimensional, integrated,
Isolated consolidated
Usage Repetitive Ad-hoc
Short, simple
Unit of Work Complex query
transaction
# Records
Tens Millions
Accessed
# Users Thousands Hundreds
DB Size 100MB-GB 100GB-TB
Data Warehouse
Data Warehouse

A data warehouse is a subject-


oriented, integrated, time-variant,
and nonvolatile collection of data in
support of managements decision-
making process

Bill Inmon
Data Warehouse

A data warehouse is based on a multi-


dimensional data model which views data in
the form of a data cube
A data cube, such as sales, allows data to be
modeled and viewed in multiple dimensions
Dimension tables, such as item (item_name,
brand, type), or time (day, week, month,
quarter, year)
Fact table contains measures (such as
dollars_sold) and keys to each of the related
dimension tables
Example
Star Schema

Dimension
Dimension Dimension
Dimension
Table
Table Table
Table

Fact
Fact Table
Table

Dimension
Dimension Dimension
Dimension
Table
Table Table
Table
Sales Cube
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key
branch_key location_key
branch_name
units_sold street
branch_type dollars_sold city
state_or_province
avg_sales country

Measures
OLAP operations

Roll up (drill-up): Summarize data


By climbing up hierarchy or by dimension
reduction

Drill down (roll down): Reverse of roll-up


From higher level summary to lower level
summary or detailed data, or introducing new
dimensions

Slice and dice:


Project and select
Data Mining

Extraction of interesting ( non-trivial,


implicit, previously unknown and
potentially useful) information from
data in large databases.
Data Mining Techniques

Classification
Clustering
Association Rule Discovery
Sequential Pattern Discovery
Data Mining

TextMining
Web Mining
Web Content Mining
Web Structure Mining
Web Usage Mining
Managing Data
Resources
Firms rules, procedures, roles for
sharing, managing, standardizing data
Data administration:
Firm function responsible for specific
policies and procedures to manage data
Data governance:
Policies and processes for managing
availability, usability, integrity, and
security of enterprise data, especially as it
relates to government regulations
Database administration:
Defining, organizing, implementing,
maintaining database; performed by database
design and management group
Data Mining Applications
Marketing
Which customers are likely to respond to this
campaign?
What other products or services should be offered to
a customer? (cross-selling)
What types of customers are loyal?

Telecommunications
Which customers will switch to competitors ?
Which calls are fraudulent?

Finance and Insurance


What types of customers have high credit risks /
insurance risks ?
What interest rate or insurance premium should be
given to different customers?
Which stocks are likely to perform well in the next 3
months?
Data Mining Applications
Healthcare
Which patients may take longer to recover ?
What is the likely cause of an illness ?

Retail
Which products do customers buy together (or in
sequence)?

Customer Support
Which customer service representative should be
assigned to a task ?
When a customer calls, the customer
representatives screen shows exactly where to
lead the conversation.
Dashboard

Вам также может понравиться