Академический Документы
Профессиональный Документы
Культура Документы
”
The following is intended to outline our general product direction. It
is intended for information purposes only, and may not be
incorporated into any contract. It is not a commitment to deliver any
material, code, or functionality, and should not be relied upon in
making purchasing decision. The development, release, and timing
of any features or functionality described for Oracle’s products
remains at the sole discretion of Oracle.
Copyright © 2006 Oracle Corporation
Oracle 10g DB
Data Warehousing
ETL
<Insert Picture Here>
OLAP Statistics
Data Mining
• Demos ETL
From: To:
Analysts Pervasive use
REGION
Oracle 10g DB PRODUCT
TIME
Data Mining
IRS
– Detecting taxpayer noncompliance
ETL
<Insert Picture Here> Oracle Data Mining 10g
OLAP Statistics D E M O N S T R A T I O N
Data Mining
Oracle Data
Mining’s
Activity
Guides
simplify &
automate
data mining
for business
users
Apply model
viewers
Additional model
evaluation viewers
SELECT * from(
SELECT A.CUST_ID, A.MARITAL_STATUS,
PREDICTION_PROBABILITY(CD_BUYERS76485_DT, 1
USING A.*) prob
FROM CBERGER.CD_BUYERS A)
WHERE prob > 0.6;
BEGIN
DBMS_PREDICTIVE_ANALYTICS.EXPLAIN(
data_table_name => 'CD_BUYERS',
explain_column_name => 'CD_BUYER',
result_table_name => 'explain_result37');
END;
/
SELECT * FROM explain_result37;
DECLARE
v_accuracy NUMBER(10,9);
BEGIN
DBMS_PREDICTIVE_ANALYTICS.PREDICT (
ACCURACY => v_accuracy,
DATA_TABLE_NAME => 'CD_BUYERS',
CASE_ID_COLUMN_NAME => 'CUST_ID',
TARGET_COLUMN_NAME => 'CD_BUYER',
RESULT_TABLE_NAME => 'predict_result24');
Attribute Importance
• Identify most influential attributes
for a target attribute
• Factors associated with high costs,
responding to an offer, etc. A1 A2 A3 A4 A5 A6 A7
Classification and Prediction Income
Association Rules
• Find co-occurring items in a market basket
• Suggest product combinations
• Design better item placement on shelves
Feature Extraction
• Reduce a large dataset into representative
new attributes
• Useful for clustering and text mining
F1 F2 F3 F4
Text Mining
• Combine data and text for better models
• Add unstructured text e.g. physician’s notes to
structured data e.g. age, weight, height, etc., to
predict outcomes
• Classify and cluster documents
• Combined with Oracle Text to develop
advanced text mining applications e.g. Medline
BLAST ATGCAATGCCAGGATTTCCA
• Sequence matching and alignment
• Find genes and proteins that CTGCAAGGCCAGGAAGTTCCA
are “similar” ATGCGTTGCCAC…ATTTCCA
GGC..TGCAATGCCAGGATGACCA
ATGCAATGTTAGGACCTCCA
IF (Income >50K AND Gender=F AND Status >Single… ), THEN P(Buy Car=1)
Confidence= .77
Support = 250
X2
X1
Attribute reduction
Attribute Importance Minimum Description Identify useful data
Length (MDL) Reduce data noise
Market basket analysis
Association Rules Apriori Link analysis
Likelihood to buy
Oracle Data
Mining reveals
important
relationships,
patterns,
predictions &
Create Categories insights to the
of Customers business users
• Enables Excel
users to “mine”
Oracle or Excel
data using “one
click” Predict and
Explain predictive
analytics features
• Users select a table
or view, or point to
data in Excel, and
select a target
attribute
ETL
<Insert Picture Here> Oracle Data Miner 10gR2
OLAP Statistics
Code Generation Release
Data Mining
• PL/SQL code
generation for
Mining Activities
ETL
In-Database Analytics
<Insert Picture Here>
OLAP Statistics Example
Data Mining
select cust_name,
prediction(campaign_model using *)
as responder,
prediction_details(campaign_model using *)
as reason
from customers;
select cust_name,
prediction(campaign_model using *) as
responder,
sum(case when purchase_date < 15-Apr-2005 then
purchase_amt else 0 end) as pre_purch,
sum(case when purchase_date >= 15-Apr-2005
then
purchase_amt else 0 end) as post_purch
from customers, sales
where sales.cust_id = customers.cust_id
and purchase_date between 15-Jan-2005 and 14-Jul-
2005
group by cust_id, prediction(campaign_model using *);
select cust_name,
prediction(campaign_model using *) as responder,
sum(case when purchase_date < 15-Apr-2005 then
purchase_amt else 0 end) as pre_purch,
sum(case when purchase_date >= 15-Apr-2005 then
purchase_amt else 0 end) as post_purch
from customers, sales, products@PRODDB
where sales.cust_id = customers.cust_id
and purchase_date between 15-Jan-2005 and 14-Jul-2005
and sales.prod_id = products.prod_id
and contains(prod_description, ‘DVD’) > 0
group by cust_id, prediction(campaign_model using *);
ETL
• Leverage your data, discover new hidden
OLAP Statistics
information and valuable insights, and make
predictions Data Mining
• Do More!
• Build applications that automate the extraction and dissemination
of data mining’s insights
• Move from “End User Tool” to “Enterprise BI Application”
• Spend Less!
• Option to Oracle 10g Database Enterprise Edition
• Eliminates need for redundant data, new servers, new software,
and new support skills/resources