Вы находитесь на странице: 1из 31

Oracle Change Data Capture

Jack Raitto, Development Manager


Oracle NEDC
NYOUG Long Island SIG
October 7, 2004

1
Oracle Corporation
Capture your change data for FREE!*

Change Capture
Cost

Before After
* Zero additional license cost over Oracle10g EE
Virtually zero source system processing cost
2
Oracle Corporation
What is Oracle CDC?
Captures change data from operational
system(s) as it occurs
Part of Extract / Transform / Load (ETL)
process for DSS / Data warehouse,
potentially other applications
Optimizes the extract phase
Unleashes SQL power for
transformations
Provides management framework for
change data

3
Oracle Corporation
How was it done before (old way)?
Method Major Issues
Application logging / Maintenance,
triggers transaction impacts
Timestamp / change Application design &
key column performance impact,
no before image
Table differencing Impractical for large
tables, high transport
costs, not timely
Log sniffing Not supported, does
not track DB releases,
security issues, rocket
science
4
Oracle Corporation
CDC Advantages

Built in, custom fit, evolves with the database


Delivers change data when you need it,
where you need it
Offers several tradeoffs between timely
change delivery vs. source system overhead
(sync, async hotlog, async autolog, etc.)
Assumes complete change management
responsibility

5
Oracle Corporation
CDC Advantages (concl.)

Captures all change data along with


transaction information see all changes a
given transaction made and who made them
Transactional consistency for changes
across multiple source tables is guaranteed
Transparently coordinates sharing of change
data across users and applications
You dont need rocket scientists on your
staff!

6
Oracle Corporation
CDC Configurations
Sync CDC Async CDC Async CDC
HotLog AutoLog
Available Oracle 9i EE Oracle 10g EE Oracle 10g EE
Oracle 10g SE
Source Transaction System Minimal (~2%)
system cost delay, system resources
resources
Part of txn YES NO NO
Latency Real time Near real time Varies w /
topology,
checkpoint &
log switch
interval
Systems 1 1 2
7
Oracle Corporation
How CDC Works: Sync CDC

Uses internal triggers to capture


before and/or after images of new and
updated rows
Has the same performance
implications as capture via user
triggers
Delivers change data in real-time
Uses the same interface as async CDC

8
Oracle Corporation
Synchronous CDC HotLog
Combined Source / Operational BI System

CDC
Change Tables ETL Process
Customer Upsert to Load
Dimension
Tr Tables
ig
ge
rs

CDC Order Direct Path


Insert to load
Fact Tables

9
Oracle Corporation
How CDC Works: Async CDC

Relational interface to Streams


Prepackaged Streams application
Asynchronously captures change data
from redo/archive logs
Presents relational interface to change
data stream
Can operate on source system (hot
log) or staging system (auto log)

10
Oracle Corporation
Foundations of Async CDC
Change capture
Change management
Warehouse loading
Async CDC
Replication
Message queuing
Warehouse loading
Event notification
Data protection
Streams
Redo log inspection

LogMiner Debugging
Auditing
Reversing transactions

11
Oracle Corporation
Asynchronous CDC HotLog
Combined Source / Operational BI System

CDC
Change Tables ETL Process
Customer Upsert to Load
Dimension
LogMiner Tables
Active Streams
Redo Direct Path
CDC Order
Log Insert to load
Fact Tables

12
Oracle Corporation
Asynchronous CDC AutoLog
Source Data Warehouse / Staging System
Database
CDC
Change Tables ETL Process
Customer Upsert to Load
Dimension
LogMiner Tables

Redo Streams
Logs CDC Order Direct Path
Arch Insert to load
Process Fact Tables
Archived
Redo Logs

13
Oracle Corporation
Using CDC: Publish/Subscribe

Publisher supplies, subscribers consume


change data
Model allows sharing of change data across
users and applications
Coordinates retention / purge of change data
Prevents application from accidentally
processing change data more than once
Guarantees transactional consistency of
change data across source tables via change
sets

14
Oracle Corporation
Using CDC: Publish/Subscribe
Subscriber 1

Publisher
Subscription
CustNo Last First

123 Smith Frank

Change
Data Publication 124

125
Jones

Stein
Mary

Linda
Table Column Type
Cust CustNo number
Cust Last varchar
CustNo Last First
Cust First varchar
Subscriber 2
123 Smith Frank

124 Jones Mary

125 Stein Linda Subscription


CustNo Last First
126 Vine Abe
125 Stein Linda
127 Block Greg
126 Vine Abe

127 Block Greg

15
Oracle Corporation
Publisher Concepts
Change source
Defines the source system to CDC
Change set
Collection of source tables for which
transactionally consistent change data
is needed
Change table
Container to receive change data
Is published to subscribers

16
Oracle Corporation
Publisher Concepts
Source Database: HQ Staging Database: DW

Change Source:
HQ_SRC
Source table:
sh.sales Change Set:
PROD_ID SH_SET
CUST_ID
Change table:
PROMO_ID
sales_ct
AMOUNT_SOLD
PROD_ID
QUANTITY_SOLD CUST_ID
PROMO_ID
AMOUNT_SOLD
Source table:
sh.promotions Change table:
PROMO_ID promo_ct
PROMO_SUBCAT PROMO_ID
PROMO_CAT PROMO_SUBCAT
PROMO_CAT
PROMO_COST

17
Oracle Corporation
Publish Package

DBMS_CDC_PUBLISH
CREATE / ALTER / DROP_AUTOLOG_CHANGE_SOURCE
CREATE / ALTER / DROP_CHANGE_SET
CREATE / ALTER / DROP_CHANGE_TABLE
PURGE
PURGE_CHANGE_SET
PURGE_CHANGE_TABLE
DROP_SUBSCRIPTION

18
Oracle Corporation
Using Change Data: Subscribers

The subscriber creates a subscription


from an available publication
The subscription provides a moving
window (view) to the change data
Subscriptions go against a single
change set and are therefore
transactionally consistent
When all subscribers have advanced
past old change data, CDC
automatically and efficiently purges
19
Oracle Corporation
Subscriber Concepts
Staging Database: DW
Subscription:
Change Set: sales_promo_list
SH_SET
Publication on :
sh.sales
PROD_ID Subscriber view:
CUST_ID
PROMO_ID spl_sales
AMOUNT_SOLD

Publication on:
sh.promotions
PROMO_ID Subscriber view:
PROMO_SUBCAT
PROMO_CAT spl_promos

20
Oracle Corporation
Subscriber View
Subscriber view: spl_sales

OPERATION$ CSCN$ USERNAME$ PROD_ID CUST_ID PROMO_ID


Insert
I 587322 GRIFFIN 12784 12 0
Update
UO 587482 SLOAN 12784 12 0
before
Update UN 587482 SLOAN 12784 12 42
after
I 594312 BRIGGS 14899 302 42
Insert
I 602311 GRIFFIN 12498 12 55
Insert
D 711413 SLOAN 138922 7934 0
Delete
I 796122 BRIGGS 77741 712 55

Insert I 796122 BRIGGS 13846 712 55

Insert

21
Oracle Corporation
Subscriber Package

DBMS_CDC_SUBSCRIBE
CREATE_SUBSCRIPTION
SUBSCRIBE
ACTIVATE_SUBSCRIPTION
EXTEND_WINDOW
PURGE_WINDOW
DROP_SUBSCRIPTION

22
Oracle Corporation
Security

Sync publisher must have SELECT


access to the source table
Async publisher must have
EXECUTE_CATALOG_ROLE privilege
Publisher uses GRANT and REVOKE
on change tables to control subscriber
access

23
Oracle Corporation
Performance Benchmark*
Objectives:
Determine impact on transaction time
Determine latency
Source system: Oracle 10g R1 Beta, SunFire 4800
SMP 8x900Mhz/16GB w/striped 8 x Sun StorEdge T3
arrays (9X36.4MB each)
Customer insurance quote OLTP application run at
Oracle, 250 concurrent users / 175 TPS, system
warmed up (steady state)
Mixture of Inserts, Updates, Deletes, Singleton
Selects, Cursor Fetches, Rollbacks / Commits,
savepoints
Capture changes on all tables

* Your mileage will vary!


24
Oracle Corporation
Transaction Performance
Transaction elongated by 10%
Relative impact varies depending on other overhead

1.2

1.15

1.1

1.05

0.95

0.9
no CDC Sync CDC (9i) HotLog CDC AutoLog CDC
(10g) (10g)

25
Oracle Corporation
Transaction Performance
Transaction elongated by 8%
Can reduce elongation by adding RAC nodes / CPUs
1.2

1.15

1.1

1.05

0.95

0.9
no CDC Sync CDC (9i) HotLog CDC AutoLog CDC
(10g) (10g)

26
Oracle Corporation
Transaction Performance
Transaction elongation virtually eliminated
Change capture processing moved off system

1.2

1.15

1.1

1.05

0.95

0.9
no CDC Sync CDC (9i) HotLog CDC AutoLog CDC
(10g) (10g)

27
Oracle Corporation
HotLog Latency Performance
100
% C hanges A rrived

80

60

40

20
0
0

0.5

1.5

2.5

3
Seconds
About the change data arrived in 1 second
Virtually all the change data arrived in 2 seconds
28
Oracle Corporation
Summary

CDC assumes the burden of change


capture for you
Change data is guaranteed consistent
and complete
Change data can be shared across
users and applications effortlessly
CDC delivers change data where you
need it, when you need it, and with
minimal overhead

29
Oracle Corporation
For More Information
Oracle Data Warehousing Guide, 10gR1,
Chapter 16
Oracle PL/SQL Packages and Types
Reference, 10gR1, packages DBMS_CDC_*
http://www.oracle.com/technology/oramag/orac
le/03-nov/o63tech_bi.html
http://www.oracle.com/technology/products/bi
/db/10g/pdf/twp_dss_ontime_etl_10gr1_0304.pd
f
http://www.rittman.net/archives/000901.html
http://www.nyoug.org/cdc.pdf (Oracle9i)

30
Oracle Corporation
Questions?

31
Oracle Corporation

Вам также может понравиться