Вы находитесь на странице: 1из 84

Enterprise Data Warehouse

A Technical Perspective
Tony Dalwood
Information Architecture & Management
University of South Australia
IT Structure
ISTS Information Strategy & Technology Services
Information Strategy
Corporate Information Systems
E-Business
Information Architecture & Management
Technical Services
Customer Services
Network Services
Systems Infrastructure


Information Architecture &
Management (IAM)
Merger of DBA team & Information Integration team
in Feb 2006
IAM manages
Corporate System Databases (3 DBAs)
Operational Data Store Management
Middle Tier Apps
Student Portal (myUniSA)
Staff Portal (UniSAinfo)
UniSAinfo Reporting
EDW

Project Governance
Steering Group
Includes Directors of ISTS, Planning and Assurance Services
(PAS), Student & Academic Services (SAS), Finance
Sponsors Group
Director of Planning & Assurance Services
Dep. Director Information Strategy
Business Project Manager
Technical Project Manager
Reference Group
Senior Officers from PAS, HR, Research, SAS, Finance

Project Governance
Project Team
Business Project Manager (PAS)
Technical Project Manager (ISTS)
Design Architect/Dev Team Leader (ISTS)
Business Analyst (x1.5) (PAS)
Data Quality Manager (0.5) (PAS)
Developers (x3 variant) (ISTS)
EDW Project Milestones
Aug 2004 - Business Case submitted by Planning & Assurance Services (PAS) and ISTS to
extend current reporting environment to an EDW ($150K)
Feb 2005 Project Commenced
Feb-July 2005 Data Gathering Workshops
Sep-Dec 2005 Technical Research & Proof of Concept (0.5 IT Resource)
Jan-Feb 2006 External Consultancy (1 IT Resource)
May 2006 First Star Schema complete (Research Publications) (4 IT Resources)
July 2006 Three more Star Schemas complete (Research Income, AVCC Data, Research
Staff Supervision) (4 IT Resources)
August 2006 First Soft Production Release (2.5 IT Resources)
Beyond Student Data & Finance Data (min 2 IT Resources)

NB: IT Resource not including part time Tech Project Manager


Business Technical
One Source of the truth Conformed Dimensions
Consolidated Facts
Performance Transformed schema design
External Data Flexible data sources
Simplicity Pre-calculated measures
Historical Capability Versioning, Snapshot Data
Quality Verification, Validation,
Audit Trail
Project Goals
By-Products of an EDW Project
Data Discovery
What data do we have
How data is used and maintained
What is the quality of the data
How data can be utilised by more of the organisation

Enhanced Collaboration
Intra and Inter communication between business
units, system owners and IT

Technical Project Plan

Warehousing Research
Proof of Concept exercise
External Assistance
Implementation of an Architecture
Development Standards & Procedures
Build & Implementation of Stage 1
Review

Proof of Concept

Validate Warehouse research findings
Proof of Concept covered the following topics:
Project methodology
Technical architecture
Design methodology
ETL methodology
MetaData options
Data Quality approach
Security implementation options

Project Methodology
Technical Architecture
Inputs into Architecture
Business Goals
Existing Reporting Environments
Technology
Time
$$
Resources/Skills
Data Flow Architecture

Source
System
Replicated Data
Table
24 Hour
Snapshot schema
Previous Snapshot
Schema
Previous
Snapshot Table
Diff Process
ODS
Staging Tables
EDW
Deltas
Target EDW Tables
Transform
& Load
Source
System
Source
Systems
External Files
Design Methodology
Dimensional Modelling chosen as the design
philosophy
Star Schemas/Snowflakes
Facts
Dimensions
Measures
Bridges
History Retention for Slowly Changing Dimensions
Warehouse records are versioned i.e. never deleted or
overwritten.
Views to identify current records
Transformation of Design - Source
Transformation of Design - Target

ETL Methodology
Scripts Vs Tool decision
Tool chosen for following reasons:
Already licensed for Oracle Internet Developer Suite that includes Oracle
Warehouse Builder
Oracle Database environment
Oracle technical skills
Visibility of Development Environment
Auto technical Meta Data generation
Auto and accessible code generation using PL/SQL
Ability to include custom code
Integration with Oracle database and related Oracle technology
Framework for Beginners
Difficult to evaluate other products without expertise
Smarts & Effort into Modelling and Design ETL should be a
no brainer

MetaData
Data about Data
Oracle Warehouse Builder provides technical
metadata
Business MetaData facility currently restricted to
documentation and Cognos catalogs
Evaluation of MetaData methods to be reviewed
at the completion of Stage 1 development

Data Quality
Pre-ETL
Technical profile to ensure physical design has mapped
appropriate data elements
Business profile of source data to identify data attributes e.g.
data type, patterns, nulls, min, max, outlies
ETL
Transform to conformed data sets
Foreign Key checks
Reporting of anomolies
Post ETL
Final Business profile to validate transformations of data


Security
Security options implemented are:
Database Layer
Oracle roles to grant or deny access to database objects
based on Business rules
Oracle views for granular data security where appropriate
User Layer
Access to end user Cognos catalogues/cubes controlled
via Cognos security mechanisms and filesystem access
Development Lifecycle
Business Requirements
Design Process
Logical Design
Physical Design
Data Mapping
Data Profiling

Development Lifecycle
Design & Build ETL Objects & Processes
Extraction routines
Diff routines
Tag records as Inserts, Updates or Deletes
Build Staging tables
Build Target warehouse tables

Standard ETL Process
Scheduled Extract/Diff process runs to populate a Diff
table in the Staging Area
ETL process then performs a standard set of steps
Load Staging from Diff table
Stamp Staging record according to Diff type (U, D or I)
Updated Record Tag staging record as new version of core record
Deleted Record Tag staging record Retired record in warehouse
Inserted Record Tag staging record to be new record (version 1)
Update Core End date existing current record
Load new Core New current record from Staging

Development Lifecycle
Post ETL
Measures
Summary data
Process Flows to execute ETL
Security views
End User Layer e.g. Catalogues

ETL Auditing
When did a process last run
How long did it run for
Did it Succeed, Fail or produce Warnings
How many records did it alter or insert
What were the data exceptions


UniSA EDW Toolset

Oracle Database
Oracle Warehouse Builder
Oracle Workflow
Oracle Enterprise Manager
Datiris Data profiler
Cognos Impromptu/Powerplay
Whiteboard and lots of A3 Paper!!!
Oracle Database
Options assisting Warehouse implementation
External tables
Materialised Views
Query Rewrite
Bitmap indexes
Partitioning
Star Query optimizer options

Oracle Warehouse Builder
Provides the design and development
environment and framework for the build and
deployment of Warehouse objects and
transformation processes
Consists of Design Repository and Runtime
components

Oracle Workflow
Optionally used for job execution with
dependency management
Exists as an optional install with RDBMS
Run as Client/Server or HTTP browser based
application
Workflow engine is a service on the warehouse
database server administered by a workflow
schema

Oracle Enterprise Manager
Optionally used as the scheduling option for
submitting and monitoring Warehouse builder
processes or workflows
Base OEM comes with RDBMS
Optionally run as standalone install or
Management Server mode using a web console

Cognos 7.3 Reporting Suite
Catalogues
Report Developer access layer
Impromptu
Reporting capability
Powerplay
Multi-dimensional analysis
Upfront
Web interface


Oracle Warehouse Builder
Demonstration
OWB 10g Release 2 - Paris
New Features:
Design Tool
Graphic Interface Improvements
Built in Slowly Changing Dimension property
Data Profiling/Quality utilities
Better Integrated Workflow Engine
Job Scheduling within OWB via OEM

Project Review

Sanity Check on whole process, architecture,
methodology
Business & Technical
Evaluate ROI
Quantify metrics on time to deliver
Proposed Future phases
Usage Statistics
Hardware adequacy & capacity
Useful Technical References
Links
Oracle Business Intelligence & Technical Sites
http://www.oracle.com/solutions/business_intelligence/index.html
http://www.oracle.com/technology/tech/bi/index.html
Rittman Blog
http://www.rittman.net/
Kimball Tips
http://www.kimballgroup.com/html/designtips.html
Texts
Oracle 9iRel2 Data Warehousing - Hobbs
Kimball Texts
The Data Warehouse Lifecycle Toolkit
The Data Warehouse Lifecycle Toolkit
The Data Warehouse ETL Toolkit
Questions ?

Вам также может понравиться