Академический Документы
Профессиональный Документы
Культура Документы
IBM, the IBM logo, ibm.com, Informix, solid, DataMirror, Optim, Cognos are trademarks or registered trademarks of
International Business Machines Corporation in the United States, other countries, or both. If these and other IBM
trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols
indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such
trademarks may also be registered or common law trademarks in other countries.
Other company, product, or service names may be trademarks or service marks of others.
Contents
Definition of BI/DW/BA
Types of IDS BI Users
The data warehousing process turns raw data into potentially valuable
information usable by people and systems. Warehousing enhances data
assets value by:
– Applying standards and consistency to the data
– Organizing the data into subject areas that cross business functional
lines
– Integrating the data
– Enforcing data consistency over time to provide meaningful history
– Acting as a stable and reliable source
– Providing easy access to data
Business Analytics
The process of using information to enhance knowledge and apply that
knowledge to help a business achieve its objectives. Analytic applications
provide tools to facilitate the business analytics process.
Business Metrics and Business Management
Business Process Management
Prediction
Using Predictive
Analysis tools
Monitoring
Complexity
Using Dashboards
& Scorecards
Reporting
Using Query,
Reporting and
search tools
Low High
Business Value
Source: TDWI
IDS in BI/Warehousing
• Given the IDS Characteristics of Reliability, High Availability,
Performance, Ease of Use, why isn’t IDS in this space?
– IDS has traditionally been viewed as an OLTP solution
• However, there a lot more warehousing users on IDS than one
realizes!
– Some customers have implemented IDS warehouses at
Terabyte levels
– There are a lot of features already in IDS that make it suitable
for BI/Warehousing
– BI tools have become very sophisticated over the years
• We recognize the need to provide better warehousing capabilities
for IDS users
What’s Available? IDS Warehousing Features
Analysis
Dashboards
Scorecards
Industry and
Functional Solutions
Complete Coverage
of all capabilities
Enterprise-Class
SOA Platform
Data Warehouse Architecture
SQL Warehousing Tool Overview
– Warehousing Process
– Design Studio
– Admin Console
– Summary
SQL Warehousing Tools Overview
• SQW Solution
• Typical process – Data Modeling
– Identify requirements • Physical Data Model (Reverse
• Data Architect engineering, New from scratch,
generate DDL), compare & sync
– Data Flows
– Define data transformation (ETL/ELT)
process • Visual Design
• SQL/ETL developer • Optimized SQL code generation
– Development of sql/shell scripts • Control flow supports programming
logic
• SQL/ETL developer
– Deployment in production system – Admin Console
• Schedule, Monitor, Parameterized
• Application Architect, DBA values
– Eclipse free reporting tool
– Reporting • e.g. BIRT
• Business user – Reusable flows
– Refine requirements • Easy refinement
• Challenges • Copy & paste, refactor
– Dynamic requirements • Values
• Constantly refinement – Easy to design & reuse
– Multiple roles, tools • Increased productivity
• Each have different – Integrated tools
perspective • Seamless integration inside
• Communication cost/ Eclipse
information loss – Auto generated code from visualized
– Unreadable, hard-to-debug scripts flows
• Poor productivity • Optimized SQL code
– Impact analysis for any data model
change
SQW Architecture
Design Center
(Eclipse) IDS
SQW
DESIGN Data Flows + Control Execution
SQL Server
Flows DB
Databases
Oracle
DB2
Design
IDS
Studio
preparation DB
DEPLOY
User scripts
Deployment Code Units
package Build Profile
IDS
SQW
Admin Console
Deploy
y Control DB
Deplo
HTTP service (WAS ) tion
Execu
RUNTIME SQW Runtime
Applications
Other Servers
(DataStage)
SQW: Design Studio
• Design Studio
– Eclipse based IDE
• Integrated tools, shell sharing
– Team development
• CVS, clearcase for checkin/checkout
projects, flows
• Data Warehousing Project
– Data Models
– Data Flows
– Control Flows
– Warehouse Applications (deployment
packages)
– Subflow & Subprocess (reusable flow
module)
– Variables
• Data Source Explorer
– Database connections to multiple
vendors, e.g. Informix, DB2 LUW,
Oracle, SQL Server, MySQL, DB2 z/OS
• DataStage Servers
– Integration with IBM DataStage
SQW: Data Modeling
File source
Table Table target
join
aggregation
Table source
Data Flow Operators:
-- source & target operators (table, file)
-- SQL Transformation operators
-- Warehousing operators
SQW: Data Flows
A simple flow
Control flow
Common utility operators
Control logic, parallel execution, loop iteration
Error handling
SQW Overview
manage
create
deploy
Application package (zip file)
Manage warehouse applications
deployment profile
(database connections, machine resources,
Schedule
variable definitions, DDL files etc..) Monitor
Generated code
Admin Console
• External Tables
– XPS style loader for easy migration
• Partitioning Strategies
– Auto fragmentation
– Fragment Advisor
– Fragment stats Update
– Truncate Fragments
• Primary Storage Manager (PSM)
– For simpler, easier management of backups
(replacing ISM)
• Merge
– UpSert capabilities
OR
Use MACH 11 Blade Server
Single
User transparency database
OLTP Apps view
MACH 11
Primary
“OLTP”
Node
Group
Connection Manager SDS
Shared
SDS Disk
“SQW”
Node
Group
SDS
SQW
IDS Storage Optimization
Dictionary
ANCPRPLT 220J 200 Z165-3 NE132 6157 SNCPRPLT 580T 132 Z165-3 NE132 6157 …
220 200
A (01) 220J (02) S (01) 580 132
580T (02) …
A (01) 200 (02) S (01) 132 (02) …
J T
Animated
Slide
Storage savings
• Tables will often compress in the range of 60% - 80%
• Overall database storage savings will be between 40% and 50%
• That’s 50% less disk space needed to support IDS 11 database!
78% Smaller
81% Smaller
40% Faster
– 2x as fast in some cases as the
database may now be ½ the size
IDS 11 Compression Operations
• estimate_compression
– Estimates compression ratio on a table
• create_dictionary
– Creates compression dictionary for a table
• compress
– Does implicit create_dictionary and compress all previous data
• uncompress
– Uncompress the table and deactivates compression
• uncompress_offline
– XLOCK table and uncompress it. Also deactivates compression
• purge_dictionary
– Delete old inactive dictionaries
Storage Optimization Operations
• repack
– Move rows within a table or fragment to consolidate free space
• repack_offline
– XLOCK the table and move rows within a table or fragment to
consolidate free space
• shrink
– Return free space at end of table or fragment to the dbspace
– Normally done after a repack
Compression On Data Page With Multiple Rows
Multiple
Dictionary Compressed
Pages
compress repack
shrink