A High-Performance Telecommunications Data Warehouse Using DB2 For Linux

®
A High-Performance
Telecommunications
Data Warehouse using
DB2 for Linux
Authors: Ken DeLathouwer, Kwai Wong, Haider Rizvi
Date: August 19, 2005
Abstract: This paper describes the results of a proof of concept (PoC) IBM completed
for a major telecommunications company with a large data warehouse. The PoC
demonstrated the capabilities of the IBM® DB2® Universal Database™ for Linux®
product (DB2 for Linux) to handle workloads with many concurrent users in a star-
schema data warehouse. It also demonstrated some of the new features available for
handling star schema.
Telco Proof of Concept Experience
1 Introduction..................................................................................................................2
2 Benchmark Specification.............................................................................................3
3 Hardware and Software Configuration........................................................................4
4 Database Design..........................................................................................................5
4.1 Partitions and Data Partition Groups...................................................................5
4.2 Table Spaces & Physical Layout.........................................................................6
4.3 Multidimensional Clustering and Indexes...........................................................7
4.4 Statistical views...................................................................................................7
4.5 Materialized Query Tables...................................................................................9
5 Database Configuration...............................................................................................9
6 Benchmark Tests........................................................................................................10
6.1 Data Explosion, Database Create and Load......................................................10
6.2 Query and Data Insert workload........................................................................12
6.2.1 Scenario 1 – Single Stream........................................................................12
6.2.2 Scenario 2 – Multiple Streams...................................................................13
6.2.3 Scenario 3 - Multiple Streams with Concurrent Load...............................14
6.2.4 Scenario 4 – Surprise Queries Single Stream............................................15
7 Summary of results....................................................................................................15
8 Conclusions................................................................................................................15
1 Introduction
In December 2004, the DB2 performance team demonstrated a data warehousing
solution running on low-cost commodity hardware with the IBM DB2 Integrated Cluster
Environment (DB2 ICE) solution (http://www-306.ibm.com/software/data/db2/linux/ice).
The demonstration was to provide a proof of concept (PoC) for a telecommunications
company (telco).
The purpose of the PoC was to benchmark the performance of the tested system
and to evaluate the work required in order to obtain such performance. The customer was
interested in the cost of setup (planning, loading) as well as overall performance. A short
timeframe was allotted for the benchmark. The vendors were expected to demonstrate a
multi-phased data warehousing solution, including:
1. Data loading into a star schema database.

2. Running queries with minimal tuning effort.
3. Tuning single-stream queries with whatever functionality the vendors found
useful for the specific schema / workload.
4. Executing several multi-stream tests, simulating 10 concurrent users as well
as 30 and 50 concurrent users.
5. Demonstrating the ability of the system to allow bringing in new data while a
10-user concurrent queries test was in progress.
Two vendors participated in the PoC. IBM proposed a DB2 ICE /e350 cluster
while the competing vendor proposed their solution. The same rules and success criteria
applied to both IBM and the competition.
Overall the PoC was highly successful for IBM, resulting in a sale to the
customer. We were able to complete all of the required tests during the short interval,
while the competition only completed the single-stream query tests. Our solution
delivered 5x better performance on the single-stream query tests.
This PoC showcased some of the key features of the DB2 Universal Database
(DB2 UDB) technology specifically targeted towards the data warehousing
environments, such as MDC (multidimensional clustering) and MQT (materialized query
tables). We also evaluated some of the new features available in DB2 UDB such as
statistical views.
This paper describes the details of the setup required for this PoC, as well as the
results achieved during this test. We also discuss how we used the various features
mentioned above to fit the given scenario.
2 Benchmark Specification
Given that the customer was concerned with the total cost of ownership, they
were interested in comparing the time and effort taken by each vendor to design, create,
and load the database. According to the competing vendor’s marketing, this initial setup
process was a strong point for them. This initial setup period was to take place prior to
the start of the 72-hour benchmark window outlined by the customer. Since they had
initially specified a 1 TB benchmarking database and later changed it to 2 TB, the time
allowed for planning was increased while new hardware was set up. The benchmark
specification required that the times for each step (e.g., hardware setup, data explosion,
splitting/loading data, planning, etc.) were to be reported to the customer in the final
deliverable.
Prior to the start of the 72-hour benchmark window, the customer sent the
database schema, a seed file containing 7.3 million rows and a corresponding ballooning
program to expand the seed data into 11 billion rows. Since the initial setup of the
database was done without any knowledge of the workload, tuning and creation of
specific database objects was done only by the "rules of thumb" for star schema
databases. This included selection of multidimensional clustering dimensions,
partitioning keys, primary indexes, and data distribution.
The workload from the customer consisted of 10 queries that were to be sent at
the beginning of the 72-hour period and an additional three queries sent 18 hours prior to
the end of the allotted time. The benchmark specification consisted of four different
scenarios with the queries that were sent, all of which had to be run in the 72-hour time
frame.
 Scenario 1)
o Part 1: Non–optimized
Using "rules of thumb" database setup and tuning without any knowledge of
the queries, run the 10 queries sequentially. The intent of this test was to see
the performance without any workload-specific objects or tuning.
o Part 2: Optimized
As in part 1, run all 10 of the original queries sequentially, but now any
workload-specific indexes, MQTs, tuning, etc., can be done to optimize the
performance.
 Scenario 2)
On the optimized database, run multiple streams of 10, 30, and 50 simultaneous
users. Each stream executes one of the 10 queries repeatedly until all streams have
finished at least one query.
 Scenario 3)
While running 10 simultaneous users, concurrently load approximately 29 million
rows into the database. Included in the timing of this run is any required rebuild of
MQTs.
 Scenario 4)
Run the final three "surprise" or ad hoc queries on the optimized database
sequentially, as in Scenario 1.
3 Hardware and Software Configuration

The sales team proposed a 16-node Linux BCU cluster with the BCU node #2
(e325/e326 nodes with DS4300 storage). Because of the shortage of time for the PoC and
the availability of DS4500 storage at our benchmarking lab, we chose to go with 16 e325
nodes with equivalent DS4500 storage servers per node.
The hardware configuration was a cluster of e325s, each with a single AMD
Opteron CPU, 4 GB of RAM and one internal 73GB drive. The 16-server cluster used
fiber attached DS4500 (a.k.a. FAStT900) storage totaling 14 TB of usable space. The
servers used 1 Gbit interconnect.
DB2 UDB Version 8.2 (fp7) and SLES8 SP3 were used. Node 1 of the cluster
was the NFS server for the DB2 instance home directory, of which each of the remaining
15 nodes were clients.
Hardware
 Server
16 x e325, each with 4 GB RAM, 1 x 73 GB internal disk
32 QLogic QLA2312 fiber channel cards (2 cards per node)
Cisco (Catalyst) 6009 with 48 port Gbit copper Ethernet blade
 Storage
4 DS4500 Storage Servers
16 EXP700 enclosures w/ 14 x 73 GB disks
RAID 6+P configuration - 2 containers per server
Total usable space: 14 TB or 876 GB per server
16 IBM e325Servers, 16 CPUs
4 IBM DS4500 Storage

Servers
- 224 Drives, each

- 73 GB 15 Krpm 2 Gbps FC HS
- 14 TB usable storage after Raid5
DB2 V8.2
Ethernet Linux OS – SLES 8 SP3
Switch Node Configuration: 325
Processors 1 x 2.0GHz AMD 64-bit Opteron Model 248
2Gbps Fibre Channel Memory 4 x 1GB PC2700 CL2.5 ECC DDR SDRAM
1Gbps Ethernet (private) Disk RDIMM
1Gbps Ethernet (public) Controllers 2 x QLA2312 HBA (2 ports x 2G bps FC)
Disk Drives 2 x 36.4GB 15Krpm Ultra320 SCSI HS
HDD
0.25 x IBM DS4500 Storage Server
1 x EXP700 ( x 14 73GB 15Krpm 2Gbps
Figure 3.1: Hardware Configuration FC HS HD)
4 Database Design
Since both vendors’ solution differed, the particulars of the design of the database
was left to each competing team based on the requirements of the customer. In this
section we describe the design choices and methods used to implement them, including
disk layouts for table spaces, partitioning and other database objects.
4.1 Partitions and Data Partition Groups
Each of the single CPU e325 servers contained one DB2 data partition, for a total
of 16 data partitions (DPs). The telco database schema consisted of a large (2 TB) fact
table, which was partitioned across the 16 DPs, and 15 very small dimension tables,
which were replicated over each DP. The partitioning key used to distribute the fact table
across the 16 DPs was the primary key. The decision to use this as the partitioning key
was made to ensure even distribution of the data since dimensional co-location was
unnecessary because of dimension replication.
Two data partition groups were used in the final setup of the database. The first
was one that only included the data partition on the catalog and coordinator node and was
used for a table space containing the dimension tables. The second group encompassed
all 16 data partitions and was used for the table spaces containing the fact table, staging
tables for updates, and replicated dimension tables.
The servers all had a single internal 73 GB drive used for /home. In addition, each
DP used the e325’s internal disk for local DIAGPATH so as to minimize any extraneous
inter-nodal communications that would occur if the DIAGPATH was on an NFS mounted
drive. The nodes also utilized the internal drive for database logs.
4.2 Table Spaces & Physical Layout
Each server exclusively used a full EXP700 enclosure of 14 x 72 GB disks. The

disks were configured as 2 6+P RAID5 volumes visible as two devices to each e325.
Each device had approximately 420 GB storage space and was divided into three logical
partitions: one used for the DMS RAW data table space, another for the DMS RAW
temporary table space, and the last for a file system holding the flat files. This horizontal
striping allowed for the maximum number of actual drives per table space. The table
space containing the user-defined staging table used for the load stage was created on the
file system used for flat file and was the only table space that was SMS file. The page
size used for both the data and temporary table space was 16 KB.
The DS4500 storage server is capable of providing a theoretical sustained read

throughput of 772 MB/s. With four e325 servers connected to a DS4500, I/O throughput
tests showed approximately 180 – 190 MB/s per node when each node was reading in
parallel, this test was executed with an OS utility (dd) outside of DB2 UDB. Since each
of the e325s was equipped with 2 QLA2312 host bus adapters (2 port) rated at 2 Gbps per
port, the bottleneck was the storage servers. Using the above table space layout provided
the highest possible I/O throughput for DB2 UDB. During a tablescan of the fact table,
DB2 UDB was able to deliver a sustained read throughput of approximately 2.6 GB/s
across all nodes.
EXP700 Enclosure – 14 x 73 GB drives

Data Table space:
RAID5 (6+P) RAID5 (6+P) DMS RAW
/dev/sdb /dev/sdc Size: 256 GB
Temp Table space:

DMS RAW
Size: 328 GB
Flatfile File system:

Software RAID0
Size: 192 GB
Figure 4.1: Disk Layout

4.3 Multidimensional Clustering and Indexes
The MDC dimensions and indexes used throughout the benchmark were chosen
prior to seeing the workload. The MDC fact table was dimensioned on the date key and
nine single column RID indexes were created on selected dimension table foreign keys.
Although there was the intent to create additional multi-column indexes on the fact table
for Part 2 of Scenario 1 (optimized single stream), this was not done partially because of
time constraints, but mainly because the tremendous performance gains with MQTs made
additional indexes unnecessary. For each replicated dimension table, a unique primary
key index was created.
4.4 Statistical views
The generation of optimal query execution plans relies upon the statistical
information collected by DB2 UDB. The statistics are used to compute estimates of the
filtering (or selectivity) of predicates. Predicates may reference a single table (local
predicates) or multiple tables (join predicates). Distribution statistics are available to
represent uneven data distribution within a single column and are used to improve the
selectivity estimate for local predicates. However, it is difficult to use the distribution
statistics to estimate join selectivities since it isn’t possible to determine which rows will
actually be joined at optimization time. For that reason, the optimizer assumes data
uniformity when computing join predicate selectivity. Fact tables of data warehouses
typically contain dynamic data while dimension data is largely static. Because dimension
attribute data can be positively or negatively correlated with fact attribute data, the base
table statistics do not allow the optimizer to differentiate relationships across tables.
Prior to seeing the workload, it was clear that there would be some queries where the join
predicates would be on non-uniformly distributed columns.
To overcome this problem, we defined statistical views. Statistical views are

views with associated statistics that can be used by the optimizer to improve cardinality
estimates. A statistical view need not be directly referenced in a query in order for its
statistics to be used to improve cardinality estimates. Statistical views are in plan to be
released in a future DB2 version. However, the infrastructure is in place in DB2 UDB
V8.2 and customers can access the technology when they encounter an optimization
problem under the guidance of DB2 Support.
Consider the following example to observe how statistical views provide information to
the optimizer.
________________________________________________________________________
Example:
- fact table FACT contains 10M rows of sales data

- dimension table CUST contains 5000 customers
- one customer 'CUST#1' comprises 25% of the sales
SELECT ... FROM CUST C, FACT F
WHERE C.CUST_ID = F.CUST_ID
AND C.NAME = 'CUST#1'
In estimating the output cardinality, the optimizer considers:

- for the join predicate, it knows there are only 5000 customers and it assumes
1/5000 rows will qualify from the fact
- for the local predicate, it knows there is exactly 1 customer out of the 5000
customers
The optimizer's estimate of the output cardinality is:
estimated cardinality = cardinality(CUST) * cardinality(FACT) * selectivity(join

predicate) * selectivity(local predicate)
= 5000 * 10000000 * 1/5000 * 1/5000
= 2000
However, the actual cardinality is 2500000 (25% of the FACT). The inaccurate
cardinality estimate can lead to poor access plan selection.
To provide better information to the optimizer, we create a statistical view:
CREATE VIEW DBA.CUST_FACT_V AS

(SELECT * FROM CUST C, FACT F
WHERE C.CUST_ID = F.CUST_ID)
We further specify that the view is enabled for query optimization and provide
basic and distribution statistics for the view.
Note that the view does not contain local predicates, and can be applied to any
query that joins CUST and FACT. When the example select query is compiled, the
optimizer determines that the statistical view can be used to optimize the query. It
will look at the distribution statistics of the view to see that frequent value
'CUST#1' has a count of 2500000 and use this value as the estimated output
cardinality. The accurate cardinality estimate will then allow the optimizer to better
determine the best access plan for the query.
________________________________________________________________________
We created five statistical views and they provided a clear performance gain. For
example, a query with a runtime of 1600 seconds prior to using statistical views ran in 6
seconds afterwards. The performance improvement provided by statistical views was not
quantifiable for all queries because only one of the 10 initial queries was run to its
completion prior to creating the statistical views. However, several other queries were
tested without statistical views and were manually stopped because of their lengthy
duration. After statistical views were applied, all queries finished in a reasonable time, the
longest of the 10 running for approximately 80 minutes.
4.5 Materialized Query Tables
During the initial unoptimized sequential run, we prepared for the optimized
sequential run where workload-specific objects are permitted. We used the DB2 Design
Advisor to obtain index and MQT recommendations. As mentioned earlier, we did not
create additional indexes, but we did create three MQTs. The MQTs were used for the
optimized sequential run and the remainder of the benchmark.
Of the initial 10 queries, seven routed to one of the three MQTs as did one of the
final surprise queries. In all, eight of the thirteen queries saw performance improvements
from an MQT definition.
The creation of the MQTs required a one-time initial population taking

approximately five hours for all three. Automatic MQT staging tables and indexes were
also created for each MQT; the former for enabling concurrent updates to the fact table
with incremental MQT refresh, and the latter for achieving better access plans.
The following process was used for creating MQTs capable of incremental refresh:
1) Create the MQT:

create table MQT_A as
( select... from.. where.. group by...)
data initially deferred refresh deferred;
2) Create the MQT staging table:

create table STG_A for MQT_A propagate immediate;
3) Set integrity for staging table

set integrity for STG_A immediate checked;
4) Perform the initial population of the MQT

refresh table MQT_A;
5) Create indexes and do RUNSTATS

create index MQT_A_PK on MQT_A...;
RUNSTATS on MQT_A and indexes all;
6) Perform incremental refresh after updates

refresh table MQT_A;
5 Database Configuration
There was minimal tuning of database manager and database configuration
parameters of the benchmark system. The NUM_FREQVALS and NUM_QUANTILES
were selected as good starting values once the schema was received and were not
changed as were the NUM_IOCLEANERS and NUM_IOSERVERS. The STMTHEAP
was increased to the maximum value to handle the large SQL statements and to avoid the
SQL0101N errors. Since no additional tuning was allowed after Scenario 1, a very small
SORTHEAP value was chosen to avoid any potential problems during the multi-stream
runs. For the inserts from staging to fact table, the LOCKLIST was increased to 100,000.
Two large buffer pools were configured for temp and data table spaces. Each 16
KB page size buffer pool was approximately 800 MB, resulting in DB2 UDB using less
than half of the available 4 GB of memory for buffer pools on each e325. Additional
tuning of DB2 parameters would likely improve times for initial MQT population and for
the ‘unoptimized’ runs without MQTs, but because of the tight schedule and the excellent
performance seen with MQTs, extra tuning was not pursued.
DB2 Database and Database Manager Configuration

 NUM_FREQVALUES 100
 NUM_QUANTILES 100
 STMTHEAP 65535
 LOCKLIST 100000
 NUM_IOCLEANERS 8
 NUM_IOSERVERS 8
 SORTHEAP 2000
DB2 Buffer pool Memory

 TEMP Buffer pool (16K pages) 50000
 DATA Buffer pool (16K pages) 50000
DB2 Registry Variables

 DB2_PARALLEL_IO=*
 DB2_SCATTERED_IO=Y
6 Benchmark Tests
Since the customer was interested in the total cost of ownership, the time for each
stage of the benchmark was recorded and reported. This included the time taken to build
the database (load, index creation, runstats) as well as the performance of the query tests
and updates.
6.1 Data Explosion, Database Create and Load
The database schema and fact table data seed file was sent prior to the start of the
72-hour period and, once the hardware was finalized, the data expansion and partitioning
was kicked off. The seed data was exploded using a ballooning program supplied by the
customer where each row in the seed data file was expanded into 1,500 rows.
Data explosion:
7,307,390 rows in seed table (~1.5 GB)
10,961,085,000 rows in exploded table (~2 TB)
Per Data Partition:

685,067,812.5 rows
137.8 GB needed for flat files
The data explosion and data pre-partitioning was done in a single step to
minimize the amount of temporary space and reduce the time required for these time-
consuming processes. A copy of the seed file was replicated on each of the 16 servers and
each server ran the balloon program to explode the data. The output of the balloon
program was then piped into db2split, where only the data for that DP was kept while the
rest was discarded.
This parallel approach to exploding and partitioning the data eliminated the need
for any post explode/split inter-nodal file transfer (aside from the seed file) as well as
reducing the space requirement from 2 TB on a single machine to 1/16th of that for each
machine. The downside to this approach was that each node had to process all the data
while only keeping 1/16th of it. A total of 32 TB of data was processed by db2split across
all nodes, while, ultimately, only 2 TB was written out.
The explode/split step ran in approximately 19 hours. The process was largely
CPU-bound as each server only wrote out approximately 130 GB (1/16th of 2 TB).
Writing 130 GB in 19 hours on an I/O subsystem capable of roughly 90 MB/s write speed
is only using up about 2% of the available sustained write throughput. This clearly shows
how CPU-intensive the balloon/db2split processing was. For comparison purposes, if this
process was done on a single machine and the split data was transferred to the other
nodes post split, the additional time required would be the time taken to transfer 130 GB
of data to each of the other 15 servers over Gbit Ethernet. The balloon/db2split process
would run in approximately the same time since writing out 16x the amount of data
would only require ~32% of write throughput of the disk subsystem so the balloon/split
would still be CPU-bound.
As part of the specification, the time taken for the database creation was part of
the benchmark report. The total time reported for the database build was 19.5 hours with
the fact table load and index creation taking 90% of the time.
Table 6.1: Database Setup Times

Stage Time
Data explode/split 19 hours
Database build - Load 4 hours
Database build – Create Index 14 hours
Database build - Runstats 1.5 hours
6.2 Query and Data Insert workload
The initial 10 queries from the customer were standard star schema queries with
relatively simple joins and a few simple GROUP BYs. The customer provided these
queries to the competing vendors in Oracle syntax which needed to be converted. Later
on, the customer would send corrections to some of the queries and q2 would be dropped
from the set of 10 initial queries to be replaced by one of the “surprise” queries.
An error during query conversion from Oracle was realized during the final
“surprise” query stage when one of the ad-hoc queries did not return any rows. The error
was an incorrect specification of datetime predicates. The root cause of the error was
attributed to the fact that Oracle date columns include both date and time information
while DB2 UDB has different data types for date and time, and also has a millisecond
unit for timestamp. When the error was discovered, we re-ran all queries with both the
incorrect and the correct predicates to show that the conversion error did not have a
significant affect on query timings reported from previous stages.
Some of the queries would return millions of rows requiring a large amount of
space to store the results. In addition, writing the large result set would create a
significant bottleneck as the target file system had relatively slow I/O rates. After
discussion with the customer, it was decided to limit the number of rows printed. For the
test results reported in this paper, all rows were fetched by the application but only
50,000 rows were printed to reduce the output file size.
6.2.1 Scenario 1 – Single Stream
The initial setup of the database consisted of only primary indexes, statistical
views, referential integrity constraints between the fact table, and the dimensions and
MDC clustering on the partitioning key of the fact table. The index configuration was
chosen based on probable joins between the fact table and the dimensions. The
partitioning key was chosen to ensure even distribution of fact table data across all 16
nodes.
The only difference between the unoptimized and the optimized single-stream
runs was that the latter used MQTs. Only three MQTs were created as described earlier in
the paper, and seven of the 10 initial queries benefited from them. The MQTs provided a
tremendous performance gain for the queries that routed to them. After the MQTs were
applied, only two of the ten benchmark queries were relatively long running, while the
other eight ran in sub-10-second time. The timings of all queries are compared with
optimized times below in Table 6.2.
Table 6.2: Single Stream Tests: Query Times
Unoptimized Optimized
Q1 5.568 6.091
Q2 N/A (*) 3.242
Q3 2383.831 5.991
Q4 1162.681 1169.059
Q5 2189.706 6.528
Q6 2459.58 4.941
Q7 4251.393 5.177
Q8 1449.679 6.531
Q9 4945.123 2.1
Q10 723.141 723.765
Total 19570.702 1933.425
*Q2 does not have an unoptimized time because the customer changed the query after the unoptimized
phase.
6.2.2 Scenario 2 – Multiple Streams
The multi-stream tests consisted of concurrent user runs of 10, 30, and 50 streams.
In these tests, each stream ran one of the 10 queries repeatedly until all streams had
finished at least one query. The 30 and 50 stream runs had 3 and 5 streams running each
query continuously. For example, for the 30 stream run, stream 1, 11 and 21 all ran query
1 repeatedly; the 50 stream test had streams 1, 11, 21, 31 and 41 all running query 1
continuously.
Since the multi-stream runs were run using the same configuration as the Scenario
1 – Optimized runs, the time to complete each test was dependent on Q4 and/or Q10
because those were the only long-running queries after the MQTs were created. The
customer was interested in not only the time taken to run each multi-stream test, but also
the total number of queries that completed. The multi-stream run results in Table 6.3
highlighted the excellent scalability of the DB2 UDB solution.
Table 6.3: Multi-Stream Tests: Query Counts and Time

10 Streams 30 Streams 50 Streams
Q1 305 616 872
Q2 754 1409 1738
Q3 511 961 1251
Q4 1 3 5
Q5 192 311 467
Q6 357 567 804
Q7 335 663 1002
Q8 101 180 251
Q9 549 1092 1375
Q10 1 3 5
Total Queries 3106 5805 7770
Time 2:31:53 5:15:07 8:01:37
6.2.3 Scenario 3 - Multiple Streams with Concurrent Load
The concurrent load plus multiple user streams was achieved by first loading the
new data into a user-defined staging table, then inserting the data into the fact table from
the staging table. Since there were 29 million rows to be loaded into the fact table, the
inserts had to be done in parts so that commits could be performed throughout the update
process. An index was created on the date key of the staging table data in order to
perform the inserts based on date ranges. The insert from the staging table was broken up
into 30 insert/select chunks, each inserting approximately 1 million rows into the fact
table at a time. After all rows were inserted into the fact table, the MQTs were refreshed
to reflect the new rows. The timings for each step are presented in Table 6.4.
Table 6.4: Multi-Stream and Load: Load + Insert Times

Step Time
Load Staging 1:04:09
Insert Fact 0:36:10
Refresh MQTs 0:06:57
Total 1:47:16
The excellent performance seen in single- and multiple-stream runs was mainly
attributed to the use of MQTs, and a common concern with MQTs is with the cost of
MQT maintenance due to update activity. The incremental refresh of the MQTs took
only 7 minutes, compared to the 5 hours needed for a full refresh when the MQTs were
first created.
Table 6.5 compares the multi-stream query run without and with concurrent
inserts to the FACT table. It demonstrates well the ability of the system to bring in new
data while the database is being queried.
Table 6.5: Multi-Stream and Load: Stream Time Compared to 10 Stream Standalone
10 Streams Standalone 10 Streams with Load
Q1 305 151
Q2 754 673
Q3 511 596
Q4 1 1
Q5 192 159
Q6 357 706
Q7 335 289
Q8 101 129
Q9 549 554
Q10 1 1
Total Queries 3106 3259
Time 2:31:53 2:45:02
6.2.4 Scenario 4 – Surprise Queries Single Stream
The final stage of the benchmark involved running the final three ‘surprise’
queries without any extra tuning. One of the original queries was decided to be too
unrealistic and was replaced by one of the surprise queries. The customer then sent a
replacement query for the removed surprise query. It was during this final stage that we
realized the Oracle to DB2 datetime predicate conversion error, and executed additional
tests to show previous test results were not affected by the error.
The performance of the final three queries was mixed. One of the final queries
routed to a previously defined MQT and ran in seconds; another ran with times
comparable to the two longer-running queries from the initial set; and the third ran for
well over an hour.
Table 6.6: Surprise Queries Single Stream: Query Times

Time
Ad hoc Q1 1.511
Ad hoc Q2 5131.363
Ad hoc Q3 1231.282
7 Summary of results
After the PoC was complete, we found out the comparison results between IBM
and the competing vendor’s solutions. Here is a table summarizing the results:
Summary of results DB2 ICE Cluster The Competition

(hh:mm:ss) (hh:mm:ss)
Scenario 1 – unoptimized 5:26:10 Approx. 2.7 hours
Scenario 1 – optimized 0:32:13 N/A*
Scenario 2 – 10 users run 2:31:53 Did not complete
Scenario 3 – 10 users + load 2:45:02 Did not complete
* The competing vendor does not have additional optimization so their optimized times
are the same as their unoptimized times.
8 Conclusions
The benchmark showcased many of the new technologies that DB2 UDB uses to
achieve high performance in large, star-schema data warehouse environments. This
demonstration established that unmatched performance can be achieved with relatively
low setup time and effort, using a cost-effective DB2 Integrated Cluster Environment
solution. While an important consideration for the customer, as well as many other
customers, is the total cost of ownership, this should not come at the cost of performance.
Features such as the Design Advisor will enable customers to implement performance
enhancing technologies with relatively little effort.
The competing vendor’s solution put major focus on ease of setup but was
ultimately unable to complete most of the requirements of the PoC because of the size,
complexity, and performance requirements of the workload. The performance of their
solution was faster than the DB2 ICE configuration without optimization, but they had
already reached the top of their performance potential.
The key to the success of this benchmark was the technology in DB2 UDB that
enabled much faster performance via, among other things, the use of statistical views and
MQTs. Overall the DB2 UDB solution performed five times faster than the best
performance of the competitor’s solution.
Disclaimer
All of the information published in this document is the result of product tests run in laboratory
environments. Particular results may vary due to a variety of circumstances, such as workload,
configuration or other factors. Accordingly, IBM DOES NOT WARRANT THE SAME RESULTS WILL
BE OBTAINED FROM ANYONE’S OWN USE OF THIS INFORMATION AND/OR
IMPLEMENTATION OF ANY TECHNIQUES MENTIONED. ACCORDINGLY, ALL INFORMATION
IS PROVIDED SOLELY FOR THE READER'S INFORMATION. IT IS PROVIDED ON AN "AS IS"
BASIS AND IBM EXPRESSLY DISCLAIMS ALL WARRANTIES TO SUCH INFORMATION.
Trademarks
The IBM logo, IBM, DB2, DB2 Universal Database, and eServer are trademarks or registered trademarks
of International Business Machines Corporation in the United States, other countries, or both.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.
© Copyright International Business Machines Corporation, 2005. All rights reserved.

A High-Performance Telecommunications Data Warehouse Using DB2 For Linux

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

A High-Performance Telecommunications Data Warehouse Using DB2 For Linux

Загружено:

Авторское право:

Доступные форматы

®

Authors: Ken DeLathouwer, Kwai Wong, Haider Rizvi

Date: August 19, 2005

1. Data loading into a star schema database.

3 Hardware and Software Configuration

4 IBM DS4500 Storage

- 224 Drives, each

4.1 Partitions and Data Partition Groups

4.2 Table Spaces & Physical Layout

Each server exclusively used a full EXP700 enclosure of 14 x 72 GB disks. The

The DS4500 storage server is capable of providing a theoretical sustained read

EXP700 Enclosure – 14 x 73 GB drives

Temp Table space:

Flatfile File system:

Figure 4.1: Disk Layout

4.4 Statistical views

To overcome this problem, we defined statistical views. Statistical views are

- fact table FACT contains 10M rows of sales data

In estimating the output cardinality, the optimizer considers:

The optimizer's estimate of the output cardinality is:

estimated cardinality = cardinality(CUST) * cardinality(FACT) * selectivity(join

To provide better information to the optimizer, we create a statistical view:

CREATE VIEW DBA.CUST_FACT_V AS

The creation of the MQTs required a one-time initial population taking

1) Create the MQT:

2) Create the MQT staging table:

3) Set integrity for staging table

4) Perform the initial population of the MQT

5) Create indexes and do RUNSTATS

6) Perform incremental refresh after updates

DB2 Database and Database Manager Configuration

DB2 Buffer pool Memory

DB2 Registry Variables

6.1 Data Explosion, Database Create and Load

Per Data Partition:

Table 6.1: Database Setup Times

6.2.1 Scenario 1 – Single Stream

6.2.2 Scenario 2 – Multiple Streams

Table 6.3: Multi-Stream Tests: Query Counts and Time

Table 6.4: Multi-Stream and Load: Load + Insert Times

Table 6.6: Surprise Queries Single Stream: Query Times

Summary of results DB2 ICE Cluster The Competition

© Copyright International Business Machines Corporation, 2005. All rights reserved.

Вам также может понравиться