Академический Документы
Профессиональный Документы
Культура Документы
13 Integrated
Subject-oriented
Time-variant
Non-volatile
Benefits of Data warehouse (video)
The Data Warehouse
The Data Warehouse is an integrated,
subject-oriented, time-variant, non-volatile
database that provides support for decision
13 making.
13 u
to present a unified view of the data to the users.
Time-variant because data in the warehouse is only accurate
and valid at some point in time or over some time interval. The
time-variance of the data warehouse is also shown in the
extended time that the data is held, the implicit or explicit
association of time with all data, and the fact that the data
represents a series of snapshots.
u Non-volatile as the data is not updated in real time but is
refreshed from operational systems on a regular basis. New
data is always added as a supplement to the database, rather
than a replacement. The database continually absorbs this
new data, incrementally integrating it with the previous data
13
13
Figure 13.3
A Data Warehouse Framework and Views
13
The Data Warehouse
Twelve Rules That Define a Data Warehouse
1. The Data Warehouse and operational environments are
separated.
2. The Data Warehouse data are integrated.
3. The Data Warehouse contains historical data over a
13
OLAP vs. OLTP
13
We can divide IT systems into transactional (OLTP) and analytical
(OLAP).
In general we can assume that OLTP systems provide source data
to data warehouses, whereas OLAP systems help to analyze it.
OLTP
OLTP deals with recording the real time transactions
that use in operational system such as transactions
happen in e-commerce and also banking ATM
13 system.
OLTP (On-line Transaction Processing) is
characterized by a large number of short on-
line transactions (INSERT, UPDATE, DELETE).
The main emphasis for OLTP systems is put on very fast
query processing, maintaining data integrity in multi-
access environments and an effectiveness measured
by number of transactions per second.
In OLTP database there is detailed and current data, and
schema used to store transactional databases is the
entity model (usually 3NF).
On-Line Analytical Processing
On-Line Analytical Processing (OLAP) is
deals with analyzing the data store in the
data warehouse.
an advanced data analysis environment that
13 aggregations.
For OLAP systems a response time is an effectiveness
measure. OLAP applications are widely used by Data
Mining techniques. In OLAP database there is
aggregated, historical data, stored in multi-dimensional
schemas (usually star schema).
More video
Introduction to OLAP
13 https://www.youtube.com/watch?v=2ry
G3Jy6eIY
Excel Tutorial: What is Business
Intelligence and an OLAP Cube?
https://www.youtube.com/watch?v=yo
E6bgJv08E
On-Line Analytical Processing
Multidimensional Data Analysis Techniques
The processing of data in which data are viewed
13
as part of a multidimensional structure.
Multidimensional view allows end users to
consolidate or aggregate data at different levels.
Multidimensional view allows a business analyst
to easily switch business perspectives.
Refer to example : Excel
13
13
On-Line Analytical Processing
OLAP Architecture
Three Main Modules
13
OLAP Graphical User Interface (GUI)
OLAP Analytical Processing Logic
OLAP Data Processing Logic
As Figure 13.17 illustrates, OLAP systems are designed to use both operational and
data warehouse data. The figure shows the OLAP system components on a single computer,
but this single-user scenario is only one of many. In fact, one problem with the
installation shown here is that each data analyst must have a powerful computer to store
the OLAP system and perform all data processing locally.
13
Types of On-Line Analytical Processing
13
Multidimensional OLAP (continued)
13
Relational Vs. Multidimensional OLAP
13
Table 13.8
Star Schema
• The star schema is a data-modeling technique used
to map multidimensional decision support into a
relational database.
• Facts
• Facts are numeric measurements (values) that represent a
specific business aspect or activity. For example, sales
figures are numeric measurements that represent product
13 •
and service sales.
Facts commonly used in business data analysis are units,
costs, prices, and revenues. Facts are normally stored in a
fact table that is the center of the star schema.
• The fact table contains facts that are linked through their
dimensions, which are explained in the next section.
• Facts can also be computed or derived at run time. Such
computed or derived facts are sometimes called metrics to
differentiate them from stored facts.
• The fact table is updated periodically with data from
operational databases.
Star Schema
• Dimensions
• Dimensions are qualifying characteristics that provide
additional perspectives to a given fact. For instance,
sales might be compared by product from region to
region and from one time period to the next.
13
Star Schema
• Attributes
u Each dimension table contains attributes. Attributes are
often used to search, filter, or classify facts.
13 •
• For example, all sales offices are rolled up to the sales
department or sales division to anticipate sales trends
drill-down
• the drill-down is a technique that allows users to navigate
through the details.
• For instance, users can view the sales by individual products
that make up a region's sales
• slicing and dicing.
• Slicing and dicing is a feature whereby users can take out
(slicing) a specific set of data of the OLAP cube and view
(dicing) the slices from different viewpoints.
• These viewpoints are sometimes called dimensions (such as
looking at the same sales by salesperson or by date or by
customer or by product or by region, etc.)
Example of Aggregation in
13
13
A Location Attribute Hierarchy
13
Figure 13.15
Attribute Hierarchies In Multidimensional Analysis
13
Figure 13.16
Data Warehouse Implementation Road Map
13
Figure 13.21
• Refer to the following video about “Data Warehouse
Architecture”
• https://www.youtube.com/watch?v=CHYPF7jxlik