Вы находитесь на странице: 1из 75

1

Course Overview
What is Data Warehouse

OLTP Vs. Data Warehousing

Data Warehousing Architecture

Data Warehousing Schemas & Objects

Physical Design in Data Warehouse

Definition of Data Warehousing
2
Course Overview

Data Warehousing basic Design
Approaches

Data Warehousing Operational
Processes

Technical Problems in Data
Warehousing

Representative DSS Tools

Business Intelligence
3
What is a Data Warehouse?

A data warehouse is a relational database that is designed for query and analysis
rather than for transaction processing. It usually contains historical data derived
from transaction data.

A data warehouse environment includes an extraction, transportation,
transformation, and loading (ETL) solution, online analytical processing (OLAP)
and data mining capabilities, client analysis tools, and other applications that
manage the process of gathering data and delivering it to business users.

It is a series of processes, procedures and tools (h/w & s/w) that help the
enterprise understand more about itself, its products, its customers and the
market it services
4
NOT possible to
purchase a Data
Warehouse, but it is
possible to build one.
Data Warehouse is
NOT a specific
technology
Facts !
5
Who are the potential
Customers ?
Which Products are sold the
most ?

What are the region-wise
preferences ?
What are the competitor
products ?
What are the projected
sales ?
What if you sale more
quantity of a particular
product ?

What will be the impact
on revenue ?
Results of promotion
schemes introduced ?

Why Data Warehousing?
Need of Intelligent Information in Competitive Market
6
Data Warehouse is a subject-oriented, integrated
nonvolatile and time-variant collection of data in
support of managements decisions.
William Imon
Defining Data warehouse
7
Subject Oriented
The data in data
warehouse is
organized around the
major subject of the
enterprise ( i.e. the
high level entities).

The orientation around
the major subject areas
causes the data
warehouse design to
be data driven.

The operational
systems are designed
around the application
and functions. e.g.
Loans , savings , credit
cards in case of a
Bank. Where Data
Warehouse is designed
around a subject like
Customer , Product ,
Vendor etc.
Operational
Systems
Data
Warehouse
Customer
Supplier
Product
Organized by processes
or tasks
Organized by
subject
8
Data Warehouse Data
Time Data


{

Key
Time Variant
Data is stored as a series of snapshots or views which record how it is
collected across time.
It helps in Business trend analysis
In contrast to OLTP environment, data warehouses focus
on change over time that is what we mean by time variant.
9
Integrated
Data is stored once in a single integrated location

Data Warehouse
Database
Subject =Customer
Auto Policy
Processing
System
Customer
data
stored
in several
databases
Fire Policy
Processing
System
FACTS, LIFE
Commercial, Accounting
Applications
It is closely related with subject orientation.
Data from disparate sources need to be put in a consistent format.
Resolving of problems such as naming conflicts and
inconsistencies
10
Non-Volatile
Existing data in the warehouse is not overwritten or updated.
External
Sources
Read-Only
Data
Warehouse
Database
Data
Warehouse
Environment

Production
Databases
Production
Applications
Update
Insert
Delete
Load
This is logical because the purpose of a data warehouse is to enable you to
analyze what has occurred.
11
So, whats different between OLTP
and Data Warehouse?
12
OLTP vs. Data Warehouse
OLTP systems are tuned for known transactions and workloads while workload is
not known in a data warehouse

Special data organization, access methods and implementation methods are
needed to support data warehouse queries (typically multidimensional queries)

e.g., average amount spent on phone calls between 9AM-5PM in Pune during
the month of December

13
OLTP vs. Data Warehouse
OLTP

Application Oriented
Used to run business
Detailed data
Current up to date
Isolated Data
Repetitive access
Clerical User
WAREHOUSE (DSS)

Subject Oriented
Used to analyze business
Summarized and refined
Snapshot data
Integrated Data
Ad-hoc access
Knowledge User (Manager)
14
OLTP vs Data Warehouse
OLTP

Performance Sensitive
Few Records accessed at a time (tens)
Read/Update Access
No data redundancy
Database Size 100MB -100 GB
DATA WAREHOUSE

Performance relaxed
Large volumes accessed at a
time(millions)
Mostly Read (Batch Update)
Redundancy present
Database Size 100 GB -
few terabytes
15
OLTP vs Data Warehouse
OLTP

Transaction throughput is the
performance metric
Thousands of users
Managed in entirety

Data Warehouse

Query throughput is the
performance metric
Hundreds of users
Managed by subsets
16
To summarize ...
OLTP Systems are
used to run a business




The Data Warehouse helps to
optimize the business
17
Data Warehouse Architectures
Centralized


In a centralized architecture, there exists only one data warehouse which stores
all data necessary for business analysis. As already shown in the previous section,
the disadvantage is the loss of performance in opposite to distributed approaches.

Central Architecture
18
Federated

In a federated architecture the data is logically consolidated but stored in
separate physical databases, at the same or at different physical sites. The local
data marts store only the relevant information for a department.
The amount of data is reduced in contrast to a central data warehouse. The level
of detail is enhanced.
Federated Architecture
Data Warehouse Architectures Contd
19

Tiered:

A tiered architecture is a distributed data approach. This process
can not be done in one step because many sources have to be
integrated into the warehouse.
On a first level, the data of all branches in one region is collected, in
the second level the data from the regions is integrated into one
data warehouse.


Advantages:

Faster response time
because the data is
located closer to the client
applications and
Reduced volume of data
to be searched.
Tiered Architecture
Data Warehouse Architectures Contd
20
Metadata
Data Sources Data Management Access
Complete Warehouse Solution Architecture
Operational Data
Legacy Data
The Post
VISA
External Data
Sources
Enterprise
Data
Warehouse
Organizationally
structured
Extract
Transform
Load
Data Information Knowledge
Asset Assembly (and Management) Asset Exploitation
Data
Mart
Data
Mart
Departmentally
structured
Data
Mart
Sales
Inventory
Purchase
21

Data Sources:
Legacy data
Operational data
External data resources

Data Management :
Metadata - At all levels of the data warehouse, information is required to support
the maintenance and use of the Data Warehouse.
Data Mart A data mart is a subject oriented data warehouse.

Data Warehouse Architecture Components
Disparate data
sources
22
Introduction To Data Marts

What is a Data Mart

From the Data Warehouse , atomic data flows to various departments for their
customized needs. If this data is periodically extracted from data warehouse
and loaded into a local database, it becomes a data mart. The data in Data Mart
has a different level of granularity than that of Data Warehouse. Since the data
in Data Marts is highly customized and lightly summarized , the departments can
do whatever they want without worrying about resource utilization. Also the
departments can use the analytical software they find convenient. The cost of
processing becomes very low.
23
Data Mart Overview
Data Marts
Satisfy 80% of
the local end-
users requests
Sales Representatives
and Analysts
Human
Resources
Financial Analysts,
Strategic Planners,
and Executives
DM Marketing
DM Finance
DM Sales
DM HR
Data Warehouse
DM Sales
DM HR
DM Marketing
24
From The Data Warehouse To Data Marts
Departmentally
Structured
Individually
Structured
Data Warehouse
Organizationally
Structured
Less
More
History
Normalized
Detailed
Data
Information
25
Data Warehousing SCHEMAS & OBJECTS

A schema is a collection of database objects, including tables, views,
indexes, and synonyms.

There is a variety of ways of arranging schema objects in the schema
models designed for data warehousing. The are:

Star Schema
Snowflake Schema
Galaxy Schema

26
Star Schema:
It Consists of a fact table connected to a set of dimensional
tables
Data is in Dimension tables is De-Normalized

Snowflake Schema:
It is refinement of star schema where some dimensional
hierarchy is normalized in to a set of dimensional tables

Galaxy Schema:
Multiple fact tables share dimension tables viewed as a
collection of stars, therefore called galaxy schema


27
Star Schema
A star schema a highly De-Normalized, query-centric model where
information is broken into two groups: facts and dimensions.
Time_Dim
TimeKey
TheDate
.
.
.
Sales_Fact
TimeKey
EmployeeKey
ProductKey
CustomerKey
ShipperKey
Required Data
(Business Metrics)
or (Measures)
.
.
.
Employee_Dim
EmployeeKey
EmployeeID
.
.
.
Branch_Dim
BranchID
Branchno
.
.
.
Customer_Dim
CustomerKey
CustomerID
.
.
.
Shipper_Dim
ShipperKey
ShipperID
.
.
.
28
Sales_fact
timeID {FK}
propertyID {FK}
branchID {FK}
clientID {FK}
promotionID {FK}
staffID {FK}
ownerID {FK}
offerPrice
sellingPrice
saleCommission
saleRevenue
Branch_Dim
branchID {PK}
branchNo
branchType
city {FK}
City
city {PK}
region {FK}
Region
region {PK}
country
Figure32.2
Fact Table
Dimension
Tables
Snowflake Schema
29

Multiple Groups of Facts links by few common dimensions
Fact1
Fact2 Fact3
Dimension2
Dimension1
Dimension4
Dimension5
Dimension3
Dimension7 Dimension6
Galaxy Schema
30
Data Warehousing Objects
All the three types of Schemas are described in the Data Modeling section

Various Objects used in Data Warehousing are:

Fact Tables
Dimension Tables
Hierarchies
Unique Identifiers
Relationships



31
Data Warehousing Objects
Fact Tables:

Represent a business process, i.e., models the business process as an artifact in
the data model

Contain the measurements or metrics or facts of business processes
"monthly sales number" in the Sales business process
most are additive (sales this month), some are semi-additive (balance as of),
some are not additive (unit price)

The level of detail is called the grain of the table

Contain foreign keys for the dimension tables


32
Additive facts:

Additive facts are facts that can be summed up through all of the dimensions

in the fact table

Semi-Additive facts:

Semi-additive facts are facts that can be summed up for some of the dimensions

in the fact table

Non-additive facts:

Non-additive facts are facts that cannot be summed up for any of the

dimensions Present in the fact table



Fact Types :
33
Examples to illustrate Additive, Semi-Additive
& Non-Additive facts:

Date
Store
Product
Sales_Amount
The purpose of this table is to record the Sales_Amount for each product in each store
On a daily basis. Sales_Amount is the fact.

In this case, Sales_Amount is an additive fact, because we can sum up this fact along
with any of the 3 dimensions present in the fact table date, store, and product
Fact table:
34
Eg for semi-Additive & Non-Additive facts:
Date
Account
Current_Balance
Profit_Margin
Fact table:
The purpose of this table is to record the current balance for each account at the end of
each day, as well as the profit margin for each account for each day

Current_Balance & Profit_Margin are the facts

Current_Balance is a semi additive fact, as it makes sense to add them up for all
accounts (whats the total current balance for all accounts in the bank?), but it does not
make sense to add them up through time

Profit_Margin is a non additive fact, for it does not make sense to add them up for the
account level or the day level
35
Based on the above classifications, there are two types of fact tables

Cumulative
Snapshot

Cumulative: This type of fact table describes what has happened over a period of time
For example this fact table may describe the total sales by product by store by day
The facts for this type of fact tables are mostly additive. The first example is a
Cumulative fact table.

Snapshot: This type of fact table describes the state of things in a particular instance
Of time, and usually includes more semi additive and non-additive facts.
The second example presented is a snapshot fact table
types of fact tables :
36
Data Warehousing Objects Contd.
Dimension Tables:

Dimension tables

Define business in terms already familiar to users

Wide rows with lots of descriptive text

Small tables (about a million rows)

Joined to fact table by a foreign key

heavily indexed

typical dimensions

time periods, geographic region (markets, cities), products, customers,
salesperson, etc.

37
Dimension tables Types

Dimension tables Types

Slowly Changing dimensions
Junk Dimensions
Confirmed Dimensions
Degenerated Dimensions.
38

Various data elements in the dimension undergo changes (e.g. changes in
attributes, hierarchical structures) which need to be captured for analysis.

SCD problem is a common one particular to data warehousing.

In a nutshell, this applies to cases where the attribute for a record varies over time.
For eg:



Customer key Name State
1001 Christina Illinois
Christina is a customer who first lived in chicago,illinois. At a later date, she moved to
Los Angeles,California. Now how to modify the table to reflect this change?

This is a Slowly Changing Dimension problem

Slowly Changing Dimensions :(SCD)
39
There are in general 3 ways to solve this type of problem, and they are

categorized as follows:

Type 1

Type 2

Type 3

Type 1: New record places the original record. No trace of the old record exists

Type 2: A new record is added to the customer dimension table

Type 3: The Original record is modified to reflect the change


Types of SCD
40

New record places the original record. No trace of the old record exists

Eg:

Customer key Name State
1001 Christina Illinois
After Christina moved from illinois to California, the new information replaces the
new record and we have the following table:


Customer key Name State
1001 Christina California
Advantages:
This is the easiest way to handle the Slowly Changing dimension, Since there
is no need to keep track of the old information.

Disadvantages:
All the history is lost. By applying this methodology, it is not possible to
track back in history. For eg In the above case, the company would not able to know
that Christina lived in Illinois before.
TYPE 1:
41

In type 2 SCD a new record is added to the table to represent the new Information.
Therefore both the original & the new record will be present

Eg:

After Christina moved from illinois to California, we add the new information as a
new row into the table
Advantages:

This allows us to accurately keep all historical information

Disadvantages:

This will cause the size of the table to grow fast where the number of rows for the
table is very high to start with, storage and performance can become a concern
Customer key Name State
1001 Christina Illinois
1005
Christina
California
TYPE 2:
42

In type 3 SCD there will be two columns to indicate the particular attribute of
interest, one indicating the original value, and one indicating the current value.
There will also be a column that indicates when the current value becomes active.

Eg:

After Christina moved from illinois to California, the original information gets updated,
And we have the above table (Assuming the effective date of change is January 15,2003
Advantages:
This does not increase the size of the table, since new information is updated
This allows us to keep some part of history

Disadvantages:
Type 3 will not be able to keep all history where an attribute is changed more than
Once. For eg, if Christina later moves from to Texas on December 15,2003 the
California information is lost
Customer key Name Original State Current State Effective Date
1001 Christina Illinois
California 15-Jan-03
TYPE 3:
43

Degenerate dimension is a dimension which is derived from the fact table
and doesn't have its own dimension table.

Degenerate dimensions are often used when a fact table's grain represents
transactional level data and one wishes to maintain system specific identifiers
such as order numbers, invoice numbers and the like without forcing their
inclusion in their own dimension.
Degenerated Dimension:
44

Dimension which is fixed and reusable.

It is also called as fixed dimension. It is a dimension which doesn't effect
with respect to time.

Ex : if the name of the city is changed from Bombay to Mumbai, the name
will not change from time to time, once the change is done ,The change is permanent.
This type of dimensions are called confirmed or fixed dimensions.

Confirmed Dimensions :
45

A dimension where one can store random transactional codes,
flags and text attributes that are not related to other dimensions
and which provides a simple way for users to easily find those
unrelated attributes.

Ex: Martial Status : (Yes or No)
Gender : (M or F) e.t.c.
Junk dimensions:
46
Data Warehousing Objects Contd.

Hierarchies:

Hierarchies are logical structures that use ordered levels as a means
of organizing data. A hierarchy can be used to define data aggregation.
For example, in a time dimension, a hierarchy might aggregate data from
the month level to the quarter level to the year level. A level represents a
position in a hierarchy.

Unique Identifiers:

Unique identifiers are specified for one distinct record in a dimension table.
Artificial unique identifiers are often used to avoid the potential problem of
unique identifiers changing.

Relationships:

Relationships guarantee business integrity. Designing a relationship between
the sales information in the fact table and the dimension tables products and
customers enforces the business rules in databases.


47
Physical Design In Datawarehouse
Physical design is the creation of the database with SQL statements. During the
physical design process, you convert the data gathered during the logical design
phase into a description of the physical database structure.

Physical Design Structures:

Table spaces: A tablespace consists of one or more data files, which are physical
structures within the operating system you are using. A data file is associated
with only one tablespace. From a design perspective, table spaces are containers
for physical design structures.

Tables and Partitioned Tables: Tables are the basic unit of data storage. They are
the container for the expected amount of raw data in your data warehouse. Using
partitioned tables instead of non-partitioned ones addresses the key problem of
supporting very large data volumes by allowing you to decompose them into
smaller and more manageable pieces.
48
Physical Design In Data Warehouse Contd.

Views:
A view is a tailored presentation of the data contained in one or more tables or
other views. A view takes the output of a query and treats it as a table. Views do
not require any space in the database.

Integrity Constraints:
Integrity constraints are used to enforce business rules associated with your
database and to prevent having invalid information in the tables. Integrity
constraints in data warehousing differ from constraints in OLTP environments. In
OLTP environments, they primarily prevent the insertion of invalid data into a
record, which is not a big problem in data warehousing environments because
accuracy has already been guaranteed.

Indexes:
Indexes are optional structures associated with tables or clusters. In addition to
the classical B-tree indexes, bitmap indexes are very common in data
warehousing environments.

49
Definition Of Data Warehouse

Ralph Kimball's paradigm:

Data warehouse is the conglomerate of all data marts within the
enterprise. Information is always stored in the dimensional model.

Bill Inmon's paradigm:

Data warehouse is one part of the overall business intelligence system.
An enterprise has one data warehouse, and data marts source their
information from the data warehouse. In the data warehouse, information
is stored in 3rd normal form


50
Basic Design Approaches of Data Warehouse
There are two major types of approaches to building or designing the
Data Warehouse.


The Top-Down Approach

The Bottom-Up Approach

51
The Top Down Approach

The Dependent Data Mart structure or Hub & Spoke: The Top-Down Approach

Inmon advocated a dependent data mart structure

The data flow in the top down OLAP environment begins with data extraction
from the operational data sources. This data is loaded into the staging area and
validated and consolidated for ensuring a level of accuracy and then transferred
to the Operational Data Store (ODS).

Detailed data is regularly extracted from the ODS and temporarily hosted in the
staging area for aggregation, summarization and then extracted and loaded into
the Data warehouse.

Once the Data warehouse aggregation and summarization processes are
complete, the data mart refresh cycles will extract the data from the Data
warehouse into the staging area and perform a new set of transformations on
them. This will help organize the data in particular structures required by data
marts. Then the data marts can be loaded with the data and the OLAP
environment becomes available to the users.
52
The Top Down Approach Contd
Inmon Approach
The data marts are treated as sub sets of the data warehouse. Each
data mart is built for an individual department and is optimized for
analysis needs of the particular department for which it is created.
53
The Bottom-Up Approach
1. The Data warehouse Bus Structure: The Bottom-Up Approach

Ralph Kimball designed the data warehouse with the data marts connected
to it with a bus structure.

The bus structure contained all the common elements that are used by data
marts such as conformed dimensions, measures etc defined for the enterprise
as a whole.

This architecture makes the data warehouse more of a virtual reality than a
physical reality

All data marts could be located in one server or could be located on different
servers across the enterprise while the data warehouse would be a virtual
entity being nothing more than a sum total of all the data marts

In this context even the cubes constructed by using OLAP tools could be
considered as data marts.
54
The Bottom-Up Approach Contd
Kimball Approach

The bottom-up approach reverses the positions of the Data warehouse
and the Data marts. Data marts are directly loaded with the data from the
operational systems through the staging area.

The data flow in the bottom up approach starts with extraction of data
from operational databases into the staging area where it is processed
and consolidated and then loaded into the ODS.

55
The Bottom-Up Approach Contd

The data in the ODS is appended to or replaced by the fresh data being
loaded. After the ODS is refreshed the current data is once again
extracted into the staging area and processed to fit into the Data mart
structure. The data from the Data Mart, then is extracted to the staging
area aggregated, summarized and so on and loaded into the Data Warehouse and
made available to the end user for analysis.
56
DW Operational Processes (Overview of
Extraction, Transformation & Loading)
Typically host based, legacy applications

Customized applications, COBOL, 3GL, 4GL

Point of Contact Devices

POS, ATM, Call switches

External Sources

Nielsens, Acxiom, CMIE, Vendors, Partners
Sequential
Legacy Relational External
Operational/
Source Data
Source Data
57
DW Operational Processes (Overview of
Extraction, Transformation & Loading) Contd
These tools try to automate or support tasks such as:-

Data Extraction (accessing diff source data bases)

Data Cleansing (finding and resolving inconsistencies in the source data)

Data Transformation (between different data formats, languages, etc.)

Data Loading

Replication (replicating source databases into the data warehouse)

Analyzing & Checking of Data Quality (for correctness and completeness)

Building derived data & views
58
DW Operational Processes (Overview of
Extraction, Transformation & Loading) Contd
Elements of a Data Warehouse
59
DW Operational Processes (Overview of
Extraction, Transformation & Loading) Contd
Loading the Warehouse

Cleaning the data before it is loaded
60
DW Operational Processes (Overview of
Extraction, Transformation & Loading) Contd
These processes have been discussed in details in the ETL section.

Some important definitions:

Data Scrubbing: http://www.wisegeek.com/what-is-data-scrubbing.htm

Data Cleansing: http://www.wisegeek.com/what-is-data-cleansing.htm

Row level security: http://www.securityfocus.com/infocus/1743

Staging Types: http://esj.com/Columns/article.aspx?EditorialsID=55
61
Technical Problems in Data Warehouse
Managing large amounts of data:
The explosion of data volume came about because the data warehouse required
that both detail and history be mixed in the same environment.
Large amounts of data need to be managed in many ways-through flexibility of
addressability of data stored inside the processor and stored inside disk
storage, through indexing, through extensions of data, through the efficient
management of overflow, and so forth. To be effective, the technology used
must satisfy the requirements for both volume and efficiency.

Index/Monitor Data:
If data in the warehouse cannot be easily and efficiently indexed, the data
warehouse will not be a success. Monitoring data warehouse data determines
such factors as the following:
If a reorganization needs to be done
If an index is poorly structured
If too much or not enough data is in overflow
The statistical composition of the access of the data
Available remaining space

62
Technical Problems in Data Warehouse Contd

Interfaces to many technologies:

Data passes into the data warehouse from the operational environment
and the ODS, and from the data warehouse into data marts, DSS applications,
exploration and data mining warehouses, and alternate storage.
This passage must be smooth and easy.

The interface to different technologies requires several considerations:

Does the data pass from one DBMS to another easily?
Does it pass from one operating system to another easily?
Does it change its basic format in passage (EBCDIC, ASCII, etc.)?
63
Technical Problems in Data Warehouse Contd
Meta Data Management:

The data warehouse operates under a heuristic, iterative development life cycle.
To be effective, the user of the data warehouse must have access to meta data
that is accurate and up-to-date.

Several types of meta data need to be managed in the data warehouse: distrib-
uted meta data, central meta data, technical meta data, and business meta data.
64
Technical Problems in Data Warehouse Contd
Efficient Loading of Data

Data is loaded into a data warehouse in two fundamental ways:
a record at a time through a language interface or en masse with a utility.
Indexes must be efficiently loaded at the same time the data is loaded. As the
burden of the volume of loading becomes an issue, the load is often parallelized.

Another related approach to the efficient loading of very large amounts of data is
staging the data prior to loading.

As a rule, large amounts of data are gathered into a buffer area before being
processed by extract/transfer/load (ETL) software. The staged data is merged,
perhaps edited, summarized, and so forth, before it passes into the ETL layer.
65
Technical Problems in Data Warehouse Contd
Lock Management:
The lock manager ensures that two or more people are not updating the
same record at the same time. But update is not done in the data warehouse;
instead, data is stored in a series of snapshot records. When a change occurs
a new snapshot record is added, rather than an update being done.
66






Steps in Building a Data Warehouse:

Identify key business drivers, sponsorship, risks, ROI
Survey information needs and identify desired functionality and define
functional requirements for initial subject area.
Architect long-term, data warehousing architecture
Evaluate and Finalize DW tool & technology
Conduct Proof-of-Concept
Design target data base schema
Build data mapping, extract, transformation, cleansing and
aggregation/summarization rules
Build initial data mart, using exact subset of enterprise data warehousing
architecture and expand to enterprise architecture over subsequent phases
Maintain and administer data warehouse
67
Representative DSS Tools

Tool Category Products
ETL Tools ETI Extract, Informatica, IBM Visual Warehouse
Oracle Warehouse Builder
OLAP Server Oracle Express Server, Hyperion Essbase,
IBM DB2 OLAP Server, Microsoft SQL Server
OLAP Services, Seagate HOLOS, SAS/MDDB
OLAP Tools Oracle Express Suite, Business Objects,
Web Intelligence, SAS, Cognos Powerplay
/Impromtu, KALIDO, MicroStrategy, Brio Query,
MetaCube
Data Warehouse Oracle, Informix, Teradata, DB2/UDB, Sybase,
Microsoft SQL Server, RedBricks
Data Mining & Analysis SAS Enterprise Miner, IBM Intelligent Miner,
SPSS/Clementine, TCS Tools
68
Business Intelligence
How intelligent can you make your business processes?

What insight can you gain into your business?

How integrated can your business processes be?

How much more interactive can your business be with customers, partners,
employees and managers?
69
What is Business Intelligence (BI)?
Business Intelligence is a generalized term applied to a broad category of
applications and technologies for gathering, storing, analyzing and providing
access to data to help enterprise users make better business decisions

Business Intelligence applications include the activities of decision support
systems, query and reporting, online analytical processing (OLAP), statistical
analysis, forecasting, and data mining

An alternative way of describing BI is: the technology required to turn raw data
into information to support decision-making within corporations and business
processes
70
Why BI?
BI technologies help bring decision-makers the data in a form they can quickly
digest and apply to their decision making.

BI turns data into information for managers and executives and in general, people
making decisions in a company.

Companies want to use technology tactically to make their operations more
effective and more efficient - Business intelligence can be the catalyst for that
efficiency and effectiveness.
71
Benefits
The benefits of a well-planned BI implementation are going to be closely tied to
the business objectives driving the project.

Identify trends and anomalies in business operations more quickly, allowing
for more accurate and timelier decisions.

Deliver actionable insight and information to the right place with less effort .

Identify and operate based on a single version of the truth, allowing all
analysis to be completed on a core foundation with confidence.
72
Business Intelligence Platform Requirements
Data Warehouse Databases

OLAP

Data Mining

Interfaces

Build and Manage Capabilities
The business intelligence platform should provide good integration across these
technologies. It should be a coherent platform, not a set of diverse and
heterogeneous technologies.
73
Business Intelligence Components
TRANSFORM
LOAD
EXTRACT
OLAP
DATA
MINING
Data
Warehouse
Operational Data
74
Business Intelligence Architecture
75
Business Intelligence Technologies
Data Sources
Paper, Files, Information Providers, Database Systems, OLTP
Data Warehouses / Data Marts
Data Exploration
OLAP, DSS, EIS, Querying and Reporting
Data Mining
Information discovery
Data Presentation
Visualization Techniques
Decision Making
Increasing potential to
support business decisions
End User
Business Analyst
Data Analyst
DB Admin

Вам также может понравиться