Вы находитесь на странице: 1из 30

Business Intelligence and

Data Warehouse (BIDW)


Overview
Mai Hoang
Mar 18, 2009

OutStart Inc.

02/05/15 04:23

Content
BIDW Overview
Relationship between Business Intelligence (BI) system and Data Warehouse
Data Integration
Data Warehouse vs. Data Mart
BI Common functions
Data Mining and Analysis
Query and Reporting
Business Performance Management

BIDW Drivers/Trends
Quick look at Microsoft tools
Business Process Management (BPM) : MS BizTalk

Outstart Inc.

02/05/15 04:23

Introduction
Business Intelligence (BI) systems turn a company's raw data into
useable information that can help management identify important trends,
analyze customer behavior, and make intelligent business decisions
quickly.
BI relies on Data Warehousing (a data repository designed to support an
organization's decision making), making cost-effective storing and
managing of warehouse data critical to any BIDW solution

Outstart Inc.

02/05/15 04:23

BIDW Overview

Outstart Inc.

02/05/15 04:23

BIDW Process

Regulated by corporate security policy

Raw data is stored


Information is cleansed
and optimized
Data mining, query and
analytical tools
generate intelligence

Make strategic Business


Decisions

Business Performance Management


(BPM) applications track results

Outstart Inc.

02/05/15 04:23

Data Integration
Data integration is the process of
combining data residing at different sources
and providing the user with a unified view of these data

Data integration appears with increasing frequency as the volume and


the need to share existing data explodes
Provide basic extract, transform and load data (ETL) functions
Some popular data integration tools
IBM Visual Warehouse (IBM)
Oracle Data Warehouse - ODW (Oracle)
SQL Server Integration Services SSIS (Microsoft)

Outstart Inc.

02/05/15 04:23

Data Warehouse vs. Data Mart


Data warehouse is a repository of an organization's electronically stored
data
Data warehouses are designed to facilitate reporting and analysis
Extracting,
Transforming and
Loading

Outstart Inc.

02/05/15 04:23

Data Warehouse Benefits


Provide a common data model for all data of interest regardless of the
data's source makes it easier to report and analyze information where
multiple data models were used to retrieve information such as sales
invoices, order receipts, general ledger charges, etc.
Prior to loading data into the data warehouse, inconsistencies are
identified and resolved greatly simplifies reporting and analysis.
The information in the warehouse can be stored safely for extended
periods of time even if the source system data is purged over time,
As being separated from operating systems, data warehouses provide
retrieval of data without slowing down operational systems.
Able to work in conjunction with and, hence, enhance the value of
operational business applications, notably customer relationship
management (CRM) systems.
Facilitate decision support system applications such as trend reports ,
reports that show actual performance versus goals, etc

Outstart Inc.

02/05/15 04:23

Data Warehouse Disadvantages


Can result in high costs
high maintenance cost over their life due to the data warehouse is usually not
static
cost of delivering suboptimal information to the organization as it can get
outdated relatively quickly.

There is often a fine line between data warehouses and operational


systems. Duplicate, expensive functionality may be developed. Or,
functionality may be developed in the data warehouse that, in retrospect,
should have been developed in the operational systems and vice versa.

Outstart Inc.

02/05/15 04:23

Data Mart
Data mart is a subset of an organizational data store, usually oriented to
a specific purpose or major data subject, that may be distributed to
support business needs
Data marts are analytical data stores designed to focus on specific
business functions for a specific community within an organization
Data marts are often (some time might not) derived from subsets of data
in a data warehouse

Outstart Inc.

02/05/15 04:23

Why need Data Mart?


Easy access to frequently needed data
Creates collective view by a group of users
Improves end-user response time
Ease of creation
Lower cost than implementing a full Data Warehouse
Potential users are more clearly defined than in a full Data Warehouse

Outstart Inc.

02/05/15 04:23

Business Intelligence
Business Intelligence refers to skills, technologies, applications and
practices used to help a business acquire a better understanding of its
commercial context.
Business intelligence may also refer to the collected information itself
BI applications provide historical, current, and predictive views of
business operations
Common functions of business intelligence applications including
Data Mining
Analytics (such as On-line Analytical Processing-OLAP)
Business Performance Management
and Reporting

Business intelligence often aims to support better business decisionmaking a BI system can be called a Decision Support System (DSS)

Outstart Inc.

02/05/15 04:23

Business Intelligence Tools


Cognos (IBM)
IBM Cognos 8's Web services-based SOA is much better integrated than some competing
offerings, with shared metadata across the platform enabling ease of transfer from report
to query to analysis.
Have high proportion of enterprise-standard BI platform deployments

Microsoft
Microsoft Analysis Services
A group of OLAP and data mining services provided in MS SQL server

Performance Point Server 2007


Monitor Server Operation
Planning Server Operation
Management Reporter
More references at http://en.wikipedia.org/wiki/PerformancePoint_Server_2007

Proclarity
Integrated tightly with MS SQL server, especially MS Analysis Services

SAS
SAS's approach to BI originates with forecasting, predictive modeling and optimization,
and embedding them within cross-functional and industry-specific applications
An OLAP product, however also provides some ETL functions
Outstart Inc.

02/05/15 04:23

Data Mining
Data mining is the process of extracting hidden patterns from data
As more data is gathered, with the amount of data doubling every three
years, data mining is becoming an increasingly important tool to transform
this data into information
Data mining is commonly used in a wide range of profiling practices, such
as marketing, surveillance, fraud detection and scientific discovery
Data mining can be applied to data sets of any size. However, while it can
be used to uncover hidden patterns in data that have been collected,
obviously it can neither uncover patterns which are not already present in
the data, nor can it uncover patterns in data that have not been collected.

Outstart Inc.

02/05/15 04:23

Process of Data Mining Knowledge Discovery in Database


(KDD)
Pre-processing raw data
Once the objective for the KDD process is known, a target data set must be
assembled
The target dataset must be large enough to contain the patterns while remaining
concise enough to be mined in an acceptable timeframe
The target set is then cleaned to help removing the observations with noise and
missing data
The clean data is reduced into feature vectors, one vector per observation. A
feature vector is a summarized version of the raw data observation
The feature vectors are divided into two sets, the "training set" and the "test set
The training set is used to "train" the data mining algorithms
The test set is used to verify the accuracy of any patterns found

Outstart Inc.

02/05/15 04:23

Process of Data Mining Knowledge Discovery in Database


(KDD) Cont.
Mine data: commonly involves four classes of task:
Classification: arrange the data into predefined groups
Clustering: like classification but the groups are not predefined, so the algorithm
will try to group similar items together.
Regression: attempt to find a function which models the data with the least error.
A common method is to use Genetic Programming
Association rule learning: search for relationships between variables.

Outstart Inc.

02/05/15 04:23

Process of Data Mining Knowledge Discovery in Database


(KDD) Cont.
Interpret results
Evaluate the patterns produced by the data mining algorithms
Not all patterns found by the data mining algorithms are necessarily valid.
It is common for the data mining algorithms to find patterns in the training set
which are not present in the general data set, this is called overfitting. To
overcome this, the evaluation uses a "test set" of data which the datamining
algorithm was not trained on. The learnt patterns are applied to this "test set"
and the resulting output is compared to the desired output.
A number of statistical methods may be used to evaluate the algorithm such as
ROC curves.
If the learnt patterns do not meet the desired standards, then it is necessary to
reevaluate and change the preprocessing and data mining
If the learnt patterns do meet the desired standards then the final step is to
interpret the learnt patterns and turn them into knowledge

Outstart Inc.

02/05/15 04:23

On-line Analytical Processing (OLAP)


Online analytical processing (OLAP) is an approach to quickly answer
multi-dimensional analytical queries
Databases configured for OLAP use a multidimensional data model,
allowing for complex analytical and ad-hoc queries with a rapid execution
time. They borrow aspects of navigational DB and hierarchical DB that are
faster than relational DB OLAP is sometimes called as Fast Analysis
of Shared Multidimensional Information
The output of an OLAP query is typically displayed in a matrix (or pivot)
format. The dimensions form the rows and columns of the matrix; the
measures form the values.

Outstart Inc.

02/05/15 04:23

OLAP Cube
In the core of any OLAP system is a concept of an OLAP cube (also
called a multidimensional cube or a hypercube)
An OLAP cube is a data structure that allows fast analysis of data

Outstart Inc.

02/05/15 04:23

OLAP Cube (cont.)


An OLAP cube consists of numeric facts called measures which are
categorized by dimensions
The cube metadata is typically created from a start scheme or snowflake
schema of tables in a relational database
Each measure can be thought of as having a set of labels, or meta-data
associated with it
A dimension is what describes these labels; it provides information about
the measure.
Example:
A cube contains a store's sales as a measure, and Date/Time as a dimension. Each
Sale has a Date/Time label that describes more about that sale.
Any number of dimensions can be added to the structure such as Store, Cashier, or
Customer. This allows an analyst to view the measures along any combination of the
dimensions

Outstart Inc.

02/05/15 04:23

OLAP Products
Below is a list of top OLAP vendors in 2006
Note:
Microsoft was the only vendor that
continuously exceeded the industrial
average growth during 2000-2006
Since the above data was collected,
Hyperion has been acquired by
Oracle, Cartesis by Business Objects,
Business Objects by SAP, Applix by
Cognos, and Cognos by IBM

Outstart Inc.

02/05/15 04:23

BIDW Drivers/Trends
Key drivers that are making BIDW solutions mission critical include:
Rapid increase in information democracy
The ability to make intelligent business decisions quickly is imperative to remain
competitive
Data is being customized on a mass scale, thus data warehouses must be
flexible enough to provide different views to different people
New legislation and compliance regulations have made BIDW mission critical.
Data must be captured, retained, and managed in a way that will satisfy courts
and regulators
The enormous diversity of data: Organizations must store and manage data
from multiple different sources such as ERP and CRM systems, and in a variety
of formats such as text, images, voice, video, unstructured data, and more
The increased need for better security due to wider data access availability and
a larger number of users.
There are a number of industry-specific drivers (Government, Financial services,
Retail, Telecom)

Outstart Inc.

02/05/15 04:23

MS BizTalk Introduction
Microsoft BizTalk Server, often referred to as simply "BizTalk", is a
business process management (BPM) server
Through the use of "adapters" which are tailored to communicate with
different software systems used in a large enterprise, MS Biztalk enables
companies to automate and integrate business processes.
In a common scenario, BizTalk enables companies to integrate and
manage business processes by exchanging business documents such as
purchase orders and invoices between disparate applications, within or
across organizational boundaries.
With over 7,000 customers, including 90 percent of the Fortune Global
100, BizTalk Server is a trusted solution for Service-Oriented Architecture
(SOA) and Business Process Management

Outstart Inc.

02/05/15 04:23

MS BizTalk Server 2006 R2


BizTalk Server 2006 R2 builds upon the Business Process Management
and SOA/ESB capabilities in prior releases to help organizations extend
core process management technologies even further with new capabilities
like
native support for Electronic Data Interchange (EDI), Applicability Statement 2
(AS2) and Radio Frequency Identification (RFID)
close alignment with the 2007 Microsoft Office system and Windows Vista,
including key .NET Framework technologies such as Windows Workflow
Foundation and Windows Communication Foundation

BizTalk Server 2006 R2 puts real-time, end-to-end supply chain


management within reach of every customer, spanning systems, people,
and processes, both within and across organizational boundaries
BizTalk Server 2006 R2 empowers customers to make informed business
decisions with real-time data from geographically dispersed, yet integrated
systems

Outstart Inc.

02/05/15 04:23

What BizTalk Server 2006 R2 can bring to your organization?


ExtendSupply Chains to the Edge
With new capabilities such as native support for EDI and RFID, BizTalk Server
2006 R2 provides the ability to gather data from the edge of the enterprise,
enabling real time visibility across business processes and trading partners

ConnectSOA and Interop on a Unified Platform


Provide the infrastructure to connect existing applications (regardless of the
platform) and to compose, expose, and consume new services.
As including tools to connect both proprietary and standards based systems and
pre-integrates with the .NET Framework, BizTalk Server is a central part of any
SOA strategy.
A broad array of technology and application adapters is available for BizTalk
which provide out-of-the-box support for everything from transport protocols
such as FTP, SOAP, and MQSeries, to high level integration with line of
business applications such as PeopleSoft, SAP, and Siebel

Outstart Inc.

02/05/15 04:23

What BizTalk Server 2006 R2 can bring to your organization?


DeliverEnterprise Proven Solutions
Help organizations cost effectively manage their supply chain from the factory to
the storefront
An end-to-end integrated supply chain allows organizations to drive maximum
efficiency through visibility into critical business processes, and tighter
collaboration with trading partners.

Outstart Inc.

02/05/15 04:23

MS BizTalk Core Functions


BizTalk provides the following functions
(1) Business Process Automation (BPA)
the process a business uses to contain costs
consists of integrating applications, cutting labor wherever possible, and using software
applications throughout the organization

(2) Business Process Modeling (BPM)


the activity of representing process of an enterprise, so that the current ("as is") process
may be analyzed and improved in future ("to be")
typically performed by business analysts and managers who are seeking to improve
process efficiency and quality
Some business process modeling techniques:
Business Process Modeling Notation (BPMN)
Cognition enhanced Natural language Information Analysis Method (CogNIAM)
Extended Business Modeling Language (xBML)
Event-driven process chain (EPC)
Unified Modeling Language (UML)

Outstart Inc.

02/05/15 04:23

Outstart Inc.

02/05/15 04:23

MS BizTalk Core Functions


(3) Business-to-business (B2B) Communication
BizTalk Server 2006 R2 includes comprehensive data exchange options to communicate
with Trading Partners through industry standards. These features include integrated
support within the BizTalk Server engine for Electronic Data Exchange (EDI) data
(including X12, EDIFACT and HIPAA support) and Availability Statement 2 (AS2) data for
EDI over the Internet.
BizTalk Accelerators speed up the development of standards based B2B solutions within
specific industry segment such as: the SWIFT, HL7 and RossetaNet Accelerators.
Trading partner information and partner agreements can be stored and managed,
allowing for rapid on boarding and provisioning of partners and for streamlining business
communication with them.

(4) Enterprise Application Integration (EAI) and Message broker


The messaging subsystem provides communication with a wide range of external
applications through adapters
Dozens of adapters are supplied, free of charge with BizTalk Server 2006 R2 to handle
proprietary protocols and to support the conversion to and from different data

Outstart Inc.

02/05/15 04:23

Q&A

Outstart Inc.

02/05/15 04:23