Вы находитесь на странице: 1из 202

Chapter 7

Data Warehouse & OLAP

Database Systems: Design, Implementation, and Management


4th Edition

Peter Rob & Carlos Coronel


The Need for Data Analysis
 Constant pressure from external and internal forces
requires prompt tactical and strategic decisions.

13
 The decision-making cycle time is reduced, while
problems are increasingly complex with a growing
number of internal and external variables.
 Managers need support systems for facilitating quick
decision making in a complex environment.
 Decision support systems (DSS).
 Building a Stock Decision Support Tool in Microsoft Excel
2010, https://www.youtube.com/watch?v=iXfxxHx21so
13
13
Data warehouse
 A data warehouse is a database that provides
support for decision making

13
 A data warehouse database must be:
 Integrated
 Subject-oriented
 Time-variant
 Non-volatile

 Benefits of Data warehouse (video)


The Data Warehouse
 The Data Warehouse is an integrated,
subject-oriented, time-variant, non-volatile

13 database that provides support for decision


making.

 Subject-oriented as the warehouse is organized


around the major subjects of the enterprise (such as
customers, products, and sales) rather than the major
application areas (such as customer invoicing, stock
control, and product sales). This is reflected in the
need to store decision-support data rather than
application-oriented data.
The Data Warehouse
 Integrated because of the coming together of source data
from different enterprise-wide applications systems. The
source data is often inconsistent using, for example, different
formats. The integrated data source must be made consistent

13
to present a unified view of the data to the users.
 Time-variant because data in the warehouse is only accurate
and valid at some point in time or over some time interval. The
time-variance of the data warehouse is also shown in the
extended time that the data is held, the implicit or explicit
association of time with all data, and the fact that the data
represents a series of snapshots.
 Non-volatile as the data is not updated in real time but is
refreshed from operational systems on a regular basis. New
data is always added as a supplement to the database, rather
than a replacement. The database continually absorbs this
new data, incrementally integrating it with the previous data
13

Table 13.6A Comparison Of Data Warehouse And Operational


Database Characteristics
Creating A Data Warehouse

13

Figure 13.3
A Data Warehouse Framework and Views

13
The Data Warehouse
Twelve Rules That Define a Data Warehouse
1. The Data Warehouse and operational
environments are separated.
2. The Data Warehouse data are integrated.
3. The Data Warehouse contains historical data

13 over a long time horizon.


4. The Data Warehouse data are snapshot data
captured at a given point in time.
5. The Data Warehouse data are subject-oriented.
6. The Data Warehouse data are mainly read-only
with periodic batch updates from operational data. No
online updates are allowed.
7. The Data Warehouse development life cycle
differs from classical systems development. The Data
Warehouse development is data driven; the classical
approach is process driven.
The Data Warehouse
8. The Data Warehouse contains data with several
levels of detail; current detail data, old detail data, lightly
summarized, and highly summarized data.
9. The Data Warehouse environment is
characterized by read-only transactions to very large

13
data sets. The operational environment is characterized
by numerous update transactions to a few data entities
at the time.
10. The Data Warehouse environment has a system
that traces data resources, transformation, and storage.
11. The Data Warehouse’s metadata are a critical
component of this environment. The metadata identify
and define all data elements. The metadata provide the
source, transformation, integration, storage, usage,
relationships, and history of each data element.
12. The Data Warehouse contains a charge-back
mechanism for resource usage that enforces optimal use
of the data by end users.
Architecture of Web-Based Data
Warehousing

13
OLAP vs. OLTP

13
We can divide IT systems into transactional (OLTP) and analytical
(OLAP).
In general we can assume that OLTP systems provide source data
to data warehouses, whereas OLAP systems help to analyze it. 
OLTP
 OLTP deals with recording the real time transactions
that use in operational system such as transactions
happen in e-commerce and also banking ATM

13 system.
 OLTP (On-line Transaction Processing) is
characterized by a large number of short on-
line transactions (INSERT, UPDATE, DELETE).
 The main emphasis for OLTP systems is put on very fast
query processing, maintaining data integrity in multi-
access environments and an effectiveness measured
by number of transactions per second.
 In OLTP database there is detailed and current data, and
schema used to store transactional databases is the
entity model (usually 3NF). 
On-Line Analytical Processing
 On-Line Analytical Processing (OLAP) is deals
with analyzing the data store in the data
warehouse.
 an advanced data analysis environment that

13
supports decision making, business modeling,
and operations research activities.

 Four Main Characteristics of OLAP


 Use multidimensional data analysis techniques
 Provide advanced database support
 Provide easy-to-use end user interfaces
 Support client/server architecture
OLAP
 OLAP (On-line Analytical Processing) is characterized
by relatively low volume of transactions.

13
 Queries are often very complex and involve
aggregations.
 For OLAP systems a response time is an effectiveness
measure. OLAP applications are widely used by Data
Mining techniques. In OLAP database there is
aggregated, historical data, stored in multi-dimensional
schemas (usually star schema). 
More video

 Introduction to OLAP
13  https://www.youtube.com/watch?v=2ry
G3Jy6eIY
 Excel Tutorial: What is Business
Intelligence and an OLAP Cube?
 https://www.youtube.com/watch?v=yo
E6bgJv08E
On-Line Analytical Processing
 Multidimensional Data Analysis Techniques
 The processing of data in which data are viewed

13 
as part of a multidimensional structure.
Multidimensional view allows end users to
consolidate or aggregate data at different levels.
 Multidimensional view allows a business analyst
to easily switch business perspectives.
 Refer to example : Excel
13

Figure 13.4 Operational Vs. Multidimensional View Of Sales


13

Figure 13.5 Integration Of OLAP With A Spreadsheet Program


INTEGRATION OF OLAP WITH A SPREADSHEET
PROGRAM - Pivot table in Excel

13
On-Line Analytical Processing
 OLAP Architecture
 Three Main Modules

13 


OLAP Graphical User Interface (GUI)
OLAP Analytical Processing Logic
 OLAP Data Processing Logic

 OLAP systems are designed to use both


operational and Data Warehouse data.
13

As Figure 13.17 illustrates, OLAP systems are designed to use both operational and
data warehouse data. The figure shows the OLAP system components on a single computer,
but this single-user scenario is only one of many. In fact, one problem with the
installation shown here is that each data analyst must have a powerful computer to store
the OLAP system and perform all data processing locally.
13
Types of On-Line Analytical Processing

 Relational OLAP (ROLAP)


o Relational On-Line Analytical Processing (ROLAP)

13 provides OLAP functionality by using relational


database and familiar relational query tools.
 Multidimensional OLAP (MOLAP)
o MOLAP extends OLAP functionality to multidimensional
databases (MDBMS).
o MDBMS end users visualize the stored data as a
multidimensional cube known as a data cube.
o Data cubes are created by extracting data from the
operational databases or from the data warehouse.
o Watch the video:
 https://www.youtube.com/watch?v=LzmAbi5ZOhE
ROLAP – using query tool

13
Multidimensional OLAP (continued)

13
Relational Vs. Multidimensional OLAP

13

Table 13.8
Star Schema
• The star schema is a data-modeling technique used
to map multidimensional decision support into a
relational database.

13 • Star schemas yield an easily implemented model for


multidimensional data analysis while still preserving
the relational structure of the operational database.
• A star schema has four Components:
• Facts
• Dimensions
• Attributes
• Attribute hierarchies
13
Star Schema

• Facts
• Facts are numeric measurements (values) that represent a
specific business aspect or activity. For example, sales
figures are numeric measurements that represent product

13 •
and service sales.
Facts commonly used in business data analysis are units,
costs, prices, and revenues. Facts are normally stored in a
fact table that is the center of the star schema.
• The fact table contains facts that are linked through their
dimensions, which are explained in the next section.
• Facts can also be computed or derived at run time. Such
computed or derived facts are sometimes called metrics to
differentiate them from stored facts.
• The fact table is updated periodically with data from
operational databases.
Star Schema
• Dimensions
• Dimensions are qualifying characteristics that provide
additional perspectives to a given fact. For instance,
sales might be compared by product from region to

13
region and from one time period to the next.
• The kind of problem typically addressed by a BI
system might be to compare the sales of unit X by
region for the first quarters of 2006 through 2016.
• In that example, sales have product, location, and
time dimensions. In effect, dimensions are the
magnifying glass through which you study the facts.
• Such dimensions are normally stored in dimension
tables. Figure 13.6 depicts a star schema for sales
with product, location, and time dimensions.
A Simple Star Schema

13
Star Schema
• Attributes
 Each dimension table contains attributes. Attributes are
often used to search, filter, or classify facts.

13  Dimensions provide descriptive characteristics about


the facts through their attributes.

Table 13.10 Possible Attributes For Sales Dimensions


Star Schema
• OLAP consists of three basic analytical operations:
• consolidation (roll-up)
• Consolidation involves the aggregation of data that can be
accumulated and computed in one or more dimensions.

13
• For example, all sales offices are rolled up to the sales department or
sales division to anticipate sales trends
• drill-down
•  the drill-down is a technique that allows users to navigate through the
details.
• For instance, users can view the sales by individual products that
make up a region's sales
• slicing and dicing.
• Slicing and dicing is a feature whereby users can take out (slicing) a
specific set of data of the OLAP cube and view (dicing) the slices from
different viewpoints.
• These viewpoints are sometimes called dimensions (such as looking
at the same sales by salesperson or by date or by customer or by
product or by region, etc.)
Example of Aggregation in

13
13
A Location Attribute Hierarchy

13

Figure 13.15
Attribute Hierarchies In Multidimensional Analysis

13

Figure 13.16
Data Warehouse Implementation Road Map

13

Figure 13.21
• Refer to the following video about “Data Warehouse Architecture”
• https://www.youtube.com/watch?v=CHYPF7jxlik

• Excel Tutorial: What is Business Intelligence and an OLAP Cube?

13
• https://www.youtube.com/watch?v=yoE6bgJv08E

• Data Cube Operations – SQL Queries


• https://blogs.perficient.com/2017/08/02/data-cube-operations-sql-queries/
Chapter 6
INTRODUCTION TO DATA
MINING
Learning objectives:

 After this lesson, you are able to learn as the

13following:
 What is Data Mining?
 Describe the various techniques in Data mining process
 Understand the KDD Process model
 Describe the various phases of CRISP-DM
 Applications of Data Mining
Definition
 Data mining is the process of Data mining
of discovering interesting knowledge
such as unknown patterns, association or significant structures
from large amount of data stored in databases, data warehouses or
other information repositories in order to discover useful patterns.

13
 Another definition of data mining : Data mining is an iterative
process of creating predictive and descriptive models, by
uncovering previously unknown trends and patterns in vast amount
of data in order to support decision making.
 Data mining is a subset of Business Analytics
 There is a need to turn data into useful information and knowledge
for broad applications including
 Market analysis
 Business management
 Decision support
 Customer segmentation and behavior
 Etc.
13
How data mining works?

 Data mining builds models to discover patterns


among attributes presented in the data set.
 Models are:

13  Mathematical representations (simple linear


relationships and highly non-linear relationship)
that identify patterns among attributes of the
things such as customers with products
 Some of these patterns are explanatory and
others are predictive (foretelling future values of
certain attributes)
Why Mine Data? Commercial Viewpoint
 Lots of data is being collected
and warehoused
 Web data, e-commerce

13 purchases at department/


grocery stores
 Bank/Credit Card
transactions
 Computers have become cheaper and more powerful
 Competitive Pressure is Strong
 Provide better, customized services for an edge (e.g. in
Customer Relationship Management)
What is (not) Data Mining?

What is not Data  What is Data Mining?


Mining?

– Look up – Certain names are more


13 phone number
in phone
prevalent in certain US
locations (O’Brien,
directory O’Rurke, O’Reilly… in
Boston area)
– Query a Web – Group together similar
search engine documents returned by
for information search engine according
about to their context (e.g.
“Amazon” Amazon rainforest,
Examples of data mining
applications
 Regarding temporal data, for instance, banking data can be
mined for changing trends, which may aid in the scheduling
of bank tellers according to the volume of customer traffic.

13
 Stock exchange data can be mined so that trends that could
help to plan investment strategies can be uncovered
 Computer network data streams can be mined to detect
intrusions based on the anomaly of message flows, which
may be discovered by clustering, dynamic construction of
stream models or by comparing the current frequent
patterns with those at a previous time.
 With spatial data, look for patterns that describe changes in
metropolitan poverty rates based on city distances from
major highways. By examining the relationships among a
set of spatial objects, which subsets of objects are spatially
auto correlated or associated can be discovered.
13
Industry examples of DM applications

 Sales/ Marketing
 Identify buying patterns from customers
 Find the association among customer demographic characteristics
  Banking

13 

Credit card fraudulent detection
Identify ‘loyal’ customers
  Insurance and Health Care
 Claims analysis i.e., which medical procedures are claimed together
 Predict the customers who will buy new policies
  Transportation
 Determine the distribution schedules for the outlets
 Analyze loading patterns
  Medicine
 Characterize patient behavior in order to predict office visits
 Identify successful medical therapies for different diseases / illnesses
Take a break….
Watch a video

 Source of data mining

13  https://www.youtube.com/watch?v=Y_JlkzzhAgw
D

13
Prediction
 Prediction is refer to the act of telling about
the future by taking into account the
experiences, opinions and other relevant
information in conducting the task of

13
foretelling.
 Depending on the nature of what is being
predicted, prediction can be specifically as :
 Classification (predicted thing is such as
tomorrow’s forecast, is a class label such as
“rainy” or “sunny”)
 Regression (predicted thing is tomorrow’s
temperature, is a real number such as 65 F)
 Time-series, the data consists of values of the
same variable that is captured and stored over tine
in regular intervals, such as stock price
Prediction techniques
 Classification : assign a new data record to one of several
predefined categories or classes. Also called supervised
learning.
 Classification approaches normally use a training set where
all objects are already associated with known class labels.

13
 The classification algorithm learns from the training set and
builds a model. The model is used to classify new objects.
 This method has been used in customer segmentation,
business modeling, and credit analysis.
 For example, after starting a credit policy, the OurVideoStore
managers could analyze the customers’ behaviours via their
credit, and label accordingly the customers who received
credits with three possible labels “safe”, “risky” and “very
risky”. The classification analysis would generate a model
that could be used to either accept or reject credit requests
in the future
13
Associations

 Or association rule learning in data mining is


a popular and well-researched technique for
discovering interesting relationships among
variables in large databases.

13
 With the help of bar-code scanners, the use of
associations rules for discovering regularities
among products is able to capture by the
system.
 Types of associations:
 Link analysis : the linkage among many objects
of interest is discovered automatically, such as the
link between web pages and referential
relationships among groups of academic
publication authors
Associations techniques

 Market-basket: detect sets of attributes/items that


frequently has association relationship or
correlations among them, e.g. 90% of the people who
buy cookies, also buy milk (60% of all grocery

13
shoppers buy both)
 In data mining, association rules are useful for
analyzing and predicting customer behavior. They
play an important part in shopping basket data
analysis, product clustering, catalog design and store
layout.
 Sequence mining (categorical): discover sequences
of events that commonly occur together, .e.g. In a set
of DNA sequences ACGTC is followed by GTCA after
a gap of 9, with 30% probability
 Something come after the other, for example: when
happen outbreak flu, the glove will be in shortage
Association rules

13
Clustering
 Clustering: method of assigning a set of objects into
groups or segments based on similarities automatically.
 Unlike classification, in clustering the class labels are
unknown.

13
 As the selected algorithm goes through the data set,
identifying the common of things based on their
characteristics, the clusters are established.
 Clustering techniques include optimization.
 Goal of clustering is to create groups so that the
members within each group have maximum similarity and
the members across groups have minimum similarity.
Clustering techniques
 Cluster analysis is a means of identifying
classes of items so that items in a cluster
have more in common with each other than
with items in other clusters.

13 Example: create customer segmentation based


on income, age, race, location, etc.
Data Mining Techniques
 Outlier Analysis: find the record(s) that is
(are) the most different from the other
records, i.e., find all outliers. Outliers are data
elements that cannot be grouped in a given

13class or cluster.
Example of using Data Mining

13
Data Mining versus Statistics
Data Mining Statistics

Starts with loosely defined Starts with a well-defined


discovery statement by using all proposition and by collecting
existing data (i.e. observational sample data (i.e. primary data)

13
and secondary data) to discover to test the hypothesis
novel patterns and relationships

Data sets in data mining are as Statistics looks for the right size
“big” as possible of data (if the size of data
required for statistical analysis,
usually sample of data is used)
Data
Visualization

13
Take a break…
watch a video
 How Facebook Data Mining, And Your Info, Is
Influencing The 2016 Election | TODAY
https://www.youtube.com/watch?v=i-rIYadXoms

13
Knowledge Discovery in Database
(KDD)
 Knowledge Discovery from Data (KDD), refers to the broad
process of finding knowledge in data that emphasizes the
"high-level" application of particular data mining methods.

13
  The unifying goal of KDD process - extract knowledge
from data in the context of large databases - done by
using data mining methods
 KDD refers to the entire process of discovering useful
knowledge from data.
 This process involves making decision of what qualifies
as knowledge by evaluating and possibly interpreting the
patterns. It also includes the choice of encoding schemes,
preprocessing, sampling, and projections of the data prior
to the data mining step.
KDD: A Definition
 KDD is the automatic extraction of non-

13obvious, hidden knowledge from large


volumes of data.
Then run Data
Mining algorithms

106-1012 bytes:
we never see the What is the knowledge?
whole data set, so How to represent
will put it in the and use it?
memory of
computers
Knowledge Discovery Process

13
Steps in KDD process

13
Knowledge Discovery Process
 The Knowledge Discovery in Databases process comprises of a few steps
leading from raw data collections to some form of new knowledge.
 The iterative process consists of the following steps:
 Data cleaning: also known as data cleansing, it is a phase in which noise data and
irrelevant data are removed from the collection or maybe missing data.
 Data integration: at this stage, multiple data sources, often heterogeneous, may be

13
combined in a common source.
 Data selection: at this step, the data relevant to the analysis is decided on and
retrieved from the data collection.
 Data transformation: also known as data consolidation, it is a phase in which the
selected data is transformed into forms appropriate for the mining procedure.
 Data mining: it is the crucial step in which clever techniques are applied to extract
patterns potentially useful. Searching for patterns of interest in a particular
representational form or a set of such representations, including classification rules or
trees, regression, and clustering
 Pattern evaluation: in this step, strictly interesting patterns representing knowledge
are identified based on given measures.
 Knowledge representation: is the final phase in which the discovered knowledge is
visually represented to the user. This essential step uses visualization techniques to
help users understand and interpret the data mining results.
3 methodologies of KDD model
 Fayyad et al. (Computer science)
 E.g., WEKA

13 


SEMMA (SAS) (Statistics)
 SAS Enterprise Miner
CRISP-DM (SPSS, OHRA) (Business)
 SPSS
Methodology of KDD –
CRISP-DM

 CRISP-DM
 Stands for Cross Industry Standard Process for
Data Mining

13 A non-proprietary, documented, and freely


available data mining model.
 It was developed by industry leaders with input
from more than 200 data mining users and data
mining tool and service providers.
 It is an industry-, tool- and application-neutral
model.
 This model encourages best practices and offers
organizations the structure needed to realize
better, faster results from data mining.
Six phases in CRISP-DM

13
CRISP –DM (Elaborate view)

13
Six phases of CRISP-DM
1. Business Understanding
This initial phase focuses on understanding the project
objectives and requirements from a business perspective, and
then converting this knowledge into a data mining problem
definition, and a preliminary plan designed to achieve the

13
objectives.
Such as “What are the common characteristics of the
customers we have lost to our competitors recently?”
2. Data Understanding
The data understanding phase starts with an initial data
collection. It proceeds with activities
 ▪ To get familiar with the data,
 ▪ To identify data quality problems,
 ▪ To discover first insights into the data, or to
 ▪ Detect interesting subsets to form hypotheses for hidden
information.
Six phases of CRISP-DM
3. Data Preparation
The data preparation phase covers all activities to
construct the final dataset (data that will be fed into the
modeling tool(s)) from the initial raw data.

13
Data preparation tasks are likely to be performed
multiple times, and not in any prescribed order. Tasks
include table, record, and attribute selection as well as
transformation and cleaning of data for modeling tools.
4. Modeling
In this phase, many modeling techniques are chosen
and applied, and calibrate their parameters to optimal
values. Typically, to the same data mining problem type,
several techniques can be applied.
Six phases of CRISP-DM

5. Evaluate Results
The accuracy and generality of the model were dealt with
the previous evaluation steps. The degree to which the
model meets the business objectives is assessed in this

13
step.
Also this step seeks to determine if there is some valid
business reason why the model is deficient. If time and
budget permits, the model(s) can be tested on test
applications in the real application which is another option
of evaluation.
6. Deployment
The end of the project is not just the creation of the
model. Though the purpose of the model is to increase
knowledge of the data, the knowledge gained needs to be
organized and presented in such a way that the client can
use.
KDD vs. DM
 DM is a component of the KDD process that is
mainly concerned with means by which
patterns and models are extracted and
enumerated from the data

13  DM is quite technical
 Knowledge discovery involves evaluation and
interpretation of the patterns and models to
make the decision of what constitutes
knowledge and what does not
 KDD requires a lot of domain understanding
 The DM and KDD are often used
interchangeably
 Perhaps DM is a more common term in
business world, and KDD in academic world
13 The end.
Video: Data Mining and Business Intelligent
https://
www.youtube.com/watch?v=peSNJ5bfjX0

How data mining works?


https://
www.youtube.com/watch?v=W44q6qszdqY
INTRODUCTION
TO
BUSINESS ANALYTICS
IBM 2201

What is BIG DATA?


#

13
Objectives
 Get an overview of big data that covers:
 What is “Big Data”
Need of Big Data
13

 Characteristics of Big Data: 4V of Big Data
 Importance & Risks of Big Data
 The structure of Big Data
 What is Big Data Analytics? Benefits
 Big Data Adoption
 Applications
#

What is big data?


Examples of big data?

13
 https://www.youtube.com/watch?v=tkOwlXUaGMM&t=250s
What is Big Data ?
 Definition: “Big Data” is data whose scale,
distribution, diversity, and/or timeliness require the
use of new technical architectures and analytics to
enable insights that unlock new sources of

13
business value.
 What make a data as “big data”?
• Huge volume of data (for instance, tools that can manage
billions of rows and billions of columns)
• Complexity of data types and structures, with an increasing
volume of unstructured data (80-90% of the data in existence
is unstructured)….part of the Digital Shadow or “Data Exhaust”
• Speed or velocity of new data creation

Copyright © 2011 EMC Corporation. All Rights Source: McKinsey May 2011 article Big Data: The next frontier for innovation, competition, and 8
#
What is big data?
 Big Data is a general term used to describe the
voluminous amount of unstructured and semi-structured
data – where the data are capture from social media,
CCTV, sensors, smart watch, etc.

13
 Big Data term is often used when speaking about
Petabytes and Exabytes of data.
 A primary goal for looking at big data is to discover
repeatable business patterns.
 For example, If a customer is buying specific type and color of cloth in a
shop only that data would be available. However, the customer might be
having an occasion for which the same is purchased and that occasion
and the relationship with such occasion can be captured by the external
data that is created by the said customer may be in social media.
#

Why “big data” is growing!

13

https://www.youtube.com/watch?v=9s-vSeWej1U
A Growing Interconnected and
Instrumental World

12+ TBs 4.6


of tweet data 30 billion billion
every day RFID tags today
camera
(1.3B in 2005)
phones
world

13
wide

100s of
data every day
? TBs of

million
s of
GPS
enable
25+ TBs of
log data d devices
every day sold
2+
annually
billion
people
on the
76 million smart Web by
meters in 2009… end 2011
200M by 2014
#

Videos

 Does social media have the power to


change the world?

13 https://www.youtube.com/watch?v=Uppg_2nGo5

4
 Are you ready for digitization?
 https://www.youtube.com/watch?v=ystdF6jN7hc
#

Need for Big Data


 Big Data can unlock significant value by making information
transparent
 As organizations create and store more transactional data in
digital form. In fact, some leading companies are using their

13

ability to collect and analyze big data to conduct controlled
experiments to make better management decisions.
Big Data allows ever-narrower segmentation of customers and
therefore much more precisely tailored products or services
 Sophisticated analytics can substantially improve decision-
making, minimize risks, and unearth valuable insights that
would otherwise remain hidden
 Big Data can be used to develop the next generation of
products and services
#

IBM’s Big Data: 4V’s

13
#
Characteristics of Big Data
 1) Volume
 Volume indicates the amount of data for analysis.
 characteristic most associated with big data, volume refers to the
mass quantities of data.

13  Data volumes continue to increase at an unprecedented rate.


 2) Velocity
 Data in motion. The speed at which data is created, processed and
analyzed continues to accelerate.
 Velocity impacts latency – the lag time between when data is created
or captured, and when it is accessible.
 Data is continually being generated at a pace that is impossible for
traditional systems to capture, store and analyze.
Velocity (Speed)
 Data is begin generated fast and need to be processed fast
 Late decisions  will lead to missing opportunities
 Examples

13  E-Promotions: Based on your current location, your purchase


history, what you like  send promotions right now for store next to
you
 Healthcare monitoring: sensors monitoring your activities and
body  any abnormal measurements require immediate reaction

IBM Big Data and Analytics at work in


Banking

https://www.youtube.com/watch?
v=1RYKgj-QK4I
M
o
d
u
Characteristics of Big Data
 3) Variety
e
1
 Different types of data and data sources. Variety is about managing
the complexity of multiple data types, including structured, semi-
structured and unstructured data.
I

13
 Organizations need to integrate and analyze data from a complex array
n
of both traditional and non-traditional information sources.
 Explosion of sensors, smart devices and social collaboration
r
technologies, generates data in countless forms like text, web data,
o tweets, sensor data, audio, video and more.
d
u  4) Veracity
c  Veracity is about uncertainty of Data, correctness of data
i  Huge amount of money is spent by Organizations because of data
o quality issues.
n  Decision makers are not confident of data that is being used by them
for decision making.
o https://www.youtube.com/watch?v=wVAWAeOIIII
B
#

Veracity
 Establishing confidence of data is a biggest
challenge.
 Uncertainty of Data or Veracity is a very important

13character of Big Data.


 Nearly 27% of respondents to a research study
expressed that they were unsure of how much of
their data was inaccurate.
 Poor data quality cost the USA economy around
$3.1 Trillion a year. 1 in every 3 Business Leaders
do not trust the information they use to make
decisions.
#

Types of big data


 Structured data
 Semi-structured data
13
 Unstructured data

https://www.youtube.com/watch?v=mnoqT8nihT8
Variety – Complex Data Structures
Data Growth is Increasingly Unstructured

• Data containing a defined data type, format,


structure

• Example: Transaction data and OLAP


• Textual data files with a discernable

13
More Structured

pattern, enabling parsing

• Example: XML data files that are self


describing and defined by an xml schema

• Data that has no inherent


structure and is usually
stored as different types of
files.

• Example: Text documents,


PDFs, images, audio,
pictures and video
Structured data

 Structured Databases

Structured data is data that has been organized into a formatted repository, typically a
database, so that its elements can be made addressable for more effective processing
and analysis.

It refers to data that has a defined length and format for big data

13Ex. numbers, dates, and groups of words and numbers called strings.


It’s usually stored in a database.

99
0

#
Semi-structured data
 Semi-structured data is a form of structured data that
does not conform with the formal structure of data
models associated with relational databases or other
forms of data tables, but nonetheless contains tags or

13other markers to separate semantic elements and enforce


hierarchies of records and fields within the data.
 Semi structure data is a set of documents on the web
which contain hyperlinks to other document and it cannot
be modeled in natural relational data model because the
pattern of hyperlinks is not regular across documents.
 Example : XML, log file
1

Semi-structured data
 For example, a clickstream log may look like :
2017-11-01 14:27:57,944-INFO :
com.ovaledge.oasis.dao.DomainDaoImpl - RUNNING

13 QUERY: Select * from domain where


DOMAINTYPE='DATAAPP_CATEGORY’;
 where we see the structure but require some rules to
find the details.
Unstructured Data

It is the text written in various forms like - web pages,
emails, chat messages, pdf files, word documents, etc.

13
Applications Music(Audio) Movie(vedio)

X-Rays Pictures
Real-time/Fast Data – generate
unstructured data

13
Mobile devices
(tracking all objects all the time)

Social media and networks Scientific instruments


(all of us are generating data) (collecting all sorts of data)

Sensor technology and


networks
(measuring all kinds of data)
 The progress and innovation is no longer hindered by the ability to collect data
 But, by the ability to manage, analyze, summarize, visualize, and discover knowledge
from the collected data in a timely manner and in a scalable fashion
4

Big Data Analytics Adoption


Big data analytics is the process of examining large
amounts of data of different types (big data) to
uncover hidden patterns, unknown correlations and

13
other useful information.
 Such information can provide competitive advantages over
rival organizations and result in business benefits, such as
more effective marketing and increased revenue
Goal: to help companies make better business decisions by
enabling data scientists and other users to analyze huge
volumes of transaction data as well as other data sources that
may be left untapped by conventional business intelligence
(BI) programs.

https://www.youtube.com/watch?v=LtScY2guZpo
5

Big Data Adoption


 The term “big data adoption” is used here to
represent a natural progression of the data,
sources, technologies and skills that are

13necessary to create a competitive advantage in


the globally integrated marketplace.
 The four main stages of big data adoption and
progression are Educate, Explore, Engage and
Execute
6

#
Big Data Analysis Adoption Structure
1. Educate – build a base of knowledge
 Most organizations in this stage are studying the potential
benefits of big data technologies and analytics, and are trying
to understand how big data can help address important

13 business opportunities in their own industries or markets.


2. Explore – define the business case and roadmap
 In this stage organizations get down to formal in-house
discussions about how to use big data to solve important
business challenges.
3. Engage – embracing big data
 organizations begin to prove the business value of big data, as
well as perform an assessment of their technologies and skills.
7

Big Data Analysis Adoption Structure

4) Execute: Implementing big data at scale


In the Execute stage, big data and analytics capabilities are
more widely operationalized and implemented within the

13
organization.
The small number of organizations in the Execute stage is
consistent with the implementations we see in the
marketplace.
Importantly, these leading organizations are leveraging big
data to transform their businesses and thus are deriving the
greatest value from their information assets.
8

Benefits of Big Data Analytics


 Benefits of Big Data Analytics
 Anything involving customers could benefit from big data analytics.
 This includes better-targeted social-influencer marketing, customer-
base segmentation, and recognition of sales and market opportunities.

13
 Business intelligence in general can benefit from big data analytics
 This could result in more numerous and accurate business insights, an
understanding of business change, better planning and forecasting,
and the identification of root causes of cost.
 Specific analytic applications are likely beneficiaries of big data
analytics
 big data analytics might help automate decisions for real-time business
processes such as loan approvals or fraud detection.

https://www.youtube.com/watch?v=QvyQFXbgW2c
9

#
Barriers of Big Data Analytics

 Inadequate staffing and skills are the leading


barriers to big data analytics, where it
couldn’t making big data usable for end users

13
 A lack of business support can hinder a big
data analytics program in terms of cost and
compelling business case.
 Problems with current database software
used which is lack of database analytics can
be barriers to big data analytics
0

#
Risks of Big Data
 We are gathering data from different sources at different types,
to the extent that we often don’t know exactly what it contains,
big data carries its own special risks.
 You don’t know whether all or just a tiny piece of it might be

13essential to corroborate your compliance with some


government regulation.
 You can’t have a perfect predictive model of how the future
business and regulatory environments are going to evolve.
 But you can have a comprehensive data-risk mitigation program
that will help you deal with new challenges as they emerge.

The danger of Big Data


https://www.youtube.com/watch?v=y8yMlMBCQiQ
Real-Time Analytics/Decision Requirement

Product
Recommendations Learning why Customers
Influence
that are Relevant Behavior Switch to competitors
& Compelling and their offers; in

13
time to Counter

Friend Invitations
Improving the Customer to join a
Marketing Game or Activity
Effectiveness of a that expands
Promotion while it business
is still in Play
Preventing Fraud
as it is Occurring
& preventing more
proactively

What is Real time Analytics?


https://www.youtube.com/watch?v=ioHwEsARPW
2

Applications of Data Analytics – Use


Case
How big data analytics can help:
–DR processing
–Churn prediction

13
–Geomapping / marketing
–Network monitoring

Telecommunication services
•Problem:
Legacy systems are used to gain insights from internally generated
data facing issues of high storage costs, long data loading time, and
long administration process.
3

#
Applications of Data Analytics – Use
Case
 Financial Service
 Problem: 

13
 Manage the several Petabytes of data which is growing at 40-100%
per year under increasing pressure to prevent frauds and complain to
regulations.
 How big data analytics can help:
 Fraud detection
 Risk management
 360°View of the Customer
4

#
Applications of Data Analytics – Use
Case
 Transportation services
 Problem:

13
 Traffic congestion has been increasing worldwide as a result of
increased urbanization and population growth reducing the efficiency
of transportation infrastructure and increasing travel time and fuel
consumption.
 How big data analytics can help:
 Real time analysis to weather and traffic congestion data streams to
identify traffic patterns reducing transportation costs.
5

#
Applications of Data Analytics – Use
Case
 Healthcare and Life Sciences
 Problem:

13
 Vast quantities of real-time information are starting to come from
wireless monitoring devices that postoperative patients and those with
chronic diseases are wearing at home and in their daily lives.
 How big data analytics can help:
 Epidemic early warning
 Intensive Care Unit and remote monitoring
6

Video
Demo: IBM Big Data and Analytics at work in Banking
https://www.youtube.com/watch?v=ioHwEsARPWI

13
What is Hadoop?
https://www.youtube.com/watch?v=4DgTLaFNQq0

What Is Big Data? & How Big Data Is Changing The World!
https://www.youtube.com/watch?v=G_e3r4S2g80
Summary
 Terminologies of Big Data
 Need of Big Data

13  The Characteristics of Big Data (3V


and 4V)
 Types of Data
 Big Data Analytics Adoption
8

1
1
8 IBM 2201
INTRODUCTION
TO
BUSINESS ANALYTICS

Chapter 2
Key Roles and Responsibilities
9

Overview

This unit introduces the key roles

13 working around the business


analytics environment. It further
derives the responsibilities of
each role and their contribution to
the analytics team
Objectives
 Get an overview of roles and responsibilities
which covers:
The key drivers of analytics in an organization and the
13

engagement of the roles
 The role and responsibilities of Data Scientist
 The role and responsibilities of Business Analyst
An Overview of the
Analytics Ecosystem
 The objective of this section are:
• To identify various sectors of the analytics industry,

13
provide a classification of different types of industry
participants and illustrate the types of opportunities
that exist for analytics professionals.
• To be aware of organizations and new offerings and
opportunities in sections allied with analytics
An Overview of the Analytics Ecosystem (1 of 3)

• Figure 1.13 Analytics Ecosystem

13
3

Study each of the


13 section clearly…
• Study each of the industry sections and get to
know some examples of players in each section
4

Answer the following questions

1. Give examples of companies in each of


the 11 types of players

13
2. Which companies are dominant in
more than one category?
3. Is it better to be the strongest player in
one category or be active in multiple
category?
5

What a business analyst does?

Day in the life of a Business Analyst

13
https://www.youtube.com/watch?
v=eFDlUgLxtGM
Users of Business Analytics
 In those days, all the departments were
reply on IT teams and data analytics to
prepare the data and churn out reports
13
for their BA projects.
 Now no more, with technology evolved,
businesses have had a choice of
solutions offering analytics and
visualization capabilities that open up BA
to everyone.

Copyright © 2011 EMC Corporation. All Rights


Different types of business analytics
users
 1) Data Analyst
 They will drill the data to look for fresh insights that can be
used to underscore the business strategy.
 The role includes documenting all business data, identifying

13 patterns and creating reports and dashboards that will


support the decision-making process.
 2) The Executive
 The CEO who will driving the company’s success by
improving operational efficiency and constantly looking for
ways to reduce costs.
 BA gives an organizational overview to the CEO to spot
trends on business insights that can support business growth,
innovation and operational efficiency.

Copyright © 2011 EMC Corporation. All Rights


Different types of business analytics
users
 3) Business user
 This type of user can be anyone across the organization.
There are 2 types of business users:

13
i) casual BA user
- make use of dashboards to analyze predefined sets of
data
ii) Power user
- Has the capability of working with complex data sets.
- Power user is often a manager, who is looking for ways to
help a department operate more efficiently and more
effectively
- They can use the BA tools such as reporting to analyze on
the business activities.

Copyright © 2011 EMC Corporation. All Rights


Different types of business analytics
users
 4) the IT team
 They maintaining the infrastructure and giving
departments the tools that allow them to fulfills their

13 own data requests.


 They are working closely with departments and the
business, to ensure business users are getting the
most from data analytics.
 They ensure the data governance and security

Copyright © 2011 EMC Corporation. All Rights


0

#
Business Analytic Team
 Business Analyst
 A business analyst is someone who analyzes an organization or
business domain (real or hypothetical) and documents its
business or processes or systems, assessing the business

13
model or its integration with technology.
Uses BI tools and applications to understand business
conditions and drive business processes
 Skills: Business Analysts need to have a baseline
understanding of some core skills: statistics, data munging, data
visualization, exploratory data analysis,
Tools: Microsoft Excel, SPSS, SPSS Modeler, SAS, SAS Miner,
SQL, Microsoft Access, Tableau, SSAS.
1

#
Business Analytic Team
 Data Scientist
 Data scientists are involved with gathering data, massaging it
into a tractable form, making it tell its story, and presenting that
story to others

13
 Uses advanced algorithms and interactive exploration tools to
uncover non-obvious patterns in data
 Data scientists apply statistics, machine learning and analytic
approaches to solve critical business problems.
 Their primary function is to help organizations turn their volumes of big
data into valuable and actionable insights.
 In comparison with ‘data analysts’, in addition to data analytical skills,
Data Scientists are expected to have strong programming skills, an
ability to design new algorithms, handle big data, with some expertise
in the domain knowledge.
2

13
3

#
Business Analytics Team – Data
Engineer
 Data Engineers are the data professionals who prepare the “big data”
infrastructure to be analyzed by Data Scientists.
 They are software engineers who design, build, integrate data from
various resources, and manage big data. Then, they write complex

13queries on that, make sure it is easily accessible, works smoothly, and


their goal is optimizing the performance of their company’s big data
ecosystem.
 They might also run some ETL (Extract, Transform and Load) on top of
big datasets and create big data warehouses that can be used for
reporting or analysis by data scientists. Beyond that, because Data
Engineers focus more on the design and architecture, they are typically
not expected to know any machine learning or analytics for big data.
 Skills: Hadoop, MapReduce, Hive, Pig, Data streaming, NoSQL, SQL,
programming.
Tools: DashDB, MySQL, MongoDB, Cassandra
4

#
BA team – BI Developers
 Business Intelligence Developers are data experts that interact more
closely with internal stakeholders to understand the reporting needs,
and then to collect requirements, design, and build BI and reporting
solutions for the company.
 They have to design, develop and support new and existing data

13
warehouses, ETL packages, cubes, dashboards and analytical reports.
 Additionally, they work with databases, both relational and
multidimensional, and should have great SQL development skills to
integrate data from different resources.
 They use all of these skills to meet the enterprise-wide self-service
needs. BI Developers are typically not expected to perform data
analyses.
 Skills: ETL, developing reports, OLAP, cubes, web intelligence,
business objects design,
Tools: Tableau, dashboard tools, SQL, SSAS, SSIS and SPSS Modeler.
5

Demands for analytical expertise

According to the McKinsey Global Institute:

13
Big Data talent is a critical issue. By 2018, the United
States alone could face a shortage of 140,000 to
190,000 people with deep analytical skills.
But companies need to spend time upfront to
identify the kinds of roles they need to make the Big
Data machine run rather than just rushing to recruit
math and science jocks.
Summary

 You have been introduced to

13
 The key roles and responsibilities of data analytics team
 Introduction to Key business drivers towards Business
analytics
 Review on the current business analytics team
proposed.
7

Types OF INFORMATION SYSTEMS

CHAPTER 4
8

Objectives
 Identify and discuss the different types of Information Systems
and their roles in supporting business operations

13
9
Sixth

Types of Information Systems


 Types of ISs use in an organization:
 Enterprise Systems:
 Transaction Processing Systems (TPS)

13
 Enterprise Resource Planning (ERP)
 Management Information Systems (MIS)
 Decision making systems:
 Supply Chain Management (SCM) systems
 Customer Relationship Management Systems (CRM)
 Decision Support Systems (DSS)
0

13
1

Transaction Processing Systems (1)

Transaction processing system (TPS): a system


that capture daily transactions from business
activity and process detailed data to update

13
records about the business operations of the
organization.
This system capture input data, to process it and
store into a database for further processing.
 Input: basic transactions such as customer orders, purchase
orders, receipts, time cards, invoices and customer payments.
 Process: data collection, data editing, data correction, data
manipulation and data storage
 The data that processed will store in the computer system for
further processing in the higher level of information system
such as MIS.
2

Transaction Processing Systems (2)

 Examples of TPS:
 Point-of-sale machines: record grocery sales transactions
 ATM machines: record deposit, withdraw and transfer money

13
 purchase order systems: to record the purchase of goods in the company
 Online TGV movie ticketing system
 Egenting Hotel reservation system
 Characteristics of TPS:
 Support daily and routine operations
 Work with large amount of input and output data
 With less decision making involved
 Supply data to higher management system such as MIS
3

Enterprise Resource Planning


 Enterprise resource planning (ERP) is a set of
integrated programs that manage a company’s vital
business operations for entire multisite.

13
 It integrate internal and external management
information across an entire organization,
embracing finance/accounting, manufacturing, sales
and service, customer relationship management, etc.
 ERP systems automate this activity with an
integrated software application.
 It offers integrated software from a single vendor to
meet to the needs.
4

An ERP integrates business processes and


the ERP database

13
5

13
6

Management Information System (MIS)

An organized collection of people, procedures, software,


databases, and devices that provides detailed information that
process from daily transaction data to managers and decision
makers.

13
Every department is using different type of MIS system to help
the manager to make better decision from the information that
produced.
MIS support structured decisions at the operational and
management control levels. However, they are also useful for
planning purposes of senior management staff.

MIS are generally reporting and control oriented. They are


designed to report on existing operations and therefore to help
provide day-to-day control of operations.
7

Management Information System


 Characteristics of MIS: (MIS)
 Provide reports with fixed and standard formats
 Produce hard-copy and softcopy reports

13


Use internal data stored in the computer system
Allows users to develop their own custom reports
Require user requests for reports developed by systems personnel
8

13
9

13
0

Supply Chain Management Systems (1)

 SCM helps determine what supplies are required,


what quantities are needed to meet customer
demand, how the supplies are to be processed

13(manufactured) into finished goods and services,


and how the shipment of supplies and products to
customers is to be scheduled, monitored and
controlled.
 Mainly it control how the information/stock/money
from supplier-manufacturer-retailer-consumer.
 Supply chain management (SCM) systems:
 Information systems that automate the flow of information between a
firm and its suppliers in order to optimize the planning, sourcing,
manufacturing, and delivery of products and services.
1

What is supply chain?

 Watch this!!

13
2

What is Supply Chain Management?

 Watch this !!!

13

Understanding the Basics of Supply Chain Analytics


https://www.supplychain247.com/article/understanding_the_basics_of_supply_chain_analytics
3

Supply Chain Management Systems


(2)
 For example, in an automotive company
 SCM can identify key supplies and parts  negotiate

13
with vendors for the best prices and support  make
sure all supplies and parts are available to manufacture
cards  send finished products to dealerships around
the country when they are needed.
 Internet can be used as a platform to negotiate for good prices and
service from different suppliers.
 Internet is a platform where it gather suppliers all over the world from
different countries, therefore it is possible to get any supplier that can offer
better price
4

SCM system

13
Customer Relationship Management
5

Systems

 Customer relationship management (CRM) program:


a software that is able to keep customer profile and

13
transaction history in order to analyze the customer
buying habits and to know them better in order to
keep and retain loyal customers.

 CRM can help a company collect customer data,


contact customers, educate customers on new
products and actively sell products to existing and
new customers.
6

13 What is CRM?

Why B2B companies are using CRM + analytics for growth?


https://www.businessanalyze.com/en/how-to-use-crm-analytics-infograph
7

13
8

Customer Relationship Management

13
9

CRM

Goal: to understand and anticipate the needs of the current


and potential customers to increase customer retention
and loyalty while optimizing the way the products and
services are sold.

13
 Function of the software is to capture data where
customers has contact the company and store it in the
system so that the company know the customer needs.
Purposes of implementing CRM:
 Improve marketers’ ability to interact with customers via multiple
channels such as email, sms, phone, Web and calling.
 To automate process for developing customer quotes thru auto mail
reply
 To improve communications with customers through messenger
contact
 To provide support for an anticipated increase in customers thru the
email, Web searching, phone enquiry
0

13
1

Decision Support Systems (1)

 Definition:
 It is a computer-based information system designed to help knowledge
workers select one of many alternative solutions to a problem.

13
 The success of an organization depends on the quality the
employees make.
 When decision making involves large amount of information and a
lot of processing, computer-based systems can make the process
efficient and effective.
2

Decision Support Systems (2)

 For examples:
 Helps answer “What if?” questions, i.e. what if we double the work shift
and cut down the number of staff?

13
 The question seek answer like : “This is how this action will impact our
revenue, or our market share or our costs.”
 DSSs are programmed to process raw data, make comparisons and
generate information to help in financial investment, marketing strategy,
credit approval, etc.
3

13
4

13 The End.
5

1
6
5

IBM 2201
INTRODUCTION
TO
BUSINESS ANALYTICS

Chapter 1
6

Overview
 This unit introduces the concept of
Business Analysis and Optimization
and provides an understanding of
13 role of Business Analysis in
Business Optimization. It gives an
glimpse of the reference architecture
in Business Analytics
Objectives
 Get an overview of this chapter is to covers:
 What is BA
Objective of BA
13

 Features of BA
 Capabilities of BA
 3 types of BA techniques
8

Some businesses operate in the same way


that they always have

13

Source: http://blogs.smithsonianmag.com/history/2011/12/emperor-
wang-mang-chinas-first-socialist/han-merchants
9

Some businesses operate in the same


way that they always have

13

Source: http://milkmiracle.net/2010/12/18/acland-in-
hindoostan
0

Some business now collect and


examine some data

13

Source: http://www.allposters.com/-
sp/Business-Meeting-1960s-
Posters_i6848810_.htm
1

#
Some businesses are integrating data-
driven algorithmic recommendations into
their decision-making processes
 These organizations are adopting business analytics
 Business analytics is all about the decisions that go

13into running a business in the areas of :


 Pricing (for example, setting price for consumer and industrial
goods, government contracts and maintenance contracts)
 Customer segmentation (for example, identifying and targeting
key customer groups in retail, insurance, and credit card
industries)
 Location (for example, finding the best location for bank
branches, and ATMs)
2

13 Let’s have a look at an example…


3

13
4

#
What is business analytics?
 Many practitioners define analytics as the process
of developing actionable decisions or
recommendations for actions based on insights
generated from historical data

13
 Definition:
 Analytics represents the combination of computer
technology, methodologies from applied mathematics,
applied probability, applied statistics, computer science,
and signal processing for using data to gain insight into
business performance and drive business planning (to
solve real problems in commercial)
5

Analytics overview

 Generally, BA is using tools and techniques to


turn data into meaningful business insights

13 Examples of tools: statistical model, regression,



machine learning, Anova test, etc
 Data are getting from FB, Instagram, google,
database, spreadsheet, sensors, CCTV, smart watch
6

#
Features of BA

 It is a scientific reading of information – Meaning the


usage of scientific methods and tools in reading the
information.
 It is always in relation with a Commercial Enterprise – the

13analysis of data is always related to the commercial which


is use in business decision making
 It is about the hidden information – Meaning that the
analysis should aim at studying or seeing or
revealing or unearthing not so obvious from the
given set of information but those which cannot be
seen by a direct viewing of the information. The main
focus here is of ‘’Analysis’’ rather then study.
7

#
What is business decisions?
 How likely is client X to buy product Y?
 Which product should we recommend next?
 What is a “realistic” view of opportunity by client?
 Are there accounts where there is significant untapped

13


revenue?
Which clients are “at risk” of going to our competitors?
What kind of salesforce do we need to deliver on our targets?
 Which units of products are not performing at par?
 Which sellers might miss their quota of sale?
 How can we align sellers with client opportunities to achieve
maximum revenue impact?
8

#
What is business decisions? (2)
 Who are the influencers for this product in the marketplace?
 What is the optimal marketing campaign to deploy?
 Should we hire this employee?
 Which employees are at risk of voluntarily leaving the

13

company?
How many employees do we need to hire now so that we can
achieve our production goals six months from now?
 What kind of raises or promotions should we offer to retain
our talented employees?
 Will outsourcing help in saving the company operating cost?
 Which company should we acquire to expand our customer
base?
9

#
Scenario of business analytics
 A couple of examples
 If the business would like to keep selling expenses the same but
increase revenue, business analytics can be used to
recommend changes in the deployment of the salesforce so that

13
sales teams are better matched to customers and can thus sell
more to customers (higher revenue)
 If workers of the company are voluntarily leaving for higher
paying jobs and there are significant salary premiums and
onboarding costs associated with hiring replacement workers,
then analytics-recommended targeted raises can retain existing
workers, thereby avoiding the high costs associated with hiring
0

#
Benefits of Business Analytics
 Take advantage of all types of data and information, using
different views to form a complete picture of your business
environment
 Empower employees to explore and interact with information

13
and deliver insights to others
 Streamline decisions by individuals or automated systems
based on analytics results
 Provide insights from various perspectives and time horizons
whether based on historic reporting, real-time analysis, or
predictive modeling.
 Data able to provide more information, e.g. when ppl
demanding more on “papaya leaves”, there must be
something going on (dengue)
1

#
Analytics overview
 INFORMS (Institute for Operations Research and
Management Science) has proposed 3 types of
analytics.
 It suggests that these three are somewhat

13 independent steps and one type of analytics


applications leads to another
2

13
3

#
Types of Business Analytics techniques
1) Descriptive Analytics
Descriptive analytics provides information about the past
state or performance of a business and answer : “What has
happened?”

13
First, this involve consolidation of data sources and
availability of all relevant data in order to generate appropriate
reports, queries, alerts and trends using various reporting
tools and techniques.
It needs data that is already stored which is part of data
warehouse.
The key player of technology in this area is visualization
such as tableau and Dundas BI, allow to develop powerful
insights in the operations of the organization.
4

# Types of Business Analytics


techniques
1) Descriptive Analytics
Examples:
 Total stock in inventory

13
 Average dollars spent per customer
 Year over year change in sales
 Generate report that provide historical insights
regarding the company’s production, financial,
operations, sales, finance, inventory and customers
5

13
6

13
7

13 Take a break..
Have a look at this scenario
8

13
9

#
Types of Business Analytics techniques
2) Predictive Analytics
It aims to determine what is likely to happen in the
future
 This analysis is based on statistical techniques to

13 predict what will happen in the future so that you can


make well-informed decisions and improve business
outcomes.
Predictive analytics relies on real-time events and
alerts to suggest actions.
It uses simulation models to suggest what could
happen in the future and answer: “What could
happen?”. Based on probabilities and of course could
not be 100% accurate.
0

#
Types of Business Analytics techniques
2) Predictive Analytics
A number of techniques used in developing
predictive analytical applications:
• Clustering algorithms – to segment customers into

13 different clusters to be able to target specific


promotions to them.
• Association mining techniques – to estimate
relationships between different purchasing behaviors
(if a customer buys one product, what else is the
customer likely to purchase)
 Any product search on Amazon.com results in the
retailer also suggesting other similar product where the
customer may interested too
1

#
Types of Business Analytics techniques
2) Predictive Analytics
Examples:
 Forecasting customer behavior
Purchasing patterns to identifying trends in sales

13

activities
 Use an application : credit score by most of the
financial services – to determine probability of
customers making future credit payments on time
 Understanding how sales might close at the end of
year
 Predicting what items customers will purchase
together
2

13
3

13
4

13 Take a break..
Have a look at this scenario
5

13
6

#
Types of Analytical Technologies
3) Prescriptive Analytics
The goal is to provide decision or
recommendation for a specific action in order to
recognize what is going on as well as the likely

13
forecast and make decisions to achieve the best
performance.
These recommendations can be in the form of a
specific yes/no decision for a problem, a specific
amount (let’s say, price for certain item, budget for
certain project or airfare to charge).
7

13
8

#
Examples

13
9

#
Business Analytic Capabilities
 Business intelligence
 Business intelligence helps users to explore all types of information from
all angles and to assess the current business situation to gain a deeper
understanding of the patterns that exist in the data.
 Predictive analytics

13 helps you to uncover unexpected patterns and associations from all data

within your organization and to develop predictive models to guide
interactions.
 Performance management – BA can improve the performance of
the company as the following ways:
 Planning, analysis, and forecasting to automate budgeting and to perform driver-based
forecasting, “what-if” scenario modeling, and multidimensional profitability analysis
 Profitability modeling and optimization to accelerate profitability analysis with an organization-
wide approach that joins financial, operational, and strategic planning
 Performance reporting and scoring that helps you align strategy with execution, communicate
goals, and monitoring of your performance against targets
0

#
Business Analytic Capabilities

 Analytical decision management focuses on the development and


deployment of decision services, bringing intelligence, predictive
insight, and optimization into repeatable decisions.
 It empowers workers and systems to gain the most from every

13 transaction and to improve outcomes by using the following tools:



Combined and integrated predictive models, rules, and decision logic to
deliver recommended actions and extended predictive analytics
What-if simulations to accommodate changing conditions based on
incoming data
 A user interface that supports intuitive development, optimization, and
implementation of targeted configurations, decisions, and content
4 keys Business Drivers for
Analytics
Current Business Problems Provide Opportunities for Organizations to Become
More Analytical & Data Driven

Driver Examples
1
Desire to optimize business

13
Sales, pricing, profitability, efficiency
operations

2
Desire to identify business risk Customer churn, fraud, default

3
Predict new business Upsell, cross-sell, best new customer
opportunities prospects
4
Comply with laws or regulatory Anti-Money Laundering, Fair Lending,
requirements Basel II

Copyright © 2011 EMC Corporation. All Rights


Summary
 You have been introduced to the module
learning outcome with general overview
of
13  The concept of Business Analytics
 Purpose of Business Analytics
 BA user & Options and the roles involved
 Introduction to Key business drivers
towards Business analytics