Академический Документы
Профессиональный Документы
Культура Документы
13
The decision-making cycle time is reduced, while
problems are increasingly complex with a growing
number of internal and external variables.
Managers need support systems for facilitating quick
decision making in a complex environment.
Decision support systems (DSS).
Building a Stock Decision Support Tool in Microsoft Excel
2010, https://www.youtube.com/watch?v=iXfxxHx21so
13
13
Data warehouse
A data warehouse is a database that provides
support for decision making
13
A data warehouse database must be:
Integrated
Subject-oriented
Time-variant
Non-volatile
13
to present a unified view of the data to the users.
Time-variant because data in the warehouse is only accurate
and valid at some point in time or over some time interval. The
time-variance of the data warehouse is also shown in the
extended time that the data is held, the implicit or explicit
association of time with all data, and the fact that the data
represents a series of snapshots.
Non-volatile as the data is not updated in real time but is
refreshed from operational systems on a regular basis. New
data is always added as a supplement to the database, rather
than a replacement. The database continually absorbs this
new data, incrementally integrating it with the previous data
13
13
Figure 13.3
A Data Warehouse Framework and Views
13
The Data Warehouse
Twelve Rules That Define a Data Warehouse
1. The Data Warehouse and operational
environments are separated.
2. The Data Warehouse data are integrated.
3. The Data Warehouse contains historical data
13
data sets. The operational environment is characterized
by numerous update transactions to a few data entities
at the time.
10. The Data Warehouse environment has a system
that traces data resources, transformation, and storage.
11. The Data Warehouse’s metadata are a critical
component of this environment. The metadata identify
and define all data elements. The metadata provide the
source, transformation, integration, storage, usage,
relationships, and history of each data element.
12. The Data Warehouse contains a charge-back
mechanism for resource usage that enforces optimal use
of the data by end users.
Architecture of Web-Based Data
Warehousing
13
OLAP vs. OLTP
13
We can divide IT systems into transactional (OLTP) and analytical
(OLAP).
In general we can assume that OLTP systems provide source data
to data warehouses, whereas OLAP systems help to analyze it.
OLTP
OLTP deals with recording the real time transactions
that use in operational system such as transactions
happen in e-commerce and also banking ATM
13 system.
OLTP (On-line Transaction Processing) is
characterized by a large number of short on-
line transactions (INSERT, UPDATE, DELETE).
The main emphasis for OLTP systems is put on very fast
query processing, maintaining data integrity in multi-
access environments and an effectiveness measured
by number of transactions per second.
In OLTP database there is detailed and current data, and
schema used to store transactional databases is the
entity model (usually 3NF).
On-Line Analytical Processing
On-Line Analytical Processing (OLAP) is deals
with analyzing the data store in the data
warehouse.
an advanced data analysis environment that
13
supports decision making, business modeling,
and operations research activities.
13
Queries are often very complex and involve
aggregations.
For OLAP systems a response time is an effectiveness
measure. OLAP applications are widely used by Data
Mining techniques. In OLAP database there is
aggregated, historical data, stored in multi-dimensional
schemas (usually star schema).
More video
Introduction to OLAP
13 https://www.youtube.com/watch?v=2ry
G3Jy6eIY
Excel Tutorial: What is Business
Intelligence and an OLAP Cube?
https://www.youtube.com/watch?v=yo
E6bgJv08E
On-Line Analytical Processing
Multidimensional Data Analysis Techniques
The processing of data in which data are viewed
13
as part of a multidimensional structure.
Multidimensional view allows end users to
consolidate or aggregate data at different levels.
Multidimensional view allows a business analyst
to easily switch business perspectives.
Refer to example : Excel
13
13
On-Line Analytical Processing
OLAP Architecture
Three Main Modules
13
OLAP Graphical User Interface (GUI)
OLAP Analytical Processing Logic
OLAP Data Processing Logic
As Figure 13.17 illustrates, OLAP systems are designed to use both operational and
data warehouse data. The figure shows the OLAP system components on a single computer,
but this single-user scenario is only one of many. In fact, one problem with the
installation shown here is that each data analyst must have a powerful computer to store
the OLAP system and perform all data processing locally.
13
Types of On-Line Analytical Processing
13
Multidimensional OLAP (continued)
13
Relational Vs. Multidimensional OLAP
13
Table 13.8
Star Schema
• The star schema is a data-modeling technique used
to map multidimensional decision support into a
relational database.
• Facts
• Facts are numeric measurements (values) that represent a
specific business aspect or activity. For example, sales
figures are numeric measurements that represent product
13 •
and service sales.
Facts commonly used in business data analysis are units,
costs, prices, and revenues. Facts are normally stored in a
fact table that is the center of the star schema.
• The fact table contains facts that are linked through their
dimensions, which are explained in the next section.
• Facts can also be computed or derived at run time. Such
computed or derived facts are sometimes called metrics to
differentiate them from stored facts.
• The fact table is updated periodically with data from
operational databases.
Star Schema
• Dimensions
• Dimensions are qualifying characteristics that provide
additional perspectives to a given fact. For instance,
sales might be compared by product from region to
13
region and from one time period to the next.
• The kind of problem typically addressed by a BI
system might be to compare the sales of unit X by
region for the first quarters of 2006 through 2016.
• In that example, sales have product, location, and
time dimensions. In effect, dimensions are the
magnifying glass through which you study the facts.
• Such dimensions are normally stored in dimension
tables. Figure 13.6 depicts a star schema for sales
with product, location, and time dimensions.
A Simple Star Schema
13
Star Schema
• Attributes
Each dimension table contains attributes. Attributes are
often used to search, filter, or classify facts.
13
• For example, all sales offices are rolled up to the sales department or
sales division to anticipate sales trends
• drill-down
• the drill-down is a technique that allows users to navigate through the
details.
• For instance, users can view the sales by individual products that
make up a region's sales
• slicing and dicing.
• Slicing and dicing is a feature whereby users can take out (slicing) a
specific set of data of the OLAP cube and view (dicing) the slices from
different viewpoints.
• These viewpoints are sometimes called dimensions (such as looking
at the same sales by salesperson or by date or by customer or by
product or by region, etc.)
Example of Aggregation in
13
13
A Location Attribute Hierarchy
13
Figure 13.15
Attribute Hierarchies In Multidimensional Analysis
13
Figure 13.16
Data Warehouse Implementation Road Map
13
Figure 13.21
• Refer to the following video about “Data Warehouse Architecture”
• https://www.youtube.com/watch?v=CHYPF7jxlik
13
• https://www.youtube.com/watch?v=yoE6bgJv08E
13following:
What is Data Mining?
Describe the various techniques in Data mining process
Understand the KDD Process model
Describe the various phases of CRISP-DM
Applications of Data Mining
Definition
Data mining is the process of Data mining
of discovering interesting knowledge
such as unknown patterns, association or significant structures
from large amount of data stored in databases, data warehouses or
other information repositories in order to discover useful patterns.
13
Another definition of data mining : Data mining is an iterative
process of creating predictive and descriptive models, by
uncovering previously unknown trends and patterns in vast amount
of data in order to support decision making.
Data mining is a subset of Business Analytics
There is a need to turn data into useful information and knowledge
for broad applications including
Market analysis
Business management
Decision support
Customer segmentation and behavior
Etc.
13
How data mining works?
13
Stock exchange data can be mined so that trends that could
help to plan investment strategies can be uncovered
Computer network data streams can be mined to detect
intrusions based on the anomaly of message flows, which
may be discovered by clustering, dynamic construction of
stream models or by comparing the current frequent
patterns with those at a previous time.
With spatial data, look for patterns that describe changes in
metropolitan poverty rates based on city distances from
major highways. By examining the relationships among a
set of spatial objects, which subsets of objects are spatially
auto correlated or associated can be discovered.
13
Industry examples of DM applications
Sales/ Marketing
Identify buying patterns from customers
Find the association among customer demographic characteristics
Banking
13
Credit card fraudulent detection
Identify ‘loyal’ customers
Insurance and Health Care
Claims analysis i.e., which medical procedures are claimed together
Predict the customers who will buy new policies
Transportation
Determine the distribution schedules for the outlets
Analyze loading patterns
Medicine
Characterize patient behavior in order to predict office visits
Identify successful medical therapies for different diseases / illnesses
Take a break….
Watch a video
13 https://www.youtube.com/watch?v=Y_JlkzzhAgw
D
13
Prediction
Prediction is refer to the act of telling about
the future by taking into account the
experiences, opinions and other relevant
information in conducting the task of
13
foretelling.
Depending on the nature of what is being
predicted, prediction can be specifically as :
Classification (predicted thing is such as
tomorrow’s forecast, is a class label such as
“rainy” or “sunny”)
Regression (predicted thing is tomorrow’s
temperature, is a real number such as 65 F)
Time-series, the data consists of values of the
same variable that is captured and stored over tine
in regular intervals, such as stock price
Prediction techniques
Classification : assign a new data record to one of several
predefined categories or classes. Also called supervised
learning.
Classification approaches normally use a training set where
all objects are already associated with known class labels.
13
The classification algorithm learns from the training set and
builds a model. The model is used to classify new objects.
This method has been used in customer segmentation,
business modeling, and credit analysis.
For example, after starting a credit policy, the OurVideoStore
managers could analyze the customers’ behaviours via their
credit, and label accordingly the customers who received
credits with three possible labels “safe”, “risky” and “very
risky”. The classification analysis would generate a model
that could be used to either accept or reject credit requests
in the future
13
Associations
13
With the help of bar-code scanners, the use of
associations rules for discovering regularities
among products is able to capture by the
system.
Types of associations:
Link analysis : the linkage among many objects
of interest is discovered automatically, such as the
link between web pages and referential
relationships among groups of academic
publication authors
Associations techniques
13
shoppers buy both)
In data mining, association rules are useful for
analyzing and predicting customer behavior. They
play an important part in shopping basket data
analysis, product clustering, catalog design and store
layout.
Sequence mining (categorical): discover sequences
of events that commonly occur together, .e.g. In a set
of DNA sequences ACGTC is followed by GTCA after
a gap of 9, with 30% probability
Something come after the other, for example: when
happen outbreak flu, the glove will be in shortage
Association rules
13
Clustering
Clustering: method of assigning a set of objects into
groups or segments based on similarities automatically.
Unlike classification, in clustering the class labels are
unknown.
13
As the selected algorithm goes through the data set,
identifying the common of things based on their
characteristics, the clusters are established.
Clustering techniques include optimization.
Goal of clustering is to create groups so that the
members within each group have maximum similarity and
the members across groups have minimum similarity.
Clustering techniques
Cluster analysis is a means of identifying
classes of items so that items in a cluster
have more in common with each other than
with items in other clusters.
13class or cluster.
Example of using Data Mining
13
Data Mining versus Statistics
Data Mining Statistics
13
and secondary data) to discover to test the hypothesis
novel patterns and relationships
Data sets in data mining are as Statistics looks for the right size
“big” as possible of data (if the size of data
required for statistical analysis,
usually sample of data is used)
Data
Visualization
13
Take a break…
watch a video
How Facebook Data Mining, And Your Info, Is
Influencing The 2016 Election | TODAY
https://www.youtube.com/watch?v=i-rIYadXoms
13
Knowledge Discovery in Database
(KDD)
Knowledge Discovery from Data (KDD), refers to the broad
process of finding knowledge in data that emphasizes the
"high-level" application of particular data mining methods.
13
The unifying goal of KDD process - extract knowledge
from data in the context of large databases - done by
using data mining methods
KDD refers to the entire process of discovering useful
knowledge from data.
This process involves making decision of what qualifies
as knowledge by evaluating and possibly interpreting the
patterns. It also includes the choice of encoding schemes,
preprocessing, sampling, and projections of the data prior
to the data mining step.
KDD: A Definition
KDD is the automatic extraction of non-
106-1012 bytes:
we never see the What is the knowledge?
whole data set, so How to represent
will put it in the and use it?
memory of
computers
Knowledge Discovery Process
13
Steps in KDD process
13
Knowledge Discovery Process
The Knowledge Discovery in Databases process comprises of a few steps
leading from raw data collections to some form of new knowledge.
The iterative process consists of the following steps:
Data cleaning: also known as data cleansing, it is a phase in which noise data and
irrelevant data are removed from the collection or maybe missing data.
Data integration: at this stage, multiple data sources, often heterogeneous, may be
13
combined in a common source.
Data selection: at this step, the data relevant to the analysis is decided on and
retrieved from the data collection.
Data transformation: also known as data consolidation, it is a phase in which the
selected data is transformed into forms appropriate for the mining procedure.
Data mining: it is the crucial step in which clever techniques are applied to extract
patterns potentially useful. Searching for patterns of interest in a particular
representational form or a set of such representations, including classification rules or
trees, regression, and clustering
Pattern evaluation: in this step, strictly interesting patterns representing knowledge
are identified based on given measures.
Knowledge representation: is the final phase in which the discovered knowledge is
visually represented to the user. This essential step uses visualization techniques to
help users understand and interpret the data mining results.
3 methodologies of KDD model
Fayyad et al. (Computer science)
E.g., WEKA
13
SEMMA (SAS) (Statistics)
SAS Enterprise Miner
CRISP-DM (SPSS, OHRA) (Business)
SPSS
Methodology of KDD –
CRISP-DM
CRISP-DM
Stands for Cross Industry Standard Process for
Data Mining
13
CRISP –DM (Elaborate view)
13
Six phases of CRISP-DM
1. Business Understanding
This initial phase focuses on understanding the project
objectives and requirements from a business perspective, and
then converting this knowledge into a data mining problem
definition, and a preliminary plan designed to achieve the
13
objectives.
Such as “What are the common characteristics of the
customers we have lost to our competitors recently?”
2. Data Understanding
The data understanding phase starts with an initial data
collection. It proceeds with activities
▪ To get familiar with the data,
▪ To identify data quality problems,
▪ To discover first insights into the data, or to
▪ Detect interesting subsets to form hypotheses for hidden
information.
Six phases of CRISP-DM
3. Data Preparation
The data preparation phase covers all activities to
construct the final dataset (data that will be fed into the
modeling tool(s)) from the initial raw data.
13
Data preparation tasks are likely to be performed
multiple times, and not in any prescribed order. Tasks
include table, record, and attribute selection as well as
transformation and cleaning of data for modeling tools.
4. Modeling
In this phase, many modeling techniques are chosen
and applied, and calibrate their parameters to optimal
values. Typically, to the same data mining problem type,
several techniques can be applied.
Six phases of CRISP-DM
5. Evaluate Results
The accuracy and generality of the model were dealt with
the previous evaluation steps. The degree to which the
model meets the business objectives is assessed in this
13
step.
Also this step seeks to determine if there is some valid
business reason why the model is deficient. If time and
budget permits, the model(s) can be tested on test
applications in the real application which is another option
of evaluation.
6. Deployment
The end of the project is not just the creation of the
model. Though the purpose of the model is to increase
knowledge of the data, the knowledge gained needs to be
organized and presented in such a way that the client can
use.
KDD vs. DM
DM is a component of the KDD process that is
mainly concerned with means by which
patterns and models are extracted and
enumerated from the data
13 DM is quite technical
Knowledge discovery involves evaluation and
interpretation of the patterns and models to
make the decision of what constitutes
knowledge and what does not
KDD requires a lot of domain understanding
The DM and KDD are often used
interchangeably
Perhaps DM is a more common term in
business world, and KDD in academic world
13 The end.
Video: Data Mining and Business Intelligent
https://
www.youtube.com/watch?v=peSNJ5bfjX0
13
Objectives
Get an overview of big data that covers:
What is “Big Data”
Need of Big Data
13
Characteristics of Big Data: 4V of Big Data
Importance & Risks of Big Data
The structure of Big Data
What is Big Data Analytics? Benefits
Big Data Adoption
Applications
#
13
https://www.youtube.com/watch?v=tkOwlXUaGMM&t=250s
What is Big Data ?
Definition: “Big Data” is data whose scale,
distribution, diversity, and/or timeliness require the
use of new technical architectures and analytics to
enable insights that unlock new sources of
13
business value.
What make a data as “big data”?
• Huge volume of data (for instance, tools that can manage
billions of rows and billions of columns)
• Complexity of data types and structures, with an increasing
volume of unstructured data (80-90% of the data in existence
is unstructured)….part of the Digital Shadow or “Data Exhaust”
• Speed or velocity of new data creation
Copyright © 2011 EMC Corporation. All Rights Source: McKinsey May 2011 article Big Data: The next frontier for innovation, competition, and 8
#
What is big data?
Big Data is a general term used to describe the
voluminous amount of unstructured and semi-structured
data – where the data are capture from social media,
CCTV, sensors, smart watch, etc.
13
Big Data term is often used when speaking about
Petabytes and Exabytes of data.
A primary goal for looking at big data is to discover
repeatable business patterns.
For example, If a customer is buying specific type and color of cloth in a
shop only that data would be available. However, the customer might be
having an occasion for which the same is purchased and that occasion
and the relationship with such occasion can be captured by the external
data that is created by the said customer may be in social media.
#
13
https://www.youtube.com/watch?v=9s-vSeWej1U
A Growing Interconnected and
Instrumental World
13
wide
100s of
data every day
? TBs of
million
s of
GPS
enable
25+ TBs of
log data d devices
every day sold
2+
annually
billion
people
on the
76 million smart Web by
meters in 2009… end 2011
200M by 2014
#
Videos
13 https://www.youtube.com/watch?v=Uppg_2nGo5
4
Are you ready for digitization?
https://www.youtube.com/watch?v=ystdF6jN7hc
#
13
ability to collect and analyze big data to conduct controlled
experiments to make better management decisions.
Big Data allows ever-narrower segmentation of customers and
therefore much more precisely tailored products or services
Sophisticated analytics can substantially improve decision-
making, minimize risks, and unearth valuable insights that
would otherwise remain hidden
Big Data can be used to develop the next generation of
products and services
#
13
#
Characteristics of Big Data
1) Volume
Volume indicates the amount of data for analysis.
characteristic most associated with big data, volume refers to the
mass quantities of data.
https://www.youtube.com/watch?
v=1RYKgj-QK4I
M
o
d
u
Characteristics of Big Data
3) Variety
e
1
Different types of data and data sources. Variety is about managing
the complexity of multiple data types, including structured, semi-
structured and unstructured data.
I
13
Organizations need to integrate and analyze data from a complex array
n
of both traditional and non-traditional information sources.
Explosion of sensors, smart devices and social collaboration
r
technologies, generates data in countless forms like text, web data,
o tweets, sensor data, audio, video and more.
d
u 4) Veracity
c Veracity is about uncertainty of Data, correctness of data
i Huge amount of money is spent by Organizations because of data
o quality issues.
n Decision makers are not confident of data that is being used by them
for decision making.
o https://www.youtube.com/watch?v=wVAWAeOIIII
B
#
Veracity
Establishing confidence of data is a biggest
challenge.
Uncertainty of Data or Veracity is a very important
https://www.youtube.com/watch?v=mnoqT8nihT8
Variety – Complex Data Structures
Data Growth is Increasingly Unstructured
13
More Structured
Structured Databases
●
Structured data is data that has been organized into a formatted repository, typically a
database, so that its elements can be made addressable for more effective processing
and analysis.
●
It refers to data that has a defined length and format for big data
13Ex. numbers, dates, and groups of words and numbers called strings.
●
●
It’s usually stored in a database.
99
0
#
Semi-structured data
Semi-structured data is a form of structured data that
does not conform with the formal structure of data
models associated with relational databases or other
forms of data tables, but nonetheless contains tags or
Semi-structured data
For example, a clickstream log may look like :
2017-11-01 14:27:57,944-INFO :
com.ovaledge.oasis.dao.DomainDaoImpl - RUNNING
13
Applications Music(Audio) Movie(vedio)
X-Rays Pictures
Real-time/Fast Data – generate
unstructured data
13
Mobile devices
(tracking all objects all the time)
13
other useful information.
Such information can provide competitive advantages over
rival organizations and result in business benefits, such as
more effective marketing and increased revenue
Goal: to help companies make better business decisions by
enabling data scientists and other users to analyze huge
volumes of transaction data as well as other data sources that
may be left untapped by conventional business intelligence
(BI) programs.
https://www.youtube.com/watch?v=LtScY2guZpo
5
#
Big Data Analysis Adoption Structure
1. Educate – build a base of knowledge
Most organizations in this stage are studying the potential
benefits of big data technologies and analytics, and are trying
to understand how big data can help address important
13
organization.
The small number of organizations in the Execute stage is
consistent with the implementations we see in the
marketplace.
Importantly, these leading organizations are leveraging big
data to transform their businesses and thus are deriving the
greatest value from their information assets.
8
13
Business intelligence in general can benefit from big data analytics
This could result in more numerous and accurate business insights, an
understanding of business change, better planning and forecasting,
and the identification of root causes of cost.
Specific analytic applications are likely beneficiaries of big data
analytics
big data analytics might help automate decisions for real-time business
processes such as loan approvals or fraud detection.
https://www.youtube.com/watch?v=QvyQFXbgW2c
9
#
Barriers of Big Data Analytics
13
A lack of business support can hinder a big
data analytics program in terms of cost and
compelling business case.
Problems with current database software
used which is lack of database analytics can
be barriers to big data analytics
0
#
Risks of Big Data
We are gathering data from different sources at different types,
to the extent that we often don’t know exactly what it contains,
big data carries its own special risks.
You don’t know whether all or just a tiny piece of it might be
Product
Recommendations Learning why Customers
Influence
that are Relevant Behavior Switch to competitors
& Compelling and their offers; in
13
time to Counter
Friend Invitations
Improving the Customer to join a
Marketing Game or Activity
Effectiveness of a that expands
Promotion while it business
is still in Play
Preventing Fraud
as it is Occurring
& preventing more
proactively
13
–Geomapping / marketing
–Network monitoring
Telecommunication services
•Problem:
Legacy systems are used to gain insights from internally generated
data facing issues of high storage costs, long data loading time, and
long administration process.
3
#
Applications of Data Analytics – Use
Case
Financial Service
Problem:
13
Manage the several Petabytes of data which is growing at 40-100%
per year under increasing pressure to prevent frauds and complain to
regulations.
How big data analytics can help:
Fraud detection
Risk management
360°View of the Customer
4
#
Applications of Data Analytics – Use
Case
Transportation services
Problem:
13
Traffic congestion has been increasing worldwide as a result of
increased urbanization and population growth reducing the efficiency
of transportation infrastructure and increasing travel time and fuel
consumption.
How big data analytics can help:
Real time analysis to weather and traffic congestion data streams to
identify traffic patterns reducing transportation costs.
5
#
Applications of Data Analytics – Use
Case
Healthcare and Life Sciences
Problem:
13
Vast quantities of real-time information are starting to come from
wireless monitoring devices that postoperative patients and those with
chronic diseases are wearing at home and in their daily lives.
How big data analytics can help:
Epidemic early warning
Intensive Care Unit and remote monitoring
6
Video
Demo: IBM Big Data and Analytics at work in Banking
https://www.youtube.com/watch?v=ioHwEsARPWI
13
What is Hadoop?
https://www.youtube.com/watch?v=4DgTLaFNQq0
What Is Big Data? & How Big Data Is Changing The World!
https://www.youtube.com/watch?v=G_e3r4S2g80
Summary
Terminologies of Big Data
Need of Big Data
1
1
8 IBM 2201
INTRODUCTION
TO
BUSINESS ANALYTICS
Chapter 2
Key Roles and Responsibilities
9
Overview
13
provide a classification of different types of industry
participants and illustrate the types of opportunities
that exist for analytics professionals.
• To be aware of organizations and new offerings and
opportunities in sections allied with analytics
An Overview of the Analytics Ecosystem (1 of 3)
13
3
13
2. Which companies are dominant in
more than one category?
3. Is it better to be the strongest player in
one category or be active in multiple
category?
5
13
https://www.youtube.com/watch?
v=eFDlUgLxtGM
Users of Business Analytics
In those days, all the departments were
reply on IT teams and data analytics to
prepare the data and churn out reports
13
for their BA projects.
Now no more, with technology evolved,
businesses have had a choice of
solutions offering analytics and
visualization capabilities that open up BA
to everyone.
13
i) casual BA user
- make use of dashboards to analyze predefined sets of
data
ii) Power user
- Has the capability of working with complex data sets.
- Power user is often a manager, who is looking for ways to
help a department operate more efficiently and more
effectively
- They can use the BA tools such as reporting to analyze on
the business activities.
#
Business Analytic Team
Business Analyst
A business analyst is someone who analyzes an organization or
business domain (real or hypothetical) and documents its
business or processes or systems, assessing the business
13
model or its integration with technology.
Uses BI tools and applications to understand business
conditions and drive business processes
Skills: Business Analysts need to have a baseline
understanding of some core skills: statistics, data munging, data
visualization, exploratory data analysis,
Tools: Microsoft Excel, SPSS, SPSS Modeler, SAS, SAS Miner,
SQL, Microsoft Access, Tableau, SSAS.
1
#
Business Analytic Team
Data Scientist
Data scientists are involved with gathering data, massaging it
into a tractable form, making it tell its story, and presenting that
story to others
13
Uses advanced algorithms and interactive exploration tools to
uncover non-obvious patterns in data
Data scientists apply statistics, machine learning and analytic
approaches to solve critical business problems.
Their primary function is to help organizations turn their volumes of big
data into valuable and actionable insights.
In comparison with ‘data analysts’, in addition to data analytical skills,
Data Scientists are expected to have strong programming skills, an
ability to design new algorithms, handle big data, with some expertise
in the domain knowledge.
2
13
3
#
Business Analytics Team – Data
Engineer
Data Engineers are the data professionals who prepare the “big data”
infrastructure to be analyzed by Data Scientists.
They are software engineers who design, build, integrate data from
various resources, and manage big data. Then, they write complex
#
BA team – BI Developers
Business Intelligence Developers are data experts that interact more
closely with internal stakeholders to understand the reporting needs,
and then to collect requirements, design, and build BI and reporting
solutions for the company.
They have to design, develop and support new and existing data
13
warehouses, ETL packages, cubes, dashboards and analytical reports.
Additionally, they work with databases, both relational and
multidimensional, and should have great SQL development skills to
integrate data from different resources.
They use all of these skills to meet the enterprise-wide self-service
needs. BI Developers are typically not expected to perform data
analyses.
Skills: ETL, developing reports, OLAP, cubes, web intelligence,
business objects design,
Tools: Tableau, dashboard tools, SQL, SSAS, SSIS and SPSS Modeler.
5
13
Big Data talent is a critical issue. By 2018, the United
States alone could face a shortage of 140,000 to
190,000 people with deep analytical skills.
But companies need to spend time upfront to
identify the kinds of roles they need to make the Big
Data machine run rather than just rushing to recruit
math and science jocks.
Summary
13
The key roles and responsibilities of data analytics team
Introduction to Key business drivers towards Business
analytics
Review on the current business analytics team
proposed.
7
CHAPTER 4
8
Objectives
Identify and discuss the different types of Information Systems
and their roles in supporting business operations
13
9
Sixth
13
Enterprise Resource Planning (ERP)
Management Information Systems (MIS)
Decision making systems:
Supply Chain Management (SCM) systems
Customer Relationship Management Systems (CRM)
Decision Support Systems (DSS)
0
13
1
13
records about the business operations of the
organization.
This system capture input data, to process it and
store into a database for further processing.
Input: basic transactions such as customer orders, purchase
orders, receipts, time cards, invoices and customer payments.
Process: data collection, data editing, data correction, data
manipulation and data storage
The data that processed will store in the computer system for
further processing in the higher level of information system
such as MIS.
2
Examples of TPS:
Point-of-sale machines: record grocery sales transactions
ATM machines: record deposit, withdraw and transfer money
13
purchase order systems: to record the purchase of goods in the company
Online TGV movie ticketing system
Egenting Hotel reservation system
Characteristics of TPS:
Support daily and routine operations
Work with large amount of input and output data
With less decision making involved
Supply data to higher management system such as MIS
3
13
It integrate internal and external management
information across an entire organization,
embracing finance/accounting, manufacturing, sales
and service, customer relationship management, etc.
ERP systems automate this activity with an
integrated software application.
It offers integrated software from a single vendor to
meet to the needs.
4
13
5
13
6
13
Every department is using different type of MIS system to help
the manager to make better decision from the information that
produced.
MIS support structured decisions at the operational and
management control levels. However, they are also useful for
planning purposes of senior management staff.
13
Use internal data stored in the computer system
Allows users to develop their own custom reports
Require user requests for reports developed by systems personnel
8
13
9
13
0
Watch this!!
13
2
13
13
with vendors for the best prices and support make
sure all supplies and parts are available to manufacture
cards send finished products to dealerships around
the country when they are needed.
Internet can be used as a platform to negotiate for good prices and
service from different suppliers.
Internet is a platform where it gather suppliers all over the world from
different countries, therefore it is possible to get any supplier that can offer
better price
4
SCM system
13
Customer Relationship Management
5
Systems
13
transaction history in order to analyze the customer
buying habits and to know them better in order to
keep and retain loyal customers.
13 What is CRM?
13
8
13
9
CRM
13
Function of the software is to capture data where
customers has contact the company and store it in the
system so that the company know the customer needs.
Purposes of implementing CRM:
Improve marketers’ ability to interact with customers via multiple
channels such as email, sms, phone, Web and calling.
To automate process for developing customer quotes thru auto mail
reply
To improve communications with customers through messenger
contact
To provide support for an anticipated increase in customers thru the
email, Web searching, phone enquiry
0
13
1
Definition:
It is a computer-based information system designed to help knowledge
workers select one of many alternative solutions to a problem.
13
The success of an organization depends on the quality the
employees make.
When decision making involves large amount of information and a
lot of processing, computer-based systems can make the process
efficient and effective.
2
For examples:
Helps answer “What if?” questions, i.e. what if we double the work shift
and cut down the number of staff?
13
The question seek answer like : “This is how this action will impact our
revenue, or our market share or our costs.”
DSSs are programmed to process raw data, make comparisons and
generate information to help in financial investment, marketing strategy,
credit approval, etc.
3
13
4
13 The End.
5
1
6
5
IBM 2201
INTRODUCTION
TO
BUSINESS ANALYTICS
Chapter 1
6
Overview
This unit introduces the concept of
Business Analysis and Optimization
and provides an understanding of
13 role of Business Analysis in
Business Optimization. It gives an
glimpse of the reference architecture
in Business Analytics
Objectives
Get an overview of this chapter is to covers:
What is BA
Objective of BA
13
Features of BA
Capabilities of BA
3 types of BA techniques
8
13
Source: http://blogs.smithsonianmag.com/history/2011/12/emperor-
wang-mang-chinas-first-socialist/han-merchants
9
13
Source: http://milkmiracle.net/2010/12/18/acland-in-
hindoostan
0
13
Source: http://www.allposters.com/-
sp/Business-Meeting-1960s-
Posters_i6848810_.htm
1
#
Some businesses are integrating data-
driven algorithmic recommendations into
their decision-making processes
These organizations are adopting business analytics
Business analytics is all about the decisions that go
13
4
#
What is business analytics?
Many practitioners define analytics as the process
of developing actionable decisions or
recommendations for actions based on insights
generated from historical data
13
Definition:
Analytics represents the combination of computer
technology, methodologies from applied mathematics,
applied probability, applied statistics, computer science,
and signal processing for using data to gain insight into
business performance and drive business planning (to
solve real problems in commercial)
5
Analytics overview
#
Features of BA
#
What is business decisions?
How likely is client X to buy product Y?
Which product should we recommend next?
What is a “realistic” view of opportunity by client?
Are there accounts where there is significant untapped
13
revenue?
Which clients are “at risk” of going to our competitors?
What kind of salesforce do we need to deliver on our targets?
Which units of products are not performing at par?
Which sellers might miss their quota of sale?
How can we align sellers with client opportunities to achieve
maximum revenue impact?
8
#
What is business decisions? (2)
Who are the influencers for this product in the marketplace?
What is the optimal marketing campaign to deploy?
Should we hire this employee?
Which employees are at risk of voluntarily leaving the
13
company?
How many employees do we need to hire now so that we can
achieve our production goals six months from now?
What kind of raises or promotions should we offer to retain
our talented employees?
Will outsourcing help in saving the company operating cost?
Which company should we acquire to expand our customer
base?
9
#
Scenario of business analytics
A couple of examples
If the business would like to keep selling expenses the same but
increase revenue, business analytics can be used to
recommend changes in the deployment of the salesforce so that
13
sales teams are better matched to customers and can thus sell
more to customers (higher revenue)
If workers of the company are voluntarily leaving for higher
paying jobs and there are significant salary premiums and
onboarding costs associated with hiring replacement workers,
then analytics-recommended targeted raises can retain existing
workers, thereby avoiding the high costs associated with hiring
0
#
Benefits of Business Analytics
Take advantage of all types of data and information, using
different views to form a complete picture of your business
environment
Empower employees to explore and interact with information
13
and deliver insights to others
Streamline decisions by individuals or automated systems
based on analytics results
Provide insights from various perspectives and time horizons
whether based on historic reporting, real-time analysis, or
predictive modeling.
Data able to provide more information, e.g. when ppl
demanding more on “papaya leaves”, there must be
something going on (dengue)
1
#
Analytics overview
INFORMS (Institute for Operations Research and
Management Science) has proposed 3 types of
analytics.
It suggests that these three are somewhat
13
3
#
Types of Business Analytics techniques
1) Descriptive Analytics
Descriptive analytics provides information about the past
state or performance of a business and answer : “What has
happened?”
13
First, this involve consolidation of data sources and
availability of all relevant data in order to generate appropriate
reports, queries, alerts and trends using various reporting
tools and techniques.
It needs data that is already stored which is part of data
warehouse.
The key player of technology in this area is visualization
such as tableau and Dundas BI, allow to develop powerful
insights in the operations of the organization.
4
13
Average dollars spent per customer
Year over year change in sales
Generate report that provide historical insights
regarding the company’s production, financial,
operations, sales, finance, inventory and customers
5
13
6
13
7
13 Take a break..
Have a look at this scenario
8
13
9
#
Types of Business Analytics techniques
2) Predictive Analytics
It aims to determine what is likely to happen in the
future
This analysis is based on statistical techniques to
#
Types of Business Analytics techniques
2) Predictive Analytics
A number of techniques used in developing
predictive analytical applications:
• Clustering algorithms – to segment customers into
#
Types of Business Analytics techniques
2) Predictive Analytics
Examples:
Forecasting customer behavior
Purchasing patterns to identifying trends in sales
13
activities
Use an application : credit score by most of the
financial services – to determine probability of
customers making future credit payments on time
Understanding how sales might close at the end of
year
Predicting what items customers will purchase
together
2
13
3
13
4
13 Take a break..
Have a look at this scenario
5
13
6
#
Types of Analytical Technologies
3) Prescriptive Analytics
The goal is to provide decision or
recommendation for a specific action in order to
recognize what is going on as well as the likely
13
forecast and make decisions to achieve the best
performance.
These recommendations can be in the form of a
specific yes/no decision for a problem, a specific
amount (let’s say, price for certain item, budget for
certain project or airfare to charge).
7
13
8
#
Examples
13
9
#
Business Analytic Capabilities
Business intelligence
Business intelligence helps users to explore all types of information from
all angles and to assess the current business situation to gain a deeper
understanding of the patterns that exist in the data.
Predictive analytics
13 helps you to uncover unexpected patterns and associations from all data
within your organization and to develop predictive models to guide
interactions.
Performance management – BA can improve the performance of
the company as the following ways:
Planning, analysis, and forecasting to automate budgeting and to perform driver-based
forecasting, “what-if” scenario modeling, and multidimensional profitability analysis
Profitability modeling and optimization to accelerate profitability analysis with an organization-
wide approach that joins financial, operational, and strategic planning
Performance reporting and scoring that helps you align strategy with execution, communicate
goals, and monitoring of your performance against targets
0
#
Business Analytic Capabilities
Combined and integrated predictive models, rules, and decision logic to
deliver recommended actions and extended predictive analytics
What-if simulations to accommodate changing conditions based on
incoming data
A user interface that supports intuitive development, optimization, and
implementation of targeted configurations, decisions, and content
4 keys Business Drivers for
Analytics
Current Business Problems Provide Opportunities for Organizations to Become
More Analytical & Data Driven
Driver Examples
1
Desire to optimize business
13
Sales, pricing, profitability, efficiency
operations
2
Desire to identify business risk Customer churn, fraud, default
3
Predict new business Upsell, cross-sell, best new customer
opportunities prospects
4
Comply with laws or regulatory Anti-Money Laundering, Fair Lending,
requirements Basel II