Crash Course in Analytics For Non Analytics Managers

BTP - Analytics
Analytics Team
Mjunction Services
confidential - mjunction services limited

Explaining the Univariate
Exploration
Past
Bivariate
Data
Science Classification
Regression
Clustering
Predicting the
Modelling
Future Association Rule

Data Exploration Count, Count %
Categorical
Min, Max, Mean, Median , Mode
Pie Chart, Bar
Chart
Univariate Range, Quartile, Variance, Standard Deviation,
Coef of Variation
Numeric Skewness, Kurtosis
Histograms, Box Plots
Chi Square test

Categorical &
Categorical
Bar Chart, 2 y Axis Plot
Correlation
Bivariate Numerical &

Numerical
Scatter Plot
Z test, T test, Anova
Numerical &
categorical
Bar and Line Chart, 2 Y Axis Plot

ZeroR
Frequency table
OneR
LDA
Naïve Bayesian
Covariance
Matrix
Logistical
Decision Tree
Regression
Classification
Similarity
K Nearest Neighbors
Function
Artificial Neural Networks
Others
SVM- Support Vector
Machines

Frequency table Decision Trees
Covariance Multi Linear Regression

Matrix
Regression
Similarity
Function K Nearest Neighbors
Artificial Neural Networks
Others SVM- Support Vector

Machines
Agglomerative
Hierarchical
Divisive
Clustering
K Means
Partitive
Self organizing Maps
Solving an Optimization Problem
using Excel Solver

Zooter Industries: Products, Profits, Demand
 Zooter Industries (ZI)manufactures high-end kick-scooters for the North

American market
 ZI’s main product models are Razor and Navajo, with profit
contributions of $150 and $160 per unit
 At present, ZI’s scooters are so popular that the company can sell all the
units it makes

Zooter Industries: Manufacturing Process
 The production process for each model includes three main steps:
– frame manufacturing
– wheels and deck assembly
– quality assurance and packaging
Each unit of the two scooter models requires the following processing times in these
production steps:
Wheels and Quality
Frame Deck Assurance and
Manufacturin Assembly( Packaging
Model g ( Hours) Hours) (hours)
Razor 4 1.5 1
Navajo 5 2 0.8

Zooter Industries: Supply Side
ZI’s capacity available at each production step is shown below for the coming week
Available Time in
Coming Week ( in
Production Step Hours)
Frame Manufacturing 5610
Wheels and Deck Assembly 2200
Quality Assurance and
Packaging 1200
How many units of each model should ZI produce in the coming week in order to
maximize its weekly profit?

Assuming Away Uncertainty: Pros and Cons
 The Zooter example treats profit contributions, manufacturing requirements, supply availabilities as
non-random quantities
 If ZI decides to make a certain number of units of each scooter model in the coming week, it will
know for sure
 How much profit it will make
 Whether it will have sufficient supply of each resource
 The “no uncertainty” assumption simplifies the search for the best production plan
 In practice, it allows us to tackle analytics models with large numbers of products and resources
 We Will be using Excel Solver to Solve this optimization problem

Sample Designs, Procedures
And Hypothesis Testing

Sampling Terminology
Population Sample Census

The total collection Investigation of all
of elements about Subset of the population individual elements
which we want to that make up the
make inferences population
Sampling
The process of using a small number of
items or parts of larger population to
make
a conclusions about the whole population

Sample Selection
Population, sample and individual cases

Why Sampling ?
Survey of entire
population is
impractical
A valid Budget and time

alternative constraints
to census restrict data
collection
when..
Results from
data collection
are needed
quickly

Sampling Techniques

Sampling Techniques: Probability Sampling
Known, nonzero probability for every element. Used in Statistical and Industry based research.
• Simple Random Sampling: A sampling procedure that ensures that each element in the population will
have an equal chance of being included in the sample. Implementation is easy but its both time and money
consuming, where larger samples are required. Most widely used sampling technique.
• Systematic sampling: A simple process, where every nth name in the list is drawn. Output can be biased
and skewed. Used in research.
• Stratified sampling: Subsamples are drawn within different strata. Each stratum is more or less equal on
some characteristic. The process is expensive. Used in research.
• Cluster sampling: The purpose of cluster sampling is to sample economically while retaining the
characteristics of the sample. It is no longer based on individual element of the population. U.S. uses this
sampling technique to create clusters for its population.

Sampling Techniques: Non-Probability Sampling
When probability of selecting a unit is unknown. Its mostly used in Marketing Research.
• Convenience Sampling: A sampling procedure where the element selection isbased on ease of accessibility.
They are the least reliable but cheapest and easiest to conduct. Street Interviews are the best examples.
• Judgment sampling: An experienced individual selects the sample based on his or her judgment about
some appropriate characteristics required of the sample member.
• Quota sampling: Ensures that the various subgroups in a population are represented on pertinent sample
characteristics.
• Snowball sampling: Initial respondents are selected by probability methods. Additional respondents are
obtained from information provided by the initial respondents.

Hypothesis Testing
What is a hypothesis?
• An assumption about the population

parameter.
• Population parameter is the characteristic of
a population such as variance or mean.
• The parameter/s must be identified
beforehand. Example- In SAS when randomly two
Lamps Picked up out of 10 are defective
whether entire batch is defective or not
confidential
confidential - mjunction
- mjunction services
services limited limited
Hypothesis Testing
Example: Assuming the average weight of the

packets is 10kgs.This is out Initial Hypothesis.
• To test the validity of assumed or hypothetical value

of population, we gather sample data and determine
the difference between our hypothesized value and
actual value.
• Then we judge whether the difference is significant
or not. If the difference is not significant then out
hypothesized value (here mean weight) is correct
else our hypothesis fails.
confidential
Procedure of Hypothesis Testing
Hypothesize: Establishing H0 and Hα
Test: Determining the statistical test, setting up

alpha(Type 1 error), establishing a decision
rule, data gathering and analysing.
Take a Statistical Action to reach to a

conclusion.
Determine the Business Implication
confidential
Hypothesis Testing
Null Hypothesis
• The assumption that we wish to test is Null Hypothesis(H0 :
read as H-not). We begin with the assumption that what has
been happening is correct.
• E.g.: The average weight of students in a class is 58kgs.
• (H0 : µ=58)
Alternative Hypothesis
• The radical claim that the new theory is correct or there are
changes happening in the said system. It is defined by
Alternative Hypothesis(Hα : read as H-alpha)
• The average weight of students in a class is not 58kgs.
• (Hα : µ≠58)
confidential
Decision Making: Rejection and Non-Rejection Region Approach
Test Statistic: The sample statistic one uses to either
reject Ho (and conclude Ha) or not to reject Ho.
If the null hypothesis is rejected, statistically it means that the

result lies in the rejection region.
If the null hypothesis is not rejected, statistically, it means that

the result lies in the non-rejection region.
Critical values are the values that determine whether the null
hypothesis will be rejected or not.
P-value: The p-value (or probability value) is the probability that

the test statistic equals the observed value or a more extreme
value under the assumption that the null hypothesis is true.
confidential
Conditions for Hypothesis Testing
• If z-test for one proportion for n samples:
• If a t-test: the data comes from an approximately normal distribution or the sample size is at least 30.
• Deciding significance level/α
• Compute the value of the test statistic:
Where p-hat is sample proportion and p0 is

population proportion.
Where x-bar is sample mean and µ0 is

population mean, S is standard deviation of
sample, n is the sample size.
confidential
When to use what..
Example:
We take a random sample of 500 Penn State students and find that 278 are from Pennsylvania. Can we conclude
that the proportion is larger than 0.5 at a 5% level of significance? Can we conclude that a major proportion is
from Pennsylvania.
Step 1: Using the one-proportion z-test since the hypothesized value p0 is 0.5(population proportion) and we can
check that
Hence, setting up the hypothesis,
Step 2: Deciding the significance level, α=0.05

Step 3: Computing the value of test statistic: where p-hat is 278/500=0.556(sample
proportion)
confidential
When to use what..
Step 4: Finding the appropriate critical values for the test using the z-table. From the table, Z0 =1.645, which is
the critical value. . The rejection region for the two-tailed test is given by:
Step 5: Check whether the value of the test statistic falls in the rejection region. If it does, then reject H0 in the
favour of Hα .
The observed Z-value is 2.504 - this is our test statistic. Since Z* falls within the rejection region, we reject H0 in
the favour of Hα .
Step 6: Concluding that a majority of the students are from Pennsylvania.
confidential
When to use what..
Example:
The mean length of the lumber is supposed to be 8.5 feet. A builder wants to check whether the shipment of
lumber she receives has a mean length different from 8.5 feet. If the builder observes that the sample mean
of 61 pieces of lumber is 8.3 feet with a sample standard deviation of 1.2 feet. What will she
conclude? Conduct this test at a 1% level of significance(α).
Step 1: Using t-test since sample size is 61>30. Then setting the hypothesis
Step 2: Deciding the significance level, α=0.01

Step 3: Computing the value of test statistic:
Step 4: Finding the appropriate critical values for the test using the t-table. From the table, the critical value
2.660. The rejection region for the two-tailed test is given by:
confidential
When to use what..
Step 5: Check whether the value of the test statistic falls in the rejection region. If it does, then reject H0
(and conclude Hα ). If it does not fall in the rejection region, do not reject H0.
The observed t-value is -1.3 - this is our test statistic. Since t* does not fall within the rejection region,
we fail to reject H0 in the favour of Hα.
Step 6: Concluding that with a test statistic of -1.3 and critical value of ± 2.660 at a 1% level of significance,
the mean length of lumber differs from 8.5 feet.
confidential
Type I and Type II Errors
Rejecting a true null hypothesis is Type I

error.
Example: An innocent man sent to jail. Or,
an employee is fired for stealing from
company without enough evidence.
Probability(Type I error)=significance
level=α
Failing to reject a false null hypothesis is

Type II error.
Example: A guilty man is let gone. Or,
An employee who is actually stealing but
due to lack of evidence, cannot be fired.
Probability(Type II error)=Power=β
confidential
Analysis of Variance (ANOVA)
When to use ANOVA?
To compare the mean values of a certain characteristic among two or more groups. Basically, to see whether two or more
groups are equal (or different) on a given metric characteristic.
H0 in ANOVA:
There are no differences among the mean values of the groups being compared (i.e., the group means are all equal)–
H0 : µ1 = µ2 = µ3 = …= µk
Hα (Conclusion if H0 rejected)?
Not all group means are equal (i.e., at least one group mean is different from the rest).
Scenario 1. When comparing 2 groups, a one-step test : Group A and Group B

Step1: Check to see if the two groups are different or not, and if so, how.
Scenario 2. When comparing >3 groups, if H0 is rejected, it is a two-step test: Group A, Group B and Group C
Step 1: Overall test that examines if all groups are equal or not. And, if not all are equal (H0 rejected), then:
Step 2: Pair-wise (post-hoc) comparison tests to see where (i.e., among which groups) the differences exit, and how.
confidential
Logic Behind an Analysis of Variance (ANOVA)
ANOVA—you take 1 continuous (“response”) variable and 1 categorical (“factor”) variable and test the null
hypothesis that all group means for the categorical variable are equal.
Example: Analyse the effects of the machine operator on the valve opening measurements of valves produced
in a manufacturing plant. The measurements for the openings of 24 valves randomly selected from an assembly line that
are given in the table below. The mean opening is 6.34 centimeters (cm).
Question: Why do the valve openings vary? And how to

justify it?
confidential
Independent/Response Variable:
The machine operator.
Our Hypothesis:
Treatment/Classification Levels of the response
variable:
The 4 machine operators 1,2,3 and 4
Dependent/Predictor Variable:
The opening measurement of the valves.
confidential
SST=SSC+SSE
Where SST: Total Sum of Squares. It measures all the
variation in the dependent or response variable.
SSC: Sum of Squares columns i.e. between the

columns/treatments.
SSE: Sum of Squares errors i.e. the within

columns/treatments.
In the ANOVA situation, the F-statistic which can be

expressed as the ratio of Between Group variability
and Within Group Variability
confidential
ANOVA using Excel
Here we observe that the values for treatment
level 3 seem to be located differently from those
of levels 2 and 4.
Treatment level 1 seems to be closest to the

mean valve measurement.
Hence, the observed F value of 10.18 is larger

than the table F value of 3.10. The null hypothesis
is rejected. Not all means are equal, so there is a
significant difference in the mean valve openings
by machine operator.
confidential
Regression and Clustering

Linear Regression
What is Regression
Regression analysis is a statistical technique for studying linear relationships among variables. It
includes many techniques for modeling and analyzing several variables, when the focus is on the
relationship between dependent variable and one or more independent variables (or 'predictors'). More
specifically, regression analysis helps one understand how the typical value of the dependent variable
(or 'criterion variable') changes when any one of the independent variables is varied, while the other
independent variables are held fixed.

Regression Equation
Suppose the Linear Model is Y = β0 + β1X + e
 where Y is the dependent variable and X is the independent variable.

 β0 and β1 are two unknown constants that represent the intercept and slope, also known as coefficients or
parameters, and e is the error term.
 Estimating the effect of an explanatory variable on the dependent variable (Effect Size – magnitude of the β
coefficient )
 This is Simple Linear Rgression Equation.
And When the Linear Model is Y = β0 + β1X1 + β2X2 + β3X3 + β4X5+…. + β20X20 + e
 This is Multiple Linear Regression Equation.

 X’s is the independent variables.

Example Of Linear Regression
Advertisement Sales
5 62
10 100
16 148
30 300
33 357
40 400
5 700
20 ??????

Plot The Data
Plots The Sales on Graph
1.2
1
Advertisement
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2
Sales

Linear Regression
Joins The Data Points With A Line… May be Curve
1.2
1
Advertisement
0.8
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2
Sales

Linear Regression
1.2
Has to be a Straight Line .. But Which One
0.8
Advertisement
0.6
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1 1.2
Sales

Called residuals and not deviations in regression..
Least Square Method
1.2
0.8
Advertisement
0.6
0.4
0.2
0
0 0.2 0.4 Sales
0.6 0.8 1 1.2

Least Square Methods
 The least squares method is a form of mathematical regression analysis that finds the line of best fit for a set of
data, providing a visual demonstration of the relationship between the data points. Each point of data is
representative of the relationship between a known independent variable and an unknown dependent variable.
 It aims to create a straight line that minimizes the sum of the squares of the errors generated by the results of the
associated equations, such as the squared residuals resulting from differences in the observed value and the value
anticipated based on the model.
• Clarification https://nptel.ac.in/courses/122104019/numerical-analysis/Rathish-kumar/least-square/r1.htm

Linear Regression
Model Validation
o ANOVA
o P value
o R Square
o Adjusted R Square
o MAPE (Mean Absolute Percentage Error)
ANOVA
• H0 : There is no relationship between X and Y versus the alternative hypothesis

• H1 : There is some relationship between X and Y .
• We need the Value Less Than 0.05 for any model.

Model Validation
P Value
• For each independent variable p value should be less than 0.05. If is not then we discard the variable from the model as the
variable will be no importance.
R Square ( Goodness Of Fit)
• It’s value ranges from 0 to 1.
• How good the line fits the data.
Adjusted R Square ( Goodness Of Fit)
• Problems of R Square
• R Square increase with no of Predictors included in the model
• Adjusted R Square solves the problem
• Adjusted R Square <= R Square
MAPE (Mean Absolute Percentage Error)
• How different the predictions are from actual

• Ranges from 0 to 1.
• Lesser MAPE better the model is.

Linear Regression Example
• D:\Working\LPG\Working\Regression.xlsx

What Is Cluster Analysis & Its Usage
• Cluster : a collection of data objects
– Similar to one another within the same cluster
– Dissimilar to the objects in other clusters
• Cluster analysis
– Grouping a set of data objects into clusters
• Suppose, you are the head of a retail store and wish to understand preferences of your costumers to
scale up your business. Is it possible for you to look at details of each costumer and devise a unique
business strategy for each one of them? Definitely not. But, what you can do is to cluster all of your
costumers into say 10 groups based on their purchasing habits and use a separate strategy for
costumers in each of these 10 groups. And this is what we call clustering.

K-means Clustering
 The k-means algorithm is an algorithm to cluster n objects based on attributes into k partitions, where k < n.
 Each cluster is associated with a centroid (centre point)
 Each point is assigned to the cluster with the closest centroid
 Number of clusters K must be specified

Clustering: Example - Step 1
Algorithm: k-means, Distance Metric: Euclidean Distance
5
expression in condition 2
4
k1
k2
2
k3
0
0 1 2 3 4 5
5
4
k1
k2
2
k3
0
0 1 2 3 4 5
5
4
k1
2
k3
k2
1
0
0 1 2 3 4 5
5
4
k1
2
k3
k2
1
0
0 1 2 3 4 5
5
4
k1
2
k2
k3
1
0
0 1 2 3 4 5
How the K-Mean Clustering algorithm
works?
Keep
repeating
the
process
till the
centroids
don’t
change
anymore.

Hierarchical Clustering
• Hierarchical clustering, as the name suggests is an algorithm that builds hierarchy of clusters. This algorithm starts with all
the data points assigned to a cluster of their own. Then two nearest clusters are merged into the same cluster. In the end,
this algorithm terminates when there is only a single cluster left.
• The results of hierarchical clustering can be shown using dendrogram. The dendrogram can be interpreted as:

Explanation Of Hierarchical Clustering
 At the bottom, we start with 25 data points, each assigned to separate

clusters. Two closest clusters are then merged till we have just one cluster
at the top. The height in the dendrogram at which two clusters are
merged represents the distance between two clusters in the data space.
 The decision of the no. of clusters that can best depict different groups
can be chosen by observing the dendrogram. The best choice of the no. of
clusters is the no. of vertical lines in the dendrogram cut by a horizontal
line that can transverse the maximum distance vertically without
intersecting a cluster.
 In the above example, the best choice of no. of clusters will be 4 as the
red horizontal line in the dendrogram below covers maximum vertical
distance AB.

Dendrogram

Clustering Example
• D:\Working\LPG\Working\Clustering.csv

Dimensionality Reduction techniques

Dimensionality reduction techniques
Addressing the “Curse of Dimensionality” where at times too many is not good.
 These techniques are used when there are too many variables
 When You Need to Visualize certain results in a two dimensional Plane.
 When Computing Time is an issue with systems
The Usage of Dimensionality reduction
• Forecasting with many variables
• Face Recognition
• Image Compression
• Gene Expression Analysis
• Data Reduction
• Data Classification
• Trend Analysis
• Factor Analysis
• Noise Reduction

Dimensionality Reduction Techniques
• Feature Selection
– Out of the existing features select the Most Relevant based on the Target
• Feature Extraction
• Out of the existing variables ( n Numbers)- get a new set of lesser number of
variables( say k < n) which contains the maximum Information
– Factor Analysis
– Principal Component Analysis
– Linear Discriminant Analysis

PCA Illustration
We can picture PCA- Principal Component Analysis as a technique that finds the
directions of maximal variance:

What is Principal Component Analysis?
• They are the directions where there is the most variance, the directions where the data is most
spread out.

To find the direction where there is most variance, find the straight line where the data is most
spread out when projected onto it. A vertical straight line with the points projected on to it will
look like this:

On this line the data is way more spread out, it has a large variance.
In fact there isn’t a straight line you can draw that has a larger variance than a horizontal one. A horizontal
line is therefore the principal component in this example.

The Steps Involved in Doing an Actual Principal Component Analysis
• Standardize the data.

• Perform Singular Vector Decomposition to get the Eigenvectors and Eigenvalues.
• Sort eigenvalues in descending order and choose the k- eigenvectors
• Construct the projection matrix from the selected k- eigenvectors.
• Transform the original dataset via projection matrix to obtain a k-dimensional
feature subspace.

LDA- Linear Discrimant Analysis
LDA attempts to find a feature subspace that maximizes class separability (note
that LD 2 would be a very bad linear discriminant in the figure above).

LDA- Linear Discriminant Analysis
Introduction
Linear Discriminant Analysis (LDA) is used to solve dimensionality reduction for data with
higher attributes
Pre-processing step for pattern-classification and machine learning applications.

Used for feature extraction.
Linear transformation that maximize the separation between multiple classes.
“Supervised” - Prediction agent.
It is also for Classification – Multiple Class Classification
Basically what it does is it tries to maximizes the Fishers Ration ( mu1- mu2)^2/( sigma1^2-
sigma2^2)

Feature Subspace :
To reduce the dimensions of a d-dimensional data set by projecting it

onto a (k)-dimensional subspace
(where k < d)
Feature space data is well represented?

Compute eigen vectors from dataset
Collect them in scatter matrix
Generate k-dimensional data from d-dimensional dataset.

Scatter Matrix:
Within class scatter matrix

In between class scatter matrix
Maximize the between class measure & minimize

the within class measure.

LDA steps:
 Compute the d-dimensional mean vectors.

 Compute the scatter matrices
 Compute the eigenvectors and corresponding eigenvalues for
the scatter matrices.
 Sort the eigenvalues and choose those with the largest
eigenvalues to form a d×k dimensional matrix
 Transform the samples onto the new subspace.

Using LDA for Classification
 Usage Basis is the Bayes Theorem
 muk = 1/nk * sum(x) , where muk – Mean within the class k,
 nk is the number of instances with class k, x is an instance within class k.
 sigma^2 = 1 / (n-K) * sum((x – mu)^2), K is the number of Classes and
 sigma^2 is the variance.
Briefly Bayes’ Theorem can be used to estimate the probability of the output class (k) given the input (x) using
the probability of each class and the probability of the data belonging to each class:
P(Y=x|X=x) = (PIk * fk(x)) / sum(PIl * fl(x))
Where PIk refers to the base probability of each class (k) observed in your training data (e.g. 0.5 for a 50-50 split
in a two class problem). In Bayes’ Theorem this is called the prior probability.
 PIk = nk/n
 The f(x) above is the estimated probability of x belonging to the class.
 Dk(x) = x * (muk/siga^2) – (muk^2/(2*sigma^2)) + ln(PIk)
 Dk(x) is the discriminate function for class k given input x, the muk, sigma^2 and PIk are all estimated from
your data.

Factor Analysis
 Factor analysis is a method for investigating whether a number of variables of interest Y1,
Y2, :: :, Yl, are linearly related to a smaller number of unobservable factors F1, F2, : ::, Fk .
 Originally Started from Customer Surveys where Multiple survey Questions were underlying
the same factor.
 Factor Analysis: Let’s say some variables are highly correlated. These variables can be
grouped by their correlations i.e. all variables in a particular group can be highly correlated
among themselves but have low correlation with variables of other group(s). Here each
group represents a single underlying construct or factor. These factors are small in number
as compared to large number of dimensions. However, these factors are difficult to
observe. There are basically two methods of performing factor analysis:
 EFA (Exploratory Factor Analysis)
 CFA (Confirmatory Factor Analysis)

An Illustrated Example
Finance ( Marketing Policy
Student No Y1) (Y2) (Y3)
1 3 6 5
2 7 3 3
3 10 9 8
4 3 9 7
5 10 6 5
Say Y1, Y2, Y3 represents the Grade of Students in 3 Subjects.

It has been suggested that these grades are functions of two underlying factors, F1 and F2,
tentatively and rather loosely described as quantitative ability and verbal ability, respectively.
Y1 = B10 + B11F1 + B12F2 + e1

Y2 = B20 + B21F1 + B22F2 + e2
Y1 = B30 + B31F1 + B32F2 + e3
The error terms e1, e2, and e3, serve to indicate that the hypothesized relationships are not
exact.

Thank you
© mjunction services limited 2018 | all rights reserved | confidential


Crash Course in Analytics For Non Analytics Managers

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Crash Course in Analytics For Non Analytics Managers

Загружено:

Авторское право:

Доступные форматы

BTP - Analytics

confidential - mjunction services limited

confidential - mjunction services limited

Numeric Skewness, Kurtosis

Histograms, Box Plots

Chi Square test

Bar Chart, 2 y Axis Plot

Bivariate Numerical &

Z test, T test, Anova

confidential - mjunction services limited

Artificial Neural Networks

confidential - mjunction services limited

Covariance Multi Linear Regression

Artificial Neural Networks

Others SVM- Support Vector

confidential - mjunction services limited

 Zooter Industries (ZI)manufactures high-end kick-scooters for the North

confidential - mjunction services limited

– quality assurance and packaging

confidential - mjunction services limited

confidential - mjunction services limited

 We Will be using Excel Solver to Solve this optimization problem

confidential - mjunction services limited

confidential - mjunction services limited

Population Sample Census

confidential - mjunction services limited

Population, sample and individual cases

A valid Budget and time

confidential - mjunction services limited

confidential - mjunction services limited

confidential - mjunction services limited

confidential - mjunction services limited

• An assumption about the population

Example: Assuming the average weight of the

• To test the validity of assumed or hypothetical value

Hypothesize: Establishing H0 and Hα

Test: Determining the statistical test, setting up

Take a Statistical Action to reach to a

Determine the Business Implication

If the null hypothesis is rejected, statistically it means that the

If the null hypothesis is not rejected, statistically, it means that

P-value: The p-value (or probability value) is the probability that

• Deciding significance level/α

• Compute the value of the test statistic:

Where p-hat is sample proportion and p0 is

Where x-bar is sample mean and µ0 is

Hence, setting up the hypothesis,

Step 2: Deciding the significance level, α=0.05

Step 6: Concluding that a majority of the students are from Pennsylvania.

Step 2: Deciding the significance level, α=0.01

Rejecting a true null hypothesis is Type I

Failing to reject a false null hypothesis is

Scenario 1. When comparing 2 groups, a one-step test : Group A and Group B

Question: Why do the valve openings vary? And how to

SSC: Sum of Squares columns i.e. between the

SSE: Sum of Squares errors i.e. the within

In the ANOVA situation, the F-statistic which can be

Treatment level 1 seems to be closest to the

Hence, the observed F value of 10.18 is larger

confidential - mjunction services limited

confidential - mjunction services limited

 where Y is the dependent variable and X is the independent variable.

 This is Multiple Linear Regression Equation.