Analyze Phase: The Aim of The Analyze Phase Is To

Analyze Understanding Six Sigma
Analyze Phase
• 4M Diagram
• Hypothesis Testing
• Mean Testing
• Variance Testing
• Regression Analysis
The Aim of the Analyze Phase is to :

a. To List down all the Possible factors through Brainstorming.
b. Pick out the Vital few Potential Factors out of Trivial Many.
c. Statistical verification of Potential Factors by means of various type of Tests.
d. Ascertaining whether we have considered all factors & nothing is Left Out.
Analyze - Cause & Effect Diagram Understanding Six Sigma
4 M Diagram
MACHINE
MAN
Cause
Cause Cause Cause
Effect
Cause
Cause
Cause
METHOD MATERIAL
4 M ( Man, Method, Machine & Material ) Diagram is used to list down all the Probable
factors (causes ) responsible for the Major Problem ( Effect ). After brainstorming the
Significant Factors are selected for further comparison ( Hypothesis Testing )
The Symptom or result is put under the Dark Box on the Right.. Lighter Boxes at the end
of the Large Bones are main groups in which ideas are classified. The Lighter Boxes
consist of Four Ms - Man,Method, Machine & Material. The Middle Bones indicates the
direction of path from cause to effect.
Analyze - Cause & Effect Diagram Understanding Six Sigma
Machine
Man
Cylinder
New casual Failure
Die
Handling Setting
problem M/C not
clean OUT CASE
DENT
Dented sheet
Chips on Piece
sheet Piece
Sheet unloading
check
thickness
Material Method
Analyze - Tests used for Comparison Understanding Six Sigma
Steps Involved in Analyze

1.) Normality test
2) Variance Test
3) Mean Testing
In the Analyze phase, before going for any test , we have to check the Normality of the data. If
the P value >0.05, then data is Normal,
Second step is Variance Testing,
Normal data Non Normal data
Two variance test F- Test Levene’s Test
Test for equal variance. Barlett’s Test Levene’s Test
Third step is Mean Testing , for 1- Sample Z & 1- Sample t test , the variance
testing is not required.
Analyze - Hypothesis Understanding Six Sigma
Hypothesis Testing
Hypothesis means an assumption ,something taken to be true for the Purpose of
argument or Investigation. There is two types of Hypothesis.
• Null Hypothesis ( Ho)
• Alternate Hypothesis ( Ha)
Ho(Null Hypothesis) is assumed to be true .This is like the defendant being

assumed to be innocent.
Ha(Alternative Hypothesis) is alternatives the Null Hypothesis. Ha is the one that
must be proved.
Hypothesis Testing is defined as the comparison of two Populations (equality of mean/

variance ) by taking samples from those Populations. It is assumed in the beginning that the
two Populations are equal (Null Hypothesis ;m1 = m2 = ... mn ; s1 = s2 = ..sn) or not equal
(Alternate Hypothesis : m1≠ m2 ≠... mn; s1 ≠ s2 ≠... sn) . The equality is confirmed by actually
conducting tests on the sample. There is always a risk associated with the Hypothesis , in
case the sample taken for comparison from Population does not correctly represent the
Population..
Analyze - Tests used for Comparison Understanding Six Sigma
Analyze
Mean Testing Variance Testing
Continuos Data Discreet Data
1 Sample 1 Proportion Test for Equal

Z Test Test Variance
1 Sample 2 Proportion
t Test Test
2 Variance
2 Sample Chi-Square Test
t Test Test
ANOVA
Testing
There are many types of hypothesis test. The test is selected depending on
the type of data or the comparison required.
Continuos Data Discreet Data
1) 2 Variance Test : Compares Variances 3) Chi Square Test : Compares counts
• Levene’s Test • Goodness of Fit
• Bartlett’s Test • Contingency Table
2) t-test : Compares means
• 1 sample t-test • Paired t-test
• 2 sample t-test Population
Ho Ha
In this case as the samples

In this case the samples correctly
Correct Type 2 does not correctly represent
represent the Population Ho Decision
so sample mean = Population mean.
Error the Population so sample
β mean ≠ Population mean.
Sample
Incorrect Decision
In this case as the samples does not In this case as the samples
correctly represent the Population Type 1 Correct correctly represent the Population
Ha Error
so sample mean ≠ Population mean. Decision so sample mean = Population mean.
Incorrect Decision α Correct Decision
Conditions for Null Hypothesis Testing to be True
• If the Calculated value is < or = to Table (Critical ) value, then Null

Hypothesis is accepted
• If the P value is > or = to a , then Null Hypothesis is accepted.
• If “ 0 ” falls within the confidence Interval, around the difference of the
two means, then Null Hypothesis is accepted.
The Conditions for the acceptability of Alternate Hypothesis are just the
converse of above conditions.
Important Terms
1.) Type 1 Error : This error gives us the probability of rejecting the Right Material .
This happens when a weird sample gets selected for the comparison of
mean/variance. It is also known as a - Error or Producer’s Risk. Generally It’s value
lies around 5 %.
2. ) Type 2 Error : This error gives us the probability of accepting the wrong
material. This also happens when a weird sample is selected for comparison. It is also
known as b - Error or Consumer’s Risk. It’s value generally lies around 10 %.
3 .) 1-α = Confidence of the Test
The probability that can be determined as a right thing when the Null Hypothesis is
correct.
4) 1-β = Power of the test

It is the Probability of correctly rejecting Ho, when it is False. As the Probability of a
increases, the probability of b decreases. So the Power of the Test Increases.
It is not possible to simultaneously commit a Type 1 and Type 2 decision
error.
1 Sample Z Test :This test compares the mean of the sample with some test Population
with known standard deviation of Population spop. It computes a confidence interval or
performs a hypothesis test of the mean when the population standard deviation, s is
known. This procedure is based upon the normal distribution.
Example : Measurements were made on nine widgets. You know that the distribution of
measurements has historically been close to normal with s = 0.2. Because you know s, and
you wish to test if the population mean is 5 and obtain a 90% confidence interval for the
mean, you use the Z-procedure.
Solution : Values
4.9
1 Open the worksheet enter the values.. 5.1
4.6
2 Choose Stat > Basic Statistics > 1-Sample Z.
5
3 In Samples in Columns, enter Values. 5.1
4.7
4 In Standard deviation, enter 0.2. 4.4
4.7
5 In Test mean, enter 5. 4.6
6 Click Options. In Confidence level, enter 90. Click OK.

7 Click Graphs. Check Individual value plot. Click OK in each dialog box.
One-Sample Z: Values The test mean does

not lie within the CI p value < 0.05,
Test of mu = 5 vs not = 5 Hence Ha, alternate
Hypothesis
The assumed standard deviation = 0.2
Variable N Mean StDev SE Mean 90% CI Z P
Values 9 4.78889 0.24721 0.06667 (4.67923, 4.89855) -3.17 0.002
Interpreting the results

The test statistic, Z, for testing if the population mean equals 5 is -3.17. The p-value, or the
probability of rejecting the null hypothesis when it is true, is 0.002. This is called the attained
significance level, p-value, or attained a of the test. Because the p-value of 0.002 is smaller
than commonly chosen a-levels, there is significant evidence that m is not equal to 5, so you
can reject H0 in favor of m not being 5.
A hypothesis test at a error = 0.1 could also be performed by viewing the individual value
plot. The hypothesized value falls outside the 90% confidence interval for the population
mean (4.67923, 4.89855), and so you can reject the null hypothesis.
1 Sample t test : This test compares the mean of the sample with some test Population
when Population standard deviation spop is Unknown. This procedure is based upon the t-
distribution, which is derived from a normal distribution with unknown s.
Example : Measurements were made on nine widgets. You know that the distribution of
widget measurements has historically been close to normal, but suppose that you do not
know s. To test if the population mean is 5 and to obtain a 90% confidence interval for the
mean, you use a t-procedure.
Solution : Values
1 Open the worksheet enter the data. 4.9
5.1
2 Choose Stat > Basic Statistics > 1-Sample t. 4.6
5
3 In Samples in columns, enter Values.
5.1
4 In Test mean, enter 5. 4.7
4.4
5 Click Options. In Confidence level enter 90. 4.7
Click OK in each dialog box 4.6
One-Sample T: Values
Test of mu = 5 vs not = 5
Variable N Mean StDev SE Mean 90% CI T P
Values 9 4.78889 0.24721 0.08240 (4.63566, 4.94212) -2.56 0.034
Result Interpretation :
The p-value < 0.01 , also the test mean does not lie within the Confidence Interval so Null
Hypothesis is rejected and Alternate Hypothesis is accepted. It confirms that the sample
mean is not euqal to Population Mean ).
2 Sample t test :. It computes a confidence interval and performs a hypothesis test of the
difference between two population means when s 's are unknown and samples are drawn
independently from each other. This procedure is based upon the t-distribution, and for small
samples it works best if data were drawn from distributions that are normal or close to
normal. You can have increasing confidence in the results as the sample sizes increase.
Example : A study was performed in order to evaluate the effectiveness of two devices for
improving the efficiency of gas home-heating systems. Energy consumption in houses was
measured after one of the two devices was installed. The two devices were an electric vent
damper (Damper=1) and a thermally activated vent damper (Damper=2). The energy
consumption data (BTU.In) are stacked in one column with a grouping column (Damper)
containing identifiers or subscripts to denote the population. Suppose that you performed a
variance test and found no evidence for variances being unequal .Now you want to compare
the effectiveness of these two devices by determining whether or not there is any evidence
that the difference between the devices is different from zero.
BTU.In Da mpe r
7.87 1
9.43 1
Solution : 7.16
8.67
1
1
12.31 1
9.84 1
1 Open the worksheet , enter the data. 16.9
10.04
1
1
12.62 1
7.62 1
2 Choose Stat > Basic Statistics > 2-Sample T. 11.12

13.43
1
1
9.07 1
6.94 1
3 Choose Samples in one column. 10.28

9.37
1
1
7.93 1
13.96 1
4 In Samples, enter 'BTU.In'. 6.8
8.58
4
1
1
1
8 1
5.98 1
5 In Subscripts, enter Damper. 15.24
8.54
1
1
11.09 1
11.7 1
6 Check Assume equal variances. Click OK. 12.71

6.78
1
1
9.82 1
12.91 1
10.35 1
9.6 1
9.58 1
9.83 1
9.52 1
18.26 1
10.64 1
6.62 1
5.2 1
12.28 2
7.23 2
2.97 2
8.81 2
9.27 2
11.29 2
8.29 2
9.96 2
10.3 2
16.06 2
14.24 2
11.43 2
10.28 2
13.6 2
5.94 2
10.36 2
6.85 2
6.72 2
10.21 2
8.61 2
Minitab Output :
Two-Sample T-Test and CI: BTU.In, Damper
Two-sample T for BTU.In
Damper N Mean StDev SE Mean
1 40 9.91 3.02 0.48
2 50 10.14 2.77 0.39
Difference = mu (1) - mu (2)
Estimate for difference: -0.235250
95% CI for difference: (-1.450131, 0.979631)
T-Test of difference = 0 (vs not =): T-Value = -0.38 P-Value = 0.701 DF = 88
Both use Pooled StDev = 2.8818
Result Interpretation :
Minitab displays a table of the sample sizes, sample means, standard deviations, and
standard errors for the two samples.
Since we previously found no evidence for variances being unequal, we chose to use the
pooled standard deviation by choosing Assume equal variances. The pooled standard
deviation, 2.8818, is used to calculate the test statistic and the confidence intervals.
A second table gives a confidence interval for the difference in population means. For this
example, a 95% confidence interval is (-1.45, 0.98) which includes zero, thus suggesting that
there is no difference. Next is the hypothesis test result. The test statistic is -0.38, with p-
value of 0.701, and 88 degrees of freedom.
Since the p-value is greater than commonly chosen a-levels, there is no evidence for a
difference in energy use when using an electric vent damper versus a thermally activated
vent damper.
ANOVA: ANOVA determines if the variation between the average of the levels is greater
than the variation that occurs within the level
level 1 level 2 Avg SS between

ANOVA= ---------------------
Observed Value Expected Value Avg SS within
(at present) (target)
ANOVA is a tool with which we can
compare several means. It is a tool
used to search for the significant X
Gap
factors that have an influence on the
Between group response variable Y. In effect,
variation (SSB) analysis of variance extends the two-
Delta (d)
sample t-test for testing the equality
Total
Variation of two population means to a more
Within group
variation general null hypothesis of comparing
(SSW) the equality of more than two means,
versus them not all being equal.
Example : You design an experiment to assess the durability of four experimental carpet
products. You place a sample of each of the carpet products in four homes and you measure
durability after 60 days. Because you wish to test the equality of means and to assess the
differences in means, you use the one-way ANOVA procedure (data in stacked form) with
multiple comparisons. Generally, you would choose one multiple comparison method as
appropriate for your data. However, two methods are selected here to demonstrate Minitab's
capabilities. Durability Carpet
18.95 1
12.62 1
Solution : 11.94 1
14.42 1
1 Open the worksheet enter the data 10.06 2
7.19 2
2 Choose Stat > ANOVA > One-Way. 7.03 2
14.66 2
3 In Response, enter Durability. In Factor, enter Carpet. 10.92 3
13.28 3
4 Click OK in each dialog box. 14.52 3
12.51 3
10.46 4
21.4 4
18.1 4
22.5 4
Minitab Output :
One-way ANOVA: Durability versus Carpet
Source DF SS MS F P
Carpet 3 146.4 48.8 3.58 0.047
P Value < 0.05, so Ha , Alternate
Error 12 163.5 13.6 Hypothesis
Total 15 309.9
The Co relation coefficient
S = 3.691 R-Sq = 47.24% R-Sq(adj) = 34.05% is also very poor, because
R2 & R2 (adj) < 64 %
Individual 95% CIs For Mean Based on
Pooled StDev
Level N Mean StDev ---------+---------+---------+---------+
1 4 14.483 3.157 (-------*-------)
2 4 9.735 3.566 (-------*--------)
3 4 12.808 1.506 (-------*-------)
4 4 18.115 5.435 (-------*-------)
---------+---------+---------+---------+
10.0 15.0 20.0 25.
Pooled StDev = 3.691
Result Interpretation : The p < 0.05, indicating that the mean of the two samples
is not equal. The R2 & R2(adj) values < 64 % , indicating a poor Coorelation
between the Carpet and Durability.
1 Proportion Test : Performs a test of one binomial proportion.

Use 1 Proportion to compute a confidence interval and perform a hypothesis test of the
proportion. For example, an automotive parts manufacturer claims that his spark plugs are
less than 2% defective. You could take a random sample of spark plugs and determine
whether or not the actual proportion defective is consistent with the claim. For a two-tailed
test of a proportion:
H0: p = p0 versus H1: p ≠ p0 where p is the population proportion and p0 is the
hypothesized value.
Example : A county district attorney would like to run for the office of state district
attorney. She has decided that she will give up her county office and run for state office if
more than 65% of her party constituents support her. You need to test H0: p = .65 versus
H1: p > .65.
As her campaign manager, you collected data on 950 randomly selected party members
and find that 560 party members support the candidate. A test of proportion was
performed to determine whether or not the proportion of supporters was greater than the
required proportion of 0.65. In addition, a 95% confidence bound was constructed to
determine the lower bound for the proportion of supporters.
1 Choose Stat > Basic Statistics > 1 Proportion.
2 Choose Summarized data.
3 In Number of trials, enter 950. In Number of events, enter 560.
4 Click Options. In Test proportion, enter 0.65.
5 From Alternative, choose greater than. Click OK in each dialog box.

Session window output
Test and CI for One Proportion
Test of p = 0.65 vs p > 0.65
95%
Lower Exact
Sample X N Sample p Bound P-Value
1 560 950 0.589474 0.562515 1.000
The p-value of 1.0 suggests that the data are consistent with the null hypothesis (H0: p =
0.65), that is, the proportion of party members that support the candidate is not greater than
the required proportion of 0.65. As her campaign manager, you would advise her not to run
for the office of state district attorney.
2 Proportion Test : Performs a test of
two binomial proportions.
Use the 2 Proportions command to
compute a confidence interval and perform
a hypothesis test of the difference
between two proportions. For example,
suppose you wanted to know whether the
proportion of consumers who return a
survey could be increased by providing an
incentive such as a product sample. You
might include the product sample with half
of your mailings and see if you have more
responses from the group that received
the sample than from those who did not.
For a two-tailed test of two proportions:
H0: p1 - p2 = p0 versus H1: p1 - p2 ≠ p0
where p1 and p2 are the proportions of
success in populations 1 and 2,
respectively, and p0 is the hypothesized
difference between the two proportions.
Example : As your corporation's purchasing manager, you need to authorize the purchase
of twenty new photocopy machines. After comparing many brands in terms of price, copy
quality, warranty, and features, you have narrowed the choice to two: Brand X and Brand Y.
You decide that the determining factor will be the reliability of the brands as defined by the
proportion requiring service within one year of purchase.
Because your corporation already uses both of these brands, you were able to obtain
information on the service history of 50 randomly selected machines of each brand. Records
indicate that six Brand X machines and eight Brand Y machines needed service. Use this
information to guide your choice of brand for purchase.
1 Choose Stat > Basic Statistics > 2 Proportions.
2 Choose Summarized data.
3 In First sample, under Trials, enter 50. Under Events, enter 44.
4 In Second sample, under Trials, enter 50. Under Events, enter 42. Click OK.
Test and CI for Two Proportions

Sample X N Sample p
1 44 50 0.880000
2 42 50 0.840000
Difference = p (1) - p (2)
Estimate for difference: 0.04
95% CI for difference: (-0.0957903, 0.175790)
Test for difference = 0 (vs not = 0): Z = 0.58 P-Value = 0.564
p value > 0.05, means of the

two machines are equal
Since the p-value of 0.564 is larger than commonly chosen a levels, the data are consistent
with the null hypothesis (H0: p1 - p2 = 0). That is, the proportion of photocopy machines that
needed service in the first year did not differ depending on brand. As the purchasing
manager, you need to find a different criterion to guide your decision on which brand to
purchase.
You can make the same decision using the 95% confidence interval. Because zero falls in
the confidence interval of (-0.096 to 0.176) you can conclude that the data are consistent
with the null hypothesis. If you think that the confidence interval is too wide and does not
provide precise information as to the value of p1 - p2, you may want to collect more data in
order to obtain a better estimate of the difference.
Chi Square Test : It is a measure of the

Observed & expected frequencies. Chi
Square test is a statistical test which
consists of three different type of Analysis.
1) Goodness of Fit
2) Test for Homogeneity
3) Test for Independence
The test for Goodness of fit determines if
the sample under analysis was drawn from
a population that follows some specified
distribution .
Test for Homogeneity answers the
proposition that several populations are
homogenous with respect to some
characteristic.
Test for Independence is for testing Null
hypothesis that two criteria of Classification
Example : You are interested in the relationship between gender and political party
affiliation. You query 100 people about their political affiliation and record the number of
males (row 1) and females (row 2) for each political party. The worksheet data appears as
follows:
Column 1 Column 2 Column 3
Democrat Republican Other
28 18 4
22 27 1
1 Open the worksheet EXH_TABL.MTW.
2 Choose Stat > Tables > Chi-Square Test (Table in Worksheet).
3 In Columns containing the table, enter Democrat, Republican and Other. Click OK.
Chi-Square Test: Democrat, Republican, Other
Expected counts are printed below observed counts
Chi-Square contributions are printed below expected counts
Democrat Republican Other Total
1 28 18 4 50
25.00 22.50 2.50
0.360 0.900 0.900
2 22 27 1 50
25.00 22.50 2.50
0.360 0.900 0.900
Total 50 45 5 100
Chi-Sq = 4.320, DF = 2, P-Value = 0.115
2 cells with expected counts less than 5.
Chi-Square Test: Democrat, Republican, Other
Expected counts are printed below observed counts
Chi-Square contributions are printed below expected counts
Democrat Republican Other Total Observed Frequency
1 28 18 4 50 Expected Frequency
25.00 22.50 2.50
0.360 0.900 0.900 Chi Square Values
2 22 27 1 50
25.00 22.50 2.50 Row Totals
0.360 0.900 0.900

Total 50 45 5 100 Grand Total
Chi-Sq = 4.320, DF = 2, P-Value = 0.115
Column Totals
2 cells with expected counts less than 5.
No evidence exists for association (p = 0.115) between gender and political party affiliation.
Of the 6 cells, 2 have expected counts less than five (33%). Therefore, even if you had
a significant p-value for these data, you should interpret the results with caution. To
be more confident of the results, repeat the test, omitting the Other category.
Formulae’s :
a) Expected Frequency = Row Total X Column Total

Grand Total
b ) Chi Square Value = ( Observed Frequency - Expected Frequecy )2

Expected Value
c) Degrees of Freedom(DF) = (No. of rows - 1) X ( No. of Columns - 1)

2 Variance Test : Use to perform

hypothesis tests for equality, or
homogeneity, of variance among two
populations using an F-test and Levene's
test. Many statistical procedures, including
the two sample t-test procedures, assume
that the two samples are from populations
with equal variance. The variance test
procedure will test the validity of this
assumption.
Example : A study was performed in order to evaluate the effectiveness of two devices for
improving the efficiency of gas home-heating systems. Energy consumption in houses was
measured after one of the two devices was installed. The two devices were an electric vent
damper (Damper = 1) and a thermally activated vent damper (Damper = 2). The energy
consumption data (BTU.In) are stacked in one column with a grouping column (Damper)
containing identifiers or subscripts to denote the population. You are interested in comparing
the variances of the two populations so that you can construct a two-sample t-test and
confidence interval to compare the two dampers.
1 Choose Stat > Basic Statistics > 2 Variances.
2 Choose Samples in one column.
3 In Samples, select the column which contain the values
4 In Subscripts, enter Damper. Click OK.

Test for Equal Variances for BTU.In

F-Test If the data
Test Statistic 1.19
1 P-Value 0.558
is Normal
Damper
Levene's Test
Test Statistic 0.00
If the data is
P-Value 0.996 not Normal
2
2.0 2.5 3.0 3.5 4.0

95% Bonferroni Confidence Intervals for StDevs
1
Damper
5 10 15 20
BTU.In
Test for Equal Variances: BTU.In versus Damper
95% Bonferroni confidence intervals for standard deviations
Damper N Lower StDev Upper

1 40 2.40655 3.01987 4.02726
2 50 2.25447 2.76702 3.56416
F-Test (normal distribution)

Test statistic = 1.19, p-value = 0.558
Levene's Test (any continuous distribution)

Result Interpretation
The variance test generates a plot that displays Bonferroni 95% confidence intervals for the
population standard deviation at both factor levels. The graph also displays the side-by-side
boxplots of the raw data for the two samples. Finally, the results of the F-test and Levene's
test are given in both the Session window and the graph. Note that the 95% confidence level
applies to the family of intervals and the asymmetry of the intervals is due to the skewness of
the chi-square distribution.For the energy consumption example, the p-values of 0.558 and
0.996 are greater than reasonable choices of a, so you fail to reject the null hypothesis of the
variances being equal. That is, these data do not provide enough evidence to claim that the
two populations have unequal variances. Thus, it is reasonable to assume equal variances
when using a two-sample t-procedure.

Test for Equal Variance is used when comparing the variance of two or more than
two populations
Example : You study conditions conducive to

potato rot by injecting potatoes with bacteria
that cause rotting and subjecting them to
different temperature and oxygen regimes.
Before performing analysis of variance, you
check the equal variance assumption using
the test for equal variances.
Solution :1Open the worksheet .

2 Choose Stat > ANOVA > Test for Equal
Variances.
3 In Response, enter Rot.
4 In Factors, enter Temp Oxygen. Click
OK.
Test for Equal Variances for Rot

Temp Oxygen
Bartlett's Test
2 Test Statistic 2.71
P-Value 0.744
10 6 Lev ene's Test
Test Statistic 0.37
P-Value 0.858
10
16 6
10
0 20 40 60 80 100 120 140

95% Bonferroni Confidence Intervals for StDevs
Test for Equal Variances: Rot versus Temp, Oxygen
95% Bonferroni confidence intervals for standard deviations
Temp Oxygen N Lower StDev Upper

10 2 3 2.26029 5.29150 81.890
10 6 3 1.28146 3.00000 46.427
10 10 3 2.80104 6.55744 101.481
16 2 3 1.54013 3.60555 55.799
16 6 3 1.50012 3.51188 54.349
16 10 3 3.55677 8.32666 128.862
Bartlett's Test (normal distribution)

Levene's Test (any continuous distribution)

Test for Equal Variances: Rot versus Temp, Oxygen

The test for equal variances generates a plot that displays Bonferroni 95% confidence
intervals for the response standard deviation at each level. Bartlett's and Levene's test
results are displayed in both the Session window and in the graph. Note that the 95%
confidence level applies to the family of intervals and the asymmetry of the intervals is
due to the skewness of the chi-square distribution.
For the potato rot example, the p-values of 0.744 and 0.858 are greater than reasonable
choices of a, so you fail to reject the null hypothesis of the variances being equal. That is,
these data do not provide enough evidence to claim that the populations have unequal
variances.
Analyze - Regression Understanding Six Sigma
The Concept of Regression
A mathematical equation of describing a relationship between the ”Y”

and “X’s”
→Creating a Model of process
Y = b0 + b1x + error where b0 = constant b1 = slope
350
There appears to be a linear
relationship 300
Annual Sales
between floor space and annual
sales… 250
200
That is, Is the Annual sales reducing or
increasing according to change floor 50 100 150
space Floor Space

Regression analysis is used to investigate and model the relationship

between a response variable and one or more predictors.
Use least squares procedures when your

response variable is continuous.
Use partial least squares regression when
your predictors are highly correlated or
outnumber your observations..
Use logistic regression when your
response variable is categorical.
Both least squares and logistic regression
methods estimate parameters in the model
so that the fit of the model is optimized. Least
squares regression minimizes the sum of
squared errors to obtain parameter
estimates,
Example : Do regression and residual analysis for yield as shown in the table.Interpret the
output results. Please note that A,B,C are factors & yield is response.
S.No. A B C Yield
1 2 3 5 85
2 2 1 10 71
3 8 3 15 3129
4 6 4 20 1384
5 5 5 25 875
6 8 3 30 3159
7 5 1 35 823
8 3 2 40 254
9 2 2 45 150
10 1 8 50 298
11 9 7 55 4631
12 5 6 60 978
13 3 5 65 367
14 2 6 70 296
15 1 7 75 303
16 4 2 80 556
17 2 4 85 266
18 1 6 90 294
19 2 5 95 313
20 5 6 100 1058
Solution :
1.) Enter the columns A, B, C (Factors ) and Yield ( response ) in minitab Excel
sheet.
2) Go to stat> Regression > Regression.

3) Select Yield as Response and

A,B,C as Predictors by double
clicking on all.
4) Click OK.
Regression Analysis: Yield versus A, B, C
The regression equation is

Yield = - 1277 + 458 A + 136 B - 1.54 C
Predictor Coef SE Coef T P

R2 & R2 (adj) > 64 % indicating a strong
Constant -1277.3 360.6 -3.54 0.003 corelation between the Factors & the
A 457.70 47.64 9.61 0.000
B 135.72 61.28 2.21 0.042 Response (Yield)
C -1.544 4.579 -0.34 0.740
S = 487.537 R-Sq = 86.9% R-Sq(adj) = 84.5%

Analysis of Variance
Source DF SS MS F P
Regression 3 25308562 8436187 35.49 0.000
Residual Error 16 3803071 237692
Total 19 29111633
Source DF Seq SS
A 1 23966672
B 1 1314871
C 1 27018
Unusual Observations
Obs A Yield Fit SE Fit Residual St Resid

11 9.00 4631 3707 308 924 2.44R
R denotes an observation with a large standardized residual.

Minitab fits the regression Line using

the Least square method. As shown in
the diagram the least square method
minimizes the sum of the squared
distances between the points and the
fitted Line.`
Predictors
Fitted Value : The predicted y or ; the mean response value for the given
predictor values using the estimated regression equation.
Residuals :The difference (ei) between the observed values and predicted or
fitted values (data minus fits). This part of the observation is not explained by the
fitted model. The formula for the residual of an observation is: ei = (yi - i)
YIELD = RESIDUAL + FITS

Three Regressions Models
Linear Y = bo + b1X
Quadratic Y = bo + b11X2
Cubic Y = bo + b1X + b11X2
Y is the response; X is the predictor; bo is the intercept; and b1, b11, and b111 are the
coefficients

Analyze Phase: The Aim of The Analyze Phase Is To

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Analyze Phase: The Aim of The Analyze Phase Is To

Загружено:

Авторское право:

Доступные форматы

Analyze Understanding Six Sigma

The Aim of the Analyze Phase is to :

Cause Cause Cause

Steps Involved in Analyze

Ho(Null Hypothesis) is assumed to be true .This is like the defendant being

Hypothesis Testing is defined as the comparison of two Populations (equality of mean/

Mean Testing Variance Testing

Continuos Data Discreet Data

1 Sample 1 Proportion Test for Equal

In this case as the samples

Conditions for Null Hypothesis Testing to be True

• If the Calculated value is < or = to Table (Critical ) value, then Null

4) 1-β = Power of the test

6 Click Options. In Confidence level, enter 90. Click OK.

One-Sample Z: Values The test mean does

Interpreting the results

2 Choose Stat > Basic Statistics > 2-Sample T. 11.12

3 Choose Samples in one column. 10.28

4 In Samples, enter 'BTU.In'. 6.8

6 Check Assume equal variances. Click OK. 12.71

level 1 level 2 Avg SS between

1 Proportion Test : Performs a test of one binomial proportion.

1 Choose Stat > Basic Statistics > 1 Proportion.

2 Choose Summarized data.

3 In Number of trials, enter 950. In Number of events, enter 560.

4 Click Options. In Test proportion, enter 0.65.

5 From Alternative, choose greater than. Click OK in each dialog box.

Session window output

Test and CI for One Proportion

Test of p = 0.65 vs p > 0.65

Sample X N Sample p Bound P-Value

1 560 950 0.589474 0.562515 1.000

Interpreting the results

1 Choose Stat > Basic Statistics > 2 Proportions.

2 Choose Summarized data.

Session window output

Test and CI for Two Proportions

p value > 0.05, means of the

Interpreting the results

Chi Square Test : It is a measure of the

Column 1 Column 2 Column 3

Democrat Republican Other

1 Open the worksheet EXH_TABL.MTW.

2 Choose Stat > Tables > Chi-Square Test (Table in Worksheet).

0.360 0.900 0.900

a) Expected Frequency = Row Total X Column Total

b ) Chi Square Value = ( Observed Frequency - Expected Frequecy )2

c) Degrees of Freedom(DF) = (No. of rows - 1) X ( No. of Columns - 1)

2 Variance Test : Use to perform

homogeneity, of variance among two

populations using an F-test and Levene's

test. Many statistical procedures, including

the two sample t-test procedures, assume

that the two samples are from populations

with equal variance. The variance test

procedure will test the validity of this

1 Choose Stat > Basic Statistics > 2 Variances.

2 Choose Samples in one column.

3 In Samples, select the column which contain the values

4 In Subscripts, enter Damper. Click OK.

Test for Equal Variances for BTU.In

2.0 2.5 3.0 3.5 4.0

Test for Equal Variances: BTU.In versus Damper