Вы находитесь на странице: 1из 79

THE ACADEMY OF AEROSPACE QUALITY (AAQ)

Auburn University, Auburn, AL.


Department of Industrial and Systems Engineering

MODULE NAME
Statistics with Excel®

MODULE OBJECTIVE
To introduce the readers to the use of
Microsoft® Office Excel® 2007 to analyze
statistical data

Prepared by Edgardo J. Escalante, Ph.D.

The Academy of Aerospace


Quality
MODULE CONTENTS

Topics:
Introduction, Tests of Hypotheses for two
means, paired observations and two
variances. One-way and two-way anova, and
simple and multiple regression analysis

Evaluations:
Examples and exercises. Solution to the
exercises.

The Academy of Aerospace


Quality
Introduction

This presentation is not intended to make the


reader an expert in using Microsoft Excel®
2007. It’s only and introductory description of
some of the statistical methods presented by
Excel®.

The Academy of Aerospace


Quality
Tests of Hypotheses
Statistical procedure used to make a decision based on
samples regarding the value that a population parameter
(mean(s), variance(s), proportion(s)) may have

Hypotheses

Ho: Null Hypothesis, the hypothesis to be challenged, the status quo or no


change (null) hypothesis. It nullifies or opposes Ha

Ha or H1: Alternative Hypothesis, the question to be answered, the research


hypothesis or theory to be tested

The Academy of Aerospace


Quality
Elements of a Test of Hypothesis (TH)

The Hypotheses: Ho and Ha.

The sample(s): The information obtained from the


population(s).

The Test Statistic (TS): A random variable that


summarizes the information from the sample(s).

The Rejection Region of Ho (RRHo): It’s a part of the


sampling distribution in which if the TS lies there, Ho
is rejected.

The confidence level of the test: (1-α)100%.

The Conclusion: Reject Ho, or fail to reject Ho.


The Academy of Aerospace
Quality
Types of errors and their probabilities in TH

α = p(Type I error)
= p(Reject Ho when Ho is true)

β = p(Type II error)
= p(Accept Ho when Ho is false)

Power of the Test=1- β


=1- p(Accept Ho when Ho is false)
=p(Reject Ho when Ho is false)
The Academy of Aerospace
Quality
Test of Hypothesis for the difference of the means of
two populations
Rejection Region of Ho (RRHo)

Ho: μ 1=μ 2 Ha: μ 1μ 2 ZZ α t t α ,df


  μ 1μ 2 Z −Z α t −t α , df
 μ 1≠μ 2 ∣Z∣Z α/2 ∣t∣t α/ 2,df

X1 −X2
a ) ( n1 , n2 ) ≥ 30 TS : Z = 2 2
S S
Independent random samples + 1 2

n1 n2
S12 S22
CI = X1 −X2 ± Zα / 2 +
n1 n2
The Academy of Aerospace
Quality
b ) ( n 1 , n 2 ) < 30, Sampling from independent normal populations with
unknown variances

X1 − X 2
b1) σ 1 = σ 2 TS: t =
1 1
(df=n1+n2-2)
Sp +
n1 n 2
− 2
+ − 2
( n1 1)S1 ( n 2 1)S 2
Sp =
n1 + n 2 − 2
1 1
CI = X 1 − X 2 ± tα / 2 , n1 + n 2 −2Sp +
n1 n 2
The Academy of Aerospace
Quality
X1 − X 2
b 2) σ1 ≠ σ2 EP : t =
S12 S22
+
n1 n 2
S12 S22
CI = X1 −X 2 ± t α/ 2 ,df +
n1 n 2
2
S S 
2 2

n +n 
1

2

df =  12 2 
2
 S1 
2
 S2 
2

n   
n  
 1  + 2 
n1 −1 n 2 −1
The Academy of Aerospace
Quality
Example
Certain engine head dimension is going to be compared
from two different production lines. A sample of 10 items
was taken from from line 1 and a sample of 12 items was
taken from line 2:

Test de hypothesis of equality of means of the two lines


using α=5% and assuming the samples come from normal
distributions
Line 1 with
Line unknown
2 but equal variances.
415.20 415.23
415.20 415.24 Ho: µ1=µ2 Ha: µ1 not equal µ2
415.23 415.18
415.16 415.24
X1 − X 2
415.22 415.24
TS: t =
415.15 415.16 1 + 1
415.23 415.15 Sp
415.18 415.22 n1 n 2
415.19 415.22
415.20 415.16
415.19
415.23
The Academy of Aerospace
Quality
Using Excel:

First one needs to check if the DATA ANALYSIS menu is activated: Under
DATA select DATA ANALYSIS. If it’s empty, then select EXCEL OPTIONS,
ADD-INS, GO and select ANALYSIS TOOLPAK and ANALYSIS TOOLPAK-
VBA and hit ok. Go again to the DATA ANALYSIS menu and you should see

The Academy of Aerospace


Quality
Using Excel: Data, Data Analysis, t-test: Two-sample assuming equal variances

The Academy of Aerospace


Quality
t-Test: Two-Sample Assuming Equal Variances

Variable 1 Variable 2
Mean 415.196172 415.2041
Variance 0.00076576 0.001233
Observations 10 12
Pooled Variance 0.00102293
Hypothesized Mean Difference 0
df 20
t Stat -0.5804293
P(T<=t) one-tail 0.28405404
t Critical one-tail 1.72471822
For a two-sided test the null
P(T<=t) two-tail 0.56810808
hypothesis is not rejected
t Critical two-tail 2.08596344
(p-value=0.568>0.05 (alpha))

The Academy of Aerospace


Quality
Case in which the samples are NOT independent
(paired observations) :

-The weight of a group of people before and after a


weight loss program

-A dimensional characteristic of an engine head before


and after thermal treatment

The Academy of Aerospace


Quality
Test of hypothesis for µD= µ1- µ2 for Paired
Observations
If dbar and sd are the mean and standard deviation of the normally distributed
differences of n random pairs of measurements

Ho : µ d = 0 Ha : µ d ≠ 0
d−δ
TS : t = reject Ho if t > t α / 2,n −1
Sd
n

The Academy of Aerospace


Quality
Example

Verify if the 103 dimensional characteristic was modified


by a thermal treatment by applying a 95% TH for µD= µ1-
µ2 (n=10)
Before TT After TT
47.39 47.28
47.47 47.33
47.58 47.46
47.60 47.47
47.40 47.28
47.68 47.58
47.47 47.36
47.64 47.54
47.48 47.37
47.73 47.61

The Academy of Aerospace


Quality
Using Excel: Data, Data Analysis, t-test: Paired two-sample for means

The Academy of Aerospace


Quality
t-Test: Paired Two Sample for Means

Variable 1 Variable 2
Mean 47.544 47.428
Variance 0.014027 0.014773
Observations 10 10
Pearson Correlation 0.994779
Hypothesized Mean Difference 0
df 9
t Stat 29
P(T<=t) one-tail 1.68E-10
t Critical one-tail 1.833113
P(T<=t) two-tail 3.36E-10
t Critical two-tail 2.262157

For a two-sided test the null


hypothesis is rejected (p-value is less than alpha)
The Academy of Aerospace
Quality
Test of Hypotheses for Variances

Ratio of two variances from normal populations


( n1 , n 2 ) < 30

σ
Ho: 12
= σ 2
2 Rejection Region of Ho (RRHo)

Ha:σ 12 > σ 22 F > Fα ,n1 −1,n 2 −1


σ 12 < σ 22 F < F1 −α ,n1 −1,n 2 −1
σ 12 ≠ σ 22 F < F1 −α / 2 ,n1 −1,n 2 −1 or F > Fα / 2 ,n1 −1,n 2 −1
2
S
TS: F =
1
(always choose S21>S22 because of the F table)
2
S 2

The Academy of Aerospace


Quality
d ) ( n1 , n 2 ) ≥ 30
Rejection Region of Ho (RRHo)

Ho:σ 1 = σ 2 Ha:σ 1 > σ 2 Z > Zα


S1 − S2
TS: Z = σ 1 <σ 2 Z < −Zα
1 1
Sp +
2 n1 2 n 2
σ 1≠σ 2 Z > Zα / 2

F transformation
1
Fprob, n -1, n -1 =
1 2
F1-prob, n -1, n -1
2 1

The Academy of Aerospace


Quality
Example
Certain engine head dimension is going to be compared from two
different production lines. A sample of 10 items was taken from from line
1 and a sample of 11 items was taken from line 2:

Test de hypothesis of equality of variance of the two lines using α=5%


and assuming the samples come from normal distributions.

Line 1 Line 2
415.20 415.23
415.20 415.24
415.23 415.18
415.16 415.24
415.22 415.24
415.15 415.16
415.23 415.15
415.18 415.22
415.19 415.22
415.20 415.16
415.19
415.23
The Academy of Aerospace
Quality
Using Excel: Data, Data Analysis, F-test Two-sample for Variances

The Academy of Aerospace


Quality
F-Test Two-Sample for Variances

Variable 1 Variable 2
Mean 415.196 415.205
Variance 0.00073778 0.0012091
Observations 10 12
df 9 11
F 0.61019215
P(F<=f) one-tail 0.23368318
F Critical one-tail 0.32232223

Since p-value=0.233 is greater than alpha (5%), the equallity of the two
variances is not rejected
The Academy of Aerospace
Quality
The Analysis of Variance

One of the ways to compare processes or groups through


contrasting their means is using a technique called the
Analysis of Variance or ANOVA for short. ANOVA
was developed by a British scientist named R. A. Fisher
at the beginning of the years 20.
This technique consists in decomposing the total
variation of data into: (a) the internal or “natural” or
“within” groups variation, and (b) the “between”
groups variation in such a way that when these two types
of variation are compared, it’s possible to determine if
there is a statistically significant difference between
their means being analyzed.
The Academy of Aerospace
Quality
One-Way ANOVA
Example: An aerospace supplier would like to
compare three different materials regarding certain
internal characteristic. He decides to take 3 random
samples of each and measures them:

Replicates
Material 1 2 3 yi s2i
A 2.05 2.03 2.02 2.033333 0.0002333
B 1.98 1.99 2.00 1.990000 0.0001000
C 2.07 2.05 2.05 2.056667 0.0001333

Values of the internal characteristic


y = 2. 0267
being analyzed
The Academy of Aerospace
Quality
y2 ( Total sum ) 2
SST = ∑∑ y ij
2
− =
N
∑ (Each data value) 2 − N

SSt =
∑ yi 2
y
− =
2
∑ ( Each group sum) 2

(Total sum ) 2
n N n N

SSE = SST −SSt N=Total number of data


n=Number of data values per
group (replicates)
df (SST ) = N −1 a=Number of levels per factor
df (SSt ) = a −1 (treatments or groups)
df (SSE ) = df (SST ) − df (SSt ) = N − a
The empirical minimum number of replicates is n=(26+a)/a
The Academy of Aerospace
Quality
2
( Total sum )
SST = ∑ (Each data value) 2 −
N
2
(18 .24)
= (2.05) 2 + (2.03) 2 + ..... + (2.05) 2 − = 0.0078
9

SSt =
∑ ( Each group sum ) 2


( Total sum ) 2

n N
(6.1) + (5.97) + (6.17) (18.24)
2 2 2 2
= − = 0.006867
3 9

The Academy of Aerospace


Quality
Results are presented in the ANOVA table:

Sources of Variation SS df MS F
SSt MSt
Treatments (t) SSt a −1 MSt =
a −1 MSE
Error (E) SSE
SSE N−a MSE =
(by difference) N−a

Total SST N −1

MS=SS/df
F=Comparison between the within and between sources
of variation
The Academy of Aerospace
Quality
For this example:

Sources of Variation SS df MS F
Treatments (t) 0.006867 2 0.003434 22.08

Error (E) 0.000933 6 0.000156


(by difference)

Total 0.00780 8

The Academy of Aerospace


Quality
Rejection region (RR) and decision
Need to compare F(ANOVA) vs F(from F Table):

F(F table)= Fα,df ( t ),df ( E ) = F0.05, 2, 6 = 5.14


Decision:
Since 22.08 is greater than 5.14,
the equality of treatment means
(different materials) is rejected,
i.e. the variable material affects
RR the measured mean of the
characteristic being analyzed
5.14 22.08

The Academy of Aerospace


Quality
0.6
p-value
0.5 Probability of having a value like that if Ho
(equality of means) is true
0.4
DistF

0.3 Región
Hode
rejection αα ((
rechazoregion
de=5%=0.05)
Ho
=5%=0.05)

0.2
p-value=0.000...
p-
value=0.002
0.1

0.0

0 5 10 15 20 25
F(0.05,2,6)=5.14 F(anova)=22.08
Since the p-value < alpha, the equality of processes means
is rejected The Academy of Aerospace
Quality
Using Excel: Data, Data Analysis

Select ANOVA-SINGLE FACTOR and enter the input range (select all the table)
and hit OK

Select the box


“Labels in First Row”

The Academy of Aerospace


Quality
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
A 3 6.1 2.033333 0.000233
B 3 5.97 1.99 0.0001
C 3 6.17 2.056667 0.000133

F(anova)=22.08 F(0.05,2,6)=5.14
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 0.006867 2 0.003433 22.07143 0.001713 5.143253
Within Groups 0.000933 6 0.000156

Total 0.0078 8

Since P-value is less than alpha (5%),


Although it’s vital to check model the null hypothesis is rejected.
assumptions, these are not going to be
covered here Or since F=22.07 is greater than
F crittical=5.14, the null hypothesis is reject

The Academy of Aerospace


Quality
Exercise 1
In a company that produces alloys for aircrafts, four
prototypes have been produced to determine the effects
of a fixed high altitude on their density. Several tests
were performed and the following data have been
obtained. Perform an ANOVA to determine the effect of
high altitude on the density of these four prototypes.
PROTOTYPE DENSITY
1 4.5 7.8 6.7
2 3.8 5.6 9.1
3 7.6 4.6 7.6
4 3.5 3.5 4.8

The Academy of Aerospace


Quality
Two-way ANOVA

Suppose an experiment is going to be performed in


space to test the influence of two factors:

A: Filling pressure (1000, 1100, 1200) psi


B: Cleansing gas (N2, ArN2)

on the luminous flux (lumens) of projection lamps.

The total number of experimental combination is the


product of the two factors’ levels: 3x2=6. It was decided
to take 2 replicates per cell, hence the total number of
observations is 6x2=12.
The Academy of Aerospace
Quality
Cell with 2 replicates Filling pressure (A)
1000 1100 1200 SUM
88 91 87
N2 534
89 91 88
Cleansing Cell Sum 177 182 175
gas (B) 92 87 95
ArN2 551
94 90 93
Cell Sum 186 177 188
SUM 363 359 363 1085
2
(1085 )
SST = (88) 2 + ..... + (93) 2 − = 80.92
12
12 terms Total
Sum of every level of factor A sum

(363) 2 + (359) 2 + (363) 2 (1085) 2


SSA = − = 2.67
Sum of every No. of4 terms in every level
12 sum
Total No. of obs.
534) 2 + (551) 2 (1085) 2
level of factor(B
SSB = − = 24.08
of terms in every level sum6 12 The Academy of Aerospace
Quality
A new term, SS of the interaction between A and B
Sum of every cell, 6 cells
(177 ) 2 +..... +(188 ) 2 (1085 ) 2
SSAB = − − SSA − SSB
2 12
SSAB = 44 . 67
No. of terms in every cell
SOURCE SS DF MS F
Filling pressure (A) 2.67 2 1.34 0.85
Cleansing Gas (B) 24.08 1 24.08 15.21*
Interaction AB 44.67 2 22.34 14.11*
Error 9.5 6 1.58
TOTAL 80.92 11
F0.05 ,2 ,6 = 5.14 F0.05 ,1,6 = 5. 99 (*) Significant variable
F=0.85 is to be compared with F0.05,2,6 =5.14. Since 0.85<5.14,
then A is not significant, i.e. does not affect the luminous flux
significantly. F=15.21 is to be compared with F0.05,1,6 =5.99.
Since 15.21>5.99, then B significantly affect the luminous flux.
F=14.11 is to be compared with F0.05,2,6 =5.14. Since of Aerospace
The Academy
Quality
When the interaction term is statistically significant, it’s
necessary to investigate the effect of its factors that individually
appeared to be non significant. In this case the effect of A should
be investigated further. For this example the assumptions of
normality and constant variance held. Since the order of the
experiments isn’t given it’s not possible to test the independence
of the residuals.

The Academy of Aerospace


Quality
Using Excel: select ANOVA: Two-Factor With Replication,

1. Enter the table (from B19 to E23 as the input range)


2. Fill in 2 in Rows per sample and hit ok

The Academy of Aerospace


Quality
ANOVA
Source of Variation SS df MS F P-value F crit
Sample Gas 24.08333 1 24.08333 15.21053 0.007983 5.987378
Columns Pressure 2.666667 2 1.333333 0.842105 0.476054 5.143253
Interaction 44.66667 2 22.33333 14.10526 0.005395 5.143253
Within Error 9.5 6 1.583333

Total 80.91667 11 Significant factors

F0.05 ,2 ,6 = 5.14 F0.05 ,1,6 = 5. 99

The conclusion is the same as the analysis previously done

The Academy of Aerospace


Quality
Exercise 2
One wishes to test the efficiency of two types of dust
versus the dissipated power on 75 Watts incandescent
lamps that are manufactured in two different shifts.
Compute the corresponding ANOVA and the factorial
plots
Shift
1 2
1 56 58
1 65 60
Dust
2 72 63
2 78 67

The Academy of Aerospace


Quality
Regression Analysis

Technique used to relate through a model, one or


more independent variables and a dependent
variable (response)

Uses of regression

1. Description. To represent the behavior of a


process

2. Prediction and estimation. Prediction is based on


an unknown x value, estimation is based on a
known x value

3. Control. To obtain a certain desirable outcome


from the process provided a cause-effect
The Academy of Aerospace
Quality
Simple Regression Analysis (one factor
regression model)

y = β 0 + β1x + ε
y=dependent variable (response)
x=independent variable (predictor of y)
ε =error component, RV
β 0=intersection. If data include zero, it represents the
mean of the distribution of y when x=0. It doesn’t
have a particular meaning if data don’t include zero
β 1=slope. It’s the change in the mean of y for every
unit change of x

The Academy of Aerospace


Quality
Model’s parameters estimation

Applying the least squares method

ˆβ = Sxy βˆ 0 = y − βˆ 1 x
1
Sxx
(Σx )(Σy) ( Σ x ) 2
Sxy = Σxy − Sxx = Σx 2 −
n n
ŷ = βˆ 0 + βˆ 1x

The Academy of Aerospace


Quality
Example
Revisiting hardness (y) vs. quenching temperature (x)
from the Regression module
TEMP(x)Hardness(y) x^2 xy y^2
101 49 10201 4949 2401
115 44 13225 5060 1936
115 46 13225 5290 2116
140 38 19600 5320 1444
123 43 15129 5289 1849
107 47 11449 5029 2209
135 41 18225 5535 1681
135 38 18225 5130 1444
105 47 11025 4935 2209
110 45 12100 4950 2025
110 43 12100 4730 1849
135 37 18225 4995 1369
125 44 15625 5500 1936
132 40 17424 5280 1600
130 39 16900 5070 1521
SUM 1818 641 222678 77062 27589

The Academy of Aerospace


Quality
Using Excel: Select REGRESSION and enter the ranges for X (Temp) and Y
Hardness and hit ok

The Academy of Aerospace


Quality
You can also obtain the following graph by selecting both X and Y
data range first, then hitting Insert-Scatter-Chart layout 3. Adjust
both min and max axis values by selecting each axis scale and hit
the right mouse button to pick the “Format axis” option

The Academy of Aerospace


Quality
The corresponding model is

Coefficients
β0 Intercept 75.269132
β1 X Variable 1 -0.268447184

Hardness = 75.26 − 0.2684 Temp

Although it’s vital to check model assumptions, these are not going to be
covered here

The Academy of Aerospace


Quality
Exercise 3. Develop a regression model for
Voltage vs. Current and interpret its coefficients

Voltage Current Voltage Current


5.3 0.12 101.9 1.05
23.2 0.21 115.3 1.18
25.8 0.29 117.3 1.24
44.9 0.44 133.6 1.32
58.7 0.55 135.5 1.44
64.2 0.59 145.9 1.48
72.3 0.71 167.2 1.60
76.1 0.84 171.9 1.75
96.5 0.93 179.4 1.83

The Academy of Aerospace


Quality
Test of the significance of the regression model

Ho: β1 = 0 It doesn’t exist a linear relationship between x


and y. The regression model has no meaning

Ha: β1 ≠ 0 x is valuable to explain the variation in y

.. . .. . . . . ..
. . ...
.. . . . . .
.. . .. .
β1=0 . β1≠0

The Academy of Aerospace


Quality
ANOVA Table

Sources of Variation SS df MS F

Regression SSR 1 MSR=SSR/1 MSR/MSE

Error SSE n-2 MSE=SSE/(n-2)

TOTAL SST n-1

Ho :β1 = 0 Ha :β1 ≠ 0
T.S. F=MSR/MSE
Reject Ho if F > F(tables)= Fα ,1, n − 2

The Academy of Aerospace


Quality
Example
Revisiting hardness vs. quenching temperature
TEMP(x)
Hardness(y) x^2 xy y^2
101 49 10201 4949 2401
115 44 13225 5060 1936
115 46 13225 5290 2116
140 38 19600 5320 1444
123 43 15129 5289 1849
107 47 11449 5029 2209
135 41 18225 5535 1681
135 38 18225 5130 1444
105 47 11025 4935 2209
110 45 12100 4950 2025
110 43 12100 4730 1849
135 37 18225 4995 1369
125 44 15625 5500 1936
132 40 17424 5280 1600
130 39 16900 5070 1521
SUM 1818 641 222678 77062 27589

The Academy of Aerospace


Quality
As part of the previous analysis using Excel you get the ANOVA table

ANOVA
df SS MS F Significance F
Regression 1 168.3701 168.3701 76.63029 8.23247E-07
Residual 13 28.56326 2.197174
Total 14 196.9333

Since the SIGNIFICANCE F (or p-value) is less than an alpha value of 0.05
then the regression model is statistically significant

The Academy of Aerospace


Quality
Exercise 4
Apply the test of significance to the regression
model for the study of Voltage vs. Current (see
Exercise 3)

The Academy of Aerospace


Quality
t tests for individual model parameters

For β1:
Ho : β1 = 0 Ha : β1 ≠ 0
βˆ 1 βˆ 1 − 0.2684
t= = = = −8.75
se(β )ˆ MSE 2.1995
1
Sxx 2336.4
t α / 2,n − 2 = t 0.025,13 = 2.16

Reject Ho, the model is significant


(Reject Ho if |t| > tα/2,n-2 )

The Academy of Aerospace


Quality
For β0:
Ho : β0 = 0 Ha : β0 ≠ 0
βˆ 0 βˆ
t= = 0
ˆ
se (β )  1 (x)2 
0
MSE  + 

 n Sxx 
75 .26
= = 20 .13
 1 (121 .2) 2 
2.1995  + 
 15 2336 .4 
t α / 2,n −2 = t 0.025 ,13 = 2.16

Reject Ho, the intersection term should be part of the model


(Reject Ho if |t| > tα/2,n-2 )
The Academy of Aerospace
Quality
As the continuation of the regression example of Hardness vs. Temperature

β0
Coefficients Std. Error t Stat P-value
Intercept 75.269132 3.736385138 20.14491 3.47E-11
X Variable 1 -0.2684472 0.030666104 -8.75387 8.23E-07

β1 P-value for
t-tests
the t-tests

Conclusion: Both tests are statistically significant, in other words, the


regression model is statistically significant and the intercept should be part
of the model

Exercise 5
Perform t tests for the study of Voltage vs.
Current (see Exercise 3) The Academy of Aerospace
Quality
Coefficient of determination
SSR
R =r =
2 2
0 ≤ R2 ≤1
Syy
It’s the proportion of variation explained by the
regression model, or the % of variation in y
explained by x. You can get it as part of the
standard regression printout
Regression Statistics
Multiple R 0.92464
R Square 0.85496
Adjusted R Square 0.843803
Standard Error 5.105593
Observations 15

85.48% of the variation in hardness is explained by± quenching


R2

temperature. The Correlation Coefficient TheisAcademy


the of Aerospace
Quality
MULTIPLE LINEAR REGRESSION

When more than one X variable is going to be considered in the


analysis, then

y = β0 + β1x1 + β 2 x 2 + ... + β k x k + ε
(hyperplane in K dimensions)

n=number of data
p=number of parameters (β s)
k=number of variables (Xs)

p=k+1

The Academy of Aerospace


Quality
β i (i=1..k) represents the expected change in the response “y”
when xi changes one unit, keeping the other Xs constant.

β 0 represents the intersection of the regression hyperplane. If


the range of data include zero, it represents the mean “y” when
x1 = x 2 = ... = x k = 0

In general every regression model that has linear


coefficients (betas), it’s a linear regression model
disregarding the form of the generated surface.
The model’s matrix representation is
  
y = Xβ + ε
The Academy of Aerospace
Quality
where

 y1   1 x11 x12 ... x1k  β 0   ε1 


y  1 x x 22 ... x 2 k    β1   ε 2 
  2   21
y= x= β= ε=
 ...  ... ... ... ... ...   ...   ... 
       
yn   1 x n1 x n 2 ... x nk  β k  ε n 
   

The matrix solution is βˆ = (X' X) X' y
−1

 
to have the model ŷ = Xβˆ
The Academy of Aerospace
Quality
Example
Assume that in the thermal treatment example one additional
variable is introduced: Unit Temperature
Hardness(y)
TEMP(x1)
U. temp(x2)
49 101 848
44 115 845
46 115 847
38 140 837
43 123 844
47 107 847
41 135 840
38 135 838
47 105 846
45 110 845
43 110 844
37 135 836
44 125 845
40 132 840
39 130 839
The Academy of Aerospace
Quality
Using Excel (Regression)

Input
Input X Range
Y Range
The Academy of Aerospace
Quality
β0
Coefficients Standard Error t Stat P-value
Intercept -574.4598054 107.2563109 -5.35595 0.000172
X Variable 1-0.060622474 0.037783719 -1.60446 0.134592
X Variable 2 0.741089213 0.122318119 6.058703 5.68E-05

β1 β2 t-tests P-values

Conclusion: Both tests are statistically significant, in other words, the


regression model is statistically significant. The intercept and U. Temp
(X2) should be part of the model. Temp(X1) is not significant at the 5%
level.
ANOVA
df SS MS F Significance F
Regression 2 189.8962982 94.94815 161.9116 2.08172E-09
Residual 12 7.037035096 0.58642
Total 14 196.9333333

The Academy of Aerospace


Quality
Adjusted multiple coefficient of determination

MSE SSE /(n − p) (n − 1)


R adj = 1 −
2
= 1− = 1− (1 − R 2 )
MST Syy /( n − 1) n−p

R2adj decreases if non-significant variables are added to the


model:
Regression Statistics
Multiple R 0.981971
R Square 0.964267
Adjusted R Square 0.958311
Standard Error 0.76578
Observations 15

The Academy of Aerospace


Quality
Exercise 6. Consider that in the Voltage vs. Current study, one
more variable is added: Resistor (ohms). Using a statistical
software compute the regression model and perform a
complete analysis Voltage (Y) Current (X1) Resistor (X2)
5.3 0.12 45
23.2 0.21 110
25.8 0.29 90
44.9 0.44 100
58.7 0.55 100
64.2 0.59 105
72.3 0.71 100
76.1 0.84 90
96.5 0.93 100
101.9 1.05 100
115.3 1.18 100
117.3 1.24 95
133.6 1.32 100
135.5 1.44 95
145.9 1.48 100
167.2 1.60 105
171.9 1.75 100
179.4 1.83 105

The Academy of Aerospace


Quality
STATISTICAL
TABLE

The Academy of Aerospace


Quality
Extract of the Z table for typical α and α/2 values

(1-α)% 80% 90% 95% 97.5% 99%


α(%) 20% 10% 5% 2.5% 1%
α 0.2 0.1 0.05 0.025 0.01
Z(α) 0.842 1.282 1.645 1.960 2.327
Z(α/2) 1.282 1.645 1.960 2.241 2.576

The Academy of Aerospace


Quality
SOLUTION TO
THE EXERCISES

The Academy of Aerospace


Quality
Exercise 1
In a company that produces alloys for aircrafts, four prototypes have been
produced to determine the effects of a fixed high altitude on their density.
Several tests were performed and the following data have been obtained.
Perform an ANOVA to determine the effect of high altitude on the density of
these four prototypes.

PROTOTYPE DENSITY
1 4.5 7.8 6.7
2 3.8 5.6 9.1
3 7.6 4.6 7.6
4 3.5 3.5 4.8

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 13.60917 3 4.536389 1.329345 0.33111 4.066181
Within Groups 27.3 8 3.4125

Total 40.90917 11

The equallity of density for the four prototypes is not rejected


The Academy of Aerospace
Quality
Exercise 2
One wishes to test the efficiency of two types of dust
versus the dissipated power on 75 Watts incandescent
lamps that are manufactured in two different shifts.
Compute the corresponding ANOVA and the factorial
plots
Shift
1 2
1 56 58
1 65 60
Dust
2 72 63
2 78 67

The Academy of Aerospace


Quality
ANOVA
Source of Variation SS df MS F P-value F crit
Sample (Dust) 210.125 1 210.125 12.27007 0.024832 7.708647
Columns(Shift) 66.125 1 66.125 3.861314 0.120857 7.708647
Interaction 36.125 1 36.125 2.109489 0.220036 7.708647
Within 68.5 4 17.125

Total 380.875 7

Dust is statistically significant but not Shift or their interaction

The Academy of Aerospace


Quality
Exercise 3. Develop a regression model for
Voltage vs. Current and interpret its coefficients

Voltage Current Voltage Current


5.3 0.12 101.9 1.05
23.2 0.21 115.3 1.18
25.8 0.29 117.3 1.24
44.9 0.44 133.6 1.32
58.7 0.55 135.5 1.44
64.2 0.59 145.9 1.48
72.3 0.71 167.2 1.60
76.1 0.84 171.9 1.75
96.5 0.93 179.4 1.83

The Academy of Aerospace


Quality
Regression Statistics
Multiple R 0.996275809
R Square 0.992565488
Adjusted R Square 0.992100831
Standard Error 4.748866698
Observations 18

ANOVA
df SS MS F Significance F
Regression 1 48173.33 48173.33 2136.125 1.8389E-18
Residual 16 360.827759 22.55173
Total 17 48534.1578

Coefficients Std. Error t Stat P-value


Intercept β0 -0.184402336 2.37042414 -0.07779 0.938957
X Variable 1 β1 98.93678099 2.14064387 46.21823 1.84E-18

The Academy of Aerospace


Quality
Exercise 4
Apply the test of significance to the regression
model for the study of Voltage vs. Current (see
Exercise 3)

Exercise 5
Perform t tests for the study of Voltage vs.
Current (see Exercise 3)

The Academy of Aerospace


Quality
Regression Statistics
Multiple R 0.996275809
R Square 0.992565488
Adjusted R Square 0.992100831
Standard Error 4.748866698 Exercise 4
Observations 18 The regression model is statistically significan
ANOVA
df SS MS F Significance F
Regression 1 48173.33 48173.33 2136.125 1.8389E-18
Residual 16 360.827759 22.55173
Total 17 48534.1578

Coefficients Std. Error t Stat P-value


Intercept β0 -0.184402336 2.37042414 -0.07779 0.938957
X Variable 1 β1 98.93678099 2.14064387 46.21823 1.84E-18

The Academy of Aerospace


Quality
Regression Statistics
Multiple R 0.996275809
R Square 0.992565488
Adjusted R Square 0.992100831
Standard Error 4.748866698
Observations 18

ANOVA
df SS MS F Significance F
Regression 1 48173.33 48173.33 2136.125 1.8389E-18
Residual 16 360.827759 22.55173
Total 17 48534.1578

Coefficients Std. Error t Stat P-value


Intercept β0 -0.184402336 2.37042414 -0.07779 0.938957
Exercise 5
X Variable 1 β1 98.93678099 2.14064387 46.21823 1.84E-18

cept is not needed in the model. It may be deleted provided certain conditions a

The Academy of Aerospace


Quality
Exercise 6. Consider that in the Voltage vs. Current study, one
more variable is added: Resistor (ohms). Using a statistical
software compute the regression model and perform a
complete analysis Voltage (Y) Current (X1) Resistor (X2)
5.3 0.12 45
23.2 0.21 110
25.8 0.29 90
44.9 0.44 100
58.7 0.55 100
64.2 0.59 105
72.3 0.71 100
76.1 0.84 90
96.5 0.93 100
101.9 1.05 100
115.3 1.18 100
117.3 1.24 95
133.6 1.32 100
135.5 1.44 95
145.9 1.48 100
167.2 1.60 105
171.9 1.75 100
179.4 1.83 105

The Academy of Aerospace


Quality
Regression Statistics
Multiple R 0.997630613
R Square 0.995266841
Adjusted R Square 0.994635753 High R-sq. adjusted
Standard Error 3.913394932
Observations 18
The regression model is statistically significa
ANOVA
df SS MS F Significance F
Regression 2 48304.44 24152.22 1577.065 3.66128E-18
Residual 15 229.7199 15.31466
Total 17 48534.16

Coefficients Std. Error t Stat P-value


Intercept β0 -19.1394386 6.766437 -2.82858 0.012705
X Variable 1 β1 96.74322441 1.916738 50.47285 3.66E-18
X Variable 2 β2 0.218236461 0.074588 2.925908 0.010432

All the terms should be included in the mode

The Academy of Aerospace


Quality

Вам также может понравиться