Solutions RegressionTutorial

References
Draper, N. R., H. Smith, Applied Regression Analysis.
3rd ed. New Delhi: Wiley-India, 1998.
Kutner, M. H., C. J. Nachtsheim, J. Neter, Applied
Linear Regression Models. 4th ed. New York: McGraw
Hill, 2004.
References
Montgomery, D. C., G. C. Runger, Applied Statistics
and Probability for Engineers. 5th ed. New Delhi: Wiley-
India, 2011.
Montgomery, D. C., Design and Analysis of
Experiments. 8th ed. New Delhi: Wiley-India, 2011.

Problem Statement
A new washing machine prototype is developed by a
Company. It’s unique design enables efficient removal
of dirt but at the same time it is felt that the color of the
cloth also gets excessively removed during the
washing.
Problem Statement
Hence the washing machine is subject to some trials
wherein the color in the wash liquid (C) is analyzed
using a photometer. The variables considered while
washing with good quality water are listed as follows

Problem Statement
a. Temperature of the water (X1)
b. Amount of detergent powder (X2)
The washing time is set to a standard 40 minutes cycle.

Problem Statement
The data collected are given below. The model to be
considered is NOT known.
Since the two variables are of different magnitudes it is
better to code the variables as -1,0,1.

Problem Statement
The data collected are given below. The model to be
considered is NOT known.
a. Consider a linear regression model involving the
main effects only. Write this model.
b. Show how the parameters are obtained.

Problem Statement
c. Present the variance-covariance matrix.
d. Construct the ANOVA table explaining the different
calculations.
e. Explain how you obtained R2 and adjusted R2.
f. Is there any lack of fit in the model?

Problem Statement
g. Demonstrate the “extra sum of squares” approach.
h. Build the model sequentially and indicate whether the
additional terms are important.
i. Show the results for the final model if coding of the
variables had not been done.

Actual Data
Sl. No. Temperature (oC) Mass of Powder (g) C (ppm)
1 30 3 234
2 40 3 257.5
3 50 3 282
4 30 6 193.5
5 40 6 187
6 50 6 181.5
7 30 9 153
8 40 9 116.5
9 50 9 81
Actual Data
Sl. No. Temperature (oC) Mass of Powder (g) C (ppm)
10 45 4.5 226.9
11 35 4.5 217.9
12 45 7.5 141.4
13 35 7.5 162.4
Coded Data
Sl. No. Tc PC C (ppm)
1 -1 -1 234
2 0 -1 257.5
3 1 -1 282
4 -1 0 193.5
5 0 0 187
6 1 0 181.5
7 -1 1 153
8 0 1 116.5
9 1 1 81
Coded Data
Sl. No. Tc PC C (ppm)
10 0.5 -0.5 226.9
11 -0.5 -0.5 217.9
12 0.5 0.5 141.4
13 -0.5 0.5 162.4
T − 40 P−6
TC = PC =
50 − 40 9−6
Regression Analysis involving Main
Factors Only
a. Consider a linear regression model involving the
main effects only. Write this model.
C = β0 + β1 X1 + β1 X2
Parameter Estimation
b. Show how the parameters are obtained.
𝛃 = (𝐗 ′ 𝐗)−𝟏 𝐗 ′ 𝐘
X and Y Matrices
1 −1 −1 234
1 0 −1 257.5
1 1 −1 282
1 −1 0 193.5
1 0 0 187
1 1 0 181.5
X = 1 −1 1 Y = 153
1 0 1 116.5
1 1 1 81
1 0.5 −0.5 226.9
1 −0.5 −0.5 217.9
1 0.5 0.5 141.4
1 −0.5 0.5 162.4
X and Y Matrices
2434.6 𝛃 = (𝐗 ′ 𝐗)−𝟏 𝐗 ′ 𝐘
13 0 0
′ 𝐗 ′ 𝐘 = −42
𝐗𝐗= 0 7 0
−493.5
0 0 7
1
13
0 0 187.27
1 𝛃 = −6.00
(𝐗 ′ 𝐗)−𝟏 = 0 0
7 −70.5
1
0 0
7
Significance of 𝛃𝟎 187.27
𝛃 = −6.00
−70.5
It was seen that 𝛃𝟎 = 187.27 which also is the
average of the responses
13
i=1 Yi 2434.5
= = 187.27
13 13
Variance-Covariance Matrix
c. Present the variance-covariance matrix (V).
C00 C01 C02
𝐕 = (𝐗 ′ 𝐗)−𝟏 σ2 = C01 C11 C12 σ2
C02 C12 C22
V βj = Cjj σ2
Cov βi , βj = Cij σ2
1
0 0
13 V βj = Cjj σ2
1
𝐕= 0 0 σ2
7 Cov βi , βj = Cij σ2
1
0 0
7
But we do not know σ2 !

Residual Sum of Squares
c. Let us use the mean square residuals as surrogate for
σ2 . 𝐘 ′ 𝐘 = 494813.74
Residual Sum of Squares (SSE) = 𝐘 ′ 𝐘 − 𝛃′𝐗 ′ 𝐘
𝛃′𝐗 ′ 𝐘 = 490988.15
SSE = 3825.59
Residuals Sum of Squares
c. Degrees of Freedom for Mean Square Error : n-p
= 13 – 3 = 10
Residuals Sum of Squares (SSE) =3825.59

SSE
MSE = = 382.56
n−p
Hence σ2 = 382.56
Residuals Sum of Squares
Hence σ2 = 382.56
Or
σ = 19.56
1
0 0
13 V βj = Cjj σ2
1
𝐕= 0 0 σ2
7 Cov βi , βj = Cij σ2
1
0 0
7
σ2 = 382.56
1
0 0 V β0 = 29.43
13
1
𝐕= 0 0 σ2 V β1 = 54.65
7
1
0 0 V β2 = 54.65
7
Cov βi , βj = 0
σ2 = 382.56
Standard Errors of the Coefficients
c. Standard Errors for the regression coefficients
V β0 = 29.43 se β0 = 5.425 187.27

𝛃 = −6.00
V β1 = 54.65 se β1 = 7.393 −70.5
V β2 = 54.65 se β2 = 7.393
Analysis of Variance
calculations.
For constructing the ANOVA table we need the total
sum of squares, regression sum of squares and the
error sum of squares.

Total Sum of Squares
d. Total sum of squares:
SSTotal = Yi − Y 2
i=1
13
SSTotal = Yi − 187.277 2
i=1
i=1
n 2
Y
i=1 i
SSTotal = Y ′ Y −
n
n 2
i=1 Yi
n
2434.62
SSTotal = 494813.7 − = 38869.34
13
Regression Sum of Squares
d. Exclude the effect of 𝛃𝟎
n 2
Y
i=1 i
SSRegression = 𝛃′𝐗 ′ 𝐘 −
n
2434.62
SSRegression = 490988.14 − = 35043.74
13
Source of Degrees of
Sum of Squares Mean Square
Variation Freedom
Regression (excl.
35043.74 2 17521.87
0)
Residual 3825.59 10 382.56
Total
38869.33 12
(excl. 0)
n
SSE   ( Yi  Ŷi )  Y' Y  β̂' X' Y
2
i1
 Y  
n 2
  Y  
n 2
 i   i 
 Y' Y   i1    β̂' X' Y   i1  
n  n 
 
 
SSE = 494813.7 − 490988.14= 3825.56

Source of Sum of Degrees of Mean Fo
Variation Squares Freedom Square
Regression 35043.74 2 17521.87

17521.87
382.56
Residual 3825.56 10 382.56
=45.80
f > f0.05,2,10 i.e. 4.103 ; P-value is 2.42e-6 :– Hence regression is significant

Adjusted R2:
e. Explain how you obtained R2, adjusted R2
2
SSError n − p
R adj =1 −
SSTotal n − 1
Adjusted R2:
3825.56 13 − 3
R2 adj =1 − = 0.882
38869.33 13 − 1
Regression Parameters
Regression Parameter Value
Coefficient of Determination (R2) 35043.74

=0.9016
38869.33
Adjusted R2 0.882
2.9341 −0.0571 −0.0952
𝐕𝐔𝐂 = −0.0571 0.0014 0 382.56
−0.0952 0 0.0159
V βUC0 = 1122.5 V βUC1 = 0.5465 V βUC2 = 6.072
σ2 = 382.56
Standard Errors of the Coefficients
c. Standard Errors for the regression coefficients
V β0 = 1122.5 SE β0 = 33.5 352.277

𝛃𝐔𝐂 = −0.60
V β1 = 0.5465 SE β1 = 0.74 −23.5
V β2 = 6.072 SE β2 = 2.46
calculations.
For constructing the ANOVA table we need the total
sum of squares, regression sum of squares and the
error sum of squares.

i=1
13
SSTotal = Yi − 187.277 2
i=1
i=1
n 2
Y
i=1 i
n
d. Total sum of squares (excluding the effect of 𝛃𝟎 ):
n 2
Y
i=1 i
n
2434.62
SSTotal = 494813.7 − = 38869.34
13
Regression Sum of Squares
d. Exclude the effect of 𝛃𝟎
n 2
Y
i=1 i
SSRegression = 𝛃′𝐗 ′ 𝐘 −
n
2434.62
SSRegression = 490988.14 − = 35043.74
13
Source of Sum of Degrees of Mean
Regression 35043.74 2 17521.87
Residual 3825.59 10 382.56
Total 38869.33 12
n
SSerror   ( y i  yˆ i ) 2  y' y  β̂' X' y
i1
 n
 
2
  n
 
2
  yi    yi 
 i1    i1  
 y' y    β̂' X' y  
n  n 
 
 
SSerror = 494813.7 − 490988.14= 3825.56
Regression 35043.74 2 17521.87

17521.87
382.56
Residual 3825.56 10 382.56
=45.80
f > f0.05,2,10 i.e. 4.103 ; P-value is 2.42e-6 :– Hence regression is significant

Regression 35043.74 2 17521.87

17521.87
382.56
Residual 3825.56 10 382.56
=45.80
As an exercise, show that the first regressor variable viz.
the temperature is NOT significant in the present model.

Adjusted R2:
SSError n−p
R2 adj =1 − used.
SSTotal n−1
3825.56 (13−3)
R2 adj =1 − = 1 − 0.1181 = 0.882
38869.33 (13−1)
Regression Parameters
Regression Parameter Value
Coefficient of Determination (R2) 35043.74

=0.9016
38869.33
Adjusted R2 0.882

Solutions RegressionTutorial

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Solutions RegressionTutorial

Загружено:

Авторское право:

Доступные форматы

References

Draper, N. R., H. Smith, Applied Regression Analysis.

3rd ed. New Delhi: Wiley-India, 1998.

Kutner, M. H., C. J. Nachtsheim, J. Neter, Applied

Linear Regression Models. 4th ed. New York: McGraw

Montgomery, D. C., G. C. Runger, Applied Statistics

and Probability for Engineers. 5th ed. New Delhi: Wiley-

Montgomery, D. C., Design and Analysis of

Experiments. 8th ed. New Delhi: Wiley-India, 2011.

A new washing machine prototype is developed by a

Company. It’s unique design enables efficient removal

cloth also gets excessively removed during the

Hence the washing machine is subject to some trials

wherein the color in the wash liquid (C) is analyzed

using a photometer. The variables considered while

washing with good quality water are listed as follows

b. Amount of detergent powder (X2)

The washing time is set to a standard 40 minutes cycle.

considered is NOT known.

Since the two variables are of different magnitudes it is

better to code the variables as -1,0,1.

considered is NOT known.

a. Consider a linear regression model involving the

main effects only. Write this model.

b. Show how the parameters are obtained.

d. Construct the ANOVA table explaining the different

e. Explain how you obtained R2 and adjusted R2.

f. Is there any lack of fit in the model?

h. Build the model sequentially and indicate whether the

additional terms are important.

i. Show the results for the final model if coding of the

variables had not been done.

a. Consider a linear regression model involving the

main effects only. Write this model.

b. Show how the parameters are obtained.

It was seen that 𝛃𝟎 = 187.27 which also is the

average of the responses

But we do not know σ2 !

Residual Sum of Squares (SSE) = 𝐘 ′ 𝐘 − 𝛃′𝐗 ′ 𝐘

Residuals Sum of Squares (SSE) =3825.59

V β0 = 29.43 se β0 = 5.425 187.27

For constructing the ANOVA table we need the total

sum of squares, regression sum of squares and the

error sum of squares.

Residual 3825.59 10 382.56

SSE = 494813.7 − 490988.14= 3825.56

Source of Sum of Degrees of Mean Fo

Variation Squares Freedom Square

Regression 35043.74 2 17521.87

f > f0.05,2,10 i.e. 4.103 ; P-value is 2.42e-6 :– Hence regression is significant

e. Explain how you obtained R2, adjusted R2

Regression Parameter Value

Coefficient of Determination (R2) 35043.74

V βUC0 = 1122.5 V βUC1 = 0.5465 V βUC2 = 6.072

V β0 = 1122.5 SE β0 = 33.5 352.277

For constructing the ANOVA table we need the total

sum of squares, regression sum of squares and the

error sum of squares.

Variation Squares Freedom Square

Regression 35043.74 2 17521.87

Residual 3825.59 10 382.56

Source of Sum of Degrees of Mean Fo

Variation Squares Freedom Square