Вы находитесь на странице: 1из 90

Topic 9

Simple Linear Regression

Week 10 & 11 -1
Learning Objectives

• Determine the least squares regression


equation, and make point and interval estimates
for the dependent variable.
• Determine and interpret the value of the:
 Coefficient of correlation.
 Coefficient of determination.
• Construct confidence intervals and carry out
hypothesis tests involving the slope of the
regression line.
Learning Outcome

1. Compute statistical variables involves the use


of sample information to draw conclusion
about the population of an event
2. Select appropriate hypothesis testing methods
for different types of data
3. Use the hypothesis testing methods to test the
significance levels of an event
4. Interpret the statistical results of the
regression models

Week 10 & 11-3


Correlation vs. Regression
 A scatter plot (or scatter diagram) can be used
to show the relationship between two variables
 Correlation analysis is used to measure
strength of the association (linear relationship)
between two variables
 Correlation is only concerned with strength of the
relationship
 No causal effect is implied with correlation
 Correlation was first presented in Chapter 3

Week 10 & 11-4


Scatter Plots

 A scatter plot is a graph of the ordered


pairs (x, y) of numbers consisting of the
independent variable, x, and the dependent
variable, y.

Week 10 & 11-5


Scatter Plots - Example

 Construct a scatter plot for the data obtained in a study


of age and systolic blood pressure of six randomly
selected subjects.
 The data is given on the next slide.

Week 10 & 11-6


Scatter Plots - Example

Subject Age, x Pressure, y


A 43 128
B 48 120
C 56 135
D 61 143
E 67 141
F 70 152

Week 10 & 11-7


Scatter Plots - Example

Positive Relationship
rree
150
150
essu
Pressu
Pr

140
140

130
130

120
120
40
40 50
50 60
60 70
70
Age
Age

Week 10 & 11-8


Scatter Plots - Other Examples

Negative Relationship
90
Final gr ade

80
70
60
50
40
5 10 15
Number of absences

Week 10 & 11-9


Scatter Plots - Other Examples

No Relationship
10
10

55
Y
y

00
00 10
10 20
20 30
30 4040 50
50 60
60 70
70
xX

Week 10 & 11-10


Scatter Plot Examples

(continued)
Strong relationships Weak relationships

y y

x x

y y

x x
Week 10 & 11-11
Scatter Plot Examples
(continued)
No relationship

x
Week 10 & 11-12
Week 10 & 11-13
Correlation Coefficient

 The correlation coefficient computed from the


sample data measures the strength and
direction of a relationship between two
variables.
 Sample correlation coefficient, r.
 Population correlation coefficient, 

Week 10 & 11-14


Range of Values for the Correlation
Coefficient

Strong negative No linear Strong positive


relationship relationship relationship

  

Week 10 & 11-15


Features of Correlation Coefficient, r

Correlation Coefficient, r Value


Perfect positive correlation +1.00
Strong positive correlation 0.50 to 0.99
Medium positive correlation 0.30 to 0.49
Weak positive correlation 0.01 to 0.29
No correlation 0
Weak negative correlation -0.01 to -0.29
Medium negative correlation -0.30 to -0.49
Stronger negative correlation -0.50 to -0.99
Perfect negative correlation -1.00
Week 10 & 11-16
Coefficient of Correlation
 Measures the relative strength of the linear
relationship between two variables
 Sample coefficient of correlation:

( X  X)(Y  Y)
i i
cov ( X, Y)
 r 
i1
n n S X SY
( X  X) (Y  Y)
i1
i
2

i1
i
2

Week 10 & 11-17


Formula for the Correlation Coefficient
r

n( xy)  ( x)( y)


r
n( x )  ( x)n( y )  ( y) 
2 2 2 2

Where n is the number of data pairs

Week 10 & 11-18


Correlation Coefficient - Example

Subject Age, x Pressure, y


A 43 128
B 48 120
C 56 135
D 61 143
E 67 141
F 70 152

Week 10 & 11-19


7-3 Correlation Coefficient -
Example (Verify)
7-13
 Compute the correlation coefficient for the
age and blood pressure data.

 x  345,  y = 819,  xy = 47,634


 x  20,399,  y  112,443.
2 2

Substituting in the formula for r gives


r  0.897.
Strong positive relationship between age (x) and
blood pressure (y).
Week 10 & 11-20
Week 10 & 11-21
Week 10 & 11-22
Introduction to
Regression Analysis
 Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to
explain
Independent variable: the variable used to explain
the dependent variable

Week 10 & 11-23


Simple Linear Regression
Model

 Only ONE Independent Variable, X


 Relationship between X and Y is
described by a linear function
 Changes in Y are assumed to be caused
by changes in X

Week 10 & 11-24


Simple Linear Regression
Model

Week 10 & 11-25


Types of Relationships
Linear relationships Curvilinear relationships

Y Y
+ve

X X

Y Y

-ve

X X
Week 10 & 11-26
Types of Relationships
(continued)
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Week 10 & 11-27
Types of Relationships
(continued)
No relationship

X
Week 10 & 11-28
Simple Linear Regression
Model
The population regression model:
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Yi  β0  β1Xi  εi
Linear component Random Error
component

Week 10 & 11-29


Simple Linear Regression
Model
(continued)

Y Yi  β0  β1Xi  εi
Observed Value
of Y for Xi

εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value

Intercept = β0

Xi X
Week 10 & 11-30
Simple Linear Regression
Equation
The simple linear regression equation provides an
estimate of the population regression line

Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i intercept

Value of X for

Ŷi  b0  b1Xi
observation i

The individual random error terms ei have a mean of zero

Week 10 & 11-31


Week 10 & 11-32
Least Squares Method

 b0 and b1 are obtained by finding the


values of b0 and b1 that minimize the sum
of the squared differences between Y and Ŷ
:
min (Yi Ŷi )  min (Yi  (b0  b1Xi ))
2 2

Week 10 & 11-33


Finding the Least Squares
Equation

 The coefficients b0 and b1 , and other


regression results in this chapter, will be
found using Excel

Formulas are shown in the text at the end of


the chapter for those who are interested

Week 10 & 11-34


Interpretation of the
Slope and the Intercept

 b0 is the estimated average value of Y


when the value of X is zero

 b1 is the estimated change in the


average value of Y as a result of a
one-unit change in X

Week 10 & 11-35


Simple Linear Regression
Example
 A real estate agent wishes to examine the
relationship between the selling price of a home
and its size (measured in square feet)

 A random sample of 10 houses is selected


 Dependent variable (Y) = house price in $1000s
 Independent variable (X) = square feet

Week 10 & 11-36


Sample Data for House Price
Model
House Price in $1000s Square Feet
(Y) (X)
245 1,400
312 1,600
279 1,700
308 1,875
199 1,100
219 1,550
405 2,350
324 2,450
319 1,425
255 1,700

Week 10 & 11-37


Graphical Presentation

 House price model: scatter plot


450
400
House Price ($1000s)

350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet

Week 10 & 11-38


 Y  2865
 X  17150
 XY  5085975
 X  30983750
2

Find the simple linear equation.

Week 10 & 11-39


Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 houseprice  98.24833 0.10977(squarefeet)
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Week 10 & 11-40


Graphical Presentation

 House price model: scatter plot and


regression
450
line
400
House Price ($1000s)

350 Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet

houseprice  98.24833 0.10977(squarefeet)

Week 10 & 11-41


Interpretation of the
Intercept, b0

houseprice  98.24833 0.10977(squarefeet)

 b0 is the estimated average value of Y when the


value of X is zero (if X = 0 is in the range of
observed X values)
 Here, no houses had 0 square feet, so b0 = 98.24833
just indicates that, for houses within the range of
sizes observed, $98,248.33 is the portion of the
house price not explained by square feet

Week 10 & 11-42


Interpretation of the
Slope Coefficient, b1

houseprice  98.24833 0.10977(squarefeet)

 b1 measures the estimated change in the


average value of Y as a result of a one-
unit change in X
 Here, b1 = .10977 tells us that the average value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size

Week 10 & 11-43


Predictions using
Regression Analysis
Predict the price for a house
with 2000 square feet:

houseprice  98.25  0.1098(sq.ft.)

 98.25  0.1098(2000)

 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Week 10 & 11-44
Interpolation vs. Extrapolation
 When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation

450
400
House Price ($1000s)

350
300
250
200
150 Do not try to
100
extrapolate
50
0
beyond the range
0 500 1000 1500 2000 2500 3000 of observed X’s
Square Feet
Week 10 & 11-45
Measures of Variation

 Total variation is made up of two parts:


SST  SSR  SSE
Total Sum of Regression Sum Error Sum of
Squares of Squares Squares

SST  ( Yi  Y)2 SSR  ( Ŷi  Y)2 SSE  ( Yi  Ŷi )2


where:
Y = Average value of the dependent variable
Yi = Observed values of the dependent variable
Ŷi = Predicted value of Y for the given Xi value
Week 10 & 11-46
Measures of Variation
(continued)

 SST = total sum of squares


 Measures the variation of the Yi values around their
mean Y
 SSR = regression sum of squares
 Explained variation attributable to the relationship
between X and Y
 SSE = error sum of squares
 Variation attributable to factors other than the
relationship between X and Y

Week 10 & 11-47


Measures of Variation

Week 10 & 11-48


Measures of Variation

Week 10 & 11-49


Coefficient of Determination, r2
 The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
 The coefficient of determination is also called
r-squared and is denoted as r2
SSR regressionsum of squares
r 
2

SST total sum of squares

note: 0  r 1
2

Week 10 & 11-50


Examples of Approximate
r2 Values
Y
r2 = 1

Perfect linear relationship


between X and Y:
X
r2 = 1
Y 100% of the variation in Y is
explained by variation in X

X
r2 =1
Week 10 & 11-51
Examples of Approximate
r2 Values
Y
0 < r2 < 1

Weaker linear relationships


between X and Y:
X
Some but not all of the
Y
variation in Y is explained
by variation in X

X
Week 10 & 11-52
Examples of Approximate
r2 Values

r2 = 0
Y
No linear relationship
between X and Y:

The value of Y does not


X depend on X. (None of the
r2 = 0
variation in Y is explained
by variation in X)

Week 10 & 11-53


Excel Output
SSR 18934.9348
Regression Statistics
r 
2
  0.58082
Multiple R 0.76211
SST 32600.5000
R Square 0.58082
Adjusted R Square 0.52842 58.08% of the variation in
Standard Error 41.33032 house prices is explained by
Observations 10
variation in square feet
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Week 10 & 11-54


Standard Error of Estimate
 The standard deviation of the variation of
observations around the regression line is
estimated by
n

SSE  i i
( Y  Ŷ )2

SYX   i1
n2 n2
Where
SSE = error sum of squares
n = sample size

Week 10 & 11-55


Excel Output
Regression Statistics
Multiple R
R Square
0.76211
0.58082
S YX  41.33032
Adjusted R Square 0.52842
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039

Residual 8 13665.5652 1708.1957


Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Week 10 & 11-56


Comparing Standard Errors
SYX is a measure of the variation of observed
Y values from the regression line

Y Y

small s YX X large s YX X

The magnitude of SYX should always be judged relative to the


size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices in
the $200 - $300K range
Week 10 & 11-57
Assumptions of Regression

 Normality of Error
 Error values (ε) are normally distributed for any given
value of X
 Homoscedasticity
 The probability distribution of the errors has constant
variance
 Independence of Errors
 Error values are statistically independent

Week 10 & 11-58


Normality of Error

Week 10 & 11-59


Residual Analysis

ei  Yi  Ŷi
 The residual for observation i, ei, is the difference
between its observed and predicted value
 Check the assumptions of regression by examining the
residuals
 Examine for linearity assumption
 Examine for constant variance for all levels of X
(homoscedasticity)
 Evaluate normal distribution assumption
 Evaluate independence assumption

 Graphical Analysis of Residuals


 Can plot residuals vs. X
Week 10 & 11-60
How to Compute Residuals
Observation House Price Square Feet Predicted Y Residuals
Yi Xi Yˆ ei
1 245 1400 251.9232 -6.9232
2 312 1600 273.8767 38.1233
3 279 1700 284.8535 -5.8535
4 308 1875 304.0628 3.9372
5 199 1100 218.9928 -19.9928
6 219 1550 268.3883 -49.3883
7 405 2350 356.2025 48.7975
8 324 2450 367.1793 -43.1793
9 319 1425 254.6674 64.3326
10 255 1700 284.8535 -29.8535
Week 10 & 11-61
Residual Analysis for Linearity

Y Y

x x
residuals

x residuals x

Not Linear
 Linear
Week 10 & 11-62
Residual Analysis for
Homoscedasticity

Y Y

x x
residuals

x residuals x

Non-constant variance  Constant variance

Week 10 & 11-63


Residual Analysis for
Independence

Not Independent
 Independent
residuals

residuals
X
residuals

Week 10 & 11-64


Excel Residual Output

RESIDUAL OUTPUT House Price Model Residual Plot


Predicted
House Price Residuals 80
1 251.92316 -6.923162 60
2 273.87671 38.12329
40
3 284.85348 -5.853484 Residuals
20
4 304.06284 3.937162
0
5 218.99284 -19.99284
0 1000 2000 3000
6 268.38832 -49.38832 -20

7 356.20251 48.79749 -40


8 367.17929 -43.17929 -60
9 254.6674 64.33264 Square Feet
10 284.85348 -29.85348
Does not appear to violate
any regression assumptions
Week 10 & 11-65
Inferences About the Slope

 The standard error of the regression slope


coefficient (b1) is estimated by

SYX SYX
Sb1  
SSX (X  X)i
2

where:
Sb1 = Estimate of the standard error of the least squares slope

SSE = Standard error of the estimate


S YX 
n2
Week 10 & 11-66
Excel Output
Regression Statistics
Multiple R 0.76211
R Square 0.58082
Adjusted R Square 0.52842
Standard Error
Observations
41.33032
10
Sb1  0.03297
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Week 10 & 11-67


Comparing Standard Errors of
the Slope
Sb1 is a measure of the variation in the slope of regression
lines from different possible samples

Y Y

small Sb1 X largeSb1 X

Week 10 & 11-68


Inference about the Slope:
t Test
 t test for a population slope
 Is there a linear relationship between X and Y?
 Null and alternative hypotheses
H0: β1 = 0 (no linear relationship)
H1: β1  0 (linear relationship does exist)
 Test statistic

b1  β1 where:

t b1 = regression slope
coefficient
Sb1 β1 = hypothesized slope
Sb1 = standard
d.f.  n  2 error of the slope

Week 10 & 11-69


Inference about the Slope:
t Test
(continued)

House Price Estimated Regression Equation:


Square Feet
in $1000s
(x)
(y) houseprice  98.25  0.1098(sq.ft.)
245 1400
312 1600
279 1700
308 1875 The slope of this model is 0.1098
199 1100
219 1550
Does square footage of the house
405 2350 affect its sales price?
324 2450
319 1425
255 1700

Week 10 & 11-70


Inferences about the Slope:
t Test Example

b1 Sb1
H0: β1 = 0 From Excel output:
H1: β1  0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

b1  β1 0.10977 0
t   3.32938
Sb1 0.03297

Week 10 & 11-71


Inferences about the Slope:
t Test Example
(continued)
Test Statistic: t = 3.329
b1 Sb1 t
H0: β1 = 0 From Excel output:
H1: β1  0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

d.f. = 10-2 = 8
Decision:
a/2=.025 a/2=.025 Reject H0
Conclusion:
Reject H0 Do not reject H0 Reject H
0
There is sufficient evidence
-tα/2 tα/2
0 that square footage affects
-2.3060 2.3060 3.329
house price
Week 10 & 11-72
Inferences about the Slope:
t Test Example
(continued)
P-value = 0.01039
P-value
H0: β1 = 0 From Excel output:
H1: β1  0 Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039

This is a two-tail test, so Decision: P-value < α so


the p-value is Reject H0
P(t > 3.329)+P(t < -3.329) Conclusion:
= 0.01039 There is sufficient evidence
(for 8 d.f.) that square footage affects
house price
Week 10 & 11-73
F-Test for Significance

 F Test statistic: F  MSR


MSE
where SSR
MSR 
k
SSE
MSE 
n  k 1
where F follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom

(k = the number of independent variables in the regression model)

Week 10 & 11-74


Excel Output
Regression Statistics
Multiple R 0.76211
MSR 18934.9348
R Square 0.58082 F   11.0848
Adjusted R Square 0.52842 MSE 1708.1957
Standard Error 41.33032
Observations 10 With 1 and 8 degrees P-value for
of freedom the F-Test
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Week 10 & 11-75


F-Test for Significance
(continued)

H0: β1 = 0 Test Statistic:


H1: β1 ≠ 0 MSR
F  11.08
 = .05 MSE
df1= 1 df2 = 8
Decision:
Critical Reject H0 at = 0.05
Value:
F = 5.32
= .05 Conclusion:
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
Week 10 & 11-76
Confidence Interval Estimate
for the Slope
Confidence Interval Estimate of the Slope:
b1  tn2Sb1 d.f. = n - 2

Excel Printout for House Prices:


Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

At 95% level of confidence, the confidence interval for


the slope is (0.0337, 0.1858)

0.0337 < 1 < 0.1858


Week 10 & 11-77
Confidence Interval Estimate
for the Slope
(continued)

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Since the units of the house price variable is


$1000s, we are 95% confident that the average
impact on sales price is between $33.70 and
$185.80 per square foot of house size

This 95% confidence interval does not include 0.


Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance

Week 10 & 11-78


t Test for a Correlation Coefficient
 Hypotheses
H0: ρ = 0 (no correlation between X and Y)
H1: ρ ≠ 0 (correlation exists)

 Test statistic
r -ρ
 t (with n – 2 degrees of freedom)
1 r 2
where

n2 r   r 2 if b1  0

r   r 2 if b1  0
Week 10 & 11-79
Example: House Prices
Is there evidence of a linear relationship
between square feet and house price at the
.05 level of significance?

H0: ρ = 0 (No correlation)


H1: ρ ≠ 0 (correlation exists)
=.05 , df = 10 - 2 = 8

r ρ .762  0
t   3.33
1 r 2 1 .7622
n2 10  2
Week 10 & 11-80
Example: Test Solution

r ρ .762  0
t   3.33 Decision:
1 r 2 1 .7622 Reject H0

n2 10  2 Conclusion:
There is
d.f. = 10-2 = 8
evidence of a
linear association
a/2=.025 a/2=.025
at the 5% level of
significance
Reject H0 Do not reject H0 Reject H0
-tα/2 tα/2
0
-2.3060 2.3060
3.33
Week 10 & 11-81
Pitfalls of Regression Analysis

 Lacking an awareness of the assumptions


underlying least-squares regression
 Not knowing how to evaluate the assumptions
 Not knowing the alternatives to least-squares
regression if a particular assumption is violated
 Using a regression model without knowledge of
the subject matter
 Extrapolating outside the relevant range

Week 10 & 11-82


Strategies for Avoiding
the Pitfalls of Regression
 Start with a scatter plot of X on Y to observe
possible relationship
 Perform residual analysis to check the
assumptions
 Plot the residuals vs. X to check for violations of
assumptions such as homoscedasticity
 Use a histogram, stem-and-leaf display, box-and-
whisker plot, or normal probability plot of the
residuals to uncover possible non-normality

Week 10 & 11-83


Strategies for Avoiding
the Pitfalls of Regression
(continued)

 If there is violation of any assumption, use


alternative methods or models
 If there is no evidence of assumption violation,
then test for the significance of the regression
coefficients and construct confidence intervals
and prediction intervals
 Avoid making predictions or forecasts outside
the relevant range

Week 10 & 11-84


Practice

 Suppose that the management of a chain of package


delivery stores would like to develop a model for
predicting the weekly sales (in thousands of dollars) for
individual stores based on the number of customers
who made purchases. A random sample of 20 stores
was selected from among all the stores in the
chain. Since we wish to predict Sales with number
of Customers, that makes Sales the dependent,
response, or "Y" variable, and number of Customers is
the independent, explanatory, or "X" variable.

Week 10 & 11-85


 The regression output:

Week 10 & 11-86


a) Interpret the meaning of the slope b1 in this
problem.
b) Predict the average weekly Sales (in
thousands in dollars) for stores that have 600
customers.
c) How much variation in Sales is explained by
number of Customers?
d) Using α=0.05, is there evidence of a linear
relationship between Sales and number
of Customers?
Week 10 & 11-87
Solution

a) As the number of Customers increases by


1, Sales increases by $8.73.
b) Sales = 7.661
c) 91.19%
d)

Week 10 & 11-88


Summary

 Introduced types of regression models


 Reviewed assumptions of regression and
correlation
 Discussed determining the simple linear
regression equation
 Described measures of variation
 Discussed residual analysis
 Addressed measuring autocorrelation

Week 10 & 11-89


Chapter Summary
(continued)

 Described inference about the slope


 Discussed correlation -- measuring the strength
of the association
 Addressed estimation of mean values and
prediction of individual values
 Discussed possible pitfalls in regression and
recommended strategies to avoid them

Week 10 & 11-90

Вам также может понравиться