Вы находитесь на странице: 1из 23

Linear Regression and Model Statistics

Lesson #2

Linear Regression Method

Copyright 2010 DeepThought, Inc. 1


Linear Regression and Model Statistics

Method Introduction
• One of the simpler methods to use for forecasting
• Estimates a line through the data
• Uses the estimated line equation to forecast future values.
• Method Format:
– Y=a+b*t

Copyright 2010 DeepThought, Inc. 2


Linear Regression and Model Statistics

Model Characteristics
• Method Characteristics
– Fits a line to the data
– Estimating a line which minimizes the errors between actual
data points and model estimates
• When to use Method
– Estimate trend
– Estimate trend magnitude
• When not to use
– Estimate anything beyond a simple linear relationship.

Copyright 2010 DeepThought, Inc. 3


Linear Regression and Model Statistics

Forecasting Steps
1. Objective Setting
2. Method Selection
3. Model Evaluation
4. Find Best Models
5. Use Best Models

Copyright 2010 DeepThought, Inc. 4


Linear Regression and Model Statistics

Objective Setting
• Simpler is better
• Linear Regression allows to test whether a line fitted to the data
works as a model. Objectives should take that principal under
consideration.
• Example Objectives for M2 Money Stock (see next slide):
– Test if M2 has a linear trend over time.
– If M2 exhibits a statistically significant trend , what is its
magnitude and does it make sense?
– If model looks good, Create a forecast based off model.

Copyright 2010 DeepThought, Inc. 5


Linear Regression and Model Statistics

Example: M2 Money Stock


M2 Money Stock (Billions of $'s)
9000.0

8000.0

7000.0

6000.0

5000.0

4000.0

3000.0

2000.0

1000.0

0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12

Copyright 2010 DeepThought, Inc. 6


Linear Regression and Model Statistics

Method Selection
• Observe time series qualities: trend, seasonality, cyclicality, and
randomness.
• Adjust time frame, units, periods to forecast as needed.
• Determine if linear regression is a possible candidate based on
method characteristics.
– Determine if transforming the units will enable use of model.
• 8 Different Unit Transformation Techniques

Copyright 2010 DeepThought, Inc. 7


Linear Regression and Model Statistics

Build Model
• Software finds us the best fit line to the data: (Minimizing the Sum
of Squared Errors)
M2 Money Stock (Billions of $'s)
9000.0

8000.0

7000.0

6000.0

5000.0

4000.0

3000.0

2000.0

1000.0

0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12
Copyright 2010 DeepThought, Inc. 8
Linear Regression and Model Statistics

Evaluate Model
• Descriptive Statistics
– Mean
– Variance & Standard Deviation
• Accuracy / Error
– SSE
– RMSE
– MAPE
– R-Squared; Adjusted R-Squared
• Statistical Significance
– F-Test
– P-Value F-Test
Copyright 2010 DeepThought, Inc. 9
Linear Regression and Model Statistics

Descriptive Statistics
Mean
The average value of the data set.

*http://images.google.com/imgres?
imgurl=http://www.cs.princeton.edu/introcs/11gaussian/images/stddev.png&imgrefurl=http://www.cs.princeton.edu/introcs/11gaussi
an/&usg=__7JZMBeSrlQKPfVL2YCVuV8HVXkY=&h=206&w=570&sz=18&hl=en&start=54&um=1&tbnid=5jb7PXr6kgP08M:&tbnh=48&tb
nw=134&prev=/images%3Fq%3Dstandard%2Brandom%2Bdistribution%26ndsp%3D18%26hl%3Den%26client%3Dfirefox-a%26rls
%3Dorg.mozilla:en-US:official%26hs%3DXpO%26sa%3DN%26start%3D36%26um%3D1
Copyright 2010 DeepThought, Inc. 10
Linear Regression and Model Statistics

Variance & Standard Deviation


• The sum of squared deviations of the data from the mean.
– Estimates the variation the data exhibits from the mean
• Standard Deviation is the squared root of the variance
– Used to measure the distribution of the variable away from the
mean, most observations of the variable will be within ± 3
standard deviations.

Copyright 2010 DeepThought, Inc. 11


Linear Regression and Model Statistics

M2 Example
M2 Money Stock (Billions of $'s)
• Mean
9000.0
– 4214.38 8000.0
7000.0
6000.0
5000.0
• Variance 4000.0
3000.0
– 3346475.10 2000.0
1000.0
0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12

• SD (Standard Deviation)
– 1829.34

Copyright 2010 DeepThought, Inc. 12


Linear Regression and Model Statistics

Accuracy/Error
SSE

• Sum of Square Errors (SSE) – Sums the Errors between the actual
values and model values
• Measures the total error of the model
• M2 Example:
M2 Money Stock (Billions of $'s)
– SSE: 316778645.89 9000.0
8000.0
7000.0
6000.0
5000.0
4000.0
3000.0
2000.0
1000.0
0.0
May-79 Nov-84
Copyright 2010 DeepThought, Inc. May-90 Oct-95 Apr-01 Oct-06 Apr-12
13
Linear Regression and Model Statistics

RMSE

• The square root of the sum of square error divided by the number
of observations.
• An averaged out total of errors based upon the number of
observations.
• Simple way to compare models based on error.
• M2 Example:
– RMSE: 456.82

Copyright 2010 DeepThought, Inc. 14


Linear Regression and Model Statistics

MAPE

• The average percentage error of the model.


• Describes the average percentage of variation exhibited between
actual and forecasted values.
• M2 Example:
– MAPE: 10.09% M2 Money Stock (Billions of $'s)
9000.0
8000.0
7000.0
6000.0
5000.0
4000.0
3000.0
2000.0
1000.0
0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12

Copyright 2010 DeepThought, Inc. 15


Linear Regression and Model Statistics

R-Squared & Adjusted R-Squared

• A proportion between unexplained and explained errors.


• Measures the percentage of variation captured by the model.
• Adjusted R-Squared incorporated the number of variables used and
sample size to adjust the R-Squared value.

Copyright 2010 DeepThought, Inc. 16


Linear Regression and Model Statistics

M2 Example
• R2 M2 Money Stock (Billions of $'s)
– 93.76%
9000.0
8000.0
7000.0
6000.0
• Adjusted R2 5000.0

– 93.76%
4000.0
3000.0
2000.0
1000.0
0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12

Copyright 2010 DeepThought, Inc. 17


Linear Regression and Model Statistics

Statistical Significance
F-Test

• A proportion between explained and unexplained errors of model.


• Used to test if model build is statistically significant from being
equal to zero.
• The larger the F-test the better.

Copyright 2010 DeepThought, Inc. 18


Linear Regression and Model Statistics

F-Test P-Value

• The F-Test P-Value


represents the percentage of significance of the F-test. (Blue area on
graph)
• The higher the value of the F-test the lower the shaded blue area is.
As the blue area decreases, confidence about our model being
statistically significant increases.
• 1 – p-value = Significance Level of the Model (%)
• Significance Level of the Model (%) represents the amount of
confidence we have that our model is different from a model with
no impact, or zero impact.
Copyright 2010 DeepThought, Inc. 19
Linear Regression and Model Statistics

M2 Example
• F-Test M2 Money Stock (Billions of $'s)
– 22778.98
9000.0
8000.0
7000.0
6000.0
• F-Test P-Value 5000.0

– 0.00
4000.0
3000.0
2000.0
1000.0
0.0
May-79 Nov-84 May-90 Oct-95 Apr-01 Oct-06 Apr-12

Copyright 2010 DeepThought, Inc. 20


Linear Regression and Model Statistics

Compare Multiple Models


• Skip this step until have knowledge of multiple methods.
• Will use Accuracy/Error statistics to compare multiple models to
find best models

Copyright 2010 DeepThought, Inc. 21


Linear Regression and Model Statistics

Use Model
• Understand Limitations of Model.
– Only measures a trend.
– A long term average.
• Answer Objectives.
– Does M2 has a linear trend.
– If trend exists, what is its magnitude.
– If model statistically significant, forecast.

Copyright 2010 DeepThought, Inc. 22


Linear Regression and Model Statistics

M2 Example
• M2 = 1145.31 + 4.04 * Time
• Next Period is 1519
• Forecast for that period is:
– Y = 1145.31 + 4.04 * 1519
• Y = 7283.446866

Copyright 2010 DeepThought, Inc. 23

Вам также может понравиться