Вы находитесь на странице: 1из 23

MULTIPLE REGRESSION

By

Dr. D. ISRAEL

1
Objective

For relating two or more independent variables on


one dependent variable.
For testing the relative influence of each of the
independent variables on the dependent variable.
For predicting the value of dependent variable
from the values of independent variables.

2
Requirements
Dependent variables should be intervally scaled.
Independent variables should also be intervally
scaled.
Sometimes, Nominal variables can also be used
as predictors.

3
Meaning of related Concepts
Multiple Regression Equation
A mathematical model that depicts the relationship between
independent variables and a dependent variable. It is of the form
y=a+b1x1+b2x2+….+bnxn+e where y = the value of dependent variable.
a=the intercept which is a constant and meaningless.
b1=the partial regression coefficient for the independent variable 1
b2=the partial regression coefficient for the independent variable 2
x1=the actual value of the independent variable
bn=the partial regression coefficient for the independent variable n.

e=the error term which is nothing but the residuals (the difference between
actual and predicted y values)
4
LEAST SQUARE ANALYSIS
•This is the process with which the
multiple regression analysis is based.
•This is aimed at producing a minimum
sum of squares of residuals.

5
STANDARD ERROR OF
ESTIMATE (SEE)
•This is the Standard Deviation of the
residuals of the regression model.
•It is a rule that at least 68% of the residuals
should be within  1 SEE

6
BETA COEFFICIENT ()
•This is an indicator of the relative importance of a
particular independent variable on the dependent variable.
•This is computed on the standardised scores of
independent variables.
•A variable with high Beta coefficient indicates that that
variable is the most important one in explaining the
dependent variable
•The significance of this data coefficient can be tested
through a ‘ t ’ test.
•It is also known as a Beta weight or standardised
regression coefficient.
7
Unstandardised Regression Coefficient

Also known as partial regression coefficient or simply


a regression coefficient.
Indicates the change in the y value for one unit change
in a particular independent variable while keeping the
other independent variables constant.
It cannot be used for determining the importance of
independent variables in the model.

8
Multiple R
Nothing but simple correlation between y and Y.
It is also called a multiple correlation index.
Multiple R2
It is also known as Coefficient of multiple determination.
It shows the strength of association between independent
variables as a group are significantly related to the dependent
variable.
If the multiple R2 is insignificant, the entire regression model
will be useless.

9
The significance of multiple R2 is tested through F value.
R2 value can be increased by increasing the number of variables.
R2 in a multiple Regression will never go below than the highest
bivariate R2 of any individual independent variable with a
dependent variable.
R² will be larger when the multi collinearity is low.
If the independent variables are statistically independent then R²
will be the sum of bivariate R² of each independent variable
with a dependent variable.
R² is calculated as the proportion of explained variation to total
variation.

10
Adjusted R2

 This is an adjusted R2 value for the number of independent


variables and the sample size in the regression model.
 Adjusted R2 is reduced as the number of independent variables is
increased.
 It is calculated as below: -
Adjusted R2 = R2 - kx(1-R
n- k- 1 )
2

Where, k = number of independent variables


n = number of sample

11
R2 Increment
 Useful for estimating the predictive power of an
independent variable when it is added to the regression
equation, as compared to the one without such variable.
 It is the only method for comparing the importance of
independent variables in two or more models.

12
Relationship between R2 and Adjusted R2

• If R2 and Adjusted R2 are close to each other, then the


model is considered good in explaining the variation

13
Multicollinearity

• Indicates the intercorrelation of independent variable

• Its presence can be estimated using tolerance and VIF (Variance


Inflation Factor)

• Variables with correlation greater than 90 indicates a Bivariate


multicollinearity

• It makes beta coefficient and regression coefficients unstable


(Unreliable)

14
How to control Multicollinearity?

• Increase the sample size

• Transformation of variables (Standardisation)

• Form a composite variable

• Drop the most inter correlated variable

• Perform a ridge regression

•Perform a stepwise regression

15
Tolerance :

• It is computed for each independent variable

• It is 1-R2 the regression of that independent variable on all other


independent variables

• A tolerance value closer to zero (Particularly less than .20)


indicates a high multicollinearity of that variable with others.

• Lower tolerance increases the standard error & regression


coefficient

16
Variance Influence Factor (VIF)

• Another multicollinearity measure

• It is the reciprocal of tolerance ( 1 / 1 – R2)

• A high VIF indicates a high multicollinearity and instability of


regression coefficient and beta coefficient

• A VIF of 4 and above indicates multicollinearity problem

17
Taking significant difference between two R2s

• Whenever we want to test the significant contribution of two R2 values

the following method can be adopted

[R22 – R12]
(K2 – K1)
F=
( 1 - R22 )
[n - K2 – 1]

Where
R22 = R2 of the second model
R12 = R2 of the first model
18
n = total sample size Contd…
K1 : Number of independent variables in the first model

• F value should be obtained for (K2 – K1) and ( n – K2 – 1) degrees of


freedom for testing the Null hypothesis of no significant R2 increment
in these two models.

19
DUMMY VARIABLE

• Used for transforming nominal independent variable


•Values of 0s and 1s are used for coding the variable.
•If a variable has four categories, it will have 4 – 1
Dummy variables

20
STEPWISE REGRESSION

• A method where the independent variables enter the model in


the order of importance in each step until the R2 can be
significantly influenced.
•Can be used for exploratory research purposes.

•Two popular forms are ; Forward selection and backward


elimination

21
Questions ?

22
23

Вам также может понравиться