Вы находитесь на странице: 1из 53

Multiple Regression

Dr. Shahid Ali


Multiple Linear Regression

Major jump in the course from the simple linear regression model with one predictor to
the multiple linear regression model with two or more predictors.

Simple means one predictor and multiple means more than one , two or more
predictor

In the multiple regression setting, because of the potentially large number of predictors, it
is more efficient to use matrices to define the regression model and the subsequent
analyses. This lesson considers some of the more important multiple regression formulas
in matrix form.
These coefficients are need to be estimated
Least square approximation
Multiple Nonlinear regression
Applications of regression in civil Engineering
Peak Traffic Flow into City Centre
Sums of squares values:

SSR is the "regression sum of squares" and quantifies how far the estimated sloped regression
line, , is from the horizontal "no relationship line," the sample mean.

SSE is the "error sum of squares" and quantifies how much the data points, vary around the
estimated regression line.

SSTO is the "total sum of squares" and quantifies how much the data points, vary around
their mean.

SSTO = SSR + SSE


Adjusted R2

where p is the total number of explanatory variables in the model (not including the
constant term), and n is the sample size.
where dft is the degrees of freedom n 1 of the estimate of the population variance of the
dependent variable, and dfe is the degrees of freedom n p 1 of the estimate of the
underlying population error variance.
The Analysis of Variance (ANOVA) table
Matlab Syntax

Syntax
b = regress(y,X)
[b,bint] = regress(y,X)
[b,bint,r] = regress(y,X)
[b,bint,r,rint] = regress(y,X)
[b,bint,r,rint,stats] =
regress(y,X)
[...] = regress(y,X,alpha)
Description
b = regress(y,X) returns a p-by-1 vector b of coefficient estimates for a multilinear regression
of the responses in y on the predictors in X. X is an n-by-p matrix of p predictors at each
of n observations. y is an n-by-1 vector of observed responses.
regress treats NaNs in X or y as missing values, and ignores them.
If the columns of X are linearly dependent, regress obtains a basic solution by setting the
maximum number of elements of b to zero.
[b,bint] = regress(y,X) returns a p-by-2 matrix bint of 95% confidence intervals for the
coefficient estimates. The first column of bint contains lower confidence bounds for each of
the p coefficient estimates; the second column contains upper confidence bounds.
If the columns of X are linearly dependent, regress returns zeros in elements
of bint corresponding to the zero elements of b.
[b,bint,r] = regress(y,X) returns an n-by-1 vector r of residuals.
[b,bint,r,rint] = regress(y,X) returns an n-by-2 matrix rint of intervals that can be used to
diagnose outliers. If the interval rint(i,:) for observation i does not contain zero, the
corresponding residual is larger than expected in 95% of new observations, suggesting
an outlier.
In a linear model, observed values of y are random variables, and so are their residuals.
Residuals have normal distributions with zero mean but with different variances at
different values of the predictors. To put residuals on a comparable scale, they are
"Studentized," that is, they are divided by an estimate of their standard deviation that is
independent of their value. Studentized residuals have t distributions with known
degrees of freedom. The intervals returned in rint are shifts of the 95% confidence
intervals of these tdistributions, centered at the residuals.
[b,bint,r,rint,stats] = regress(y,X) returns a 1-by-4 vector stats that contains, in order,
the R2 statistic, the F statistic and its p value, and an estimate of the error variance.
load carsmall x1 = Weight;
x2 = Horsepower;
data y = MPG;
X = [ones(size(x1)) x1 x2 x1.*x2];
b = regress(y,X) % Removes NaN data

b = 60.7104 -0.0102 -0.1882 0.0000


scatter3(x1,x2,y,'filled')
hold on
x1fit = min(x1):100:max(x1);
x2fit = min(x2):10:max(x2);
[X1FIT,X2FIT] = meshgrid(x1fit,x2fit);
YFIT = b(1) + b(2)*X1FIT + b(3)*X2FIT + b(4)*X1FIT.*X2FIT;
mesh(X1FIT,X2FIT,YFIT)
xlabel('Weight')
ylabel('Horsepower')
zlabel('MPG')
view(50,10)

Вам также может понравиться