Linear Regression

Linear Regression
Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder
http://numericalmethods.eng.usf.edu
Transforming Numerical Methods Education for STEM Undergraduates
1/10/2010
Linear Regression
What is Regression?
What is regression? Given n data points ( x1, y1), ( x 2, y 2), ... , ( xn, yn ) best fit y = f ( x) to the data. The best fit is generally based on minimizing the sum of the square of the residuals, Residual at a point is
Sr.
( xn, yn)
i = yi f ( xi )
Sum of the square of the residuals
y = f ( x)
Sr = ( yi f ( xi ))
i =1
( x1, y1)
Figure. Basic model for regression
Linear Regression-Criterion#1
Given n data points ( x1, y1), ( x 2, y 2), ... , ( xn, yn ) best fit y = a 0 + a1 x to the data.
y
xi , yi
i = yi a0 a1 xi
x ,y
n
x2 , y 2
x3 , y3
x1 , y1
i = yi a0 a1 xi
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .
Does minimizing
4
i =1
work as a criterion, where
i = yi (a 0 + a1 xi )
Example for Criterion#1

Example: Given the data points (2,4), (3,6), (2,6) and (3,8), best fit the data to a straight line using Criterion#1
Table. Data Points x 2.0 3.0 2.0 3.0 4.0 6.0 6.0 8.0 y
y
4 2 0 0 1 2 x 3 4
10 8 6
Figure. Data points for y vs. x data. 5 http://numericalmethods.eng.usf.edu
Linear Regression-Criteria#1
Using y=4x-4 as the regression curve
Table. Residuals at each point for regression model y = 4x 4.
x 2.0 3.0 2.0 3.0 y 4.0 6.0 6.0 8.0 ypredicted 4.0 8.0 4.0 8.0
4
10 8 6
= y - ypredicted 0.0 -2.0 2.0 0.0

y
4 2 0
i =1
=0
Figure. Regression curve for y=4x-4, y vs. x data
Using y=6 as a regression curve
Table. Residuals at each point for y=6
x 2.0 3.0 2.0 3.0 y 4.0 6.0 6.0 8.0 ypredicted 6.0 6.0 6.0 6.0
4
10 8 6
y
= y - ypredicted -2.0 0.0 0.0 2.0
4 2
i =1
=0
0 0 1 2 x
Figure. Regression curve for y=6, y vs. x data
Linear Regression Criterion #1

4
i =1
= 0 for both regression models of y=4x-4 and y=6.
The sum of the residuals is as small as possible, that is zero, but the regression model is not unique. Hence the above criterion of minimizing the sum of the residuals is a bad criterion.
Will minimizing
y
i =1
work any better?
x,y
i
i = yi a0 a1 xi
xn , y n
x2 , y 2
x3 , y3
x1 , y1
i = yi a0 a1 xi
x
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .
Linear Regression-Criteria 2
Using y=4x-4 as the regression curve
Table. The absolute residuals employing the y=4x-4 regression model

x 2.0 3.0 2.0 3.0 y 4.0 6.0 6.0 8.0 ypredicted 4.0 8.0 4.0 8.0
4
10 8 6
|| = |y - ypredicted| 0.0 2.0 2.0

y
4 2 0
0.0
i =1
=4
Figure. Regression curve for y=4x-4, y vs. x data
10
Using y=6 as a regression curve
Table. Absolute residuals employing the y=6 model

x 2.0 3.0 2.0 3.0 y 4.0 6.0 6.0 8.0 ypredicted 6.0 6.0 6.0 6.0
4
10 8 6
y
|| = |y ypredicted| 2.0 0.0 0.0 2.0
4 2 0 0 1 2 x 3 4
i =1
=4
Figure. Regression curve for y=6, y vs. x data http://numericalmethods.eng.usf.edu
11
i =1 4 i
= 4 for both regression models of y=4x-4 and y=6.
The sum of the errors has been made as small as possible, that is 4, but the regression model is not unique. Hence the above criterion of minimizing the sum of the absolute value of the residuals is also a bad criterion. Can you find a regression line for which regression coefficients?
i =1
< 4 and has unique
12
Least Squares Criterion

The least squares criterion minimizes the sum of the square of the residuals in the model, and also produces a unique line.
S r = i = ( yi a0 a1 xi )
2 i =1 i =1
y
x,y
i i
i = yi a0 a1 xi
xn , y n
x ,y
2
x3 , y3
x1 , y1
i = yi a0 a1 xi
13
Figure. Linear regression of y vs. x data showing residuals at a typical point, xi . http://numericalmethods.eng.usf.edu
Finding Constants of Linear Model

Minimize the sum of the square of the residuals: S r = i = ( yi a0 a1 xi )
2 n n 2
To find a 0 and
a1
we minimize
Sr
with respect to
a1 and a 0 .
i =1
i =1
n S r = 2 ( y i a 0 a1 xi )( 1) = 0 a 0 i =1
n S r = 2 ( y i a 0 a1 xi )( xi ) = 0 a1 i =1
giving
a + a x = y
i =1 n 0 i =1 1 i i =1 n 2 n i =1 i =1 i =1
a0 xi + a1 xi = yi xi
14
(a0 = y a1 x)
Finding Constants of Linear Model

Solving for a 0 and
n
a1 directly yields,
n n i =1 i =1
a1 =
n x i y i x i y i
i =1 n 2 n x i x i i =1 i =1 n 2
and
a0 =
x y x x y
i =1 2 i i =1 n i i =1 n i
n x i2 x i i =1 i =1
i =1 2
(a0 = y a1 x)
15
Example 1
The torque, T needed to turn the torsion spring of a mousetrap through an angle, is given below. Find the constants for the model given by
T = k 1 + k 2
Table: Torque vs Angle for a torsional spring
Torque (N-m)
0.4
Angle,
Torque, T
0.3
Radians 0.698132 0.959931 1.134464 1.570796 1.919862
N-m 0.188224 0.209138 0.230052 0.250965 0.313707
0.2
0.1 0.5 1 (radians) 1.5 2
Figure. Data points for Angle vs. Torque data http://numericalmethods.eng.usf.edu
16
Example 1 cont.
The following table shows the summations needed for the calculations of the constants in the regression model.
Table. Tabulation of data for calculation of important summations
Radians 0.698132 0.959931 1.134464 1.570796 1.919862
T
N-m 0.188224 0.209138 0.230052 0.250965 0.313707 1.1921
2
Radians2 0.487388 0.921468 1.2870 2.4674 3.6859 8.8491
T
N-m-Radians 0.131405 0.200758 0.260986 0.394215 0.602274
Using equations described for a 0 and a1 with n = 5

k2 = n i Ti i Ti
i =1 i =1 i =1 5 2 n i i i =1 i =1 5 2 5 5 5
=
i =1
=
6.2831 1.5896
5(1.5896) (6.2831)(1.1921) 2 5(8.8491) (6.2831)
= 9.609110 2 N-m/rad
17 http://numericalmethods.eng.usf.edu
Example 1 cont.
Use the average torque and average angle to calculate
k1
T=
T
i =1
n 1.1921 = 5
=
=
i =1
n
6.2831 5
= 2.3842 10 1
= 1.2566
Using,
k1 = T k 2
= 2.3842 10 1 (9.6091 10 2 )(1.2566) = 1.1767 10 1 N-m
Example 1 Results
Using linear regression, a trend line is found from the data
Figure. Linear regression of Torque versus Angle data
Can you find the energy in the spring if it is twisted from 0 to 180 degrees?
Example 2
To find the longitudinal modulus of composite, the following data is collected. Find the longitudinal modulus, E using the regression model Table. Stress vs. Strain data = E and the sum of the square of the Strain Stress residuals. (%) (MPa)
0 0.183 0.36 0.5324 0.702 0.867 1.0244 1.1774 1.329 1.479 1.5 0 306
3.0E+09
Stress, (Pa)
612 917 1223 1529 1835 2140 2446 2752 2767 2896
2.0E+09
1.0E+09
0.0E+00 0 0.005 0.01 Strain, (m/m) 0.015 0.02
20
1.56
Figure. Data points for Stress vs. Strain data http://numericalmethods.eng.usf.edu
Example 2 cont.
Residual at each point is given by i = i E i The sum of the square of the residuals then is
S r = i2
i =1 n
= ( i E i )
i =1
Differentiate with respect to E

n S r = 2( i E i )( i ) = 0 E i =1 n
Therefore
E=

i =1 n i
i =1
2 i
21
Example 2 cont.
Table. Summation data for regression model
i 1 2 3 4 5 6 7 8 9 10 11 12 0.0000 1.8300 103 3.6000 103 5.3240 103 7.0200 103 8.6700 103 1.0244 102 1.1774 102 1.3290 102 1.4790 102 1.5000 102 1.5600 102 0.0000 3.0600 108 6.1200 108 9.1700 108 1.2230 109 1.5290 109 1.8350 109 2.1400 109 2.4460 109 2.7520 109 2.7670 109 2.8960 109 2 0.0000 3.3489 106 1.2960 105 2.8345 105 4.9280 105 7.5169 105 1.0494 104 1.3863 104 1.7662 104 2.1874 104 2.2500 104 2.4336 104 1.2764 103 0.0000 5.5998 105 2.2032 106 4.8821 106 8.5855 106 1.3256 107 1.8798 107 2.5196 107 3.2507 107 4.0702 107 4.1505 107 4.5178 107 2.3337 108
With
i =1
12
2 i
= 1.2764 10 3
and

i =1 i
12
= 2.3337 10 8
Using
E=

i =1 12
12
i i 2 i
i =1
2.3337 108 = 1.2764 10 3

= 182.84 GPa
i =1
12
22
Example 2 Results
The equation = 182.84 describes the data.
Figure. Linear regression for Stress vs. Strain data 23 http://numericalmethods.eng.usf.edu
Additional Resources
For all resources on this topic such as digital audiovisual lectures, primers, textbook chapters, multiple-choice tests, worksheets in MATLAB, MATHEMATICA, MathCad and MAPLE, blogs, related physical problems, please visit http://numericalmethods.eng.usf.edu/topics/linear_regr ession.html
THE END

Linear Regression

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Linear Regression

Загружено:

Авторское право:

Доступные форматы

Linear Regression

Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder

Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .

work as a criterion, where

Example for Criterion#1

Figure. Data points for y vs. x data. 5 http://numericalmethods.eng.usf.edu

= y - ypredicted 0.0 -2.0 2.0 0.0

Figure. Regression curve for y=4x-4, y vs. x data

= y - ypredicted -2.0 0.0 0.0 2.0

Linear Regression Criterion #1

= 0 for both regression models of y=4x-4 and y=6.

work any better?

Figure. Linear regression of y vs. x data showing residuals at a typical point, xi .

Table. The absolute residuals employing the y=4x-4 regression model

|| = |y - ypredicted| 0.0 2.0 2.0

Figure. Regression curve for y=4x-4, y vs. x data

Table. Absolute residuals employing the y=6 model

|| = |y ypredicted| 2.0 0.0 0.0 2.0

Figure. Regression curve for y=6, y vs. x data http://numericalmethods.eng.usf.edu

= 4 for both regression models of y=4x-4 and y=6.

< 4 and has unique

Least Squares Criterion

Finding Constants of Linear Model

Finding Constants of Linear Model

Radians 0.698132 0.959931 1.134464 1.570796 1.919862

N-m 0.188224 0.209138 0.230052 0.250965 0.313707

0.1 0.5 1 (radians) 1.5 2

Figure. Data points for Angle vs. Torque data http://numericalmethods.eng.usf.edu

Radians 0.698132 0.959931 1.134464 1.570796 1.919862

Using equations described for a 0 and a1 with n = 5

5(1.5896) (6.2831)(1.1921) 2 5(8.8491) (6.2831)

Figure. Linear regression of Torque versus Angle data

0.0E+00 0 0.005 0.01 Strain, (m/m) 0.015 0.02

Figure. Data points for Stress vs. Strain data http://numericalmethods.eng.usf.edu

Differentiate with respect to E

2.3337 108 = 1.2764 10 3

Figure. Linear regression for Stress vs. Strain data 23 http://numericalmethods.eng.usf.edu

Вам также может понравиться