Академический Документы
Профессиональный Документы
Культура Документы
• Many engineering design and analysis problems involve factors that are
interrelated and dependent. E.g., (1) runoff volume, rainfall; (2) evaporation,
temperature, wind speed; (3) peak discharge, drainage area, rainfall intensity;
(4) crop yield, irrigated water, fertilizer.
• Due to inherent complexity of system behaviors and lack of full understanding
of the procedure involved, the relationship among the various relevant factors
or variables are established empirically or semi-empirically.
• Regression analysis is a useful and widely used statistical tool dealing with
investigation of the relationship between two or more variables related in a
non-deterministic fashion.
• If a variable Y is related to several variables X1, X2, …, XK and their
relationships can be expressed, in general, as
Y = g(X1, X2, …, XK)
where g(.) = general expression for a function;
Y = Dependent (or response) variable;
X1, X2,…, XK = Independent (or explanatory) variables.
Correlation
• When a problem involves two dependent random variables, the degree of
linear dependence between the two can be measured by the correlation
coefficient (X,Y), which is defined as
Uncorrelated in
linear fashion
Perfectly correlated in
nonlinear fashion, but
uncorrelated linearly.
Calculation of Correlation Coefficient
• Given a set of n paired sample observations of two random variables
(xi, yi), the sample correlation coefficient ( r) can be calculated as
Auto-correlation
• Consider following daily stream flows (in 1000 m3) in June 2001 at Chung Mei
Upper Station (610 ha) located upstream of a river feeding to Plover Cove
Reservoir. Determine its 1-day auto-correlation coefficient, i.e., (Qt, Qt+1).
Day (t) Flow Q(t) Day (t) Flow Q(t) Day (t) Flow Q(t)
1 8.35 11 313.89 21 20.06
2 6.78 12 480.88 22 17.52
3 6.32 13 151.28 23 116.13
4 17.36 14 83.92 24 68.25
5 191.62 15 44.58 25 280.22
6 82.33 16 36.58 26 347.53
7 524.45 17 33.65 27 771.30
8 196.77 18 26.39 28 124.20
9 785.09 19 22.98 29 58.00
10 562.05 20 21.92 30 44.08
800 900
700 800
700
600
500
600
500
400
400
300
300
200
200
100
100
0
0
10 Day 20 30 0 200 400 600 800 1000
Q(t), 1000 m^3
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
1 2 3 4 5
Time lags (Days)
Regression Models
• due to the presence of uncertainties a deterministic functional
relationship generally is not very appropriate or realistic.
• The deterministic model form can be modified to account for
uncertainties in the model as
Y = g(X1, X2, …, XK) +
where = model error term with E()=0, Var()=2.
y
^
y =0+1 x
1
1
^ = error (residual)
ei = yi – y i
^
yi
yi
0
x
xi
Least Square Criterion
• What are the values of 0 and 1 such that the resulting line “best” fits
the data points?
• The least squares (LS) criterion states that the sum of the squares of
errors (or residuals, deviations) is minimum. Mathematically, the LS
criterion can be written as:
n
y i x i 0
D n
2 y i 0 1 xi 1 0 i 1
0 i 1
n
D 2 y x x 0 x y x 0
n
1
i 1
i 0 1 i i
i 1
i i i
y i n xi 0
i 1 i 1
n n n
xi y i xi xi2 0
i 1
i 1
i 1
• Normal equations:
n n
n
x
i 1
i
i 1
yi
x x 2
n n n
xi y i
i i
i 1 i 1 i 1
LS Solution (2 Unknowns)
n
n
y i xi
ˆ i 1 i 1 ˆ y x
n n
n n n n
1
ˆ
i 1
x i y i x i y i x i y i nx y
n i 1 i 1 i 1
2 n
n
1 n
2 2
x i xi x nx
2
i
i 1 n i 1 i 1
Fitting a Polynomial Eq. By LS Method
y i xi 2 xi2 k xik i , i 1,2, , n
LS criterion:
D= y i x i x i2 x ik
n
2
minimize
i 1
, ,
D
Set 0 , for j 0,1, 2, , k
j
Normal Equations are:
n n k n
n n 2 n k1 n
xi xi xi yi xi
i1 i1 i1 i1
n k n k1 n 2k n
xi xi xi yi xi
k
i 1
, 1 ,, k
D
Set 0 , for j 0, 1, 2, , k
j
Normal equations:
n n n
n n 2 n n
xi xi1 xi1xik yi xi1
i1 i1 i1 i1
n n n 2 n
xik xik xi1 xik yi xik
i1 i1 i1 i1
Matrix Form of Multiple Regression by LS
y
1
1x x
11
12 x
1
k
1
y1x x x
2
21
22 2
k
2
y
n 1x
n x
1n2x
nk
k
n
t
h t
h
(
No
te
:x=
iji ob
s
er
v
at
io
no
f
th
ejin
d
ep
e
nd
e
nt
var
i
ab
le
)
or y=X+ in short
m
i
n
D
2
i y
ε'
ε'
-
X
βy
-
X
β
i
1
LS criterion is:
n
β ^
t
S
eD
β
,
0a
n
dr
e
s
ul
t
i: X' ( y - X β ) 0
n
β X' X X' y
ˆ 1
The LS solutions are:
Measure of Goodness-of-Fit
2
R
=Co
e
ff
i
ci
en
to
fD
et
er
mi
na
ti
on
n2
ε
i
1i
1
n
y
i
2
y
i
1
=
1-
% o
fv
ari
at
io
nint
hed
ep
e
nd
ent
var
i
abl
e,
y,u
nex
pl
ai
ne
dby
th
er
eg
re
ss
io
nequ
at
io
n;
=
%of
va
ri
at
io
nint
hed
epe
nd
e
nt
var
ia
bl
e,
y,e
xpl
ai
ne
dbyt
he
re
g
re
ss
io
nequ
at
io
n.
Example 1 (LS Method)
Example 1 (LS Method)
LS Example
LS Example (Matrix Approach)
LS Example (by Minitab w/ 0)
LS Example (by Minitab w/o 0)
LS Example (Output Plots)