Lecture 11 PDF

ChE 707 Lecture Notes by
A B Tengkiat

A B Tengkiat
Data Fitting
Curve Fitting
3/2/2016
3/2/2016
Curve Fitting

A B Tengkiat
Data Fitting can be done by:

Least Square Regression (a)
Linear Interpolation (b)
Curvilinear Interpolation (c)
Curve Fitting

A B Tengkiat
Curve fitting can involve

Interpolation, where an exact fit to the data is
required; or
Smoothing, in which a "smooth" function is
constructed that approximately fits the data
Process of constructing a curve or

mathematical function that best fit the
data points
Related to regression analysis
3/2/2016
Curve Fitting

A B Tengkiat
Fitted curves can be used

as an aid for data visualization
to infer values of a function if data is unavailable
to summarize the relationships among two or
more variables
for extrapolation but subject to greater degree
of uncertainty as it may reflect the method used
to construct the curve than the observed data
Curve Fitting

A B Tengkiat
Usually
means trying to find the curve that minimizes
vertical displacement of a point from the curve
Can be a smoothing process since number of
fitted coefficients is typically less than number of
data points
relaxed the constraint that interpolant has to go
exactly through the data points but requires it to
approach the data points as closely as possible
3/2/2016

A B Tengkiat
Curve Fitting
Requires parameterizing the potential
interpolants and having some way of measuring
the error, i.e., in the simplest case leads to least
squares approximation
Curve Fitting

A B Tengkiat
Original curve (dotted line)

1st degree (red)
2nd degree (green)
3rd degree (orange)

4th degree (blue)
3/2/2016
Curve Fitting versus

Smoothing
A B Tengkiat
Curve Fitting
often involves the use of an explicit function form
concentrates on achieving a close match with
data values as much as possible
Smoothing
Aims to give a general idea of relatively slow
changes of value
Data is changed to "smoothed" values
often have associated a tuning parameter which
is used to control the extent of smoothing
Curve Fitting Application

A B Tengkiat
Types of application when fitting

experimental data
Trend Analysis
Process of using pattern of data to make prediction or
forecast
For high precision, interpolation can be used
For imprecise data, least squares regression should be
used
3/2/2016
Curve Fitting Application

A B Tengkiat
Hypothesis Testing
Determination of best fit coefficients of mathematical
model
If mathematical model exist then adequacy of the
model is tested
Selection of best model from alternative models

A B Tengkiat
Approximate Fit versus

Exact Fit
Even if an exact match exists, it does not
necessarily follow that we can find it.
Depending on the algorithm used, the
following may be encountered:
Divergent case, where the exact fit cannot be
calculated
Too much computational time required to find
the solution
Either way, you might end up having to

accept an approximate solution
3/2/2016

A B Tengkiat

Exact Fit
Runges phenomenon, i.e
i.e.,
.,
high order polynomials can
tend to be highly oscillatory
or lumpy
Possible preference of
approximate fit wherein the
effect of averaging out
questionable data points in a
sample, rather than
distorting the curve to fit
them exactly

A B Tengkiat

Exact Fit
Low order polynomials tend to be smooth.
Maximum number of inflection points
possible in a polynomial curve is n 2,
where n is the order of the polynomial
equation.
3/2/2016
Criterion for Best Fit

A B Tengkiat
Best Fit corresponds to minimization of

errors
Sum of errors or residuals results in minimum
value due to the cancellation of errors
ej = (yj a0 a1 xj)

A B Tengkiat

Sum of the absolute errors or residuals can
have multiple results
|ej| = |yj a0 a1 xj|
3/2/2016

A B Tengkiat

Minimax criterion, i.e. minimizes the maximum
distance of individual points, fails with the
presence of outlier/s
min((ej) = min
min
min((yj a0 a1 xj)

A B Tengkiat
Sum of squares of residuals

ej2 = (yj a0 a1 xj)2
Best option since it overcomes the weakness of the
previous three criteria
Provides a unique solution

A B Tengkiat
3/2/2016
Regression Analysis

A B Tengkiat
Regression Analysis
Refers to any approach to modeling the
relationship between one or more variables
denoted by y with one or more variables
denoted x, such that the model depends on
the unknown parameters to be estimated
from the data
Least Squares method is used in
determination of the parameters of the
model equation
A model is called a linear model when
relationship is linearly dependent
10
3/2/2016
Regression Analysis

A B Tengkiat
Most applications fall into two broad

categories:
Prediction or Forecasting through fitting data
to a predictive model. After developing such a
model, is used to predict data that is not among
the given set of data.
Correlation Testing is evaluation and/or
quantification of the strength of relationship
between dependent variable, y, and independent
variable/s, i.e. x1, x2, . . . , xn. Relationship may
not exist, redundant or weak.

A B Tengkiat
Regression Analysis
First step is to plot and visually inspect the
data to ascertain what form of model
equation will apply, i.e.
Linear model
Nonlinear model
11
3/2/2016

A B Tengkiat
Regression Analysis
Given Model of the form
y = X +
where
y1

y
y = 2
M

y
n
x1 x11

x x 21
X = 2=
M
M

x x
n n1
L
L
O
L
x1 p
x2 p
M
x np
1

= 2
M

n
1

= 2
M

n
Model equations contain

Variable/s
Parameter/s
Regression Analysis
Regressand,, y
Regressand
A B Tengkiat
Also called
Dependent variable
Endogenous variable
Response variable
Measured variable
Variable/s is/are caused by, or directly

influenced by other variables
12
3/2/2016
Regression Analysis
Regressor,, x
Regressor
A B Tengkiat
Also called
Exogenous variable
Explanatory variable
Covariate
Input variable
Predictor variable
Independent variable
Regression Analysis
Regressor,, x
Regressor
A B Tengkiat
Can be
Constant (called intercept)
Linear term
Nonlinear term
13
3/2/2016
Regression Analysis
Parameter vector,
A B Tengkiat
Is called
Effect
Regression coefficients
Predictor variable
Independent variable
Statistical inferences on regression focuses on
Regression Analysis

A B Tengkiat
Error term,
Is also called Disturbance Term or Noise
Variable captures all other factors which
influence the dependent variable y other than
the regressors x
Relationship between the error term and the
regressors,, for example whether they are
regressors
correlated, is a crucial step in formulating a
linear regression model, as it will determine the
method to use for estimation
14
3/2/2016
Regression Statistics
Sum of Squares of Errors (SSE)
A B Tengkiat
Also called
Sum of SquaredResiduals (SSR)
Error Sum of Squares (ESS)
Residual Sum of Squares (RSS)
Square of the difference between actual and

predicted values
SSE = ( y y ) 2 = [ y f ( x)]2

A B Tengkiat
Total Sum of Squares (TSS)

Variance of dependent variable around its mean
Tells how much of the initial variation in the
sample were explained by the regression
y
TSS = ( y y ) = y
n
15
3/2/2016

A B Tengkiat
Standard Error of Estimate (S

(S y /x )
Also called Standard Error of Regression
Quantifies spread of data around regression line
Sy/x
SSE
=
=
n (m + 1)
( y y )
n (m + 1)
where n = number of data points

m = number of parameters to be estimated

A B Tengkiat
Coefficient of Determination (R
(R2)
Indicate goodness
goodnessof
of
fit of regression
Ratio of explained variance to the total
variance, i.e. how many percent of data is
accounted for by the correlation
R2 = 1
SSE
( y y ) 2 = 1 n 2 [ y f ( x)]2
= 1
TSS
( y y)2
(ny y) 2
Value is between 0 and 1

Biased parameter since it will never decrease if
additional regressors are added even if they are
irrelevant
16
3/2/2016

A B Tengkiat
Correlation Coefficient (R
(R)
Measures linearity
Value is between 0 and 1
R = 1
SSE
( y y ) 2
= 1
TSS
( y y)2

A B Tengkiat
Adjusted R2 (R2adj)
Slightly modified version of R2 designed to
penalize for excess number of regressors which
do not add to the explanatory power of the
regression
2
Radj
= 1
(n 1) ( y y ) 2
n 1
(n 1) SSE
(1 R 2 ) = 1
= 1
n p
(n p )TSS
( n p ) ( y y ) 2
where n = number of data points

p = number of independent variables
17
3/2/2016

A B Tengkiat
Always smaller in magnitude with respect to R2
by accounting the number of parameters being
predicted
Can be even negative for poorly fitting models

A B Tengkiat
t Statistics
Test existence of coefficient/s
Test of equality of means, i.e.,
i.e., actual and
predicted data
Large values indicate that the hypothesis of
existence of coefficient/s can be rejected and
that the corresponding coefficient is not zero
Physical sense of coefficient should be checked
before accepting or rejecting it
18
3/2/2016

A B Tengkiat
p Value
Expresses the results of the hypothesis test as a
significance level using the t Distribution
Values smaller than 0.05 are taken as evidence
that the coefficient is nonzero under 95%
confidence level

A B Tengkiat
F statistics
Test of goodness or lack of fit
Test of variance, i.e. actual data and predictor
Large values indicate a good fit between actual
and predicted data
F=
explained variance
unexplaine d variance
19
3/2/2016

A B Tengkiat

A B Tengkiat
Significance F
Is the equivalent parameter of F Statistics for
the pvalue of t Statistics
Expresses the results of the hypothesis test as a
significance level using the F Distribution
Values smaller than 0.05 are taken as evidence
that the correlation is good
Linear Regression
20
3/2/2016

A B Tengkiat
Linear Regression
First type of regression analysis to be
studied rigorously and used extensively in
practical applications

A B Tengkiat
Linear Regression
Easier to fit linear model/s compared to
nonlinear model/s, which also provides
easier determination of statistical properties
of the resulting estimators
Usually used to fit linear equations or model
through least squares approach
Even though the terms "least squares" and
linear model are closely linked, they are not
synonymous as least squares approach can
be used to fit nonlinear
nonlinear models.
21
3/2/2016
Example 1

A B Tengkiat
Fit the data to quadratic equation, i.e.,

i.e., y = a0
+ a1x + a2x2, by transformation
x
0
1
2
3
4
5
y
2.1
7.7
13.6
27.2
40.9
61.1
Example 1

A B Tengkiat
Using Quadratic Equation:

y = a0 + a1x + a2x 2
Multiple R
0.9993
R Square
0.9985
Adjusted R Square
0.9975
Standard Error
1.1175
Observations
6
Fit is very good since values are close to 1

Drop is negligible indicating validity of variables
ANOVA
df
Regression
Residual
Total
SS
MS
2 2509.647 1254.823 1004.777

3 3.746571 1.248857
5 2513.393
Significance F
5.76E-05 Sig F << 0.05,

indicating good
prediction
22
3/2/2016
Example 1

A B Tengkiat
Pvalue > 0.05 indicate that existence of

variable is questionable, possible removal of
constant or variable
Coefficients
Intercept
x
2
x
2.4786
2.3593
1.8607
Standard
Error
t Stat
1.0128
0.9527
0.1829
2.4471
2.4764
10.1735
P-value
0.0919
0.0896
0.0020
RESIDUAL OUTPUT
Observation
Predicted y
1
2
3
4
5
6
2.4786
6.6986
14.6400
26.3029
41.6871
60.7929
Lower 95%
Upper 95%
Pvalue
Test
5.7019 Failed
5.3912 Failed
2.4428 Passed
-0.7447
-0.6727
1.2787
PROBABILITY OUTPUT
Residuals
Standard
Residuals
-0.3786
1.0014
-1.0400
0.8971
-0.7871
0.3071
Percentile
-0.4373
1.1569 Residual is
-1.2014 insignificant
1.0364 for some
data
-0.9093
0.3548
8.33
25.00
41.67
58.33
75.00
91.67
2.1000
7.7000
13.6000
27.2000
40.9000
61.1000
Fit is good but statistically not sound
Example 1

A B Tengkiat
Removing Constant from Quadratic Equation

y = a1x + a2x 2
Multiple R
0.9991
R Square
0.9982
Adjusted R Square
0.7478
Standard Error
1.6752
Observations
6
Fit is very good since values are close to 1

Drop is due force fitting quadratic equation
without constant
ANOVA
df
Regression
Residual
Total
SS
MS
2 6383.295 3191.647 1137.296

4 11.22539 2.806348
6 6394.52
Significance F
4.78E-05
Sig F << 0.05,

indicating good
prediction
23
3/2/2016
Example 1

A B Tengkiat
Pvalue < 0.05 indicate the existence of

variables are valid statistically
Coefficients
x
2
x
Standard
Error
4.1374
1.5913
0.9237
0.2189
t Stat
4.4791
7.2682
RESIDUAL OUTPUT
Observation
0.0110
0.0019
Lower 95%
1.5728
0.9834
Upper 95%
6.7020
2.1992
PROBABILITY OUTPUT
Predicted y
1
2
3
4
5
6
P-value
Residuals
0.0000
5.7287
14.6400
26.7339
42.0104
60.4696
2.1000
1.9713
-1.0400
0.4661
-1.1104
0.6304
Standard
Residuals
1.5353
1.4412
-0.7603
0.3408
-0.8118
0.4609
Percentile
8.33
25.00
41.67
58.33
75.00
91.67
2.1000
7.7000
13.6000
27.2000
40.9000
61.1000
Residual is significant
for some data
Fit is statistically sound

A B Tengkiat
Example 1
Which to use?
Equation 1
Equation 2
R2
Radj2
Standard Error
Significance F
y = a0 + a1x + a2x 2
y = a1x + a2x 2
Eq 1
Eq 2
0.9985
0.9982
0.9975
0.7478
1.1175
1.6752
5
5.8 x 10
4.8 x 105
Eq 1 provides best fit

fit but statistically unsound
24
3/2/2016
Example 2
ppm Pb+2 Q (mg Pb+2/g)

0
0
0.398
8.86 40
2.805
16.78 30
28.620
24.36 20
10
67.451
27.51 0
0
104.974
28.23
139.615
32.57
Q (m g P b /g m o s s )

A B Tengkiat
Given the set of data, use linear and quadratic

equations to fit a data set
50
100
150
Ceq (ppm)
Example 2

A B Tengkiat
Solution
Using Linear Equation: Q = a + b ppm Pb+2
Multiple R
0.8705
R Square
0.7577
Adjusted R Square
0.6971
Standard Error
4.6794
Observations
6
ANOVA
df
Regression
Residual
Total
SS
Fit not good since values are far from 1

Drop versus R2 not high, variable used is suitable
MS
Significance F
1 273.8812 273.8812 12.50776 0.024086 Passed with Sig

F < 0.05, hence
4 87.58759 21.8969
good variance
5 361.4688
prediction
25
3/2/2016
Example 2

A B Tengkiat
Pvalue < 0.05

indicate existence of
variable and constant
Coefficients
Intercept
+2
ppm Pb
15.4286
0.1301
Standard
Error
t Stat
2.8450
0.0368
5.4231
3.5366
P-value
0.0056
0.0241
RESIDUAL OUTPUT
Observation
Predicted Q
1
2
3
4
5
6
15.4803
15.7936
19.1519
24.2038
29.0855
33.5921
Lower 95%
Upper 95%
7.5296
0.0280
23.3276
0.2322
PROBABILITY OUTPUT
Residuals
-6.6162
0.9815
5.2058
3.3057
-0.8535
-2.0234
Standard
Residuals
Percentile
-1.5808
0.2345
1.2438
0.7898
-0.2039
-0.4834
8.33
25.00
41.67
58.33
75.00
91.67
8.8641
16.7751
24.3577
27.5095
28.2320
31.5688
Fit is not good
Residual is
significant
Example 2

A B Tengkiat
Using Quadratic Equation:

Q = a + b ppm Pb+2 + c (ppm Pb+2)2
Multiple R
0.9314
R Square
0.8676
Adjusted R Square
0.7793
Standard Error
3.9946
Observations
6
Fit not good since R2 is far from 1

Drop versus R2 is high, indicating that
some variable/s is/are unnecessary
ANOVA
df
Regression
Residual
Total
SS
MS
Significance F
2 313.5979 156.7989 9.826348 0.048195 Sig F is near the

border of 0.05,
3 47.87097 15.95699
hence potential
5 361.4688
poor prediction
26
3/2/2016
Example 2

A B Tengkiat
Pvalue > 0.05 indicate that existence of

variable is questionable. Hence, there is no
validity in increasing order of the polynomial.
Coefficients
Intercept
ppm
2
ppm
13.2241
0.3089
-0.0013
Standard
Error
2.8019
0.1176
0.0009
t Stat
4.7196
2.6265
-1.5777
P-value
Lower 95%
0.0180
0.0786
0.2127
RESIDUAL OUTPUT
Observation
Predicted Q
1
2
3
4
5
6
13.3467
14.0801
20.9637
27.9426
30.8335
30.1407
4.3070
-0.0654
-0.0041
Upper 95%
Pvalue
Test
22.1411 Passed
0.6832 Failed
0.0014 Failed
PROBABILITY OUTPUT
Residuals
-4.4826
2.6950
3.3941
-0.4331
-2.6015
1.4281
Standard
Residuals
Percentile
-1.4487
0.8710
1.0969
-0.1400
-0.8408
0.4615
8.33
25.00
41.67
58.33
75.00
91.67
8.8641
16.7751
24.3577
27.5095
28.2320
31.5688
Fit is not good
Residual is
significant
Example 3
1100
1050
Reactor Temperature (K)

A B Tengkiat
Correlate temperature as a function of

reaction length using a polynomial equation
1000
950
900
850
800
750
700
10
15
20
25
30
35
Reactor Length (cm)
27
3/2/2016
Example 3

A B Tengkiat
Solution
Using Quadratic Equation: T = a + b L + c L2
Multiple R
0.9924
R Square
0.9848
Adjusted R Square
0.9835
Standard Error
13.4849
Observations
26
Fit is good since values are close to 1

Drop versus R2 is minimal, variable used is suitable
ANOVA
df
Regression
Residual
Total
SS
2
23
25
MS
271559
4182
275741
Significance
F
135780
182
747
1.2E-21
Passed
Sig F << 0.05, hence good
variance prediction
Example 3

A B Tengkiat
Pvalue < 0.05

indicate existence of
variable
Coefficients
Intercept
L
2
L
-4.5489
92.2626
-1.9796
Standard
Error
25.1386
2.3968
0.0527
t Stat
P-value
-0.1810
0.8580
38.4947 2.18E-22
-37.5762 3.76E-22
Lower 95%
-56.5521
87.3045
-2.0886
Upper 95%
Pvalue
Test
47.4543 Failed
97.2207 Passed
-1.8706 Passed
Remove Intercept to improve Fit
28
3/2/2016
Example 3

A B Tengkiat
Using Quadratic Equation: T = b L + c L2

Multiple R
0.9999
R Square
0.9998
Adjusted R Square
0.9582
Standard Error
13.2103
Observations
26
Fit is good since values are almost 1

Drop versus R2 is higher compared versus previous fit
but still acceptable, hence variable used is suitable
ANOVA
df
Regression
Residual
Total
SS
MS
2 24125754 12062877
24
4188
175
26 24129942
Significance
F
69123
3.48E-44
Passed
Sig F is almost nil, hence
good variance prediction
Example 3

A B Tengkiat
Pvalue is almost nil

indicating existence of
variable
Coefficients
L
L
Standard
Error
t Stat
P-value
Lower 95%
Upper 95%
91.8379
0.4763 192.8312
8.36E-40
90.8549
92.8209
-1.9706
0.0172 -114.7757
2.11E-34
-2.0060
-1.9352
Have a good fit, try higher order to check

if a better correlation can be obtained
29
3/2/2016
Example 3

A B Tengkiat
Using Cubic Equation:

T = a + b L + c L2 + d L3
Multiple R
0.9959
R Square
0.9918
Adjusted R Square
0.9906
Standard Error
10.1622
Observations
26

Drop versus R2 is minimal, hence variable used is suitable
ANOVA
df
Regression
Residual
Total
SS
3
22
25
MS
273469
2272
275741
Significance
F
91156
103
883
4.58E-23
Passed
Example 3
Pvalue is almost nil indicating
existence of variables and constant

A B Tengkiat
Coefficients
Intercept -242.2922
L
129.2318
L
Standard
Error
t Stat
P-value
Lower 95%
Upper 95%
58.4318
8.7831
-4.1466
14.7137
0.0004 -363.4724 -121.1120

7.23E-13 111.0168 147.4469
-3.7398
0.4112
-9.0955
6.58E-09
-4.5925
-2.8871
0.0261
0.0061
4.3011
0.0003
0.0135
0.0387
Have a better fit than quadratic equation

and has a lower standard error
Try higher order to check if a better
correlation can be obtained
30
3/2/2016
Example 3

A B Tengkiat
Using Quartic Equation:

T = a + b L + c L2 + d L3 + e L4
Multiple R
0.9994
R Square
0.9988
Adjusted R Square
0.9986
Standard Error
3.9137
Observations
26

ANOVA
df
Regression
Residual
Total
SS
4
21
25
MS
275420
322
275741
Significance
F
68855
15
4495
1.83E-30
Passed
Example 3
Pvalue is almost nil indicating existence of
variables and constant except for L variable

A B Tengkiat
Coefficients
Standard
Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
514.0935
70.7089
7.2706
L
2
L
-29.8530
14.4985
-2.0590
5.21E-02
-60.0043
0.2983
8.0375
1.0557
7.6137
1.80E-07
5.8421
10.2329
-0.3402
0.0041
0.0325
0.0004
-10.4536
11.2839
8.86E-10
2.25E-10
-0.4079
0.0033
-0.2726
0.0048
L
4
L
3.68E-07 367.0463 661.1406
Have a better fit than cubic equation

and has a lower standard error
Refit removing L variable
31
3/2/2016
Example 3

A B Tengkiat
Using Quartic Equation:

T = a + c L2 + d L3 + e L4
Multiple R
0.9993
R Square
0.9986
Adjusted R Square
0.9984
Standard Error
4.1920
Observations
26

ANOVA
df
Regression
Residual
Total
SS
3
22
25
MS
275355
387
275741
Significance
F
91785
18
5223
1.59E-31
Passed
Example 3
Pvalue is almost nil indicating
existence of variables

A B Tengkiat
Coefficients
Intercept
2
L
3
L
4
L
Standard
Error
t Stat
P-value
Lower 95%
Upper 95%
369.1665
7.2326
51.0417
2.40E-24 354.1669 384.1661
5.8722
0.0988
59.4468
8.57E-26
5.6673
6.0770
-0.2741
0.0033
0.0058
0.0001
-47.6313
37.1423
1.08E-23
2.41E-21
-0.2861
0.0032
-0.2622
0.0035
Have negligible variation in R2 without L

variable and slightly higher standard
error but fit is statistically sound
Try higher order fit
32
3/2/2016
Example 3

A B Tengkiat
Using Quintic Equation:

T = a + b L + c L2 + d L3 + e L4 + f L5
Multiple R
0.9995
R Square
0.9990
Adjusted R Square
0.9987
Standard Error
3.8016
Observations
26

ANOVA
df
Regression
Residual
Total
SS
5
20
25
MS
275452
289
275741
55090
14
Significance F
3812
4.54E-29
Passed
Example 3
Pvalue is almost nil indicating existence of variables
and constant except for L and L5 variables

A B Tengkiat
Coefficients
Standard
Error
Intercept 826.0748 218.7051

L
-112.4497
56.7479
2
L
16.3610
5.6338
3
L
-0.7407
0.2684
4
L
0.0133
0.0062
5
L
-8.21E-05
0.0001
t Stat
3.7771
-1.9816
2.9041
-2.7597
2.1610
-1.5025
P-value
Lower 95%
Upper 95%
0.0012 369.8639 1282.2856

0.0614 -230.8236
5.9243
0.0088
4.6090
28.1129
0.0121
-1.3005
-0.1808
0.0430
0.0005
0.0262
0.1486 -1.96E-04 3.19E-05
Better than quartic equation but L variable

existence is statistically doubtful
Try without L variable
33
3/2/2016
Example 3

A B Tengkiat
Using Quintic Equation:

T = a + c L2 + d L3 + e L4 + f L4
Multiple R
0.9994
R Square
0.9987
Adjusted R Square
0.9985
Standard Error
4.0578
Observations
26

ANOVA
df
Regression
Residual
Total
SS
4
21
25
MS
275396
346
275741
68849
16
Significance F
4181
3.91E-30
Passed
Example 3
Pvalue is almost nil indicating existence of
variables except for L4 and L5 variables

A B Tengkiat
Coefficients
Intercept
2
L
3
L
4
L
5
L
393.8714
5.2247
-0.2137
0.0013
2.28E-05
Standard
Error
17.1820
0.4222
0.0388
0.0013
0.0000
t Stat
22.9235
12.3756
-5.5143
1.0460
1.5745
P-value
Lower 95%
2.41E-16 358.1395
4.12E-11
4.3467
1.80E-05
-0.2944
0.3074
-0.0013
0.1303 -7.31E-06
Upper 95%
429.6032
6.1027
-0.1331
0.0040
5.29E-05
Quintic equation is not valid due to p

pvalue
exceeding 0.05. Hence, quartic equation,
i.e. T = a + c L2 + d L3 + e L4, is the best
because it satisfies all the conditions
34

A B Tengkiat
3/2/2016
Nonlinear Regression

A B Tengkiat
Parameters appear as functions like 2, e x,
etc.
f/j usually is a combination of the
parameter and the independent variable
Require initial value/s or estimate/s
Method of False Position can be used to
solve for a nonlinear parameter
Solution
Is an iterative process
May be many due to multiple minima in the sum
of squares
35
3/2/2016

A B Tengkiat
Nonconvergence or failure to find a
minimum value of SSE is common
Nonlinear Regression are avoided by
simplification of curve fitting through
Linearization
Segmentation of the curve into several sections
wherein each segment is fitted by linear
equation or simplified linearized form
Example

0
0
40
0.40
8.86 30
2.81
16.78 20
28.62
24.36 10
67.45
27.51 0
0
104.97
28.23
139.62
32.57

A B Tengkiat
Fit the data below to an equation of the form

y = a + bxn
50
100
150
Ceq (ppm)
36
3/2/2016
Example

A B Tengkiat
Solution
Model Equation:
Residual:
SSE
SSE::
y = a + bxn
ej = yj a bxjn
SSE = ej2 = (yj a bxjn)2
Getting Partial Derivatives

SSE
SSE//a = 0 = 2 (yj a bxjn) = 0
SSE
SSE//b = 0 = 2 (yj a bxjn)xjn = 0
SSE
SSE//n = 0 = 2b(yj a bxjn) xjn ln xj = 0

A B Tengkiat
Example
Simplifying the equations
na + bxjn = yj
axjn + bxj2n = xjnyj
axjn ln xj + bxj2n ln xj = xjnyj ln xj
Rearranging the 1st equation
a = n 1(yj bxjn)
37
3/2/2016
Example
Combining the equations results to
n x j y j x j
A B Tengkiat
n x j
2n
( x )
n x j y j ln x j x j ln x j y j
n
=b=
n 2
n x j ln x j x j ln x j x j
2n
With the 6 data points and y = 134.71, equation

simplifies to
x y
6 x
n
22.45 x j
2n
j
( x )
n 2
x y ln x 22.45 x ln x
6 x ln x x ln x x
n
j
2n
j
n
Example

A B Tengkiat
To solve for n,
Assume n
Compute for the left
lefthand side (LHS) and right
righthand
side (RHS) of the equation
LHS
x y
6 x
n
22.45 x j
2n
j
( x )
n 2
RHS
x y ln x 22.45 x ln x
6 x ln x x ln x x
n
j
2n
j
n
When LHS = RHS stop iteration else change value of n

until LHS = RHS
38
3/2/2016
Example

A B Tengkiat
Assume n = 0.25
xj
0.40
2.81
28.62
67.45
104.97
139.61
343.86
yj
8.86
16.78
24.36
24.91
28.23
31.57
134.71
xjn
0.79
1.29
2.31
2.87
3.20
3.44
13.91
y j , actual y j , predicted
8.86
10.92
16.78
14.71
24.36
22.42
24.91
26.60
28.23
29.14
31.57
30.93
134.71
110.16
x j 2n
x j n ln x j x j 2n ln x j
0.63
(0.73)
(0.58)
1.67
1.34
1.73
5.35
7.76
17.94
8.21
12.07
34.59
10.25
14.90
47.68
11.82
16.98
58.36
37.93
52.30
159.72
ej
(2.06)
2.07
1.94
(1.69)
(0.91)
0.64
24.55
SSE =
ej2
4.23
4.28
3.77
2.85
0.82
0.41
602.83
614.96
xjnyj
x j n y j ln x j
7.04
(6.49)
21.71
22.40
56.34
188.96
71.40
300.70
90.37
420.54
108.52
535.94
355.37
1,462.06
LHS =
RHS =
b =
a =
7.57
7.47
7.57
4.91
Example

A B Tengkiat
Assume n = 0.25
xj
0.40
2.81
28.62
67.45
104.97
139.61
343.86
yj
8.86
16.78
24.36
24.91
28.23
31.57
134.71
8.86
7.91
16.78
18.17
24.36
25.34
24.91
27.10
28.23
27.87
31.57
28.33
xjn
1.26
0.77
0.43
0.35
0.31
0.29
3.42
x j 2n
1.59
(1.16)
(1.46)
0.60
0.80
0.62
0.19
1.45
0.63
0.12
1.47
0.51
0.10
1.45
0.45
0.08
1.44
0.42
2.67
5.45
1.17
ej
0.96
(1.39)
(0.98)
(2.19)
0.36
3.24
SSE =
ej2
0.92
1.93
0.97
4.78
0.13
10.52
18.34
x j n y j x j n y j ln x j
11.16
(10.29)
12.96
13.37
10.53
35.32
8.69
36.61
8.82
41.05
9.18
45.36
61.35
161.42
LHS =
RHS =
b =
a =
(21.09)
(20.22)
(21.09)
34.46
39
3/2/2016
Example

A B Tengkiat
By Newton
NewtonRaphson Solution, n = 0.01621
xj
0.40
2.81
28.62
67.45
104.97
139.61
343.86
yj
8.86
16.78
24.36
24.91
28.23
31.57
134.71
xjn
0.99
1.02
1.06
1.07
1.08
1.08
6.29
8.86
9.26
16.78
15.88
24.36
24.02
24.91
27.10
28.23
28.71
31.57
29.75
x j 2n
0.97
(0.91)
(0.89)
1.03
1.05
1.07
1.11
3.54
3.74
1.15
4.51
4.83
1.16
5.02
5.41
1.17
5.35
5.80
6.60
18.56
19.95
ej
(0.40)
0.90
0.34
(2.19)
(0.48)
1.82
SSE =
ej2
0.16
0.81
0.12
4.78
0.23
3.31
9.23
xjnyj
x j n y j ln x j
8.73
(8.05)
17.06
17.60
25.72
86.26
26.68
112.34
30.44
141.68
34.20
168.91
142.83
518.75
LHS =
RHS =
b =
a =
208.61
208.61
208.61
(196.25)

A B Tengkiat
Example
Multiple R
0.9862
R Square
0.9726
Adjusted R Square
0.9658
Standard Error
1.5323
Observations
6
ANOVA
df
Regression
Residual
Total
1
4
5
Coefficients
Intercept
xn
-196.25
208.61
SS
MS
333.69
9.39
343.08
333.69
2.35
Standard
Error
18.36
17.50
t Stat
-10.69
11.92
142.12
P-value
0.0004
0.0003
Significance F
0.0003
Lower 95%
-247.22
160.03
Upper 95%
-145.29
257.19
40
3/2/2016
Example
RESIDUAL OUTPUT

A B Tengkiat
Observation
1
2
3
4
5
6
Predicted y
9.26
15.88
24.02
27.10
28.71
29.75
PROBABILITY OUTPUT
Residuals
-0.40
0.90
0.34
-2.19
-0.48
1.82
Standard
Residuals
Percentile
-0.29
0.66
0.25
-1.59
-0.35
1.33
8.33
25.00
41.67
58.33
75.00
91.67
8.86
16.78
24.36
24.91
28.23
31.57

A B Tengkiat
Methodology of False
Position Method
Determine nonlinear parameter in the model
equation
Rearrange model equation to isolate
nonlinear term
Assume
Initial guess for nonlinear parameter, c
Increment of nonlinear parameter, i.e.,
i.e., c
Compute the linear parameter/s and SSE of

the model equation using
c1 = c + c
c2 = c
c3 = c c
41
3/2/2016
Methodology of False
Position Method
A B Tengkiat
New c value is calculated by Method of

False Position
c5 = c2 +
SSE1 SSE3
c
2 SSE1 2 SSE2 + SSE3
Repeat process until a minimum SSE is

calculated or a desired accuracy is reached
Example 1

0
0
40
0.40
8.86 30
2.81
16.78 20
28.62
24.36 10
67.45
27.51 0
0
104.97
28.23
139.62
32.57

A B Tengkiat
Fit the data below to an equation of the form

y = a + bxn
50
100
150
Ceq (ppm)
42
3/2/2016

A B Tengkiat
Example 1
Solution
Model Equation:
y = a + bxn
Nonlinear parameter is a
Linearize equation
ln (y a) = ln b + n ln x
Initial guesses
a = 214 a = 100
Example 1

A B Tengkiat
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
(314.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
b=
n=
ln(y - a)
5.78
5.80
5.82
5.83
5.84
5.85
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
(214.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
ypredicted
9.22
15.92
24.05
27.10
28.69
29.72
=
b=
n=
ln(y - a)
5.41
5.44
5.47
5.48
5.49
5.50
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
ypredicted
9.25
15.88
24.02
27.10
28.70
29.74
=
326.37
0.01
SSE
0.13
0.74
0.09
4.79
0.21
3.43
9.396
Residual % Abs Error

(0.36)
4.07
0.86
5.12
0.31
1.26
(2.19)
8.78
(0.46)
1.62
1.85
5.87
26.73
226.36
0.01
SSE
0.15
0.80
0.11
4.76
0.22
3.35
9.392

(0.39)
4.39
0.89
5.32
0.34
1.38
(2.18)
8.76
(0.47)
1.66
1.83
5.80
27.30
43
3/2/2016
Example 1

A B Tengkiat
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
(114.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
b=
n=
ln(y - a)
4.81
4.87
4.93
4.93
4.96
4.98
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
ypredicted
9.32
15.80
23.95
27.08
28.73
29.80
=
126.34
0.03
SSE
0.21
0.95
0.17
4.70
0.25
3.13
9.406

(0.46)
5.14
0.98
5.81
0.41
1.68
(2.17)
8.70
(0.50)
1.76
1.77
5.60
28.70
New value for a, i.e. a5 = 202.05

New values
a = 200
a = 4
Example 1

A B Tengkiat
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
(204.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
b=
n=
ln(y - a)
5.36
5.40
5.43
5.43
5.45
5.46
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
(200.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
ypredicted
9.26
15.88
24.02
27.10
28.70
29.74
=
b=
n=
ln(y - a)
5.34
5.38
5.41
5.42
5.43
5.44
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
ypredicted
9.26
15.88
24.01
27.10
28.70
29.74
=
216.36
0.02
SSE
0.15
0.81
0.12
4.76
0.22
3.34
9.39181

(0.39)
4.43
0.90
5.35
0.34
1.40
(2.18)
8.76
(0.47)
1.66
1.83
5.78
27.39
212.36
0.02
SSE
0.16
0.81
0.12
4.76
0.22
3.33
9.39178

(0.39)
4.45
0.90
5.36
0.34
1.41
(2.18)
8.76
(0.47)
1.67
1.82
5.78
27.42
44
3/2/2016
Example 1

A B Tengkiat
a=
y
8.86
16.78
24.36
24.91
28.23
31.57
(196.00)
x
0.40
2.81
28.62
67.45
104.97
139.61
b=
n=
ln(y - a)
5.32
5.36
5.40
5.40
5.41
5.43
ln x
(0.92)
1.03
3.35
4.21
4.65
4.94
ypredicted
9.26
15.87
24.01
27.10
28.70
29.75
=
208.36
0.02
SSE
0.16
0.81
0.12
4.76
0.22
3.32
9.39177

(0.40)
4.47
0.90
5.38
0.34
1.42
(2.18)
8.75
(0.47)
1.67
1.82
5.78
27.46
New value for a, i.e. a5 = 197.22

Final values after a few iterations
a = 196.52
b = 208.88
n = 0.0162
Curve Segmentation

A B Tengkiat
Michaelis
Michaelis
Menten Model for enzyme kinetics
v=
Vmax S
Km + S
Curve Segmentation
Linear: v = VmaxKmS
Nonlinear: v = aSb
Constant: v = Vmax
45
3/2/2016

A B Tengkiat
References
1.
Akai. Applied Numerical Methods for Engineers. New

York: John Wiley and Sons, Inc., 1994.
2.
Chapra and Canale.

Canale. Numerical Methods for Engineers
with Software and Programming Applications. New
York: The McGraw

McGrawHill Companies, Inc., 2002.
3.
Perry, R. H., D. W. Green and J. O. Maloney. Perrys

Chemical Engineers Handbook. 6th ed. New York:
McGraw--Hill, Inc., 1984.
McGraw
4.
Press, Teukolsky,
Teukolsky, Vetterling and Flannery. Numerical
Recipes in Fortran 77: The Art of Scientific

Computing 2nd ed. Melbourne: Cambridge University
Press, 1992.

A B Tengkiat
References
5.
http://www.wikipedia.org/
6.
Mathematics Source Library C & ASM.

http://mymathlib. webtrellis.net/index.html
46

Lecture 11 PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lecture 11 PDF

Загружено:

Авторское право:

Доступные форматы

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

Data Fitting can be done by:

ChE 707 Lecture Notes by

Curve fitting can involve

Process of constructing a curve or

ChE 707 Lecture Notes by

Fitted curves can be used

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

Original curve (dotted line)

3rd degree (orange)

Curve Fitting versus

Curve Fitting Application

ChE 707 Lecture Notes by

Types of application when fitting

Curve Fitting Application

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

Approximate Fit versus

Either way, you might end up having to

ChE 707 Lecture Notes by

Approximate Fit versus

ChE 707 Lecture Notes by

Approximate Fit versus

Criterion for Best Fit

ChE 707 Lecture Notes by

Best Fit corresponds to minimization of

ChE 707 Lecture Notes by

Criterion for Best Fit

ChE 707 Lecture Notes by

Criterion for Best Fit

Criterion for Best Fit

ChE 707 Lecture Notes by

Sum of squares of residuals

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

Most applications fall into two broad

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

Model equations contain

Variable/s is/are caused by, or directly

Statistical inferences on regression focuses on

ChE 707 Lecture Notes by

Square of the difference between actual and

ChE 707 Lecture Notes by

Total Sum of Squares (TSS)

ChE 707 Lecture Notes by

Standard Error of Estimate (S

where n = number of data points

ChE 707 Lecture Notes by

Value is between 0 and 1

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

where n = number of data points

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by

ChE 707 Lecture Notes by