Вы находитесь на странице: 1из 23

Curve Fitting (Regression)

• Describes techniques to fit curves (curve fitting) to


discrete data to obtain intermediate estimates.

• There are two general approaches two curve fitting:


– Data exhibit a significant degree of scatter. The strategy is
to derive a single curve that represents the general trend of
the data.
– Data is very precise. The strategy is to pass a curve or a
series of curves through each of the points.

1
Least Square Method - Curve Fitting

Linear Interpolation

Curvlinear Interpolation

2
Mathematical Background
• Arithmetic mean y. The sum of the individual data points (yi)
divided by the number of points (n).

• Standard deviation Sy. The most common measure of a spread


for a sample.

or

3
• Variance Sy2. Representation of spread by the square
of the standard deviation.

Degrees of freedom

• Coefficient of variation c.v. Has the utility to quantify


the spread of data.

4
Least Squares Regression

Linear Regression
• Fitting a straight line to a set of paired
observations: (x1, y1), (x2, y2),…,(xn, yn).
y=a0+a1x+e
a1- slope
a0- intercept
e- error, or residual, between the model and
the observations
5
Criteria for a “Best” Fit/
• Minimize the sum of the residual errors for all
available data:

n = total number of points


• However, this is an inadequate criterion, so is the sum
of the absolute values

6
Minimize the sum of the
residual errors for all
available data (not
adequate)

Minimize the sum of the


absolute value of the
residual errors for all
available data (not
adequate)

7
• Best strategy is to minimize the sum of the squares of
the residuals between the measured y and the
calculated y with the linear model:

• This strategy yields a unique line for any given set of


data.

8
List-Squares Fit of a Straight Line/

Normal equations, can be


solved simultaneously

Mean values
9
10
11
“Goodness” of our fit/
If
• Total sum of the squares around the mean for the
dependent variable, y, is St
• Sum of the squares of residuals around the regression
line is Sr
• St-Sr quantifies the improvement or error reduction
due to describing data in terms of a straight line rather
than as an average value.

r2-coefficient of determination
Sqrt(r2) – correlation coefficient 12
r -Coefficient of Determination
2

13
• For a perfect fit
Sr=0 and r=r2=1, signifying that the line
explains 100 percent of the variability of the
data.
• For r=r2=0, Sr=St, the fit represents no
improvement.
• A correlation coefficient, r, greater than 0.8 is
generally described as strong, whereas a
correlation less than 0.5 is generally described
as weak.
14
Algorithm for Least-Square Linear
Regression
Sumx = 0 : Sumy=0 : St=0
Sumxy = 0 : Sumx2=0 : Sr=0

‘ Calculate Sumx, Sumy, Sumxy, and Sumx2


For i=1 to n
Sumx = Sumx + x(i)
Sumy = Sumy + y(i)
Sumxy = Sumxy + x(i) * y(i)
Sumx2 = Sumx2 + x(i)^2
Next i

‘ Calculate the mean values of x and y


xm = Sumx /n : ym = Sumy / n

‘ Calculate constants for the line equation a0 and a1


a1 = (n * Sumxy – Sumx * Sumy) / (n * Sumx2 – Sumx^2)
a0 = ym – a1 * xm

15
Algorithm for Least-Square Linear
Regression (Cont.)
‘ Calculate St, and Sr
For i=1 to n
St = St + (y(i) – ym)^2
Sr = Sr + (y(i) – a1 * x(i) – a0)^2
Next i

‘ Calculate r2
r2 = (St – Sr) / St

16
Linearization of Non-linear
Regression

Non-linear
Linear

Perform regression
for xi and (ln yi)

17
Linearization of Non-linear
Regression

Non-linear
Linear

Perform regression
for ln(xi) and ln (yi)

18
Linearization of Non-linear
Regression (Assignment)

?
Non-linear
Linear

? ?
19
Regression Analysis in Excel
Excel has different regression built in
models (for correlating two variables)

20
Regression Analysis in Excel
Here is an example of a sample data set and
the plot of a "best-fit" straight line through
the data

21
Regression Analysis in Excel

Correlation Coefficient, r

Coefficient of Determination, r 2
or R2
22
Questions

23

Вам также может понравиться