Вы находитесь на странице: 1из 71

UNIVERSITY OF GUYANA

FACULTY OF ENGINEERING & TECHNOLOGY

TOPIC: LINEAR REGRESSION AND


CORRELATION ANALYSIS

COURSE NAME: Engineering Mathematics V


COURSE CODE: EMT 3200
LECTURER: Ms. Elena Trim

1
GROUP MEMBERS
NAME USI# DEPARTMENT
Esan Barry 1025262 Civil Engineering
Kevon Gibson 1023589 Civil Engineering
Akelah Young 1024909 Civil Engineering
Nsenga Grant 1025206 Civil Engineering
Lynicia Amsterdam 1025074 Civil Engineering
Rafhel Ward 1025334 Civil Engineering

2
PRESENTATION OUTLINE
1.0 Introduction 5.0 Further Problems
2.0 Linear Regression 6.0 References
3.0 Correlation Analysis
4.0 Conclusion

3
1.0 INTRODUCTION
1.1 HISTORY
• A well-known British anthropologist, Sir Francis Galton (1822 –
1911) developed the ideas of correlation and regression in the study of
sweet peas and human physical characteristics..

• Galton then described how to determine the relationship between


children’s heights using parent’s heights.

4
1.0 INTRODUCTION
• Correlation and linear regression are the most commonly used
techniques for investigating the relationship between two quantitative
variables.

• Correlation analysis determines co relationships or association of two


variables to quantify the strength of the relationship between the
variables whereas regression expresses the relationship in the form of
an equation.
5
2.0 LINEAR REGRESSION
Regression analysis is a statistical tool used for analyzing multifactor
data. It provides A simple method for investigating the relationship
among variables. The relationship is expressed in the form of an
equation connecting the dependent variable and one or more
explanatory or predictor variables (Chatterjee & Hadi, 2006).

6
2.0 LINEAR REGRESSION
2.1 Types of Linear Regression
Linear Regression may be of two types:
1. Simple Linear Regression
2. Multiple Linear Regression

7
2.1 TYPES OF LINEAR REGRESSION
2.1.1 Simple Linear Regression
• In Simple Linear Regression, a single independent variable is used to
predict the value of the dependent variable.

• More accurate and less time consuming.

8
2.1 TYPES OF LINEAR REGRESSION
2.1.1 Simple Linear Regression
• Can be expressed in the form of a straight line.

Y = a0 + a1 X
 Y represents the output or dependent variable.

 a0 and a1 are two unknown constants that represent the intercept and coefficient
(slope) respectively.
 X represents the independent variable.

9
2.1 TYPES OF LINEAR REGRESSION
•2.1.1
  Simple Linear Regression
•a1 can be solved using the mathematical expression

a1 =

•a0 can be solved by substituting the values obtained for both the
independent and dependent variable and the coefficient of the
slope, a1, in the linear regression equation and transposing to
solve the y intercept, a0.
10
2.1 TYPES OF LINEAR REGRESSION
2.1.2 Multiple Linear Regression
• In Multiple Linear Regression, we try to find the relationship between
two or more independent variables and corresponding dependent
variable.

• The independent variables can be continuous, less accurate and more


time consuming.

11
2.1 TYPES OF LINEAR REGRESSION
2.1.2 Multiple Linear Regression
• The equation that describes how the predicted values are related to the
independent variables is called the Multiple Linear Regression
equation:

Y = a0 + a1 X1 + a2 X2 + … + an Xn

12
2.0 LINEAR REGRESSION
2.2 Types of Regression Relationships

13
2.0 LINEAR REGRESSION
Example 1:
In an experiment to determine the relationship between frequency and
the inductive reactance of an electrical circuit, the following results
were obtained:

Frequency (Hz) 50 100 150 200 250 300 350


Inductive Reactance
(ohms) 30 65 90 130 150 190 200

14
2.0 LINEAR REGRESSION
Example 1:
a) Determine the equation of the regression line of inductive reactance
on frequency, assuming a linear relationship.

b) Use the regression equations calculate the value of inductive


reactance when the frequency is 175Hz.

15
2.0 LINEAR REGRESSION
Solution:
Frequency Inductive
(X - X̅) (Y - Ῡ) (X - X̅)2 (X - X̅) (Y - Ῡ)
(X) Reactance (Y)
50 30 -150 -92.143 22500 13821.45
100 65 -100 -57.143 10000 5714.30
150 90 -50 -32.143 2500 1607.15
200 130 0 7.857 0 0
250 150 50 27.857 2500 1392.85
300 190 100 67.857 10000 6785.70
350 200 150 77.857 22500 11678.55
X̅ = 200 Ῡ = 122.143     ∑: 70000 ∑: 41000.00

16
2.0 LINEAR REGRESSION
• 
Solution: Substituting in Equation 1 gives :

Y= +X

a1 =

a1 =
a0 = 4.943
a1 = 0.586
 

17
2.0 LINEAR REGRESSION
• 
Solution:
Thus, the equation of the regression line of inductive reactance on
frequency is:

Y= +X

Y = 4.94 + 0.586X

18
2.0 LINEAR REGRESSION
Solution:
The regression equation of inductive reactance on frequency is:

Y = 4.943 + 0.586X

Hence when the frequency, X, is 175Hz

Y = 4.94 + 0.586(175) = 107.5

Therefore, the inductive reactance is 107.5 Ohms when the frequency is


175 Hz.
19
2.0 LINEAR REGRESSION
Example 2:
The experimental values relating centripetal force and radius, for a mass
travelling at constant velocity in a circle, are as shown:

Force (N) 5 10 15 20 25 30 35 40

Radius (cm) 55 30 16 12 11 9 7 5

20
2.0 LINEAR REGRESSION
Example 2:
Determine the equations of:

a) The regression line of force on radius

b) Hence, calculate the force at a radius of 40 cm and the radius


corresponding to a force of 32 N

21
2.0 LINEAR REGRESSION
Solution:
(X X̅̅ ))
(X -- X (Y
(Y -- Ῡ) (X - X̅ ) (X X̅̅ )) (Y
(X -- X (Y -- Ῡ)
22
Radius (X) Force(Y) Ῡ) (X - X̅) Ῡ)
55 5 36.875 -17.5 1359.766 -645.313
55 5 36.875 -17.5 1359.766 -645.313
30 10 11.875 -12.5 141.016 -148.438
30 10 11.875 -12.5 141.016 -148.438
16 15 -2.125 -7.5 4.516 15.938
16 15 -2.125 -7.5 4.516 15.938
12 20 -6.125 -2.5 37.516 15.313
12 20 -6.125 -2.5 37.516 15.313
11 25 -7.125 2.5 50.766 -17.813
11 25 -7.125 2.5 50.766 -17.813
9 30 -9.125 7.5 83.266 -68.438
9 30 -9.125 7.5 83.266 -68.438
7 35 -11.125 12.5 123.766 -139.063
7 35 -11.125 12.5 123.766 -139.063
5 40 -13.125 17.5 172.266 -229.688
5 40 -13.125 17.5 172.266 -229.688
= 22.5

22
2.0 LINEAR REGRESSION
Solution:
• 
Y= +X Substituting in Equation 1 gives :

a1 =

a1 =

a1 = - 0.617
a0 = 33.683

23
2.0 LINEAR REGRESSION
Solution:
• 
Therefore, the least Square Linear Regression Equation

Is given as

24
2.0 LINEAR REGRESSION
• 
Solution:
Hence, calculate the force at a radius of 40 cm and the radius
corresponding to a force of 32N. The Radius being the X-Value

The Radius of 40cm has a corresponding Force of 9.003N


25
3.0 CORRELATION ANALYSIS
Correlation measures the direction and strength of the relationship
between two quantitative variable and how the relationship between
variables change together (Granville, 2012) .

Correlation coefficient (r) refers to a number between -1 and +1.

26
3.0 CORRELATION ANALYSIS
3.1 CORRELATION COEFFICIENT GRAPHS

Source: (Granville, 2012) 27


3.0 CORRELATION ANALYSIS
3.2 Types of Correlation Analysis
The two main types of correlation measurement are:

1. The Pearson Product Moment correlation coefficients

2. The Spearman’s rank correlation coefficients

28
3.0 CORRELATION ANALYSIS
3.2.1 PEARSON MOMENT CORRELATION
• Pearson moment correlation coefficient is the measure of the strength
of a linear association between two variables.

• To see if your data would have a linear relationship you simply have to
plot them, usually as scatter plots.

• A correlation of +1 does not mean that for every unit increase of one
variable there is a unit increase in the other. It means there is no
variation between the data points and the line of best fit.
29
3.0 CORRELATION ANALYSIS
3.2.1 PEARSON MOMENT CORRELATION
• The correlation coefficient is not a measure of the strength of
correlation, rather the strength of linear correlation expressed by the
coefficient of determination.
• It is calculated by:

𝑟= ¿ ¿
 

30
3.0 CORRELATION ANALYSIS
3.2.2 SPEARMAN’S RANK COEFFICIENT
• It is a non-parametric version of the Pearson product-moment
correlation
• The strength of association between two ranked variables is denoted
by rs, where rs, is given by:

31
3.0 CORRELATION ANALYSIS
Example 1:
An agricultural research organization tested a particular chemical
fertilizer to try and find out whether an increase in the amount of
fertilizer used would lead to a corresponding increase in the food
supply.

Fertilizer 2 1 3 2 4 5 3
Bushes of Beans 4 3 4 3 6 5 5
32
3.0 CORRELATION ANALYSIS
Solution:
Table Showing Dependent (y) and independent variables (x).
X Y XY X2 Y2
2 4 8 4 16
1 3 3 1 9
3 4 12 9 16
2 3 6 4 9
4 6 24 16 36
5 5 25 25 25
3 5 15 9 25
20 30 93 68 136 33
3.0 CORRELATION ANALYSIS
• 
Solution:

34
3.0 CORRELATION ANALYSIS
• 
Solution:

35
3.0 CORRELATION ANALYSIS
• 
Solution:

Therefore, it illustrates that the amount of fertilizer used has a strong


positive correlation to the number of bushes beans obtained
 

36
4.0 CONCLUSION
• Linear regression is a common Statistical Data Analysis technique that
is used to determine the extent to which there is a linear relationship
between a dependent variable and one or more independent variables.

• In simple linear regression a single independent variable is used to


predict the value of a dependent variable.

37
4.0 CONCLUSION
• Simple linear regression is similar to correlation in that the purpose is
to measure to what extent there is a linear relationship between two
variables.

• The difference between the two is that correlation makes no distinction


between independent and dependent variables while linear regression
does.

38
5.0 FURTHER PROBLEMS
1. Electrical Engineering Application Problem
The relationship between the voltage applied to an electrical circuit and
the current flowing is as shown:

Current (mA) 2 4 6 8 10 12 14

Applied Voltage (V) 5 11 15 19 24 28 33

39
5.0 FURTHER PROBLEMS
1. Electrical Engineering Application Problem
Determine:
a. The equation of the regression line of applied voltage on current
assuming a linear relationship.
b. Using the regression equation determine the value of applied voltage
when the current is 3mA.
c. The Pearson's Correlation Coefficient from the table given.

40
5.0 FURTHER PROBLEMS
Solution:
Current Applied
N (X - X̅) (Y - Ῡ) (X - X̅)2 (X - X̅) (Y - Ῡ)
(X) Voltage (Y)
1 2 5 -6 -14.286 36 85.716
2 4 11 -4 -8.286 16 33.144
3 6 15 -2 -4.286 4 8.572
4 8 19 0 -0.286 0 0
5 10 24 2 4.714 4 9.428
6 12 28 4 8.714 16 34.856
7 14 33 6 13.714 36 82.284
  X̅ = 8 Ῡ = 19.286 ∑: 112 ∑: 254 41
5.0 FURTHER PROBLEMS
 
Solution:
Substituting in Equation 1 gives :
Y= +X
a1 =
a1 =
a1 = 2.268 a0 = 1.142

42
5.0 FURTHER PROBLEMS
 

Solution:
Therefore, the least Square Linear Regression Equation is given as

Hence when the current, X, is 3mA


Y = 1.142 + 2.268(3)
Y = 7.95
Therefore, the applied voltage is 7.95 V when the current is 3mA
43
5.0 FURTHER PROBLEMS
Solution:
C) The Pearson's Correlation Coefficient from the table given.
X Y X2 Y2 XY
2 5 4 25 10
4 11 16 121 44
6 15 36 225 90
8 19 64 361 152
10 24 100 576 240
12 28 144 784 336
14 33 196 1089 462
56 135 560 3181 1334
44
5.0 FURTHER PROBLEMS
 
Solution:

45
5.0 FURTHER PROBLEMS
Solution:
 

46
5.0 FURTHER PROBLEMS
2. Civil Engineering Application Problem
The data below shows the result of a liquid limit test with ‘X’ being the
number of blows and ‘Y’ representing the factor of liquid limits.

Blows 4 5 6 7 8
Liquid Limit (%) 5.97 6.98 6.99 8.99 7.99

47
5.0 FURTHER PROBLEMS
2. Civil Engineering Application Problem
Determine:
a. The equation of the regression line of Liquid Limit on Blows.
b. The Moisture Content at 25 Blows.
c. The Pearson's Correlation Coefficient from the table given.

48
5.0 FURTHER PROBLEMS
Solution:
Liquid
Blows (X) (X - X̅) (Y - Ῡ) (X - X̅)2 (X - X̅) (Y - Ῡ)
Limit (Y)
4 5.97 -2 -1.414 4 2.828
5 6.98 -1 -0.404 1 0.404
6 6.99 0 -0.394 0 0
7 8.99 1 1.606 1 1.606
8 7.99 2 0.606 4 1.212
X̅ = 6 Ῡ = 7.384 ∑: 10 ∑: 6.05

49
5.0 FURTHER PROBLEMS
  Substituting in Equation 1 gives :
Solution:
Y= +X
a1 =
a1 = a0 = 3.754
a1 = 0.605

50
5.0 FURTHER PROBLEMS
 

Solution:
a) Therefore, the least Square Linear Regression Equation is given as

 b) Hence, the Moisture Content at 25 Blows

Y = 18.879
At 25 Blows the corresponding Liquid Limit will be 18.9%.
51
5.0 FURTHER PROBLEMS
Solution:
C) The Pearson's Correlation Coefficient from the table given.
X Y X2 Y2 XY
4 5.97 16 35.64 23.88
5 6.98 25 48.72 34.90
6 6.99 36 48.86 41.94
7 8.99 49 80.82 62.93
8 7.99 64 63.84 63.92
30 36.92 190 277.88 227.57

52
5.0 FURTHER PROBLEMS
 
Solution:

53
5.0 FURTHER PROBLEMS
Solution:
 

54
5.0 FURTHER PROBLEMS
3. Mechanical Engineering Application Problem
The results obtained from a Tensile Test on a Steel Specimen are shown
below:

Tensile Force (KN) 4.8 9.3 12.8 17.7 21.6 26


Extension (mm) 3.5 8.2 10.1 15.6 18.4 20.8

55
5.0 FURTHER PROBLEMS
3. Mechanical Engineering Application Problem
Assume a linear relationship, Determine:
a. The equation of the regression line of extension on forces.
b. The value of extension when the Force is 16 KN.
c. The Pearson's Correlation Coefficient from the table given.

56
5.0 FURTHER PROBLEMS
Solution:
Tensile Force Extension
(X - X̅) (Y - Ῡ) (X - X̅)2 (X - X̅) (Y - Ῡ)
(X) (Y)
4.8 3.5 -10.567 -9.267 111.661 97.924
9.3 8.2 -6.067 -4.567 36.808 27.708
12.8 10.1 -2.567 -2.667 6.589 6.846
17.7 15.6 2.333 2.833 5.443 6.609
21.6 18.4 6.233 5.633 38.850 35.110
26 20.8 10.633 8.033 113.061 85.415
X̅ = 15.367 Ῡ = 12.767 ∑: 312.413 ∑:259.613
57
5.0 FURTHER PROBLEMS
 
Solution: Substituting in Equation1 gives :

Y= +X
a1 =
a1 =
a0 = 0.0124
a1 = 0.83

58
5.0 FURTHER PROBLEMS
 

Solution:
a) Therefore, the least Square Linear Regression Equation Is given as
b) Hence when the Tensile Force, X, is 16 KN
Y = 0.0124 + 0.83(16)
Y = 13.2924
Therefore, the extension is 13.2924 (mm) while the force is 16 KN.

59
5.0 FURTHER PROBLEMS
Solution:
C) The Pearson's Correlation Coefficient from the table given.
X Y X2 Y2 XY
4.8 3.5 23.04 12.25 16.8
9.3 8.2 86.49 67.24 76.26
12.8 10.1 163.84 102.01 129.28
17.7 15.6 313.29 243.36 276.12
21.6 18.4 466.56 338.56 397.44
26 20.8 676 432.64 540.8
92.2 76.6 1729.22 1196.06 1436.7
60
5.0 FURTHER PROBLEMS
 
Solution:

61
5.0 FURTHER PROBLEMS
Solution:
 

62
5.0 FURTHER PROBLEMS
Question 4.
Eight tomato plants of the same variety were selected at random and
treated, weekly, with a solution in which x grams of fertilizer was
dissolved in a fixed quantity of water. The yield y kilograms of tomatoes
were recorded.

Plant A B C D E F G H
X 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5
Y 3.9 4.4 5.8 6.6 7.0 7.1 7.3 7.7
63
5.0 FURTHER PROBLEMS
Question 4.
a. Calculate the equation of least squares regression line of y on x.
b. Estimate the yield of a plant treated, weekly with 3.2 grams of
fertilizer.

64
5.0 FURTHER PROBLEMS
Question 5.
Over a period of one year a greengrocer sells tomatoes at six difference
prices (x pence per kilogram). He calculates the average number of
kilograms, y, sold per day at each of the six different prices. From these
data, the following were calculated:

∑X = 200 ∑Y = 436 ∑X2 = 7250 ∑Y2 = 39234 ∑XY = 12515 N = 6

65
5.0 FURTHER PROBLEMS
Question 5.
a. Calculate the value of the product moment correlation coefficient.

66
6.0 REFERENCES
• Army, U., 2009. Buttelake. [Online] Available at:
http://www.buttelake.com/corr.htm [Accessed 20 February 2020].
• Chatterjee, S. & Hadi, A. S., 2006. Regression Analysis by Example.
4th ed. s.l.:John Wiley & Sons.
• Cho, H. A. & Golberg, M. A., 2004. Introduction to Regression
Analysis. Las Vegas: WIT Press.

67
6.0 REFERENCES
• Granville, V., 2012. Data Science Central. [Online] Available at:
https://www.statisticshowto.datasciencecentral.com/probability-and-
statistics/correlation-coefficient-formula/
[Accessed 19 February 2020].

68
6.0 REFERENCES
• KREYSZIG, E., 2011. Advanced Engineering Mathematics. 10th ed.
Ohio: John Wiley and Sons Inc..
• Lani, J., 2013. Statistical Solutions. [Online] Available at:
https://www.statisticssolutions.com/what-is-linear-regression/
[Accessed 21 February 2020].

69
6.0 REFERENCES
• KREYSZIG, E., 2011. Advanced Engineering Mathematics. 10th ed.
Ohio: John Wiley and Sons Inc..
• Lani, J., 2013. Statistical Solutions. [Online] Available at:
https://www.statisticssolutions.com/what-is-linear-regression/
[Accessed 21 February 2020].

70
THANK YOU FOR
LISTENING
QUESTIONS???

71