1

PROF. MYLLAH D. GARCIA

Objectives:

2

Draw a scatter plot for a set of ordered pairs.

Compute the correlation coefficient

Test the hypothesis

Compute the equation of the regression line.

Compute the coefficient of determination

Compute the standard error of the estimate

Find the prediction interval

Be familiar with the concept of multiple regression

Prof. Myllah D. Garcia

Correlation

3

the values of the other variable.

A Correlation Analysis is a group of techniques to

measure the strength of the association between two

variables.

Scatter Diagram

4

variables.

Example: Construct a scatter plot for the data

obtained in a study of age and systolic blood pressure

of six randomly selected subjects.

ct

(x)

(y)

A

43

128

B

48

120

C

56

135

D

61

143

E

67

141

F

70

152

Scatter Plot

5

160

140

120

100

80T itle

Axis

60

40

20

0

40

45

50

55

60

65

70

75

Axis T itle

6

larger values of the independent variable, and vice

versa. The values are on a straight line, and therefore

one can say that there is a perfect positive

association between the variables. Perfect

association rarely occurs when sample data are

collected.

7

associated with smaller values of the independent

variable, and vice versa. The values are on a straight

line, and therefore one can say that there is a perfect

negative association between the variables. Again,

perfect association rarely occurs with sample data.

8

somewhat closely packed together in a linear

manner, and so one can say that there is a very

strong positive association between the variables.

9

relatively closely packed together in a somewhat

linear pattern, and so one can say that there is a very

strong negative association between the variables.

No association

10

very little association between the variables.

Nonlinear association

11

relationship. We will not study such relationships in

this text but will concentrate only on linear

relationships between two variables.

Correlation Coefficient

12

The linear correlation coefficient r measures the

strength of the linear correlation between paired

quantitative x- and y-values in a sample.

The linear correlation coefficient is sometimes

referred to as the Pearson product moment

correlation coefficient in honor of Karl Pearson

(1857-1936), who originally developed it.

Coefficient of Correlation

13

14

If there is a perfect positive linear relationship between the

If there is a perfect negative linear relationship between

the variables, the value of r will be equal to -1.

If there is a strong positive linear relationship between the

variables, the value of r will be close to +l

If there is a strong negative linear relationship between the

variables, the value of r will be close to -1.

If there is little or no linear relationship between the

variables, the value of r will be close to 0.

Prof. Myllah D. Garcia

15

Example

16

and the dependent variable y.

Answer

17

variable y. That is, the higher the value of x, the lower the

value of y.

Prof. Myllah D. Garcia

Example

18

Answer

19

Example

20

Subje Age Pressure

ct

(x)

(y)

A

43

128

B

48

120

C

56

135

D

61

143

E

67

141

F

70

152

Prof. Myllah D. Garcia

Answer

21

ct

(x)

(y)

xy

43

128

48

120

56

135

61

143

67

141

F

70 coefficient

152 suggests

10,640 a

4,900

The correlation

strong23,104

positive

Prof. Myllah D. Garcia

Correlation and Regression

relationship

between age and blood pressure.

22

Step 1:

Step 2:

Step 3:

Step 4:

Step 5:

Find the critical values

Compute the test value

Make the decision

Summarize the results.

23

is the

correlation computed by using all possible pairs of

data values (x, y) taken from a population.

This alternative hypothesis means that there is a

population.

Prof. Myllah D. Garcia

24

25

Coefficient

26

Example

27

Solution:

Step 1:

Step 2: From t Table, the critical values are

Step 3:

Step 5: There is a significant relationship between the

variables of age and blood pressure.

28

costs of a slice of pizza and the subway fares. Use

significance level.

Answer

29

Step 1:

Step 2: From t Table at n= 6 and

Step 3:

Step 5: We conclude that there is sufficient evidence to

support the linear correlation between costs of a

slice pizza and subway fares.

Regression

30

predictions.

Regression Equation an equation that defines the

relationship between two variables.

31

line of best fit. Best fit means that the sum of the

squares of the vertical distances from each point to

the line is at a minimum.

The reason one needs a line of best fit is that the

values of y will be predicted from the values of x;

hence, the closer the points are to the line, the better

the fit and the prediction will be.

32

between the actual Y values and the predicted values

of Y.

Regression equation

33

regression equation

algebraically describes the relationship between the

two variables x and y. The graph of the regression

equation is called the regression line (or line of

the best fit, or least-squares line).

34

line.

35

Subje Age Pressure

ct

(x)

(y)

xy

43

128

48

120

56

135

61

143

67

141

19,881and Regression

Answer

36

Regression Line

37

Chart Title

160

f(x) = 0.96x + 81.05

140

120

100

Axis T itle

80

60

40

20

0

40

45

50

55

60

Axis T itle

65

70

75

Example

38

pressure for a person who is 50 years old.

Solution:

at x = 50

In other words, the predicted systolic blood pressure

for a 50-year-old person is 129.

39

x, the value of the dependent variable y must be

normally distributed about the regression line.

2. The standard deviation of each of the depended

variables must be the same for each value of the

independent variable.

1.

40

of r using

. Determine whether there is

sufficient evidence to support a claim of a linear

correlation between the two variables.

Listed below are systolic blood pressure

measurements (in mm Hg) obtained from the same

woman. Is there sufficient evidence to conclude that

there is a linear correlation between right and left

arm systolic blood pressure measurement?

Prof. Myllah D. Garcia

Exercise

41

Making Predictions

Find the best predicted systolic blood pressure in the

left arm given that the systolic blood pressure in the

right arm is 100 mm Hg.

42

using

. Determine whether there is sufficient

evidence to support a claim of a linear correlation between

the two variables.

Listed below are the brain sizes (cm) and Wechsler IQ

score of subjects. Is there sufficient evidence to conclude

that there is a linear correlation between brain size and IQ

score? Does it appear that people with larger brains are

more intelligent?

Exercise

43

Find the best predicted IQ score of someone with a

brain of 1,275 cm.

44

45

46

47

48

