Вы находитесь на странице: 1из 45

# Correlation

## Deviation & Computational

Equations

Testing Significance

Intercorrelation Matrix
2

Partial Correlations

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
3

KEY CONCEPTS
*****
Correlation

Correlation coefficient
Interpretation of the concepts of magnitude and direction
Use of a scatterplot to diagnose correlation
Deviation score formula for the Pearson Product-Moment Correlation Coefficient
Karl Pearson (1857-1936)
Concepts of:
Sum of cross products
Sums of squares of X & Y
Computational formula for the Pearson Product-Moment Correlation Coefficient
t-test for determining the significance of r and df
Null hypothesis in determining the significance of r
Coefficient of determination
Coefficient of nondetermination
Assumptions for the Pearson r
Linear relationship
X & Y are metric variables
Randomly drawn sample
X & Y are normally distributed in the population
The concept of a nonlinear relationship
Intercorrelation matrix
Caveats in interpreting an intercorrelation matrix
Interpretation of a partial correlation
Zero-order correlation
1st , 2nd, etc. order correlations

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
4

Lecture Outline

Coefficient

##  Coefficients of correlation, determination &

nondetermination

 Intercorrelation matrix

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
5

Relationships

phenomena.

## Q Why do some prisoners attempt to escape

from prison and others do not?

## Q Why do some judges have a constant

backlogs of cases while others run an efficient
docket?

## Q What factors account for the fact that some

countries have a higher rate of violent crime
than others?

## Q Is violence in the media related to violent

behavior in society?

cops"?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
6

## How can one determined if one variable is the

“cause” of another?

Example

## Do liberal laws on the purchase and

possession of firearms cause increases in
the incidence of violent crimes involving
weapons?

Principles of causality

## 1st Are the two variables in question related

( X & Y)? Is there a covariance between
them?

## 2nd Is there a replicable time sequence

between the two variables, the variable
thought to be causal (X) always

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
7

effect (Y)?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
8

## 3rd Having eliminated or controlled for all

other extraneous variables, can it be
demonstrated that when X occurs Y
always follows?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
9

## The first step in determining whether one

variable (X) is correlated with another variable
(Y) involves …

## Is it true that as X increases …

Y also increases, and to what extent?

## Or, is it true that as X increases …

Y decreases, and to what extent?

Caveat

## The mere fact that two variables covary (i.e.

correlate) is no proof that one is the cause of
the other.

## Correlation does not necessarily prove

causation.
Correlation: Charles M. Friel Ph.D., Criminal Justice Center,
Sam Houston State University
10

A Correlation Coefficient

## A correlation coefficient is an index number that

measures …
 The magnitude and
 The direction of the relationship
between two variables

## It is designed to range in value between

0.0 and 1.0

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 +0.2 +0.4 +0.6 +0.8 +1.0

Negative Positive
Relationship Relationship
X Y X Y
X Y X Y

No relationship

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
11

## Statisticians have developed many techniques

for determining the correlation between two
or more variables.

## The primary difference among these techniques

is a function of the types of variables being
correlated (i.e. nonmetric: nominal, ordinal, or
metric: interval or ratio)

## Pearson product-moment correlation

coefficient (metric with metric)

## Spearman's rank-difference coefficient (rho)

(ordinal with ordinal)

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
12

## Biserial coefficient (a metric variable with a

metric variable that has been artificially
reduced to categories)

## Point biserial coefficient (a metric variable

with a truly dichotomous variable)

## Tetrachoric correlation coefficient (two

metric variables that have been artificially
reduced to dichotomous categories)

variables)

## Partial correlation (two metric variables with

the intercorrelation with a third variable
removed from both of them)

## Kendall coefficient of concordance (three or

more ordinal variables

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
13

## Multiple correlation (one metric variable with

two or more metric and/or nonmetric
variables)

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
14

The Scatterplot
A useful tool for visually identifying the presence
of a possible relationship between two metric
variables.

30

20

10

0
10 20 30 40

AGE

## Correlation r = -0.4174 (p< 0.001)

The Scatterplot (cont.)

30

30

20

20

10

10

0
20 40 60 80 100 120 140 160
Correlation: Charles M. Friel Ph.D., Criminal Justice Center,
0 Sam Houston State University
12 14 TIME TO16DISPOSITION 18 IN DAYS 20 22 24

15

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
16

An Example
Is There a Correlation Between
Homicide & Rape?

## The incidence of homicide and rape per

100,000 population in a sample of seven
medium size cities

(X) (Y)

A 4 16

B 6 29

C 10 43

D 5 20

E 1 3

F 2 4

G 3 6

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
17

Totals 31 121

## Scatterplot of the Homicide v Rape

12

11

10

1
0
0 5 10 15 20 25 30 35 40 45 50

RAPE

## What is the magnitude of the relationship on a

scale of 0.0 to 1.0?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
18

or negative?

## Pearson Product-Moment Correlation

Coefficient
Karl Pearson (1857-1936) British mathematician and statistician who also
developed the Chi-Square Test

r= Σ (X – X) (Y – Y)

Σ (X – X)2 Σ (Y – Y)2

## Incidence of Homicide (X) and Rape (Y)

City X Y (X – X) 2 (Y - Y) 2 (X – X) (Y - Y)
A 4 16 0.1849 1.6641 0.5547
B 6 29 2.4649 137.124 18.4789
C 10 43 31.0249 661.004 143.2047
D 5 20 0.3249 5.7100 1.5447
E 1 3 11.7649 204.204 49.0147
F 2 4 5.9049 176.624 32.2947
G 3 6 2.0449 127.464 16.1447

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
19

## Mean number of homicides & rapes

X = 4.43 and Y = 17.29

## Sum of squared deviations in homicides = 53.71

Sum of squared deviations in rapes = 1310.79
Sum of cross products (SP) = 261.24

## Calculation of the Pearson r

r= 261.24 = 261.24

Interpretation

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
20

## The magnitude of the correlation between

homicide and rape = 0.985.

## The direction of the relationship is positive.

As the incidence of homicide increases so
does the incidence of rape.
An Alternative Way to Calculate a
Pearson Correlation

## The previous equation is called a deviation

score equation since the mean of each variable
is subtracted from each respective case.

## An alternative computational equation is given

below. It will yield the same result within
rounding error.

r= N(Σ XY) – (Σ X) (Σ Y)
[N Σ X2 – (Σ X)2] [NΣ Y2 – (Σ Y)2]

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
21

## City X Y (X) 2 (Y) 2 (XY)

A 4 16 16 256 64
B 6 29 36 841 174
C 10 43 100 1849 430
D 5 20 25 400 100
E 1 3 1 9 3
F 2 4 4 16 8
G 3 6 9 36 18

## r= 7 (797) – (31) (121)

[7 (191)– (31)2] [7 (3407) – (121)2]

r = 0.985

## This is the same value computed with the

deviation score equation.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
22

## Determining the Significance of a

Correlation Coefficient

The problem

## Imagine a population in which X and Y are

not related, the correlation ρ = 0.0. (ρ =
rho)

## Is it possible to draw a random sample from

that population and find that the correlation
between X & Y in the sample is not 0.0?

## Of course this is possible, but what is the

probability of that happening?

A t-test

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
23

## A t-Testfor the Significance of a

Correlation Coefficient

t=[r N – 2) ] / 1 – r2
df = (N – 2)

& Y is ρ = 0.0

## What is the probability, therefore that the

correlation obtained in the sample came from a
population where the parameter ρ = 0.0?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
24

## For the correlation between homicide and rape

r = 0.985

t = [ 0.985 7 – 2) ] / 1 – (0.985)2

## t = (2.203) / (0.1726) = 12.767

df = (N - 2) = (7 cities - 2) = 5

is t = 2.571

Interpretation

at p < 0.05

## Decision Reject the null hypothesis and

affirm that the two variables are positively
related in the population.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
25

Coefficients of Determination
& Non-determination

e.g. r = 0.985

## r2 = the coefficient of determination

r2 = (0.985)2 = 0.97

## This is the proportion of variance in Y that

can be explained by X, in percentage terms
97%

## 1 - r2 = the coefficient of nondetermination

1 - r2 = (1 - 0.9852)= 0.03

## This is the proportion of variance in Y that

can not be explained by X, in percentage
terms 3%

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
26

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
27

## Some Examples of SPSS

Correlational Output

## Q Is there a correlation between the age of the

offender and the length of sentence?

30

20

10

0
10 20 30 40

AGE

## Correlation r = 0.826, p < 0.001

Correlations

AGE SENTENCE
AGE Pearson Correlation 1.000 .826**
Sig. (2-tailed) . .000
N 70 70
SENTENCE Pearson Correlation .826** 1.000
Sig. (2-tailed) .000 .
N 70 70
**. Correlation is significant at the 0.01 level (2-tailed).

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
28

## Q Is there a correlation between the age of first

arrest and the length of sentence?

30

20

10

0
12 14 16 18 20 22 24

## Correlation r = -0.417, p < 0.001

Correlations

AGE_FIRS SENTENCE
AGE_FIRS Pearson Correlation 1.000 -.417**
Sig. (2-tailed) . .000
N 70 70
SENTENCE Pearson Correlation -.417** 1.000
Sig. (2-tailed) .000 .
N 70 70
**. Correlation is significant at the 0.01 level (2-tailed).

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
29

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
30

## Q Is there a correlation between the time to

case disposition and the length of sentence?

30

20

10

0
20 40 60 80 100 120 140 160

## Correlation r = -0.084, p < 0.489

Correlations

TM_DISP SENTENCE
TM_DISP Pearson Correlation 1.000 -.084
Sig. (2-tailed) . .489
N 70 70
SENTENCE Pearson Correlation -.084 1.000
Sig. (2-tailed) .489 .
N 70 70

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
31

##  That the relationship between X and Y can be

represented by a straight line, i.e. it is linear.

##  That X and Y are metric variables, measured

on an interval or ratio scale of measurement.

##  In using a t distribution to test the significance

of the correlation coefficient …

## That the sample was randomly drawn

from the population, and

## That X and Y are normally distributed in

the population. This assumption is less
important as the sample size increases

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
32

An Intercorrelation Matrix

## Multiple correlations and their significance can

be computed simultaneously and reported in an
intercorrelation matrix

Example

## Intercorrelation of age, age at first arrest,

number of prior arrests and convictions, and
length of sentence

## SPSS Intercorrelation Results

C o rre la tio n s

AG E A G E _ F IR S PR _ AR R S T P R _ C O N V S EN T E N C E
AG E P e a rso n C o rr e la tio n 1 .0 0 0 - .3 1 2** .1 7 9 .3 0 2* .8 2 6**
S ig . (2 - ta ile d ) . .0 0 9 .1 3 8 .0 1 1 .0 0 0
N 70 70 70 70 70
AG E _ F IR S P e a rso n C o rr e la tio n -.3 1 2** 1 .0 0 0 -.3 1 5** -.3 5 8** -.4 1 7**
S ig . (2 - ta ile d ) .0 0 9 . .0 0 8 .0 0 2 .0 0 0
N 70 70 70 70 70
PR _ AR R S T P e a rso n C o rr e la tio n .1 7 9 - .3 1 5** 1 .0 0 0 .7 9 5** .2 4 6*
S ig . (2 - ta ile d ) .1 3 8 .0 0 8 . .0 0 0 .0 4 0
N 70 70 70 70 70
PR _ C O N V P e a rso n C o rr e la tio n .3 0 2* - .3 5 8** .7 9 5** 1 .0 0 0 .4 0 0**
S ig . (2 - ta ile d ) .0 1 1 .0 0 2 .0 0 0 . .0 0 1
N 70 70 70 70 70
SEN T E N C E P e a rso n C o rr e la tio n .8 2 6** - .4 1 7** .2 4 6* .4 0 0** 1 .0 0 0
S ig . (2 - ta ile d ) .0 0 0 .0 0 0 .0 4 0 .0 0 1 .
N 70 70 70 70 70
**. C o r re la tio n is sig n ifica n t a t th e 0 .0 1 le ve l ( 2 -ta ile d ) .
*. C o r re la tio n is sig n ifica n t a t th e 0 .0 5 le ve l ( 2 -ta ile d ).

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
33

Caveats in Interpreting an
Intercorrelation Matrix

##  Has each variable been checked for outliers

that might lead to a Type I or II error?

##  Has each pair of variables (X & Y) been

checked for bivariate outliers that might lead
to a Type I or II error?

##  Can it be assumed that each variable is

normally distributed in the population?

##  Can it be assumed that each pair of variables

is a random sample from the population?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
34

## What Is the Meaning of a

Linear Relationship?

## The Pearson correlation assumes that the two

variables are linearly related. What does this
mean?

Example

30

20

10

0
10 20 30 40

AGE

## Notice that the straight line is a "fair"

representation of the relationship.

## The cases are about evenly distributed above

and below the line.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
35

## This is called homogeneity of the variance of

Y over levels of X.
What Is the Meaning of a Linear Relationship? (cont.)

Example

## Age at first arrest and length of sentence

30

20

10

0
12 14 16 18 20 22 24

## Notice that the straight line is not a "fair"

representation of the relationship.

below the line.

## This is called heterogeneity of the variance

of Y over levels of X.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
36

each other.

computed …

## Eliminating the intercorrelation that both

have with the third variable?

Example
Age
Age at first arrest
Length of sentence

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
37

Correlations

## AGE AGE_FIRS SENTENCE

AGE Pearson Correlation 1.000 -.312** .826**
Sig. (2-tailed) . .009 .000
N 70 70 70
AGE_FIRS Pearson Correlation -.312** 1.000 -.417**
Sig. (2-tailed) .009 . .000
N 70 70 70
SENTENCE Pearson Correlation .826** -.417** 1.000
Sig. (2-tailed) .000 .000 .
N 70 70 70
**. Correlation is significant at the 0.01 level (2-tailed).

sentence …

## Eliminating the intercorrelation of both

variables with age at first arrest?

## A This problem can be solved by computing

the partial correlation between age and
sentence, controlling for age at first arrest.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
38

## What is the correlation of X and Y taking out the

intercorrelation of both variables with Z?

X Y

## rXY.Z = the partial correlation between X and Y,

partialling out the inter-relationship between X
and Z, and Y and Z

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
39

Example

## What is the correlation between age (X) and

length of sentence (Y), partialling out or
controlling for age at first arrest (Z)?

Correlations

## AGE AGE_FIRS SENTENCE

AGE Pearson Correlation 1.000 -.312** .826**
Sig. (2-tailed) . .009 .000
N 70 70 70
AGE_FIRS Pearson Correlation -.312** 1.000 -.417**
Sig. (2-tailed) .009 . .000
N 70 70 70
SENTENCE Pearson Correlation .826** -.417** 1.000
Sig. (2-tailed) .000 .000 .
N 70 70 70
**. Correlation is significant at the 0.01 level (2-tailed).

## rXY.Z = [ .826 - (-.312) (-.417) ]

[ 1 - (-.312 )2 1 - ( -.417) 2 ]

rXY.Z ≅ 0.806

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
40

## Notice the difference between the correlation

(0.826) and the partial correlation (0.806) when
controlled for age at first arrest.

## Partial Correlation Coefficient (cont.)

Example
What is the correlation between age at first
arrest (X) and length of sentence (Y), partialling
out or controlling for age (Z)?

Correlations

## AGE AGE_FIRS SENTENCE

AGE Pearson Correlation 1.000 -.312** .826**
Sig. (2-tailed) . .009 .000
N 70 70 70
AGE_FIRS Pearson Correlation -.312** 1.000 -.417**
Sig. (2-tailed) .009 . .000
N 70 70 70
SENTENCE Pearson Correlation .826** -.417** 1.000
Sig. (2-tailed) .000 .000 .
N 70 70 70
**. Correlation is significant at the 0.01 level (2-tailed).

## rXY.Z = [ -.417 - (-.312) (.826) ]

[ 1 - (-.312 )2 1 - ( .826) 2 ]

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
41

rXY.Z ≅ -0.298

## Notice the difference between the correlation

(-0.417) and the partial correlation (-0.298) when
controlled for age.

## Age and sentence controlling for age

at first arrest
- - - P A R T I A L C O R R E L A T I O N C O E F F I C I E N T S - - -

AGE SENTENCE

( 0) ( 67)
P= . P= .000

( 67) ( 0)
P= .000 P= .

## Age at first arrest and sentence controlling

for age
- - - P A R T I A L C O R R E L A T I O N C O E F F I C I E N T S - - -

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
42

## Controlling for.. AGE

SENTENCE AGE_FIRS

( 0) ( 67)
P= . P= .013

( 67) ( 0)
P= .013 P= .

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
43

rxy.zz′

## More than one variable can be partialled out of a

bivariate correlation.

X Y

Z′
Z

Example

sentence (Y) …

## Partialling out prior arrests, time to

disposition, prior convictions, drug use
and the seriousness of the offense?

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
44

## The bivariate correlations among the seven

variables.

Correlations
C o rre la t io n s

## A G E S E N T E N CPER _ A R R STTM _ D IS P R _ C O N DV R _ S C O RSEE R _ IN D X

AGE P e a r s o n C o r r e la tio1 n. 0 0 0 . 8 2 *6* .1 7 9 .0 4 8 .3 0 *2 . 2 5 *2 . 6 0 *9*
S ig . ( 2 - ta ile d ) . .0 0 0 .1 3 8 .6 9 2 .0 1 1 .0 3 6 .0 0 0
N 70 70 70 70 70 70 70
S E N T E N C PE e a r s o n C o r r e la tio .n8 2 *6* 1 .0 0 0 . 2 4 *6 - .0 8 4 .4 0 *0* . 3 4 *6* . 7 4 *4*
S ig . ( 2 - ta ile d ) .0 0 0 . .0 4 0 .4 8 9 .0 0 1 .0 0 3 .0 0 0
N 70 70 70 70 70 70 70
P R _ A R R S PT e a r s o n C o r r e la tio .n1 7 9 . 2 4 *6 1 .0 0 0 - .0 7 2 .7 9 *5* -.0 0 3 . 5 0 *2*
S ig . ( 2 - ta ile d ) .1 3 8 .0 4 0 . .5 5 6 .0 0 0 .9 7 9 .0 0 0
N 70 70 70 70 70 70 70
T M _ D I S P P e a r s o n C o r r e la tio .n0 4 8 -.0 8 4 -.0 7 2 1 .0 0 0 - .0 6 6 -.0 2 4 .0 3 2
S ig . ( 2 - ta ile d ) .6 9 2 .4 8 9 .5 5 6 . .5 8 9 .8 4 1 .7 9 4
N 70 70 70 70 70 70 70
P R _ C O N VP e a r s o n C o r r e la tio .n3 0 *2 . 4 0 *0* . 7 9 *5* - .0 6 6 1 .0 0 0 .0 5 6 . 5 7 *8*
S ig . ( 2 - ta ile d ) .0 1 1 .0 0 1 .0 0 0 .5 8 9 . .6 4 5 .0 0 0
N 70 70 70 70 70 70 70
D R _ S C O RPEe a r s o n C o r r e la tio .n2 5 *2 . 3 4 *6* -.0 0 3 - .0 2 4 .0 5 6 1 .0 0 0 . 2 7 *9
S ig . ( 2 - ta ile d ) .0 3 6 .0 0 3 .9 7 9 .8 4 1 .6 4 5 . .0 1 9
N 70 70 70 70 70 70 70
S E R _ IN D XP e a r s o n C o r r e la tio .n6 0 *9* . 7 4 *4* . 5 0 *2* .0 3 2 .5 7 *8* . 2 7 *9 1 .0 0 0
S ig . ( 2 - ta ile d ) .0 0 0 .0 0 0 .0 0 0 .7 9 4 .0 0 0 .0 1 9 .
N 70 70 70 70 70 70 70
* * . C o r r e la tio n is s ig n if ic a n t a t t h e 0 .0 1 le v e l ( 2 - ta ile d ) .
* . C o r r e la tio n is s ig n if ic a n t a t t h e 0 .0 5 le v e l ( 2 - ta ile d ) .

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University
45

## The partial correlation of age and sentence,

controlling for five other variables.

- - - P A R T I A L C O R R E L A T I O N C O E F F I C I E N T S - - -

AGE SENTENCE

( 0) ( 63)
P= . P= .000

( 63) ( 0)
P= .000 P= .

## Notice the difference between the correlation

and the partial correlation between age and
sentence.

 Correlation = +0.826

## The correlation is lower when the intercorrelation

with the other five variables is removed.

## Correlation: Charles M. Friel Ph.D., Criminal Justice Center,

Sam Houston State University