Вы находитесь на странице: 1из 35

WELCOME TO THE

PRESENTATION
A REPORT ON THE STUDY OF

Correlation, correlation coefficient


and
its significance

SUBMITTED BY:
HAFIJ AL MAHMUD (1044)
MD. HABIBUR RAHMAN
(1033)
SHEHAB MOHAMMAD (1524)
MD. NAZMUS SAKIB (1051)

Department of Biochemistry and Molecular Biology


Jahangirnagar University
Savar, Dhaka, Bangladesh
 THE CORRELATION IS ONE OF
THE MOST COMMON AND MOST
Correlation
USEFUL STATISTICS. A CORRELATION
IS A SINGLE NUMBER THAT
DESCRIBES THE DEGREE OF
RELATIONSHIP BETWEEN TWO
VARIABLES

EG. THE CORRELATION


BETWEEN THE DEMAND FOR A
PRODUCT AND ITS PRICE.

WHEN TWO VARIABLES ARE


CORRELATED, KNOWING THE VALUE
OF ONE VARIABLE ALLOWS YOU TO
PREDICT THE VALUE OF THE OTHER
VARIABLE
  A PERFECT CORRELATION :
  A perfect correlation of ± 1 occurs only
when the data points all lie exactly
4 on a
 straight line.  If r = +1, the

Demand
slope of this line is positive. 
3
If r = -1, the slope of this
 line is negative. 
2

 That is, all the


1
variability in one
variable is explained
by the variability in 0
0 1 2 3 4 5 6 7 8 9 10
the other variable
supply
PERFECT CORRELATION:

 That is, all the variability in one variable is explained


by the variability in the other variable
 Few, if any, psychological variables are perfectly
correlated with each other

 Many non-psychological variables do have a perfect


correlation
 E.g. Time since the beginning of class and the time
remaining in the class are perfectly correlated
 What are other examples of perfectly correlated
variables?
LESS THAN PERFECT CORRELATIONS :
 Even if two variables are correlated, most of
the time you cannot perfectly predict the
value of one variable given the other

 E.g.,
other variables besides amount of time
spent studying influence your GPA

 Some of the variability is people’s GPA is due to


the amount of time spent studying, but not all
the variability is due to it
LESS THAN PERFECT CORRELATIONS:

Study 4

Time IQ GPA
3
3 80 2.0
3 100 2.5

GPA
2

3 120 3.0
1
5 80 2.5
5 100 3.0 0
5 120 3.5 0 5 10
Study Hours per Day
PEARSON’S PRODUCT MOMENT
CORRELATION COEFFICIENT:
 The quantity r, called the linear correlation
coefficient, measures the strength and the
direction of a linear relationship between two
variables. The linear correlation coefficient is
sometimes referred to as the Pearson
product moment correlation coefficient in
honor of its developer Karl Pearson.

 Pearson’s r must lie in the range of -1 to +1


inclusive
INTERPRETATION OF
PEARSON’S R:

To interpret Pearson’s r, you must consider


two parts of it:

The sign of r

The magnitude, or absolute value of r


THE SIGN OF R:
 When r is greater than 0 (I.e. its sign is
positive) the variables are said to have
a direct relation

 In a direct relation, as the value


of one variable increases, the value
of the other variable also tends to
increase
IS THE SIGN OF R + OR - ?
 As the number of cigarettes smoked per day
increases, chance of cancer tends to
increase.

 As the number of cats in a farm yard


increases, the number of mice tends to
decrease

 Overdose of paracetamol increases the


chance of liver damage tends to increase.
THE MAGNITUDE OF R:
 The magnitude refers to the size of the
correlation coefficient ignoring the sign of r
 The magnitude is equivalent to taking the
absolute value of r
 The larger the magnitude of r is, the more
perfectly the two variables are related to
each other
 The smaller the magnitude of r is, the less
perfectly the two variables are related to
each other
WHEN R = 1:

When r equals 1.0, there is a


perfect correlation between the
variables
Knowing the 1.0
value of one 0.8
variable
0.6
exactly

Hours
predicts the 0.4

value of the 0.2

other variable 0.0


0 10 20 30 40 50 60
Minutes
WHEN R = 0:

 When r equals 0,
either the 10

assumptions of
correlation have 5
been violated or
there is no relation
0
between the two -10 -5 0 5 10
variables
-5
 The points in a
scatter plot with r
-10
= 0 will tend to
form a circular
cluster
  A correlation greater than 0.8 is
generally described as strong,
whereas a correlation less than 0.5
is generally described as weak. 
These values can vary based upon
the
  "type" of data being examined.  A
study utilizing scientific data may
require a stronger correlation than a
study using social science data
COEFFICIENT OF DETERMINATION, R2:

  The coefficient of determination, r 2, is


useful because it gives the proportion of
the variance (fluctuation) of one variable that
is predictable from the other variable

    The coefficient of determination is such


that 0 <  r 2 < 1,  and denotes the strength
   of the linear association between x and y. 
   .
COEFFICIENT OF
DETERMINATION, R 2 :
 The coefficient of determination represents
the percent of the data that is the closest to
the line of best fit.  For example, if r = 0.922,
then r 2 = 0.850, which means that 85% of
the total variation in y can be explained by
the linear relationship between x and y (as
described by the regression equation).  The
other 15% of the total variation in y remains
unexplained.
LINEAR RELATION

 Pearson’s r, in its simplest form, only works


for variables that are linearly related

 That is, the equation that allows us to predict the


value of one variable from the value of the other
is a line:

Y = slope * X + intercept
SIGNIFICANCE OF CORRELATION AND
COEFICENT:

 Factors in relationships between two


variables

 The strength of the


relationship

 The significance of the relationship


SIGNIFICANCE OF CORRELATION AND
COEFICENT:

 The strength of the relationship:


 is indicated by the correlation coefficient: r
 but is actually measured by the coefficient of

determination: r2

 The significance of the relationship


 is expressed in probability levels: p (e.g., significant at
p =.05)

 The smaller the p-level, the more significant the


relationship .
 The larger the correlation, the stronger the

relationship
USING SPSS FOR CORRELATION:

 SPSS (Statistical Package for Social Science)


is a software package used for conducting
statistical analyses, manipulating data, and
generating tables and graphs that
summarize data.
STEPS TO CREATE A GRAPH
 In the SPSS Output Viewer, you will see a
table with the requested descriptive statistics
and correlations.
 The Descriptive Statistics section gives the
mean, standard deviation, and number of
observations (N) for each of the variables
that you specified.
 For example, the mean of GPT is 273.00, the
standard deviation of the rather stay at home
variable is 205.39811, and there were 20
observations (N) for each of the two
variables.
The Correlations section gives the values of
the specified correlation tests, in this case,
Pearson's r. Each row of the table
corresponds to one of the variables. Each
column also corresponds to one of the
variables.
 In this example, the cell at the bottom row of
the right column represents the correlation of
GOT with GOT
. Kind of silly, isn't it! This correlation must always be 1.0 (why?).
Likewise the cell at the middle row of the middle column
represents the correlation of rather stay at home with rather stay
at home. It too, must always be 1.0. The cell at middle row and
right column (or equivalently, the bottom row at the middle
column) is more interesting. This cell represents the correlation of
GPT and rather stay at home (or rather stay at home with GPT --
it doesn't matter. Why?) There are three numbers in these cells.
The top number is the correlation coefficient. The correlation
coefficient in this example is .979. The middle number is the
significance of this correlation; in this case, it is .000. (The
significance basically tells us whether we would expect a
correlation that was this large purely due to chance factors and
not due to an actual relation.
THANK YOU ALL

Вам также может понравиться