Вы находитесь на странице: 1из 11

Correlation (Pearson, Kendall,

Spearman)
Point of concepts
• Correlation is a bivariate analysis that measures the
strengths of association  between two variables.
• In statistics, the value of the correlation coefficient
varies between +1 and -1. When the value of the
correlation coefficient lies around ± 1, then it is said to
be a perfect degree of association between the two
variables. 
• As the value goes towards  0, the relationship between
the two variables will be weaker.
• Usually, in statistics, we measure three types of
correlation: Pearson correlation, Kendall rank
correlation and Spearman correlation.
• Pearson r correlation:
• Pearson r correlation is widely used in statistics to
measure the degree of the relationship between the
linear related variables.
• For the Pearson r correlation, both variables should be
normally distributed. 
• For example, in the stock market, if we want to
measure how two commodities are related to each
other, Pearson r correlation is used to measure the
degree of relationship between the two commodities.
The following formula is used to calculate the Pearson
r correlation:
• Where:
• r = Pearson r correlation coefficient
• N = number of value in each data set
• ∑xy = sum of the products of paired scores
• ∑x = sum of x scores
• ∑y = sum of y scores
• ∑x2= sum of squared x scores
• ∑y2= sum of squared y scores
• Pearson r correlation in SPSS: To perform the Pearson r
correlation in SPSS, we will select “correlate” from the analysis
menu and select “bivariate” from the correlate option. After
selecting this option, the following window appears:
•   Figure 1
Figure 2

Select the variables for which we want to calculate the Pearson r


correlation and drag them to the right side of the variable list.

Click on “option” and select “descriptive statistics” from there.


Select “Pearson test” from the window.

Click on the “ok” button. The result window will show the result
of the Pearson correlation.

Figure 2 shows the SPSS output table for the Pearson r


correlation. 
Kendall’s tau and Spearman’s rank
correlation coefficient.
• There are two accepted measures of rank correlations, namely Kendall’s
tau and Spearman’s rank correlation coefficient.
• Kendall’s tau is a measure of correlation. Kendall’s tau measures the
strength of the relationship between the two variables. Like Spearman’s
rank correlation coefficient, Kendall’s tau is carried out on the ranks of the
data. In other words, Kendall’s tau is carried out on the variables that are
separately put in order and are numbered.
• Like other measures of correlation, Kendall’s tau takes the values between
minus one and plus one. In Kendall’s tau, the positive correlation signifies
that the ranks of both the variables are increasing. On the other hand, the
negative correlation in Kendall’s tau signifies that as the rank of one
variable is increased, the rank of the other variable is decreased.
Next…
• It is feasible to calculate the confidence intervals and carry
out the hypothesis tests with the help of Kendall’s tau.
• The main advantages of using Kendall’s tau are as follows:
• The distribution of Kendall’s tau has better statistical
properties.
• The interpretation of Kendall’s tau in terms of the
probabilities of observing the agreeable (concordant) and non
agreeable (discordant) pairs is very direct.
• In most of the situations, the interpretations of Kendall’s tau
and Spearman’s rank correlation coefficient are very similar
and thus invariably lead to the same inferences.
• In Spearman’s rank correlation coefficient, the measure of rank
correlation is the more widely used rank correlation coefficient.
• Symbolically, Spearman’s rank correlation coefficient is
denoted by rs . Spearman’s rank correlation coefficient is given
by the following formula:
• rs = 1- (6∑di2 )/ (n (n2-1)), here di in Spearman’s rank correlation
coefficient represents the difference in the ranks given to the
values of the variable for each item of the particular data. This
formula of Spearman’s rank correlation coefficient is applied in
cases when there are no tied ranks. However, in the case of
fewer numbers of tied ranks, this approximation of Spearman’s
rank correlation coefficient provides sufficiently good
approximations.
• The Spearman’s rank correlation coefficient table, called the
Spearman table, is constructed by taking all kinds of possible
happenings into consideration.
• The Spearman’s rank correlation coefficient, as the name
suggests, is used to test for association. However, there are
some specific relationships that are picked up by Spearman’s
rank correlation coefficient test.
• For Spearman’s rank correlation coefficient to work
effectively, the underlying relationship must be monotonic. In
other words, for the efficient result generation from the
Spearman’s rank correlation coefficient, the variables should
either increase in values together, or when one gets
increased, then the other should get decreased.
• Spearman’s rank correlation coefficient is a non parametric
test. In other words, Spearman’s rank correlation coefficient
does not depend upon the assumptions of various underlying
distributions. This means that Spearman’s rank correlation
coefficient is distribution free.
• Spearman’s rank correlation coefficient can be used to test
for association in the testing of hypothesis. The null
hypothesis in which Spearman’s rank correlation coefficient is
involved is that there is no association between the variables
under study. Thus, the purpose of Spearman’s rank
correlation coefficient is to investigate the possible
association in the underlying variables.
• It would be incorrect to write the null hypothesis as having no
rank correlation between the variables while using
Spearman’s rank correlation coefficient.
• There often exist difficulties while using Spearman’s rank
correlation coefficient. One such difficulty is with the data
having very large or very small samples. In the case of very
large samples, it is very time consuming to perform
Spearman’s rank correlation coefficient since it requires the
ranking of the data of both of the variables.

Вам также может понравиться