Академический Документы
Профессиональный Документы
Культура Документы
In the case at hand, we get r = 0.935, indicating a very strong positive linear relation between x and y. By itself,
this calculation does not involve any distributional assumptions; Pearson’s correlation coefficient can be used
to measure the degree of linear relation between any pair of quantitative variables without further qualification.
However, the obvious next question is: what does the sample correlation r imply in terms of the population
correlation r? In the case at hand, where basic economic logic implies that the relation “should” be positive, we
are interested in the following pair of hypotheses:
H0: r = 0 vs. HA: r > 0 (2)
The corresponding test (see Sharpe’s section 15.2, pp. 489-490) is parametric; it requires the assumption that
the two variables x and y have a bivariate normal distribution in the population. If we fear that this assumption
is badly violated, Spearman’s rank correlation rs provides a nonparametric alternative to Pearson’s correlation.
Apart from avoiding the normality assumption, Spearman’s rank correlation has two further advantages:
QM3 (EBS2001) 2016-2017: SPEARMAN’S RANK CORRELATION 2
1) unlike the Pearson correlation, Spearman’s correlation is able to detect monotonic but nonlinear patterns;
2) compared to Pearson’s correlation, Spearman’s correlation is less sensitive to outliers.
The calculation of Spearman’s rank correlation is extremely simple, as the table below shows. Its upper half
repeats the information from the screenshot on the previous page. Next, we rank the eight values of x and y
separately from 1 to 8. To illustrate: the lowest x-value of 99 translates into a rank of 1, the second-lowest x-
value of 101 becomes 2, etcetera (if specific values occur more than once, we break such a tie by assigning the
average of the consecutive ranks which would otherwise be assigned, e.g. by 6.5 instead of 6 and 7).
Spearman’s rank correlation is now simply the Pearson correlation between the ranks of x and y, rather than
between the original values of x and y! In the case at hand, the result is rS = 0.905; quite close to its parametric
counterpart, and again indicating a strong positive relation.
Now consider the following pair of hypotheses in terms of the population rank correlation, rS:
H0: rS = 0 vs. HA: rS > 0 (3)
This test can be conducted in a very straightforward fashion, using the critical values ra that are reported in the
appendices of most statistics books (but not in Sharpe). For n = 8, such tables show that r0.005 = 0.881; since our
rS is larger, we can reject the null against the stated one-sided alternative at the 0.5% significance level.
Alternatively, we conclude that the two-sided p-value must be smaller than 1%. Clearly, there is massive
evidence against the “no correlation” null hypothesis (3).
To implement Spearman’s rank correlation in
SPSS, use the menu path Analyze > Correlate >
Bivariate to open the Bivariate Correlations dialog
box. Move both “households” and “sales” to the
Variables box, just as you would do for the normal
Pearson correlation coefficient. In fact, “Pearson” is
the default option in the “Correlation Coefficients”
field. All we have to do now is to check “Spearman”
(see the screenshot alongside), and press OK. We
will not discuss “Kendall’s tau-b” in this text.
Altogether, we get the output that is printed
below. Two straightforward tables are shown:
- Under the header “Correlations”, the first
“Correlations” table shows us the normal Pearson
correlation coefficient, r = 0.935 as stated,
together with the two-sided p-value for the “no
correlation” null hypothesis (2).
- The header “Nonparametric Correlations” offers another “Correlations” table containing Spearman’s rank
correlation coefficient of rS = 0.905, together with a two-sided p-value of 0.002 for the “no correlation” null
hypothesis (3). This p-value is smaller than 1%, which is consistent with our manual analysis.
Correlations Nonparametric Correlations