Вы находитесь на странице: 1из 10

1 2

Quantitative Methods 3 (EBS2001), 2018-2019, lecture 1 The intuition behind hypothesis testing: flipping a coin

Lecturer / Christian Kerckhoffs Benchmark: the coin is fair


Coordinator room A3.09, phone 3883766, c.kerckhoffs@maastrichtuniversity.nl Suspicion: the coin is biased towards heads

Today: Preparing Cases 1 & 2 Þ throw the coin e.g. 20 times, and conclude!
- Sharpe 3rd ed., Chapters 4, 9-14, 20 (relevant parts)
- Some additional whistles and bells
Þ the resulting number of heads depends on chance

1st half QM1: descriptive statistics - if the coin is fair, we expect around 10 heads
2nd half QM1, QM2: inferential statistics “from a sample, what can we conclude about the population?” - if the coin is fair, we may easily obtain 11 or more heads
- if the coin is fair, we may even obtain 13 or more heads
- if the coin is fair, obtaining 15 or more heads seems unlikely
A simple example of inferential statistics

Population: all working Dutch men


Þ if the number of heads is far enough above 10, we tend to conclude that the coin is not fair
Our interest: % part-timers “sample proportion p̂ Þ population proportion p” but not totally certain Þ risk of making an error

Underlying idea: sampling distributions (Ch. 9)


Crucial questions: - what is “far enough”?
Basic tools: confidence intervals (Chs. 9, 11-12) & hypothesis tests (Chs. 10-12) - what “risk” are we willing to take?

Typical lingo: null and alternative hypothesis, confidence & significance level, Basic idea: the more unlikely or surprising our result, the more doubt about the coin being fair
P-value, Type I & II error, etc…

3 4

Step 1 Hypotheses: formulate opposing hypotheses about the population


Step 1 Hypotheses: formulate opposing hypotheses about the population H0: pH = 0.5 null hypothesis, “status quo”
Ha: pH > 0.5 alternative hypothesis, “what we want to prove”

Population: all conceivable flips


Sample: our experiment of 20 successive flips
Þ reject H0 if sufficient sample evidence against it!

Step 2 Model: specify the test statistic and its sampling distribution Step 2 Model: specify the test statistic and its sampling distribution
- the # heads h, our test statistic, is a random number
- if H0 is true, then h ~ binomial with n = 20 and p = 0.5 Þ expected value (mean) of h equals 10
P ( h ³ 11 ) = 0.412
P ( h ³ 13 ) = 0.132
P ( h ³ 15 ) = 0.021

Step 3 Mechanics & Conclusion: determine whether the test statistic is unlikely/surprising Step 3 Mechanics & Conclusion: determine whether the test statistic is unlikely/surprising
a) Reject H0 if h ³ 11 if H0 true, 41.2% risk to reject it “by accident” ® too risky decision rule!
b) Reject H0 if h ³ 13 if H0 true, 13.2% risk to reject it “by accident” ® better
c) Reject H0 if h ³ 15 if H0 true, 2.1% risk to reject it “by accident” ® safe rule

With decision rule c): - our critical value is 15


- our significance level is α = 2.1% (risk of Type I Error)
5 6

Hypothesis testing always involves the same three steps: The intuition behind confidence intervals: flipping a coin

1) Formulate opposing hypotheses about the population Imagine: a sample of 100 flips, with 53 heads

53
2) Specify the test statistic and its sampling distribution Þ pˆ H = = 0.53 as our point estimate for pH
100
the test statistic - is a random number
- has a known sampling distribution if H0 is true Þ in view of this sample outcome,
- we observe one “draw” from this distribution pH = 0.5 seems reasonable
pH = 0.55 seems reasonable
3) Determine whether the test statistic is unlikely / surprising pH = 0.4 does not seem reasonable
option a) the critical value approach (see above) (if true, then 53 heads would be very extreme!)
option b) the P-value approach (see later on) emphasized by Sharpe
Formalization: 95% Confidence Interval for pH, [ 0.432, 0.628 ]

“on the basis of the sample outcome, we are 95% confident that the true pH
lies somewhere between 0.432 and 0.628”

7 8
Today’s basic example: a (small) part of Case 1) Some prior data considerations

- a survey among a sample of 1st year Q4_1 •• Day Three basic types of data:
SBE students, E&BE, 2 – 2011 Q4_2 Birthday •• Month
i) cross-section n objects or subjects, measured at same time (Cases 1, 2 & 3)
Q4_3 •••• Year

- n = 173 respondents, many variables Q5 What is your gender? • 1 • 2 ii) time series one object or subject, measured during T successive time periods (Cases 4 & 5)
Male Female
• 1 Dutch iii) panel data n objects or subjects, measured during T successive time periods (Case 6)
- we consider a subset of 7 variables Q6 What is your nationality? • 2 German (combines i and ii)
• 3 other
(i.e. questions, see alongside)
Please rate on a 1 to 5 scale, where 1 means Note: our main focus is on type i) !
“very unimportant” and 5 “very important”.
I chose this university… 1 2 3 4 5 Four different scales of measurement (“NOIR”):
Q9_6 Because of the city of Maastricht. • • • • • Qualitative 1) nominal “red, white, blue” Q5, Q6
Q9_7 Because of student life here. • • • • • 2) ordinal “small, medium, large” Q9_6, Q9_7

Quantitative 3) interval “degrees Celsius” Q4_1/2/3, maybe Q9_6, Q9_7 (?)


- after survey, data entered into
4) ratio “income” “age” defined from Q4_1/2/3
SPSS data matrix (see alongside)
- variables in columns Note: - from 1) to 4), the data become ever “richer” !
(note the appealing names) - downgrading a variable is admissible; upgrading it is not admissible
- respondents in rows - one exception: it is open to debate whether Likert-scale variables
(like Q9_6 and Q9_7) can be regarded as “pseudo-interval” or not
9 10
First step in empirical research: descriptive analysis Main dish in empirical research: 8 inferential tools from QM1 & QM2

Quantitative variables: Descriptive Statistics


1) One-Sample T Test “does the mean of a quantitative variable differ
Std.
Descriptive Statistics N Minimum Maximum Mean Deviation from a specified value?”
Analyze > Descriptive Statistics Q9_6 City of Maastricht 171 1 5 2.96 1.210 2) Independent-Samples T Test “does the mean of a quantitative variable differ
Q9_7 Student life 171 1 5 2.68 1.087
> Descriptives between two (sub-)populations?”
Valid N (listwise) 171
3) One-Way ANOVA “does the mean of a quantitative variable differ
between more than two (sub-)populations?”
Qualitative variables: Q5 Gender
Valid Cumulative 4) Paired-Samples T Test “do the means of two quantitative variables differ?”
Frequency Tables Frequency Percent Percent Percent
Valid male 112 64.7 66.7 66.7 5) Correlation “to what extent are two quantitative variables related?”
Analyze > Descriptive Statistics
female 56 32.4 33.3 100.0 6) Chi-square Test for Goodness of Fit “does the frequency distribution of a nominal variable differ
> Frequencies
Total 168 97.1 100.0
from a specified distribution?”
Missing 9 5 2.9
Total 173 100.0 7) Chi-square Test of Independence “are two nominal variables dependent?” (crosstables)

Q6 Nationality
8) Regression analysis “to what extent can a quantitative variable be explained
Note: Valid Cumulative by a (number of) other quantitative variable(s)?”
Frequency Percent Percent Percent
- all variables have “missing values”
Valid Dutch 66 38.2 40.0 40.0 Which tool to use depends a.o. on your research questions and your measurement scale
- frequencies also interesting German 70 40.5 42.4 82.4
for Q4_1, Q4_2 and Q4_3 Other 29 16.8 17.6 100.0 - tool 1) discussed extensively today (by hand & SPSS)
Total 165 95.4 100.0 - tools 2)-7) discussed by analogy today (only SPSS, after the break)
(to check for “strange answers”)
Missing 9 8 4.6
Total 173 100.0 - tool 8) discussed next lecture

11 12
1) One-Sample T Test The sampling distribution of y

Research issue: does the mean of a quantitative variable differ from a specified value? If we have - observations that are independent
Null hypothesis: H0: μ = μ0 - either a normally distributed variable, or a large sample (say n > 30)
Assumptions: - independence assumption (the sample data are independent of each other)
Then it can be shown that:
- normal population assumption (the variable of interest is normally distributed)
(or: large sample, say n > 30) 1) the expected value of y equals the true μ “on average, we hit our target”
Sharpe: chapter 11
2) y fluctuates around the true μ with standard deviation
Application: Q9_7, “… because of student life here” σ
SD( y ) = σ = population standard deviation of y
n
Descriptive Statistics
N Minimum Maximum Mean Std. Deviation 3) y has a normal distribution
Q9_7 Student life 171 1 5 2.68 1.087
Valid N (listwise) 171 σ
Þ y ~ Normal, with mean μ, st. dev. SD( y ) =
n
Research issue: the population mean, μ
y -μ σ
Þ ~ Standard Normal , SD( y ) =
Point estimate: the sample mean, here: y = 2.68 SD( y ) n
but: different sample ® different outcome
Þ y fluctuates randomly between samples
Þ y obeys a sampling distribution !
Þ what does our single “draw” for y tell us about μ ?
13 14
y -μ σ Example 1
Þ ~ Standard Normal , SD( y ) =
SD( y ) n
Null hypothesis: H0: μ = 3 (“neutral value”)
- problem: we don’t know the population standard deviation σ
Alternative hypothesis: Ha: μ ¹ 3 (two-sided)
- use the sample standard deviation s as point estimate for σ, to get the standard error of y :
“in the population, the average importance of motive ‘student life’ is 3”
s
SE ( y ) = s = sample standard deviation of y vs. “the average importance differs from 3”
n

y -μ Þ calculate test statistic:


Þ ~t t-distribution with n–1 degrees of freedom
SE ( y )
s 1.087 y - m 0 2.68 - 3
SE ( y ) = = = 0.0831 Þ t= = = - 3.85
n 117 SE ( y ) 0.0831
This sampling distribution acts as the basis for inference!
- if H0 is true, this is a draw from a t-distribution with n–1 = 170 df
Tool 1: Test statistic for testing H0: μ = μ0 : - with so many df, the t-distribution equals the standard normal distribution Þ z = –3.85

point estimate - hypothesiz ed value y - μ0 - the standard normal distribution is centered around 0
t= =
standard error SE ( y ) Þ if H0 is true, we expect a test statistic “close to” 0
if H0 true, t obeys a t-distribution with n–1 df Þ “far enough” from 0? Þ values “far above” or “far below” 0 make H0 suspect
Þ reject H0
Tool 2: 100(1–α)% Confidence Interval for μ Þ is the actual value of –3.85 “far enough” from 0 ?

point estimate ± critical value * standard error = [ y ± tα / 2 × SE ( y )] (hardly used in QM3) Note: both positive and negative values for the test statistic cast doubt on the null !

15 16
Option a) The critical value approach Option b) The P-value approach

Rejection region: z > za/2 “if the null were true, then what would be the probability
to observe a test statistic so contradictory to it?”
α = significance level = Type I Error risk
Table A-63: P ( z ³ 3.85 ) = 1 – 0.9999 = 0.0001
E.g. α = 5% Þ Table A-59: z0.025 = 1.96
“if the null were true, then 5% chance to observe Þ P-value around 0.02% (two-sided!)
a test statistic of ³1.96, or £–1.96 ”
“if the null were true, there would be 0.02% chance
to observe a test statistic of ³3.85, or £ –3.85 ”
3.85 > 1.96 Þ reject H0 at 5% significance
Type I error risk is 5% Þ reject the null at any reasonable significance level
“the average score is significantly different from 3”
“the lower the P-value, the more evidence against the null! ”

Option b) The P-value approach


17 18
SPSS: Analyze > Compare Means > One-Sample T Test Example 2
choose 3 as Test Value
Null hypothesis: H0: μ = 2.5
One-Sample Statistics Alternative hypothesis: Ha: μ > 2.5 (one-sided)
Std. Std. Error
N Mean Deviation Mean Þ calculate test statistic:
Q9_7 Student life 171 2.68 1.087 .083
y - m 0 2.68 - 2.5
t »z= = = 2.17
One-Sample Test SE ( y ) 0.083
Test Value = 3
95% Confidence Interval - if H0 is true, this is a draw from the standard normal distribution
Sig. Mean of the Difference
t df (2-tailed) Difference Lower Upper - the standard normal distribution is centered around 0
Q9_7 Student life -3.798 170 .000 -.316 -.48 -.15
Þ is the actual value of 2.17 “far enough” above 0 ?

Note: this time, only positive values for the test statistic cast doubt on the null !
Note: inspect P-value Þ conclusion !

19 20

Option a) The critical value approach: z > za ? 2) Independent-Samples T Test


α = significance level = Type I Error risk
Research issue: does the mean of a quantitative variable differ between two (sub-)populations?
E.g. α = 5% Þ Table A-59: z0.05 = 1.645
Null hypothesis: H0: μ1 – μ2 = Δ0 (typically Δ0 = 0)
“if the null were true, then 5% chance to observe a test statistic of 1.645 or larger ”

2.17 > 1.645 Þ reject H0 at 5% significance Assumptions: - independence assumption (in each sample, the data are independent of each other)
- independent groups assumption (the samples are independent rather than paired)

Option b) The P-value approach - normal populations assumption (in each population, the data are normally distributed)
(or: both samples are large, say n1, n2 > 30)
“if the null were true, then what would be the probability
to observe a test statistic so contradictory to it?” ( y1 - y 2 ) - Δ 0 s12 s22
Test statistic: t= with SE ( y1 - y 2 ) = + general version
SE ( y1 - y 2 ) n1 n2
Table A-63: P ( z ³ 2.17 ) = 1 – 0.9850 = 0.0150
1 1
Þ P-value = 1.5% (one-sided!) or SE ( y1 - y 2 ) = s pooled + pooled version, “equal variances”
n1 n2

“if the null were true, then 1.5% chance to observe a test statistic ³ 2.17 ”
Distribution: t-distribution Þ standard normal with large samples
Þ very much evidence against H0
“the lower the P-value, the more evidence against the null! ” Sharpe: sections 13.1-13.5

BREAK
21 22

Application: quantitative variable: “birthdate” (defined from Q4_1/2/3; see case, item b)
subpopulations: Q5, “gender” Normally: inspect P-value Þ conclusion

Here: three P-values ???


Null hypothesis: H0: μ1 = μ2 (i.e. Δ0 = 0)
Alternative hypothesis: Ha: μ1 ¹ μ2 (two-sided) Problem: sampling distribution of the t statistic for the age means
depends on whether the age variances are
“in the population, the average age of men and women is the same”
i) unrestricted: Sharpe section 13.2 “The Two-Sample t-Test”
vs. “the average age differs between men and women”
ii) equal: σ 12 = σ 22 Sharpe section 13.5 “The Pooled t-Test”
SPSS: Analyze > Compare Means > Independent Samples T Test
choose “birthdate” as Test Variable, Q5 as Grouping Variable Step 1: evaluate Levene’s Test, H0: σ 12 = σ 22
Group Statistics P-value = 74.7%
Std. Std. Error
Q5 Gender N Mean
Deviation Mean
Þ no evidence that the variances differ
Birthdate male 109 1990.53 1.699 .163 Þ we may (but don’t have to) focus on the row “Equal variances assumed”
female 56 1990.84 2.027 .271
forget your Marketing class: “It’s never wrong not to pool” (Sharpe, margin box p. 410)
Independent Samples Test
Levene's Test for
Equality of Variances
t-test for Equality of Means Step 2: evaluate the t-Test, H0: μ1 = μ2
Std. 95% Confidence
Sig. Mean Error Interval of the Diff.
P-value = 29.2%
F Sig. t df (2-tailed) Diff. Diff. Lower Upper Þ no evidence that the means differ
Birthdate Equal variances
0.104 .747 -1.057 163 .292 -.32 .299 -.905 .274
assumed
Equal variances
-.999 95.52 .320 -.32 .316 -.943 .312
not assumed

23 24
3) One-Way ANOVA Application: quantitative variable: “birthdate”
subpopulations: Q6, “nationality”
Research issue: does the mean of a quantitative variable differ between more than two
(sub-)populations? Null hypothesis: H0: μ1 = μ2 = μ3 (k = 3)
Alternative hypothesis: Ha: at least one mean differs from the others
Null hypothesis: H0: μ1 = μ2 = … = μk (k subpopulations)
“the average age of Dutch, Germans and others is the same”
Assumptions: - independence assumption (in each sample the data are independent of each other; vs. “for at least one group, the average age is different”
the group samples are independent)
- equal variance assumption (in each population, the variance is the same)
SPSS: Analyze > Compare Means > One-Way ANOVA
- normal populations assumption (in each population, the data are normally distributed) choose “birthdate” as Dependent, Q6 as Factor; tick Descriptive under Options

Descriptives
SSTr /(k - 1)
Test statistic: F = Birthdate 95% Confidence
SSE /(N - k ) Interval for Mean
Std. Std. Lower Upper
- SSTr : variability between subpopulations N Mean Deviation Error Bound Bound Minimum Maximum
- SSE : variability within subpopulations Dutch 66 1991.26 2.010 .247 1990.77 1991.76 1983.57 1992.84
German 67 1990.02 1.474 .180 1989.66 1990.38 1986.49 1992.67
Other 29 1990.51 1.658 .308 1989.88 1991.14 1984.95 1992.86
Distribution: F-distribution Total 162 1990.62 1.823 .143 1990.33 1990.90 1983.57 1992.86

Sharpe: sections 20.6-7, 20.9 ANOVA


Birthdate Sum of Squares df Mean Square F Sig.
Between Groups 51.675 2 25.837 8.504 .000
Within Groups 483.107 159 3.038
Total 534.782 161
25 26
4) Paired-Samples T Test
Note: - inspect P-value Þ conclusion
P-value = 0.000 (i.e. smaller than 0.05%) Research issue: do the means of two quantitative variables differ?
Þ very strong evidence that the means differ
Null hypothesis: H0: μ1 – μ2 = Δ0 (typically Δ0 = 0)

- alternative always two-sided (“Sums of Squares”)


Test statistic: one population, two distinct variables
- test statistic does not indicate which means differ Þ focus on paired differences di = y1i – y2i , i = 1,…,n,
Þ check the Descriptives table (but…) with mean d and variance sd2

Þ apply the One-Sample T Test to these differences!


- are the variances similar?
Þ check the Descriptives table (but…)
Assumptions: - paired data assumption (the samples are paired rather than independent)
- for k = 2, we get the Independent-Samples T Test back - independence assumption (the paired differences are independent of each other)
- normal population assumption (the paired differences are normally distributed)
Þ “Equal variances assumed” version (or: the sample is large, say n > 30)

Distribution: t-distribution Þ standard normal with large samples

Sharpe: sections 13.6-7

27 28

Application: quantitative variable 1: Q9_6, “…city of Maastricht”


quantitative variable 2: Q9_7, “…student life” Note: - 1st table: descriptive statistics

Null hypothesis: H0: μ1 = μ2 (i.e. Δ0 = 0) - 2nd table: correlation, see technique 5) below
Alternative hypothesis: Ha: μ1 ¹ μ2 (two-sided) - 3rd table: the actual test
“on average, the motives ‘city of Maastricht’ and ‘student life’ are equally important” e.g.: 0.275 = 2.96 – 2.68
vs. “they are not equally important” 0.085 = 1.106 / Ö171
3.249 = 0.275 / 0.085
SPSS: Analyze > Compare Means > Paired-Samples T Test
170 = n–1
choose Q9_6 and Q9_7 as Paired Variables
Paired Samples Statistics inspect P-value Þ conclusion
Mean N Std. Deviation Std. Error Mean
Pair 1 Q9_6 City of Maastricht 2.96 171 1.210 .092 P-value = 0.001 or 0.1%
Q9_7 Student life 2.68 171 1.087 .083
Þ very strong, evidence that the means differ
Paired Samples Correlations
N Correlation Sig.
Pair 1 Q9_6 City of Maastricht
171 .540 .000
& Q9_7 Student life

Paired Samples Test


Paired Differences
95% Confidence
Std. Std. Error Interval of the Diff. Sig.
Mean Deviation Mean Lower Upper t df (2-tailed)
Pair 1 Q9_6 City of Maastricht
.275 1.106 .085 .108 .442 3.249 170 .001
- Q9_7 Student life
29 30
5) Correlation Application: quantitative variable 1: Q9_6, “…city of Maastricht”
quantitative variable 2: Q9_7, “…student life”
Research issue: to what extent are two quantitative variables related?

Corr. coefficient: measures degree of linear relation between x1 and x2 SPSS: Analyze > Correlate > Bivariate
choose Q9_6 and Q9_7 as Variables; tick Pearson (default)
å ( xi - x )( y i - y )
r =
å ( xi - x ) * å ( y i - y )
2 2
Correlations
Q9_6 City of Q9_7
Maastricht Student life
- –1 £ r £ 1 r = –1 : perfect negative relation Q9_6 City of Pearson Correlation 1 .540**
r = +1 : perfect positive relation Maastricht Sig. (2-tailed) .000
r = 0 : no linear relation N 171 171
Q9_7 Student life Pearson Correlation .540** 1
- r is scale-independent
Sig. (2-tailed) .000
- the role of both variables in r is symmetric N 171 171
- no distinction dependent vs. independent **. Correlation is significant at the 0.01 level (2-tailed).
- the relation need not be causal
Note: - r = 0.540 : pretty strong positive relation
Null hypothesis: H 0: ρ = 0 (population vs. sample correlation)
- P-value = 0.000
Assumptions: normality assumption (both variables are normally distributed) Þ very strong evidence that ρ differs from 0

Test statistic / - Correlation ¹ Paired-Samples T Test


too complicated for our purposes
Distribution two variables can be correlated strongly, yet have very different means :
e.g. body length and shoe size
Sharpe: sections 4.1-4.4, 15.2 (pp. 489-490)

31 32
6) Chi-square Test for Goodness of Fit
- alternative : Spearman’s rank correlation
- if variables are ordinal rather than quantitative (here?) Research issue: does the frequency distribution of a nominal variable differ from a specified distribution?
- if we doubt whether they are normally distributed
- to measure the strength of monotonic but nonlinear relations Null hypothesis: H0: p1 = …, p2 = …, pk = … (k categories)
- SPSS: same menu, tick Spearman, not Pearson
Assumptions: - independence assumption (the counts should be independent)
- see extra text “Spearman’s Rank Correlation” (available on QM3 course pages)
- sample size assumption (all Expi ≥ 5, i = 1,….,k)

Correlations typically satisfied for a random sample, if n is large enough


Q9_6 City of Q9_7
Maastricht Student life
.536**
k (Obs i - Expi )2
Spearman's rho Q9_6 City of Correlation Coefficient 1.000 Test statistic: χ2 = å with observed frequencies Obsi , i = 1,…,k
Maastricht Sig. (2-tailed) . .000 i =1 Expi
N 171 171 expected frequencies Expi = n*pi , i = 1,…,k
Q9_7 Student life Correlation Coefficient .536** 1.000
Sig. (2-tailed) .000 . Distribution: χ2-distribution with k–1 df
N 171 171
**. Correlation is significant at the 0.01 level (2-tailed).
Sharpe: sections 14.1-14.3
33 34

Application: nominal variable Q6, “nationality” How have these numbers been determined?

Null hypothesis: H0: p1 = 0.40, p2 = 0.40, p3 = 0.20 - observed frequencies: if H0 true, we expect:
Alternative hypothesis: Ha: at least one share is different Obs1 = 66 Exp1 = n*p1 = 165*0.40 = 66.0
Obs2 = 70 Exp2 = n*p2 = 165*0.40 = 66.0
“in the 1st year student population, 40% is Dutch, 40% is German, 20% has another nationality”
Obs3 = 29 Exp3 = n*p3 = 165*0.20 = 33.0
vs. “the shares of the three nationalities are different”

Þ are the observed frequencies “far away” from the expected ones?
SPSS: Analyze > Nonparametric Tests > Legacy Dialogs > Chi-square
note the analogy with the coin-flipping example (multinomial vs. binomial)
choose Q6 as Test Variable
under Expected Values, tick Values ; indicate the shares using Add (66 - 66.0)2 (70 - 66.0)2 (29 - 33.0)2
Þ c2 = + + = 0.727
Q6 Nationality 66.0 66.0 33.0
Observed N Expected N Residual
Dutch 66 66.0 .0 - if H0 is true, the χ2 test statistic has a χ2-distribution with 2 df
German 70 66.0 4.0
Other 29 33.0 -4.0 provided that all expected frequencies ³ 5 (see Sharpe p. 443)
Total 165
- the χ2 test statistic is always ³ 0 (squares!)
Test Statistics
Q6 Nationality Þ if H0 true, the observed and expected frequencies should be close
Chi-Square .727a Þ the test statistic should be “close to” 0
df 2
Þ values “far above” 0 make H0 suspect ® reject
Asymp. Sig. .695
a. 0 cells (0.0%) have expected frequencies less than Þ is actual value of 0.727 “far enough” above 0?
5. The minimum expected cell frequency is 33.0.

35 36

Option a) The critical value approach: χ 2 > χ α2 ? 7) Chi-square Test of Independence (or “Homogeneity”)

α = 5% Þ χ 02.05 = 5.991 (Table A-62: df = 2) Research issue: are two nominal variables, tabulated against each other in a crosstable, dependent?

“if the null were true, then 5% chance to observe Null hypothesis: H0: the two variables are independent
a test statistic of 5.991 or larger”
Assumptions: - independence assumption (the counts should be independent)
0.727 < 5.991 Þ don’t reject H0 at 5%
- sample size assumption (all Expij ≥ 5, i = 1,…,R ; j = 1,…,C)
(Type I Error risk)
typically satisfied for a random sample, if n is large enough

Option b) The P-value approach


(Obs ij - Expij )2
Test statistic: χ2 = å with observed freqs. Obsij , i = 1,…,R ; j = 1,…,C
“if the null were true, then what would be the probability all cells Expij
to observe a test statistic so contradictory to it?” ri × c j
expected freqs. Expij = , i = 1,…,R ; j = 1,…,C
n
P ( χ2 ³ 4.605 ) = 0.10
Distribution: χ2-distribution with (R–1)(C–1) df
0.727 < 4.605
Sharpe: sections 14.4-14.6
Þ P-value must be larger than 10%
SPSS : P-value = 69.5%
Þ no evidence against the null at all

“the lower the P-value, the more evidence against the null! ”
37 38

Application: nominal variable Q6, “nationality”, vs. How have these numbers been determined?
nominal variable Q5, “gender”
- Sharpe section 5.6: Conditional Probability and Independence
Null hypothesis: H0: Q6 and Q5 are independent
p. 160 : “Events A and B are independent whenever P (B A ) = P (Β) ”
Alternative hypothesis: Ha: Q6 and Q5 are dependent
SPSS: Analyze > Descriptive Statistics > Crosstabs - here: P (Male ) = 108 / 162 = 0.667 “ P (B) ”
choose Q5 as Row, Q6 as Column
P (Male Dutch ) = 38 / 66 = 0.576 “ P (B A ) ”
under Cells, tick Expected ; under Statistics, tick Chi-square
P (Male German ) = 51 / 67 = 0.761 “ P (B A ) ”
Q5 Gender * Q6 Nationality Crosstabulation
Q6 Nationality P (Male Other ) = 19 / 29 = 0.655 “ P (B A ) ”
Dutch German Other Total
Q5 Gender male Count 38 51 19 108
Expected Count 44.0 44.7 19.3 108.0 - independence: 66.7% of all students is male
female Count 28 16 10 54 Þ 66.7% of the Dutch students is male,
Expected Count 22.0 22.3 9.7 54.0
66.7% of the German students is male,
Total Count 66 67 29 162
66.7% of the other students is male
Expected Count 66.0 67.0 29.0 162.0

Chi-Square Tests
Value df Asymp. Sig. (2-sided)
Pearson Chi-Square 5.166a 2 .076
Likelihood Ratio 5.234 2 .073
Linear-by-Linear Association 1.659 1 .198
N of Valid Cases 162
a. 0 cells (0.0%) have expected count less than 5. The minimum
expected count is 9.67.

39
- what does this imply in terms of expected frequencies?
- Obs11 = 38 Dutch males observed frequency

- if H0 (independence) is true, we expect 66.7% of the 66 Dutch to be male

Exp11 = 0.667 * 66 = 44.0 expected frequency

“what it ‘should’ be if nationality and gender were independent”

108 row total * column total


Note: 0.667 * 66 = * 66 =
162 total

- similarly for the other 5 cells of the table

Þ are the observed frequencies “far away” from the expected ones?

Þ formalized with χ2-statistic, cf. tool 6), slides 32-35


this time, the statistic has a χ2-distribution with (R–1)(C–1) df,
again provided that all expected frequencies ³ 5 (see Sharpe p. 452)

Conclusion:
P-value = 0.076 or 7.6%
Þ some evidence that the variables are dependent

Вам также может понравиться