Вы находитесь на странице: 1из 44

CHI-SQUARE

AND
ODDS RATIOS
Semester Recap
 We’ve covered:
 Descriptive Statistics  Statistical Concepts
 Measures of Central Tendency  Statistical Significance
 Measures of Variability  Type I and Type II Error
 Z-scores and Graphing  Alpha and p-values

 Association and Prediction  Beta and power

 Correlation  Effect Sizes

 Regression (simple and multiple)

 Testing for Group Differences


 t-tests (one, indep., and paired)
 ANOVA (One-way, factorial,
ANCOVA, RM ANOVA
Last weeks…
 All of the statistical tests on the prior slide are
known as ‘parametric’ statistics
 ParametricStatistics have strict assumptions that must be
met before a t-test, correlation, etc… can be used
 Assumes homoscedasticity of variance (variance between
two variables are similar)
 Assumes a normal distribution (bell-shaped curve)

 These assumptions are not easily met


 Forexample, we’ve used physical activity in several of our
examples this semester – physical activity is RARELY
normally distributed…
Nationally Representative Sample of Minutes of
Daily Physical Activity

Is this a normal distribution?

What kind is it?

Positive Skew

PA
Additionally…
 Sometimes you need to use a dependent variable
that is categorical (grouping)
 Recall from the chart that the statistical tests we’ve
discussed all require 1 continuous dependent variables
 Example Research Question:
 Are there more male than female athletic trainers across the
United States?
 I want to know if there are more men than women – this is a
nominal dependent variable
 No correlation, regression, or ANOVA, etc… will
help me answer this simple question
Non-Parametric Tests
 Non-parametric tests can be used when:
 Parametrictests statistical assumptions are not met
 Categorical DV’s are used

 Non-parametric statistics is an entirely different line of


statistics that includes dozens of new tests
 Usually a parametric test has a non-parametric ‘relative’
 Chi-Square Test of Independence is similar to Pearson Correlation
 Most of the time if you can’t meet the assumptions of the parametric
test you wanted to do – you can find an appropriate non-
parametric test
Non-Parametric Tests (cont)
 Non-parametric tests have benefits and drawbacks
like all statistical tests
 Benefits:
 Assumptions are easier to meet
 Non-parametric tests basically just need nominal or ordinal data
 Remember that interval or ratio data scales can always be converted
to nominal or ordinal – the inverse is not true
 More ‘robust’
 Non-parametric statistics are more versatile tests
 Recall small changes in our data required completely different
parametric tests (e.g., the difference between a t-test and ANOVA)
 Usually easier to calculate
 We can realistically hand-calculate several non-parametric tests
Non-Parametric Tests (cont)
 Non-parametric tests have benefits and drawbacks
like all statistical tests
 Drawbacks:
 Less ‘mainstream’
 Non-parametric tests are commonly used – but far less frequently
than parametric tests
 Less powerful
 By design, non-parametric tests have less statistical power…
 What’s that mean?
 Unless you NEED to use a categorical dependent variable, if your
data meets the statistical assumptions you should use parametric stats
Tonight
 We will discuss and use two non-parametric tests:
 Chi-square test of independence
 Logistic Regression (Odds Ratio)

 These are the most common non-parametric tests


 Both are related to each other (often used together)
 Do NOT require normal distribution or homoscedasticity

 Do require 2 categorical variables (nominal or ordinal)


Chi-Square, χ2
 There are actually a few different types of Chi-Square
tests, we will discuss the Chi-Square test of
independence
 Test determines if two variables are related (or unrelated)
 ‘Testof independence’
 Similar to Pearson correlation
 For example, we would expect
 Sex (Male/Female) is NOT related to hair color (dark/light)
 Men and women are just as likely to have dark or light hair

 Sex (Male/Female) is related to having a heart attack (Yes/No)


 Men have more heart attacks than women do

 For these tests, start thinking about data in 2x2 tables…


Chi-Square Data ‘Picture’

Cause of Death in Heart Attack?


Men and Women Yes No
Men 68 32
Sex
Women 42 58
 A 2x2 table provides a nice summary of the data
 In this example, ‘Sex’ is the IV and ‘Heart Attack’ is the DV
 Does male/female increase risk of heart attack?
 This table provides frequency of occurrence

 Can also convert to percentage – you will get the same result
SPSS View
 Data Structure:
 Key variables are categorical
 Can look at the data labels or
values:
 Males =1
 Females = 2
 HeartAttack Yes = 1
 HeartAttack No = 2

 Look at Labels and Values:


SPSS View
 Data Structure:
 Key variables are categorical
 Can look at the data labels or
values:
 Males =1
 Females = 2
 HeartAttack Yes = 1
 HeartAttack No = 2
Chi-Square Data ‘Picture’

Cause of Death in Heart Attack?


Men and Women Yes No
Men 68 32
Sex
Women 42 58

 You should fill out the margins of the table (how many men,
women, total n, heart attacks, other causes, etc…
 Do on board
How the Chi-Square works…

Cause of Death in Heart Attack?


Men and Women Yes No
Men 68 32
Sex
Women 42 58

 The χ2 test has a null hypothesis that there is no difference


in the frequency of men/women having heart attacks
 Ifthe two variables are unrelated (independent), we would
expect men and women to have the about same number
 But, we need a statistical test to know if this difference is RSE
How to run
Chi-Square
 It is CRITICAL you put the variables in the correct spots
 Typically
the IV goes in the Row Then click on ‘statistics’
 And DV goes in the Column

It doesn’t really
change the answer
– but it makes it
easier for you to
understand the
results
What else is there?
 The ‘cells’ tab will allow you
to request percentages in
each cell, to go along with
the frequencies
 The ‘format’ tab will allow
you to change the
organization of your table
 E.g., put ‘Females’ on the top
row, or put ‘No Heart Attack’
in the left column
SPSS Output
 SPSS provides two initial tables:
 1) Case Processing Summary: Ignore, repeat info of…
 2) CrossTabs Table (our 2x2):

= Frequency. Could have asked for Percentages


Chi-Square Tests
 We only care about the ‘Pearson Chi-Square’ –
yeah, it’s that same guy from correlation…
 Important info is the χ2 = 13.657, df, p, and n

 χ2 = 13.66 is just like the t-statistic or r, or F ratio...


Chi-Square df
 df is calculated by (number of columns – 1) multiplied
by (number of rows – 1)
2 rows – 1 = 1
 2 columns – 1 = 1

 1 x 1 = 1 df

 All 2x2 tables have 1 df, more variables will change this
Reporting the results:
 A chi-square test of independence was used
compare the frequency of heart attacks between
men and women. Sex was significantly related to
having a heart attack. Men tended to have more
heart attacks than women (χ2(1) = 13.66, p <
0.001).

Questions on Chi-Square?
Logistic Regression
 Researchers do often report just the Chi-Square test
results. However, it is also common for them to
incorporate logistic regression/or odds ratios
 Quick definition of Logistic Regression:
 Type of regression equation that uses a categorical DV
 Such as heart attack yes/no from our example
 Itallows you to include any type of IV (categorical or
continuous) – and any number of IV’s
 In this sense, it is very similar to simple or multiple linear reg.
 Insteadof providing you with a slope – it provides an
odds ratio for each IV
What are the odds…?
 Odds
 The probability of an event happening divided by the
probability of the event not happening
 Students often get confused here…
 A die has 6 sizes, each with a difference number
 On one die, the odds of rolling a 1 is…?
 1 side has a 1 on it, 5 sides do NOT
 Odds of rolling a 1:
 1/5, or 20%
What’s an odds ratio…?
 Odds Ratio
 The odds of an event happening in one group divided by
the odds of an event happening in another group
 It is literally the ratio of two odds

 It acts like effect size for a chi-square


 The chi-square tells you if there is a difference – the odds ratio
tells you how big/strong that difference is
 If the chi-square is significant – the odds ratio is also
statistically significant
 Literal interpretation is how much more likely an event is
to happen in one group versus another
 It’s easier to see in our example…
Back to our example
 What are the odds of a heart attack in men?
 68/32, or 2.125
 What are the odds of a heart attack in women?
 42/58, or 0.724

Cause of Death in Heart Attack?


Men and Women Yes No
Men 68 32
Sex
Women 42 58
Back to our example
 What is the ratio of these odds, or odds ratio?
 2.125 / 0.724 = 2.9 = OR
 Interpretation:
 Men are 2.9 times more likely to have a heart attack
than women (we know it’s significant because of the χ2)

Cause of Death in Heart Attack?


Men and Women Yes No
Men 68 32
Sex
Women 42 58
More on odds ratios
 Interpreting odds ratios can trip up some students:
 For example, 2.9 is the odds ratio for men vs. women
 Men are 3 times more likely than women
 Being a man is a ‘risk factor’ for heart attack

 What is the odds ratio for women vs. men?


 0.724 / 2.125 = 0.34
 Women are one-third as likely to have a heart attack than men
 Being a woman is ‘protective’ of a heart attack

 Odds Ratios:
> 1.0 indicate an increased risk
 < 1.0 indicate a decreased risk
 = 1.0 indicate the SAME risk
Another Example: Lung Cancer

Cause of Death in Lung Cancer?


Men and Women Yes No
Men 6 64
Sex
Women 16 201

 First, notice that way more women had lung cancer


 But – there are way more women in this sample
 I’ll run a chi-square in SPSS to see if there is a
difference…
Lung Cancer Chi-Square results
 χ2 = 2.451, df = 1, p = 0.456, n = 287
 Is there a difference in the frequency of lung cancer
between men and women?

 A chi-square test revealed that there was no


significant difference in the odds of lung cancer
between men and women (χ2 (1) = 2.451, p =
0.456).
 Let’s calculate the odds ratio…
Odds Ratio: Lung Cancer
 What are the odds of cancer in men?
 6/64 = 0.094 Odds Ratio = 0.094 / 0.079 = 1.19
 In women?
It appears men might be slightly
 16/201 = 0.079
more likely than women, but this
 What is the OR? could be due to RSE

Cause of Death in Lung Cancer?


Men and Women Yes No
Men 6 64
Sex
Women 16 201
More on Odds Ratios
 1) Take care in setting up your 2x2 table – this can
make it really easy to calculate the odds and
understand your chi-square or really hard

 2) As you can see we are hand-calculating an odds


ratio. You can get SPSS to do this for you.
 If you have only 1 IV (like these example), it’s called
“Risk” in the CrossTabs option
 Like simple linear regression
New Heart Attack Output
 When you do this, ignore the bottom rows of the
box (they are more confusing than helpful)
 You get the OR and the 95% CI
 This is the same result we got from hand calculating it,
and we knew it was significant because of the chi-
square test
95% CI’s and Odds Ratios
 Is this odds ratio statistically significant?

 How can you tell?


More on Odds Ratios
 2) As you can see we are hand-calculating an odds
ratio. You can get SPSS to do this for you.
 With 1 IV, use crosstabs
 With multiple IV’s, use ‘logistic regression’
 I won’t ask you to do this – but you should know it’s there
More on Odds Ratios
 2) As you can see we are hand-calculating an odds
ratio. You can get SPSS to do this for you.
 With 1 IV, use crosstabs
 With multiple IV’s, use ‘logistic regression’

 3) Odds Ratios can be tricky without the 2x2 table


 They tell you how much more likely something is to
happen for a certain group (like males vs females)
 But, you should always include the information in the
2x2 table so people know what you’re talking about
 Example…
Odds of Winning the Powerball
 Odds of winning the Powerball lotto with 1 ticket:
1 / 175,223,510 = 0.00000000571
 Odds of winning the Powerball with 10 tickets:
 10/175,223,510 = 0.0000000571
 Odds Ratio = 10.0
 You are 10 times as likely to win with 10 tickets!!!!!
 Still better odds to die driving to buy your ticket – or
getting hit by lightning – than winning with 10 tickets.
 Odds Ratios are great statistics – but should not be
used to model VERY, VERY rare things
Referent Group
 4) Odds ratios are a ratio between two groups
 The odds ratio was 2.9 or 0.34 for our heart attack example
– depending on which group was the referent
 Referent Group = The ‘baseline’ group in the odds ratio.
The group that is in the denominator.
 2.125 / 0.724 = 2.9
 Women were the referent group – meaning men were 2.9 times
more likely than women
 Must mention this when talking about your odds ratio – you can’t
just say, ‘the odds ratio was 2.9’.
 WHO was 2.9 times more likely than WHO?

 Thisis especially important when using them for more than 2


groups…
Example with multiple groups

 Odds ratios from a current study I’m working on


 Shows compares the odds of obesity in children
meeting recommendations for PA, screen time, sleep
 Notice that WHO I choose to be the referent group
influences all of the odds ratios
QUESTIONS ON CHI-SQUARE?
ODDS RATIOS?
Upcoming…
 In-class activity
 Homework:
 Cronk – Read Section 7.2 on Chi-Square of Independence
 Holcomb Exercises 55, 56, 57 and 59
 Exercises 55-57 are on Chi-Square, 59 is on Odds Ratio

Вам также может понравиться