Академический Документы
Профессиональный Документы
Культура Документы
1308 individuals were asked: How many people have you known personally that were victims
of homicide?
> homicide=data.frame(nvics=rep(c(0:6),2),
+ race=rep(c("Black","White"),each=7),
+ Freq=c(119,16,12,7,3,2,0,1070,60,14,4,0,0,1))
> xtabs(Freq~race+nvics,data=homicide)
nvics
race 0 1 2 3 4 5 6
Black 119 16 12 7 3 2 0
White 1070 60 14 4 0 0 1
> homicide=transform(homicide,race=relevel(race,"White"))
> homicide
Call:
glm(formula = nvics ~ race, family = poisson(link = "log"), data = homicide,
weights = Freq)
Deviance Residuals:
Min 1Q Median 3Q Max
-14.051 0.000 5.257 6.216 13.306
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.38321 0.09713 -24.54 <2e-16 ***
raceBlack 1.73314 0.14657 11.82 <2e-16 ***
---
Signif. codes: 0 S***S 0.001 S**S 0.01 S*S 0.05 S.S 0.1 S S 1
1
> sum(resid(hom.poi,type="pearson")^2)
[1] 2279.873
Notice that X 2 = 2279.873 which is large compared to the df = 1308 2 = 1306. Thus,
X2 2279.873
= = 1.745692 > 1
df 1306
is an indication of overdispersion. However we will fit the negative binomial model and cal-
culate the dispersion parameter D for that model to better determine the potential existence
of overdispersion.
Remark : The degrees of freedom are actually N p = 1308 2 = 1306 and not 9 as shown
in the software. This is because observation occurrences are weighted by their frequency
instead of the data entered as 1308 rows. For example, a black person that knew 0 victims
of homicide is weighed by the frequency of 119. Rather, there should have been 119 rows
(each row corresponding to an individual) of black person that knew 0 victims. However,
the model fit remains unchanged in all other aspects.
> homicide2=homicide[rep(1:14,homicide$Freq),]
> hom.poi2=glm(nvics~race,family=poisson(link="log"),data=homicide2)
> hom.poi2$df.residual
[1] 1306
Call:
glm.nb(formula = nvics ~ race, data = homicide, weights = Freq,
init.theta = 0.2023119205, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-12.754 0.000 2.086 3.283 9.114
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.3832 0.1172 -20.335 < 2e-16 ***
raceBlack 1.7331 0.2385 7.268 3.66e-13 ***
---
Signif. codes: 0 S***S 0.001 S**S 0.01 S*S 0.05 S.S 0.1 S S 1
Theta: 0.2023
Std. Err.: 0.0409
2 x log-likelihood: -995.7980
2
The Wald and Likelihood-ratio CIs for exp both models are
> # Wald CI
> confint.default(hom.poi)
2.5 % 97.5 %
(Intercept) -2.573577 -2.192840
raceBlack 1.445877 2.020412
> exp(confint.default(hom.poi))
2.5 % 97.5 %
(Intercept) 0.0762623 0.1115994
raceBlack 4.2455738 7.5414329
> exp(confint.default(hom.nb))
2.5 % 97.5 %
(Intercept) 0.07332043 0.1160771
raceBlack 3.54571025 9.0299848
> # Likelihood CI
> confint(hom.poi)
2.5 % 97.5 %
(Intercept) -2.579819 -2.198699
raceBlack 1.443698 2.019231
> exp(confint(hom.poi))
2.5 % 97.5 %
(Intercept) 0.0757877 0.1109474
raceBlack 4.2363330 7.5325339
> exp(confint(hom.nb))
2.5 % 97.5 %
(Intercept) 0.07305976 0.1157258
raceBlack 3.57784560 9.1316443