HW 2, 512

DEPAUL UNIVERSITY
HW 2
MAT 512 Li
Peter Drogos
2/12/2014
Drogos 2
NOTE: R code is bolded in small font
1.
l.
. R concludes verifies this result (see below, highlighted).
Call:
lm(formula = Minutes ~ Copiers, data = copier.data)
Residuals:
Min
1Q Median
3Q
Max
-6.8729 -2.9696 -0.4751 2.8260 7.3315
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 11.4641
3.4390 3.334 0.00875 **
Copiers
0.8045 30.580 2.09e-10 ***
24.6022
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
Residual standard error: 4.615 on 9 degrees of freedom

Multiple R-squared: 0.9905, Adjusted R-squared: 0.9894
F-statistic: 935.1 on 1 and 9 DF, p-value: 2.094e-10
Drogos 3
m. The calculated simple correlation coefficient is 0.995. This means that there is a strong relationship between Minutes and Copiers,
and they are positively correlated.
n. (
. This means that we reject
, and we say that the simple regression model is significant
)
at level of significance
. This matches with the results from R, where we can see a significant p-value much less than the
level of significance. This confirms our strong relationship between Minutes and Copiers.
Call:
lm(formula = Minutes ~ Copiers, data = copier.data)
Residuals:
Min
1Q Median
3Q
Max
-6.8729 -2.9696 -0.4751 2.8260 7.3315
Coefficients:
(Intercept) 11.4641
3.4390 3.334 0.00875 **
Copiers
0.8045 30.580 2.09e-10 ***
24.6022
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

F-statistic: 935.1 on 1 and 9 DF, p-value: 2.094e-10
Drogos 4
2.
a.
The first data plot indicates that as the number of monthly X-ray exposures increases, the number of monthly labor hours required
increases. The same can be said of monthly occupied bed days and monthly labor hours. These plots suggest linear regression models.
The plot for the average length of patients stay and monthly labor hours required also suggests a linear regression model, but is less
defined. The data plots indicate that the given model might be reasonable because we wish to predict the monthly labor hours required
based on the monthly X-ray exposures, the monthly occupied sick days, and the average length of a patients stay.
b. (see attached written sheet)
c. Without using the handy functions, we find that the ordinary least squares estimate is
(
) (see attached Excel work and R code below).
> b<-solve(t(X)%*%X)%*%t(X)%*%y
>b
[,1]
Drogos 5
(Intercept) 1946.80203866
x1
0.03857709
x2
1.03939197
x3
-413.75779647
So, we see that the point estimates for
are 1946.802, 0.0386, 1.03939, and -413.7578, respectively.
d. The results from c) for the point estimates match when using the handy functions in R, see below
Call:
lm(formula = y ~ x1 + x2 + x3)
Residuals:
Min
1Q Median
3Q
Max
-677.23 -270.19 60.93 228.32 517.70
Coefficients:
(Intercept) 1946.80204 504.18193 3.861 0.00226 **
x1
0.03858 0.01304 2.958 0.01197 *
x2
1.03939
x3
0.06756 15.386 2.91e-09 ***
-413.75780 98.59828 -4.196 0.00124 **
--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
The point estimates for
(from R output) are 1946.802, 0.0386, 1.03939, and -413.7578, respectively.
Drogos 6
e. When x1=56194, x2=14077.88, and x3=6.89, we calculate a point prediction of 15896.25 (see attached sheet).
f. Using R, we obtain the same results as the written solution.
predict(hosp.model,data.frame(x1=56194,x2=14077.88,x3=6.89))
1
15896.25
g. The 95% PI of the labor hours corresponding to

> X<-model.matrix(~x1+x2+x3,data=hosp.data)
> b<-solve(t(X)%*%X)%*%t(X)%*%y
> x0<-c(1,56194,14077.88,6.89)
> y.pred<-t(b)%*%x0
> t<-qt(0.975,length(y)-length(b))
> DV<-t(x0)%*%solve(t(X)%*%X)%*%x0
> sum(resid(hosp.model)^2)
> s.square<-sum(resid(hosp.model)^2)/(length(y)-4)
> s<-sqrt(s.square)
> low.PI<-y.pred-(t*s*sqrt(1+DV))
> upp.PI<-y.pred+(t*s*sqrt(1+DV))
) is [14906.24, 16886.26]
Drogos 7
h. The 95% PI of the labor hours corresponding to
) obtained by R is [14906.24,16886.26].
predict(hosp.model,data.frame(x1=56194,x2=14077.88,x3=6.89),interval="p",level=0.95)
fit
lwr
upr
1 15896.25 14906.24 16886.26
i. An unbiased estimator for the standard deviation of the error term
is s=387.1598.
j. The adjusted R^2 =0.995155651 (see attached Excel work)

k. The result obtained by R confirms this result
> summary(hosp.model)

F-statistic: 1028 on 3 and 12 DF, p-value: 9.919e-15
l. The model we use is significant because of the F-test results. We find that the linear regression model is significant at past the
significance level
(p<0.05).
F-statistic: 1028 on 3 and 12 DF, p-value: 9.919e-15
m. Based off of the t-test results, there is no term in the model we should drop (each term has p<0.05, each term is significant at
) level of significance (see highlighted portion below).
Coefficients:
Drogos 8
(Intercept) 1946.80204 504.18193 3.861 0.00226 **
x1
0.03858 0.01304 2.958 0.01197 *
x2
1.03939 0.06756 15.386 2.91e-09 ***
x3
-413.75780 98.59828 -4.196 0.00124 **
3. (see attached written sheet)

-------------------------------------------------------------R code (no output)
1.
copier.data<-read.table('t3-7 service time.txt', header=T, sep=",")
> x<-copier.data[,1]
> y<-copier.data[,2]
> copier.model<-lm(Minutes~Copiers,data=copier.data)
> summary(copier.model)
2.
> hosp.data<-read.table('t4-11 hospital.txt', header=T,sep=",")
> hosp.data
> x1<-hosp.data[,1]
> x2<-hosp.data[,2]
> x3<-hosp.data[,3]
> y<-hosp.data[,4]
Drogos 9
> hosp.model<-lm(y~x1+x2+x3)
> predict(hosp.model,data.frame(x1=56194,x2=14077.88,x3=6.89),interval="p",level=0.95)
> X<-model.matrix(~x1+x2+x3,data=hosp.data)
>X
> b<-solve(t(X)%*%X)%*%t(X)%*%y
> y.pred<-t(b)%*%x0
> t<-qt(0.975,length(y)-length(b))
> DV<-t(x0)%*%solve(t(X)%*%X)%*%x0
> summary(hosp.model)
> sum(resid(hosp.model)^2)
> s.square<-sum(resid(hosp.model)^2)/(length(y)-4)
> s<-sqrt(s.square)
> low.PI<-y.pred-(t*s*sqrt(1+DV))
> upp.PI<-y.pred+(t*s*sqrt(1+DV))
-------------------------------------------------------Excel written work

x1
x2
2463
2048
3940
6505
5723
11520
5779
x3
472.92
1339.75
620.25
568.33
1497.6
1365.83
1687
y
4.45
6.92
4.28
3.9
5.5
4.6
5.62
566.52
696.82
1033.15
1603.62
1611.37
1613.27
1854.17
total variation
16618887
15573496
13032077
9238724
9191671
9180154
7778392
y^
explained variation
692.14447
15610420.02
555.12936
16711887.21
972.59527
13472949.06
1174.8082
12029372.88
1448.5043
10205741.45
1907.557
7483452.04
1597.8745
9273683.966
Drogos 10
5969
8461
20106
13313
10771
15543
34703
39204
86533
1639.92
2872.33
3655.08
2912
3921
3865.67
12446.33
14098.4
15524
x0
intercept
x1
x2
x3
y^
unexpl
var
R^2
adj. R^2
5.15 2160.55
6.18 2305.58
6.15 3503.93
5.88 3571.89
4.88
3741.4
5.5 4026.52
10.78 11732.17
7.05 15414.94
6.35 18854.45
4643.147
pt. estimates (from

R)
1946.802039
56194
0.03857709
14077.88
1.03939197
6.89
-413.7577965
15896.24724
1798712.99
0.996124521
0.995155651
6163287
5464219
1297815
1147591
813147.4
380228.7
50254249
1.16E+08
2.02E+08
4.64E+08
1750.7357
2701.6564
3976.8834
3054.1924
4418.6337
4288.6842
11761.849
15195.95
18793.152
8366042.31
3769385.45
443907.0666
2524776.288
50406.1466
125643.7705
50675922.86
111361644.5
200222653.6
462327888.6

HW 2, 512

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

HW 2, 512

Загружено:

Авторское право:

Доступные форматы

DEPAUL UNIVERSITY

. R concludes verifies this result (see below, highlighted).

-6.8729 -2.9696 -0.4751 2.8260 7.3315

3.4390 3.334 0.00875 **

0.8045 30.580 2.09e-10 ***

--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 4.615 on 9 degrees of freedom

-6.8729 -2.9696 -0.4751 2.8260 7.3315

3.4390 3.334 0.00875 **

0.8045 30.580 2.09e-10 ***

--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 4.615 on 9 degrees of freedom

So, we see that the point estimates for

are 1946.802, 0.0386, 1.03939, and -413.7578, respectively.

-677.23 -270.19 60.93 228.32 517.70

0.03858 0.01304 2.958 0.01197 *

0.06756 15.386 2.91e-09 ***

-413.75780 98.59828 -4.196 0.00124 **

--Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

The point estimates for

(from R output) are 1946.802, 0.0386, 1.03939, and -413.7578, respectively.

g. The 95% PI of the labor hours corresponding to

1 15896.25 14906.24 16886.26

i. An unbiased estimator for the standard deviation of the error term

j. The adjusted R^2 =0.995155651 (see attached Excel work)

Residual standard error: 387.2 on 12 degrees of freedom

0.03858 0.01304 2.958 0.01197 *

1.03939 0.06756 15.386 2.91e-09 ***

-413.75780 98.59828 -4.196 0.00124 **

3. (see attached written sheet)

-------------------------------------------------------Excel written work

pt. estimates (from

Вам также может понравиться

--Signif. codes: 0 * 0.001 0.01 * 0.05 . 0.1 1

--Signif. codes: 0 * 0.001 0.01 * 0.05 . 0.1 1

--Signif. codes: 0 * 0.001 0.01 * 0.05 . 0.1 1