Вы находитесь на странице: 1из 12

COURSE: DSCI 3710 Print Name:

Exam 2 version A Signature:

Spring 2014 Student ID#:

INSTRUCTIONS:

Please print your name and student ID number on this exam. Also, put your
signature on this exam.

On your scantron PRINT your name and exam version and fill in your student ID
number. To better protect your privacy also print your name on the backside of
your scantron.

You have 75 minutes to complete this 25 question-exam. The exam is open


book, open notes, and open mind. You may use any type of hand calculator but
please show all your work on the exam and mark all answers on the scantron.
Usage of cell phones, digital cameras, PDAs, instant messenger, or any other
communication device is strictly prohibited.

Please DO NOT pull this exam apart. When you have completed the exam, please
turn in your scantron and exam booklet to your instructor, at the front desk.

No cheating.

Good luck and we wish you well on the exam

Note: (1) Whenever question(s) are connected you may be asked to assume a result (given a
value) as an answer for the previous question but this result (value) may or may not be
correct. The procedure is set in place to prevent you from losing points on a subsequent
question because you made a mistake on some previous question/s.
(2) It may not be necessary to compute all the missing values in the partially filled
Tables; please read the questions and compute only the ones you must, to answer those
questions.
Use the following information to answer questions 1 - 7.

An instructor of one of the sections in DSCI 3710 would like to predict a students final exam
score (in points) based on the amount of time (in hours) spent in studying. A (partial) simple
linear regression of the data follows:

Regression Statistics
Multiple R XXX
R Square XXX
Adjusted R
Square XXX
Standard Error 39.944
Observations 9

ANOVA
Significan
df SS MS F ce F
2607.458
Regression 1 9 XXXX XXXX XXXX
Residual X XXXX XXXX
13776.00
Total 8 00

P-
Coefficien valu Lower Upper
ts SE t Stat e 95% 95%
Intercept 134.0034 32.0932 4.1754 XXXX 58.1150 XXXX
Time (Hours) 6.3390 4.9587 XXXX XXXX XXXX 18.0644

Durbin-Watson statistic = 2.816668

RESIDUAL
OUTPUT

Standardi
zed Leverage
Observation Residuals s
1 0.950 0.166
2 -1.814 0.344
3 0.489 0.123
4 0.028 0.260
5 -0.374 0.240
6 0.955 0.111
7 0.726 0.260
8 -1.939 0.372
9 0.489 0.123
1
1. What is the correlation (R) between these two variables?

A. 0.05
B. 0.08
C. 0.19
D. 0.44*
E. 0.65

2. What is the calculated t statistic for the test of the coefficient of the variable Time?

A. 3.84
B. 2.86
C. 1.28*
D. 0.96
E. 0.78

3. What is the critical value (absolute) of the test statistic for testing (t-test) the significance of
Time at the 0.05 level of significance?

A. 1.895
B. 2.306
C. 1.860
D. 2.365*
E. 1.965

4. What percentage of the Final exam score is explained by its regression on Time spent on
studying?

A. 5%
B. 8%
C. 19%*
D. 44%
E. 65%

5. Are there any outliers in this data set?

A. None, because all leverage values are within +/- 3


B. None, because all Cooks distance values are within +/- 3
C. None, because all standardized residual values are within +/- 2 *
D. One, since one value of the Cooks distance is greater than 1
E. One, since one value of leverage is greater than 0.35

2
6. What seems to be the effect of studying for one extra hour (regardless of significance)?

A. The final exam score would increase by 6.34 points *


B. The final exam score would decrease by 6.34 points
C. The final exam score would increase by 40.12 points
D. The final exam score would increase by 134.00 points
E. The final exam score would decrease by 134.00 points

7. What is the predicted exam score for the fifth observation?

A. 153.0*
B. 159.4
C. 165.7
D. 172.0
E. 178.4

Use the following to answer the questions 8 - 11.

In a research study published in 1963, students in grades 4-6 were asked whether good grades,
athletic ability, or popularity was most important to them. The table below provides a
classification of the 335 students by Goals and Grade Classification. Do the data present
sufficient evidence to indicate that student goals depend on their grade classification?

Source: Adapted from Chase, M.A and Dummer, G.M. (1992), "The Role of Sports as a Social Determinant for Children,"
Research Quarterly for Exercise and Sport, 63, 418-424.

8. What is the alternative hypothesis?

A. Ha: Goals is greater than Grade Classification


B. Ha: Goals is less than Grade Classification
C. Ha: Goals and Grade Classification are dependent *
D. Ha: Goals and Grade Classification are independent
E. None of the above

3
9. What is the expected number of children in the 5th grade that Sports is the most important
to them?

A. 18.9
B. 22.2*
C. 26.9
D. 31.6
E. 46.1

10. Suppose the value of the test statistic is 10.46. Which one of the following would best
describe the p-value of the test?

A. p > 0.10
B. p < 0.01
C. 0.05 < p < 0.10
D. 0.025 < p < 0.05*
E. 0.01 < p < 0.025

11. Assuming the p-value is 0.074, what would be the decision at the 0.05 significance level?

A. Reject Ho and conclude that Goals and Grade Classification are dependent
B. Reject Ho and conclude Goals and Grade Classification are independent
C. Fail to Reject Ho and conclude that Goals and Grade Classification are dependent
D. Fail to Reject Ho and conclude that Goals and Grade Classification are independent*
E. Reject Ho and conclude that Goals is greater than Grade Classification

Use the following to answer the questions 12 - 15.

The U.S. film industry generates billions of dollars in revenue every year from box office sales.
A random sample of 48 movies from 2011 was selected for analysis to build a predictive model
to help forecast the potential first month box office revenue of an upcoming new release. The
total production costs, total amount spent on promotions and the category of the film (Action,
Comedy or Drama) have all been recorded along with their total first months box office revenue.
The movie type were coded through the use of dummy variables. A sample of the coded data is
shown below. Regression was performed at the 0.05 significance level.

First month box office revenue Total production costs Total promotional costs Drama Action
(In millions) (In millions) (In millions)

41.12 11.62 3.52 0 0


110.88 34.61 10.27 0 1
124.19 35.75 12.21 0 1
84.92 26.51 12.32 1 0
83.20 26.17 11.52 1 0
77.20 24.10 11.20 1 0

4
Regression Statistics
Multiple R XXX
R Square XXX
Adjusted R
Square XXX
Standard Error XXX
Observations 48

ANOVA
Significanc
df SS MS F eF
53940.
Regression XX 29 XXXX XXXX XXXX
Residual 43 XXXX 10.63
Total XX XXXX

Coefficie Upper
nts Standard Error t Stat P-value Lower 95% 95%
Intercept 8.194 2.229 3.676 0.001 3.698 12.689
Total Production
Costs XXXX 0.175 14.153 0.001 2.123 XXXX
Total Promotional
Costs XXXX 0.280 2.132 0.039 0.032 1.163
Drama 3.400 2.756 XXXX 0.224 -2.158 8.958
15.55
Action 3 XXXX 3.621 0.001 6.891 24.214

12. Based on this output, which set of independent variables are found to be significant
predictors of first month box office revenue at the 0.05 level of significance?

A. Action and Drama


B. Total Production Costs and Drama
C. Total Production Costs ,Total Promotional Costs, and Action*
D. Drama and Total Production Costs
E. None of the above

13. The calculated value of the test statistic to test the complete regression model
(i.e., for the Ho: All the s = 0) is:

A. 6.760
B. 17.332
C. 132.33
D. 457.05
E. 1268.69*

5
14. Find the 95% upper confidence interval for the coefficient of Total Production Costs.

A. 1.163
B. 2.123
C. 2.828*
D. 3.566
E. 6.891

15. What is the predicted value of first month box office revenue (in millions of dollars) for an
upcoming Action film with 36.2 million in Production Costs and 11.2 million in Promotion
Costs?

A. 137.63
B. 120.06*
C. 103.59
D. 84.73
E. 65.56

16. At any fixed level of Production Costs and Promotional Costs, which film type would be
predicted to provide the largest amount of first month box office revenue?

A. Action*
B. Comedy
C. Drama
D. Insufficient information provided to answer the question

6
Use the following to answer the questions 17 - 20.

Archeological research in Outer Mongolia is being performed at four separate sites. The head
researcher coordinating these separate digs believes that there is difference in the depths in which
the artifacts at each site have been discovered. To test his belief, we have selected the last 9
artifacts at each site and recorded the depth that they were recovered. The data and an analysis
(partially complete) using the standard ANOVA procedure in Excel, conducted at 5%
significance level follow:

Anova: Single Factor


SUMMARY
Groups Count Sum Average Variance
Site 1 9 886.000 XXXX 182.528
Site 2 9 500.000 55.556 341.278
Site 3 9 515.000 57.222 267.944
Site 4 9 691.000 XXXX 37.694
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups XXXX X 3632.815 XXX XXXX XXXX
Within Groups 6635.556 XX XXXX
Total 17534.000 35
TUKEY MULTIPLE COMPARISON TEST
Critical Q Distance Alpha
3.85 18.48 0.05

Means joined by a double line are not significantly different.

Site 2 Site 3 Site 4 Site 1


55.556 57.222 76.778 98.444

17. Which of the following would be the appropriate alternative hypothesis to test a difference
in the mean price of framing lumber sold by suppliers located in the four states?

A. HA: Not all mean depths are equal *


B. HA: 1 2 3 4
C. HA: 1 = 2 = 3 = 4
D. HA: 1 < 2 < 3 < 4
E. HA: 1 > 2 > 3 > 4

18. What is the calculated value of the test statistic?


7
A. 207.36
B. 18.48
C. 17.52*
D. 3.85
E. 2.90

19. What is the critical value of the test statistic? Use a = 0.05.

A. 1.645
B. 1.960
C. 2.33
D. 3.196
E. 2.901*

20. Using a = 0.05 and assuming the calculated F value is 2.01, what is the conclusion of this
ANOVA test?

A. There is insufficient information to decide whether the depths are different.


B. Fail to reject H0, conclude there is insufficient evidence to indicate that there is a
difference in the mean depth of discovery among the four sites.*
C. Fail to reject H0, conclude the mean depth of discovery at the four sites are all
different.
D. Reject H0, conclude the mean depth of discovery at the four sites is different.
E. Reject H0, conclude the mean depth of discovery at the four sites is different are
equal.

A.

8
Use the following to answer the questions 21 - 23.

12The average number of days that new condominiums built on Grand Caymans Seven Mile
Beach stayed on the market before selling was reported to be 120 days by an international real-
estate publication. Nine new condominiums that have been sold during this calendar year on
Grand Caymans Seven Mile Beach were randomly selected, and the number of days that each
stayed on the market was collected. Is there evidence at the 0.05 level to suggest that the mean
number of days a new condominium spends on the market is different from what was reported by
the national publication?

Day
t Test for Population Mean
s
107 Days
111 Number of Observations 9
127 Sample Standard Deviation 6.716481
116.11111
123 Sample Mean 1
112 Ho: XX Ha: XX
114 T* XXX
111 2 * P[T T* ] two tail XXX
118 | T Critical |, = 0.05 2.306004
110.94836 t 121.27385
122 95% CI for Pop. Mean 7 o 5

21. Which of the following would be appropriate for the null hypothesis?

A. > 120
B. 120
C. < 120
D. = 120*
E. < 120

22. What is the absolute value of the calculated test statistic to test the mean number of days a
home spends on the market?

A. 116.11
B. 110.95
C. 6.72
D. 2.548
E. 1.74*

9
23. Assuming the hypothesis test is conducted at the 1% significance level, what should be the
conclusion?

A. There is evidence that the average number of days on the market is different than 120
because the absolute value of the test statistic (calculated) is larger than the table (critical)
value.
B. There is insufficient evidence that the average number of days on the market is different
than 120 because the absolute value of the test statistic (calculated) is larger than the table
(critical) value.
C. There is evidence of the average number of days on the market is larger than 120 because
the absolute value of the test statistic (calculated) is smaller than the table (critical) value.
D. There is insufficient evidence that the average number of days on the market is different
than 120 because the absolute value of the test statistic (calculated) is smaller than the
table (critical) value.*
E. There is very strong evidence of that the average number of days on the market is
different than 120 because the absolute value of the test statistic (calculated) is equal to
the table (critical) value.

Use the following to answer the questions 24 - 25.

The U.S. Census Bureau conducts annual surveys to obtain information on the percentage of the
voting-age population that is registered to vote. Suppose that 639 employed persons and 504
unemployed persons are independently and randomly selected and that 394 of the employed
persons and 218 of the unemployed persons have registered to vote. Can we conclude that the
percentage of the employed workers (p1), who have registered to vote, exceeds the percentage of
unemployed workers (p2), who have registered to vote? Use a significance level of = 0.10 for
the test.

Z Test for Two Proportions


Variable
Variable 1 2
0.43254
Sample Proportion 0.616588 0
Number of Observations 639 504
Ho:XX Ha:XX
Z* 6.19
P[Z Z*] XX
Z Critical, = 0.1 1.281552
90% CI for p1 - p2 0.135854 to XXXX
24. What is the estimated mean proportion of respondents (pooled proportion,) of the people
who have registered to vote in the combined sample?

A. 0.617
B. 0.535*
C. 0.585
D. 0.525

10
E. 0.184

25. What is the upper limit of the 90% CI for p1 p2?

A. 0.011
B. 0.134
C. 0.232*
D. 0.184
E. 0.535

Thanks! And, we wish you a Great Summer!

11

Вам также может понравиться