STAT1008 Final Exam S1 2006

© All Rights Reserved

Просмотров: 41

STAT1008 Final Exam S1 2006

© All Rights Reserved

- Impact of Training on Employee Performance (Banking Sector Karachi)
- Six Sigma Tools
- Zuur Etal (2010)_MethodsinEcologyandEvolution_A Protocol for Data Exploration to Avoid Common Statistical Problems
- PSYC2012 Research Method
- Jake Over Ga Ard
- Co Relation
- Ejercicios Estadistica Inferencial II
- Chapter 7
- Marketing Reserch
- 16_SBE11e_SM_Ch16
- Big Data on B-Schools
- BS-I Second Semester
- Quick GRETL Guide
- Modeling for Prediction
- 9418
- FVSysID ShortCourse 6 Validation
- stats term project
- m,Khapidial Ansori Spss
- MultivariableRegressionAnalysisChapmanPH507June2016.pdf
- Regression

Вы находитесь на странице: 1из 12

First Semester Examination Final, June 2006

QUANTITATIVE RESEARCH METHODS (STAT1008)

Study Period: 15 minutes

Time Allowed: 3 hours

Permitted Material: Calculator, dictionary and 1 A4 page with notes

on both sides

Instructions to Candidates:

Attempt ALL questions.

Each question is of equal mark value.

Start your solution to each question on a new page.

To ensure full marks show all the steps in working out your

solution. Marks may be deducted for failure to show appropriate

calculations or formulae.

Unless otherwise stated, use a significance level of 5%.

Selected statistical tables are attached to the back of the

examination paper.

Page 1 of 12

Question 1: 20 marks

For each question below, choose the best answer from the options given.

Write your answer in your answer booklet clearly indicating the question (i to xx), and

your answer as the letter appropriate (A, B, C, or D).

You will gain 1 mark for each correct answer. Marks will not be deducted for

incorrect answers.

(i) The Central Limit Theorem states that as a sample size becomes larger, the

distribution of

a. The sample mean becomes more normal

b. The sample standard deviation becomes more normal

c. The population mean becomes more normal

d. None of the above.

(ii) In a simple regression, the R-squared value is computed to be 87.2%. From this

alone we can say

a. The regression is an appropriate model for the data

b. The regression is not an appropriate model for the data

c. The residuals will be normally distributed.

d. None of the above.

(iii) The same two variables discussed in part (ii) (above) are used to compute a

correlation coefficient. The absolute value of the correlation coefficient (to 3

decimal places) will be

a. 0.872

b. 0.934

c. 0.966

d. 0.760

(iv) A scale of very poor, poor, average, good, very good has been used to assess the

teaching methods of a lecturer. A regression is to be run using final exam result

as the response, and teaching rating as a dummy variable. How many columns

will be needed for the dummy variable coding?

a. 1

b. 2

c. 3

d. 4

(v) Suppose Minitab gives a sample correlation between two variables as -0.955,

with a p-value of 0.000. This means that

a. The null hypothesis, =0, would be rejected.

b. The null hypothesis, =0, would not be rejected.

c. The response is dependent on the explanatory variable.

d. 95.5% of the variation in the response is explained by the explanatory

variable.

Page 2 of 12

(vi) A researcher asks a group of university students to select their favourite method

of assessment open book examination, closed book examination, assignment,

group work or other. In summarising this data, which would be the most

appropriate graph to use?

a. Bar chart

b. Boxplot

c. Histogram

d. Scatterplot

(vii) In investigating the relationship between years of formal education and income, a

researcher finds that the covariance is +101.34. Which of the following

statements is true, based only on this figure?

a. There is a very strong positive relationship between the two variables

b. The correlation will be very close to 1.

c. Both years of formal education and income have very large variances.

d. A line of best fit for the data would have a positive slope.

(viii) A study is being performed based on in-home access to broadband internet

within the ACT. The ACT has been divided into 32 regions, and a random

selection of homes in each region is chosen for participation in the study. This

sampling plan is best described as

a. Simple random sampling

b. Systematic sampling

c. Stratified random sampling

d. Cluster sampling

Page 3 of 12

Use the information given below to answer questions (ix) to (xx).

It has long been claimed that if a system of flexible work hours is offered to staff, they

will have reduced demand for resources. A local health department decides to test this

theory among their 11 field workers. For 12 months, they record the distance driven

by each field worker in the course of carrying out his or her duties. Then, they switch

from a standard 5 day week to a flexible 4 day week system, and record the new

mileage driven over the 12 months immediately following. The data collected id

number of staff member, number of miles driven when working a 5 day week, number

of miles driven when working a 4 day week are entered into Minitab. The data are

presented in a boxplot below.

Boxplot of 5 Day, 4 Day

9000

8000

7000

Data

6000

5000

4000

3000

2000

1000

0

5 Day

4 Day

(ix) When the data are entered into Minitab, how many rows will be required?

a. 12

b. 3

c. 11

d. 10

(x) Based only on the boxplot, we can say that

a. The IQR is larger for the 5 day week than for the 4 day week

b. The Range is larger for the 4 day week than for the 5 day week

c. The mean of the 5 day week is higher.

d. There are more observations for the 5 day week than the 4 day week.

(xi) How many variables are present in the data set?

a. 2

b. 3

c. 4

d. 12

Page 4 of 12

(xii) How many continuous variables are present in the data set?

a. 1

b. 2

c. 3

d. 11

A further variable is calculated the difference between the mileages under the two

schemes, that is (5 Day mileage) (4 Day mileage). Basic Descriptive Statistics for

all three variables are calculated and presented below, however some values have

been obscured by oil blots.

Descriptive Statistics: 5 Day, 4 Day, 5 Day - 4 Day

Variable

5 Day

4 Day

5 Day - 4 Day

N

11

11

11

Mean

4955

3973

982

SE Mean

*oil1*

*oil3*

*oil5*

StDev

3161

2171

1140

Sum

*oil2*

*oil4*

*oil6*

Sum of

Squares

369966249

220755581

23593724

(xiii) The value which should be present at *oil1* is given by (to the nearest whole

number)

a. 287

b. 953

c. 908356

d. 3161

(xiv) The value which should be present at *oil4* is given by (to the nearest whole

number)

a. 43703

b. 23881

c. 14858

d. 20068689

Based on the above descriptive statistics, a 90% confidence interval for the average

mileage driven using a 5 day week is calculated to be

3161

3161

, 4955 + c

4955 c

.

n

n

a. 1.645

b. 1.96

c. 1.812

d. 2.228

(xvi) In the above confidence interval, n=

a. 10

b. 11

c. 12

d. 5

Page 5 of 12

The local health department wishes to test if there is a significant difference between

the mileage under the two work day schemes.

(xvii) The appropriate test statistic is given by

4955 3973

a.

1 1

2711.56

+

11 11

4955 3973

b.

3161 2171

+

11

11

982

c.

1140 11

999921

.

d.

4713241

(xviii) In testing if there is a significant difference between mileage under the two work

day schemes, the test statistic should be compared to which tables?

a. T tables with 10 degrees of freedom

b. Standard normal tables

c. T tables with 20 degrees of freedom

d. F tables with 11, 11 degrees of freedom.

(xix) The p-value for the test discussed in parts (xvii) and (xviii) (using a two-sided

alternative hypothesis) is 0.017. This means that

a. There is a 1.7% chance that the null hypothesis is true.

b. 1.7% of the time, the null hypothesis is true.

c. We would reject the null hypothesis.

d. We would reject the alternative hypothesis.

(xx) If the test were carried out against a one-sided alternative, which of the following

statements about the new p-value would be true?

a. The p-value will be equal to 0.017.

b. The p-value will be equal to 0.0085.

c. The p-value will be equal to 0.034.

d. None of the above.

Page 6 of 12

Question 2 (20 marks)

(a) The table below shows the results of a survey of voters including who they

voted for in the most recent federal election (in the House of Representatives)

and their positions on the death penalty for convicted murderers.

For

Against

Liberal/National

0.26

0.04

ALP

0.12

0.24

Other

0.24

0.10

i. Find the marginal probability distribution of voting in the most recent

federal election.

ii. What is the probability that a randomly chosen Australian voter supports

the death penalty for convicted murderers?

iii. What is the probability that an Australian voted for the Liberal/National

candidate in the House of Representatives at the last election if it is

known that they are against the death penalty for convicted murderers?

iv. Are voting choice and position on the death penalty independent events?

Explain your answer.

(b)A commuter must pass through give traffic lights on her way to work, and will

have to stop at each one that is red. She estimates the probability model for the

number of red lights she hits as shown below.

# red lights

0

1

2

3

4

5

Probability 0.06 0.25 0.34 0.15 0.16 0.04

i. Find the expected number of red lights at which the commuter will

have to stop on her way to work.

ii. Find the standard deviation of the number of red lights.

iii. Find the expected number of red lights the commuter will face on

her way to work over a 5 day working week. What is the standard

deviation of the number of red lights faced over a 5 day working

week?

iv. The local council installs a new set of lights on the commuters

route. The commuter wants to take a sample in order to estimate

the new mean number of red lights she can expect to be stopped at.

To estimate this mean to within half a red light (0.5), how many

journeys should she sample, assuming the new standard deviation

is equal to 2.5 red lights? That is, calculate the number of

observations she should make, clearly stating any assumptions you

make.

Page 7 of 12

Question 3 (20 marks)

(a) Your pocket copy of Kyrgystan on a Budget claims that you can expect to spend

about 4237 soms (the local unit of currency) each day you spend in this country,

with a standard deviation of 360 soms. Assume that expenditure follows a normal

distribution.

i. Your budget allows you to spend 90,000 soms during your stay (not

including transport into and out of the country). What is the maximum

number of whole days you can spend in Kyrgystan on average, without

breaking your budget?

ii. What is the standard deviation of your total expenses for a stay of that

duration?

iii. How much money should you budget for each day in order to cover all

but the most expensive 10% of days?

iv. After a stay of 10 days, you find that you have spent 41,414 soms.

What percentage of travellers with the same length of stay will have

spent less than you (assuming the figures in Kyrgystan on a Budget

are accurate).

(b)Having completed your stay in Kyrgystan, you return home to Canberra, and

decide to put your budgeting skills to the test. Your part-time job pays well, at $24

an hour, but the number of hours per week is a random variable best represented by

a uniform distribution with possible values from 4 to 18. Assume each week is

independent.

i. Draw a graph representing the distribution of the number of hours of

work per week. Clearly label all axes and points of interest.

ii. Find the expected value of the hours of work per week.

iii. Find the variance of the hours of work per week.

iv. What is the probability you get no more than 6 hours work next week?

v. Your budget requires your job to bring in a minimum of $112 per week

to meet minimum expenses, or you will be forced to ask your parents

for money. What is the probability that you ask your parents for money

next week?

vi. What is the probability that you get more than 600 hours work over the

coming year (52 weeks)?

Page 8 of 12

Question 4 (20 marks)

(a) The Yummy biscuit company claims that every 500g package of their

chocolate chip cookies contains an average of at least 1000 chocolate chips.

Being a dedicated student of statistics, you determine to test their claim, and

taking a random sample of 16 packages, you find an average of 1238.2

chocolate chips with a standard deviation of 94.3.

i. Perform a hypothesis test at the 5% level to determine if the

companys claim is supported by your data.

ii. Comment on any assumptions you have made in performing

the inference in part (i).

(b)The Scrumptious biscuit company claim that their 500g packages of

chocolate chip cookies are tastier, and contain a different number of chocolate

chips on average than those produced by the Yummy company. You take a

random sample of 16 Scrumptious packages and find a sample average of

1382.2 with a standard deviation of 123.1.

i. Test at the 10% level whether the data support the assumption

of equal population variances.

ii. Perform an appropriate test of equality of means, using a

significance level of 10% and clearly stating any assumptions

you make.

iii. Given the results of your test in (ii), answer the following

question without performing any further calculations: Would

the value 0 be found within a 90% confidence interval for the

true mean difference in number of chocolate chips between

Yummy and Scrumptious packages? Explain your answer.

Page 9 of 12

Question 5 (20 marks)

success, five friends embark on a weight-loss journey together, using a combination

of healthy diet and exercise in order to reach their goals. They record their weight loss

every week over 20 weeks. Some basic descriptive statistics of both variables appear

below.

Descriptive Statistics: Weeks, Weight lost

Variable

Weeks

Weight lost

N

100

100

Mean

10.500

7.290

StDev

5.795

4.242

Minimum

1.000

-1.136

Maximum

20.000

15.414

Scatterplot of Weight lost vs Weeks

17.5

15.0

Weight lost

12.5

10.0

7.5

5.0

2.5

0.0

0

10

Weeks

15

20

A regression is performed in Minitab on the data, and an excerpt of the output is given

below. However, some of the output has been obscured by sweat stains from the

exercise program.

Regression Analysis: Weight lost versus Weeks

Weight lost = SWEAT STAIN 1

Predictor

Constant

Weeks

Coef

-0.0742

0.70133

SE Coef

0.2539

0.02119

T

-0.29

33.09

Page 10 of 12

P

0.771

0.000

S = SWEAT STAIN 2

R-Sq = 91.8%

R-Sq(adj) = 91.7%

Analysis of Variance

Source

Regression

Residual Error

Total

DF

1

98

99

SS

1635.5

146.4

1781.8

MS

1635.5

1.5

F

1095.03

SE Fit

0.218

0.143

0.218

0.184

Residual

3.258

2.952

-2.611

2.433

P

0.000

Unusual Observations

Obs

42

47

79

97

Weeks

2.0

7.0

19.0

17.0

Weight

lost

4.586

7.787

10.640

14.282

Fit

1.328

4.835

13.251

11.848

St Resid

2.71R

2.43R

-2.17R

2.01R

residual.

(a) Describe the scatterplot. Would you expect the covariance between weeks and

weight loss to be positive or negative? Give a reason.

(b)Give the equation of the fitted model (SWEAT STAIN 1).

(c) Find the standard error of the estimate (SWEAT STAIN 2). Give an interpretation

of what this value means.

(d)The friends wish to test the value of the intercept particularly, they wish to know

if the average weight loss at 0 weeks is 0kg. Use the output (without performing

any calculations) to comment on this.

(e) It is often claimed that weight loss of over 0.5kg per week is unsustainable. Test if

the average weight loss per week by this group of friends is likely to be

unsustainable based on this criterion.

(f) Comment on the unusual observations flagged in the Minitab output. Are they a

cause for concern about the model?

Page 11 of 12

(g)The friends wish to find a 95% confidence interval for the average weight loss of

all people using the same combination of diet and exercise in week 15. Use the

output above and calculations you have made in earlier parts of this question, to

find this interval.

Hint: You may find the following formulae useful:

(1 ) % Confidence Interval for y given that x = xg :

1 ( xg x )

+

n ( n 1) sx2

y t / 2, n 2 s

1 ( xg x )

1+ +

n ( n 1) sx2

y t / 2,n 2 s

_____________________________________________________________________

END OF EXAMINATION

Page 12 of 12

- Impact of Training on Employee Performance (Banking Sector Karachi)Загружено:IOSRjournal
- Six Sigma ToolsЗагружено:Dave Hanley
- Zuur Etal (2010)_MethodsinEcologyandEvolution_A Protocol for Data Exploration to Avoid Common Statistical ProblemsЗагружено:lordlyra
- PSYC2012 Research MethodЗагружено:M Yousuf Adam
- Jake Over Ga ArdЗагружено:cittiemutzz
- Co RelationЗагружено:Abdur Rajak
- Ejercicios Estadistica Inferencial IIЗагружено:Jav Alfonso
- Chapter 7Загружено:Fanny Sylvia C.
- Marketing ReserchЗагружено:ankit_will
- 16_SBE11e_SM_Ch16Загружено:Juana
- Big Data on B-SchoolsЗагружено:Prof Dr Chowdari Prasad
- BS-I Second SemesterЗагружено:Mitul Deliya
- Quick GRETL GuideЗагружено:Fernanda Carolina Ferreira
- Modeling for PredictionЗагружено:Venu Kapoor
- 9418Загружено:Deepak Singh
- FVSysID ShortCourse 6 ValidationЗагружено:Anonymous Ry7AEm
- stats term projectЗагружено:api-272825488
- m,Khapidial Ansori SpssЗагружено:opidial ansori
- MultivariableRegressionAnalysisChapmanPH507June2016.pdfЗагружено:Nipaporn Tanny Thipmanee
- RegressionЗагружено:aurellia humairah
- 437-1225-1-SM.pdfЗагружено:Anonymous nSQwTw4
- SPSSЗагружено:dgcavalcante
- corr reg 1.pdfЗагружено:rajender564
- Result of Pilot TestЗагружено:Stirf Erajeb
- lab 1.docxЗагружено:greg
- Statistics for ManagersЗагружено:Leo Prabhu
- Americans Divided on Government SurveillanceЗагружено:davidtaintor
- Estadistica Para MoЗагружено:Juan Agustin Cuadra Soto
- 2014 lab 4 5Загружено:Anonymous gUySMcpSq
- US Federal Reserve: 200515papЗагружено:The Fed

- motionmountain-volume1Загружено:Fabiano Fagundes
- Python Introduction 2019Загружено:Daniel Hockey
- BSN 17.0 M02Загружено:Cheyenne
- Integral CalculusЗагружено:Liregine Cayme
- BG3801 L8 Simulations of Physiological Control System Using SimulinksЗагружено:Nur Farahin Nasrun
- ch06Загружено:Grevious Zo
- MCAT Princeton Review FormulasЗагружено:greyskyes
- (2.1) Harriet Fell, Javed Aslam, Rajmohan Rajaraman, Eric Ropiak, Chris Burrows, Ravi Sundaram-Discrete Structures (2009)Загружено:Rahul Menon
- SparsityЗагружено:Csar Servin
- Evaluation of Bank Branches by Means of Data Envelopment Analysis.pdfЗагружено:David Adeabah Osafo
- Cheng, R. C. H. -- Generating Beta Variates With Nonintegral Shape ParametersЗагружено:Alexander Hernandez
- Multimedia Image and Video Processing - EEn.pdfЗагружено:Sergiu Gocan
- Cycle time in manufacturingЗагружено:Pawan Sharma
- 6DOFЗагружено:miladparsman
- Statistics in ActionЗагружено:Sam
- AI Question BankЗагружено:gopitheprince
- Noise Compression of ECG Signal for Epic Cardio Signal.pdfЗагружено:Editor IJRITCC
- ISO7816-3Загружено:Tolu Collins
- Pais(2010)AccurateIntegrationofFatigueCrackGrowthModelsThroughKrigingandReanalysisoftheExtendedFiniteElementMethodЗагружено:chanrixsoni
- Lucy Richardson de ConvolutionЗагружено:Alejandra Marquesin
- Modeling Bus Suspension Transfer FunctionЗагружено:Helios Flares
- hw2Загружено:mildeithbida
- Ponto flutuanteЗагружено:jonastibola
- garsboy Decesion MakingЗагружено:garsboy
- Homework 9Загружено:Kyunnie Sang GaemGyu
- Physics: Mechanics Review SheetЗагружено:Cody Swain
- Exponents PDFЗагружено:michael_eppersoncary
- Fatemi Chapter07Загружено:Jason Eddins
- Ch 17Загружено:Alan Ahlawat Sumski
- Exercises on Binary Quadratic FormsЗагружено:Trifan Nicoleta