Endterm - 1

Page 1 of 14
This part is for the use of the GRADER ONLY

Question # 1 2 3 4 5 6 7 TOTAL
Full Marks 6 14 2+5+4+4=15 4 4+2+4+(3+7)=20 8+3+2+2 2+2+10+2 90
Marks
Obtained

Indian Institute of Management Bangalore
Introduction to Statistical Methods
Final Examination
Time: 3 Hrs. Name:___________________
Max Points: 90 Roll No.:_________________

Do not seek any clarifications. Answer all questions in the space provided (if necessary, use
the left/back pages with question numbers). Do not attach any additional sheets.

1. A sample of 2520 units is to be selected from a population that is divided into four Divisions,
having respectively 4000, 2000, 6000 and 8000 units. A Stratified random sampling scheme
is to be adopted using the divisions as the strata.
[32=6]
a) How many units should be selected in the sample from the four divisions, if a
proportional stratified sampling scheme is adopted?

Sample size from
Division 1 Division 2 Division 3 Division 4

b) Past survey records show that the variances of measurements in these
divisions are respectively 500, 400, 500 and 400. How many units should be
selected in the sample from the four divisions, if the objective is to minimize the
variance of the stratified mean?

Sample size from

Page 2 of 14
c) Sampling from Divisions 2 and 3 is the cheapest and it costs Rs. 1 per unit, while
sampling from Division 1 costs Rs. 4 per unit, and sampling from Division 4 is 4
times as costly as from Division 1. What should be the sample sizes from the four
divisions if one works most efficiently with a total budget of Rs 4635? Please note
that it will not be possible to select 2520 units within the budget constraint.

Sample size from

2. Choose the correct alternative in each of the following. If you tick the choice none of the
above, then specify the right answer.
[72=14]

In a hypothesis test for proportion (p), the plot of Prob.[Type II error] vs. p of a test procedure
looks like below. Choose the right alternatives for questions (i) --- (v) on the basis of this.

0 0.3 0.5 0.6 1

(i) The null hypothesis here is

a) p = 0 b) p= 0.3 c) p=0.5
d) p =0.6 e) none of the above

(ii) The level of significance at which the test is constructed is

a) 0.05 b) 0.1 c) 0.4
d) 0.9 e) cannot be concluded

(iii) The test is

a) left-tailed b) right-tailed c) two-tailed
d) can not be concluded from the given information
0.9
0.4
Page 3 of 14

(iv) The power of the test (when the true proportion is 0.5) is

a) 0.4 b) 0.5 c) 0.6
d) 0.9 e) none of the above

(v) The probability of Type II error of the test, when p=1, is

a) 0 b)0.5 c)1
d) none of the above

(vi) In each of the following choices, two statistical methods have been
mentioned. Mark the odd pair.

a) Chi-square test for independence and Chi-square test for variance
b) Chi-square test for goodness of fit and Kolmogorov-Smirnov test
c) Sign test and Paired t-test
d) Pearsonr correlation and Spearmans correlation coefficients
e) F-test for ANOVA and Kruskal Wallis test

(vii) F-test for one-way ANOVA is always one-tailed because

a) F-table provides only cut-off points for the right tail.
b) After all, it is ONE-WAY ANOVA.
c) Small values of the test statistic provide evidence in favour of the
null hypothesis.
d) There is only ONE unbiased estimator of
2
when the alternative is
true.
e) As per the alternative, the true means of the later populations are
always larger than the true mean of the first population.

3. Soham enrolled in a Business Management program. In his STATS course, his
instructor announced that the final grade would be decided based on QUIZ,
PROJECT, FINAL and CP scores. The instructor is known to disclose the CP score
only at the time of announcement of course grades (of all courses), which is good
three months after even the final as well as project scores are declared to the students.
Once Soham came to know that he got 15 in QUIZ, 8 in PROJECT and 65.5 in the
FINAL, he became desperate to predict his own CP score possibly based on his score
in QUIZ, PROJECT and FINAL. With this in mind, Soham set about collecting the
QUIZ, PROJECT, FINAL and CP scores of students from the previous batch of
students. He could collect data from sixty eight such students and performed multiple
regression in SPSS using ENTER method, with CP as dependent variable and all
combinations of QUIZ, PROJECT, and FINAL as independent variables along with
the constant. Soham did this as he intuitively felt that the constant should anyway be
part of the regression model, but was not sure which of the three variables he should
include in the list of independent variables. The parts of outputs from are shown in
the next page.
Page 4 of 14

Page 5 of 14
A) Using the Step-wise regression method, enlist (sequentially) which variable should be entered
/removed if the Probability-of-F-to-enter <=0.08 and Probability-of-F-to-delete >=0.10 values
are used. (Use only as many steps as appropriate.) The last table from SPSS stepwise
regression is included at the bottom of the previous page.
[2]

Step Variables entered Variables Removed
1.
2.
3.
4.
5.
6.

B) For the final model, as selected in the last stage in the ABOVE stepwise regression, fill in the
different missing entries in the following ANOVA table.
[5]

Degrees of freedom Sum of Squares Mean Square F Sig.
Regression
Residual
Total

Page 6 of 14
C) Finally, Soham got so confused that he decided to use only the FINAL score (along) with
constant to predict his CP score. Find an appropriate interval which should contain Sohams
CP score with 90% probability?
[4]

D) Soham had initially thought that for an increase of 29 point in FINAL score, (average) CP
score is likely to increase by at most 2 point? Setting this notion as an appropriate null
hypothesis, conduct a test at 1% level of significance to see if there is any significant
evidence against it. (Use the same model as in part C.)
[4]

Page 7 of 14
4. You are responsible for a single division. The odds are about even that you will get by from
week to week without any serious mistake or problem in the division, which is pretty good
performance in the real world. Now, you are being promoted to look after five divisions, each
running just as efficiently as in the first. What are the odds now that you will have no serious
mistake or problem in your work in a given week? [Hint: An event having odds m:n is
equivalent to the probability of the event being m/(m+n).]
[4]

5. A new telescope manufacturer wants to be very certain that variance in resolution [while
focussing on objects about 1000 lyr (lyr=light-year) away] of its products is less than 9 lyr
2

before introducing this into the market. Accordingly, Rebati, the manager, set the null
hypothesis as the performance being worse and hoped to prove that wrong on the basis of
sample information. Accordingly, few of the newly manufactured telescopes were randomly
picked and used to focus on objects 1000lyr away. The resolutions of a total of 25 such
attempts were studied and the mean and the standard deviation were found to be respectively
30.3 and 2.156 respectively.

a) What can you say about the p-value of the appropriate test here.
[4]

Page 8 of 14
b) Should Rebati decide to introduce the product in the market if she can tolerate only 1%
type I error?
[2]

c) For the decision criterion in part b) [i.e., =0.01], what is the probability that Rebati
would fail to take the right decision of introducing the product when in fact the true
variance in resolution is only 3.31 lyr
2
?
[4]

d) Rebati finally consulted her friend, Harry, a Statistician, because she was not too sure of
her STATS. Harry looked at the resolution figures and immediately commented that it is
bit disappointing that the measurements didnt specify the corresponding telescope,
specially because five of these instruments were repeatedly used to get the sample data.
Rebati immediately replied No problem, I do have that information in my computer, I
just didnt know if that information would be of any use. Harry commented, It may or
may not be very important in the final analysis, but I think you should first see if there is
any significant difference among the different telescopes that you selected. After
retrieving the required information, the mean and standard deviation was quickly
computed:

Telescope 1 Telescope 2 Telescope 3 Telescope 4 Telescope 5
No. of times
it was used
in the sample
5 4 5 4
Mean 29.1 30.3 31.6 28.9
Standard
deviation
2.102 2.008 2.101 2.017
Page 9 of 14

i) Rebati teased Harry I have not given you the sample information from Telescope 5.
Being a good statistician, you should be able to deduce. What are these missing
values?
[0.5+1+1.5=3]

No. of times Telescope 5 was used in the sample =
Mean resolution of those measurements =
Standard deviation of those measurements =

ii) Harry looked at the standard deviation values and felt very comfortable in
assuming that the true s.d.s of resolutions from the five telescopes are same.
However, in order to justify pulling all measurements using the different
telescopes (like Rebati did), he needed to infer whether the true mean resolutions
from the five different telescopes are likely to be same or not. Test the
appropriate hypothesis at 5% level of significance. Show all necessary
calculations. [If you havent been able to answer part a) and need some/all of
these values to answer this part, you may assume those answers to be 7, 31 and 2
respectively. Caution: these are not necessarily correct values]
[7]

Page 10 of 14
6. A study is undertaken to investigate monthly income of daily wage-earners in Karnataka. A
sample of 1000 wage-earners from the state has been taken and information about their last
monthly wage is sought. The frequency distribution looks like this:

Lower limit Upper-limit Frequency
0 500 45
500 1000 90
1000 1500 100
1500 2000 125
2000 2500 145
2500 3000 180
3000 3500 135
3500 4000 85
4500 5000 65
5000 6000 20
6000 10000 10

sample mean 2500
sample s.d. 1331.3142

a) Before starting any further analysis, the investigative agency wants to verify whether
it would be reasonable to assume that the monthly wage of daily wage earners in the
state has a normal distribution. Use an appropriate non-parametric test at 10% level
of significance to verify this.
[8]

Page 11 of 14
b) Find an interval that contains the average monthly income of ALL daily wage earners
in the state of Karnataka with 95% probability.
[3]

c) Is your conclusion in part a) relevant or useful for your answer in part b)?
[2]

d) In what sense does your interval in part b) contain the true average monthly income
of ALL daily wage earners of the state with 95% probability?
[2]

Page 12 of 14
7. Kiran Creations, the well-known advertising agency has been hired by NS Petro Chem
Ltd. to launch a new advertising campaign for their new product, All Hair Shampoo. Kiran, The
Account Manager of Kiran Creations, has designed two different campaigns one is Jazzy and
the other is called Spooky. Kiran has classified the retention levels as LOW ( if the average
retention is 5 days) and HIGH (if the average retention is 10 days). He has assigned a probability
of 0.6 for LOW and a probability of 0.4 for HIGH. The effectiveness of the campaign depends on
the retention level of the viewers. The estimated annualized revenue (in Rs. Million) for each of
the campaigns at the two different retention levels are given in the table below:

Campaign/Retention LOW HIGH
Jazzy 10 100
Spooky -40 220

a) Which campaign should NS Petro Chem go for?
[2]

b) What is the maximum worth of any kind of information for NS Petro Chem?
[2]

Page 13 of 14
Swetha, MD of NS Petro Chem wanted a quick market survey to be carried out before deciding
whether to go for Jazzy or Spooky campaign. It is assumed that the retention time follows
exponential distribution. She has proposed that if the retention, based on the market survey, is
less than or equal to 3 days, the situation can be categorized as SHORT; if it is greater than 3 days
but less than or equal to 8 days, the situation can be categorized as NORMAL and if it is more
than 8 days, the situation can be categorized as LONG. She has also decided apriori that if the
situation turns out to be SHORT or NORMAL, the company should opt for Jazzy campaign
and that the company should go for Spooky only if the situation turns out to be LONG.

c) What is the maximum amount that the company is willing to pay for the market survey?
(In other words, what is the Expected Value of Sample Information (EVPI)?) Draw the
decision tree and label all the branches properly. Show all the numbers such as
probabilities, payoffs, expected values etc. on the tree.
[8]

Page 14 of 14
Show all the calculations with respect to probabilities below:

d) Is the apriori decision made by Swetha with regard to the type of campaign appropriate?
Explain why.
[4]

Endterm - 1

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Endterm - 1

Загружено:

Авторское право:

Доступные форматы

Page 1 of 14

This part is for the use of the GRADER ONLY

Вам также может понравиться