Вы находитесь на странице: 1из 114

Examiners Report

Mark Scheme

Examiners Report

Mark Scheme

Examiners Report

Mark Scheme

Final Mark Scheme

2616/01

June 2004

General Instructions
Some marks in the mark scheme are explicitly designated as M, A, B or E.
M marks (method) are for an attempt to use a correct method (not merely for stating the
method).
A marks (accuracy) are for accurate answers and can only be earned if corresponding M
mark(s) have been earned. Candidates are expected to give answers to a sensible level of
accuracy in the context of the problem in hand. The level of accuracy quoted in the mark
scheme will sometimes deliberately be greater than is required, when this facilitates marking.
B marks (explanation) are for explanation and/or interpretation. These will frequently be
subdividable depending on the thoroughness of the candidates answer.
Follow-through marking should normally be used wherever possible there will
however be an occasional designation of c.a.o. for correct answer only.
Full credit MUST be given when correct alternative methods of solution are used. If errors
occur in such methods, the marks awarded should correspond as nearly as possible to
equivalent work using the method in the mark scheme.
All queries about the marking should have been resolved at the standardising meeting.
Assistant Examiners should telephone the Principal Examiner (or Team Leader if
appropriate) if further queries arise during the marking.
Assistant Examiners may find it helpful to use shorthand symbols as follows:FT

Follow-through marking
Correct work after error
Incorrect work after error

Condonation of a minor slip

BOD

Benefit of doubt

NOS

Not on scheme (to be used sparingly)

Work of no value

Final Mark Scheme

Q1

2616/01

June 2004

X ~ N(, 2), Y ~ N(2, 42); T = aX + bY


(i)

We want

= E[aX + bY] M1
= a. + b.2 1
2b = 1 a i.e. b = 21 (1 a)

The Var(T)

= a22 + 21 (1- a )

= 2{a2 + (1 a)2}
= 2{2a2 2a + 1}

(ii)

d (2a2 2a + 1) = 0
Consider da

(42)

1 Beware printed answer


M1 Substitution of b= 21 (1-a )

reqd

M1

i.e. 0 = 4a 2 1
a = 21 1 Beware printed answer
2
Verification that this is a minimum (e.g. trivially by d 2 )

T = 21 X + 41 Y ~N , 21 2

da

2 if all three items are correct;


award 1 if any two are correct

[Both X and 21 Y are u. b. for m and both are Normally distributed all of which

( )

is also true for T; but] T has smaller variance Var ( X ) = 2 , Var 21 Y = 2


E2

(iii)

t =7.48 B1 FT if wrong
One-sided CI is given by

7.48 - 1.645
M1 M1 B1

1 3
2

10

M1

M1 (use of 21 2 as Var(T))

= 7.48 0.63(71)
= 6.84(29) A1 C.A.O.

Final Mark Scheme

Q2

A
B
(i)

237
203

249
222

2616/01

213
214

233
216

227
230

June 2004

236

Wilcoxon rank sum test (or Mann-Whitney from thereof).


Ranks are
A 10 11
B 1
5

2
3

8
4

6
7

M1 for attempt
A1 if all correct

Rank sum is 20 (from B, otherwise the tables cant be used)


(Mann-Whitney is 5) 1
Refer to tables of Wilcoxon rank sum (or Mann-Whitney) statistics.
Lower 2 21 % tail is needed. 1
Value for (5, 6) is 18 (or 3 for Mann-Whitney).
Result is not significant. 1
Seems medians are the same. 1
(ii)

Normality of both underlying populations/distributions.


n1 = 6
n2 = 5

2
x = 232.5 sn-1
=143.1( sn-1 =11.9624 ) sn2 =119.25,sn =10.9202
2
y = 217.0 sn-1
=100.0 ( sn-1 =10.0 ) sn2 =80.0,sn =8.9443

Pooled s2 =

5143.1+4100.0
=123.94
9

M1 for any reasonable attempt at

pooling (and FT into test)


A1 if correct
Test statistic is
M1

232.5- 217.0 ( -0 )
15.5 = 2.29
= 6.7414
123.94 1 + 1
6

( 92)

A1

= 11.1330
FT reasonable attempt
Refer to t9. 1 May be awarded even if test statistic is wrong. No FT if wrong.
Double-tailed 5% point is 2.262. 1 No FT if wrong
Significant, seems means differ. 1
(iii)

If the assumptions for the t procedure are satisfied, it is better (more


sensitive/powerful), E2
but if not it might be seriously misleading and the non-par procedure safer. E2 4

Final Mark Scheme

Q3

(i)

2616/01

H0 : D = 0 (or AFTER = BEFORE)


H1 : D < 0 (or AFTER < BEFORE)

June 2004

1
1

Where D is the population mean difference after before

1 for verbal defn of

[NOTE candidate might of course define D as before after take core that H1 agrees]

Requires Normality of population

of differences

1.

must be clear, or clearly implied

The test procedure, and the CI in (ii), MUST be PAIRED COMPARISON t.


Differences are [as after before, candidate might use before after]
6

19

13

31

22

44

2
sn1 = 17.621 sn-1
=310.49

d = -12.4

11

14

A1 Accept sn = 16.716(5)

sn2 = 279.44 ONLY if correctly used in sequel.


Test statistic is
-12.4-0
17.621
10

= - 2.22 ( 535 ) A1

Refer to t9

M1, M1, M1 (dont FT to 2nd M1)

1 May be awarded even if test statistic is wrong. No FT if wrong

Lower s.t. 5% pt is 1.833 1 Sign must agree with H1/test statistic, unless a
clear argument based on modulus is used. No FT if wrong.
Significant
(ii)

1. Seems mean afterwards is lower.

14

CI is given by
12.4 2.262 17.621 = 12.4 12.60(4) = (25.00(4), 0.20(4))
10

M1

B1

M1

A1 c.a.o.

Xero out of 4 if not same dist as for test. Some wrong dist can score max M1 B0
M1 A0. Recovery to t9 is ok.
(iii)

Any non-parametric procedure 1


Paired Wilcoxon 1 [allow sign test]

Final Mark Scheme

Q4

(i)

2616/01

June 2004

H0 : no association between age and level of interest. B1


H1 : association between age and level of interest. B1
oi

49
145
194

216
435
651

265
580
845

ei
60.84
133.16

204.16
446.84

A2

Award A1 if any
one is correct. But
deduct 1 if not at
least 2 dp

oi ei = 11.84
or 11.34 with Yates correction
x 2 =3.99 ( 71) with Yates

4.35 ( 73 ) without Yates

M1 for either, near-enough correct


A1 if Yates used

Refer to 12 1 [FT if 2 or 3 df averred]


Upper 5% point is 3.84 1
Significant 1
Seems there is association 1*
Seems under-30s have less interest than would be expected,
and over-30s more, then if there were no association. 2*
* These 3 marks are not available if H0 H1
(ii)
Directly-elected
mayor

(iii)

Yes
No
Total

Level of interest
Great
Little
118
314
49
216
167
530

Total
432
265
697

M1 for table with correctly labelled rows and columns.


M1 if all margins correctly add up from the individual values.
A1, A1, A1, A1 for each individual cell (118, 314, 49, 216).

We do not [at least prima facie] have a random sampler of 697 people
who were classified over the 4 cells. The usual sample 2 approach requires
such an assumption. E2

Examiners Report

2616 Statistics 4
General Comments
Most candidates appeared to be well prepared for this examination and there was no
evidence that candidates had insufficient time to complete the paper. In fact, some
candidates gave full answers to all four questions.
As in previous years candidates performed much more strongly when carrying out
the numerical parts of questions than they did when discussing assumptions or
analysing results. The two most common examples of this weakness were firstly the
assumptions required for the various t-tests to be valid many candidates were not
clear about whether parent populations, samples, means or data had to be normally
distributed or whether they were looking at one distribution, two distributions or the
difference between two distributions.
The second weakness was in the contextualisation of the results of a hypothesis test.
Many candidates did not make any statement beyond reject H 0 , whilst at the other
end of the scale, candidates were too definitive, making statements such as reject
H 0 , hence the median strength using process A is greater than the median strength
using process B.
Once again, Question 1 on estimation was by far the least popular question.
However most candidates who attempted question 1 scored well.

Comments on Individual Questions


Q.1

This question was only attempted by about 20% of candidates.


Virtually all candidates knew what they had to do in part (i) and were able to
verify the value of b. Most were also able to calculate the variance of T,
although poor algebra let down some candidates.
In part (ii) most candidates used calculus to show that the variance was
minimised when a = 0.5, although some showed only that the variance had a
stationary value. A few candidates used a method involving completing the
square.
Candidates who got this far were almost all able to state the distribution of T
and explain why it was a better estimator of than either X or Y.
Most candidates who attempted part (iii) knew what they were doing but a
number failed to realise that Var(T) = 12 2 and a number also did not realise
that because the value of 2 was known, the normal distribution should be
used indeed one candidate used specifically because the sample was
small.

Q.2

This was the most popular question on the paper, being attempted by all but 2
candidates.
Part (i) was obviously familiar ground for most candidates and most scored
very well here. The method of choice for most candidates was to calculate the

Wilcoxon rank sum statistic, covert to the Mann-Whitney statistic and then
use the Mann-Whitney tables. Only a small minority of candidates calculated
a statistic (Wilcoxon or Mann-Whitney) and then moved directly to the
relevant statistical table. However, this part of the question was answered
better than any other part of the paper.
Part (ii) was not answered as well with many candidates not realising that
Normality of both underlying populations was required. The pooled variance
also caused some confusion with some candidates trying to pool standard
deviations, some adding variances and others being confused about the use
of s n2 and/or s n21 .
Once a variance had been obtained, most candidates were then able to
calculate the test statistic correctly and compared it with the two-tailed value
of t 9 .
In both parts (i) and (ii) a significant number of candidates were too definitive
in their interpretation of the rejection, or otherwise, of the null hypothesis.
Answers to part (iii) tended to be too vague with very few candidates
mentioning the fact that the t-test is a more powerful, or sensitive, test than
the non-parametric alternatives, as long as the assumptions are satisfied.
However, if the assumptions are not satisfied, results can be seriously
misleading.
Q.3

In part (i) many candidates lost a significant number of marks because they
did not carefully state their hypotheses or take sufficient care with the
distributional assumption. Hypotheses such as the intensity remains the
same and the intensity reduces were common. What is required are explicit
statements about either the mean of the population of differences, or about
the means of the populations before and after. In addition all terms used
should be defined. The required distributional assumption was the Normality
of the population of differences.
As with other questions, most candidates were able to carry out the
calculations competently and most used the correct value of t.
Part (ii) was very well done by the majority of candidates, although a few did
use the Normal distribution.
Virtually all candidates correctly named the paired Wilcoxon test in part (iii)

Q.4

Most candidates were obviously on comfortable ground here and tended to


score well.
In part (i) most candidates were able to state the hypotheses correctly,
although some got the hypotheses the wrong way round and some talked
about correlation.
Calculations were inevitably done correctly, but a few candidates only gave
the expected values to 1 decimal place or even to the nearest integer.

Many candidates obviously realised that it would be appropriate to use Yates


correction, but few actually did. Of those that did, some were unsure whether
to add or subtract 0.5.
Most candidates correctly used 1 degree of freedom for the test and were
able to give the correct critical value. A small minority used 2 or 3 degrees of
freedom.
2

There was a definite improvement on previous years in the discussion of the


results of the hypothesis test, with many candidates considering the
2
contributions to the statistic, or at the very least considering the
differences between observed and expected values.
Most candidates scored full marks in part (ii)
Candidates struggled with part (iii), with the most common suggestion being
about different sample sizes. The actual reason was that we do not have a
random sample of people who were classified over the 4 cells.

Mark Scheme

MEI STATISTICS 4 (2616)

JANUARY 2005

SOLUTIONS

Question 1
(i)

We have :

X 1 ~ Poisson ( )
X 2 ~ Poisson (4 )

M1 might be implicit

in sequel

X 3 ~ Poisson (10 )

1
(X1 + X 2 + X 3 )
15

1
E ( ) = ( + 4 + 10 )
15

M1 for any attempt

to find E ( )
M1 for use of

Poisson means

=
is unbiased

A1

1
Var( ) = 2 Var( X 1 + X 2 + X 3 )
15

M1 for any

(reasonable) attempt
to find Var
M1 for use of

1
= 2 ( + 4 + 10 )
15

Poisson variances
A1 - beware printed

answer

15
8

(ii)

Y ~ Poisson (10 )
1
1
1
1
E( Y ) = E(Y ) = E(Y ) = . 10
10
10
10
10
=
i.e. unbiased

Now

Var(

1
1
Y)=
Var(Y )
100
10

1 Var(Y )

n
100

1 10

=
100 n
10n

M1
A1
1
M1

M1

M1, A1
7

MEI STATISTICS 4 (2616)

(iii)

10n
ie

<

15

1
Y
10

JANUARY 2005

M1

for n 2

E1

is better

For Z ~ Poisson ( ), we have Var ( Z ) =

So would need n 16 to be better than

SOLUTIONS

n
E2

Allow 1 for n 15
5

MEI STATISTICS 4 (2616)

JANUARY 2005

SOLUTIONS

Question 2
(a)

MUST be N (0,1) test and CI for comparing means


1 if both correct. DO
NOT allow

H 0 : 1 = 2

X1 = X 2

H 1 : 1 2

or similar. Allow
verbal statement
1 if

1 , 2 are

adequately defined
in words (population
mean times )

12 6

Test statistic is

13 9
M1

(2 4) 2
(3 5) 2
+
80
90

1.3
0.2081

1.3 = 2.84(97)

A1

0.4562

Refer to N (0,1)

1 No FT if wrong

1% critical point (two-sided) is 2.576

1 No FT if wrong

Significant

Seems mean waiting times differ

CI is given by
-1.3 1.96 0.4562 = -1.3 0.894 = (-2.194, -0.406) A1
M1

B1

accept (-2.2, -0.4)

M1

12
(b) MUST be Wilcoxon rank-sum test (or Mann-Whitney form
thereof).

[For bottom-up
rankings

It is convenient, and natural, to rank top down

W = 55, MW = 34

Use of Ranks M1
Ranks are: I
II

Upper 5% tail

10

11

13

W=55, MW = 34]

12

A1

Rank sum (for I) is 29 (Mann-Whitney is 8)

Refer to tables of Wilcoxon (or M-W) statistic

Lower 5% tail is needed

MEI STATISTICS 4 (2616)

JANUARY 2005

SOLUTIONS

Value for (6,7) is 29 (or 8 if M-W used)

Result is significant

Seems on the whole there are differences in satisfaction scores

MEI STATISTICS 4 (2616)

JANUARY 2005

SOLUTIONS

Question 3
Differences (after before):

6 11 22 5 1 4 28 2 7 3 9 8
(a)

MUST be PAIRED WILCOXON test.


Ranks of |d| are

10

M1

11

12

A1 FT if wrong

Test statistic is 5 + 1 + 2 = 8 [or 70]

Refer to paired Wilcoxon table with n=12

Lower 5% point is 17 [upper is 61]

the observed 8 [or 70] is significant

Seems coaching programme has improved short-term visual


memory

7
(b)

MUST be PAIRED COMPARISON t test


1

Normality of differences

d = 7.5

S n 1 = 9.5299 ( S n 1 = 90.8182)

M1 for use of
differences
B1 Accept Sn =

Test statistic (for test of

D = 0 against D > 0) is

7.5 0
= 2.72 (62)
9.5299
12

9.1248 (Sn2 =
83.85) ONLY if
correctly used in
sequel

M1 A1

Refer to t11

1 No FT if wrong

Upper 5% pt is 1.796

1 No FT if wrong

Significant

Seems coaching programme has improved short-term visual


memory

MEI STATISTICS 4 (2616)

JANUARY 2005

Look at differences

SOLUTIONS

M1

Consider e.g. dotplot


M1, or for any other
relevant
display/discussion
of the data

Bulk of data appear OK [assuming no concern about being


integers], but the two large upper outliers cast doubt

E2 (E0, E1, E2)

MEI STATISTICS 4 (2616)

JANUARY 2005

SOLUTIONS

Question 4
(i) H0:
H1:

no association (between success of transmission and


type of destination)

association

2
(ii) Oi

100
21
31
152

Ei

57
14
21
92

23 180
13 48
20 72
56 300

91.2(0) 55.2(0) 33.6(0)


24.32
14.72
8.96
36.48

22.08

13.46

Contributions to X2

0.8491 0.0587 3.3440


0.4532 0.0352 1.8216

X2 = 10.63 (985)
awrt 10.64

0.8232 0.0528 3.2019


Refer to

42

A4 - deduct 1 per

error
Must be to this level
of accuracy

M1
A2

[give A1 if

(10.5, 10.8)]
2[or zero; FT if
wrong, unless 300]
1

Upper 10% point is 7.779


Significant
Seems there is association

1
1

ZERO
if H0 H1

12
(iii) The key feature is the behaviour of transmission when intended
destinations are universities. There are many more more than one
attempt, and many more not successful at all, transmissions than
would be expected if there were no association, and many fewer
successful at first attempt transmissions. There is little or no
suggestion of any other associations.

E6 (divisible)

Examiners Report

2616

Mark Scheme

June 2005

Mark Scheme 2616


June 2005

2616

Mark Scheme

June 2005

2616 Statistics 4
Q1

E(Y) = (n 1)2
(i)

Var (Y) = 2(n 1)4

(iii)

T =kY

E(T) = k(n 1)2

B1

(ii)

Y = Xi X

X1, , Xn ~ ind N(, 2)

Var (T) = 2k (n 1)

B1

Bias = E(T) 2
= k(n 1)2 2

M1
A1

Allow M1A0 if 2 E(T).

M1

If both terms present, even if wrong.

A1

If both correct.

A2

Divisible for algebra.


BEWARE printed answer.

M1

To include =0, possibly implied.

A1

Correct derivative.

A1

Isolate k.

A1

BEWARE printed answer.

M1

Or other methods.

A1

(Since n > 1).

B2

Divisible for algebra.


Answer not printed.

M1

For the converse argument, with no


support of only if, award SC B1.

MSE(T) = Variance + bias2


2

2 2

= 2k (n 1) + {k(n 1) }
2

2
2

= 2k (n 1) +{k (n 1) 2k(n 1) + 1}
= 4[2(n 1) +(n 1)2]k2 2 4(n 1)k + 4
(iv)

d MSE(T )
=0
dk

Consider

d MSE(T )
= 4 2(n 1) + (n 1) 2 2k 2 4 (n 1)
dk
n 1
2(n 1) + (n 1) 2
1
=
n +1
Check minimum by considering
d 2 MSE (T )
= 4 2(n 1) + (n 1) 2 2
d k2
> 0 min
=0 k =

(v)

1
,
n +1
2(n 1) + (n 1) 2 2(n 1)
MSE (T ) = 4

+ 1
n +1
(n + 1) 2

=
2n 2 + n 2 2n + 1 2n 2 + 2 + n 2 + 2n + 1
(n + 1) 2

With k =

(vi)

4
(n + 1)

{2n + 2} =
2

2 4
n +1

From (ii), we want k(n 1)2 2 = 0


1
k =
n 1
In this case, MSE(T) = Var(T)
=

2 4
n 1

A1
M1
A1

Or substitute in expression for MSE in


(iii) this is not difficult.
4
20

2616

Mark Scheme

June 2005

Q2
(i)

(ii)

H0 : =

H1 :

B1

Where , are the population mean strengths for


processes A and B.

B1

Normality of both populations.


Same variance.

B1
B1

Both hypotheses. Do not allow any


other symbols, including, e.g.,
X A = X B or similar, unless they are
clearly and explicitly stated to be
population means. Allow statements in
words (see below).
For adequate verbal definitions of ,
. Must indicate mean; condone
average. Allow absence of
population if correct notation is
used, otherwise insist on population.
4

n1 = 9, x = 114 6667, s n 1 = 87 25, ( s n 1 = 9 3408)


2

n 2 = 8, y = 123 75, s n 1 = 109 07, ( s n 1 = 10 4437)

B1

If all means and variances correct.


Accept sns ONLY if correctly used in
sequel.
2
s n = 77 5 ,
s n = 8 8066
2

Pooled s 2 =

698 + 763 5
= 97 43
15

Test statistic is
114 6667 123 75
1 1
97 43
+
9 8

9 0833
23 0051 = 4 7964

= 1 89(38)

(iii)

(iv)

M1
A1
M1
M1
A1

s n = 95 4375, s n = 9 7692
For any reasonable attempt at pooling
(and ft into test and CI).
If correct.
Overall structure. Allow cs pooled s.
1 1
+
9 8
ft cs pooled s2.

Refer to t15.
Double tail 5% point is 2131.
Not significant.
Seems mean strengths are the same for both
processes.

M1
A1
E1
E1

No ft from here if wrong.


No ft from here if wrong.
ft only cs test statistic.
ft only cs test statistic. Expect reference
to means and context.

CI is given by 90833

M1

2947
47964
= 90833 141349 = (2321(8), 505(2))

B1
M1
A1

Must be cs ( x y ) ...
From t15.
Allow cs pooled s.
c.a.o. Must be written as an interval.

Wilcoxon
Rank sum test

B1
B1

10

Or Mann-Whitney scores B2.


2
20

2616
Q3
(a)

Mark Scheme

H0 : D = 0 or
H1 : D 0 or

E = S
E S

June 2005

B1

Where D is population mean for Experimental


fertilizer population mean for Standard fertilizer.

B1

Normality of differences is required.

B1

MUST be PAIRED COMPARISON t test.


Differences are
06 23 08 06 09 15 14 08 01 02
2
d = 0 46, s n 1 = 1 0668(75), s n 1 = 1 1382

M1
B1

Test statistic is

0 46 0
1 0668(75)
10

M1

Both hypotheses. Do not allow any


other symbols, including, e.g.,
X E = X S or similar, unless they are
clearly and explicitly stated to be
population means. Allow statements in
words (see below).
For adequate verbal definition of .
Must indicate mean; condone
average. Allow absence of
population if correct notation is
used, otherwise insist on population.
Must be explicit about the population.

Accept s n = 1 0121, s n = 1.0244


ONLY if correctly used in sequel.
Allow cs d and/or sn1.
Allow alternative: 0 (cs 2262)
1 0668(75)
(= 07631) for subsequent
10

comparison with d .
(Or d

= 136(35)

(b)

A1

Refer to t9.
Double tail 5% point is 2262.
Not significant.
Seems mean yield using experimental fertilizer is
same as for standard.

M1
A1
E1
E1

Now need Normality for yields using experimental


fertilizer.
For these yields,
2
x = 20 43, s n 1 = 4 0803, s n 1 = 16 649

B1

One-sided CI (lower confidence bound) is given by


2043

1833
4 0803

10
= 2043 236(51) = 1806(49)
In repeated sampling, lower confidence bounds
obtained in this way would fall below the true mean
on 95% of occasions.

(cs 2262)

1 0668(75)

10
(= 0303, 12231) for comparison with
0.)
c.a.o. (but ft from here if this is wrong.)
Use of D d scores M1A0, but
next 4 marks still available.
No ft from here if wrong.
No ft from here if wrong.
ft only cs test statistic.
ft only cs test statistic. Expect reference
to mean(s) and context.

B1

Accept s n = 3 8709, s n = 14.9841


ONLY if correctly used in sequel.

M1
M1
B1

Mean. Allow cs x .
Minus.
From t9.

M1

Allow cs sn-1, or sn / 9 (see above).

A1

Depends on all 4 preceding marks.

E2

(E0, E1, E2). Comment should refer to


lower bound rather than just the
confidence interval.

11

9
20

2616

Q4
(a)

Mark Scheme

Data
Median 60
Difference
Rank of |diff|

June 2005

29

32

34

38

40

46

51

52

59

63

71

95

31
11

28
10

26
9

22
8

20
7

14
6

9
4

8
3

1
1

3
2

11
5

35
12

M1
M1
A1

(b)
(i)

T = 2 + 5 + 12 = 19

B1

Refer to tables of Wilcoxon single sample (/paired)


statistic.
Lower (or upper if 59 used) 2% tail is needed.
Value for n = 12 is 13 (or 65 if 59 used).
Result is not significant.
No real evidence that median is not 60.

M1

No ft from here if wrong.

M1
A1
E1
E1

No ft from here if wrong.


No ft from here if wrong.
ft only cs test statistic.
ft only cs test statistic.

B1
B1

BEWARE printed answer.

P(80 < N (62, = 27 3) 100 )


= P(0 6593(4) < N (0, 1) 1 3919(4) )
= 0 9180 0 7452 = 0 1728

expected frequency = 200 01728 = 346


(ii)

(iii)

For differences. ZERO in this section if


differences not used.
For ranks of |difference|.
All correct.
ft from here if ranks wrong.
Or 1 + 3 + 4 + 6 + 7 + 8 + 9 + 10 + 11
= 59

Grouping the last two cells,


X2 = 56903 + 01946 + 183265 + 52024 + 8 9526
+ 56195
= 43.98(59)

M1
A1

Allow without grouping.


This becomes + 00769 + 217529.
X2 becomes 6019(62). Then must have
42 below.

Refer to 32 .
Extremely highly significant overwhelming
evidence that Normal model does not fit data.

M1
A1

NEXT mark not available if not 32 .


ft only cs test statistic.

The fit is not particularly good in most of the


intervals, but the main points are that the modal class
is perhaps half an interval lower than expected, that
there are many fewer low values than expected, and
that there a lot of upper outliers.

E2

(E0, E1, E2)

Part (a) has a small sample and it appears that the


underlying distribution is not Normal could be
dangerous to use a t test.

E2

(E0, E1, E2)

There is also the point that, in the absence of


Normality (or at least of symmetry), we could not use
the t test for the mean as a proxy test for the median.

E1

20

2616

Mark Scheme

June 2005

2616 - Statistics 4

General Comments
There were 93 candidates from 20 centres (June 2004: 82 from 20). The overall
standard of the scripts seen was pleasing: many candidates were clearly well
prepared for this paper. Routine calculations were carried out well but the
candidates ability to comment and interpret were a little disappointing at this level.
Question 1 was by far the least popular question with only about 15 candidates
attempting it. Every candidate attempted Question 2; Questions 3 and 4 were
equally popular.

Comments on Individual Questions


1)

Estimation theory
Although this was the least popular question it seemed to have the highest mean
mark, with most of those attempting it scoring full or almost full marks. Those
who were prepared to try it were likely to be successful as long as their algebra
was up to the task. Sometimes the algebra arrived at the correct destination by
brute force rather than elegance.
There were just two places where marks seemed likely to be lost: part (iv) where
some neglected to verify that the required value of k did indeed give a minimum
and part (vi) where there was a temptation for some to use the converse argument.

2)

Two sample t test and confidence interval; the strengths of steel rods

This was the most popular question being attempted by all candidates. It was also
a very high scoring question: about half of the entry scored full or almost full
marks.
(i)

The hypotheses were usually stated correctly but there was rather less
care in providing verbal definitions of the population means. Similarly, the
required assumptions were sometimes less than ideal.

(ii)

Most candidates carried out the test competently. There was rarely any
problem over finding and using the pooled variance. The critical value was
almost always correct but on a number of occasions the conclusion was
badly expressed.

(iii)

As in part (ii) most candidates had little difficulty here. Just occasionally
the standard error (which had been correctly constructed in part (ii))
became pooled s

(iv)

1
17

This part was almost always correct.

2616

3)

4)

Mark Scheme

June 2005

Paired sample t test and one-sided confidence interval; comparing


fertilizers

(a)

The hypotheses were usually stated correctly but candidates were not as
careful about defining the symbol . Nor were they sufficiently careful
when it came to the distributional assumption.
However there were only a very few candidates who did not realise that
they should carry out a paired test. The vast majority made good progress
with the test itself, and only the final conclusion left room for
improvement.

(b)

As above, most realised what to do here and the correct value for the
lower bound was usually found. A small minority tried to construct the
confidence interval using the information from the paired test. There was
some uncertainty again with the distributional assumption.
The main area of difficulty was with the interpretation of the interval.
Very many comments revealed a flawed understanding of a confidence
interval to quite a worrying extent.

Wilcoxon rank sum test for the median; Chi-squared test for goodness of fit;
waiting times in an airport
(a)

This part of the question was almost always answered well. Many fully
correct solutions were seen.

(b) (i) This part was frequently done correctly.


(ii)

Most candidates calculated a correct value of X2 (with or without


grouping) but relatively few were able to identify the correct Chi-squared
distribution to look up. Most of those who got this second aspect wrong
made no allowance for estimated parameters while a few thought that
there were 200 degrees of freedom. Hardly any commented on the fact
that the test statistic was significant at any level available to them in the
tables.
Disappointingly few candidates took the trouble to comment at all on the
reasons for the poor quality of fit.

(iii)

In this part of the question very few candidates realised that they could
refer back to the previous part for evidence that the assumption of
background Normality was not viable. They knew that Normality was
required, but often chose to look at the sample data in part (a), sometimes
with the aid of a dot plot. Hardly any candidates included in their
discussion the small sample size which might prompt the use of a t test.
No more than a handful of candidates picked up on the fact that a t test
examines the population mean whereas the Wilcoxon test in part (a)
examined the median.

Оценить